The following bug was just identified in OpenBSD and it looks like the same
bug exists in the other BSDen NFS servers.
A Linux client (don't know which version, but you can look at
http://bugzilla.kernel.org/show_bug.cgi?id=6256)
does a Setattr of mtime to the server's time, where the file is mode 0664 and
the client user has group access (ie. caller is not the file owner).
The BSD servers fail the Setattr with EPERM, since the VA_UTIMES_NULL flag
isn't set before doing the VOP_SETATTR.
It seems to me that this should be allowed, since it is allowed for a local
utimes(2). If so, the fix is to set VA_UTIMES_NULL for the
"set-time-to-server-time" cases of setting atime and/or mtime.
Submitted by: rick@snowhite.cis.uoguelph.ca
Reviewed by: cel
Approved by: silby
MFC after: 1 week
socket can have a tcp connection that has entered time wait
attached to it, in the event that shutdown() is called on the
socket and the FINs properly exchange before close(). In this
case we don't detach or free the inpcb, just leave the tcptw
detached and freed, but we must release the inpcb lock (which we
didn't previously).
MFC after: 3 months
acpi(4) HPET time counter support,
acpi_ibm(4) fan control support,
ddb(4) show lock,
ddb(4) show sleepq,
firmware(9) added,
random(4) MPSAFE,
new sysctl kern.sigqueue.queue_sigchild,
brandinfo BI_CAN_EXEC_DYN flag,
new sysctl kern.forcesigexit,
RedZone, a buffer corruption protection for kernel's malloc(9),
security.mac.biba.interfaces_equal for mac_biba,
POSIX_TIMERS support updated to 200112L,
initial support for POSIX message queue,
Xbox support,
DEFAULTS kernel configuration files for each arch,
cardbus(4) /dev/cardbus%d.cis device node added,
ce(4) for Cronyx Tau-PCI/32 added,
ipmi(4), OpenIPMI (Intelligent Platform Management Interface)
driver added,
kbdmux(4) integrated into syscons(4) and kbd,
uart(4) now in GENERIC kernel,
uart(4) LOM and RSC support,
snd_atiixp(4) added and suspend/resume support,
snd_solo(4) MPSAFE,
speaker(4) amd64 support,
uaudio(4) 24/32 bit audio support,
ath(4) updated to version 0.9.16.16,
bge(4) Jumbo frame support, big-endian arch support, MPSAFE,
em(4) updated to version 3.2.18, big-endian arch support,
performance improvement, suspend/resume support,
iwi(4) big-endian arch support,
le(4) for AMD Am7900 LANCE added,
myri10ge(4) for Myricom Myri10GE adapter added,
nve(4) updated to version 1.0-0310,
ti(4) big-endian arch support,
ufoma(4) for FOMA 3G mobile phone in Japan added,
vgapci(4) stub driver added,
arp(8) retransmission algorithm revised,
new sysctl net.link.ether.inet.log_arp_permanent_modify,
support for -i <if> with -d -a,
an experimental BPF Just-In-Time compiler added,
if_bridge(4) span ports support added,
if_bridge(4) RFC 3378 EtherIP support,
ipfw(4) now supports action argument substitution from table lookup,
ng_bpf(4) BPF Just-In-Time compiler support,
bug related to NFS over TCP reconnection fixed,
IPV6_V6ONLY now works for UDP,
amr(4) performance improvement, ioctl support for MegaRaid Tools,
ata(4) DMA for kernel dump and dumping to ataraid(4) devices,
ataraid(4) now supports JMicron ATA RAID metadata,
gmirror and graid3 disconnect_on_failure sysctls added,
g_md.ko renamed to geom_md.ko,
mpt(4) SAS HBA and 64-bit PCI support,
twa(4) updated to 9.3.0.1,
geli(8) now allows loading keyfiles before root file system is mounted,
initial support for SGI's XFS added,
ACPI-CA updated to 20051021,
DRM updated to 20051202,
TrustedBSD OpenBSM version 1.0 alpha 5 imported,
bsnmpd(1) Host Resources MIB in RFC 2790 support,
config(8) "nocpu" directive added,
config(8) now reads DEFAULT if any before the specified config file,
csh(1) NLS catalog support,
csup(1), CVSup-compatible client written in C imported,
devd(8) -f option,
ftpd(8) change related to PID file creation,
gbde(8) -k and -K option,
gpt(8) GPT partition label setting support,
gvinum(8) now supports to move a subdisk between drives,
GSS-API version 2 (RFC2743 and RFC2744) implemented,
jail(8) -J option,
kdump(1) -H and -s option,
kgdb(1) -w option,
libarchive(3) tp format support,
ln(1) -F option,
locate(1) -I option,
mdmfs(8) -P and -E option,
mergemaster(8) -A option,
mount(8) "nodev" option removed,
netstat(1) IPsec protocol stats support,
periodic(8) daily gmirror, graid3, gstripe, gconcat support,
pkill(1) -I option,
rfcomm_pppd(8) -c servicename support,
rtld(1) ELF symbol versioning support,
sh(1) "times" built-in command support,
truss(1) -s option,
truss(1) now works on FreeBSD/ppc,
usbd(8) removed in favor of devd(8),
xargs(1) -r option,
rc.d/auditd added,
rc.d/bluetooth, rc.d/hcsecd, rc.d/sdpd added,
rc.d/ftpd added,
rc.d/hostapd added,
rc.d/netif ipv4_addrs_<ifn> support,
rc.d/rcconf.sh removed and early_late_divider variable added,
rc.initdiskless now uses tar(1) instead of pax(1),
rc.d/pccard removed,
rc.d/ppp-user added (renamed from ppp),
removable_interfaces variable removed,
bsnmpd updated from 1.11 to 1.12,
pkg_add(1) -P option,
pkg_add(1) and pkg_create(1) -K option,
pkg_create(1) -x, -E, and -G options,
local_startup directory now evaluated by rcorder(8) with
scripts in the base system,
suffix of startup scripts removed,
variables "ldconfig_local_dirs" and "ldconfig_local32_dirs" added,
@cwd in pkg-plist now allows no directory argument, and
CHECKSUM.MD5's checksum in CHECKSUM.MD5 problem fixed.
pru_abort(), pru_detach(), and in_pcbdetach():
- Universally support and enforce the invariant that so_pcb is
never NULL, converting dozens of unnecessary NULL checks into
assertions, and eliminating dozens of unnecessary error handling
cases in protocol code.
- In some cases, eliminate unnecessary pcbinfo locking, as it is no
longer required to ensure so_pcb != NULL. For example, the receive
code no longer requires the pcbinfo lock, and the send code only
requires it if building a new connection on an otherwise unconnected
socket triggered via sendto() with an address. This should
significnatly reduce tcbinfo lock contention in the receive and send
cases.
- In order to support the invariant that so_pcb != NULL, it is now
necessary for the TCP code to not discard the tcpcb any time a
connection is dropped, but instead leave the tcpcb until the socket
is shutdown. This case is handled by setting INP_DROPPED, to
substitute for using a NULL so_pcb to indicate that the connection
has been dropped. This requires the inpcb lock, but not the pcbinfo
lock.
- Unlike all other protocols in the tree, TCP may need to retain access
to the socket after the file descriptor has been closed. Set
SS_PROTOREF in tcp_detach() in order to prevent the socket from being
freed, and add a flag, INP_SOCKREF, so that the TCP code knows whether
or not it needs to free the socket when the connection finally does
close. The typical case where this occurs is if close() is called on
a TCP socket before all sent data in the send socket buffer has been
transmitted or acknowledged. If INP_SOCKREF is found when the
connection is dropped, we release the inpcb, tcpcb, and socket instead
of flagging INP_DROPPED.
- Abort and detach protocol switch methods no longer return failures,
nor attempt to free sockets, as the socket layer does this.
- Annotate the existence of a long-standing race in the TCP timer code,
in which timers are stopped but not drained when the socket is freed,
as waiting for drain may lead to deadlocks, or have to occur in a
context where waiting is not permitted. This race has been handled
by testing to see if the tcpcb pointer in the inpcb is NULL (and vice
versa), which is not normally permitted, but may be true of a inpcb
and tcpcb have been freed. Add a counter to test how often this race
has actually occurred, and a large comment for each instance where
we compare potentially freed memory with NULL. This will have to be
fixed in the near future, but requires is to further address how to
handle the timer shutdown shutdown issue.
- Several TCP calls no longer potentially free the passed inpcb/tcpcb,
so no longer need to return a pointer to indicate whether the argument
passed in is still valid.
- Un-macroize debugging and locking setup for various protocol switch
methods for TCP, as it lead to more obscurity, and as locking becomes
more customized to the methods, offers less benefit.
- Assert copyright on tcp_usrreq.c due to significant modifications that
have been made as part of this work.
These changes significantly modify the memory management and connection
logic of our TCP implementation, and are (as such) High Risk Changes,
and likely to contain serious bugs. Please report problems to the
current@ mailing list ASAP, ideally with simple test cases, and
optionally, packet traces.
MFC after: 3 months
pru_abort(), pru_detach(), and in_pcbdetach():
- Universally support and enforce the invariant that so_pcb is
never NULL, converting dozens of unnecessary NULL checks into
assertions, and eliminating dozens of unnecessary error handling
cases in protocol code.
- In some cases, eliminate unnecessary pcbinfo locking, as it is no
longer required to ensure so_pcb != NULL. For example, in protocol
shutdown methods, and in raw IP send.
- Abort and detach protocol switch methods no longer return failures,
nor attempt to free sockets, as the socket layer does this.
- Invoke in_pcbfree() after in_pcbdetach() in order to free the
detached in_pcb structure for a socket.
MFC after: 3 months
- in_pcbdetach(), which removes the link between an inpcb and its
socket.
- in_pcbfree(), which frees a detached pcb.
Unlike the previous in_pcbdetach(), neither of these functions will
attempt to conditionally free the socket, as they are responsible only
for managing in_pcb memory. Mirror these changes into in6_pcbdetach()
by breaking it into in6_pcbdetach() and in6_pcbfree().
While here, eliminate undesired checks for NULL inpcb pointers in
sockets, as we will now have as an invariant that sockets will always
have valid so_pcb pointers.
MFC after: 3 months
the so_pcb pointer on the socket is always non-NULL. This eliminates
countless unnecessary error checks, replacing them with assertions.
MFC after: 3 months
rather than an error. Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.
soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF. so_pcb is now entirely owned and
managed by the protocol code. Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.
Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.
In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.
netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit. In their current state they may leak
memory or panic.
MFC after: 3 months
than an int, as an error here is not meaningful. Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.
This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit. This will be corrected shortly in followup
commits to these components.
MFC after: 3 months
the file descriptor reference, rather than paying additional lock
operations to acquire a socket reference from the file descriptor.
This will also help to ensure that file descriptor based socket
requests are not delivered to a socket after close. Most consumers
have already been converted to this model.
MFC after: 3 months
be present at this point. We will eventually remove this assert because
the socket layer should never look at so_pcb, but for now it's a useful
debugging tool.
MFC after: 3 months
socket calls relating to the creation and destruction of sockets. This
will eventually form the foundation of socket(9), but is currently in too
much flux to do so.
MFC after: 3 months
There's something strange going on with async events. They seem
to be be treated differently for different Fusion implementations.
Some will really tell you when it's okay to free the request that
started them. Some won't. Very disconcerting.
This is particularily bad when the chip (FC in this case) tells you
in the reply that it's not a continuation reply, which means you
can free the request that its associated with. However, if you do
that, I've found that additional async event replies come back for
that message context after you freed it. Very Bad Things Happen.
Put in a reply register debounce. Warn about out of range context
indices. Use more MPILIB defines where possible. Replace bzero with
memset. Add tons more KASSERTS. Do a *lot* more request free list
auditting and serial number usages. Get rid of the warning about
the short IOC Facts Reply. Go back to 16 bits of context index.
Do a lot more target state auditting as well. Make a tag out
of not only the ioindex but the request index as well and worry
less about keeping a full serial number.
a different register shift and is fed by a different clock than
we use for UltraSPARC hardware. To deal with this, the regshft and
rclk fields in the class structure are removed and bus frontends
now pass the right regshft and rclk to the probe function where
they're put in the BAS and passed in to subordinate drivers.
this is used by some 3rd party applications when {e,f,g}cvt() are
not found. POSIX defines the xcvt() funtions but says they are
deprecated in favor or sprintf(). We'll import these functions
from OpenBSD and remove __gdtoa() from the exported interfaces
when libc version is bumped.
vnode after vflush() has succeeded. This would cause a dangling vnode
panic at unmount time otherwise. Other filesystems may have this problem
via their VFS_VGET() routines.
Found by: kris
Sponsored by: Isilon Systems, Inc.
--------------------
- Seal the fate of long standing memory leak (4 years, 7 months) during
pcm_unregister(). While destroying cdevs, scan / detect possible
children and free its SLIST placeholder properly.
- Optimize channel allocation / numbering even further. Do brute cyclic
checking only if the channel numbering screwed.
- Mega vchan create/destroy cleanup:
o Implement pcm_setvchans() so everybody can use it freely instead
of implementing their own, be it through sysctl or channel auto
allocation.
o Increase vchan creation/destruction resiliency:
+ it's possible to increase/decrease total vchans even during
busy playback/recording. Busy channel will be left alone, untouched.
Abusive test sample:
# play whatever...
#
while : ; do
sysctl hw.snd.pcm0.vchans=1
sysctl hw.snd.pcm0.vchans=10
sysctl hw.snd.pcm0.vchans=100
sysctl hw.snd.pcm0.vchans=200
done
# Play something else, leave above loop running frantically.
+ Seal another 4 years old bug where it is possible to destroy (virtual)
channel even when its cdevs being referenced by other process.
The "First Come First Served" nature of dsp_clone() is the main
culprit of this issue, and usually manifest itself as dangling
channel <-> process association. Ensure that all of its cdevs
are free from being referenced before destroying it (through
ORPHAN_CDEVT() macross).
All these fixes (including previous fixes) will be MFCed, later.