every iteration of aac_startio(). This ensures that a command that is
deferred for lack of resources doesn't immediately get retried in the
aac_startio() loop. This avoids an almost certain livelock.
the socket is on an accept queue of a listen socket. This change
renames the flags to SQ_COMP and SQ_INCOMP, and moves them to a new
state field on the socket, so_qstate, as the locking for these flags
is substantially different for the locking on the remainder of the
flags in so_state.
It doesn't take 'align' and 'flags' but 'master' instead, which is
a reference to the Master Zone, containing the backing Keg.
Pointed out by: Tim Robbins (tjr)
them to behave the same as if the SS_NBIO socket flag had been set
for this call. The SS_NBIO flag for ordinary sockets is set by
fcntl(fd, F_SETFL, O_NONBLOCK).
Pass the MSG_NBIO flag to the soreceive() and sosend() calls in
fifo_read() and fifo_write() instead of frobbing the SS_NBIO flag
on the underlying socket for each I/O operation. The O_NONBLOCK
flag is a property of the descriptor, and unlike ordinary sockets,
fifos may be referenced by multiple descriptors.
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.
Extensions to UMA worth noting:
- Better layering between slab <-> zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
- UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures. uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.
mbuma things worth noting:
- integrates mbuf & cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
- change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
- netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.
From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used. The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.
Additional things worth noting/known issues (READ):
- One report of 'ips' (ServeRAID) driver acting really
slow in conjunction with mbuma. Need more data.
Latest report is that ips is equally sucking with
and without mbuma.
- Giant leak in NFS code sometimes occurs, can't
reproduce but currently analyzing; brueffer is
able to reproduce but THIS IS NOT an mbuma-specific
problem and currently occurs even WITHOUT mbuma.
- Issues in network locking: there is at least one
code path in the rip code where one or more locks
are acquired and we end up in m_prepend() with
M_WAITOK, which causes WITNESS to whine from within
UMA. Current temporary solution: force all UMA
allocations to be M_NOWAIT from within UMA for now
to avoid deadlocks unless WITNESS is defined and we
can determine with certainty that we're not holding
any locks when we're M_WAITOK.
- I've seen at least one weird socketbuffer empty-but-
mbuf-still-attached panic. I don't believe this
to be related to mbuma but please keep your eyes
open, turn on debugging, and capture crash dumps.
This change removes more code than it adds.
A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.
Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)
Add two additional pairs of assertions, one at the end of the NFS
server event loop, and one one exit from the NFS daemon, that
assert that if debug.mpsafenet is enabled, Giant is not held, and
that if it is not enabled, Giant will be held. This is intended
to support debugging scenarios where Giant is "leaked" during NFS
processing.
my Elektor card. Note that the hints are necessary to specify the
IO base of the pcf chip. This enables to check the IO base when the
probe routine is called during ISA enumeration.
The interrupt driven code is mixed with polled mode, which is wrong
and produces supposed spurious interrupts at each access. I still have
to work on it.
install nfssvc(). It also updates the argument count, but did so
without setting SYF_MPSAFE, effectively removing the MPSAFE flag even
when syscalls.master indicates it doesn't require Giant. This change
forces the modevent to set MPSAFE as a flag to its internal notion of
an argument coutn.
Note: this duplication of information is a bad thing, but is a more
general problem I'm not currently willing to address.
kernel). No other sys/*.h file requires machine/foo.h to be included
before it. In addition, all the files that include rman.h would need
to include those two anyway. From these two perspectives, it is
traditional to include things like this.
This lets us stop treating sys/rman.h specially in every bus frontend
file.
a vnode. Not bumped into with asserts in the main tree because we
run the NFS server with Giant by default. Discovered by inspection.
Complete annotations of Giant acquisition/release to note that it's
only because of VFS that we acquire Giant in most places in the NFS
server.
long time, i.e., since the cleanup of the VM Page-queues code done two
years ago.
Reviewed by: Alan Cox <alc at freebsd.org>,
Matthew Dillon <dillon at backplane.com>
This is part 2/2 of fixing autonegotiation on hme(4) using DP83840A PHYs.
It appears to also fix the occasional problems to establish a link on
hme(4) using LU6612 PHYs and shouldn't hurt on those using QS6612 PHYs.
Obtained from: NetBSD
properly. This causes the autonegotiation to e.g. never establish a
100baseTX full-duplex link. The solution to this problem is to manually
write the capabilities from the BMSR to the ANAR every time a media
change occurs, even when already in autonegotiation mode.
The NetBSD way of doing this is to set their MIIF_FORCEANEG flag in the
NIC driver. This causes mii_phy_setmedia() to call mii_phy_auto() (which
will set the ANAR according to the BMSR) even when the PHY alread is in
autonegotiation mode. However, while doing the same on FreeBSD (which
involves porting the MIIF_FORCEANEG flag and converting nsphy.c to use
mii_phy_setmedia()) fixes autonegotiation, using mii_phy_setmedia()
causes this driver to no longer work properly in the other modes.
Another drawback of that approach is that this will also force writing
the ANAR on other PHYs whose drivers use mii_phy_setmedia() and which
are used with a NIC whose driver sets MIIF_FORCEANEG (e.g. hme(4) is
known to be used together with 3 different PHYs while only the DP83840A
require this workaround).
So instead of moving to MIIF_FORCEANEG, just call mii_phy_auto() in
nsphy_service() unconditionally when hanging off of a hme(4) and serving
a media change
This is part 1/2 of fixing autonegotiation on hme(4) using DP83840A PHYs.