After disscussing things I have decided to take the easy and
consistent 90% solution instead of aiming for the very involved 99%
solution.
If we allow forceful unmounts of DEVFS we need to decide how to handle
the devices which are in use through this filesystem at the time.
We cannot just readopt the open devices in the main /dev instance since
that would open us to security issues.
For the majority of the devices, this is relatively straightforward
as we can just pretend they got revoke(2)'ed.
Some devices get tricky: /dev/console and /dev/tty for instance
does a sort of recursive open of the real console device. Other devices
may be mmap'ed (kill the processes ?).
And then there are disk devices which are mounted.
The correct thing here would be to recursively unmount the filesystems
mounte from devices from our DEVFS instance (forcefully) and if
this succeeds, complete the forcefully unmount of DEVFS. But if
one of the forceful unmounts fail we cannot complete the forceful
unmount of DEVFS, but we are likely to already have severed a lot
of stuff in the process of trying.
Event attempting this would be a lot of code for a very far out
corner-case which most people would never see or get in touch with.
It's just not worth it.
3C920B-EMB-WNM Integrated Fast Ethernet Controller
Submitter reports that the card appears to autonegotiate properly, and
operate well with high levels of NFS traffic.
PR: 75253
Submitted by: "Oleg V. Nauman" <oleg at reis dot zp dot ua>
MFC after: 2 weeks
only one set is needed for either endianess. This also completes them for
big endian archs and fixes the compilation of libncp, netncp, etc. there.
Reviewed by: bp, rwatson
Compile tested on: i386, sparc64
MFC after: 1 week
o Move the sysctls under debug.psm.* and hw.psm.* making them a bit
clearer and more consistent with other drivers.
o Remove the debug.psm_soft_timeout sysctl. It was introduced many
moons ago in r1.64 but never referenced anywhere.
o Introduce hw.psm.tap_threshold and hw.psm.tap_timeout to control
the behaviour of taps on touchpads. People might like to fiddle
with these if tapping seems to slow or too fast for them.
o Add debug.psm.loglevel as a tunable so that verbosity can be set
easily at boot-time (to watch probes and such) without having to
compile a kernel with options PSM_DEBUG=N.
that the RFC 793 specification for accepting RST packets should be
following. When followed, this makes one vulnerable to the attacks
described in "slipping in the window", but it may be necessary in
some odd circumstances.
spx_reass() to increase atomicity across multiple operations on the
socket buffer when iterating over the SPX fragment reassembly list
for the ipxpcb, as well a to reduce the number of locking operations.
record loop for ACK'd data, rather than relying on lokcing in
sbdroprecord() and sowwakeup(), reducing the number of lock operations
as well as eliminating a possible race against the head of the send
buffer mbuf chain. Use the _locked variants of sbdroprecord() and
sowwakeup().
the peer address by using M_WAITOK in ipx_setpeeraddr() to prevent
allocation failure. The socket reference used to reach these calls
will prevent the ipxpcb from being released prematurely.
properly handle the case where a connection is disconnected. The
queue(9)-enabled version of this code broke from the inner but not
outer loop, and so potentially frobbed an ipxpcb flag after the ipxpcb
was free'd, which might be picked up later by the malloc debugging
code. Properly break from the loop context and avoid touching the
cb/ipxpcb after free.
connection rates, which is causing problems for some users.
To retain the security advantage of random ports and ensure
correct operation for high connection rate users, disable
port randomization during periods of high connection rates.
Whenever the connection rate exceeds randomcps (10 by default),
randomization will be disabled for randomtime (45 by default)
seconds. These thresholds may be tuned via sysctl.
Many thanks to Igor Sysoev, who proved the necessity of this
change and tested many preliminary versions of the patch.
MFC After: 20 seconds
expects a locked route reference. This removes a panic that occurs
when connected ipxpcb is closed and its route free'd, and may have been
present since the route locking took place.
MFC after: 2 weeks
o implement double-extended and single precision loads and stores,
o implement double precision stores,
o replace the machdep.unaligned_print sysctl with debug.unaligned_print
and change the default value to 0,
o replace the machdep.unaligned_sigbus sysctl with debug.unaligned_test,
o Remmove the fillfd() function. The function is trvial enough for
inline assembly.
The debug.unaligned_test sysctl is used to test the emulation of
misaligned loads and stores. When PSR.ac is 0, the CPU will handle
misaligned memory accesses itselfi and we don't get an exception
for it. When PSR.ac is 1, the process needs to be signalled and we
should not emulate. The sysctl takes effect when PSR.ac is 1 and
tells us that we should emulate and not send a signal.
PR: 72268
MFC after: 1 week
function provided by the driver limits allocations to the page size,
i.e. 4KB on i385 and 8KB on typical 64 bit processors. Since amd64
has 64 bit pointers, but only 4KB pages, an array of pointers that
just fits into one page on all the other processors, does require
2 pages on amd64.
In order to make this driver useful on amd64, the allocation unit
has been increased to 2 pages on amd64 and contigmalloc() is used
instead of malloc(). All other processor types are unaffected by
this change. This modification has only been compile-tested on
amd64, yet, but should just work (FLW).
machine; instead use the intended entry points. There's still
too much incestuous knowledge about the internals of the
802.11 layer but this at least fixes adhoc mode.
o ic_inact_auth is a bad name, it's the inactivity threshold
for being associated but not authorized; use it that way
o reset ni_inact when switching inactivity thresholds to
minimize the race against the timer (don't want to lock
for this stuff)
o change the inactivity probe threshold from a one-shot to
cover a range: when below this threshold but not expired
send a probe each inactivity interval; should probably
guard against the interval being turned way down as this
could cause us to spam the net with probes
we're at it:
o WPA/802.11i has a unicast key and a group key; in station mode
everything is sent with the unicast key--we were consulting the
destination mac address and incorrectly using the group key
o (perpetuate fallback use of the default tx key to maintain
compatibility with the way wpa_supplicant works)
o correct EAPOL encryption logic to check unicast key instead
of assuming other state implies this
o move QoS encapsulation up to before enmic work so TKIP has the
information required to calculate the pseudo-header
o do not do QoS-encapsulation of EAPOL frames as some ap's do the
wrong thing with such frames (may need to revisit this if ap's
start dropping non-QoS frames from stations assoc'd with QoS)
o move ieee80211_mbuf_adjust closer to its caller
unloaded, cleanup, or return ebusy of that's inconvenient.' The
default module hanlder for newbus will now call this when we get a
MOD_QUIESCE event, but in the future may call this at other times.
This shouldn't change any actual behavior until drivers start to use it.
suggested by Peter Edwards. This seems to fix my fxp problems and
likely will fix his as well. Use DELAY rather than *sleep because we
can be called from any context.
o catch one place where we were not using ath_chan_change to
switch channels; this fixes a problem where the channel
settings were not being correctly reported in captured packets
o return unique channel identification in the channel flags;
ethereal gets confused if you return merged flags (e.g. ofdm,
cck, and 2Ghz) (this is workaround and should be removed if
we can ever cleanup radiotap consumers)
o correct short/long preamble flag state for rx and treat tx
the same--use a new hwflags array that gives us the data
based on the h/w rate index/cookie
o add gross hack to handle radiotap capture of frames that
come in with hardware padding; should be replaced by a
flag in the radiotap header and more smarts in the apps
that decode radiotap data
o lintval is in ms; must convert to TU's for passing to the hal
o roundup to calculate nexttbtt (should look at current tsf and pull the
calculated nextbtt forward but this'll do for now)
o don't or- in HAL_BEACON_RESET_TSF when doing station timer setup; this
is not needed and messes up the sleep timer calcs, though it's unclear
if it mattered as the hal masks these values before use
Submitted by: Thorsten von Eicken
the netperf branch but for some reason didn't trigger a build failure
locally when I merged to CVS and omitted it. Presumably driver error.
Pointed out by: cperciva, tinderbox
schedulers a bit to ensure more correct handling of priorities and fewer
priority inversions:
- Add two functions to the sched(9) API to handle priority lending:
sched_lend_prio() and sched_unlend_prio(). The turnstile code uses these
functions to ask the scheduler to lend a thread a set priority and to
tell the scheduler when it thinks it is ok for a thread to stop borrowing
priority. The unlend case is slightly complex in that the turnstile code
tells the scheduler what the minimum priority of the thread needs to be
to satisfy the requirements of any other threads blocked on locks owned
by the thread in question. The scheduler then decides where the thread
can go back to normal mode (if it's normal priority is high enough to
satisfy the pending lock requests) or it it should continue to use the
priority specified to the sched_unlend_prio() call. This involves adding
a new per-thread flag TDF_BORROWING that replaces the ULE-only kse flag
for priority elevation.
- Schedulers now refuse to lower the priority of a thread that is currently
borrowing another therad's priority.
- If a scheduler changes the priority of a thread that is currently sitting
on a turnstile, it will call a new function turnstile_adjust() to inform
the turnstile code of the change. This function resorts the thread on
the priority list of the turnstile if needed, and if the thread ends up
at the head of the list (due to having the highest priority) and its
priority was raised, then it will propagate that new priority to the
owner of the lock it is blocked on.
Some additional fixes specific to the 4BSD scheduler include:
- Common code for updating the priority of a thread when the user priority
of its associated kse group has been consolidated in a new static
function resetpriority_thread(). One change to this function is that
it will now only adjust the priority of a thread if it already has a
time sharing priority, thus preserving any boosts from a tsleep() until
the thread returns to userland. Also, resetpriority() no longer calls
maybe_resched() on each thread in the group. Instead, the code calling
resetpriority() is responsible for calling resetpriority_thread() on
any threads that need to be updated.
- schedcpu() now uses resetpriority_thread() instead of just calling
sched_prio() directly after it updates a kse group's user priority.
- sched_clock() now uses resetpriority_thread() rather than writing
directly to td_priority.
- sched_nice() now updates all the priorities of the threads after the
group priority has been adjusted.
Discussed with: bde
Reviewed by: ups, jeffr
Tested on: 4bsd, ule
Tested on: i386, alpha, sparc64
- ipx_pcbnotify(), which is never called.
- ipx_rtchange(), which is never called, is incomplete inplemented, and
also #ifdef notdef.
- spx_fixmtu(), which is never called, is incompletely implemented, and
also #ifdef notdef.
other comments. Clarify that the next two things needed for SMP are
two lines.
- Expand mii abbreviation to miibus for clarity in the USB ethernet
comment.
is the case for most other sysctls in the System V IPC message queue
implementation.
PR: 75541
Submitted by: Sergiy Vyshnevetskiy <serg at vostok dot net>
MFC after: 2 weeks
generic bridge support was biting us more than it helped, whenever a new chipset
came out from a vendor and misprogramming it caused strange hangs or corruption.
[2] Add a large number of PCI IDs based on what the linux drivers support.
Note that the new PCI IDs haven't been tested, they're just *likely* to work.
In particular the VIA AGP 8x chipsets are concerning, due to lack of testing,
possible issues (kern/69953), and not having a nice "does this bridge say it
would do 8x" function. However, this shouldn't make the situation worse, since
these chips would have probed in the past anyway.
more general than previous. It also lets me implement cancelable point
in thread library. Also in theory, umtx_lock and umtx_unlock can
be implemented by using umtx_wait and umtx_wake, all atomic operations
can be done in userland without kernel's casuptr() function.
to remove a transaction from the async schedule. The previous method didn't
work well and led to the hardware writing to free'd buffers etc, as
it didn't always know that the transaction had been aborted.
Written after consultation with David Brownell who wrote the Linux
EHCI driver.
As part of this give the sqh structure a "previous" pointer.
MFC after: 1 week
rather than a softc pointer (with the bus structure at the start).
This is a non-functional change. It just helps when reading the code to
know that the ehci, ohci and uhci drivers share the bus structure, not the
entire softc.
of lock types in the kernel. This results in an increase of witness
data usage from ~145k to ~280k on i386 for kernels with
'options WITNESS'.
- Remove the unused witness malloc bucket.
Submitted by: Michal Mertl mime at traveller dot cz (1)
This allows boot to proceed on a real system until the issue
of calling back into certain OpenFirmware calls (e.g. finddevice)
in thread context is understood.
(this commit only affects psim users, of which I think I am the
only one...)
Silence on: net@, current@, hackers@.
No objections: joerg
Requested by: by many (mostly Cronyx) users for a long long time.
MFC after: 10 days
PR: kern/21771, kern/66348
show file name for 'mdconfig -l -u <x>' command.
This allows to preserve API/ABI compatibility with version 0 (that's why
I changed version number back to 0) and will allow to merge this change
to RELENG_5.
MFC after: 5 days
system from an AP at runtime (i.e., calling cpu_reset from ddb). Someday,
if we move to an NMI for stopping cpus instead, we can do away with this.
Requested by: jhb
- Remove the sched_add wrapper that used sched_add_internal() as a backend.
Its only purpose was to interpret one flag and turn it into an int. Do
the right thing and interpret the flag in sched_add() instead.
- Pass the flag argument to sched_add() to kseq_runq_add() so that we can
get the SRQ_PREEMPT optimization too.
- Add a KEF_INTERNAL flag. If KEF_INTERNAL is set we don't adjust the SLOT
counts, otherwise the slot counts are adjusted as soon as we enter
sched_add() or sched_rem() rather than when the thread is actually placed
on the run queue. This greatly simplifies the handling of slots.
- Remove the explicit prevention of migration for ithreads on non-x86
platforms. This was never shown to have any real benefit.
- Remove the unused class argument to KSE_CAN_MIGRATE().
- Add ktr points for thread migration events.
- Fix a long standing bug on platforms which don't initialize the cpu
topology. The ksg_maxid variable was never correctly set on these
platforms which caused the long term load balancer to never inspect
more than the first group or processor.
- Fix another bug which prevented the long term load balancer from working
properly. If stathz != hz we can't expect sched_clock() to be called
on the exact tick count that we're anticipating.
- Rearrange sched_switch() a bit to reduce indentation levels.
and threads currently holding sleep mutexes (and spin mutexes for
curthread). This can be quite useful in looking for a lock condition
summary for a system, as it avoids manually iterating through threads
and processes to find all the interesting locks.
NB: "alllocks" is up there with "lockedvnods" for a bad argument for
show.
MFC after: 2 weeks
ADVANCELOGIC->AVANCELOGIC (nothing in the tree uses it, so safe to do)
sort HAGIWARA vendor entry
sort ACTIONTAR vendor entry
Minor change to SYSTEMTALKS vendor entry.
lower the priority of the returning thread to a user priority before
calling into thread_userret() which would call wakeup() which in turn would
cause the returning thread to eventually context switch rather than
completing its slice. Allowing this thread to complete its slice first
yields a 15% performance improvement in super-smack on my dual opteron with
4BSD.
Add $NetBSD$ in a comment at the top
Update copyright dates
Update header comment
Add some of the entries not present in FreeBSD's usbdevs file
Harmonize some descriptions with NetBSD where NetBSD's were shorter
More work needs to happen here, as there's many conflicting vendor
names. There's also more harmonization that can happen before that
problem is tackled.
This was inspired by recent discussions, but none of the patches
posted were consulted to produce this commit. Other, similar ones
will follow.
cases for tcp_input():
While it is true that the pcbinfo lock provides a pseudo-reference to
inpcbs, both the inpcb and pcbinfo locks are required to free an
un-referenced inpcb. As such, we can release the pcbinfo lock as
long as the inpcb remains locked with the confidence that it will not
be garbage-collected. This leads to a less conservative locking
strategy that should reduce contention on the TCP pcbinfo lock.
Discussed with: sam
After this change, when component is disconnected because of an I/O error,
it will not be connected and synchronized automatically, it will be logged
as broken and skipped. Autosynchronization can occur, when component is
disconnected (on orphan event) and connected again - there were no I/O
error, so there is no need to not connected the component, but when there were
writes while it wasn't connected, it will be synchronized.
This fix cases, when component is disconnected because of I/O error and can be
connected again and again.
- Bump version number.
- Implement backward compatibility mechanism. After this change when metadata in
old version is detected, it is automatically upgraded to the new (current)
version.
One of a set of patches submitted by Kazuhito HONDA
to make the usb audio driver a lot more capable.
PR: 75274
Submitted by: Kazuhito HONDA (kazuhito at ph dot noda dot tus dot ac dot jp)
Obtained from: NetBSD (indirectly)
MFC after: 2 weeks
flag and busy field with the global page queues lock to synchronizing their
access with the containing object's lock. Specifically, acquire the
containing object's lock before reading the page's PG_BUSY flag and busy
field in vm_fault().
Reviewed by: tegge@
though these aren't used yet.
- Add missing function prototypes for some static functions.
- Allow lvt_mode() to handle an LVT entry with a delivery mode of fixed.
- Consolidate code duplicated in lapic_init() and lapic_setup() to program
the spurious vector register of a local APIC in a static lapic_enable()
function.
- Dump the timer, thermal, error, and performance counter LVT entries
during lapic_dump().
- Program LVT pins (currently only LINT0 and LINT1) after the local
APIC has been software enabled via lapic_enable() since otherwise the
LVT programming will not be able to unmask LVT sources.
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.
Remove a bogus comment from pmap_enter_quick().
Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
Currently this is only used to initiailize the TPR to 0 during initial
setup.
- Reallocate vectors for the local APIC timer, error, and thermal LVT
entries. The timer entry is allocated from the top of the I/O interrupt
range reducing the number of vectors available for hardware interrupts
to 191. Linux happens to use the same exact vector for its timer
interrupt as well. If the timer vector shared the same priority queue
as the IPI handlers, then the frequency that the timer vector will
eventually be firing at can interact badly with the IPIs resulting in
the queue filling and the dreaded IPI stuck panics, hence it being located
at the top of the previous priority queue instead.
- Fixup various minor nits in comments.
multiple MIB entries using sysctl in short order, which might
result in unexpected values for tcp_maxidle being generated by
tcp_slowtimo. In practice, this will not happen, or at least,
doesn't require an explicit comment.
MFC after: 2 weeks
substitute for a global mutex protecting the socket count and
generation number.
The observation that soreceive_rcvoob() can't return an mbuf
chain is a property, not a bug, so remove the XXXRW.
In sorflush, s/existing/previous/ for code when describing prior
behavior.
For SO_LINGER socket option retrieval, remove an XXXRW about why
we hold the mutex: this is correct and not dubious.
MFC after: 2 weeks
unnecessary use of a global variable and simplify the return case.
While here, use ()'s around return values.
In sodealloc(), remove a comment about why we bump the gencnt and
decrement the socket count separately. It doesn't add
substantially to the reading, and clutters the function.
MFC after: 2 weeks
After this change, when component is disconnected because of an I/O error,
it will not be connected and synchronized automatically, it will be logged
as broken and skipped. Autosynchronization can occur, when component is
disconnected (on orphan event) and connected again - there were no I/O
error, so there is no need to not connected the component, but when there were
writes while it wasn't connected, it will be synchronized.
This fix cases, when component is disconnected because of I/O error and can be
connected again and again.
- Bump version number.
- Add version change history.
- Implement backward compatibility mechanism. After this change when metadata in
old version is detected, it is automatically upgraded to the new (current)
version.
occur between a reader and a writer that results in a panic upon close,
e.g.,
"panic: sbflush_locked: cc 4 || mb 0xffffff0052afa400 || mbcnt 0"
Reviewed by: rwatson@
MFC after: 2 weeks
methods:
Read can see O_NONBLOCK and O_DIRECT.
Write can see O_NONBLOCK, O_DIRECT and O_FSYNC.
In addition O_DIRECT is shadowed as IO_DIRECT for now for backwards
compatibility.
fcntl.h.
This is in preparation for making the flags passed to device drivers be
consistently from fcntl.h for all entrypoints.
Today open, close and ioctl uses fcntl.h flags, while read and write
uses vnode.h flags.
while doing g_(read|write)_data() (e.g. BSD). This can cause a deadlock
in MIRROR class. Not sure if this is safe to drop the topology lock in BSD
class, so change the code in MIRROR class to avoid this deadlock.
GigE controller, so handle this.
- Use the outbound window 0 if the PCI mem requested is in its range, instead
of inconditionally use the outbound window 1.
This should be enough to get FreeBSD/arm to work on the IQ80321 board as well.
Reported and tested by: Jia-Shiun Li <jiashiun at gmail dot com>