I benchmarked this by simultaneously extracting 4 large tarballs (basically
world images) on a 4-processor AMD64 system, in a malloc-backed md.
With this patch, system time was reduced by 43%, and wall clock time by 33%.
Submitted by: jeff
MFC after: 1 week
cd9660_lookup() that was used to fix an actual race in ufs_lookup.c:1.78.
This is not currently a hazard, but the bug would be activated by
marking cd9660 as MPSAFE.
Requested by: bde
ext2_lookup() that was used to fix an actual race in ufs_lookup.c:1.78.
This is not currently a hazard, but the bug would be activated by
marking ext2fs as MPSAFE.
Requested by: bde
MFC after: 2 weeks
to the 100/1000 BCM5400 phy. This fixes the problem with
the GEM port not syncing up on Sawtooth G4's.
Obtained from: NetBSD
Reported by: Ben Rosengart <ben + freebsd org at narcissus net>
so we are ready for mpsafevfs=1 by default on sparc64 too. I have been
running this on all my sparc64 machines for over 6 months, and have not
encountered MD problems.
MFC after: 1 week
the kernel by wrapping all targets for fake opt_*.h files in
.if defined(KERNBUILDDIR). Thus, such fake files won't be
created at all if modules are built with the kernel.
Some modules undergo cleanup like removing unused or unneeded
options or .h files, without which they wouldn't build this way
or the other.
Reviewed by: ru
Tested by: no binary changes in modules built alone
Tested on: i386 sparc64 amd64
provided in the kernel build directory, fix modules that were
failing to build this way due to not quite correct kernel option
usage. In particular:
ng_mppc.c uses two complementary options, both of which are listed
in sys/conf/files. Ideally, there should be a separate option for
including ng_mppc.c in kernel build, but now only
NETGRAPH_MPPC_ENCRYPTION is usable anyway, the other one requires
proprietary files.
nwfs and smbfs were trying to ensure they were built with proper
network components, but the check was rather questionable.
Discussed with: ru
- Add newer CPUID definitions for future use.
Many thanks to Mike Tancsa <mike at sentex dot net> for providing test
cases for Intel Pentium D and AMD Athlon 64 X2.
Approved by: anholt (mentor)
case by saving the value of dp->i_ino before unlocking the vnode
for the current directory and passing the saved value to VFS_VGET().
Without this change, another thread can overwrite dp->i_ino after
the current directory is unlocked, causing ufs_lookup() to lock
and return the wrong vnode in place of the vnode for its parent
directory. A deadlock can occur if dp->i_ino was changed to a
subdirectory of the current directory because the root to leaf vnode
lock ordering will be violated. A vnode lock can be leaked if
dp->i_ino was changed to point to the current directory, which
causes the current vnode lock for the current directory to be
recursed, which confuses lookup() into calling vrele() when it
should be calling vput().
The probability of this bug being triggered seems to be quite low
unless the sysctl variable debug.vfscache is set to 0.
Reviewed by: jhb
MFC after: 2 weeks
amd64, and is a factor of 3 less than the value previously auto-sized on
a 12GB machine, which would cause an overflow in calculations involving the
maxbcache int, causing bufinit() to loop forever at boot.
Reviewed by: mlaier, peter
correspond to the commit log. It changed the maxswzone and maxbcache
parameters from int to long, without changing the extern definitions
in <sys/buf.h>.
In fact it's a good thing it did not, because other parts of the system
are not yet ready for this, and on large-memory sparc machines it causes
severe filesystem damage if you try.
The worst effect of the change was that the tunables controlling the
above variables stopped working. These were necessary to allow such
large sparc64 machines (with >12GB RAM) to boot, since sparc64 did not
set a hard-coded upper limit on these parameters and they ended
up overflowing an int, causing an infinite loop at boot in bufinit().
Reviewed by: mlaier
cards and teach the re(4) driver to attach to revision 3 cards.
Submitted by: Fredrik Lindberg fli+freebsd-current at shapeshifter dot se
MFC after: 2 weeks
Reviewed by: imp, mdodd
originally wrote it for 4.x and hasn't really had the time to fully update
it to 5.x and later. Also, the author doesn't use the hardware anymore as
well. If someone does need this driver they can always resurrect it from
the Attic.
Requested by: Frank Mayhar frank at exit dot com
the modified memory rather than using register operands that held a pointer
to the memory. The biggest effect is that we now correctly tell the
compiler that these functions change the memory that these functions
modify.
Reviewed by: cognet
do not support the GETINFO immediate command, unlike just about every other
variant of the hardware. Also document some magic values and fix some minor
nearby whitespace.
MFC After: 3 days
changes in MD code are trivial, before this change, trapsignal and
sendsig use discrete parameters, now they uses member fields of
ksiginfo_t structure. For sendsig, this change allows us to pass
POSIX realtime signal value to user code.
2. Remove cpu_thread_siginfo, it is no longer needed because we now always
generate ksiginfo_t data and feed it to libpthread.
3. Add p_sigqueue to proc structure to hold shared signals which were
blocked by all threads in the proc.
4. Add td_sigqueue to thread structure to hold all signals delivered to
thread.
5. i386 and amd64 now return POSIX standard si_code, other arches will
be fixed.
6. In this sigqueue implementation, pending signal set is kept as before,
an extra siginfo list holds additional siginfo_t data for signals.
kernel code uses psignal() still behavior as before, it won't be failed
even under memory pressure, only exception is when deleting a signal,
we should call sigqueue_delete to remove signal from sigqueue but
not SIGDELSET. Current there is no kernel code will deliver a signal
with additional data, so kernel should be as stable as before,
a ksiginfo can carry more information, for example, allow signal to
be delivered but throw away siginfo data if memory is not enough.
SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
not be caught or masked.
The sigqueue() syscall allows user code to queue a signal to target
process, if resource is unavailable, EAGAIN will be returned as
specification said.
Just before thread exits, signal queue memory will be freed by
sigqueue_flush.
Current, all signals are allowed to be queued, not only realtime signals.
Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
The receive function em_process_receive_interrupts() unlocks the
adapter while ether_input() processes the packet, and then locks
it back. In the meantime, em_init() may be called, either from
em_watchdog() from softclock interrupt or from the ifconfig(8)
program. The em_init() resets the card, in particular it sets
adapter->next_rx_desc_to_check to 0 and resets hardware RX Head
and Tail descriptor pointers. The loop in
em_process_receive_interrupts() does not expect these things to
change, and a mess may result.
This fixes long wedges of em(4) interfaces receive part under high
load and IP fastforwarding enabled.
PR: kern/87418
Submitted by: Dmitrij Tejblum <tejblum yandex-team.ru>
that the following funtions are not used, wrap in '#ifdef noused' for the
moment.
bstp_enable_change_detection
bstp_disable_change_detection
bstp_set_bridge_priority
bstp_set_port_priority
bstp_set_path_cost
the ExCA spec, and close cousins:
o Write an activate routine that works.
o merge a couple of items from oldcard before they are lost
o write a deactivate routine
I suspect we're still a ways away from having this work, but maybe for
6.1/5.5?
- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.
previous revision only restored the MP optimization.
Describe the optimization strategy for TLB invalidations in a comment.
Reviewed by: ups@
MFC after: 3 days
Use bridge_ifdetach() to notify the bridge that a member has been detached. The
bridge can then remove it from its interface list and not try to send out via a
dead pointer.
as a Novell NE-2000. This is necessary for unpatched qemu working
correctly. qemu claims to be a RTL8029, but doesn't implement the
RTL8029 specific registers at this time. I've created patches for
that, but there's no reason we can't use qemu's emulation w/o these
patches. This should make life easier for those folks that boot
FreeBSD via qemu.
ed_probe_generic8390 where we're calling it. It will be done as part
of ed_probe_Novel_generic after things are setup in a way that
ed_probe_generic8390 will grok.
o Fix operator precedence botch that causes a panic when setting the media
type for 10baseT connections.
o Save the type of device so that it prints with the rest of the probe.
# this should make it work with qemu again, but only if it has my patches
# to actually implement the RTL8029 specific registers.
- Use device_printf() and if_printf() and remove nge_unit.
- Use callout_init_mtx() and remove nge_tick_locked() as nge_tick() is now
always called with the driver lock held.
- Use M_ZERO to contigmalloc() when allocating nge_ldata. It was possible
for the random garbage to be used in certain cases otherwise.
- Cleanup attach error handling including no longer leaking nge_ldata.
- Add locking to the ifmedia callouts.
- Lock accesses to if_hwassist and if_capenable in nge_ioctl().
Submitted by: Yuriy N. Shkandybin jura at networks dot ru (1, 3, 4)
Tested by: Yuriy N. Shkandybin jura at networks dot ru
MFC after: 3 days
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.
Assert that all cloners have been destroyed when freeing the memory.
Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.
Discussed with: brooks, pjd, -current
Reviewed by: brooks
notifications when LIO operations completed. These were the problems
with LIO event complete notification:
- Move all LIO/AIO event notification into one general function
so we don't have bugs in different data paths. This unification
got rid of several notification bugs one of which if kqueue was
used a SIGILL could get sent to the process.
- Change the LIO event accounting to count all AIO request that
could have been split across the fast path and daemon mode.
The prior accounting only kept track of AIO op's in that
mode and not the entire list of operations. This could cause
a bogus LIO event complete notification to occur when all of
the fast path AIO op's completed and not the AIO op's that
ended up queued for the daemon.
Suggestions from: alc
auto-start, set cnp.cn_lkflags to LK_EXCLUSIVE. This flag must now
be set so that lockmgr knows what kind of lock to acquire, and it
will panic if not specified. This resulted in a panic when using
extended attributes on UFS1 as of locking work present in the 6.x
branch.
This is a RELENG_6_0 merge candidate.
Reported by: lofi
MFC after: 3 days
TLB shootdown requirements. Otherwise a CPU may not get the needed
TLB invalidation.
The PTE valid and access flags can not be used here to avoid TLB
shootdowns unless sf->cpumask == all_cpus.
( Otherwise some CPUs may still hold an even older entry in the TLB)
Since sf_buf_alloc mappings are normally always used this is
also not really useful and presetting accessed and modified
allows the CPU to speculatively load the entry into the TLB.
Both bugs can cause random data corruption.
MFC after: 3 days
based on XMAC II chip should be ready for this in their initial
mode of operation, and Yukon-based NICs are configured so by
the driver.
PR: kern/79998
MFC after: 1 month
semantics, and then was reused for next node, it still would be applied
as writer again.
To fix the regression the decision is made never to alter item->el_flags
after the item has been allocated. This requires checking for overrides
both in ng_dequeue() and in ng_snd_item().
Details:
- Caller of the ng_apply_item() knows what is the current access to
node and specifies it to ng_apply_item(). The latter drops the
given access after item has beem applied.
- ng_dequeue() needs to be supplied with int pointer, where it stores
the obtained access on node.
- Check for node/hook access overrides in ng_dequeue().
generic sounding CIS "PCMCIA", "FAST ETHERENT CARD" and a bogus MANFID
code (0xffff and 0x1090). However, since I'm not aware of 'generic'
cards that aren't NE-2000oids, go with that and hope for the best.
First and most importantly, I threw out the thread priority-twiddling
implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in
favor of a new scheme that uses sleep mutexes. The old scheme was
really very naughty and sought to provide the same behavior as
Windows spinlocks (i.e. blocking pre-emption) but in a way that
wouldn't raise the ire of WITNESS. The new scheme represents
'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If
a thread on cpu0 acquires the 'dispatcher mutex,' it will block
any other thread on the same processor that tries to acquire it,
in effect only allowing one thread on the processor to be at
'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit
and spin' routine on the spinlock variable itself. If a thread on
cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher
mutex' for cpu1 and then it too does an atomic sit and spin to try
acquiring the spinlock.
Unlike real spinlocks, this does not disable pre-emption of all
threads on the CPU, but it does put any threads involved with
the NDISulator to sleep, which is just as good for our purposes.
This means I can now play nice with WITNESS, and I can safely do
things like call malloc() when I'm at 'DISPATCH_LEVEL,' which
you're allowed to do in Windows.
Next, I completely re-wrote most of the event/timer/mutex handling
and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects()
have been re-written to use condition variables instead of msleep().
This allows us to use the Windows convention whereby thread A can
tell thread B "wake up with a boosted priority." (With msleep(), you
instead have thread B saying "when I get woken up, I'll use this
priority here," and thread A can't tell it to do otherwise.) The
new KeWaitForMultipleObjects() has been better tested and better
duplicates the semantics of its Windows counterpart.
I also overhauled the IoQueueWorkItem() API and underlying code.
Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the
same work item isn't put on the queue twice. ExQueueWorkItem(),
which in my implementation is built on top of IoQueueWorkItem(),
was also modified to perform a similar test.
I renamed the doubly-linked list macros to give them the same names
as their Windows counterparts and fixed RemoveListTail() and
RemoveListHead() so they properly return the removed item.
I also corrected the list handling code in ntoskrnl_dpc_thread()
and ntoskrnl_workitem_thread(). I realized that the original logic
did not correctly handle the case where a DPC callout tries to
queue up another DPC. It works correctly now.
I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and
modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to
use them. I also tried to duplicate the interrupt handling scheme
used in Windows. The interrupt handling is now internal to ndis.ko,
and the ndis_intr() function has been removed from if_ndis.c. (In
the USB case, interrupt handling isn't needed in if_ndis.c anyway.)
NdisMSleep() has been rewritten to use a KeWaitForSingleObject()
and a KeTimer, which is how it works in Windows. (This is mainly
to insure that the NDISulator uses the KeTimer API so I can spot
any problems with it that may arise.)
KeCancelTimer() has been changed so that it only cancels timers, and
does not attempt to cancel a DPC if the timer managed to fire and
queue one up before KeCancelTimer() was called. The Windows DDK
documentation seems to imply that KeCantelTimer() will also call
KeRemoveQueueDpc() if necessary, but it really doesn't.
The KeTimer implementation has been rewritten to use the callout API
directly instead of timeout()/untimeout(). I still cheat a little in
that I have to manage my own small callout timer wheel, but the timer
code works more smoothly now. I discovered a race condition using
timeout()/untimeout() with periodic timers where untimeout() fails
to actually cancel a timer. I don't quite understand where the race
is, using callout_init()/callout_reset()/callout_stop() directly
seems to fix it.
I also discovered and fixed a bug in winx32_wrap.S related to
translating _stdcall calls. There are a couple of routines
(i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that
return 64-bit quantities. On the x86 arch, 64-bit values are
returned in the %eax and %edx registers. However, it happens
that the ctxsw_utow() routine uses %edx as a scratch register,
and x86_stdcall_wrap() and x86_stdcall_call() were only preserving
%eax before branching to ctxsw_utow(). This means %edx was getting
clobbered in some cases. Curiously, the most noticeable effect of this
bug is that the driver for the TI AXC110 chipset would constantly drop
and reacquire its link for no apparent reason. Both %eax and %edx
are preserved on the stack now. The _fastcall and _regparm
wrappers already handled everything correctly.
I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem()
instead of the NdisScheduleWorkItem() API. This is to avoid possible
deadlocks with any drivers that use NdisScheduleWorkItem() themselves.
The unicode/ansi conversion handling code has been cleaned up. The
internal routines have been moved to subr_ntoskrnl and the
RtlXXX routines have been exported so that subr_ndis can call them.
This removes the incestuous relationship between the two modules
regarding this code and fixes the implementation so that it honors
the 'maxlen' fields correctly. (Previously it was possible for
NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't
own, which was causing many mysterious crashes in the Marvell 8335
driver.)
The registry handling code (NdisOpen/Close/ReadConfiguration()) has
been fixed to allocate memory for all the parameters it hands out to
callers and delete whem when NdisCloseConfiguration() is called.
(Previously, it would secretly use a single static buffer.)
I also substantially updated if_ndis so that the source can now be
built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only
WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled.
The original WPA code has been updated to fit in more cleanly with
the net80211 API, and to eleminate the use of magic numbers. The
ndis_80211_setstate() routine now sets a default authmode of OPEN
and initializes the RTS threshold and fragmentation threshold.
The WPA routines were changed so that the authentication mode is
always set first, followed by the cipher. Some drivers depend on
the operations being performed in this order.
I also added passthrough ioctls that allow application code to
directly call the MiniportSetInformation()/MiniportQueryInformation()
methods via ndis_set_info() and ndis_get_info(). The ndis_linksts()
routine also caches the last 4 events signalled by the driver via
NdisMIndicateStatus(), and they can be queried by an application via
a separate ioctl. This is done to allow wpa_supplicant to directly
program the various crypto and key management options in the driver,
allowing things like WPA2 support to work.
Whew.
routine, create all the child bio objects before starting the
requests, rather than starting them as created. This closes a race
whereby some number of child operations could complete before the
rest were ever created, and prematurely freeing the parent bio.
This fixes the panics installing in VMWare and qemu
updated by a process holding the snapshot lock. Another process updating a
different inode in the same inodeblock will do copy on write checks and lock in
the opposite direction.
The snapshot code force a copy on write of these blocks manually (cf. start of
expunge_ufs[12]) and these inode blocks are later put on snapblklist.
This partial fix is to 'drain' the relevant ffs_copyonwrite() operation after
installing new snapblklist. This is not a 100% solution since a failed block
allocation can cause implicit fsync() which might deadlock before the new
snapblklist has been installed.
file is flushed by a process not holding snaplk (e.g. bufdaemon). Another
process might hold snaplk and try to access the block due to ffs_copyonwrite
processing.
the cg map buffer being held when writing indirect blocks. The process ends up
in ffs_copyonwrite(), attempting to get snaplk while holding the cg map buffer
lock.
Another process might be in ffs_copyonwrite(), trying to allocate a new block
for a copy. It would hold snaplk while trying to get the cg map buffer lock.
Release the cg map buffer early and use the copy for most of the cgaccount
processing to avoid this deadlock.
skipping the call from ffs_snapremove() if the block number is zero.
Simplify snapshot locking in ffs_copyonwrite() and ffs_snapblkfree() by using
the same locking protocol for low block numbers as for larger block numbers.
This removes a lock leak that could happen if vn_lock() succeeded after
lockmgr() failed in ffs_snapblkfree().
Check if snapshot is gone before retrying a lock in ffs_copyonwrite().
reclamation. If the vnode previously was a fifo then v_op would point to
ffs_fifoops[12] instead of the expected ffs_vnodeops[12], causing a panic at
the end of ffsext_strategy.
the UDF specification specifies a logical sectorsize of 2048.
Instead, get it from GEOM.
- When reading the UDF Anchor Volume Descriptor, use the logical
sectorsize of 2048 when calculating the offset to read from, but
use the actual sectorsize to determine how much to read.
- works with reading a DVD disk and a DVD disk image file via mdconfig
- correctly returns EINVAL if we try to mount_udf an audio CD, instead
of panicking inside GEOM when INVARIANTS is set
the modified interface that they use. Changes include:
- Register a different interrupt handler for the new interface. This one is
INTR_MPSAFE, not INTR_FAST, and directly processes completions and AIFs.
- Add an event registration and callback mechanism for the ioctl and CAM
modules can know when a resource shortage clears. This condition was
previously fatal in CAM due to programming oversights.
- Fix locking to play better with newbus.
- Provide access methods for talking to cards with the NEWCOMM interface.
- Fix up the CAM module to be better suited for dealing with newer firmware
on the PERC Si/Di series that requires talking to plain SCSI via aac.
- Add a whole slew of new PCI Id's.
Thanks to Adaptec for providing an initial version of this work and for
answering countless questions about it. There are still some rough edges in
this, but it works well enough to commit and test for now.
Obtained from: Adaptec, Inc.
o Rather than just try to turn off EXCA_INTR_RESET, set the entire register
to 0. This is slightly faster, and a better hammer.
o Move attempted clearing of the output enable (EXCA_PWRCTL_OE) back to
after we turn off the power. Modify it to write 0 so that we don't get
Bad Vcc messages on TI bridges (untested, but ru@ sent me a similar patch)
while at the same time avoiding interrupt storms on Ricoh bridges (tested
by me on my Sony).
# Many of my observations of 'breakage' for this patch are due to some bug
# in the load/unload of cbb.ko unlreated to this change. I'll be investigating
# and fixing that bug in the fullness of time.
an allocation. This fixes the malloc 'use after free' panic on boot that
many were seeing. It doesn't solve the problem of the allocations being
cached and then written past their bounds later. That will take more work.
Submitted by: kan
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html
The same problem applies to if_bridge(4), too.
- Copy-and-paste the if_bridge(4) related block from
if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.
Silence from: thompsa
the 5GHz band.
o Enable 802.11a channels scanning for 2915ABG adapters.
o Fix a typo (negociated->negotiated).
With hints from NetBSD.
MFC after: 2 days
- Rename vxfoo() functions to vx_foo() to improve readability and
consistency with other drivers.
- Prefix most the softc members with 'vx_' (the other members already had
the prefix).
- Switch to using callout_init_mtx() and callout_*() rather than
timeout() and untimeout().
- Add some missing calls to if_free() in some failure cases in vx_attach().
- Use if_printf() and remove the unit number from the softc.
- Remove uses of the 'register' keyword and spls.
- Add locked variants of vx_init() and vx_start().
- Add a mutex to the softc and lock it in various appropriate places.
- Setup the interrupt handler last during attach.
Tested by: imp
MFC after: 1 week
It allows to specify options for NFS root file system.
Currently supported options are: soft, intr, conn, lockd.
I'm adding this functionality mostly for 'lockd' option, which is only
honored when performing the initial mount and will be silently ignored
if used while updating the mount options.
This will allow to use flock(2) without the need of using varmfs or
rpc.lockd and friends.
Example of use:
boot.nfsroot.options="intr,lockd"
MFC after: 2 weeks
if (foo);
bar();
to:
if (foo)
bar();
Really, really nasty bug and a very nice catch of mine.
Unfortunately, I'll not become a hero of the day, because the code is
commented out.
module name to something that wouldn't conflict with
sys/dev/firewire/firewire.c.
Submitted by: Cai, Quanqing <caiquanqing at gmail dot com>
PR: kern/82727
MFC after: 3 days
- Don't keep the SPDIF state in the driver private struct since it
can be overriden by hand with pciconf(8), query it when needed instead.
Regarding the locking I let Ariff explain it himself:
---snip---
About the locking, that is what I'm intended to do since the beginning.
The reason I'm not putting that along since my first patchset was
because several people especially from amd46 camp reported that it cause
lots of LORs, which is weird considering that I've never encounter such
in a pretty much strict locking environment (i386). However, since our
previous discussion with Pyun YongHyeon about strict locking, I've
decided to bring it back for all the affected drivers, not just for
es137x. It turns out that the root of the problem was within dsp.c
during device open, which has been fixed since dsp.c revision 1.84.
---snip---
Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my>
code which may help.
People with a ich compatible soundcard which want to help out should
change the "#if 1" to a "#if 0" and try if the soundcard still works.
Reports about working or not-working soundcards with this change to
multimedia@ please.
PR: 73987
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
can be compiled as loadable modules.
Reviewed by: bde
modules along with kernel.
After this change it is possible to embrace opt_*.h includes with ifdef
HAVE_KERNEL_OPTION_HEADERS. And thus, avoid editing a lot of Makefiles
in modules directory each time we introduce a new opt_xxx.h.
Requested by: bde
o Add support for Tamarack TC5299J + MII found on SMC 8041TX V.2
and corega PCCCCTXD
o Add support for ISA/PCI RTL80[12]9 chips
o Improve support for the ax88790 based
o minor code movement
Submitted by: (#2) David Madole
the arp code will search all local interfaces for a match. This triggers a
kernel log if the bridge has been assigned an address.
arp: ac🇩🇪48:18:83:3d is using my IP address 192.168.0.142!
bridge0: flags=8041<UP,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.142 netmask 0xffffff00
ether ac🇩🇪48:18:83:3d
Silence this warning for 6.0 to stop unnecessary bug reports, the code will need
to be reworked.
Approved by: mlaier (mentor)
MFC after: 3 days
the MAC result, as well as avoid losing the DAC check result when MAC
is enabled.
MFC after: 3 days
Reported by: Patrick LeBlanc <Patrick dot LeBlanc at sparta dot com>
Before this change a copy operation with cp(1) would not update the
file access times.
According to the POSIX mmap(2) documentation: the st_atime field
of the mapped file may be marked for update at any time between the
mmap() call and the corresponding munmap() call. The initial read
or write reference to a mapped region shall cause the file's st_atime
field to be marked for update if it has not already been marked for
update.
framework. This makes Giant protection around MAC operations which inter-
act with VFS conditional, based on the MPSAFE status of the file system.
Affected the following syscalls:
o __mac_get_fd
o __mac_get_file
o __mac_get_link
o __mac_set_fd
o __mac_set_file
o __mac_set_link
-Drop Giant all together in __mac_set_proc because the
mac_cred_mmapped_drop_perms_recurse routine no longer requires it.
-Move conditional Giant aquisitions to after label allocation routines.
-Move the conditional release of Giant to before label de-allocation
routines.
Discussed with: rwatson
instead of an int. No other FreeBSD architecture does this. Patch over
this problem in the lmc driver. While I'm here, correct a mistake with
DEVICE_POLLING.
stale flag bits left over from before the inode was recycled.
Without this change, a leftover IN_SPACECOUNTED flag could prevent
softdep_freefile() and softdep_releasefile() from incrementing
fs_pendinginodes. Because handle_workitem_freefile() unconditionally
decrements fs_pendinginodes, a negative value could be reported at
file system unmount time with a message like:
unmount pending error: blocks 0 files -3
The pending block count in fs_pendingblocks could also be negative
for similar reasons. These errors can cause the data returned by
statfs() to be slightly incorrect. Some other cleanup code in
softdep_releasefile() could also be incorrectly bypassed.
MFC after: 3 days
- Don't bzero the softc first thing in attach.
- Cleanup error handling in attach() to avoid lots of duplication.
- Don't initialize the callout handle twice.
MFC after: 3 days
The DMA controller driver only knows how to do memory to memory copies, and
the AAU driver how to zero a chunk of memory.
Use them to process big (>=1KB) copying/zeroing.
dedicated sysctl handlers. Protect manipulations with
poll_mtx. The affected sysctls are:
- kern.polling.burst_max
- kern.polling.each_burst
- kern.polling.user_frac
- kern.polling.reg_frac
o Use CTLFLAG_RD on MIBs that supposed to be read-only.
o u_int32t -> uint32_t
o Remove unneeded locking from poll_switch().
- Use the new API for pmap_copy_page() and pmap_zero_page().
- Just write-back the pages in pmap_qenter(), and invalidate it in
pmap_qremove().
- Nuke the cache flushing in pmap_enter_quick(), it's not needed anymore.
possible for do_execve() to call exit1() rather than returning. As a
result, the sequence "allocate memory; call kern_execve; free memory"
can end up leaking memory.
This commit documents this astonishing behaviour and adds a call to
exec_free_args() before the exit1() call in do_execve(). Since all
the users of kern_execve() in the tree use exec_free_args() to free
the command-line arguments after kern_execve() returns, this should
be safe, and it fixes the memory leak which can otherwise occur.
Submitted by: Peter Holm
MFC after: 3 days
Security: Local denial of service
whether the interface being accessed is IFF_NEEDSGIANT or not. This
avoids lock order reversals when calling into the interface ioctl
handler, which could potentially lead to deadlock.
The long term solution is to eliminate non-MPSAFE network drivers.
Discussed with: jhb
MFC after: 1 week
interface polling, compiles on 64-bit platforms, and compiles on NetBSD,
OpenBSD, BSD/OS, and Linux. Woo! Thanks to David Boggs for providing this
driver.
Altq, sppp, netgraph, and bpf are required for this driver to operate.
Userland tools and man pages will be committed next.
Submitted by: David Boggs
to the parent interface, such as IFF_PROMISC and
IFF_ALLMULTI. In addition, vlan(4) gains ability
to migrate from one parent to another w/o losing
its own flags.
PR: kern/81978
MFC after: 2 weeks
as it is done for usual promiscuous mode already. This info is important
because promiscuous mode in the hands of a malicious party can jeopardize
the whole network.
calling sysctl_out_proc(). -- fix from jhb
Move the code in fill_kinfo_thread() that gathers data from struct proc
into the new function fill_kinfo_proc_only().
Change all callers of fill_kinfo_thread() to call both
fill_kinfo_proc_only() and fill_kinfo() thread. When gathering
data from a multi-threaded process, fill_kinfo_proc_only() only needs
to be called once.
Grab sched_lock before accessing the process thread list or calling
fill_kinfo_thread().
PR: kern/84684
MFC after: 3 days
- Make it so one can't call db_setup_paging() if it has already been called
before. traceall needs this, or else the db_setup_paging() call from
db_trace_thread() will reset the printed line number, and override its
argument.
This is not perfect for traceall, because even if one presses 'q' while in
the middle of printing a backtrace it will finish printing the backtrace
before exiting, as db_trace_thread() won't be notified it should stop, but
it is hard to do better without reworking the pager interface a lot more.
sampling rate between playback and recording. This can be
disabled / enabled via kernel hints
(hint.pcm.<unit>.fixed_rate=0/4000-48000) or sysctl
hw.snd.pcm<unit>.fixed_rate=0/4000-48000). Default to 48khz
fixed rate. [1]
* Basic cleanup. *_es1371x_* -> *_es137x_*.
* Some locking fixes. [2]
Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my>
Discussed with: yongari [2]
See also: http://lists.freebsd.org/pipermail/freebsd-multimedia/2005-September/002758.html [1]
Reported by: Jos Backus <jos at catnook.com> [1]
* General spl* cleanup. It doesn't serve any purpose anymore.
* Nuke sndstat_busy(). Addition of sndstat_acquire() /
sndstat_release() for sndstat exclusive access. [1]
sys/dev/sound/pcm/sound.c:
* Remove duplicate SLIST_INIT()
* Use sndstat_acquire() / release() to lock / release the entire
sndstat during pcm_unregister(). This should fix LOR #159 [1]
sys/dev/sound/pcm/sound.h:
* Definition of SD_F_SOFTVOL (part of feeder volume)
* Nuke sndstat_busy(). Addition of sndstat_acquire() /
sndstat_release() for exclusive sndstat access. [1]
Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my>
LOR: 159 [1]
Discussed with: yongari [1]
* Added codec id for CMI9761.
* feeder_volume *whitelist* through ac97_fix_volume()
sys/dev/sound/pcm/ac97.h:
* Added AC97_F_SOFTVOL definition.
sys/dev/sound/pcm/channel.c:
* Slight changes for chn_setvolume() to conform with OSS.
* FEEDER_VOLUME is now part of feeder building process.
sys/dev/sound/pcm/mixer.c:
* General spl* cleanup. It doesn't serve any purpose anymore.
* Main hook for feeder_volume.
Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my>
Tested by: multimedia@
threads. This is quite useful if generating a debug log for post-mortem
by another developer, in which case the person at the console may not
know which threads are of interest. The output of this can be quite
long.
Discussed with: kris
MFC after: 3 days
This is a special case because tcp_twstart() destroys a tcp control
block via tcp_discardcb() so we cannot call tcp_drop(struct *tcpcb) on
such connections. Use tcp_twclose() instead.
MFC after: 5 days
- WEP TX fix:
The original code called software crypto, ieee80211_crypto_encap(),
which never worked since IEEE80211_KEY_SWCRYPT was never flagged due to
ieee80211_crypto_newkey() assumes that wi always supports hardware based
crypto regardless of operational mode(by virtue of IEEE80211_C_WEP).
This fix works around that issue by adding wi_key_alloc() to force
the use of s/w crypto. Also if anyone ever decides to cleanup ioctl
handling where key changes wouldn't cause a call to wi_init() every time,
we'll need wi_key_alloc() to DTRT.
In addition to that, this fix also adds code to wi_write_wep() to force
existing keys to be switched between h/w and s/w crypto such that an
operation mode change(sta <-> hostap) will flag IEEE80211_KEY_SWCRYPT
properly.
- WEP RX fix:
Clear IEEE80211_F_DROPUNENC even in hostap mode. Quote from Sam:
"This is really gross but I don't see an easy way around it.
By doing it we lose the ability to independently drop unencode
frames (and support mixed wep/!wep use). We should really be
setting the EXCLUDE_UNENCRYPTED flag written in wi_write_wep
based on IEEE80211_F_DROPUNENC but with our clearing it we can't
depend on it being set properly."
Reported by: Holm Tiffe <holm at freibergnet dot de>
Submitted by: sam
MFC after: 3 days
o Axe poll in trap.
o Axe IFF_POLLING flag from if_flags.
o Rework revision 1.21 (Giant removal), in such a way that
poll_mtx is not dropped during call to polling handler.
This fixes problem with idle polling.
o Make registration and deregistration from polling in a
functional way, insted of next tick/interrupt.
o Obsolete kern.polling.enable. Polling is turned on/off
with ifconfig.
Detailed kern_poll.c changes:
- Remove polling handler flags, introduced in 1.21. The are not
needed now.
- Forget and do not check if_flags, if_capenable and if_drv_flags.
- Call all registered polling handlers unconditionally.
- Do not drop poll_mtx, when entering polling handlers.
- In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
- In netisr_poll() axe the block, where polling code asks drivers
to unregister.
- In netisr_poll() and ether_poll() do polling always, if any
handlers are present.
- In ether_poll_[de]register() remove a lot of error hiding code. Assert
that arguments are correct, instead.
- In ether_poll_[de]register() use standard return values in case of
error or success.
- Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
poll_switch() goes through interface list and enabled/disables polling.
A message that kern.polling.enable is deprecated is printed.
Detailed driver changes:
- On attach driver announces IFCAP_POLLING in if_capabilities, but
not in if_capenable.
- On detach driver calls ether_poll_deregister() if polling is enabled.
- In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
flag. If there is no, then unlocks and returns.
- In ioctl handler driver checks for IFCAP_POLLING flag requested to
be set or cleared. Driver first calls ether_poll_[de]register(), then
obtains driver lock and [dis/en]ables interrupts.
- In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
If present, then returns.This is important to protect from spurious
interrupts.
Reviewed by: ru, sam, jhb
to avoid touching pageable memory while holding a mutex.
Simplify argument list replacement logic.
PR: kern/84935
Submitted by: "Antoine Pelisse" apelisse AT gmail.com (in a different form)
MFC after: 3 days
sys/fs/nwfs/nwfs_vfsop= s.c, introduced with the conversion to
nmount with revision 1.38. This causes mount_nwfs to fail with
the error message:
mount_nwfs: mount error: /mnt/netware: syserr = No such file or directo=
ry
This is caused by a typo on line 178, which specifies "nwfw_args"
rather than "nwfs_args".
Submitted by: Antony Mawer <gnats@mawer.org>
Fat fingers: phk
PR: 86757
MFC: 3 days
up. This make iostat report operations passed down to the device driver
instead of operations passed down to GEOM disk. The transfer size limit
imposed by the device driver is no longer hidden, improving the correlation
between iostat output and device driver workload.
> Cause all flags passed by boot2 to set the respective loader(8)
> boot_* variable. The end effect is that all flags from boot2
> are now passed to the kernel.
Add a new private thread flag to indicate that the thread should
not sleep if runningbufspace is too large.
Set this flag on the bufdaemon and syncer threads so that they skip
the waitrunningbufspace() call in bufwrite() rather than than
checking the proc pointer vs. the known proc pointers for these two
threads. A way of preventing these threads from being starved for
I/O but still placing limits on their outstanding I/O would be
desirable.
Set this flag in ffs_copyonwrite() to prevent bufwrite() calls from
blocking on the runningbufspace check while holding snaplk. This
prevents snaplk from being held for an arbitrarily long period of
time if runningbufspace is high and greatly reduces the contention
for snaplk. The disadvantage is that ffs_copyonwrite() can start
a large amount of I/O if there are a large number of snapshots,
which could cause a deadlock in other parts of the code.
Call runningbufwakeup() in ffs_copyonwrite() to decrement runningbufspace
before attempting to grab snaplk so that I/O requests waiting on
snaplk are not counted in runningbufspace as being in-progress.
Increment runningbufspace again before actually launching the
original I/O request.
Prior to the above two changes, the system could deadlock if enough
I/O requests were blocked by snaplk to prevent runningbufspace from
falling below lorunningspace and one of the bawrite() calls in
ffs_copyonwrite() blocked in waitrunningbufspace() while holding
snaplk.
See <http://www.holm.cc/stress/log/cons143.html>
the directory's inode after queuing the dirrem that will decrement
the parent directory's link count. This will force the update of
the parent directory's actual link to actually be scheduled. Without
this change the parent directory's actual link count would not be
updated until ufs_inactive() cleared the inode of the newly removed
directory, which might be deferred indefinitely. ufs_inactive()
will not be called as long as any process holds a reference to the
removed directory, and ufs_inactive() will not clear the inode if
the link count is non-zero, which could be the result of an earlier
system crash.
If a background fsck is run before the update of the parent directory's
actual link count has been performed, or at least scheduled by
putting the dirrem on the leaf directory's inodedep id_bufwait list,
fsck will corrupt the file system by decrementing the parent
directory's effective link count, which was previously correct
because it already took the removal of the leaf directory into
account, and setting the actual link count to the same value as the
effective link count after the dangling, removed, leaf directory
has been removed. This happens because fsck acts based on the
actual link count, which will be too high when fsck creates the
file system snapshot that it references.
This change has the fortunate side effect of more quickly cleaning
up the large number dirrem structures that linger for an extended
time after the removal of a large directory tree. It also fixes a
potential problem with the shutdown of the syncer thread timing out
if the system is rebooted immediately after removing a large directory
tree.
Submitted by: tegge
MFC after: 3 days