--------------------
- Seal the fate of long standing memory leak (4 years, 7 months) during
pcm_unregister(). While destroying cdevs, scan / detect possible
children and free its SLIST placeholder properly.
- Optimize channel allocation / numbering even further. Do brute cyclic
checking only if the channel numbering screwed.
- Mega vchan create/destroy cleanup:
o Implement pcm_setvchans() so everybody can use it freely instead
of implementing their own, be it through sysctl or channel auto
allocation.
o Increase vchan creation/destruction resiliency:
+ it's possible to increase/decrease total vchans even during
busy playback/recording. Busy channel will be left alone, untouched.
Abusive test sample:
# play whatever...
#
while : ; do
sysctl hw.snd.pcm0.vchans=1
sysctl hw.snd.pcm0.vchans=10
sysctl hw.snd.pcm0.vchans=100
sysctl hw.snd.pcm0.vchans=200
done
# Play something else, leave above loop running frantically.
+ Seal another 4 years old bug where it is possible to destroy (virtual)
channel even when its cdevs being referenced by other process.
The "First Come First Served" nature of dsp_clone() is the main
culprit of this issue, and usually manifest itself as dangling
channel <-> process association. Ensure that all of its cdevs
are free from being referenced before destroying it (through
ORPHAN_CDEVT() macross).
All these fixes (including previous fixes) will be MFCed, later.
to avoid possible device unregister race (impossible to reproduce, yet
possible).
- Extra sanity check to ensure proper parent channel is being selected.
- Reset parent channel once all of its children gone.
called.
- vfs_getvfs has to return a reference to prevent the returned mountpoint
from changing identities.
- Release references acquired via vfs_getvfs.
Discussed with: tegge
Tested by: kris
Sponsored by: Isilon Systems, Inc.
mount memory from being reclaimed. This resolves a number of race
conditions described in vfs_default.c and introduced with the
VFS_LOCK_GIANT macros.
- Let the mtx and lock remain valid after the mount structure has been
freed by using init and fini calls. Technically fini will never be
called but is included for completeness.
- Consistently use lockmgr directly rather than lockmgr to lock and
vfs_unbusy to unlock.
Discussed with: tegge
Tested by: kris
Sponsored by: Isilon Systems, Inc.
- Move the vn_lock of the dvp until after we've unbusied the filesystem
to avoid a LOR with the mount point lock.
- In the v_mountedhere while loop we acquire a new instance of giant each
time through without releasing the first. This would cause us to leak
Giant.
Sponsored by: Isilon Systems, Inc.
requires Giant. It is set in bgetvp and cleared in brelvp.
- Create QUEUE_DIRTY_GIANT for dirty buffers that require giant.
- In the buf daemon, only grab giant when processing QUEUE_DIRTY_GIANT and
only if we think there are buffers in that queue.
Sponsored by: Isilon Systems, Inc.
failing, print a message when we fail for some reason as most callers do
not check the return value (e.g. 'cuz they're called from SYSINIT)
Reviewed by: scottl
MFC after: 1 week
Vararg functions have a different calling convention than regular
functions on amd64. Casting a varag function to a regular one to
match the function pointer declaration will hide the varargs from
the caller and we will end up with an incorrectly setup stack.
Entirely remove the varargs from these functions and change the
functions to match the declaration of the function pointers.
Remove the now unnecessary casts.
Also change static struct ipprotosw[] to two independent
protosw/ip6protosw definitions to remove an unnecessary cast.
PR: amd64/95008
Submitted and tested by: Mats Palmgren
Reviewed by: rwatson
MFC after: 3 days
o Add the scc(4) manpage to the build.
o Update the uart(4) manpage to account for scc(4).
o Update the uart(4) module build to include support for scc(4).
controllers typically have multiple channels and support a number
of serial communications protocols. The scc(4) driver is itself
an umbrella driver that delegates the control over each channel
and mode to a subordinate driver (like uart(4)).
The scc(4) driver supports the Siemens SAB 82532 and the Zilog
Z8530 and replaces puc(4) for these devices.
case panic on sparc64.
The problem is in MD5(9) implementation. The Encode() function takes
'unsigned char *output' as its first argument, which is then assigned to
'u_int32_t *op'. If the 'output' argument is not 4 byte aligned (and in
geli(8) case it is not), sparc64 machine will panic.
I don't know how to fix MD5(9) in a clean way, so I'm implementing a
work-around in geli(8).
Reported by: brueffer
MFC after: 3 days
in the ISR doesn't read the actual socket event register, but instead
reads garbage (usually 0xffffffff, but other times other things).
This totally violates the PCI spec, but happens rarely enough that a
workaround is in order. This adds one test when we have a real
interrupt to service (which is very rare), and doesn't affect the
usualy 'nothing to see here' case at all.
Problem reported by many, but sam@ gave me this workaround after
diagnosing the problem.
a lock's priority to a sleeping thread. When we panic, dump a stack
trace of the thread that is asleep if DDB is compiled into the kernel
just before calling panic(). This is much more informative and useful
for debugging than the current behavior of getting a page fault and not
having an easy way of determining which thread caused the original problem.
MFC after: 1 week
a race where data could come in before we clear the INFLUX flag, and get
skipped over by knote (and hence never be activated, though it should of
been)...
Found by: glebius & co.
Reviewed by: glebius
MFC after: 3 days
some systems were designed so that AML writes to various resources shared
with OS drivers, including the RTC, PIC, PCI, etc. These writes could
collide with writes by the OS and should never be performed. For now, we
print a message if such an access occurs, but do not block it. To block
the access, the tunable "debug.acpi.block_bad_io" can be set to 1. In the
future, we will flip the switch and this will become the default.
Information about this problem was found in Microsoft KB 283649. They
block IO accesses if the BIOS indicates via _OSI that it is Windows 2001
or higher. They always block accesses to the PIC, cascaded PIC, and ELCRs,
no matter how old the BIOS.
systems (blade servers). On most systems, this is implemented as an IO
write to the SMI port and the BIOS generates the actual reset.
PR: kern/94939
Submitted by: dodell@ixsystems.com
Reviewed by: jhb
MFC after: 3 weeks
foreign per-CPU pages in cpu_ipi_send() in order to get the module IDs
of the other CPUs can cause a page fault. If this happens when doing a
TLB shootdown while dealing with another page fault this causes a panic
due to the recursive page fault. As I don't spot other code that assumes
or requires that accessing foreign per-CPU pages must not page fault
solve this by adding a statically allocated (and therefore locked in the
kernel pages) array which establishes a FreeBSD CPU ID -> module ID
relation and use that in cpu_ipi_selected() (instead of statically
allocating the per-CPU pages which would just waste memory on say a dual
CPU machine as sun4u theoretically supports up to 128 CPUs or wasting
dTLB slots for the foreign per-CPU pages). [1]
- Fix a potential race in cpu_ipi_send(); as we don't serialize the access
to cpu_ipi_selected() between MI and MD use (only MI-MI and MD-MD) we
might catch the NACK bit caused by sending another IPI. Solve this by
checking the NACK bit in the contents of the interrupt dispatch status
reg read while interrupts were still turned off instead of reading that
reg anew after interrupts were turned on again. This is also what the
CPU docs suggest to do.
- Add a workaround for the SpitFire erratum #54 bug (affecting interrupt
dispatch). While public info regarding what this CPU bug actually causes
is not available testing shows that with the workaround in place it's
less likely to get a "couldn't send ipi" panic, it doesn't solve these
panics entirely though. [2]
Reported by: kris [1]
Some clue from: kmacy [1]
Info from: Linux, OpenSolaris [2]
Additional testing by: kris
MFC after: 3 days
Saab for helping to track this down. Fix a error with 32bit DMA size
calculation that seemed to be harmless. Add a few micro-optimizations while
I'm here.
generating a coredump as the result of a signal.
- Fix a bug where we could leak a Giant lock if vn_start_write() failed
in coredump().
Reported by: jmg (2)
mddestroy() only if the file is from a non-MPSAFE VFS.
- No longer unconditionally hold Giant in the md kthread for vnode-backed
kthreads.
- Improve the handling of the thread exit race when destroying an md
device.
and use that instead of testing fdidx against -1 to determine if it should
release Giant if Giant was locked due to the requested file residing on a
non-MPSAFE VFS.
Discussed with: jeff
get_cyclecount() as that results in a saner value and makes schedgraph
much happier on Alpha. (schedgraph doesn't handle the fact that the
counters are out of sync though)
as we have to call tick_init() before cninit() in order to provide the
low-level console drivers with a working DELAY() which in turn means we
cannot use panic() in tick_init().
- s,to high, too high, in the panic string
Inspired by: kmacy's sun4v changes
MFC after: 3 days
probably never fully applied to IPv6. Over time it has become more
stale, so replace it with something more up to date.
Reviewed by: ume
MFC after: 1 month
ipsec_copypkt(), as this is already handled by the call to M_MOVE_PKTHDR(),
which also knows how to correctly handle MAC m_tags. This corrects a panic
when running with MAC and KAME IPSEC.
PR: kern/94599
Submitted by: zhouyi zhou <zhouyi04 at ios dot cn>
Reviewed by: bz
MFC after: 3 days
arguments. The first one is never used (all callers pass in 0); the
second is sometimes used to pass in a struct timespec * which is used as
a timeout and never modified. Constify that argument so callers can pass
a const struct timespec * without jumping through hoops.
back to using the RSDT instead. ACPI-CA already follows this same strategy
as a workaround for yet another instance of brain-damaged BIOS writers.
PR: i386/93963
Submitted by: Masayuki FUKUI <fukui.FreeBSD@fanet.net>
now just make it clear station statistics (could read
a stat block and assign to caller can do partial changes)
Reviewed by: avatar (previous version)
MFC after: 1 week
acquiring Giant in kern_sendfile().
Guard against the forced reclamation of a vnode in kern_sendfile().
Discussed with: jeff
Reviewed by: tegge
MFC after: 3 weeks
ipxpcb mutex. Contrary to the comment, even in 4.x this was unsafe,
as parallel use of the socket by another process would result in pcb
corruption if the mbuf allocation slept.
MFC after: 1 month
a problem with listing large number of md(4) devices. Either 'list' or
'query' mode uses XML.
Additionally, new functionality was introduced. It's possible to pass
multiple devices to -u:
# ./mdconfig -l -u md0,md1
Approved by: cognet (mentor)
REGRESSION is enabled, allows user space to dictate that sonewconn()
should skip it's "skip the hard work" check to see if the listen
queue is full, and instead proceed with allocation of a socket and
trimming of the overflowed queue. This makes it easier to test the
queue overflow logic.
MFC after: 1 month
IPXP_DROPPED before continuing, and return EINVAL or ECONNRESET if
it is flagged. It's unclear why each situation should be one or
the other, but it is copied from netinet which has the same bugs.
MFC after: 1 month
as belonging to SPX. This replaces the implicit assumption that the cb
pointer for non-SPX pcb's will be NULL. This isn't required in TCP/IP
as different pcb lists are maintained for different IP protocols; IPX
stores all pcbs on the same global ipxpcb_list.
Foot provided by: gnn
MFC after: 1 month
Kernel changes:
Inform hwpmc of executable objects brought into the system by
kldload() and mmap(), and of their removal by kldunload() and
munmap(). A helper function linker_hwpmc_list_objects() has been
added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve
the list of currently loaded kernel modules.
The unused `MAPPINGCHANGE' event has been deprecated in favour
of separate `MAP_IN' and `MAP_OUT' events; this change reduces
space wastage in the log.
Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to
handle the map change callbacks.
Change the default per-cpu sample buffer size to hold
32 samples (up from 16).
Increment __FreeBSD_version.
libpmc(3) changes:
Update libpmc(3) to deal with the new events in the log file; bring
the pmclog(3) manual page in sync with the code.
pmcstat(8) changes:
Introduce new options to pmcstat(8): "-r" (root fs path), "-M"
(mapfile name), "-q"/"-v" (verbosity control). Option "-k" now
takes a kernel directory as its argument but will also work with
the older invocation syntax.
Rework string handling in pmcstat(8) to use an opaque type for
interned strings. Clean up ELF parsing code and add support for
tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).
Report statistics at the end of a log conversion run depending
on the requested verbosity level.
Reviewed by: jhb, dds (kernel parts of an earlier patch)
Tested by: gallatin (earlier patch)
reason, seems to be where new flags are getting defined:
INP_DROPPED - The protocol has terminated this connection and the socket
is not reusable: when the socket code enters the protocol,
an error is immediately returned. This will substitute for
NULLing the so_pcb socket field, helping to implement the
invariant that all valid sockets have valid pcb's in TCP.
INP_SOCKREF - The protocol has become the owner of the socket reference,
and will need to free it when freeing the pcb, which will
be used when a TCP socket is closed but still has queued
data.
MFC after: 1 month
the error on sparc64 hadn't changed since the last checkin, pass
LINT on other platforms and mpt doesn't work on sparc64 anyway
and the tinderbox build didn't work for me in a cross build case
on my main build machine (which runs RELENG_6). Sigh. Still
need to try harder.
- Introduce invariant that all IPX/SPX sockets will have valid so_pcb
pointers to ipxpcb structures, and that for SPX, the control block
pointer will always be valid. Don't attempt to free the socket or
pcb at various odd points, such as disconnect.
- Add a new ipxpcb flag, IPXP_DROPPED, which will be set in place of
freeing PCB's so that this invariant can be maintained. This flag
is now checked instead of a NULL check in various socket protocol
calls.
- Introduce many assertions that this invariant holds.
- Various pieces of code, such as the SPX timer code, no longer needs
to jump through hoops in case it frees a PCB while running.
- Break out ipx_pcbfree() from ipx_pcbdetach(). Likewise
spx_pcbdetach().
- Comment on some SMP-related limitations to the SPX code.
- Update copyrights.
MFC after: 1 month
of its allocations fails. Allocate the ipxp last so as to avoid having
to free it if another allocation goes wrong.
Normalize retrieval of ipxp and cb from socket in spx_sp_attach(), and
add assertions.
MFC after: 1 month
especially reads of spx header structures, which will now be cached
in the stack until they can be copied out after releasing the lock.
Panic if a bad socket option direction is passed in by the caller.
MFC after: 1 month
Make the kernel side of FAST_IPSEC not depend on the shared
structures defined in /usr/include/net/pfkeyv2.h The kernel now
defines all the necessary in kernel structures in sys/netipsec/keydb.h
and does the proper massaging when moving messages around.
Sponsored By: Secure Computing
A) Fibre Channel Target Mode support mostly works
(SAS/SPI won't be too far behind). I'd say that
this probably works just about as well as isp(4)
does right now. Still, it and isp(4) and the whole
target mode stack need a bit of tightening.
B) The startup sequence has been changed so that
after all attaches are done, a set of enable functions
are called. The idea here is that the attaches do
whatever needs to be done *prior* to a port being
enabled and the enables do what need to be done for
enabling stuff for a port after it's been enabled.
This means that we also have events handled by their
proper handlers as we start up.
C) Conditional code that means that this driver goes
back all the way to RELENG_4 in terms of support.
D) Quite a lot of little nitty bug fixes- some discovered
by doing RELENG_4 support. We've been living under Giant
*waaaayyyyy* too long and it's made some of us (me) sloppy.
E) Some shutdown hook stuff that makes sure we don't blow
up during a reboot (like by the arrival of a new command
from an initiator).
There's been some testing and LINT checking, but not as
complete as would be liked. Regression testing with Fusion
RAID instances has not been possible. Caveat Emptor.
Sponsored by: LSI-Logic.
is derived from the phrase 'MegaRAID Firmware Interface' used by LSI. This
driver provides a block interface to logical disks on the card and a minimal
management device. It is MPSAFE, INTR_FAST, and 64-bit capable.
Thanks to Dell for providing hardware to test with and IronPort for
sponsoring the work.
Sponsored by: Dell, Ironport
MFC After: 3 days
being committed:
- Wrap comments more evenly on right border.
- Clean up braces.
Also, along similar lines:
- Assert some pointers are non-NULL before dereferencing them.
- Remove one assertion that looks, on face value, poor.
MFC after: 1 month
socket also supports the voltage. Some XV cards have appeared on the
scene (or cards that report they support XV), and in older machines
that have sockets that do not support XV, we were bogusly trying to
power them at XV rather than at 3.3V. Now, power up the card at the
lowest voltage supported by both the card and the socket.
MFC After: 3 days
with Giant, as there is current unsafety in the IPX tunneled over IP
code. There have been no reports of trouble, but there probably would
be if anyone were running this code at high speed on SMP systems.
MFC after: 3 days
The bug was that earlier, if a request was retransmitted,
we would do subsequent retransmits every 10 msecs.
This can cause data corruption under moderate loads by reordering
operations as seen by the client NFS attribute cache, and on the
server side when the retransmission occurs after the original request
has left the duplicate cache, since the operation will be committed
for a second time.
Further work on retransmission handling is needed (e.g. they are still
being done sent too often since they are scaled by HZ, and the size of
the dup cache is too small and easily overwhelmed on busy servers).
Submitted by: mohans
details see PR kern/94448.
PR: kern/94448
Original patch: Eygene A. Ryabinkin <rea-fbsd at rea dot mbslab dot kiae dot ru>Final patch: thompsa@
Tested by: thompsa@, Eygene A. Ryabinkin
MFC after: 7 days
variable on the spx_input() stack. It's not very large, and this will
avoid parallelism issues when spx_input() runs in more than one thread at
a time.
MFC after: 1 month
- [1] Make the driver friendly towards kernel without PREEMPTION.
Use msleep(9) instead of simple unlock-check_variable-lock mechanisme
since the later not really effective in non-preemptible kernel
(especially during codec detection routine).
- Free most driver resources in a sane manner to avoid possible
double free and panics especially during device detach and codec
detection failure.
MFC after: 3 days
[1] http://lists.freebsd.org/pipermail/freebsd-questions/2006-March/116515.html
relocate it), do not attempt to call pmap_vac_me_harder() on the page.
At this point m will be NULL, and we know we won't have any cache
issues with this page.
Correctly identify the user running opiepasswd(1) when the login name
differs from the account name. [2]
Security: FreeBSD-SA-06:11.ipsec [1]
Security: FreeBSD-SA-06:12.opie [2]
that have the specified kind, instead of assuming that there is
only one report of the right kind in the report descriptor.
Submitted by: Morten Johansen
Obtained from: NetBSD (indirectly)
PR: usb/77604
VFS_LOCK_GIANT/VFS_UNLOCK_GIANT calls. This completely removes Giant
acquisition in the syscall path for ffs.
Bug fix to kern_fhstatfs from: Todd Miller <Todd.Miller@sparta.com>
Sponsored by: Isilon Systems, Inc.
- rename some file local structure definitions, the names clash with
autogenerated names
- on !alpha add some compatibility defines for those renamed structures
- make some functions globally visible on alpha
o don't send management frames if the IFF_DRV_RUNNING flag is not set.
this prevents the timeout watchdog from being potentially re-armed
when the interface is brought down.
fixes a crash that occurs with RT2661 based adapters.
reported by Arnaud Lacombe.
Specifically, on mappings with PG_G set pmap_remove() not only performs
the necessary per-page invlpg invalidations but also performs an
unnecessary invalidation of the entire set of non-PG_G entries.
Reviewed by: tegge
multicast addresses from carp interface. [1]
o Rewrite carpdetach(), so that it does the following things: [1]
- Stops callouts.
- Decrements carp_suppress_preempt, if needed.
- Downs interface and sets CARP state to INIT.
- Calls carp_multicast_cleanup().
- Detaches softc from carp_if and if we are the last frees
the carp_if.
o Use new carpdetach() in carp_clone_destroy().
o In carp_ifdetach() acquire the carp_if lock and cleanup all
interfaces hanging on carp_if. [1]
o Make carp_ifdetach() static and use EVENT(9) to call it
from if_detach(). [2]
o In carp_setrun() exit if the softc doesn't have a valid pointer
to parent. [1]
Obtained from: OpenBSD [1]
Submitted by: Dan Lukes <dan obluda.cz> [2]
PR: kern/82908 [2]
- Determine open direction using 'flags', not 'mode'. This bug exist since
past 4 years.
- Don't allow opening the same device twice, be it in a same or different
direction.
- O_RDWR is allowed, provided that it is done by a single open (for example
by mixer(8)) and the underlying hardware support true full-duplex operation.
- Do various paranoid checking in case other process/thread trying to hijack
the same device twice (or more).
MFC after: 5 days
- <netipx> headers [1]
- IPX library (libipx)
- IPX support in ifconfig(8)
- IPXrouted(8)
- new MK_NCP option
New MK_NCP build option controls:
- <netncp> and <fs/nwfs> headers
- NCP library (libncp)
- ncplist(1) and ncplogin(1)
- mount_nwfs(8)
- ncp and nwfs kernel modules
User knobs: WITHOUT_IPX, WITHOUT_IPX_SUPPORT, WITHOUT_NCP.
[1] <netsmb/netbios.h> unconditionally uses <netipx> headers
so they are still installed. This needs to be dealt with.
"fdinit() fails to initialize newfdp->fd_fd.fd_lastfile to -1. This breaks
fdcopy() which will incorrectly set newfdp->fd_freefile to 1 if no files are
open and the last file descriptor marked as unused for fdp was 0. This later
causes descriptor 0 to be unavailable in newfdp when the optimization is
enabled.
When the last file descriptor previously marked as used is nonzero and marked
as unused, fdunused() incorrectly sets fdp->fd_lastfile to fd - 1 due to
fd_last_used() returning (size - 1). This hides the problem that breaks the
optimization."
This allows us to keep the optimization, while un-breaking it.
This is a RELENG_6 candidate.
PR: kern/87208
MFC after: 1 week
Submitted by: tegge
the target directory or file. This case should fail in the filesystem
anyway and perhaps kern_rename() should catch it.
Sponsored by: Isilon Systems, Inc.
branch:
Integrate audit.c to audit_worker.c, so as to migrate the worker
thread implementation to its own .c file.
Populate audit_worker.c using parts now removed from audit.c:
- Move audit rotation global variables.
- Move audit_record_write(), audit_worker_rotate(),
audit_worker_drain(), audit_worker(), audit_rotate_vnode().
- Create audit_worker_init() from relevant parts of audit_init(),
which now calls this routine.
- Recreate audit_free(), which wraps uma_zfree() so that
audit_record_zone can be static to audit.c.
- Unstaticize various types and variables relating to the audit
record queue so that audit_worker can get to them. We may want
to wrap these in accessor methods at some point.
- Move AUDIT_PRINTF() to audit_private.h.
Addition of audit_worker.c to kernel configuration, missed in
earlier submit.
Obtained from: TrustedBSD Project
Add ioctls to audit pipes in order to allow querying of the current
record queue state, setting of the queue limit, and querying of pipe
statistics.
Obtained from: TrustedBSD Project
net.inet.ip.portrange.reservedlow apply to IPv6 aswell as IPv4.
We could have made new sysctls for IPv6, but that potentially makes
things complicated for mapped addresses. This seems like the least
confusing option and least likely to cause obscure problems in the
future.
This change makes the mac_portacl module useful with IPv6 apps.
Reviewed by: ume
MFC after: 1 month