when not propogated on fork (due to minherit(2)). Consistency checks
otherwise fail when the vm_map is freed and it appears to have not been
emptied completely, causing an INVARIANTS panic in vm_map_zdtor().
PR: kern/68017
Submitted by: Mark W. Krentel <krentel@dreamscape.com>
Reviewed by: alc
because UFS uses fixed-size directory blocks. When using this code with
other file systems, such as HFS+, the value of auio.uio_resid will need
to be taken into account.
local to a function. Remove a couple of blank lines in variable
declarations.
In one case, explicitly test against NULL rather than using a pointer
as a boolean directly.
older API to list attributes on a file (zero-length attribute name)
to function. extattr_list_*() are now the only available APIs to
use when listing attributes.
only allow this to be further processed when bridging is active on
that interface, but also if the current packet has a VLAN tag and
VLANs are active on our interface. This gives the VLAN layers a
chance to also consider the packet (and perhaps drop it instead of the
main dispatcher).
This fixes a situation where bridging was only active on VLAN
interfaces but ether_demux() called on behalf of the main interface
had already thrown the packet away.
MFC after: 4 weeks
are currently all bad BIOS revisions that will never be able to support
ACPI. They were derived by examining which BIOS's are blacklisted by other
operating systems. Other types of quirks will be possible here as well.
network interfaces. This global mutex will protect all ifnet labels.
Acquire the mutex across various MAC activities on interfaces, such
as security checks, propagating interface labels to mbufs generated
from the interface, retrieving and setting the interface label.
Introduce mpo_copy_ifnet_label MAC policy entry point to copy the
value of an interface label from one label to another. Use this
to avoid performing a label externalize while holding mac_ifnet_mtx;
copy the label to a temporary ifnet label and then externalize that.
Implement mpo_copy_ifnet_label for various MAC policies that
implement interface labeling using generic label copying routines.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
locking in tcp_input() for TCP packets with urgent data pointers to
hold the socket buffer lock across testing and updating oobmark
from just protecting sb_state.
Update socket locking annotations
Ultra2 users may want to set OFWCONS_POLL_HZ to a value of '20'.
I have left default value at '4' as higher values can consume a more
than is acceptable amount of CPU, and we don't have a consensus yet
what is an optimal value.
Submitted by: Pyun YongHyeon <yongari@kt-is.co.kr>
not active GEOM providers, it will result in a kernel panic.
If the GEOM provider or disk goes away before the volume
configuration data gets written to the disk, it will result
in another kernel panic.
o Make sure that the drives specified for volume creation
are active GEOM providers.
o When writing out volume configuration data to associated drives,
make sure that the GEOM provider is active, otherwise continue
to the next drive in the volume.
Approved by: le, bmilekic (mentor)
Giant if debug.mpsafenet=0, as any points that require synchronization
in the SMPng world also required it in the Giant-world:
- inpcb locks (including IPv6)
- inpcbinfo locks (including IPv6)
- dummynet subsystem lock
- ipfw2 subsystem lock
- Assert the mutex in NG_IDHASH_FIND() since the mutex is required to
safely walk the node lists in the ng_ID_hash table.
- Acquire the ng_nodelist_mtx when walking ng_allnodes or ng_allhooks
to generate state dump output from the netgraph sysctls.
the socket buffer having its limits adjusted. sbreserve() now acquires
the lock before calling sbreserve_locked(). In soreserve(), acquire
socket buffer locks across read-modify-writes of socket buffer fields,
and calls into sbreserve/sbrelease; make sure to acquire in keeping
with the socket buffer lock order. In tcp_mss(), acquire the socket
buffer lock in the calling context so that we have atomic read-modify
-write on buffer sizes.
smp_rendezvous() to ensure we run on the BSP. This reverts rev 1.128.
Add a comment indicating that MI code should be the one that runs all
shutdown functions on the BSP with the APs halted. This should work
around problems in power off while waiting for the MI code to be improved.
waiting for the socket to connect and use msleep() on the socket
mute rather than tsleep(). Acquire socket buffer mutexes around
read-modify-write of socket buffer flags.
The general UMA lock is a recursion-allowed lock because
there is a code path where, while we're still configured
to use startup_alloc() for backend page allocations, we
may end up in uma_reclaim() which calls zone_foreach(zone_drain),
which grabs uma_mtx, only to later call into startup_alloc()
because while freeing we needed to allocate a bucket. Since
startup_alloc() also takes uma_mtx, we need to be able to
recurse on it.
This exact explanation also added as comment above mtx_init().
Trace showing recursion reported by: Peter Holm <peter-at-holm.cc>
originated on RELENG_4 and was ported to -CURRENT.
The scoreboarding code was obtained from OpenBSD, and many
of the remaining changes were inspired by OpenBSD, but not
taken directly from there.
You can enable/disable sack using net.inet.tcp.do_sack. You can
also limit the number of sack holes that all senders can have in
the scoreboard with net.inet.tcp.sackhole_limit.
Reviewed by: gnn
Obtained from: Yahoo! (Mohan Srinivasan, Jayanth Vijayaraghavan)
This should fix problems with older SMP systems that only have ISA/EISA
IRQs when routing virgin PCI interrupts as well as on other boxes whose
MADT does not have any interrupt override entries for ISA IRQs that are
used to route PCI interrupts even in APIC mode.
actually used. For most ACPI devices this means deferring the call
until bus_alloc_resource().
- Add a function acpi_config_intr() to call BUS_CONFIG_INTR() for an
ACPI IRQ resource using the trigger mode and polarity information
stored in the ACPI resource object.
- Add a function acpi_lookup_irq_resource() to lookup the ACPI IRQ
resource that corresponds to a specified rid and new-bus resource.
- Have the ACPI PCI bridge driver call BUS_CONFIG_INTR() on interrupts
that it routes through link devices.
- Remove needactivate variable from acpi_alloc_resource() by changing the
function not modify the flags variable but just mask off RF_ACTIVE when
calling rman_reserve_resource().
Reviewed by: njl (1, an earlier version)
- Allow ioapic_set_{nmi,smi,extint}() to be called multiple times on the
same pin so long as the pin's mode is the same as the mode being
requested.
- Add a notion of bus type for the interrupt associated with interrupt pin.
This is needed so that we can force all EISA interrupts to be active high
in the forthcoming ioapic_config_intr().
- Fix a bug for EISA systems that didn't remap IRQs. This would have broken
EISA systems that tried to disable mixed mode for IRQ 0.
case of NFS mounted swap, so do not try to dereference it.
While we're here, brucify the printf() call which happens when we
time out on acquisition of vm_page_queue_mtx.
PR: kern/67898
Submitted by: bde (style)
device associated with any PCI devices that are enumerated in the ACPI
tree when adding children to an ACPI PCI bus and remove the duplicate
ACPI-only device_t and replace the device_t associated with the handle with
the ACPI and PCI aware device_t.
Several changes:
* Implement read for ulpt.
* If the device is not opened for reading, occasionally drain any
data the printer might have (but don't hammer the printer with reads).
* Lower the buffer size to one page.
The driver seems to work with more printers now.
Obtained from: NetBSD
depending on namespace pollution in <sys/vnode.h> for the definition
of mutex interfaces used in SOCKBUF_*LOCK().
Sorted includes.
Removed unused includes.
the SS_NBIO flag from the parent socket to the child socket during an
accept() operation.
The file descriptor O_NONBLOCK flag would have been propagated already
by the fflag assignment, and therefore would have been inconsistent
with the underlying socket's so_state member.
This makes accept() more closely adhere to the API contract we effectively
outline in the manual page. Note also that Linux continues to differ here;
O_NONBLOCK is not propagated. The other BSDs do propagate the flag, as
does Solaris. The Single UNIX Specification does not offer specific
advice on this issue.
PR: kern/45733
Requested by: Jayanth Vijayaraghavan
Reviewed by: rwatson
%di will already point to the character after the nul char when the
'repnz scasb' terminates.
Submitted by: Tom Cosgrove tom dot cosgrove at arches-consulting dot com
- Split the code out into if_clone.[ch].
- Locked struct if_clone. [1]
- Add a per-cloner match function rather then simply matching names of
the form <name><unit> and <name>.
- Use the match function to allow creation of <interface>.<tag>
vlan interfaces. The old way is preserved unchanged!
- Also the match function to allow creation of stf(4) interfaces named
stf0, stf, or 6to4. This is the only major user visible change in
that "ifconfig stf" creates the interface stf rather then stf0 and
does not print "stf0" to stdout.
- Allow destroy functions to fail so they can refuse to delete
interfaces. Currently, we forbid the deletion of interfaces which
were created in the init function, particularly lo0, pflog0, and
pfsync0. In the case of lo0 this was a panic implementation so it
does not count as a user visiable change. :-)
- Since most interfaces do not need the new functionality, an family of
wrapper functions, ifc_simple_*(), were created to wrap old style
cloner functions.
- The IF_CLONE_INITIALIZER macro is replaced with a new incompatible
IFC_CLONE_INITIALIZER and ifc_simple consumers use IFC_SIMPLE_DECLARE
instead.
Submitted by: Maurycy Pawlowski-Wieronski <maurycy at fouk.org> [1]
Reviewed by: andre, mlaier
Discussed on: net
Only the first link0..link$NLINKS hooks would be utilized, whereas
the link hooks may be connected sparsely.
Add a counter variable so that the link hook array is only traversed
while there is still work to do, but that it continues up to the end
if it has to.
* block packets that fail to create state table entries
* only allow non-fragmented packets to influence whether or not a logged
packet is the same as the one logged before.
* correct the ICMP packet checksum fixing up when processing ICMP errors for NAT
* implement a maximum for the number of entries in the NAT table (NAT_TABLE_MAX
and ipf_nattable_max)
* frsynclist() wasn't paying attention to all the places where interface
names are, like it should.
* fix comparing ICMP packets with established TCP state where only 8 bytes
of header are returned in the ICMP error.
MFC after: 1 week
* Obtain/release schedlock around calls to calcru.
* Sort switch cases which do not cascade per style(9).
* Sort local variables per style(9).
* Remove "superfluous" whitespace.
* Cleanup handling of NULL uap->tp in clock_getres(). It would probably
be better to return EFAULT like clock_gettime() does by passing the
pointer to copyout(), but I presume it was written to not fail on
purpose in the original code. I'll defer to -standards on this one.
Reported by: bde
This is not really used by the process but it's confusing to some
status readers to see zombie processes the "runnin" threads.
Pointed out by: Don Lewis <truckman@FreeBSD.org>
the GEOM topology.
There are still issues with not detaching from cam correctly such that
upon a device detach there's an invalid pointer dereference from the
later call to cam_rescan().
that are on a CISS bus to be exported up to CAM and made available as normal
devices. This will typically add one or two buses to CAM, which will be
numbered starting at 32 to allow room for CISS proxy buses. Also, the CISS
firmware usually hides disk devices, but these can also be exposed as 'pass'
devices if you set the hw.ciss.expose_hidden_physical tunable.
Sponsored by: Tape Laboratories, Inc.
MFC After: 3 days
where it is known to detect a problem but the problem is not very easy
to fix. The warning became very common recently after a call to calcru()
was added to fill_kinfo_thread().
Another (much older) cause of "negative times" (actually non-monotonic
times) was fixed in rev.1.237 of kern_exit.c.
Print separate messages for non-monotonic and negative times.
from exit1(). sched_exit() must be called unconditionally from exit1().
It was called almost unconditionally because the only exits on system
shutdown if at all.
(2) Removed the comment that presumed to know what sched_exit() does.
sched_exit() does different things for the ULE case. The call became
essential when it started doing load average stuff, but its caller
should not know that.
(3) Didn't fix bugs caused by bitrot in the condition. The condition was
last correct in rev.1.208 when it was in wait1(). There p was spelled
curthread->td_proc and was for the waiting parent; now p is for the
exiting child. The condition was to avoid lowering init's priority.
It should be in sched_exit() itself. Lowering of priorities is broken
in other ways in at least the 4BSD scheduler, and doing it for init
causes less noticeable problems than doing it for for shells.
Noticed by: julian (1)
least the pci device unloadable
- Use ttymalloc() rather than a plain malloc to allocate the
rp->rp_tty ttys. This is now required due to the recent locking
changes to ttys and prevents a panic due to locking an unitialized
t_mtx.
- Allow the pci driver to be unloaded. This involved moving
the call rp_releaseresource() to the end of rp_pcireleaseresource(),
since rp_pcireleaseresource() uses ctlp->dev, which is freed
by rp_releaseresource().
- Allow the generic part of the driver to be unattached by providing
a hook to cancel timeouts.
Glanced at by: obrien
- sowakeup() now asserts the socket buffer lock on entry. Move
the call to KNOTE higher in sowakeup() so that it is made with
the socket buffer lock held for consistency with other calls.
Release the socket buffer lock prior to calling into pgsigio(),
so_upcall(), or aio_swake(). Locking for this event management
will need revisiting in the future, but this model avoids lock
order reversals when upcalls into other subsystems result in
socket/socket buffer operations. Assert that the socket buffer
lock is not held at the end of the function.
- Wrapper macros for sowakeup(), sorwakeup() and sowwakeup(), now
have _locked versions which assert the socket buffer lock on
entry. If a wakeup is required by sb_notify(), invoke
sowakeup(); otherwise, unconditionally release the socket buffer
lock. This results in the socket buffer lock being released
whether a wakeup is required or not.
- Break out socantsendmore() into socantsendmore_locked() that
asserts the socket buffer lock. socantsendmore()
unconditionally locks the socket buffer before calling
socantsendmore_locked(). Note that both functions return with
the socket buffer unlocked as socantsendmore_locked() calls
sowwakeup_locked() which has the same properties. Assert that
the socket buffer is unlocked on return.
- Break out socantrcvmore() into socantrcvmore_locked() that
asserts the socket buffer lock. socantrcvmore() unconditionally
locks the socket buffer before calling socantrcvmore_locked().
Note that both functions return with the socket buffer unlocked
as socantrcvmore_locked() calls sorwakeup_locked() which has
similar properties. Assert that the socket buffer is unlocked
on return.
- Break out sbrelease() into a sbrelease_locked() that asserts the
socket buffer lock. sbrelease() unconditionally locks the
socket buffer before calling sbrelease_locked().
sbrelease_locked() now invokes sbflush_locked() instead of
sbflush().
- Assert the socket buffer lock in socket buffer sanity check
functions sblastrecordchk(), sblastmbufchk().
- Assert the socket buffer lock in SBLINKRECORD().
- Break out various sbappend() functions into sbappend_locked()
(and variations on that name) that assert the socket buffer
lock. The !_locked() variations unconditionally lock the socket
buffer before calling their _locked counterparts. Internally,
make sure to call _locked() support routines, etc, if already
holding the socket buffer lock.
- Break out sbinsertoob() into sbinsertoob_locked() that asserts
the socket buffer lock. sbinsertoob() unconditionally locks the
socket buffer before calling sbinsertoob_locked().
- Break out sbflush() into sbflush_locked() that asserts the
socket buffer lock. sbflush() unconditionally locks the socket
buffer before calling sbflush_locked(). Update panic strings
for new function names.
- Break out sbdrop() into sbdrop_locked() that asserts the socket
buffer lock. sbdrop() unconditionally locks the socket buffer
before calling sbdrop_locked().
- Break out sbdroprecord() into sbdroprecord_locked() that asserts
the socket buffer lock. sbdroprecord() unconditionally locks
the socket buffer before calling sbdroprecord_locked().
- sofree() now calls socantsendmore_locked() and re-acquires the
socket buffer lock on return. It also now calls
sbrelease_locked().
- sorflush() now calls socantrcvmore_locked() and re-acquires the
socket buffer lock on return. Clean up/mess up other behavior
in sorflush() relating to the temporary stack copy of the socket
buffer used with dom_dispose by more properly initializing the
temporary copy, and selectively bzeroing/copying more carefully
to prevent WITNESS from getting confused by improperly
initialized mutexes. Annotate why that's necessary, or at
least, needed.
- soisconnected() now calls sbdrop_locked() before unlocking the
socket buffer to avoid locking overhead.
Some parts of this change were:
Submitted by: sam
Sponsored by: FreeBSD Foundation
Obtained from: BSD/OS
ld: locore.o: non-pic code with imm relocation against dynamic
symbol `__gp'
With binutils 2.15, ld(1) defines the implicit/automatic symbol __gp
as a dynamic symbol and thus will now complain when used in a non-PIC
fashion (the immediate relocation used to set the GP register). Resolve
this by defining __gp in the linker script. Make sure __gp is aligned
on a 16-byte boundary.
Note: the 0x200000 magic offset is due to having a 22-bit GP-relative
relocation. The GOT will be accessed with negative offsets from GP.
that it is a series of alphabetically-ordered #fidef's, from Bruce Evans.
Define two new thread-related values in kproc_info, from Cyrille Lefevre.
Also remove a few values from kproc_info that were not needed, and change
around a few comments, from me. Changes are combined into a single commit
simply because it is a hassle to make sure that alignments and sizes are
not changed on any platform when modifying kproc_info.
socket from its accept queue when aborting it during a new inbound
connection. Update spx_input() to acquire the accept lock, assert
the condition of the socket on its parent queue, and approriately
disconnect it from the queue before calling soabort() on it.
Tweak things so that ng_fec has a chance of working with things
other than ethernet. Use ifp->if_output of the underlying interfaces
and use IF_HANDOFF() rather than depending on ether_output() and
ether_output_frame() explicitly. Also, don't insist that underlying
devices be IFM_ETHER when checking their link states in the link
monitor code.
With these changes, I was able to create a two channel bundle
consisting of one ethernet interface and one 802.11 wireless
device (via ndis). Note that this only works because both devices
use the same if_output vector: ng_fec will not let you bundle
devices with different output vectors together (it really doesn't
make sense to do that).
underlying interfaces rather than using ac_netgraph in struct arpcom.
The latter is meant only for use by ng_ether, and using it breaks
interoperability with the rest of netgraph.
socket lock over pulling so_options and so_linger out of the socket
structure in order to retrieve a consistent snapshot. This may be
overkill if user space doesn't require a consistent snapshot.
resolved by socket locking: in particular, that we test the connection
state at the socket layer without locking, request that the protocol
begin listening, and then set the listen state on the socket
non-atomically, resulting in a non-atomic cross-layer test-and-set.
lots of errors. Blind substitution of "dev_t foo" by "struct cdev *foo"
in comments usually just created an English syntax error (e.g.,
"struct cdev *changes"), but here it did less than that since the dev_t
is a user dev_t.
in npxsetregs() too. npxsetregs() must overwrite the previous state, and
it is never paired with an npxgetregs() that would defuse the previous
state (since npxgetregs() would have fninit'ed the state, leaving nothing
to do).
PR: 68058 (this should complete the fix)
Tested by: Simon Barner <barner@in.tum.de>
to vm_map_find() that is less likely to be outside of addressable memory
for 32-bit processes: just past the end of the largest possible heap.
This is the same hint that mmap() uses.
ki_childutime, and ki_emul. Also uses the timevaladd() routine to
correct the calculation of ki_childtime. That will correct the value
returned when ki_childtime.tv_usec > 1,000,000.
This also implements a new KERN_PROC_GID option for kvm_getprocs().
(there will be a similar update to lib/libkvm/kvm_proc.c)
Submitted by: Cyrille Lefevre
which just mark areas which are empty due to issues with the alignment
of already-existing fields. This defines several unrelated variables
in one shot, because most of the work for updating kinfo_proc is making
sure the sizeof(struct kinfo_proc) remains the same across all hardware
platforms, and that no space is wasted on any platform due to alignment
issues with the new variables.
Submitted by: some by Cyrille Lefevre, some by me
unnecessary because cpu_setregs() and/or npxinit() always sets CR0_TS
during system initialization, and CR0_TS is set in the next statement
(fpstate_drop()) if necessary after system initialization. Setting
it unnecessarily was less than a pessimization since it broke the
invariant that the npx can be used without an npxdna() trap if
fpucurthread is non-null. The broken invariant became harmful when I
added an fnclex to npxdrop().
Removed setting of CR0_MP in exec_setregs(). This was similarly
unnecessary but was harmless.
Updated comments (mainly by removing them). Things are simpler now
that we have cpu_setregs() and don't support a math emulator or pretend
to support not having either a math emulator or an npx.
Removed the ifdef for avoiding setting CR0_NE in the !SMP case in
cpu_setregs(). npx_probe() should reverse the setting if it wants to
force IRQ13 exception handling for testing.