in the error cases, causing panics.
Adapted from similar fix to NFSv3 mkdir submitted by Mohan Srinivasan mohans
at yahoo-inc dot com
Approved by: alfred
should not return ERESTART after it caught a signal, otherwise
thr_wake() call will be lost, also a timeout wait should not be
restarted. Final, using wakeup not wakeup_one to be safeness.
they would leave enough elements on the stack that if you escaped to the
loader prompt and then typed 'setenv', it would pull in all of the leaked
junk and cause an exception in the environment. There still seems to be
3 leaked elements, but they don't appear to be coming from this file.
a deadlock (with NFS exclusive vnode locks enabled). Lookup
grabs the parent's lock and wants to lock child. Readdirplus
locks the child and wants to lock parent (for loading the attrs
for ".."). The fix is to not load the attrs for ".." in
readdirplus.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
This closes a major hole in close-to-open consistency support.
Added a new sysctl so that this can be disabled for single NFS
client applications with very large amounts of mmap'ed IO (for
performance).
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
returned back to df from a statfs call. Causing df to print negative
values.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
FP registers fall in that category. Passing the new register value
by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
the packet will have the register value in target-byte order and
in the memory representation (modulo the fact that bytes are sent
as 2 printable hexadecimal numbers of course). We only need to
decode the packet to have a pointer to the register value.
This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.
Tested on: alpha, amd64, i386, ia64, sparc64
@sys/dev/acpica/acpi_pci_link.c:153" panic by backing out rev 1.37 in the SMP
case. It appears that on a dual-proc machine the assertions in the rev 1.37
commit log hold true.
Introduce domain_init_status to keep track of the init status of the domains
list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 =
initialized/done. Higher values can be used to support late addition of
domains which right now "works", but is potential dangerous. I choose to
only give a warning when doing so.
Use domain_init_status with if_attachdomain[1]() to ensure that we have a
complete domains list when we init the if_afdata array. Store the current
value of domain_init_status in if_afdata_initialized. This way we can update
if_afdata after a new protocol has been added (once that is allowed).
Submitted by: se (with changes)
Reviewed by: julian, glebius, se
PR: kern/73321 (partly)
call net_add_domain(). Calling this function too early (or late) breaks
assertations about the global domains list.
Actually it should be forbidden to call net_add_domain() outside of
SI_SUB_PROTO_DOMAIN completely as there are many places where we traverse
the domains list unprotected, but for now we allow late calls (mostly to
support netgraph). In order to really fix this we have to lock the domains
list in all places or find another way to ensure that we can safely walk the
list while another thread might be adding a new domain.
Spotted by: se
Reviewed by: julian, glebius
PR: kern/73321 (partly)
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
observations lead me to believe that the convetion for pc98 boot
loaders is to have a jump unstruction, followed by a string, followed
by code. The jump usually doesn't have a nop after it and usually the
string is NUL terminated, but Grub/98 breaks both of these rules.
# I looked for, but failed to find the Minux boot blocks for PC-9801 port.
512. If I had an audio cdrom in my cd player when I booted my system,
I'd get a panic from geom because you can't read 8192 bytes from an
audio cdrom.
Remove XXX comment about IPL1 and replace it with some information
from my soon to be published web page on the pc98 disk layout. The
IPL1 test was the result of an observation of a disk with FreeBSD's
boot0 program. It was testing part of an area what appears to be
reserved for a boot loader name, which comes after a jump over this
area. I don't yet know if it is required to be any specific jump
instruction, or if the destination has to be location 11. [1]
[1] FreeBSD Press No. 13, page 115, poorly translated by myself. The
picture there shows offset 8 as the destination of the jump, but
FreeBSD's boot0 program has three padding NULs after the IPL1 name and
uses a 16-bit 'jmp' instruction.
resource lists. It used to be sized based only on _CRS, hence _PRS could
perform an out-of-bounds access if it was larger (i.e., when there are
dependent functions). Add asserts to detect this case. Note, this is
only a temporary fix and I believe _PRS and _CRS should have separate
arrays.
Also, fix a typo where the wrong irq was being check for the APIC case.
Submitted by: tegge
to do a window update to the peer (thru an ACK) from soreceive()
itself. TCP will do that upon return from the socket callback.
Sending a window update from soreceive() results in a lock reversal.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
soreceive(), then pass in M_DONTWAIT to m_copym(). Also fix up error
handling for the case where m_copym() returns failure.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
that the exclusive lock is already held, then we call panic. Don't
clobber internal lock state before panic'ing. This change improves
debugging if this case were to happen.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
Approved by: Robert Watson <rwatson@freebsd.org>
Add locking to the IPv6 scoping code.
All spl() like calls have also been removed.
Cleaning up the handling of ifnet data will happen at a later date.
correct open/close behaviour of filesystems:
When an ioctl to modify the MBR arrives, we cannot take for granted that
we have the consumer open.
The symptom is that one cannot run 'boot0cfg -s2 /dev/ad0' in single-user
mode because / is the only open partition in only open r1w0e1.
If it is not, we attempt to increase the write count by one and
decrease it again afterwards.
Presumably most if not all other slices suffer from the same problem.
the address of a channel on a SCC, it returns 0 on failure. [1]
- Hardcode channel 1 for the keyboard on Z8530, the information present
in the Open Firmware device tree doesn't allow to determine this via
uart_cpu_channel(). This makes the keyboard (if one backs out rev. 1.5
of sys/dev/puc/puc_sbus.c and has both keyboard and mouse plugged in to
avoid the hang that revision works around) and consequently syscons(4)
on Ultra 2 work. There's a problem with the keyboard LEDs similar to
the one on Ultra 60 (LEDs don't get lit under X) though, instead of
lighting just a specific single one all get lit and can't be turned off
again. [1]
- Add comments about what uart_cpu_channel() and uart_cpu_getdev_keyboard()
do and their constraints.
- Improve the comments about what uart_cpu_getdev_[console,dbgport]() do,
they don't return an address (as in bus) but an Open Firmware package
handle.
Reviewed by: marcel (modulo the comments) [1]
instead acquire it conditionally in closef() if it is required for
advisory locking. This removes Giant from the close() path of sockets
and pipes (and any other objects that don't acquire Giant in their
fo_close path, such as kqueues). Giant will still be acquired twice for
vnodes -- once for advisory lock teardown, and a second time in the
fo_close method. Both Poul-Henning and I believe that the advisory lock
teardown code can be moved into the vn_closefile path shortly.
This trims a percent or two off the cost of most non-vnode close
operations on SMP, but has a fairly minimal impact on UP where the cost
of a single mutex operation is pretty low.
pointer updates: test available space while holding the socket buffer
mutex, and continue to hold until until the pointer update has been
performed.
MFC after: 2 weeks
o Remove a bogus comment that relates to alpha.
o s/u_int64_t/uint64_t/g
o Add bi_spare2 to make the internal padding explicit.
o Move BOOTINFO_MAGIC after the field it applies to.
changing the Makefile, fail the creation of loader.efi when there are
unresolved symbols in loader.sym. This avoids silently creating a
faulty EFI binary.
operation (by subtracting the absolute result from 0), don't test
for overflow.
This avoids an arithmetic exception when dividing LONG_MIN by 1:
This is the only case that causes overflow, and the resulting value
is correct under 2's compliment arithmetic.
PR: 72024
Approved by: dwmalone@
Obtained from: NetBSD
MFC after: 4 days
normal PPP compression, as a workaround for certain (arguably) broken
Linux PPP implementations that can't handle this particular case.
MFC after: 1 week
NetBSD got activated. NetBSD has an additional change in
their mii.c rev 1.26 which got missed with that merger:
: When probing for a PHY, look at the EXTSTAT bit in the BMSR, as well,
: not just the media mask. This prevents PHYs/TBIs that only support
: Gigabit media from slipping through the cracks.
With this GE only ones like from the SK-9844 are detected again.
PR: i386/63313, i386/71733, kern/73725
Tested by: matt baker <matt at sevenone dot com>, Jin Guojun <jin at george dot lbl dot gov>
Approved by: rwatson (mentor)
Obtained from: NetBSD mii.c rev 1.26
MFC after: 1 week
This socket option allows processes query a TCP socket for some low
level transmission details, such as the current send, bandwidth, and
congestion windows. Linux provides a 'struct tcpinfo' structure
containing various variables, rather than separate socket options;
this makes the API somewhat fragile as it makes it dificult to add
new entries of interest as requirements and implementation evolve.
As such, I've included a large pad at the end of the structure.
Right now, relatively few of the Linux API fields are filled in, and
some contain no logical equivilent on FreeBSD. I've include __'d
entries in the structure to make it easier to figure ou what is and
isn't omitted. This API/ABI should be considered unstable for the
time being.
Null_open() was only here to handle MNT_NODEV, but since that does
not affect any filesystems anymore, it could only have any effect
if you nullfs mounted a devfs but didn't want devices to show up.
If you need that, there are easier ways.
window was 0 bytes in size. This may have been the cause of unsolved
"connection not closing" reports over the years.
Thanks to Michiel Boland for providing the fix and providing a concise
test program for the problem.
Submitted by: Michiel Boland
MFC after: 2 weeks
Historically, our contigmalloc1() and contigmalloc2() assumes
that a page in PQ_CACHE can be unconditionally reused by busying
and freeing it. Unfortunatelly, when object happens to be not
NULL, the code will set m->object to NULL and disregard the fact
that the page is actually in the VM page bucket, resulting in
page bucket hash table corruption and finally, a filesystem
corruption, or a 'page not in hash' panic.
This commit has borrowed the idea taken from DragonFlyBSD's fix
to the VM fix by Matthew Dillon[1]. This version of patch will
do the following checks:
- When scanning pages in PQ_CACHE, check hold_count and
skip over pages that are held temporarily.
- For pages in PQ_CACHE and selected as candidate of being
freed, check if it is busy at that time.
Note: It seems that this is might be unrelated to kern/72539.
Obtained from: DragonFlyBSD, sys/vm/vm_contig.c,v 1.11 and 1.12 [1]
Reminded by: Matt Dillon
Reworked by: alc
MFC After: 1 week
and assume that the BIOS has set it up for us. This allows folks with a
serial-aware BIOS to set the BIOS to speeds above 9600 and allow boot0 to
just use the existing settings.
- Purge some gratuitous cpp comments as per style(9).
Submitted by: Danny Braniss danny at cs dot huji dot ac dot il (1)
MFC after: 1 month
assembler to cpp(1) comment conversions. This allows btx to compile again
when BTX_SERIAL is defined.
Reported by: Danny Braniss danny at cs dot huji dot ac dot il
MFC after: 1 month
'binat from ... to ... -> (if)' are used, where the interface
is dynamic.
Discovered by: kos(at)bastard(dot)net
Analyzed by: Pyun YongHyeon
Approved by: mlaier (mentor)
MFC after: 1 week
contents of the tcpcb are read and modified in volume.
In tcp_input(), replace th comparison with 0 with a comparison with
NULL.
At the 'findpcb', 'dropafterack', and 'dropwithreset' labels in
tcp_input(), assert 'headlocked'. Try to improve consistency between
various assertions regarding headlocked to be more informative.
MFC after: 2 weeks
Printf() a warning if if_attachdomain() is called more than once on an
interface to generate some noise on mailing lists when this occurs.
Fix up style in if_start(), where spaces crept in instead of tabs at
some point.
MFC after: 1 week
MFC note: Not the printf().
When a series of traces is included in a bug report, this will make it
easier to tie the trace information back to ps or threads output,
each of which will show the pid or the tid, but usually not both.
- Use a new-bus device driver for the ACPI PCI link devices. The devices
are called pci_linkX. The driver includes suspend/resume support so that
the ACPI bridge drivers no longer have to poke the links to get them
to handle suspend/resume. Also, the code to handle which IRQs a link is
routed to and choosing an IRQ when a link is not already routed is all
contained in the link driver. The PCI bridge drivers now ask the link
driver which IRQ to use once they determine that a _PRT entry does not
use a hardwired interrupt number.
- The new link driver includes support for multiple IRQ resources per
link device as well as preserving any non-IRQ resources when adjusting
the IRQ that a link is routed to.
- The entire approach to routing when using a link device is now
link-centric rather than pci bus/device/pin specific. Thus, when
using a tunable to override the default IRQ settings, one now uses
a single tunable to route an entire link rather than routing a single
device that uses the link (which has great foot-shooting potential if
the user tries to route the same link to two different IRQs using two
different pci bus/device/pin hints). For example, to adjust the IRQ
that \_SB_.LNKA uses, one would set 'hw.pci.link.LNKA.irq=10' from the
loader.
- As a side effect of having the link driver, unused link devices will now
be disabled when they are probed.
- The algorithm for choosing an IRQ for a link that doesn't already have an
IRQ assigned is now much closer to the one used in $PIR routing. When a
link is routed via an ISA IRQ, only known-good IRQs that the BIOS has
already used are used for routing instead of using probabilities to
guess at which IRQs are probably not used by an ISA device. One change
from $PIR is that the SCI is always considered a viable ISA IRQ, so that
if the BIOS does not setup any IRQs the kernel will degenerate to routing
all interrupts over the SCI. For non ISA IRQs, interrupts are picked
from the possible pool using a simplistic weighting algorithm.
Tested by: ru, scottl, others on acpi@
Reviewed by: njl
release the pipe mutex before calling fsetown(), as fsetown()
may block. The sigio code protects the pipe sigio data using
its own mutex, and the pipe reference count held by the caller
will prevent the pipe from being prematurely garbage-collected.
Discovered by: imp
is a PAL ID, while PCPU_GET(cpuid) is a FreeBSD CPU ID. The FreeBSD CPU
ID of the BSP is always zero, so use that to see which CPU should run the
full clock functions.
structure, so assert the inpcb lock associated with the tcptw.
Also assert the tcbinfo lock, as tcp_timewait() may call
tcp_twclose() or tcp_2msl_rest(), which require it. Since
tcp_timewait() is already called with that lock from tcp_input(),
this doesn't change current locking, merely documents reasons for
it.
In tcp_twstart(), assert the tcbinfo lock, as tcp_timer_2msl_rest()
is called, which requires that lock.
In tcp_twclose(), assert the tcbinfo lock, as tcp_timer_2msl_stop()
is called, which requires that lock.
Document the locking strategy for the time wait queues in tcp_timer.c,
which consists of protecting the time wait queues in the same manner
as the tcbinfo structure (using the tcbinfo lock).
In tcp_timer_2msl_reset(), assert the tcbinfo lock, as the time wait
queues are modified.
In tcp_timer_2msl_stop(), assert the tcbinfo lock, as the time wait
queues may be modified.
In tcp_timer_2msl_tw(), assert the tcbinfo lock, as the time wait
queues may be modified.
MFC after: 2 weeks
but unlikely races that could be corrected by having tcp_keepcnt
and tcp_keepintvl modifications go through handler functions via
sysctl, but probably is not worth doing. Updates to multiple
sysctls within evaluation of a single addition are unlikely.
Annotate that tcp_canceltimers() is currently unused.
De-spl tcp_timer_delack().
De-spl tcp_timer_2msl().
MFC after: 2 weeks
on the tcpcb, but also calls into tcp_close() and tcp_twrespond().
Annotate that tcp_twrecycleable() requires the inpcb lock because it does
a series of non-atomic reads of the tcpcb, but is currently called
without the inpcb lock by the caller. This is a bug.
Assert the inpcb lock in tcp_twclose() as it performs a read-modify-write
of the timewait structure/inpcb, and calls in_pcbdetach() which requires
the lock.
Assert the inpcb lock in tcp_twrespond(), as it performs multiple
non-atomic reads of the tcptw and inpcb structures, as well as calling
mac_create_mbuf_from_inpcb(), tcpip_fillheaders(), which require the
inpcb lock.
MFC after: 2 weeks
protects access to the ISN state variables.
Acquire the tcbinfo write lock in tcp_isn_tick() to synchronize
timer-driven isn bumping.
Staticize internal ISN variables since they're not used outside of
tcp_subr.c.
MFC after: 2 weeks
It means, that node listens to flow control messages from downstreams
and removes link from list of active links whenever a LINK_IS_DOWN message
is received. If LINK_IS_UP message is received, then links is put
back into list of active links.
Approved by: julian (mentor), implicitly
MFC after: 1 week
o Implement some netgraph flow control:
- Whenever status of HDLC heartbeat from pear is timed out,
send NGM_LINK_IS_DOWN message.
- If HDLC link changes status from down to up, send
NGM_LINK_IS_UP message.
Approved by: julian (mentor), implicitly
MFC after: 1 week
- Let hme_start()/hme_init() acquire lock and then call
hme_start_locked()/hme_init_locked() respectivly.
- Teardown interrupt handler before hme_detach().
- Remove IFF_NEEDSGIANT flag and mark interrupt handler INTR_MPSAFE.
- Set callout handler to CALLOUT_MPSAFE.
- Add locks in hme MII interface.
Reviewed by: jake
Tested by: Julian C. Dunn <jdunn at opentrend dot net>
MFC after: 2 weeks
Consolidate all of the bounce tests into the BUS_DMA_COULD_BOUNCE flag.
Allocate the bounce zone at either tag creation or map creation to help
avoid null-pointer derefs later on. Track total pages per zone so that
each zone can get a minimum allocation at tag creation time instead of
being defeated by mis-behaving tags that suck up the max amount.
Allocate the bounce zone at either tag creation or map creation to help
avoid null-pointer derefs later on. Track total pages per zone so that
each zone can get a minimum allocation at tag creation time instead of
being defeated by mis-behaving tags that suck up the max amount.
without ever being changed to actually work with an i8251. Nobody is
working on this either at the moment, so it's not about to change
soon.
When the code necessary to support the i8251 is committed, this can
be reverted again.
not used and aliases for other defines.
o Add REG_DATA as an alias for com_data. Likewise for other register
defines.
o Add LCR_SBREAK and make CFCR_SBREAK an alias for it. Likewise for
the other LCR register bits that are known with the CFCR prefix.
o Add MCR_IE and make MCR_IENABLE an alias for it.
o Add LSR_TEMT and make LSR_TSRE an alias for it.
o Add LSR_THRE and make LSR_TXRDY as alias for it.
o Add FCR_ENABLE and make FIFO_ENABLE as alias for it. Likewise for
the other FCR register bits that are known with the FIFO prefix.
o Add EFR_CTS and make EFR_AUTOCTS an alias for it.
o Add EFR_RTS and make EFR_AUTORTS an alias for it.
This is a first step in cleaning up the definitions in this file.
USPACE_SVC_STACK_TOP, USPACE_SVC_STACK_BOTTOM, USPACE_UNDEF_STACK_TOP,
and USPACE_UNDEF_STACK_BOTTOM look wrong to me, so I'm leaving them
alone.
Reviewed by: arch@
mp_machdep.c was using UAREA_PAGES to allocate something that isn't a
U area, and that there seems to be an implicit assumption that the PCB
is just past the end of the kernel stack.
Reviewed by: arch@
to us was to help out the Linux port, but really just invited overflow.
In fact, the request sense timer was overflowing prior to this change making
it much shorter than intended.
aic_osm_lib.h:
Be more careful about overflow in all timer/timeout primitives.
from divert sockets.
- Remove div_disconnect() method, since it shouldn't be called now.
- Remove div_abort() method. It was never called directly, since protocol
doesn't have listen queue. It was called only from div_disconnect(),
which is removed now.
Reviewed by: rwatson, maxim
Approved by: julian (mentor)
MT5 after: 1 week
MT4 after: 1 month
setting the B_REMFREE flag in the buf. This is done to prevent lock order
reversals with code that must call bremfree() with a local lock held.
This also reduces overhead by removing two lock operations per buf for
fsync() and similar.
- Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock
has been acquired so that we may remove ourself from the free-list.
- Provide a bremfreef() function to immediately remove a buf from a
free-list for use only by NFS. This is done because the nfsclient code
overloads the b_freelist queue for its own async. io queue.
- Simplify the numfreebuffers accounting by removing a switch statement
that executed the same code in every possible case.
- getnewbuf() can encounter locked bufs on free-lists once Giant is removed.
Remove a panic associated with this condition and delay asserts that
inspect the buf until after it is locked.
Reviewed by: phk
Sponsored by: Isilon Systems, Inc.
to request from devices during the "long inquiry" portion of our probe.
This same bug was fixed in the 4.x stream a few years ago, but the fix
was never propogated to -current.
This fix is slightly different than in -stable:
o Use offsetof() instead of a hard coded constant so as the make
the code more self-explainatory.
o Round odd long inquiry lengths up so as to avoid tickling ignore
wide residue bugs in broken parallel SCSI devices running with a
wide transfer negotiation.
MFC: 3 days
after allowing more than one address with the same prefix.
Reported by: Vladimir Grebenschikov <vova NO fbsd SPAM ru>
Submitted by: ru (also NetBSD rev. 1.83)
Pointyhat to: mlaier
queue a packet to the hardware... instead of when the hardware queue is
empty..
don't initalize cur_tx now that it doesn't need to be...
Pointed out by: bde