vfs_flagopt() for binary/boolean options.
vfs_getopts() for string options
vfs_filteropt() to check for unknown options.
vfs_scanopt() for scanf() like processing of options.
Also add function for setting the stat.f_mntfromname field.
socket callbacks or similar callers, from both the NFS client and the
server.
Instituted nfsm_dissect_nonblock(), nfsm_dissect_xx_nonblock(). And
nfsm_disct() now takes an extra M_TRYWAIT/M_DONTWAIT argument.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
is safe to turn off the nfsnode's NMODIFIED flag.
- Move the check for signals to the top of the loop where we loop
around the dirty buffers on the vnode, scheduling writes. This
ensures that we'll break ouf of the flush operation on reception of
a signal.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Root filessytems (like NFS) don't have an associated disk device,
and even if they had, the exact semantics would be filesystem
dependent and should be implemented there.
userland and a dedicated system call to get replies.
The vnode-bypass of fifos broke this into a panic.
Ditch all the magic and create a device /dev/nfslock instead, and
use that for both directions apart from the shorter path, this is
also faster because the device driver runs Giant free using the
vnode bypass.
Noticed by: marcel
actually is a property of the northbridge and applies to all PCI/PCI-X/PCIe
devices in the system, though only PCIe devices will respond to registers
higher than 256. This uses per-CPU pools of temporary mappings so that
the whole 256MB of configuration space doesn't have to be mapped all at
once. While the sf_buf API was considered for this, the fact that it
requires sleep locks and can return failure made it unsuitable for this use.
For now only the Intel Grantsdale and Lindenhurst (925 and 752x) chipsets are
supported. Since there doesn't appear to be a compatible way to determine
northbridge support, new chipsets will have to be explicitely added in the
future.
zero-copy receive of jumbo frames. This eliminates the need for the
jumbo frame allocator implemented in kern/uipc_jumbo.c and sys/jumbo.h.
Remove it.
Note: Zero-copy receive of jumbo frames did not work without these changes;
I believe there was insufficient locking on the jumbo vm object.
Tested by: ken@
Discussed with: gallatin@
properly support bounce buffers and resource shortages. This allows the
driver to work properly and reliably with more than 4GB of RAM. Of the
three data paths that exist in the driver, (block, CAM, ioctl), the ioctl
path has not been well tested with these changes due to difficulty with
finding an application that uses it that actually works.
Sponsored by: The FreeBSD Foundation and FreeBSD Systems, Inc.
and annotate that nfs_mountroot assumes it is OK to step on the
values in the global NFSv3 diskless structure as the mountroot
function is called during a serialized part of the boot, before
any other NFS client activity occurs.
MFC after: 2 weeks
doesn't. Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.
Fix this the cleaner way: Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.
commit. In the new world order, the transitive closure on the vector
operations is not precomputed. As such, it's unsafe to actually use
any of the function pointers in an indirect function call. They can
be null, and we need to use the default vector in that case.
This is mostly a quick fix for the four function pointers that are
ed explicitly. A more generic or scalable solution is likely to see
the light of day.
No pathos on: current@
tcpip_fillheaders()
tcp_discardcb()
tcp_close()
tcp_notify()
tcp_new_isn()
tcp_xmit_bandwidth_limit()
Fix a locking comment in tcp_twstart(): the pcbinfo will be locked (and
is asserted).
MFC after: 2 weeks
inp->inp_moptions pointer, so that ip_getmoptions() can perform
necessary locking when doing non-atomic reads.
Lock the inpcb by default to copy any data to local variables, then
unlock before performing sooptcopyout().
MFC after: 2 weeks
to implement the sanity check should have been changed when we converted
the implementation of vm_pindex_t from 32 to 64 bits. (Thus, RELENG_4 is
not affected.) The consequence of this error would be a legimate write to
an extremely large file being treated as an errant attempt to write meta-
data.
Discussed with: tegge@
modifications to the inpcb IP options mbuf:
- Lock the inpcb before passing it into ip_pcbopts() in order to prevent
simulatenous reads and read-modify-writes that could result in races.
- Pass the inpcb reference into ip_pcbopts() instead of the option chain
pointer in the inpcb.
- Assert the inpcb lock in ip_pcbots.
- Convert one or two uses of a pointer as a boolean or an integer
comparison to a comparison with NULL for readability.
- Always check that index number passed from userland
is <= NG_NETFLOW_MAXIFACES. [1]
- Increase NG_NETFLOW_MAXIFACES up to 512. [2]
Noticed by: Roman Palagin [1]
Requested by: Yuri Y. Bushmelev [2]
MFC after: 1 week
header. pf finds the first TCP/UDP/ICMP6 header to filter by traversing
the header chain. In the case where headers are skipped, the protocol
checksum verification used the wrong length (included the skipped headers),
leading to incorrectly mismatching checksums. Such IPv6 packets with
headers were silently dropped.
Discovered by: Bernhard Schmidt
MFC after: 1 week
and that I've verified things seem to basically work. I was able to
boot and hot plug usb devices. Please let me know if this causes
problems for anybody.
The push down of giant has proceeded to the point that this will start
to matter more and more.
anywhere in the DAG. This includes configurations that are not
allowed by the EFI specification.
o Reject a GPT partition table if it's not preceeded by a PMBR.
There's no need to preserve the MBR partitioning anymore as GPT
is mature and with the first bullet extending the applicability
of GPT, it's better to be a bit more strict.
If we are resuming non-MPSAFE drivers, they need Giant held for them.
This may fix some obscure suspend/resume problems. It has fixed keyrate
setting problems that were triggered by cardbus (MPSAFE) changing the
ordering for syscons resume (non-MPSAFE). Also, add some asserts that
Giant is held in our suspend/resume and shutdown methods.
Found by: iedowse
MFC after: 2 days
because we know it then and we need it when inserting a component which
wasn't destroyed while device was running.
Reported by: Michael Handler <handler@grendel.net>
MFC after: 1 week
and if so call it.
The cmount method will gather and interpret omount() style arguments,
and issue a kern_[v]mount() call to execute the corresponding nmount
operation.
- Initialize sc->pcn_type during ATTACH as softc contents may not surivive
from PROBE.
- Print out chip-id to assist with ongoing pcn(4) debugging efforts.
non-standard BIOSen. We used to implement this in local patches but
now that ACPI-CA has merged/re-implemented most of our fixes, they were
no longer needed and we just needed to turn this knob on. Also, remove
an unnecessary cast.
Tested by: phk
we really want vs. the size changing 'long' (i386 vs. AMD64).
This fixes the problem with DRM with Radeon's on AMD64.
Submitted by: Jung-uk Kim <jkim@niksun.com>
back on again in resume. Override the default of D3 with the value the
BIOS specifies in _SxD, if present. Skip serial devices (PNP05xx) since
they seem to hang when set to D3 and may require special driver support.
Also, skip non-type 0 PCI devices (i.e., bridges) since our we don't yet
save/restore their config space and that seems to be necessary.
If this gives you trouble with suspend/resume, you can disable the new
ACPI and PCI power behavior separately with these tunables & sysctls:
debug.acpi.do_powerstate
hw.pci.do_powerstate
Approved by: imp (pci)
Tested by: acpi@ (numerous)
instead for the time being. Intel should fix this.
Note that if this commit is correct, it is made on the vendor branch.
We expect the Intel folks to fix it, and we don't want to unnecessarily
take files off the vendor branch.
Approved by: njl
MFC after: 1 week
initializations but we did have lofty goals and big ideals.
Adjust to more contemporary circumstances and gain type checking.
Replace the entire vop_t frobbing thing with properly typed
structures. The only casualty is that we can not add a new
VOP_ method with a loadable module. History has not given
us reason to belive this would ever be feasible in the the
first place.
Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.
Give coda correct prototypes and function definitions for
all vop_()s.
Generate a bit more data from the vnode_if.src file: a
struct vop_vector and protype typedefs for all vop methods.
Add a new vop_bypass() and make vop_default be a pointer
to another struct vop_vector.
Remove a lot of vfs_init since vop_vector is ready to use
from the compiler.
Cast various vop_mumble() to void * with uppercase name,
for instance VOP_PANIC, VOP_NULL etc.
Implement VCALL() by making vdesc_offset the offsetof() the
relevant function pointer in vop_vector. This is disgusting
but since the code is generated by a script comparatively
safe. The alternative for nullfs etc. would be much worse.
Fix up all vnode method vectors to remove casts so they
become typesafe. (The bulk of this is generated by scripts)
in the _PRS or _CRS of link devices. If faced with multiple DPFs in a
_PRS, we just use the first one. We assume that if _CRS has DPF tags they
only contain a single set since multiple DPFs wouldn't make any sense. In
practice, the only DPFs I've seen so far for link devices are that the one
IRQ resource is surrounded by a DPF tag pair for no apparent reason, and
this should handle that case fine now.
- Only allocate link structures for IRQ resources for link devices rather
than allocating a link structure for every resource.
Reviewed by: njl
Tested by: phk
in the error cases, causing panics.
Adapted from similar fix to NFSv3 mkdir submitted by Mohan Srinivasan mohans
at yahoo-inc dot com
Approved by: alfred
should not return ERESTART after it caught a signal, otherwise
thr_wake() call will be lost, also a timeout wait should not be
restarted. Final, using wakeup not wakeup_one to be safeness.
they would leave enough elements on the stack that if you escaped to the
loader prompt and then typed 'setenv', it would pull in all of the leaked
junk and cause an exception in the environment. There still seems to be
3 leaked elements, but they don't appear to be coming from this file.
a deadlock (with NFS exclusive vnode locks enabled). Lookup
grabs the parent's lock and wants to lock child. Readdirplus
locks the child and wants to lock parent (for loading the attrs
for ".."). The fix is to not load the attrs for ".." in
readdirplus.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
This closes a major hole in close-to-open consistency support.
Added a new sysctl so that this can be disabled for single NFS
client applications with very large amounts of mmap'ed IO (for
performance).
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
returned back to df from a statfs call. Causing df to print negative
values.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
FP registers fall in that category. Passing the new register value
by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
the packet will have the register value in target-byte order and
in the memory representation (modulo the fact that bytes are sent
as 2 printable hexadecimal numbers of course). We only need to
decode the packet to have a pointer to the register value.
This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.
Tested on: alpha, amd64, i386, ia64, sparc64
@sys/dev/acpica/acpi_pci_link.c:153" panic by backing out rev 1.37 in the SMP
case. It appears that on a dual-proc machine the assertions in the rev 1.37
commit log hold true.
Introduce domain_init_status to keep track of the init status of the domains
list (surprise). 0 = uninitialized, 1 = initialized/unpopulated, 2 =
initialized/done. Higher values can be used to support late addition of
domains which right now "works", but is potential dangerous. I choose to
only give a warning when doing so.
Use domain_init_status with if_attachdomain[1]() to ensure that we have a
complete domains list when we init the if_afdata array. Store the current
value of domain_init_status in if_afdata_initialized. This way we can update
if_afdata after a new protocol has been added (once that is allowed).
Submitted by: se (with changes)
Reviewed by: julian, glebius, se
PR: kern/73321 (partly)
call net_add_domain(). Calling this function too early (or late) breaks
assertations about the global domains list.
Actually it should be forbidden to call net_add_domain() outside of
SI_SUB_PROTO_DOMAIN completely as there are many places where we traverse
the domains list unprotected, but for now we allow late calls (mostly to
support netgraph). In order to really fix this we have to lock the domains
list in all places or find another way to ensure that we can safely walk the
list while another thread might be adding a new domain.
Spotted by: se
Reviewed by: julian, glebius
PR: kern/73321 (partly)
lock collision.
2. Fix two race conditions. One is between _umtx_unlock and signal,
also a thread was marked TDF_UMTXWAKEUP by _umtx_unlock, it is
possible a signal delivered to the thread will cause msleep
returns EINTR, and the thread breaks out of loop, this causes
umtx ownership is not transfered to the thread. Another is in
_umtx_unlock itself, when the function sets the umtx to
UMTX_UNOWNED state, a new thread can come in and lock the umtx,
also the function tries to set contested bit flag, but it will
fail. Although the function will wake a blocked thread, if that
thread breaks out of loop by signal, no contested bit will be set.
observations lead me to believe that the convetion for pc98 boot
loaders is to have a jump unstruction, followed by a string, followed
by code. The jump usually doesn't have a nop after it and usually the
string is NUL terminated, but Grub/98 breaks both of these rules.
# I looked for, but failed to find the Minux boot blocks for PC-9801 port.
512. If I had an audio cdrom in my cd player when I booted my system,
I'd get a panic from geom because you can't read 8192 bytes from an
audio cdrom.
Remove XXX comment about IPL1 and replace it with some information
from my soon to be published web page on the pc98 disk layout. The
IPL1 test was the result of an observation of a disk with FreeBSD's
boot0 program. It was testing part of an area what appears to be
reserved for a boot loader name, which comes after a jump over this
area. I don't yet know if it is required to be any specific jump
instruction, or if the destination has to be location 11. [1]
[1] FreeBSD Press No. 13, page 115, poorly translated by myself. The
picture there shows offset 8 as the destination of the jump, but
FreeBSD's boot0 program has three padding NULs after the IPL1 name and
uses a 16-bit 'jmp' instruction.
resource lists. It used to be sized based only on _CRS, hence _PRS could
perform an out-of-bounds access if it was larger (i.e., when there are
dependent functions). Add asserts to detect this case. Note, this is
only a temporary fix and I believe _PRS and _CRS should have separate
arrays.
Also, fix a typo where the wrong irq was being check for the APIC case.
Submitted by: tegge
to do a window update to the peer (thru an ACK) from soreceive()
itself. TCP will do that upon return from the socket callback.
Sending a window update from soreceive() results in a lock reversal.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
soreceive(), then pass in M_DONTWAIT to m_copym(). Also fix up error
handling for the case where m_copym() returns failure.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
that the exclusive lock is already held, then we call panic. Don't
clobber internal lock state before panic'ing. This change improves
debugging if this case were to happen.
Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com
Reviewed by: rwatson
Approved by: Robert Watson <rwatson@freebsd.org>
Add locking to the IPv6 scoping code.
All spl() like calls have also been removed.
Cleaning up the handling of ifnet data will happen at a later date.