as some combinations of chipset, controller and target do not behave
correctly when DMA is enabled for other commands.
PR: kern/103602
MFC after: 2 weeks
were never freed, but the big ring was freed twice.
-Don't supply rx hw csums for frames which are padded beyond the
length specified in the ip header. If the padding is non-zero,
the hw csum will be incorrect for such frames.
Sponsored by: Myricom
non-mapped data as possible at once and not page-by-page. Which this change we
combain I/Os, but also saves many VM_OBJECT_UNLOCK()/VM_OBJECT_LOCK()
operations.
Simple 'fsx -l 33554432 -o 524288 -N 10000 /tank/fsx' test shows ~23%
performance increase.
This workaround the problem in Parallels/VMWare where the emulated drivers are
slower, especially with ATA_FLUSHCACHE. The problem appears much more
frequently with ZFS which use it a lot more.
Approved: sos, pjd
- vm_page_undirty() is enough (instead of vm_page_set_validclean()), but it has
to be called before we write the data in case someone makes page dirty after
our write, but before our vm_page_undirty() call.
- Always dmu_write, not matter if uiomove() succeeded, because it could
partially be ok and we would lose some changes.
All good ideas from: ups
In dounmount(), before or while vn_lock(coveredvp) is called, coveredvp
vnode may be VI_DOOMED due to one of the following:
- other thread finished unmount and vput()ed it, and vnode was chosen
for recycling, while vn_lock() slept;
- forced unmount of the coveredvp->v_mount fs.
In the first case, next check for changed v_mountedhere or mnt_gen counter
would be successfull. In the second case, the unmount shall be allowed.
Submitted by: sobomax
MFC after: 2 weeks
- Fix for a bug where a close would not wait for all (directio)
dirty buffers to drain. The nfsnode was not marked NMODIFIED
when there were directio dirtied buffers pending, causing this.
- No reason to vhold/vrele the vp when enqueueing DirectIO requests
for the nfsiods. The vnode can't really go way since the close
has to wait for these requests to drain.
MFC after: 1 week
Submitted by: mohans
specific request and thus should first try to be allocated from the
sys_resource pool. This avoids using the sys_resource pool for wildcard
requests that have bounded ranges coming from cbb(4) and Host-PCI pcib(4)
drivers.
Tested by: Andrea Bittau <a.bittau of cs.ucl.ac.uk fame>
Sleuthing by: Andrea Bittau as well
that the MSI mapping window is fixed at 0xfee00000 and the capability
does not include two more dwords used to program the address. Supporting
this mostly results in quieting spurious warnings during boot about
non-default MSI mapping windows.
- HT 2.00b also added a new HT capability type, so support that in pciconf.
MFC after: 3 days
Tested by: jmg
It seems that valid pause frames(Tx flow control) cause GMAC to hang
such that it resulted in watchdog timeout. As a work around don't
flush Rx MAC FIFO if we've received pause frames.
Tested by: Harald Schmalzbauer (h DOT schmalzbauer AT omnisec DOT de)
Under certain circumtances, if TSO is active, Yukon II generates
corrupted IP packets. All corrupted IP packets I noticed were the the
last segmented packet in a TSO request. The corrupted packet resulted
in retransmission of the damaged packet which in turn decreased network
performance dramatically.
Unfortunately it seems that there is no way to workaround this bug
as TSO is completely handled in hardware. Disable TSO until we find a
working workaround or a new silicon revision that doesn't have this
hardware bug.
fault. The previous method zero'd out the page tables, invalidated the
TLB, and then entered a spin loop. The idea was that the instruction after
the TLB invalidate would result in a page fault and the page fault and
subsequent double fault wouldn't be able to determine the physical page
for their fault handlers' first instruction. This stopped working when
PGE (PG_G PTE/PDE bit) support was added as a TLB invalidate via %cr3
reload doesn't clear TLB entries with PG_G set. Thus, the CPU was still
able to map the virtual address for the spin loop and happily performed
its infinite loop.
The triple fault now uses a much more deterministic sledge-hammer approach
to generate a triple fault. First, the IDT descriptor is set to point to
an empty IDT, so any interrupts (including a double fault) will instantly
fault. Second, we trigger a int 3 breakpoint to force an interrupt and
kick off a triple fault.
MFC after: 3 days
in all other file system on FreeBSD (instead from inactive() method).
A nice side-effect of this change, except that it speedups file system
when mmaped file are often open/closed, is that it makes FreeBSD's
namecache work:)
This fixes slow operations on mmaped files, because without this fix,
pages were written to disk multiple times.
If one is looking for even greater speed up for such operation, he should
disable ZIL (by setting vfs.zfs.zil_disable to 1 in /boot/loader.conf).
Disabling ZIL makes fsx run ~9 times faster.
supports software encrypt/decrypt.
The nuked code itself is quite problematic, as pointed out by sam@ ---
wk->wk_keyix should be replaced by the loop count.
Tested with WEP/TKIP/CCMP/no-protection.
Approved by: sam@ (mentor)
Noticed by: Hans Petter Selasky <hselasky@c2i.net>
o Fix linewrap issues.
o Fix two typos (s/Recomended/Recommended/ and s/tunning/tuning/)
o Remove a couple of extra instances of the word "of".
o Update names of kmem_size variables.
Approved by: pjd
where similar data structures exist to support devfs and the MAC
Framework, but are named differently.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
by Philippe Biondi and Arnaud Ebalard. This is a temporary fix
until more discussion can be had on the exact risks involved in
allowing source routing in IPv6
Submitted by: itojun
Reviewed by: jinmei
MFC after: 1 day
- Move FreeBSD-specific code to zfs_freebsd_*() functions in zfs_vnops.c
and keep original functions as similar to vendor's code as possible.
- Add various includes back, now that we have them.
macro, as za_first_integer field also contains type. This should be fixed in
ZFS itself, but this bug is not visible on Solaris, because there, type is
not stored in za_first_integer. On the other hand it will be visible on
MacOS X.
Reported by: Barry Pederson <bp@barryp.org>
variable name conventions for arguments passed into the framework --
for example, name network interfaces 'ifp', sockets 'so', mounts 'mp',
mbufs 'm', processes 'p', etc, wherever possible. Previously there
was significant variation in this regard.
Normalize copyright lists to ranges where sensible.
labels: the mount label (label of the mountpoint) and the fs label (label
of the file system). In practice, policies appear to only ever use one,
and the distinction is not helpful.
Combine mnt_mntlabel and mnt_fslabel into a single mnt_label, and
eliminate extra machinery required to maintain the additional label.
Update policies to reflect removal of extra entry points and label.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
the introduction of priv(9) and MAC Framework entry points for privilege
checking/granting. These entry points exactly aligned with privileges and
provided no additional security context:
- mac_check_sysarch_ioperm()
- mac_check_kld_unload()
- mac_check_settime()
- mac_check_system_nfsd()
Add mpo_priv_check() implementations to Biba and LOMAC policies, which,
for each privilege, determine if they can be granted to processes
considered unprivileged by those two policies. These mostly, but not
entirely, align with the set of privileges granted in jails.
Obtained from: TrustedBSD Project
- Redistribute counter declarations to where they are used, rather than at
the file header, so it's more clear where we do (and don't) have
counters.
- Add many more counters, one per policy entry point, so that many
individual access controls and object life cycle events are tracked.
- Perform counter increments for label destruction explicitly in entry
point functions rather than in LABEL_DESTROY().
- Use LABEL_INIT() instead of SLOT_SET() directly in label init functions
to be symmetric with destruction.
- Align counter names more carefully with entry point names.
- More constant and variable name normalization.
Obtained from: TrustedBSD Project
- Add a more detailed comment describing the mac_test policy.
- Add COUNTER_DECL() and COUNTER_INC() macros to declare and manage
various test counters, reducing the verbosity of the test policy
quite a bit.
- Add LABEL_CHECK() macro to abbreviate normal validation of labels.
Unlike the previous check macros, this checks for a NULL label and
doesn't test NULL labels. This means that optionally passed labels
will now be handled automatically, although in the case of optional
credentials, NULL-checks are still required.
- Add LABEL_DESTROY() macro to abbreviate the handling of label
validation and tear-down.
- Add LABEL_NOTFREE() macro to abbreviate check for non-free labels.
- Normalize the names of counters, magic values.
- Remove unused policy "enabled" flag.
Obtained from: TrustedBSD Project
set/clear it but would not do it. Now we will.
- Moved to latest socket api for extended sndrcv info struct.
- Moved to support all new levels of fragment interleave.
calls. Add MAC Framework entry points and MAC policy entry points for
audit(), auditctl(), auditon(), setaudit(), aud setauid().
MAC Framework entry points are only added for audit system calls where
additional argument context may be useful for policy decision-making; other
audit system calls without arguments may be controlled via the priv(9)
entry points.
Update various policy modules to implement audit-related checks, and in
some cases, other missing system-related checks.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
- Replace PRIV_NFSD with PRIV_NFS_DAEMON, add PRIV_NFS_LOCKD.
- Use PRIV_NFS_DAEMON in the NFS server.
- In the NFS client, move the privilege check from nfslockdans(), which
occurs every time a write is performed on /dev/nfslock, and instead do it
in nfslock_open() just once. This allows us to avoid checking the saved
uid for root, and just use the effective on open. Use PRIV_NFS_LOCKD.
@118370 Correct typo.
@118371 Integrate changes from vendor.
@118491 Show backtrace on unexpected code paths.
@118494 Integrate changes from vendor.
@118504 Fix sendfile(2). I had two ways of fixing it:
1. Fixing sendfile(2) itself to use VOP_GETPAGES() instead of
hacking around with vn_rdwr(UIO_NOCOPY), which was suggested
by ups.
2. Modify ZFS behaviour to handle this special case.
Although 1 is more correct, I've choosen 2, because hack from 1
have a side-effect of beeing faster - it reads ahead MAXBSIZE
bytes instead of reading page by page. This is not easy to implement
with VOP_GETPAGES(), at least not for me in this very moment.
Reported by: Andrey V. Elsukov <bu7cher@yandex.ru>
@118525 Reorganize the code to reduce diff.
@118526 This code path is expected. It is simply when file is opened with
O_FSYNC flag.
Reported by: kris
Reported by: Michal Suszko <dry@dry.pl>
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.
Approved by: njl (mentor)
Reviewed by: alc
from the incoming SYN handling section of tcp_input().
Enforcement of the accept queue limits is done by sonewconn() after the
3WHS is completed. It is not necessary to have an earlier check before a
connection request enters the SYN cache awaiting the full handshake. It
rather limits the effectiveness of the syncache by preventing legit and
illegit connections from entering it and having them shaken out before we
hit the real limit which may have vanished by then.
Change return value of syncache_add() to void. No status communication
is required.
when the ACK is invalid and doesn't belong to any registered connection,
either in syncache or through SYN cookies. True but a NULL struct socket
is returned when the 3WHS completed but the socket could not be created
due to insufficient resources or limits reached.
For both cases an RST is sent back in tcp_input().
A logic error leading to a panic is fixed where syncache_expand() would
free the mbuf on socket allocation failure but tcp_input() later supplies
it to tcp_dropwithreset() to issue a RST to the peer.
Reported by: kris (the panic)
when one of links is inactive and have stale sequence number. To avoid
this sequence numbers of all links are getting updated on every
successful packet reassembling.
- ng_ppp_bump_mseq function created to simplify code.
- ng_ppp_frag_drop function separated from ng_ppp_frag_process to
simplify code.
Reviewed by: archie
Approved by: glebius (mentor)
which lead to ineffective multilink packet distribution plans.
- Changed bytesInQueue calculation math to have more precise information
about links utilization.
- Taken rough account of the link overhead. Better way to do it could be to
get exact overhead from user-level, but I have not done it to keep
binary compatibility.
Reviewed by: archie
Approved by: glebius (mentor)