I couldn't think of a way to maintain the hardware TXQ locks _and_ layer
on top of that per-TXQ software queuing and any other kind of fine-grained
locks (eg per-TID, or per-node locks.)
So for now, to facilitate some further code refactoring and development
as part of the final push to get software queue ps-poll and u-apsd handling
into this driver, just do away with them entirely.
I may eventually bring them back at some point, when it looks slightly more
architectually cleaner to do so. But as it stands at the present, it's
not really buying us much:
* in order to properly serialise things and not get bitten by scheduling
and locking interactions with things higher up in the stack, we need to
wrap the whole TX path in a long held lock. Otherwise we can end up
being pre-empted during frame handling, resulting in some out of order
frame handling between sequence number allocation and encryption handling
(ie, the seqno and the CCMP IV get out of sequence);
* .. so whilst that's the case, holding the lock for that long means that
we're acquiring and releasing the TXQ lock _inside_ that context;
* And we also acquire it per-frame during frame completion, but we currently
can't hold the lock for the duration of the TX completion as we need
to call net80211 layer things with the locks _unheld_ to avoid LOR.
* .. the other places were grab that lock are reset/flush, which don't happen
often.
My eventual aim is to change the TX path so all rejected frame transmissions
and all frame completions result in any ieee80211_free_node() calls to occur
outside of the TX lock; then I can cut back on the amount of locking that
goes on here.
There may be some LORs that occur when ieee80211_free_node() is called when
the TX queue path fails; I'll begin to address these in follow-up commits.
which dumps out the actual options being used by an NFS mount.
This will be used to implement a "-m" option for nfsstat(1).
Reviewed by: alfred
MFC after: 2 weeks
This brand of controllers expects that the number of
contexts specified in the input slot context points
to an active endpoint context, else it refuses to
operate.
- Ring the correct doorbell when streams mode is used.
- Wrap one or two long lines.
Tested by: Markus Pfeiffer (DragonFlyBSD)
MFC after: 1 week
Programming the low bits has a side-effect if unmasking the pin if it is
not disabled. So if an interrupt was pending then it would be delivered
with the correct new vector but to the incorrect old LAPIC.
This fix could be made clearer by preserving the mask bit while
programming the low bits and then explicitly resetting the mask bit
after all the programming is done.
Probability to trip over the fixed bug could be increased by bootverbose
because printing of the interrupt information in ioapic_assign_cpu
lengthened the time window during which an interrupt could arrive while
a pin is masked.
Reported by: Andreas Longwitz <longwitz@incore.de>
Tested by: Andreas Longwitz <longwitz@incore.de>
MFC after: 12 days
Also, make it explicit that V_XATTRDIR is not properly supported in gfs
code yet.
The bad code was plain incorrect: (a) it spoiled handling of v_usecount
reaching zero and (b) it leaked v_holdcnt.
The ugly code employs potentially unsafe locking tricks.
Ideally we should separate vnode lifecycle and gfs node lifecycle.
A gfs node should have its own reference count where its child nodes
should be accounted.
PR: kern/151111
Reviewed by: kib
MFC after: 13 days
... to avoid any races or inconsistencies.
This should fix a regression introduced in r243404.
Also, remove a stale comment that has not been true for quite a while
now.
Pointyhat to: avg
Teested by: trociny, emaste, dumbbell (earlier version)
MFC after: 1 week
src/sys/{bsm,security/audit}. There are a few tweaks to help with the
FreeBSD build environment that will be merged back to OpenBSM. No
significant functional changes appear on the kernel side.
Obtained from: TrustedBSD Project
Sponsored by: The FreeBSD Foundation (auditdistd)
enforcing the TXOP and TBTT limits:
* Frames which will overlap with TBTT will not TX;
* Frames which will exceed TXOP will be filtered.
This is not enabled by default; it's intended to be enabled by the
TDMA code on 802.11n capable chipsets.
the revamped sysctl code did not work, and needed a change. This
makes the limit get set at the time that all sysctl stats are
created and is actually more elegant imho anyway.
TX hot path by getting rid of index calculations and simply
managing pointers. Much of the creative code is due to my
coworker here at Intel, Alex Duyck, thanks Alex!
Also, this whole series of patches was given the critical
eye of Gleb Smirnoff and is all the better for it, thanks
Gleb!
- add a limit for both RX and TX, change the default to 256
- change the sysctl usage to be common, and now to be called
during init for each ring.
- the TX limit is not yet used, but the changes in the last
patch in this series uses the value.
- the motivation behind these changes is to improve data
locality in the final code.
- rxeof interface changes since it now gets limit from the
ring struct
Fix path handling for *at() syscalls.
Before the change directory descriptor was totally ignored,
so the relative path argument was appended to current working
directory path and not to the path provided by descriptor, thus
wrong paths were stored in audit logs.
Now that we use directory descriptor in vfs_lookup, move
AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2() calls to the place where
we hold file descriptors table lock, so we are sure paths will
be resolved according to the same directory in audit record and
in actual operation.
Sponsored by: FreeBSD Foundation (auditdistd)
Reviewed by: rwatson
MFC after: 2 weeks
defines (at Gleb's request). Also, change the defines around
the old transmit code to IXGBE_LEGACY_TX, I do this to make
it possible to define this regardless of the OS level (it is
not defined by default). There are also a couple changed
comments for clarity.
Currently when we discover that trail file is greater than configured
limit we send AUDIT_TRIGGER_ROTATE_KERNEL trigger to the auditd daemon
once. If for some reason auditd didn't rotate trail file it will never
be rotated.
Change it by sending the trigger when trail file size grows by the
configured limit. For example if the limit is 1MB, we will send trigger
on 1MB, 2MB, 3MB, etc.
This is also needed for the auditd change that will be committed soon
where auditd may ignore the trigger - it might be ignored if kernel
requests the trail file to be rotated too quickly (often than once a second)
which would result in overwriting previous trail file.
Sponsored by: FreeBSD Foundation (auditdistd)
MFC after: 2 weeks
Currently on each record write we call VFS_STATFS() to get available space
on the file system as well as VOP_GETATTR() to get trail file size.
We can assume that trail file is only updated by the audit worker, so instead
of asking for file size on every write, get file size on trail switch only
(it should be zero, but it's not expensive) and use global variable audit_size
protected by the audit worker lock to keep track of trail file's size.
This eliminates VOP_GETATTR() call for every write. VFS_STATFS() is satisfied
from in-memory data (mount->mnt_stat), so shouldn't be expensive.
Sponsored by: FreeBSD Foundation (auditdistd)
MFC after: 2 weeks
these are FCOE stats (fiber channel over ethernet), something that
FreeBSD does not yet have, they were mistaken for flow control by
the implementor I believe. Secondly, the real flow control stats
are oddly named with a 'link' tag on the front, it was requested
by my validation engineer to make these stats have the same name as
the igb driver for clarity and that seemed reasonable to me.
Remove redundant call to AUDIT_ARG_UPATH1().
Path will be remembered by the following NDINIT(AUDITVNODE1) call.
Sponsored by: FreeBSD Foundation (auditdistd)
MFC after: 2 weeks
multiqueue code, this functionality has proven to be more
trouble than it was worth. Thanks to Gleb for a second
critical look over my code and help in the patches!
* Global IPFW_DYN_LOCK() is changed to per-bucket mutex.
* State expiration is done in ipfw_tick every second.
* No expiration is done on forwarding path.
* hash table resize is done automatically and does not flush all states.
* Dynamic UMA zone is now allocated per each VNET
* State limiting is now done via UMA(9) api.
Discussed with: ipfw
MFC after: 3 weeks
Sponsored by: Yandex LLC
- Add "fdt addr" subcommand that lets you specify preloaded blob address
- Do not pre-initialize blob for "fdt addr"
- Do not try to load dtb every time fdt subcommand is issued,
do it only once
- Change the way DTB is passed to kernel. With introduction of "fdt addr"
actual blob address can be not virtual but physical or reside in
area higher then 64Mb. ubldr should create copy of it in kernel area
and pass pointer to this newly allocated buffer which is guaranteed to work
in kernel after switching on MMU.
- Convert memreserv FDT info to "memreserv" property of root node
FDT uses /memreserve/ data to notify OS about reserved memory areas.
Technically it's not real property, it's just data blob, sequence
of <start, size> pairs where both start and size are 64-bit integers.
It doesn't fit nicely with OF API we use in kernel, so in order to unify
thing ubldr converts this data to "memreserve" property using the same
format for addresses and sizes as /memory node.
It returns memory regions restricted from being used by kernel. These
regions are dfined in "memreserve" property of root node in the same
format as "reg" property of /memory node
embryonic connection has been setup and never attempt to abort a tid
before this is done. This fixes a bad race where a listening socket is
closed when the driver is in the middle of step (b) here. The symptom
of this were "ARP miss" errors from the driver followed by tid leaks.
A hardware-offloaded passive open works this way:
a) A SYN "hits" the TCAM entry for a server tid and the chip delivers it
to the queue associated with the server tid (say, queue A). It waits
for a response from the driver telling it what to do.
b) The driver decides it is ok to proceed. It adds the new tid to the
list of embryonic connections associated with the server tid and then
hands off the SYN to the kernel's syncache to make sure that the kernel
okays it too. If it does then the driver provides an L2 table entry,
queue id (say, queue B), etc. and instructs the chip to send the SYN/ACK
response.
c) The chip delivers a status to queue B depending on how the third step
of the 3-way handshake goes. The driver removes the tid from its list
of embryonic connections and either expands the syncache entry or
destroys the tid. In any case all subsequent messages for the new tid
will be delivered to queue B, not queue A. Anything running in queue B
knows that the L2 entry has long been setup and the new flag is of no
interest from here on. If the listener is closed it will deal with
so_comp as normal.
MFC after: 1 week