This is a long standing performance bug which happened to not cause trouble
in practice due to rather limited use of these primitives.
The read side expects the writer to finish soon(tm) hence it loops with one
pause in-between. But it is possible the writer gets preempted in which case
the waiting can take a long time, especially so if it got preempted by the
reader. In principle this may never clean itself up.
In the current kernel seq is only used to obtain stable fp + capabilities
state. In order for looping at least once to occur there has to be a
concurrent writer modifying the fd slot for the very fd we are trying to
read. That is, for any looping to occur in the first place the program has
to be multithreaded and be doing something fishy to begin with. As such,
the indefinite looping is rather hard to run into unless you really try
(and I did not).
Update to what the previous code seemed to be doing via the correct
interfaces. Further issues exist in xge_ioctl_registers(), but this is
debugging code in a driver that has few users and they don't appear to
be crashes or leaks.
Reviewed by: jhb (prior version)
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14848
No functional change in practice. If the sbni driver supported
64-bit big-endian system, this would be an ABI changes, but it is
i386-only. The old version leaked a word of stack on 64-bit systems.
This eliminates the only assignment to ifr_data.
Reviewed by: kib
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14874
No functional change in practice. If the sbni driver supported
64-bit big-endian system, this would be an ABI changes, but it is
i386-only. The old version leaked a word of stack on 64-bit systems.
This eliminates the only assignment to ifr_data.
These have been supplanted by the MI signal information codes in
<sys/signal.h> since 7.0. The FPE_*_TRAP ones were deprecated even
earlier in 1999.
PR: 226579 (exp-run)
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D14637
Requests to modify the state of TLS connections need to be sent on the
same queue as TLS record transmit requests to ensure ordering.
However, in order to use the offload transmit queue in t4_set_tcb_field(),
the function needs to be updated to do proper flow control / credit
management when queueing a request to an offload queue. This required
passing a pointer to the toepcb itself to this function, so while here
remove the 'tid' and 'iqid' parameters and obtain those values from the
toepcb in t4_set_tcb_field() itself.
Submitted by: Harsh Jain @ Chelsio (original version)
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D14871
The original implementation used a reference to ifr_data and a cast to
do the equivalent of accessing ifr_addr. This was copied multiple
times since 1996.
Approved by: kib
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14873
Make all kernel accesses to ifru_buffer go via access functions
which take the process ABI into account and use an appropriate union
to access members in the correct place in struct ifreq.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14846
It is random collection of fixes for issues not yet corrected,
reported at https://tsyrklevi.ch/clang_analyzer/freebsd_013017/. Many
issues from that list were already corrected. Most of them are for
compat32, old compat32 or affect both primary host ABI and compat32.
The freebsd32_kldstat(), for instance, was already fixed by using
malloc(M_ZERO). Patch includes correction to report the supplied
version back, which is just pedantic.
Reviewed by: brooks, emaste (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14868
xforms that support processing of multiple blocks at a time (to support more
efficient modes, for example) can define the encrypt_ and decrypt_multi
interfaces. If these interfaces are not present, the generic cryptosoft
code falls back on the block-at-a-time encrypt/decrypt interfaces.
Stream ciphers may support arbitrarily sized inputs (equivalent to an input
block size of 1 byte) but may be more efficient if a larger block is passed.
Sponsored by: Dell EMC Isilon
This is more correct in that ioctl commands have no meaning until they
hit the handler associated with the file descriptor.
Add support for MDIOCRESIZE_32 which was missed when it was added.
Reviewed by: cem, kib, markj (various versions)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14714
According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be
considered as untagged, and only PCP and DEI values from the VLAN tag
are meaningful. See for instance
https://www.cisco.com/c/en/us/td/docs/switches/connectedgrid/cg-switch-sw-master/software/configuration/guide/vlan0/b_vlan_0.html.
Make it possible to specify PCP value for outgoing packets on an
ethernet interface. When PCP is supplied, the tag is appended, VLAN
id set to 0, and PCP is filled by the supplied value. The code to do
VLAN tag encapsulation is refactored from the if_vlan.c and moved into
if_ethersubr.c.
Drivers might have issues with filtering VID 0 packets on
receive. This bug should be fixed for each driver.
Reviewed by: ae (previous version), hselasky, melifaro
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D14702
Include _uio.h instead of uio.h in several headers to reduce header
polution.
Fix a few places that relied on header polution to get the uio.h header.
I have not moved struct uio as many more things that use it rely on
header polution to get other definitions from uio.h.
Reviewed by: cem, kib, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14811
Drop our local patch and restore full vanilla upstream code in
contrib/libb2.
No functional change intended. explicit_bzero() should continue to be used.
Obtained from: libb2 b4b241a34824b51956a7866606329a065d397525
Sponsored by: Dell EMC Isilon
If the operation is not an update, if neither r/w nor r/o mode is
explicitly requested, if the error code hints at the possibility of the
media being read-only, and if the fallback is allowed, then we can try
to automatically downgrade to the readonly mode.
This is especially useful for auto-mounting of removable media that
sometimes can happen to be write-protected.
The fallback to r/o is not enabled by default. It can be requested on a
per-mount basis with a new mount option, 'autoro'. Or it can be
globally allowed by setting vfs.default_autoro.
Reviewed by: cem, kib
MFC after: 3 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D13361
assignment. Device drivers are able to override the default assignment
if they bind directly. There are severe performance penalties for
handling interrupts on remote CPUs and this should only be done in
very controlled circumstances.
Reviewed by: jhb, kib
Tested by: pho (earlier version)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14838
Includes our local patch to conditionalize use of __builtin_clz(ll) on
Clang's __has_builtin() (which is just defined to false when building with
GCC).
The issue is tracked upstream at https://github.com/facebook/zstd/pull/884 .
Otherwise, these are vanilla Zstandard 1.3.4 files.
Reported by: allanjude, Yann Collet
Sponsored by: Dell EMC Isilon
No functional change for Skipjack, AES-ICM, Blowfish, CAST-128, Camellia,
DES3, Rijndael128, DES. All of these have identical IV and blocksizes
declared in the associated enc_xform.
Functional changes for:
* AES-GCM: block len of 1, IV len of 12
* AES-XTS: block len of 16, IV len of 8
* NULL: block len of 4, IV len of 0
For these, it seems like the IV specified in the enc_xform is correct (and
the blocksize used before was wrong).
Additionally, the not-yet-OCFed cipher Chacha20 has a logical block length
of 1 byte, and a 16 byte IV + nonce.
Rationalize references to IV lengths to refer to the declared ivsize, rather
than declared blocksize.
Sponsored by: Dell EMC Isilon
Singed calculations in cubic_cwnd() can result in negative cwnd
value which is then cast to an unsigned value. Values less than
1 mss are generally bad for other parts of the code, also fixed.
Submitted by: Jason Eggleston <jason@eggnet.com>
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D14141
To fix the GCC build, remove multiple redundant declarations of
vmci_send_datagram() (the copy in vmci.h as well as the extern definition in
vmci_queue_pair.c were wholly redundant).
Also to fix the GCC build, include a non-empty format string in the vmci(4)
definition of ASSERT(). It seems harmless either way, but adding the
stringified invariant is easier than masking the warning.
The other vmci_kernel_defs.h changes are cosmetic and simply match macros to
existing definitions.
Reported by: GCC 6.4.0
Sponsored by: Dell EMC Isilon
should be honored.
We must not sleep or acquire any MI VM locks if TDP_NOFAULTING is
specified. On the other hand, there were some callers in the tree
which set TDP_NOFAULTING for larger scope than needed, I fixed the
code which I wrote, but I suspect that linuxkpi and out of tree drm
drivers might abuse this still.
So only enable the mode for vm_fault_quick_hold_pages() where
vm_fault_hold() is not called when specifically asked by user. I
decided to use vm_prot_t flag to not change KPI. Since number of
flags in vm_prot_t is limited, I reused the same flag which was
already consumed for vm_map_lookup().
Reported and tested by: pho (as part of the larger patch)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14825
If one has added fields to struct mbuf such that MHLEN is smaller than
this threshold (128), iflib_rxd_pkt_get() may otherwise overrun the
internal mbuf buffer while copying.
Reviewed by: mmacy
MFC after: 3 days
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14843
We may remove a sleepqueue from the hash table in
sleepq_resume_thread().
Reviewed by: kib
MFC after: 3 days
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14847
after the array to its proper location. Otherwise, the linker.hints
file has things out of order and we associated it with whatever was
the previous module.
In a virtual machine, VMCI is exposed as a regular PCI device. The primary
communication mechanisms supported are a point-to-point bidirectional
transport based on a pair of memory-mapped queues, and asynchronous
notifications in the form of datagrams and doorbells. These features are
available to kernel level components such as vSockets through the VMCI
kernel API. In addition to this, the VMCI kernel API provides support for
receiving events related to the state of the VMCI communication channels,
and the virtual machine itself.
Submitted by: Vishnu Dasa <vdasa@vmware.com>
Reviewed by: bcr, imp
Obtained from: VMware
Differential Revision: https://reviews.freebsd.org/D14289
DTB Overlays are useful to change/add nodes to a dtb without the need to
modify it.
Add support for building dtbo during buildkernel.
The goal of DTBO present in the FreeBSD source tree is to fill a gap in
time when we submit changes upstream (Linux). Instead of waiting 2 to 4 months
we can add a DTBO in tree in the meantime.
This is not for adding DTBO for capes/hat/addon boards, those will be
better to put in a ports.
This is also not for enabling a i2c/spi/pwm controller on certain pins,
each user have a different use case for those (which pins to use etc ...)
and we cannot have all possible configuration.
Add a dtbo for sun8i-h3-sid which add the SID node missing in upstream dts.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D14782
Assert that all such memory is unwired on return to usermode.
The count of the wired memory will be used to detect the copyout mode.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
copyout(9) while owning zone lock.
Despite old value sysctl buffer is wired, spurious faults might still
occur.
Note that we still own the uma_rwlock there, but this lock does not
participate in sensitive lock orders.
Reported and tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
currently on the queue. This prevents accidentally doubly-removing a DAD
entry from the queue, while also simplifying some of the logic in
nd6_dad_stop().
Reviewed by: ae, hrs, vangyzen
MFC after: 2 weeks
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D10943
Most important for the future use, do not call
vm_fault_quick_hold_pages() with disabled pagefaults.
Reported and tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Current code, which copies the potential syscall arguments into the
current frame, puts an arbitrary limit on the number of syscall
arguments. Apparently, mmap(2) and lseek(2) (?) require larger
number. But there is an issue that stack is only need to be mapped to
contain the number of arguments required by the syscall, so copying
arbitrary large number of words from the stack is not completely safe.
Use different approach to convert lcall frame into int $0x80 frame in
place, by doing the retl in kernel. This also allows to stop proceed
vfork case specially, and stop making assumptions about %cs at the
syscall time.
Also, improve comments with the formulations provided by bde.
Reviewed and tested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
controlled by the TCP_BLACKBOX option.
Enable this as part of amd64 GENERIC. For now, leave it disabled on
other platforms.
Sponsored by: Netflix, Inc.