- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.
- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.
- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().
- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.
- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.
- How rate limiting works:
1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.
2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.
3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.
4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.
Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months
Replace archaic "busses" with modern form "buses."
Intentionally excluded:
* Old/random drivers I didn't recognize
* Old hardware in general
* Use of "busses" in code as identifiers
No functional change.
http://grammarist.com/spelling/buses-busses/
PR: 216099
Reported by: bltsrc at mail.ru
Sponsored by: Dell EMC Isilon
Previously code ignored resid field and returned extra zeroes in case of
data underflow. Now it returns only real bytes received from target.
MFC after: 2 weeks
If our buffer is too small, we may receive part of the page, and should
not try read/write past the end of the buffer.
Reported by: Coverity
CID: 1368374, 1368375
MFC after: 1 week
This is very preliminary and mostly enough for me (with other patches)
to work on VHT support.
It adds:
* VHT20, VHT40 and VHT80 regulatory/band awareness
* VHT20, VHT40 and VHT80 channel configuration / population
* Parses vht channel specifications (eg ifconfig wlan0 create wlandev athp0 wlanmode monitor channel 36:vht/80)
* Configuration of VHT, VHT40, VHT80, VHT80+80, VHT160 channel
width (IEEE80211_FVHT_VHT* flags in net80211)
TODO:
* No VHT80+80 or VHT160 channels yet - I don't yet have hardware, and I'm
not yet sure how to support/populate VHT80+80 channels.
* No, I won't update the manpage until this is "more done", lest someone
tries using vht and gets upset with me.
* No, I won't commit the regulatory database I'm testing with, so you'll
just end up with no VHT channels ever populated. Which is good, as there
isn't an 11ac driver in-tree yet to try it with.
struct ip in ping(8):
sbin/ping/ping.c:1684:53: error: taking address of packed member
'ip_src' of class or structure 'ip' may result in an unaligned pointer
value [-Werror,-Waddress-of-packed-member]
(void)printf(" %s ", inet_ntoa(*(struct in_addr *)&ip->ip_src.s_addr));
^~~~~~~~~~~~~~~~~
sbin/ping/ping.c:1685:53: error: taking address of packed member
'ip_dst' of class or structure 'ip' may result in an unaligned pointer
value [-Werror,-Waddress-of-packed-member]
(void)printf(" %s ", inet_ntoa(*(struct in_addr *)&ip->ip_dst.s_addr));
^~~~~~~~~~~~~~~~~
MFC after: 3 days
The offending code has been dead ever since the import from OpenBSD in
r195805. OpenBSD later deleted that entire function.
Reported by: Coverity
CID: 500059
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
is a 32-bit socklen_t, do_get3() passes the kernel to access the wrong
32-bit half on big-endian LP64 machines when simply casting the 64-bit
size_t optlen to a socklen_t pointer.
While at it and given that the intention of do_get3() apparently is to
hide/wrap the fact that socket options are used for communication with
ipfw(4), change the optlen parameter of do_set3() to be of type size_t
and as such more appropriate than uintptr_t, too.
MFC after: 3 days
In this specific case the src address can be set to any, which was not
accepted prior to this commit.
pfSense bug report: https://redmine.pfsense.org/issues/6985
Reviewed by: kp
Obtained from: pfSense
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)
The primary purpose is to call nmount() in a loop with new iovec's so
free_iovec takes arguments by reference and resets their values.
Reviewed by: cem
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D8513
Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.
A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.
dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable. Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.
When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore
A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.
Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.
savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.
decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.
Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.
EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.
Designed by: def, pjd
Reviewed by: cem, oshogbo, pjd
Partial review: delphij, emaste, jhb, kib
Approved by: pjd (mentor)
Differential Revision: https://reviews.freebsd.org/D4712
This is imported from NetBSD. The author--Joerg Sonnenberger--agreed
to apply a two-clause BSD license, just so the license was clear.
This source tree location matches NetBSD, and is the first place someone
might look for such a tool.
Obtained from: Joerg Sonnenberger via NetBSD
MFC after: 3 days
Sponsored by: Dell EMC
This adds support to camcontrol(8) and libcam(3) for getting and setting
the time on SCSI protocol drives. This is more commonly found on tape
drives, but is a SPC (SCSI Primary Commands) command, and may be found
on any device that speaks SCSI.
The new camcontrol timestamp subcommand allows getting the current device
time or setting the time to the current system time or any arbitrary time.
sbin/camcontrol/Makefile:
Add timestamp.c.
sbin/camcontrol/camcontrol.8:
Document the new timestamp subcommand.
sbin/camcontrol/camcontrol.c:
Add the timestamp subcommand to camcontrol.
sbin/camcontrol/camcontrol.h:
Add the timestamp() function prototype.
sbin/camcontrol/timestamp.c:
Timestamp setting and reporting functionality.
sys/cam/scsi/scsi_all.c:
Add two new CCB building functions, scsi_set_timestamp() and
scsi_report_timestamp(). Also, add a new helper function,
scsi_create_timestamp().
sys/cam/scsi/scsi_all.h:
Add CDB and parameter data for the the set and report timestamp
commands.
Add function declarations for the new CCB building and helper
functions.
Submitted by: Sam Klopsch
Sponsored by: Spectra Logic
MFC After: 2 weeks
It is quite specific mode of operation without storing on-disk metadata.
It can be useful in some cases in combination with some external control
tools handling mirror creation and disks hot-plug.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
the BIOCSETIF ioctl.
The kernel always copies an entire struct ifreq and IPv4 addresses will
always fit in an ifreq.
On systems with pointers larger than 64-bits, the computed size will be
less than the size of struct ifreq, potentially resulting in the kernel
attempting to copyin memory from outside the allocation.
Reviewed by: jhb
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D8445
and uses TCP for the Unmount RPC if the mount is over TCP.
Without this patch, umount does an Unmount RPC over UDP for all NFS mounts.
Suggested by: cperciva
Reviewed by: cperciva
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D8503
instead. Since we're little endian, we can get away with it. Also,
since the counters in quesitons would require billions of iops for
tens of billions of seconds to overflow, and since such data rates are
unlikely for people using i386 for a while, that's OK. The fastest
cards today can't do even a million IOPs.
Noticed by: dim@
Sponsored by: Netflix, Inc
it in human readable form. Include a pointer to the public spec that
was followed to implement this in the code. Samsung also implements
page 0xca on some of their drives, but the format is slighly
different, so the code skips printing zero keys. Samsung's log page
has additional, unknown data after the end of Intel defined data which
isn't displayed.
Supported by: Netfix, Inc
number is printed, even though you'd need like a billion IOPs for a 10
billion seconds to overflow the 64-bit counters (~300 years).
Sponsored by: Netflix, Inc
are valid or not. While many pages are reserved in the standard, that
doesn't make them invalid and future versions of the standard may
define then.
Sponsored by: Netflix, Inc