- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.
- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.
- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().
- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.
- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.
- How rate limiting works:
1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.
2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.
3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.
4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.
Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months
the fifth argument to functions being traced, however there was an error
where the userspace stack was being used. This may be invalid leading to
a kernel panic if this address is unmapped.
Submitted by: Graeme Jenkinson <graeme.jenkinson@cl.cam.ac.uk>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9229
As the efi_devpath_last_node() and efi_devpath_trim() can return NULL
pointers, the consumers of this API should check the the NULL pointers.
Same for efinet_dev_init() using calloc().
Reported by: Robert Mustacchi <rm@joyent.com>
Reviewed by: jhb, allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9203
Clang apparently requires the explicit form of this instruction, and rejects
uses which ignore the optional cmpD register. This was the only use of the
shorthand form of the instruction, so just fix it up to match the others.
PR: kern/215681
Submitted by: Mark Millard
Reported by: Mark Millard <markmi _AT_ dsl-only.net>
MFC after: 2 weeks
Languages like C++17 and Go provide direct support for slice types:
pointer/length pairs. The CloudABI generator now has more complete for
this, meaning that for the C binding, pointer/length pairs now use an
automatic naming scheme of ${name} and ${name}_len.
Apart from this change and some reformatting, the ABI definitions are
identical. Binary compatibility is preserved entirely.
This field has no practical use and never readed. Initiators already
receive respective residual size from frontends. Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.
While there, fix kern_data_resid field support in case of HA, missed in
r312291.
MFC after: 13 days
This lock was replaced from rwlock in r272840. But unlike rwlock, rmlock
doesn't allow recursion on rm_rlock(), so at this time fix this with
RM_RECURSE flag. Later we need to change ipfw to avoid such recursions.
PR: 216171
Reported by: Eugene Grosbein
MFC after: 1 week
to a non-wildcard address. As documented in ip(4), doing sendmsg(2) with
IP_SENDSRCADDR on a socket that is bound to non-wildcard address is
completely different to using this control message on a wildcard one.
A fix is to add a bool to mark whether we did setsockopt(IP_RECVDSTADDR)
on the socket, and use IP_SENDSRCADDR control message only if we did.
While here, garbage collect absolutely useless udp_recv() function that
establishes some structures on stack to never use them later.
Running smilint against MIB definitions is useful in finding
functional problems with MIB definitions/descriptions.
This is inspired by the smilint targets defined in
usr.sbin/bsnmpd/modules/{snmp_hostres,snmp_mibII}/Makefile
Document all of the variables that are involved in running the
smilint target, as well as all of the prerequisites to running
it.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9099
The Zedboard has a hardware bug where initialization of the USB PHY
occasionally fails on boot-up. Fix regression in -CURRENT when
kernel panics on such occasion. 11-RELEASE branch works fine
PR: 215862
Submitted by: Thomas Skibo <thoma555-bsd@yahoo.com>
Previously "panic: msleep" could happen for a few different reasons.
Break the KASSERTs out into individual cases to identify the failing
condition. Found during the investigation that resulted in r308288.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8604
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:
o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).
In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.
Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.
This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.
There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.
Reviewed by: adrian, gnn
Differential Revision: https://reviews.freebsd.org/D9171
gtaskqueue bits at SI_SUB_INIT_IF instead of waiting until SI_SUB_SMP
which is far too late.
Add an assertion in taskqgroup_attach() to catch startup initialization
failures in the future.
Reported by: kib bde
it into pmap-v4.h where they are used. Other than those few lines of
support for different MMU types, nothing in cpuconf.h has been used in our
code for quite a while.
The file existed to set up a variety of symbols to describe the
architecture. Over the past few years we have converted all of our source
to use the new architecture symbols standardized by ARM Inc, and predefined
by both clang and gcc.
PR: 216104
It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.
CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).
MFC after: 2 weeks
This patch adds driver for temperature/humidity sensor connected via GPIO.
To compile it into kernel add "device gpioths". To activate driver, use
hints (.at and .pins) for gpiobus. As result it will provide temperature &
humidity values via sysctl.
DHT11 is cheap & popular temperature/humidity sensor used via GPIO on ARM
or MIPS devices like Raspberry Pi or Onion Omega.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9185
On rela architectures GNU BFD ld and gold store the relocation addend
in GOT entries (in addition to the relocation's r_addend field).
rtld previously relied on this to access its own _DYNAMIC symbol in
order to apply its own relocations.
However, recording addends in the GOT is not specified by the ABI,
and some versions of LLVM's LLD linker leave the GOT uninitialized on
rela architectures.
BFD ld does not populate the GOT on sparc64, and sparc64 rtld has a
machine-dependent rtld_dynamic_addr() function that returns the
_DYNAMIC address. Use the same approach on amd64, obtaining the %rip-
relative _DYNAMIC address following a suggestion from Rafael Espíndola.
Architectures other than amd64 should be addressed in future work.
PR: 214972
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D9180
The find_currdev() is using variable "copy" to store the reference to trimmed
devpath pointer, if for some reason the efi_devpath_handle() fails, we will
leak this copy.
Also we can simplify the code there a bit.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9191
Replace archaic "busses" with modern form "buses."
Intentionally excluded:
* Old/random drivers I didn't recognize
* Old hardware in general
* Use of "busses" in code as identifiers
No functional change.
http://grammarist.com/spelling/buses-busses/
PR: 216099
Reported by: bltsrc at mail.ru
Sponsored by: Dell EMC Isilon
sh has defaulted to 'set -o emacs' since FreeBSD 9.0. Therefore, do not set
this again in .shrc, since that only serves to prevent invocations like
'sh -o vi' and 'sh +o emacs' to have the intended effect.
PR: 215958
Submitted by: Andras Farkas
MFC after: 1 week