When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.
However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.
Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.
On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().
This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.
Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.
This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.
Reviewed by: gallatin, hselasky, cy, adrian, kristof
Differential Revision: https://reviews.freebsd.org/D19111
can handle. Instead using an array on node private data, use per-hook
private data.
- Use NG_NODE_FOREACH_HOOK() to traverse through hooks instead of array.
PR: 240787
Submitted by: Lutz Donnerhacke <lutz donnerhacke.de>
Differential Revision: https://reviews.freebsd.org/D21803
This removes the last consumer of the modified zlib originally
bundled with Paul's PPP implementation, which will be removed
in a follow up commit.
PR: 229763
Differential Revision: https://reviews.freebsd.org/D21186
Add helper function for synchronous execution of HCI commands at probe
stage and use this function to check firmware state of Intel Wireless
8260/8265 bluetooth devices found in many post 2016 year laptops.
Attempt to initialize FreeBSD bluetooth stack while such a device is in
bootloader mode locks the adapter hardly so it requires power on/off
cycle to restore.
This change blocks ng_ubt attachment unless operational firmware is
loaded thus preventing the lock up.
PR: 237083
Reviewed by: hps, emax
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21071
or the likes. Add new control message types: setdlt and getdlt to switch
from default DLT_RAW (no encapsulation) to DLT_EN10MB (ethernet).
Approved by: glebius
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18535
tv_usec has "long" type for all architecture in FreeBSD
and follows __LP64__. However, this is not true for tv_sec
that has "time_t" type.
Since r320347 that changed time_t from 32 to 64 bit integer
for 32 bit version of powerpc architecture, we have only single
i386 architecture having 32 bit time_t type.
Submitted by: jhb
MFC after: 1 week.
reordered in transit instead of dropping them altogether.
It uses sequence numbers of PPtPGRE packets.
A set of new sysctl(8) added to control this ability or disable it:
net.graph.pptpgre.reorder_max (1) defines maximum length of node's
private reorder queue used to keep data waiting for late packets.
Zero value disables reordering. Default value 1 allows the node to restore
the order for two packets swapped in transit. Greater values allow the node
to deliver packets being late after more packets in sequence
at cost of increased kernel memory usage.
net.graph.pptpgre.reorder_timeout (1) defines time value in miliseconds
used to wait for late packets. It may be useful to increase this
if reordering spot is distant.
MFC after: 1 month
given in random(4).
This includes updating of the relevant man pages, and no-longer-used
harvesting parameters.
Ensure that the pseudo-unit-test still does something useful, now also
with the "other" algorithm instead of Yarrow.
PR: 230870
Reviewed by: cem
Approved by: so(delphij,gtetlow)
Approved by: re(marius)
Differential Revision: https://reviews.freebsd.org/D16898
A few ISP filter PADI requests based on such tag,
to force the use of their own routers.
The custom Host-Uniq tag is passed in the NGM_PPPOE_CONNECT
control message, so it can be used with FreeBSD ppp(8)
and mpd without any other change.
Add support to send and receive PADM messages,
HURL and MOTM, often used by service providers to provide
ACS information and other configuration settings
to the user CPE.
Submitted by: ale
Approved by: mav (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9270
Uses of mallocarray(9).
The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.
Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.
Reported by: wosch
PR: 225197
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
X-Differential revision: https://reviews.freebsd.org/D13837
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
No functional change intended.
o Separate fields of struct socket that belong to listening from
fields that belong to normal dataflow, and unionize them. This
shrinks the structure a bit.
- Take out selinfo's from the socket buffers into the socket. The
first reason is to support braindamaged scenario when a socket is
added to kevent(2) and then listen(2) is cast on it. The second
reason is that there is future plan to make socket buffers pluggable,
so that for a dataflow socket a socket buffer can be changed, and
in this case we also want to keep same selinfos through the lifetime
of a socket.
- Remove struct struct so_accf. Since now listening stuff no longer
affects struct socket size, just move its fields into listening part
of the union.
- Provide sol_upcall field and enforce that so_upcall_set() may be called
only on a dataflow socket, which has buffers, and for listening sockets
provide solisten_upcall_set().
o Remove ACCEPT_LOCK() global.
- Add a mutex to socket, to be used instead of socket buffer lock to lock
fields of struct socket that don't belong to a socket buffer.
- Allow to acquire two socket locks, but the first one must belong to a
listening socket.
- Make soref()/sorele() to use atomic(9). This allows in some situations
to do soref() without owning socket lock. There is place for improvement
here, it is possible to make sorele() also to lock optionally.
- Most protocols aren't touched by this change, except UNIX local sockets.
See below for more information.
o Reduce copy-and-paste in kernel modules that accept connections from
listening sockets: provide function solisten_dequeue(), and use it in
the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4),
infiniband, rpc.
o UNIX local sockets.
- Removal of ACCEPT_LOCK() global uncovered several races in the UNIX
local sockets. Most races exist around spawning a new socket, when we
are connecting to a local listening socket. To cover them, we need to
hold locks on both PCBs when spawning a third one. This means holding
them across sonewconn(). This creates a LOR between pcb locks and
unp_list_lock.
- To fix the new LOR, abandon the global unp_list_lock in favor of global
unp_link_lock. Indeed, separating these two locks didn't provide us any
extra parralelism in the UNIX sockets.
- Now call into uipc_attach() may happen with unp_link_lock hold if, we
are accepting, or without unp_link_lock in case if we are just creating
a socket.
- Another problem in UNIX sockets is that uipc_close() basicly did nothing
for a listening socket. The vnode remained opened for connections. This
is fixed by removing vnode in uipc_close(). Maybe the right way would be
to do it for all sockets (not only listening), simply move the vnode
teardown from uipc_detach() to uipc_close()?
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D9770
patm(4) devices.
Maintaining an address family and framework has real costs when we make
infrastructure improvements. In the case of NATM we support no devices
manufactured in the last 20 years and some will not even work in modern
motherboards (some newer devices that patm(4) could be updated to
support apparently exist, but we do not currently have support).
With this change, support remains for some netgraph modules that don't
require NATM support code. It is unclear if all these should remain,
though ng_atmllc certainly stands alone.
Note well: FreeBSD 11 supports NATM and will continue to do so until at
least September 30, 2021. Improvements to the code in FreeBSD 11 are
certainly welcome.
Reviewed by: philip
Approved by: harti
There were several places where reference to compression were left
unfinished. Furthermore, KASSERTs contained references to MPPC_INVALID
which is not defined in the tree and therefore were sure to break with
INVARIANTS: comment them out.
Reported by: Eugene Grosbein
PR: 216265
MFC after: 3 days
callout_stop() recently started returning -1 when the callout is already
stopped, which is not handled by the netgraph code. Properly filter
the return value. Netgraph callers only want to know if the callout
was cancelled and not draining or already stopped.
Discussed with: julian, glebius
MFC after: 2 weeks
contiguous memory but in one path we did not always guarantee this,
thus do a m_pullup() there.
PR: 214385
Submitted by: Joe Jones (joeknockando googlemail.com)
MFC after: 3 days
lot of this module and I want to get the style and whitespace changes in
a separate commit (or maybe more).
PR: 206185
Submitted by: Dmitry Vagin
MFC after: 1 month
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.
For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.
Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747