ip_dn_io_ptr() (i.e. dummynet_io()) can return the mbuf immediately (as
opposed to owning it and later passing it through dummynet_send(), which
returns it to pf_test()). In that case we must clear the PF_TAG_DUMMYNET
flag to ensure we don't skip any subsequent firewall passes.
This can happen if we process a packet in PFIL_IN, set PF_TAG_DUMMYNET
on it, pass it to ip_dn_io_ptr() but have it returned immediately. The
packet continues its normal path, eventually hitting
pf_test(dir=PFIL_OUT), where we'd skip when we're not supposed to.
Sponsored by: Rubicon Communications, LLC ("Netgate")
This printf was designed to catch misqueued bio requests. Prior to
supporting read_bias == 0, we couldn't get anything but reads and writes
in this queue. However, for read_bias == 0 we queue everything except
BIO_DELETE to this queue, so remove the printf. We don't need to update
any statistics.
Sponsored by: Netflix
Robert Morris reported that, if a client sends an absurdly
large Owner/OwnerGroup string, the kernel malloc() for the
large size string can block forever.
This patch adds a sanity limit for Owner/OwnerGroup string
length. Since the RFCs do not specify any limit and FreeBSD
can handle a group name greater than 1Kbyte, the limit is
set at a generous 10Kbytes.
Reported by: rtm@lcs.mit.edu
PR: 260546
MFC after: 2 weeks
When the MDS of a pNFS service receives an Open/Create
and the file already exists, it must do a Setattr of
size == 0. Without this patch, this was eroneously
done via a VOP_SETAATR() call, which would set the
length of the MDS file to 0 (which is already is,
since all data lives on the DSs).
This patch fixes the problem by doing a nfsvno_setattr()
instead of VOP_SETATTR(), which knows to do a proxied
Setattr on the DSs.
For a non-pNFS server, the change has no effect, since
nfsvno_setattr() only does a VOP_SETATTR() for that case.
This was found during a recent IETF NFSv4 testing event.
MFC after: 2 weeks
This reverts commit 91f44749c6.
Devirtualization of V_if_index and V_ifindex_table was rushed into
the tree lacking proper context, discussion, and declaration of intent,
so I'm backing it out as harmful to VNET on the following grounds:
1) The change repurposed the decades-old and stable if_index KBI for
new, unclear goals which were omitted from the commit note.
2) The change opened up a new resource exhaustion vector where any vnet
could starve the system of ifnet indices, including vnet0.
3) To circumvent the newly introduced problem of separating ifnets
belonging to different vnets from the globalized ifindex_table, the
author introduced sysctl_ifcount() which does a linear traversal over
the (potentially huge) global ifnet list just to return a simple upper
bound on existing ifnet indices.
4) The change effectively led to nonuniform ifnet index allocation
among vnets.
5) The commit note clearly stated that the patch changed the implicit
if_index ABI contract where ifnet indices were assumed to be starting
from one. The commit note also included a correct observation that
holes in interface indices were always allowed, but failed to declare
that the userland-observable ifindex tables could now include huge
empty spans even under modest operating conditions.
6) The author had an earlier proposal in the works which did not
affect per-vnet ifnet lists (D33265) but which he abandoned without
providing the rationale behind his decision to do so, at the expense
of sacrificing the vnet isolation contract and if_index ABI / KBI.
Furthermore, the author agreed to back out his changes himself and
to follow up with a proposal for a less intrusive alternative, but
later silently declined to act. Therefore, I decided to resolve the
status-quo by backing this out myself. This in no way precludes a
future proposal aiming to mitigate ifnet-removal related system
crashes or panics to be accepted, provided it would not unnecessarily
compromise the goal of as strict as possible isolation between vnets.
Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
This reverts commit 703e533da5.
Revert "ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif"
This reverts commit e1882428dc.
Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
because for other endpoint types this is not critical.
Tested by: ehaupt@
PR: 262882
MFC after: 3 hours
Sponsored by: NVIDIA Networking
When the NFSv4.1/4.2 client is doing a pnfs mount to
mirrored DS(s), asynchronous threads are used to do the
RPCs against the DS(s) concurrently. If a DS is slow
to reply, it is possible for the "cred" to be free'd
before the asynchronous thread is done with it, causing
a panic/crash.
This patch fixes the problem by acquiring a refcount on
the "cred" while it is being used by the asynchronous thread
for a DS RPC. This bug was found during a recent IETF
NFSv4 testing event.
This bug only affects "pnfs" mounts to mirrored pNFS
servers.
MFC after: 2 weeks
The page size specified for RQ, SQ and CQ is always in units of 4KBytes.
Make sure we subtract MLX5_ADAPTER_PAGE_SHIFT, 12, instead of PAGE_SHIFT
which may vary. This fixes support for using the mlx5en driver on systems
having non-4K page size.
Linux commit:
68cdf5d6e91068c98d6091b193dc7a5ab7dcf5eb
MFC after: 1 week
Sponsored by: NVIDIA Networking
Without this patch the NFSv4.1/4.2 server erroneously
always frees session slot zero for callbacks. This only
affects 4.1/4.2 mounts if the server has delegations
enabled or is a pNFS configuration. Even for those
cases, the effect is mainly to only use slot 0 for
callbacks, serializing all of them. There is a slight
chance that callbacks will fail if the client performs
them in a different order than received on the TCP
connection.
If this bug affects your server, you will see console
messages like:
newnfs_request: Bad session slot
This patch fixes the problem. Found during a recent
IETF NFSv4 testing event.
PR: 263728
MFC after: 2 weeks
This fixes incomplete commit 2e547442ab
New sysctl allows to mark transmitted PPPoE LCP Control
ethernet frames with needed 3-bit Priority Code Point (PCP) value.
Confirming driver like if_vlan(4) uses the value to fill
IEEE 802.1p class of service field.
This is similar to Cisco IOS "control-packets vlan cos priority"
command.
It helps to avoid premature disconnection of user sessions
due to control frame drops (LCP Echo etc.)
if network infrastructure has a botteleck at a switch
or the xdsl DSLAM.
See also:
https://sourceforge.net/p/mpd/discussion/44692/thread/c7abe70e3a/
Tested by: Klaus Fokuhl at SourceForge
MFC after: 2 weeks
Robert Morris reported that, for the case of SecinfoNoname
with the Parent option, providing a non-directory could
cause a crash.
This patch adds a sanity check for v_type == VDIR for
this case, to avoid the crash.
Reported by: rtm@lcs.mit.edu
PR: 260300
MFC after: 2 weeks
In the places where we set an integer to 0 or 1 and then use it like a
boolean, replace int with bool and 0/1 with false/true. Left alone
places where this is a function argument or return value. No functional
changes intended.
Sponsored by: Netflix
Allow a global setting for the read_bias for the dynamic io
scheduler. This allows global policy to be set, in addition to the
existing per-drive policy. kern.cam.iosched.read_bias is a new tunable.
Sponsored by: Netflix
Reviewed by: chs
Differential Revision: https://reviews.freebsd.org/D34365