In Linux this takes a refcount_t argument but in linuxkpi struct kref
uses an atomic_t for the refcount and code in drm directly uses this
function with a kref so use an atomic_t here.
Reviewed by: bz
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D36099
This adds missing includes, uses the standard dirent.h rather than the
BSD-specific sys/dirent.h subset (which works on macOS but not Linux)
and works around Linux's lack of st_birthtim.
This allows usr.sbin/makefs to be added to LOCAL_XTOOL_DIRS again on
macOS and Linux so that disk images can be cross-built.
Reviewed by: markj
Fixes: 240afd8c1fcc ("makefs: Add ZFS support")
Differential Revision: https://reviews.freebsd.org/D36135
flsll is needed for makefs's new ZFS support, and the others are added
for completeness.
Reviewed by: emaste, arichardson
Fixes: 240afd8c1fcc ("makefs: Add ZFS support")
Differential Revision: https://reviews.freebsd.org/D36134
This is needed for building makefs as a cross-tool since the ZFS code
uses these APIs.
Reviewed by: emaste
Fixes: 240afd8c1fcc ("makefs: Add ZFS support")
Differential Revision: https://reviews.freebsd.org/D36133
This will allow the code to be reused by the cross-build sys/types.h
wrapper in order to provide the APIs for greater compatibility. This
also provides a path towards eventually removing the definitions from
sys/types.h altogether if so desired by gradually migrating users to
including sys/bitcount.h explicitly, but that is not the primary goal
here.
Note that the copyright header is a direct copu of sys/types.h's given
that's where this code comes from. This could be replaced in future with
a more specific one restricted to just the code in question, depending
on what the copyright for that code is.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D36132
Currently the code copies a struct timespec's raw bits as a pair of
uint64_t. On 64-bit systems this has the same representation, but on
32-bit issues there are two issues:
1. tv_sec is a time_t which is 32-bit on i386 specifically
2. tv_nsec is a long not a 64-bit integer
On i386, this means the assertion should fire as the size doesn't match.
On other 32-bit systems there are 4 bytes of padding after tv_nsec,
which in practice are probably 0, as this data is ultimately coming from
the kernel, so it's deterministic (though the padding bytes are not
required to be preserved by the compiler, so are strictly unspecified).
However, on 32-bit big-endian systems, the padding bytes are in the
wrong half to be harmless, resulting in the nanoseconds being multiplied
by 2^32.
Fix this all by marshalling via a real uint64_t pair like is done by the
real ZFS_TIME_ENCODE.
Reviewed by: markj
Fixes: 240afd8c1fcc ("makefs: Add ZFS support")
Differential Revision: https://reviews.freebsd.org/D36131
Rather than using 'git checkout' to move to the commit in question for
create and update, use the '--head' argument to 'arc diff'. This
avoids the need to alter the current checkout and the related bits to
save/restore HEAD in the current checkout.
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D36248
The fsnode tree traversal routines used in ZFS mode assume that all
children of a (directory) fsnode can be accessed using a directory fd
for the parent and the child name. This is true when populating the
image using an mtree manifest or from a single staging directory, but
doesn't work when multiple staging directories are specified.
Change the traversal routines to use absolute path lookups when an mtree
manifest is not in use. This isn't ideal, but it's the simplest way to
fix the problem.
Reported by: imp
Sponsored by: The FreeBSD Foundation
The UFS filesystem expects to find '.' and '..' as the first two entries
in a directory. The kernel's UFS name cache can become quite confused
when these two entries are not present as the first two entries.
Prior to this change, when the fsck_ffs(8) utility detected that
'.' and/or '..' were missing, it would report them, but only offered
to replace them if the space at the beginning of the directory was
available. Otherwise it was left to the system administrator to
move the offending file(s) out of the way and then rerun fsck_ffs(8)
to create the '.' and '..' entries.
With this change, fsck_ffs(8) will always be able to create the '.'
and/or '..' entries. It moves any files in the way elsewhere in the
directory block. If there is no room in the directory block to which
to move them, they are placed in the lost+found directory.
Reported by: Peter Holm
Sponsored by: The FreeBSD Foundation
When either searching for backup UFS superblocks or when explicitly asked
to use one with the -b option, report the reason for failure if it cannot
be used.
Reported by: Peter Holm
Sponsored by: The FreeBSD Foundation
Implement the virtio_bus_finalize_features method so the FEATURES_OK
status bit is set. Implement the virtio_bus_config_generation method
to ensure larger than 4-byte reads are consistent.
Reviewed by: cperciva
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36150
This permits inlining the comparisons even in the 16K page case.
Note that since PC_FREEN is -1, values can be compared efficiently
without having to fetch words of pc_freemask from memory via the
'cmn <reg>, #0x1' instruction.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36218
- Define PC_FREEL and _NPCM in terms of _NPCPV rather than via magic
numbers.
- Remove assertions about _NPC* values from pmap.c. This is less
relevant now that PC_FREEL and _NPCM are derived from _NPCPV.
- Add a helper inline function pc_is_full() which uses a loop to check
if pc_map is all zeroes. Use this to replace three places that
check for a full mask assuming there are only 3 entries in pc_map.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36217
o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d18ef), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860dcd).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee54ea).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232
For many years only TCP debugging used them, but relatively recently
TCP DTrace probes also start to use them. Move their declarations
into tcp_debug.h, but start including tcp_debug.h unconditionally,
so that compilation with DTrace and without TCPDEBUG is possible.
The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.
There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.
Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36164
They were useful many years ago, when the callwheel was not efficient,
and the kernel tried to have as little callout entries scheduled as
possible.
Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36163
While here remove recursive network epoch entry in mld_fasttimo_vnet(),
as this function is already in epoch.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36161
Modern TCP stacks uses multiple callouts per tcpcb, and a global
callout is ancient artifact. However it is still used to garbage
collect compressed timewait entries.
Reviewed by: melifaro, tuexen
Differential revision: https://reviews.freebsd.org/D36159
The protosw KPI historically has implemented two quite orthogonal
things: protocols that implement a certain kind of socket, and
protocols that are IPv4/IPv6 protocol. These two things do not
make one-to-one correspondence. The pr_input and pr_ctlinput methods
were utilized only in IP protocols. This strange duality required
IP protocols that doesn't have a socket to declare protosw, e.g.
carp(4). On the other hand developers of socket protocols thought
that they need to define pr_input/pr_ctlinput always, which lead to
strange dead code, e.g. div_input() or sdp_ctlinput().
With this change pr_input and pr_ctlinput as part of protosw disappear
and IPv4/IPv6 get their private single level protocol switch table
ip_protox[] and ip6_protox[] respectively, pointing at array of
ipproto_input_t functions. The pr_ctlinput that was used for
control input coming from the network (ICMP, ICMPv6) is now represented
by ip_ctlprotox[] and ip6_ctlprotox[].
ipproto_register() becomes the only official way to register in the
table. Those protocols that were always static and unlikely anybody
is interested in making them loadable, are now registered by ip_init(),
ip6_init(). An IP protocol that considers itself unloadable shall
register itself within its own private SYSINIT().
Reviewed by: tuexen, melifaro
Differential revision: https://reviews.freebsd.org/D36157
Move the mbr non-geli zfs cases to no-priv creation with makefs / mkimg.
Add comments about the weird thing we do for MBR + ZFS + Legacy. Add
comments about other architectures. Still need to think through how to
leverage a completed universe to do all the architectures...
Sponsored by: Netflix
Certain operations such as checksum insertion and VLAN insertion
require the device model to rewrite the packet header. The first step
in rewriting the packet header is to copy the existing packet header
from the source packet. This copy is done by copying data from an
iovec array that corresponds to the S/G entries described by transmit
descriptors. However, if the total packet length is smaller than the
headers that need to be copied as the initial template, this copy can
overflow the iovec array and use garbage values as the source pointer
to memcpy. The PR used a single descriptor with a length of 0 in its
PoC.
To fix, track the total packet length and drop requests to transmit
packets whose payload is smaller than the required header length.
While here, fix another issue where the final descriptor could have an
invalid length (too short) that could underflow 'len' when stripping
the checksum. Skip those requests instead, too.
PR: 264372
Reported by: Robert Morris <rtm@lcs.mit.edu>
Reviewed by: grehan, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36182
This avoids type confusion where a malicious guest could rewrite the
MaxPStreams field in an endpoint context after the endpoint was
initialized causing the device model to interpret a guest provided
address (stored in ep_ringaddr of the "software" endpoint state) as a
bhyve host process address (ep_sctx_trbs). It also prevents a malicious
guest from triggering overflows of ep_sctx_trbs[] by increasing the
number of streams after the endpoint has been initialized.
Rather than re-reading the MaxPStreams value out of the endpoint context
in guest memory on subsequent operations, cache the value in the software
endpoint state. Possibly the device model should raise errors if the
value of MaxPStreams changes while an endpoint is running. This approach
simply ignores any such changes by the guest.
PR: 264294, 264347
Reported by: Robert Morris <rtm@lcs.mit.edu>
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36181
This reverts commit f6ffed44a8eb5d1ab89a18e60fb056aab2105be7.
fetchadd will fail the waiters flag, which can cause other
code to wait when it should not with nothing clear it
Revert until I sort this out.
Reported by: markj
"Invalid TXQ id" and "Queue <n> is stuck <x> <y>" are two errors seen
more commonly by FreeBSD users. Try to gather some extra data the
"easy way" adding more error logging for these situations in the hope
to find a clue or at least do more targetd debugging in the future.
Note that for one of the errors the Linux Intel driver has a TODO to
print register data. If that will show up in future versions of the
driver this may also help.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
If hardware vlan tagging is disabled (after a vlan has been added) we
receive double-tagged packets, even if the packet on the wire only has a
single VLAN tag. That looks like this:
17:29:30.370787 00:51:82:11:22:02 > 90:ec:77:1f:8a:5f, ethertype 802.1Q (0x8100), length 64: vlan 0, p 0, ethertype 802.1Q, vlan 1001, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.101.0.12 is-at 00:51:82:11:22:02, length 42
This happens because the ixgbe driver does not clear the vlan flags in
the hardware (such as IXGBE_RXDCTL_VME) if IFCAP_VLAN_HWTAGGING is
cleared.
Add code to do so, which fixes this issue.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D36139
This also fixes a bug where not-last unbusy failed to post a release
fence.
Reviewed by: markj (previous version), kib (previous version)
Differential Revision: https://reviews.freebsd.org/D36084