A small example of how these functions can be used to simplify checks of
this nature.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26271
This adds the getenv_bool() function, to parse a boolean value from a
kernel environment variable or tunable. This works for traditional
boolean values like "0" and "1", and also "true" and "false"
(case-insensitive). These semantics do not yet apply to sysctls declared
using SYSCTL_BOOL with CTLFLAG_TUN (they still only parse 1 and 0).
Also added are two wrapper functions, getenv_is_true() and
getenv_is_false(). These are slightly simpler for callers wishing to
perform a single check of a configuration variable.
Reviewed by: jhb (slightly earlier version)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26270
OTG mode is not supported still. It's easy to do it as a one-off
detection, but the proper support requires continuous monitoring and
communicating the current state to the USB layer.
Also, fix phy0_route setting for H3. Remove duplicate register
definitions.
Tested on Orange Pi PC Plus with dr_mode="peripheral" using
hw.usb.template=3
umodem_load="YES"
Reviewed by: manu
MFC after: 5 weeks
Differential Revision: https://reviews.freebsd.org/D26348
bootonce feature is temporary, one time boot, activated by
"bectl activate -t BE", "bectl activate -T BE" will reset the bootonce flag.
By default, the bootonce setting is reset on attempt to boot and the next
boot will use previously active BE.
By setting zfs_bootonce_activate="YES" in rc.conf, the bootonce BE will
be set permanently active.
bootonce dataset name is recorded in boot pool labels, bootenv area.
in case of nextboot, the nextboot_enable boolean variable is recorded in
freebsd:nvstore nvlist, also stored in boot pool label bootenv area.
On boot, the loader will process /boot/nextboot.conf if nextboot_enable
is "YES", and will set nextboot_enable to "NO", preventing /boot/nextboot.conf
processing on next boot.
bootonce and nextboot features are usable in both UEFI and BIOS boot.
To use bootonce/nextboot features, the boot loader needs to be updated on disk;
if loader.efi is stored on ESP, then ESP needs to be updated and
for BIOS boot, stage2 (zfsboot or gptzfsboot) needs to be updated
(gpart or other tools).
At this time, only lua loader is updated.
Sponsored by: Netflix, Klara Inc.
Differential Revision: https://reviews.freebsd.org/D25512
This was broken in r357940 which introduced the __typeof use. We need
the volatile qualifier to be on the pointee not the pointer otherwise it
does nothing. This was found by mhorne in D26498, noticing there was a
problem (a spin loop condition was hoisted for RISC-V boot code) but not
the root cause of it.
Reported by: mhorne
Reviewed by: mhorne, mjg
Approved by: mhorne, mjg
Differential Revision: https://reviews.freebsd.org/D26500
Possible in LA57 pmap config.
Noted by: alc
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26492
* Zero gw_sdl if switching to interface route - the assumption
that underlying storage is zeroed is incorrect with route changes.
* Apply proper flag mask to rte.
Reported by: vangyzen
- Move assertions out of the main loop to avoid duplicate conditional
expressions, and improve assertion messages.
- Fix va_next updates. In some cases we were not doing the wraparound
check before continuing the loop.
- Use the right va_next. In pmap_advise() and pmap_copy() we would step
through 1G pages 2M at a time.
- Copy 1G mappings in pmap_copy().
Reviewed by: alc, kib
MFC with: r365518
Sponsored by: Juniper Networks, Inc., Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26463
Due to a HW bug in the RockChip PCIe implementation, attempting to access
a non-existent register in the configuration space will throw an exception.
Use new bus functions bus_peek() and bus_poke() to overcomme this limitation.
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.
Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.
This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25371
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures in nfsrv_checksequence(). This was fixed by r365789.
A similar bug exists in nfsrv_bindconnsess(), where SVC_RELEASE() is called
while mutexes are held.
This patch applies a fix similar to r365789, moving the SVC_RELEASE() call
down to after the mutexes are released.
This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.
MFC after: 1 week
Fix the logic so it works as it appears.
Reported by: Coverity
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: D26211 (in progress, so omitting full URL)
vm_ooffset_t is now unsigned. Remove some tests for negative values,
or make other adjustments accordingly.
Reported by: Coverity
Reviewed by: kib markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D26214
Move the initialization of these variables to the beginning of their
respective functions.
On our end this creates a small amount of unneeded churn, as these
variables are properly initialized before their first use in all cases.
However, changing this benefits at least one downstream consumer
(NetApp) by allowing local and future modifications to these functions
to be made without worrying about where the initialization occurs.
Reviewed by: melifaro, rscheff
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26454
Hardware assistance includes checksumming (tx and rx), TSO, and RSS on
the inner traffic in a VXLAN tunnel.
Relnotes: Yes
Sponsored by: Chelsio Communications
This lets a VXLAN pseudo-interface take advantage of hardware checksumming (tx
and rx), TSO, and RSS if the NIC is capable of performing these operations on
inner VXLAN traffic.
A VXLAN interface inherits the capabilities of its vxlandev interface if one is
specified or of the interface that hosts the vxlanlocal address. If other
interfaces will carry traffic for that VXLAN then they must have the same
hardware capabilities.
On transmit, if_vxlan verifies that the outbound interface has the required
capabilities and then translates the CSUM_ flags to their inner equivalents.
This tells the hardware ifnet that it needs to operate on the inner frame and
not the outer VXLAN headers.
An event is generated when a VXLAN ifnet starts. This allows hardware drivers to
configure their devices to expect VXLAN traffic on the specified incoming port.
On receive, the hardware does RSS and checksum verification on the inner frame.
if_vxlan now does a direct netisr dispatch to take full advantage of RSS. It is
not very clear why it didn't do this already.
Future work:
Rx: it should be possible to avoid the first trip up the protocol stack to get
the frame to if_vxlan just so it can decapsulate and requeue for a second trip
up the stack. The hardware NIC driver could directly call an if_vxlan receive
routine for VXLAN traffic instead.
Rx: LRO. depends on what happens with the previous item. There will have to to
be a mechanism to indicate that it's time for if_vxlan to flush its LRO state.
Reviewed by: kib@
Relnotes: Yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873
This will be used by some upcoming changes to if_vxlan(4). RFC 7348 (VXLAN)
says that the UDP checksum "SHOULD be transmitted as zero. When a packet is
received with a UDP checksum of zero, it MUST be accepted for decapsulation."
But the original IPv6 RFCs did not allow zero UDP checksum. RFC 6935 attempts
to resolve this.
Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873
These are similar to the existing VLAN capabilities.
Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873
These are being added to support VXLAN but will work for GENEVE as well.
ENCAP_RSVD1 will likely become ENCAP_GENEVE in the future.
The size of struct mbuf does not change and that means this change can be MFC'd.
If size wasn't a constraint a cleaner way may have been to add inner_csum_flags
and inner_csum_data to go with csum_flags and csum_data.
Reviewed by: kib@
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873
Change the zone setup:
- Allow slabs to be returned to the OS
- Set the number of slots to the max devctl will queue before discarding
- Reserve 2% of the max (capped at 100) for low memory allocations
- Disable per-cpu caching since we don't need it and we avoid some pathologies
Change the alloation strategiy a bit:
- If a normal allocation fails, try to get the reserve
- If a reserve allocation fails, re-use the oldest-queued entry for storage
- If there's a weird race/failure and nothing on the queue to steal, return NULL
This addresses two main issues in the old code:
- If devd had died, and we're generating a lot of messages, we have an
unbounded leak. This new scheme avoids the issue that lead to this.
- The MPASS that was 'sure' the allocation couldn't have failed turned out
to be wrong in some rare cases. The new code doesn't make this assumption.
Since we reserve only 2% of the space, we go from about 1MB of
allocation all the time to more like 50kB for the reserve.
Reviewed by: markj@
Differential Revision: https://reviews.freebsd.org/D26448
Since r347532 (merged to stable/12) we only count user-wired pages
towards the system limit. However, we now also treat pages wired by
hypervisors (bhyve and virtualbox) as user-wired, so starting VMs with
large amounts of RAM tends to fail due to the low limit.
The purpose of the limit is to provide a seatbelt, not to impose some
policy on the use of wired memory. Thus, increase the default limit to
allow reasonable VM configurations to work without tuning.
Reviewed by: kib
Discussed with: dougm
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26424
This is followup to r365477.
If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.
This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".
It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0
For example:
# gpart show da0
=> 34 3906963389 da0 GPT (1.8T) [CORRUPT]
34 262144 1 ms-reserved (128M)
262178 2014 - free - (1.0M)
264192 3906764943 2 freebsd-swap (1.8T)
# gpart resize -i 2 -s 3900000000 da0
# gpart recover da0
Reported by: Alex Korchmar
MFC after: 3 days
Both before and after job control adjustments.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416
Orphans affect job control state, we must account for them when
changing pg_jobc.
Instead of p_pptr, use proc_realparent() to get parent relevant for
job control.
Use correct calculation of the parent for exiting process. For jobc
purposes, we must use realparent, but if it is also exiting, we should
fall to reaper, then recursively find non-exiting reaper.
Reported by: trasz
PR: 249257
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416
Use ddb pager.
Make lines more compact.
Eliminate unneeded casts.
Print more job-control related info when reporting process group.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416
Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated
atomically without locks or (locked) atomics.
tn_update_getattr() change also contains unrelated bug fix.
Reported by: lwhsu
PR: 249362
Reviewed by: markj (previous version)
Discussed with: mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26451
As with .text, the aim is to ensure that executable sections are
segregated from the rest, to avoid creation of writeable and executable
mappings. Recent versions of LLVM emit a PLT in firmware modules.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26444