It's not supposed to be legal for two jails to contain the same IP address,
unless both jails contain only that one address. This is the behavior
documented in jail(8), and is there to prevent confusion when multiple
jails are listening on IADDR_ANY.
VIMAGE jails (now the default for GENERIC kernels) test this correctly,
but non-VIMAGE jails have been performing an incomplete test when nested
jails are used.
Approved by: re@ (kib@)
MFC after: 5 days
After r273201 it is supported "/{udp,tcp,proto}" suffix into
$firewall_myservices, and in the rc.conf the information is outdated.
Reviewed by: bcr, rgrimes
Approved by: re (gjb), doc (bcr), src (rgrimes)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D17338
When using a vlan with igb and the vlanhwcsum option, any mbufs which
already had the TCP, UDP, or SCTP checksum calculated and therefore don't
have the CSUM_[IP|IP6]_[TCP|UDP|SCTP] bits set in the csum_flags field would
have the L4 checksum corrupted by the hardware.
This was caused by the driver setting E1000_TXD_POPTS_TXSM any time a
checksum bit was set OR a vlan tag was present.
The patched driver only sets E1000_TXD_POPTS_TXSM when an offload is
requested.
PR: 231416
Reported by: pi
Approved by: re (gjb)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D17404
See r339205 for details.
An unused ERMS support is retained in the macro. It will be activated
after ifunc support lands.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17405
rep stos has a high startup time even on modern microarchitectures like
Skylake. Intel optimization manuals discuss how for small sizes it is
beneficial to go for streaming stores. Since those cannot be used without
extra penalty in the kernel I investigated performance impact of just
regular movs.
The patch below implements a very simple scheme: a 32-byte loop followed
by filling in the remainder of at most 31 bytes. It has a 256 breaking
point on which it falls back to rep stos. It provides a significant win
over the current primitive on several machines I tested (both Intel and
AMD). A 64-byte loop did not provide any benefit even for multiple of 64
sizes.
See the review for benchmark data.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17398
This was mostly a cosmetic issue. autoboot_delay=-1 is documented to bypass
the loader menu and immediately execute the boot command, but lualoader
would draw the menu and immediately execute the boot command. No interaction
was possible with the menu.
The fix lifts autoboot_delay processing out of menu.autoboot, which now
takes a delay and does nothing if no delay is specified. This lines up with
my expectations of menu.autoboot's usage from a third party, which may
want more control over the process than the default behavior.
PR: 231610
Approved by: re (gjb)
When getting the number of bytes to checksum make sure to convert the UDP
length to host byte order when the entire header is not in the first mbuf.
Reviewed by: jtl, tuexen, ae
Approved by: re (gjb), jtl (mentor)
Differential Revision: https://reviews.freebsd.org/D17357
Reported by: Jose Luis Duran
Reviewed by: bcr
Approved by: re (gjb), krion (mentor, implicit), mat (mentor, implicit)
Differential Revision: https://reviews.freebsd.org/D17409
Refactor sample ring buffer ring handling to make it more robust to
long running callchain collection handling
r338112 introduced a (now fixed) regression that exposed a number of race
conditions within the management of the sample buffers. This
simplifies the handling and moves the decision to overwrite a
callchain sample that has taken too long out of the NMI in to the
hardlock handler. With this change the problem no longer shows up as a
ring corruption but as the code spending all of its time in callchain
collection.
- Makes the producer / consumer index incrementing monotonic, making it
easier (for me at least) to reason about.
- Moves the decision to overwrite a sample from NMI context to interrupt
context where we can enforce serialization.
- Puts a time limit on waiting to collect a user callchain - putting a
bound on head-of-line blocking causing samples to be dropped
- Removes the flush routine which was previously needed to purge
dangling references to the pmc from the sample buffers but now is only
a source of a race condition on unload.
Previously one could lock up or crash HEAD by running:
pmcstat -S inst_retired.any_p -T and then hitting ^C
After this change it is no longer possible.
PR: 231793
Reviewed by: markj@
Approved by: re (gjb@)
Differential Revision: https://reviews.freebsd.org/D17011
Change swap_reserve and swap_total to be in units of pages so that
swap reservations can be done using only atomics instead of using a single
global mutex for swap_reserve and a single mutex for all processes running
under the same uid for uid accounting.
Results in mmap speed up and a 70% increase in brk calls / second.
Reviewed by: alc@, markj@, kib@
Approved by: re (delphij@)
Differential Revision: https://reviews.freebsd.org/D16273
With the new route cache feature udp_notify() will modify the inp when it
needs to invalidate the route cache. Ensure that we hold a write lock on
the inp before calling the function to ensure that multiple threads don't
race while trying to invalidate the cache (which previously lead to a page
fault).
Differential Revision: https://reviews.freebsd.org/D17246
Reviewed by: sbruno, bz, karels
Sponsored by: Dell EMC Isilon
Approved by: re (gjb)
NL_ARGMAX is the maximum number of positional arguments supported by
printf(3). Prior to r308145 it was declared as 99 and not enforced.
r308145 added enforcement and increased the value to 64k.
Unfortunately, development versions of PostgreSQL used the system
definition to allocate and zero an NL_ARGMAX * 4 sized array on the
stack of its snprintf implementation with measurable performance
impacts. This has been fixed in new PostgreSQL versions, but it is
possible that other programs suffer from this problem.
A value of 4096 puts us on par with Linux and is certainly large enough
for any reasonable program.
Reviewed by: mjg
Reported by: mjg
Approved by: re (gjb)
Differential revision: https://reviews.freebsd.org/D17387
Differential revision: https://reviews.freebsd.org/D8286
This change is a no-op in terms of semantics, but has a side effect
of removing a perfectly useless nop sled for CPUs with ERMS.
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
This makes it easier to grep the source tree for these notes, and
ensures that they will remain in sync.
Reviewed by: kib
Approved by: re (gjb)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17408
- Extend the bsdinstall(8) man page with ZFS installation scripting
details. [1]
- Extend the bsdinstall(8) man page with the description of all the ZFS
variables involved in a scripted installation of ZFS-based systems. [1]
- Extend the SCRIPTING section with an example for a ZFS-based scripted
installation. [1]
- Create a new section explaining how ZFS datasets must be written into
a variable to get them set on the final system. [1]
While here:
- Add Roberto to the copyrights for recognition as changes to the manual
page are huge.
- Use "Dq" for default values.
- Use sysrc(8) instead of echo in examples.
Submitted by: Roberto Fernandez Cueto <roberfern@gmail.com> [1]
Reviewed by: dteske
Approved by: re (gjb), krion (mentor, implicit), mat (mentor, implicit)
Differential Revision: https://reviews.freebsd.org/D14169
move all elements from the adist_send and adist_recv lists back onto the
adist_free list, but we don't wake consumers waitings for the adist_free list
to become non-empty. This can lead to the sender process stopping audit trail
files distribution and waiting forever.
Fix the problem by adding the missing wakeup.
While here slow down spinning on CPU in case of a short race in
sender_disconnect() and add an explaination when it can occur.
PR: 201953
Reported by: peter
Approved by: re (kib)
file name and opening it. This race was not properly handled, because we were
copying new name before checking for openat(2) error and when we were trying
again we were starting with the next trail file. This could result in skipping
distribution of such a trail file.
Fix this problem by checking for ENOENT first (only for .not_terminated files)
and then updating (or not) tr_filename before restarting the search.
PR: 200139
Reported by: peter
Approved by: re (kib)
target.
The doc/share/mk/doc.commands.mk sets SVN to /usr/local/bin/svn
by default, which is not necessarily installed by the documentation
project textproc/docproj port.
Ensure SVN can be evaluated properly to include the hardware pages
by iterating through /usr/local/bin and /usr/bin and looking for
both svn and svnlite binaries, and pass the SVN variable explicitly
through env(1) in the reldoc target to avoid failures if it does not
exist.
Approved by: re (rgrimes)
Sponsored by: The FreeBSD Foundation
ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.
The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other member which requires no translation.
Reviewed by: kib
Approved by: re (rgrimes, gjb)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Review: https://reviews.freebsd.org/D17388
shutdown() to wakeup another thread blocked on a stream listen socket.
This code is failing, while it used to work on FreeBSD 10 and still
works on Linux.
It seems reasonable to add another exception to support something users are
actually doing, which used to work on FreeBSD 10, and still works on Linux.
And, it seems like it should be acceptable to POSIX, as we still return
ENOTCONN.
This patch is different to what had been committed to stable/11, since
code around listening sockets is different. Patch in D15019 is written
by jtl@, slightly modified by me.
PR: 227259
Obtained from: jtl
Approved by: re (kib)
Differential Revision: D15019
This caused microcode to be updated only on the BSP if hyperthreading
was disabled, typically resulting in a hang or reset.
Approved by: re (kib)
Sponsored by: The FreeBSD Foundation
ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.
The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other members which require no translation.
Reviewed by: kib (prior version), jhb
Approved by: re (rgrimes)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Review: https://reviews.freebsd.org/D17378
using an application trying to use a v4mapped destination address on a
kernel without INET support or on a v6only socket.
Catch this case and prevent the packet from going anywhere;
else, without the KASSERT() armed, a v4mapped destination
address might go out on the wire or other undefined behaviour
might happen, while with the KASSERT() we panic.
PR: 231728
Reported by: Jeremy Faulkner (gldisater gmail.com)
Approved by: re (kib)
Provide an example of specifying a common vendor value as the documentation
is not clear enough at the moment.
While here, add 'D:#' to the previous example to eat the remaining
description string.
Also, pet mandoc a bit.
Submitted by: Yuri Pankov <yuripv@yuripv.net>
Reviewed by: cem, imp
Approved by: re (kib), krion (mentor, implicit), mat (mentor, implicit)
Differential Revision: https://reviews.freebsd.org/D17321
system-call entry and whenever audit arguments or return values are
captured:
1. Expose a single global, audit_syscalls_enabled, which controls
whether the audit framework is entered, rather than exposing
components of the policy -- e.g., if the trail is enabled,
suspended, etc.
2. Introduce a new function audit_syscalls_enabled_update(), which is
called to update audit_syscalls_enabled whenever an aspect of the
policy changes, so that the value can be updated.
3. Remove a check of trail enablement/suspension from audit_new() --
at the point where this function has been entered, we believe that
system-call auditing is already in force, or we wouldn't get here,
so simply proceed to more expensive policy checks.
4. Use an audit-provided global, audit_dtrace_enabled, rather than a
dtaudit-provided global, to provide policy indicating whether
dtaudit would like system calls to be audited.
5. Do some minor cosmetic renaming to clarify what various variables
are for.
These changes collectively arrange it so that traditional audit
(trail, pipes) or the DTrace audit provider can enable system-call
probes without the other configured. Otherwise, dtaudit cannot
capture system-call data without auditd(8) started.
Reviewed by: gnn
Sponsored by: DARPA, AFRL
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17348
libelf maintains two views of endianness: e_byteorder, and
e_ident[EI_DATA] in the ELF header itself. e_byteorder is not always
kept in sync, so use the ELF header endianness to test for mips64el.
PR: 231790
Bisected by: sbruno
Reviewed by: jhb
Approved by: re (kib)
MFC with: r338478
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17380
Due to markup issues, the DESCRIPTION OF MEMORY section is rather
unreadable; rework it a bit, using subsections for different lines of the
top output, and move it closer to description.
While here, pet manlint ordering other sections as expected.
Submitted by: Yuri Pankov <yuripv@yuripv.net>
Reviewed by: eadler
Approved by: re (gjb), krion (mentor)
Differential Revision: https://reviews.freebsd.org/D17369
This is a depessimization, see r334537 for an explanation. Routines
remain significantly slower than they have to be.
bzero was removed from the kernel but remains in libc. Macroify to
accommodate differences to memset (no return value, always setting to 0).
The bzero.S file is left in place due to libc build magic which pulls in
a C variant if a matching .S file is missing.
Reviewed by: kib
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17355
In the probe case for SCSI SMR Host Aware or Most Managed drives, be sure
to free allocated memory.
sys/cam/scsi/scsi_da.c:
In dadone_probezone(), free the data pointer before returning.
MFC after: 3 days
Sponsored by: Spectra Logic
Approved by: re (kib)
Otherwise (iter % ds->ds_cnt) is not guaranteed to lie in the range
[0, MAXMEMDOM).
Reported by: pho
Reviewed by: kib
Approved by: re (rgrimes)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17374
Tested with ifunc resolvers in the kernel and module with calls from
kernel to kernel, module to kernel, and module to module.
Reviewed by: kib (previous version)
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17370
Belatedly add a comment to the amd64 pmap explaining why we initialize
the kernel pmap's resident page count.
Reviewed by: alc, kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17377
Fix the issue with subtracting the TLS_TCB_SIZE too when we are trying to get
the 'where' in the R_PPC_TPREL32 case. At allocation time we added an offset
and the TLS_TCB_SIZE. This has to be subtracted as well.
Now all the issues reported are fixed. Tests were done on G4 and G5 PowerMac's.
Additionally I ran the tls tests from the gcc test suite and made sure the
results are as good as pre 338486.
Thanks to tuexen for reporting the malfunction and for patient testing.
Also testing thanks goes to jhibbits.
Reported by: tuexen
Discussed with: jhibbits, nwhitehorn
Approved by: re (gjb)
Pointyhat to: andreast
GCC 8.1 failed to build LLVM's libc++ when -Wshadow is set,
so lower down WARNS flag to 3.
This is similar to dtc(1) which uses libc++ and sets WARNS to 3.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
This allows older BEs to be destroyed as they become replaced by a BE
created from them: e.g.
bectl create -e brokenworld fixedworld
bectl activate fixedworld
bectl destroy brokenworld
Submitted by: Shawn Webb
Approved by: re (gjb)
Obtained from: HardenedBSD (5948c0581e)
Such data may later be unmapped. This occurs, for example, when a
loader-provided microcode update file is discarded.
Reviewed by: alc, kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17340
The initial raise in r336519 wasn't enough for using big resolution
(1920 x 1200 for example). Raise it again.
Reported by: bob prohaska <fbsd@www.zefox.net>
Tested by: bob prohaska <fbsd@www.zefox.net>
Approved by: re (gjb@)
The AMD Threadripper 2990WX is basically a slightly crippled Epyc.
Rather than having 4 memory controllers, one per NUMA domain, it has
only 2 memory controllers enabled. This means that only 2 of the
4 NUMA domains can be populated with physical memory, and the
others are empty.
Add support to FreeBSD for empty NUMA domains by:
- creating empty memory domains when parsing the SRAT table,
rather than failing to parse the table
- not running the pageout deamon threads in empty domains
- adding defensive code to UMA to avoid allocating from empty domains
- adding defensive code to cpuset to avoid binding to an empty domain
Thanks to Jeff for suggesting this strategy.
Reviewed by: alc, markj
Approved by: re (gjb@)
Differential Revision: https://reviews.freebsd.org/D1683