It can useful for code outside the VM system to look up the NUMA domain
of a page backing a virtual or physical address, specifically when
creating NUMA-aware data structures. We have _vm_phys_domain() for
this, but the leading underscore implies that it's an internal function,
and vm_phys.h has dependencies on a number of other headers.
Rename vm_phys_domain() to vm_page_domain(), and _vm_phys_domain() to
vm_phys_domain(). Make the latter an inline function.
Add _vm_phys.h and define struct vm_phys_seg there so that it's easier
to use in other headers. Include it from vm_page.h so that
vm_page_domain() can be defined there.
Include machine/vmparam.h from _vm_phys.h since it depends directly on
some constants defined there.
Reviewed by: alc
Reviewed by: dougm, kib (earlier versions)
Differential Revision: https://reviews.freebsd.org/D27207
The arm configs that required it have been removed from the tree.
Removing this option makes the callout code easier to read and
discourages developers from adding new configs without eventtimer
drivers.
Reviewed by: ian, imp, mav
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27270
kernel during dump time.
A real life scenario is that cores are compressed to reduce
size of dumpon partition, but we either don't care about space
in the /var/crash or we have a filesystem level compression of
/var/crash. And we want cores to be uncompressed in /var/crash
because we'd like to instantily read them with kgdb. In this
case we want kernel to write cores compressed, but savecore(1)
write them uncompressed.
Reviewed by: markj, gallatin
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27245
unsigned char promotes to int, which can overflow when shifted left by
24 bits or more. this has been reported multiple times but then
forgotten. it's expected to be benign UB, but can trap when built with
explicit overflow catching (ubsan or similar). fix it now.
note that promotion to uint32_t is safe and portable even outside of
the assumptions usually made in musl, since either uint32_t has rank
at least unsigned int, so that no further default promotions happen,
or int is wide enough that the shift can't overflow. this is a
desirable property to have in case someone wants to reuse the code
elsewhere.
musl commit: 593caa456309714402ca4cb77c3770f4c24da9da
Obtained from: musl
first, the condition (mem && k < p) is redundant, because mem being
nonzero implies the needle is periodic with period exactly p, in which
case any byte that appears in the needle must appear in the last p
bytes of the needle, bounding the shift (k) by p.
second, the whole point of replacing the shift k by mem (=l-p) is to
prevent shifting by less than mem when discarding the memory on shift,
in which case linear time could not be guaranteed. but as written, the
check also replaced shifts greater than mem by mem, reducing the
benefit of the shift. there is no possible benefit to this reduction of
the shift; since mem is being cleared, the full shift is valid and
more optimal. so only replace the shift by mem when it would be less
than mem.
musl commits:
8f5a820d147da36bcdbddd201b35d293699dacd8
122d67f846cb0be2c9e1c3880db9eb9545bbe38c
Obtained from: musl
MFC after: 2 weeks
We have adopted these and don't consider them 'contrib' code, so bring
them closer to style(9). This is a followon to r315467 and r351700.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The suser_enable sysctl allows to remove a privileged rights from uid 0.
This change introduce per jail setting which allow to make root a
normal user.
Reviewed by: jamie
Previous version reviewed by: kevans, emaste, markj, me_igalic.co
Discussed with: pjd
Differential Revision: https://reviews.freebsd.org/D27128
local software base directory, as committed in SVN rev. 367813.
The pkg and mailwrapper programs used the LOCALBASE environment variable
for this purpose and this functionality is preserved by getlocalbase().
After this change, the value of the user.localbase sysctl variable is used
if present (and not overridden in the environment).
The nvmecontrol program gains support of a dynamic path to its plugin
directory with this update.
Differential Revision: https://reviews.freebsd.org/D27237
"hardware threads", use cpuset_getaffinity(2) on FreeBSD, so it will
honor processor sets configured by the cpuset(1) command.
This should make it possible to avoid e.g. lld creating a huge number of
threads on a machine with many cores, even for linking simple programs.
This will also be submitted upstream.
Submitted by: mjg
MFC after: 1 week
The size on LP64 is 80 bytes, which is just more than a cacheline, does
not lend itself to easy shrinking and rounding up to 2 would be a huge
waste given NOFREE marker.
The least which can be done is to reorder it so that most commonly used
fields are less likely to span different lines, and consequently suffer
less false sharing.
With the change at hand most commonly used fields land in the same line
about 3/4 of the time, as opposed to 2/4.
This function returns the path to the local software base directory, by
default "/usr/local" (or the value of _PATH_LOCALBASE in include/paths.h
when building the world).
The value returned can be overridden by 2 methods:
- the LOCALBASE environment variable (ignored by SUID programs)
- else a non-default user.localbase sysctl value
Reviewed by: hps (earlier version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27236
Some platforms have additional architecture-specific floating-point flags.
Msun test cases lrint and test_fegsetenv (fenv) expects only standard flags,
so make sure to mask them appropriately.
This makes test pass on PowerPC64.
Reviewed by: jhibbits, ngie
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D27202
/etc/os-release is now a symbolic link to a generated file. Make
mergemaster cope with symbolic links generically. I'm no longer
a big mergemaster user, so this has only been lightly tested
by me, though Kimura-san has ran it through its paces.
Submitted by: Yasushiro KIMURA-san
PR: 242212
MFC After: 2 weeks
make it create the temporary file in the same directory as the source
file by default, instead of always using $TMPDIR or /tmp. If creating
that file fails because the directory is not writable, also fallback to
$TMPDIR or /tmp.
This has also been submitted upstream as:
https://sourceforge.net/p/elftoolchain/tickets/597/
Reported by: cem
PR: 250872
MFC after: 2 weeks
- Mask out recently added VV_* bits to avoid printing them twice.
- Keep VI_LOCKed on the same line as the rest of the flags.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27261
Some of the PCI ID were described as ENA with LLQ support - it's not
fully accurate and because of that, their names were changed.
Instead of LLQ, use RSERV0 for the description of those devices.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27119
The new HAL allows the driver to read extra ENI stats. Exact meaning of
each of them can be found in base/ena_defs/ena_admin_defs.h file and
structure ena_admin_eni_stats.
Those stats are being updated inside of the timer service, which is
executed every second.
ENI metrics are turned off by default. They can be enabled, using the
sysctl node: dev.ena.X.eni_metrics.update_delay
0 value in this node means that the update is turned off. Other values
determine how many seconds must pass, before ENI metrics will be
updated.
They can be acquired, using sysctl:
sysctl dev.ena.X.eni_metrics
Where X stands for the interface number.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27118
Refering to guide: https://wiki.freebsd.org/SPDX the SPDX tag should not
replace the standard license text, however it should be added over the
standard license text to make the automation easier.
Because of that, the old license was kept, but the SPDX tag was added
on top of every ENA driver file.
Submited by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27117
For the first descriptor in a chain the data may start at an offset.
It is optional feature of some devices, so the driver must ack that
it supports it.
The data pointer of the mbuf is simply shifted by the given value.
Submitted by: Maciej Bielski <mba@semihalf.com>
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27116
* Use the new API of ena_trace_*
* Fix typo syndrom --> syndrome
* Remove validation of the Rx req ID (already performed in the ena-com)
* Remove usage of deprecated ENA_ASSERT macro
Submitted by: Ido Segev <idose@amazon.com>
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27115
When links come and go, lacp goes into a "suppress distributing" mode
where it drops traffic for 3 seconds. When in this mode, lagg/lacp
historiclally drops traffic with ENETDOWN. That return value causes TCP
to close any connection where it gets that value back from the lower
parts of the stack. This means that any TCP connection with active
traffic during a 3-second windown when an LACP link comes or goes
would get closed.
TCP treats return values of ENOBUFS as transient errors, and re-schedules
transmission later. So rather than returning ENETDOWN, lets
return ENOBUFS instead. This allows TCP connections to be preserved.
I've tested this by repeatedly bouncing links on a Netlfix CDN server
under a moderate (20Gb/s) load and overved ENOBUFS reported back to
the TCP stack (as reported by a RACK TCP sysctl).
Reviewed by: jhb, jtl, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27188
Add support for the ENI metrics, bug fix for destroying wait event and
also other minor bug fixes, improvements, etc.
Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc.
The latest generation hardware requires IO CQ (completion queue)
descriptors memory to be aligned to a 4K. It needs that feature for
the best performance.
Allocating unaligned descriptors will have a big performance impact as
the packet processing in a HW won't be optimized properly. For that
purpose adjust ena_dma_alloc() to support it.
It's a critical fix, especially for the arm64 EC2 instances.
Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27114
The latest generation hardware requires IO CQ (completion queue)
descriptors memory to be aligned to a 4K. It needs that feature for
the best performance.
Allocating unaligned descriptors will have a big performance impact as
the packet processing in a HW won't be optimized properly.
It's a critical fix, especially for the arm64 EC2 instances.
FreeBSD's NFS exporter has long exported some unused statistics fields.
Revision r366992 removed them from nfsstat. This revision renames those
fields in the kernel's exported structures to make it clear to other
consumers that they are unused.
Reported by: emaste
Reviewed by: emaste
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D27258
Ecmd memory is not directly related to the request queue, only referenced
from it sometimes in target mode. Separate allocation should be easier
in case of fragmented memory and can be skipped when target is not built.
MFC after: 1 month
A copy-pasto left us copying in 24-bytes at the address of the rb pointer
instead of the intended target.
Reported by: sigsys@gmail.com
Sighing: kevans
The two flags are distinct and it is impossible to correctly handle clone(2)
without the assistance of fork1(). This change depends on the pwddesc split
introduced in r367777.
I've added a fork_req flag, FR2_SHARE_PATHS, which indicates that p_pd
should be treated the opposite way p_fd is (based on RFFDG flag). This is a
little ugly, but the benefit is that existing RFFDG API is preserved.
Holding FR2_SHARE_PATHS disabled, RFFDG indicates both p_fd and p_pd are
copied, while !RFFDG indicates both should be cloned.
In Chrome, clone(2) is used with CLONE_FS, without CLONE_FILES, and expects
independent fd tables.
The previous conflation of CLONE_FS and CLONE_FILES was introduced in
r163371 (2006).
Discussed with: markj, trasz (earlier version)
Differential Revision: https://reviews.freebsd.org/D27016
No functional change intended.
Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).
__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.
Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037
As this ABI is still fresh (r367287), let's correct some mistakes now:
- Version the structure to allow for future changes
- Include sender's pid in control message structure
- Use a distinct control message type from the cmsgcred / sockcred mess
Discussed with: kib, markj, trasz
Differential Revision: https://reviews.freebsd.org/D27084
This is used by some Linux programs using filehandles (r367773) to locate
the mountpoint for a given fsid.
Differential Revision: https://reviews.freebsd.org/D27136