MFV:
commit ee36c709c3d5f7040e1bd11f5c75318aa03e789f
Author: Gvozden Neskovic <neskovic@gmail.com>
Date: Sat Aug 27 20:12:53 2016 +0200
perf: 2.75x faster ddt_entry_compare()
First 256bits of ddt_key_t is a block checksum, which are expected
to be close to random data. Hence, on average, comparison only needs to
look at first few bytes of the keys. To reduce number of conditional
jump instructions, the result is computed as: sign(memcmp(k1, k2)).
Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} ,
which is computed efficiently. Synthetic performance evaluation of
original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R)
CPU E5-2660 v3:
old 6.85789 s
new 2.49089 s
perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare()
Compute the result directly instead of using conditionals
perf: zfs_range_compare()
Speedup between 1.1x - 2.5x, depending on compiler version and
optimization level.
perf: spa_error_entry_compare()
`bcmp()` is not suitable for comparator use. Use `memcmp()` instead.
perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare()
perf: 2.8x faster zil_bp_compare()
perf: 2.8x faster mze_compare()
perf: faster dbuf_compare()
perf: faster compares in spa_misc
perf: 2.8x faster layout_hash_compare()
perf: 2.8x faster space_reftree_compare()
perf: libzfs: faster avl tree comparators
perf: guid_compare()
perf: dsl_deadlist_compare()
perf: perm_set_compare()
perf: 2x faster range_tree_seg_compare()
perf: faster unique_compare()
perf: faster vdev_cache _compare()
perf: faster vdev_uberblock_compare()
perf: faster fuid _compare()
perf: faster zfs_znode_hold_compare()
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Richard Elling <richard.elling@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#5033
The canonical form of sync is:
sync L, E (if Category Elemental Memory Barriers implemented)
The L bits (2) denote the type of sync:
0 -- hwsync
1 -- lwsync
2 -- ptesync or hwsync
It's been found that most 32-bit CPUs designed prior to the introduction of
lwsync will ignore the L bits. However, some cores, particularly the e500 core,
will trigger an illegal instruction exception. Adding these variants will make
it easier to see which sync variant is actually being used in case of a trap.
libl is needed for config(8), which is a bootstrap-tool. It is possible to
build a system WITHOUT_TOOLCHAIN to exclude lex and thus, libl. We still
need to support building from this kind of host, though.
While here, group the config(8) dependencies together and add a small
explanation. These can likely both be scoped more clearly, but this will
need some further investigation.
Reported by: rgrimes (not WITHOUT_TOOLCHAIN, but provoked investigation)
MFC after: immediately
in ipf_nat_checkout() and report it in the frb_natv4out and frb_natv4in
dtrace probes.
This is currently being used to diagnose NAT failures in PR/208566. It's
rather handy so this commit makes it available for future diagnosis and
debugging efforts.
PR: 208566
MFC after: 1 week
No functional change.
Note that this change is careful to set the CCB header xflags after
foo_fill_bar() routines, which generally zero existing flags. An earlier
version of this patch mistakenly set the flag before the fill routines.
Submitted by: Scott Ferris <sferris AT isilon.com>, jhibbits@
Reviewed by: bdrewery@, markj@, and non-committer FreeBSD contributor Anton Rang
Sponsored by: Dell EMC Isilon
BPF (eBPF) is an independent instruction set architecture which is
introduced in Linux a few years ago. Originally, eBPF execute
environment was only inside Linux kernel. However, recent years there
are some user space implementation (https://github.com/iovisor/ubpf,
https://doc.dpdk.org/guides/prog_guide/bpf_lib.html) and kernel space
implementation for FreeBSD is going on
(https://github.com/YutaroHayakawa/generic-ebpf).
The BPF target support can be enabled using WITH_LLVM_TARGET_BPF, as it
is not built by default.
Submitted by: Yutaro Hayakawa <yhayakawa3720@gmail.com>
Reviewed by: dim, bdrewery
Differential Revision: https://reviews.freebsd.org/D16033
This should have been done as part of r336019 -- including ${SRCTOP}/sys is
not a good business model for something that's build in legacy/bootstrap
stages.
Beyond that, libnv seems to build quite alright as legacy, part of
buildworld, and standalone without. Axe it.
Reported by: truckman (head building stable/11)
Tested by: Shawn Webb (HardenedBSD)
MFC after: 3 days
Before r329882 the target would be computed after lowmem handlers run
and free pages. On some systems a significant amount of page
reclamation happens this way. However, with r329882 the target is
computed first, which can lead to unnecessary reclamation from the
page cache, and this in turn may result in excessive swapping.
Instead, adjust the target after running lowmem handlers. Don't
invoke the lowmem handlers before the PID controller, though, since
that would hide the true rate of page allocation.
Reviewed by: alc, kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16606
BOOT_TAG lived shortly in sys/msgbuf.h, but this wasn't necessarily great
for changing it or removing it. Move it into subr_prf.c and add options for
it to opt_printf.h.
One can specify both the BOOT_TAG and BOOT_TAG_SZ (really, size of the
buffer that holds the BOOT_TAG). We expose it as kern.boot_tag and also add
a loader tunable by the same name that we'll fetch upon initialization of
the msgbuf.
This allows for flexibility and also ensures that there's a consistent way
to figure out the boot tag of the running kernel, rather than relying on
headers to be in-sync.
Prodded super-super-lightly by: imp
own region in the TCAM starting with T6, unlike previous chips where
they were in the same region as normal filters.
These filters "hit" before anything else in the LE's lookup. The exact
order is:
a) High priority filters
b) TOE's active region (TCAM and/or hash)
c) Servers (TOE hw listeners)
d) Normal filters
MFC after: 1 week
Sponsored by: Chelsio Communications
On PowerPC (and possibly other architectures), that doesn't use
EARLY_AP_STARTUP, the config task queue may be used initialized.
This was observed while trying to mount the root fs from NFS, as
reported here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230168.
This patch has 2 main changes:
1- Perform a basic initialization of qgroup_config, similar to
what is done in taskqgroup_adjust, but simpler.
This makes qgroup_config ready to be used during NFS root mount.
2- When EARLY_AP_STARTUP is not used, call inm_init() and
in6m_init() right before SI_SUB_ROOT_CONF, because bootp needs
to send multicast packages to request an IP.
PR: Bug 230168
Reported by: sbruno
Reviewed by: jhibbits, mmacy, sbruno
Approved by: jhibbits
Differential Revision: D16633
nonexistent NAT instance or nonexistent rule.
This allows execute batched `delete` commands and do not fail when
found nonexistent rule.
Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC
We use ps to collect the information of all processes in textdump. But
it doesn't contain process arguments which however sometimes are very
useful for debugging. The new 'a' modifier adds that capability.
While here, remove 'm' modifier from ddb.4. It was in the manual page
from its very first revision, but I could not find any evidence of the
code ever supporting it.
Submitted by: Terry Hu <thu@panzura.com>
Reviewed by: kib
MFC after: 1 week
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D16603
It was possible in some rare circumstances for ngets to behave terribly with
bhyveload and some form of redirecting user input over a pipe.
PR: 198706
Submitted by: Ivan Krivonos <int0dster@gmail.com>
MFC after: 1 week
and make sure it would terminate with nul with strlcpy().
Reviewed by: imp (earlier revision)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16595
gptboot was broken when r316078 added the LOADER_GELI_SUPPORT #ifdef to
not pass geliargs via __exec. KARGS_FLAGS_EXTARG must not be used if we're
not going to pass an additional argument to __exec.
PR: 228151
Submitted by: guyyur@gmail.com
MFC after: 1 week
From the "newly licensed to drive" PR department, add a BOOT_TAG marker (by
default, --<<BOOT>>--, to the beginning of each boot's dmesg. This makes it
easier to do textproc magic to locate the start of each boot and, of
particular interest to some, the dmesg of the current boot.
The PR has a dmesg(8) component as well that I've opted not to include for
the moment- it was the more contentious part of this PR.
bde@ also made the statement that this boot tag should be written with an
ordinary printf, which I've- for the moment- declined to change about this
patch to keep it more transparent to observer of the boot process.
PR: 43434
Submitted by: dak <aurelien.nephtali@wanadoo.fr> (basically rewritten)
MFC after: maybe never
filter_create_ext() is documented to take a NULL terminated set of
arguments. 0 is promoted to an int so this would fail on 64-bit
systems if the value was not passed in a register. On all currently
supported 64-bit architectures it is.
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
COLORTERM is the de facto standard, while CLICOLOR is generally specific to
FreeBSD and ls(1).
PR: 230101
Submitted by: D Green <dfrg@xsmail.com> (with manpage additions by myself)
Reviewed by: cem ("LGTM" in PR; pre-manpage changes)
MFC after: 1 week
This reports the current status on a single line every second, mirroring
similar functionality in GNU dd, and carefully interacts with SIGINFO.
PR: 229615
Submitted by: Thomas Hurst <tom@hur.st> (modified for style(9) nits by me)
MFC after: 1 week
Using a space as the magic character would result in problems if the command
started with a number:
- For a 'valid' number n, n < size of argv, it would erroneously get
replaced with that argument; e.g. `apply -a ' ' -d 1rm x => `execxrm x`
- For an 'invalid' number n, n >= size of argv, it would segfault.
e.g. `apply -a ' ' 2to3 test.py` would try to access argv[2]
This problem occurred because apply(1) would prepend "exec " to the command
string before doing the actual magic number replacements, so it would come
across "exec 2to3 1" and assume that the " 2" is also a magic number to be
replaced.
Re-work this to instead just append "exec " to the command sbuf and
workaround the ugliness. This also simplifies stuff in the process.
PR: 226948
Submitted by: Tobias Stoeckmann <tobias@stoeckmann.org>
MFC after: 1 week
If OPAL_RTC_READ is busy and does not return the information on the first run,
as returning OPAL_BUSY_EVENT, the system will crash since ymd and hmsm variable
will contain junk values.
This is happening because we were not calling OPAL_RTC_READ again after
OPAL_POLL_EVENTS' return, which would finally replace the old/junk hmsm and ymd
values.
The code was also mixing OPAL_RTC_READ and OPAL_POLL_EVENTS return values.
This patch fix this logic and guarantee that we call OPAL_RTC_READ after
OPAL_POLL_EVENTS return, and guarantee the code will only proceed if
OPAL_RTC_READ returns OPAL_SUCCESS.
Reviewed by: jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16617
argv has been incremented during argument handling, so elements of the
array are no longer valid. Change the err() arguments so only the
first string pointer in argv is used.
Found during code inspection.
After a re-read of the appropriate section of RFC5661, I decided that a
few things should be changed related to LayoutRecall callback handling.
Here are the things fixed by this patch.
- For two of the three cases that LayoutRecall is done, I now think
setting the clora_changed argument false is correct.
- All errors other than NFSERR_DELAY returned by LayoutRecall appear
permanent, so don't retry for any of them. (NFSERR_DELAY is retried by
newnfs_request(), so it is not affected by this patch.)
- Instead of waiting "forever" (actually until the process is SIGTERM'd)
for Layouts to be returned during a mirror copy, fail and return
ENXIO after about 1minute.
Waiting for a <ctrl>C made sense when pnfsdscopymr() was done by itself,
but did not make sense when done via find(1).
This patch only affects the pNFS server.