Effectively all i386 kernels now have two pmaps compiled in: one
managing PAE pagetables, and another non-PAE. The implementation is
selected at cold time depending on the CPU features. The vm_paddr_t is
always 64bit now. As result, nx bit can be used on all capable CPUs.
Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE
configs, for drivers compatibility. Kernel layout, esp. max kernel
address, low memory PDEs and max user address (same as trampoline
start) are now same for PAE and for non-PAE regardless of the type of
page tables used.
Non-PAE kernel (when using PAE pagetables) can handle physical memory
up to 24G now, larger memory requires re-tuning the KVA consumers and
instead the code caps the maximum at 24G. Unfortunately, a lot of
drivers do not use busdma(9) properly so by default even 4G barrier is
not easy. There are two tunables added: hw.above4g_allow and
hw.above24g_allow, the first one is kept enabled for now to evaluate
the status on HEAD, second is only for dev use.
i386 now creates three freelists if there is any memory above 4G, to
allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed
from 3 to 1.
The PAE_TABLES kernel config option is retired.
In collaboarion with: pho
Discussed with: emaste
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18894
The need to use libc malloc(3) from some places in libthr always
caused issues. For instance, per-thread key allocation was switched to
use plain mmap(2) to get storage, because some third party mallocs
used keys for implementation of calloc(3).
Even more important, libthr calls calloc(3) during initialization of
pthread mutexes, and jemalloc uses pthread mutexes. Jemalloc provides
some way to both postpone the initialization, and to make
initialization to use specialized allocator, but this is very fragile
and often breaks. See the referenced PR for another example.
Add the small malloc implementation used by rtld, to libthr. Use it in
thr_spec.c and for mutexes initialization. This avoids the issues with
mutual dependencies between malloc and libthr in principle. The
drawback is that some more allocations are not interceptable for
alternate malloc implementations. There should be not too much memory
use from this allocator, and the alternative, direct use of mmap(2) is
obviously worse.
PR: 235211
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18988
This includes the bump for cdevsw d_version. Otherwise, the impact on
the ABI (not KBI) is surprisingly low. The most important affected
interface is devname(3) and ttyname(3) which already correctly handle
long names (and ttyname(3) should not be affected at all).
Still, due to the d_version bump, I argue that the change is not MFC-able.
Requested by: mmacy
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18932
The strdup() call does not take advantage of the known length of the
source string. Replace by malloc() and memcpy() utilizimng the pre-
calculated string length.
Submitted by: cperciva
Reported by: rgrimes
MFC after: 2 weeks
Casper library should not use exit(3) function because before setting it up
applications may register it. Casper doesn't depend on any registered exit
function, so it safe to change this.
Reported by: jceel
MFC after: 2 weeks
The return value (`err`) should be checked; not the `errno` value.
PR: 235200
Approved by: emaste (mentor)
Reviewed by: asomers, lwhsu
MFC after: 28 days
MFC with: r343362, r343365, r343367-r343368
Differential Revision: https://reviews.freebsd.org/D18969
This is meant to clarify the fact that the system call will not fail
with -1/EFAULT, as one might expect, when reading the sendfile(2)
manpage today.
While here, pet the mandoc linter, when dealing with the section that
describes valid values for `flags`.
PR: 232210
MFC after: 2 weeks
Approved by: emaste (mentor)
Reviewed by: glebius, 0mp
Differential Revision: https://reviews.freebsd.org/D18949
I should have only changed the format qualifier with the `size_t` value,
`length`, not the other [`off_t`] value, `dest_file_size`.
MFC after: 1 month
MFC with: r343362, r343365, r343367
Approved by: emaste (mentor; implicit)
Reported by: gcc 8.x
gcc 8.x is more pedantic than clang 7.x with format strings and the tests
passed `void*` variables while supplying `%s` (which is technically
incorrect).
Make the affected `void*` variables use `char*` storage instead to address
this issue, as the compiler will upcast the values to `char*`.
MFC after: 1 month
MFC with: r343362
Approved by: emaste (mentor; implicit)
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D18934
These testcases exercise a number of functional requirements for sendfile(2).
The testcases use IPv4 and IPv6 domain sockets with TCP, and were confirmed
functional on UFS and ZFS. UDP address family sockets cannot be used per the
sendfile(2) contract, thus using UDP sockets is outside the scope of
testing the syscall in positive cases. As seen in
`:s_negative_udp_socket_test`, UDP is used to test the sendfile(2) contract
to ensure that EINVAL is returned by sendfile(2).
The testcases added explicitly avoid testing out `SF_SYNC` due to the
complexity of verifying that support. However, this is a good next logical
item to verify.
The `hdtr_positive*` testcases work to a certain degree (the header
testcases pass), but the trailer testcases do not work (it is an expected
failure). In particular, the value received by the mock server doesn't match
the expected value, and instead looks something like the following (using
python array notation):
`trailer[:]message[1:]`
instead of:
`message[:]trailer[:]`
This makes me think there's a buffer overrun issue or problem with the
offset somewhere in the sendfile(2) system call, but I need to do some
other testing first to verify that the code is indeed sane, and my
assumptions/code isn't buggy.
The `sbytes_negative` testcases that check `sbytes` being set to an
invalid value resulting in `EFAULT` fails today as the other change
(which checks `copyout(9)`) has not been committed [1]. Thus, it
should remain an expected failure (see bug 232210 for more details
on this item).
Next steps for testing sendfile(2):
1. Fix the header/trailer testcases so that they pass.
2. Setup if_tap interface and test with it, instead of using "localhost", per
@asomers's suggestion.
3. Handle short recv(2)'s in `server_cat(..)`.
4. Add `SF_SYNC` support.
5. Add some more negative tests outside the scope of the functional contract.
MFC after: 1 month
Reviewed by: asomers
Approved by: emaste (mentor)
PR: 232210
Sponsored by: Netflix, Inc
Differential Revision: https://reviews.freebsd.org/D18625
Previously, we directly used libzfs_core's lzc_receive to import to a
temporary snapshot, then cloned the snapshot and setup the properties. This
failed when attempting to import replication streams with questionable
error.
libzfs's zfs_receive is a much better fit here, so we now use it instead
with the destination dataset and let libzfs take care of the dirty details.
be_import is greatly simplified as a result.
Reported by: Marie Helene Kvello-Aune <freebsd@mhka.no>
MFC after: 1 week
An integrity check such as a check-hash or a cross-correlation failed.
The integrity error falls between EINVAL that identifies errors in
parameters to a system call and EIO that identifies errors with the
underlying storage media. EINTEGRITY is typically raised by intermediate
kernel layers such as a filesystem or an in-kernel GEOM subsystem when
they detect inconsistencies. Uses include allowing the mount(8) command
to return a different exit value to automate the running of fsck(8)
during a system boot.
These changes make no use of the new error, they just add it. Later
commits will be made for the use of the new error number and it will
be added to additional manual pages as appropriate.
Reviewed by: gnn, dim, brueffer, imp
Discussed with: kib, cem, emaste, ed, jilles
Differential Revision: https://reviews.freebsd.org/D18765
This is CVS revision 1.31 from NetBSD lib/libedit/chartype.c:
Make sure that argv is NULL terminated since functions like tty_stty rely
on it to be so (Gerry Swinslow)
This broke when the wide-character support was enabled in libedit. The
conversion from multibyte to wide-character did not supply the apparently
expected terminating NULL in the new argv array.
PR: 233343
Submitted by: Yuichiro NAITO
Obtained from: NetBSD
MFC after: 1 week
Based on the description in Linux man page.
Reviewed by: markj, ngie (previous version)
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18837
two zones sharing a keg may have different limits. Now this is going
to work:
zone = uma_zcreate();
uma_zone_set_max(zone, limit);
zone2 = uma_zsecond_create(zone);
uma_zone_set_max(zone2, limit2);
Kegs no longer have uk_maxpages field, but zones have uz_items. When
set, it may be rounded up to minimum possible CPU bucket cache size.
For small limits bucket cache can also be reconfigured to be smaller.
Counter uz_items is updated whenever items transition from keg to a
bucket cache or directly to a consumer. If zone has uz_maxitems set and
it is reached, then we are going to sleep.
o Since new limits don't play well with multi-keg zones, remove them. The
idea of multi-keg zones was introduced exactly 10 years ago, and never
have had a practical usage. In discussion with Jeff we came to a wild
agreement that if we ever want to reintroduce the idea of a smart allocator
that would be able to choose between two (or more) totally different
backing stores, that choice should be made one level higher than UMA,
e.g. in malloc(9) or in mget(), or whatever and choice should be controlled
by the caller.
o Sleeping code is improved to account number of sleepers and wake them one
by one, to avoid thundering herd problem.
o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache()
KPI added. Having no bucket cache basically means setting maxcache to 0.
o Now with many fields added and many removed (no multi-keg zones!) make
sure that struct uma_zone is perfectly aligned.
Reviewed by: markj, jeff
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D17773
Summary:
GCC expects to link in a crtsavres.o on powerpc platforms. On
powerpc64 this is an empty file, but on powerpc and powerpcspe this does contain
some save/restore functions, which may not actually be necessary for newer
modern GCC and clang. This appeases the in-tree gcc, though, and is needed in
order to switch to the BSD CRTRBEGIN.
PR: 233751
Reviewed By: andrew
Differential Revision: https://reviews.freebsd.org/D18826
Highlights:
- Make sure that only TLS sections are sorted into TLS segment.
- Fixed multiple errors in "Section to Segment mapping".
- Man page updates
- ar improvements
- elfcopy: avoid filter_reloc uninitialized variable for rela
- elfcopy: avoid stripping relocations from static binaries
- readelf: avoid printing directory in front of absolute path
- readelf: add NT_FREEBSD_FEATURE_CTL FreeBSD note type
- test improvements
NOTES:
Some of these changes originated in FreeBSD and simply reduce diffs
between contrib and vendor.
ELF Tool Chain ar is not (currently) used in FreeBSD, and there are
improvements in both FreeBSD and ELF Tool Chain ar that are not in
the other.
Sponsored by: The FreeBSD Foundation
This set of changes is geared towards making bectl respect deep boot
environments when they exist and are mounted. The deep BE composition
functionality (`bectl add`) remains disabled for the time being. This set of
changes has no effect for the average user. but allows deep BE users to
upgrade properly with their current setup.
libbe(3): Open the target boot environment and get a zfs handle, then pass
that with the target mountpoint to be_mount_iter; If the BE_MNT_DEEP flag is
set call zfs_iter_filesystems and mount the child datasets.
Similar logic is employed when unmounting the datasets, save for children
are unmounted first.
bectl(8): Change bectl_cmd_jail to pass the BE_MNT_DEEP flag when
calling be_mount as well as call be_unmount when cleaning up after the
jail has exited instead of umount(2) directly.
PR: 234795
Submitted by: Wes Maag <jwmaag_gmail.com> (test additions by kevans)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18796
We could perhaps have a method that does this given a dataset, but it's yet
clear that we'll always want to bypass the altroot when we grab the
mountpoint. For now, we'll refactor things a bit so we grab the altroot
length when libbe is initialized and have a common method that does the
necessary augmentation (replace with / if it's the root, return a pointer to
later in the string if not).
This will be used in some upcoming work to make be_mount work properly for
deep BEs.
MFC after: 1 week
from the local mapping.
Enable the setting by default.
The article behind the change: https://arxiv.org/abs/1901.01161
Reviewed by: markj
Discussed with: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18764
j is int32_t and thus j<<31 is undefined if j==1.
Hinted by: muusl-lib (git 688d3da0f1730daddbc954bbc2d27cc96ceee04c)
Discussed with: freebsd-numerics (kargl)
Previously, the following sequence of events was feasible under some
circumstance:
bectl create test
bectl activate test
# the test BE dataset gets promoted and set as bootfs
bectl destroy test
I was unable to reproduce the destroy succeeding, but we should be rejecting
this before it even gets to libzfs because it would leave the system in an
inconsistent state. Forcing the user to be explicit as to which environment
should be activated instead is much better.
Reported by: Graham Perrin <grahamperrin@gmail.com>
MFC after: 3 days
lib/csu/tests/dynamiclib requires libh_csu.so be built first. I'm not
sure this is the most correct/best way to address this but it solves
the issue in my testing.
PR: 233734
Sponsored by: The FreeBSD Foundation
As it does for recv*(2), MSG_DONTWAIT indicates that the call should
not block, returning EAGAIN instead. Linux and OpenBSD both implement
this, so the change makes porting easier, especially since we do not
return EINVAL or so when unrecognized flags are specified.
Submitted by: Greg V <greg@unrelenting.technology>
Reviewed by: tuexen
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18728
When presented with an arg string like '-l-', getopt_long will successfully
parse out the 'l' short option, then proceed to match '--' against the first
longopts entry as it later does a strncmp with len=0. This latter bit is
arguably another bug in itself, but presumably not a practical issue as all
callers of parse_long_options are already doing the right thing (except this
one pointed out).
An opt string like '-l-' should be considered malformed and throw a bad
argument rather than behaving as if '--' were passed. It cannot possibly do
what the invoker expects, and it's probably the result of a typo (ls -l- a)
rather than any intent.
Reported by: Tony Overfield <toverfield@yahoo.com>
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18616
Add a short description of the function to the appropriate man page and add
reference to it where it makes sense.
Reviewed by: bcr, markj, 0mp
Approved by: markj
Differential Revision: https://reviews.freebsd.org/D18725
This merge brings in a couple new files, which needed to be attached to the
build; a new dependency on <limits.h>, which must be stubbed; and a name
change in the Context parameter constants, from ZSTD_p_foo to ZSTD_c_foo.
Significantly, it fixes a kernel build error with GCC where floating-point
functions were included in the kernel build, by hiding them under the same
compile-time #ifdef that already covered their invocation. That issue was
introduced to FreeBSD in the 1.3.7 update and tracked upstream here:
https://github.com/facebook/zstd/issues/1386
The full 1.3.8 release notes can be found on Github:
https://github.com/facebook/zstd/releases/tag/v1.3.8
Relnotes: yes
The long double aliases of double functions are only exposed as aliases if
LDBL_MANT_DIG is 53 (same as DBL_MANT_DIG). Without float.h included these
files were not exposing weak aliases as expected, leading to link failures
if programs use the *l functions. This should fix editors/calligra on
targets with 64-bit long double, which uses erfl and erfcl. Found on
powerpc64.
Reviewed by: kargl@
Addition of the new errno values requires adding new elements to
sys_errlist array, which is actually ABI-incompatible, since ELF
records the object size. Expand array in advance to 150 elements so
that we have our users to go over the issue only once, at least until
more than 53 new errors are added.
I did not bumped the symbol version, same as it was not done for
previous increases of the array size. Runtime linker only copies as
much data into binary object on copy relocation as the binary'object
specifies. This is not fixable for binaries which access sys_errlist
directly.
While there, correct comment and calculation of the temporary buffer
size for the message printed for unknown error. The on-stack buffer
is used only for the number and delimiter since r108603.
Requested by: mckusick
Reviewed by: mckusick, yuripv
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18656
Error messages in gai_strerror(3) vary largely among OSs.
For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.
Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.
MFC after: 1 month
Differentical Revision: D18630