Commit Graph

20737 Commits

Author SHA1 Message Date
jhb
d3e4e51223 Initial support for bhyve save and restore.
Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed.  In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken).  A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.

To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.

While the current implementation is useful for several uses cases, it
has a few limitations.  The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system).  In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions.  The file format also does not currently support
versioning of individual chunks of state.  As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files.  The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility.  As a result, the current implementation is not enabled
by default.  It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.

Submitted by:	Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by:	Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes:	yes
Sponsored by:	University Politehnica of Bucharest
Sponsored by:	Matthew Grooms (student scholarships)
Sponsored by:	iXsystems
Differential Revision:	https://reviews.freebsd.org/D19495
2020-05-05 00:02:04 +00:00
jhb
a643b1d159 Remove support for IPsec algorithms deprecated in r348205 and r360202.
Examples of depecrated algorithms in manual pages and sample configs
are updated where relevant.  I removed the one example of combining
ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this
combination is NOT RECOMMENDED.

Specifically, this removes support for the following ciphers:
- des-cbc
- 3des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc

This also removes support for the following authentication algorithms:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160

Reviewed by:	cem, gnn (older verisons)
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24342
2020-05-02 00:06:58 +00:00
jhb
4ca3575516 Remove the SYMVER build option.
This option was added as a transition aide when symbol versioning was
first added.  It was enabled by default in 2007 and is supported even
by the old GPLv2 binutils.  Trying to disable it currently fails to
build in libc and at this point it isn't worth fixing the build.

Reported by:	Michael Dexter
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D24637
2020-04-30 22:08:40 +00:00
emaste
89677cbc50 liblua: ensure that "require" will fail in bootstrap flua
We do not want to support bootstrapping lua modules, so ensure that
require will fail by providing a nonexistent path.

Reviewed by:	kevans
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24610
2020-04-29 13:41:32 +00:00
markj
c52a2f979c Document handling of connection-mode sockets by sendto(2).
sendto(2), sendmsg(2) and sendmmsg(2) return ENOTCONN if a destination
address is specified and the socket is not connected and the socket
protocol does not automatically connect ("implied connect").  Document
that.  Also document the fact that the destination address is ignored
for connection-mode sockets if the socket is already connected.

PR:		245817
Submitted by:	Erik Inge Bolsø <knan-bfo@modirum.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24530
2020-04-27 16:12:32 +00:00
markj
3cbb3f5c2b Fix handling of EV_EOF for named pipes.
Contrary to the kevent man page, EV_EOF on a fifo is not cleared by
EV_CLEAR.  Modify the read and write filters to clear EV_EOF when the
fifo's PIPE_EOF flag is clear, and update the man page to document the
new behaviour.

Modify the write filter to return the amount of buffer space available
even if no readers are present.  This matches the behaviour for sockets.

When reading from a pipe, only call pipeselwakeup() if some data was
actually read.  This prevents the continuous re-triggering of a
EVFILT_READ event on EOF when in edge-triggered mode.

PR:		203366, 224615
Submitted by:	Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24528
2020-04-27 15:59:19 +00:00
cem
fe223e0b07 libc: partially revert r326576
In r326576 ("use @@@ instead of @@ in __sym_default"), an earlier version of
the phabricator-discussed patch was inadvertently committed.  The commit
message claims that @@@ means that weak is not needed, but that was due to a
misunderstanding of the use of weak symbols in this context by the submitted
in the first draft of the patch; the description text was not updated to
match the discussion.  As discussed in phabricator, weak is needed for
symbol interposing because of the behavior of our rtld, and is widely used
elsewhere in libc.

This partial revert restores the approved version of the patch and permits
symbol interposing for openat.

Reported by:	Raymond Ramsden <rramsden AT isilon.com>
Reviewed by:	dim, emaste, kib (2017)
Discussed with:	kib (2020)
Differential Revision:	https://reviews.freebsd.org/D11653
2020-04-25 14:24:54 +00:00
0mp
c932333080 Fix a typo
Reported by:	pstef
MFC after:	2 days
2020-04-24 22:04:14 +00:00
mav
2698bb5d21 Map family 0x5F (Denverton) to goldmont.
According to the 325462-071US document, they should be the same.

MFC after:	1 week
2020-04-24 16:05:35 +00:00
vangyzen
9cea565ffa Update jemalloc to version 5.2.1
Revert r354606 to restore r354605.

Apply one line from jemalloc commit d01b425e5d1e1 in hash_x86_128()
to fix the build with gcc, which only allows a fallthrough attribute
to appear before a case or default label.

Submitted by:	jasone in r354605
Discussed with:	jasone
Reviewed by:	bdrewery
MFC after:	never, due to gcc 4.2.1
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D24522
2020-04-23 23:57:43 +00:00
kp
9ae2e3fb2d libc: Shortcut if_indextoname() if index == 0
If the index we're trying to convert is 0 we can avoid a potentially
expensive call to getifaddrs(). No interface has an ifindex of zero, so
we can handle this as an error: set the errno to ENXIO and return NULL.

Submitted by:	Nick Rogers
Reviewed by:	lutz at donnerhacke.de
MFC after:	2 weeks
Sponsored by:	RG Nets
Differential Revision:	https://reviews.freebsd.org/D24524
2020-04-23 21:16:51 +00:00
kevans
68b40071f8 kqueue(2): de-vandalize the random sentence in the middle
A last minute change appears to have inadvertently vandalized unrelated
parts of the manpage with the date. =-(

Reported by:	rpokala
2020-04-22 04:05:02 +00:00
kevans
fd3851a43c kqueue(2): add a note about EV_RECEIPT
In the below-referenced PR, a case is attached of a simple reproducer that
exhibits suboptimal behavior: EVFILT_READ and EVFILT_WRITE being set in the
same kevent(2) call will only honor the first one. This is, in-fact, how
it's supposed to work.

A read of the manpage leads me to believe we could be more clear about this;
right now there's a logical leap to make in the relevant statement: "When
passed as input, it forces EV_ERROR to always be returned." -- the logical
leap being that this indicates the caller should have allocated space for
the change to be returned with EV_ERROR indicated in the events, or
subsequent filters will get dropped on the floor.

Another possible workaround that accomplishes similar effect without needing
space for all events is just setting EV_RECEIPT on the final change being
passed in; if any errored before it, the kqueue would not be drained. If we
made it to the final change with EV_RECEIPT set, then we would return that
one with EV_ERROR and still not drain the kqueue. This would seem to not be
all that advisable.

PR:		229741
MFC after:	1 week
2020-04-22 03:45:52 +00:00
jhb
7ed94416c0 Map negative types passed to vm_capability_type2name to NULL.
Submitted by:	vangyzen
2020-04-21 21:48:35 +00:00
jhb
33e82dfc5f Check the magic value in longjmp() before calling sigprocmask().
This avoids passing garbage to sigprocmask() if the jump buffer is
invalid.

Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24483
2020-04-21 17:40:23 +00:00
jhb
11f65d310a Add description string for VM_CAP_BPT_EXIT.
While here, replace the array of mapping structures with an array of
string pointers where the index is the capability value.

Submitted by:	Rob Fairbanks <rob.fx907@gmail.com>
Reviewed by:	rgrimes
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24289
2020-04-21 17:30:56 +00:00
asomers
f99f5d8582 libauditd: make it a PRIVATELIB
According to the upstream man page (which we don't install), none of
libauditd's symbols are intended to be public. Also, I can't find any
evidence for a port that uses libauditd. Therefore, we should treat it like
other such libraries and use PRIVATELIB.

Reported by:	phk
Reviewed by:	cem, emaste
MFC after:	2 weeks
2020-04-19 02:20:39 +00:00
asomers
d71fd57cdd libbsm: fix some MLINKS
Add missing MLINKS entries for a few functions. Remove some old typo
entries.

Reported by:	phk
Reviewed by:	cem
MFC after:	2 weeks
2020-04-19 02:18:40 +00:00
asomers
696799af17 cap_dns.3: fix some orphan .Xr links
Reported by:	phk
MFC after:	2 weeks
2020-04-18 20:13:43 +00:00
brooks
8c6291d34e Attempt to use AT_PS_STRINGS to get the ps_strings pointer.
This saves a system call and avoids one of the (relatively rare) cases
of the kernel exporting pointers via sysctl.

As a temporary measure, keep the sysctl support to allow limited
compatability with old kernels.

Fail gracefully if ps_strings can't be found (should never happen).

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24407
2020-04-15 20:28:20 +00:00
brooks
af2831122f Support AT_PS_STRINGS in _elf_aux_info().
This will be used by setproctitle().

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24407
2020-04-15 20:26:41 +00:00
brooks
2423c967d5 Fix -Wvoid-pointer-to-enum-cast warnings.
This pattern is used in callbacks with void * data arguments and seems
both relatively uncommon and relatively harmless.  Silence the warning
by casting through uintptr_t.

This warning is on by default in Clang 11.

Reviewed by:	arichardson
Obtained from:	CheriBSD (partial)
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24425
2020-04-15 18:15:58 +00:00
jhb
2314192d69 Remove support for geli(4) algorithms deprecated in r348206.
This removes support for reading and writing volumes using the
following algorithms:

- Triple DES
- Blowfish
- MD5 HMAC integrity

In addition, this commit adds an explicit whitelist of supported
algorithms to give a better error message when an invalid or
unsupported algorithm is used by an existing volume.

Reviewed by:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24343
2020-04-15 00:14:50 +00:00
kevans
2abd45cb37 closefrom: clamp lowfd to >= 0; close_range's parameters are unsigned.
Pointy hat:	kevans
Reported by:	CI (lwhsu)
2020-04-14 23:24:24 +00:00
kevans
79165c9642 Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_range
Include a temporarily compatibility shim as well for kernels predating
close_range, since closefrom is used in some critical areas.

Reviewed by:	markj (previous version), kib
Differential Revision:	https://reviews.freebsd.org/D24399
2020-04-14 18:07:42 +00:00
jtl
5c3de7e0d4 Make sonewconn() overflow messages have per-socket rate-limits and values.
sonewconn() emits debug-level messages when a listen socket's queue
overflows. Currently, sonewconn() tracks overflows on a global basis. It
will only log one message every 60 seconds, regardless of how many sockets
experience overflows. And, when it next logs at the end of the 60 seconds,
it records a single message referencing a single PCB with the total number
of overflows across all sockets.

This commit changes to per-socket overflow tracking. The code will now
log one message every 60 seconds per socket. And, the code will provide
per-socket queue length and overflow counts. It also provides a way to
change the period between log messages using a sysctl.

Reviewed by:	jhb (previous version), bcr (manpages)
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D24316
2020-04-14 15:38:18 +00:00
kevans
e5fb66229b libc: remove shm_open(2)'s compat fallback
This had been introduced to ease any pain for using slightly older kernels
with a newer libc, e.g., for bisecting a kernel across the introduction of
shm_open2(2). 6 months has passed, retire the fallback and let shm_open()
unconditionally call shm_open2().

Stale includes are removed as well.
2020-04-13 15:59:15 +00:00
delphij
8c6c4def20 Sync with OpenBSD:
arc4random.c: In the incredibly unbelievable circumstance where
_rs_init() fails to allocate pages, don't call abort() because of
corefile data leakage concerns, but simply _exit().  The reasoning
is _rs_init() will only fail if someone finds a way to apply
specific pressure against this failure point, for the purpose of
leaking information into a core which they can read.  We don't
need a corefile in this instance to debug that.  So take this
"lever" away from whoever in the future wants to do that.

arc4random.3: reference random(4)

arc4random_uniform.c: include stdint.h over sys/types.h
2020-04-13 08:42:13 +00:00
kevans
6371039d47 Implement a close_range(2) syscall
close_range(min, max, flags) allows for a range of descriptors to be
closed. The Python folk have indicated that they would much prefer this
interface to closefrom(2), as the case may be that they/someone have special
fds dup'd to higher in the range and they can't necessarily closefrom(min)
because they don't want to hit the upper range, but relocating them to lower
isn't necessarily feasible.

sys_closefrom has been rewritten to use kern_close_range() using ~0U to
indicate closing to the end of the range. This was chosen rather than
requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the
call to kern_close_range for simplicity.

The flags argument of close_range(2) is currently unused, so any flags set
is currently EINVAL. It was added to the interface in Linux so that future
flags could be added for, e.g., "halt on first error" and things of this
nature.

This patch is based on a syscall of the same design that is expected to be
merged into Linux.

Reviewed by:	kib, markj, vangyzen (all slightly earlier revisions)
Differential Revision:	https://reviews.freebsd.org/D21627
2020-04-12 21:23:19 +00:00
melifaro
4eda536b2e Introduce nexthop objects and new routing KPI.
This is the foundational change for the routing subsytem rearchitecture.
 More details and goals are available in https://reviews.freebsd.org/D24141 .

This patch introduces concept of nexthop objects and new nexthop-based
 routing KPI.

Nexthops are objects, containing all necessary information for performing
 the packet output decision. Output interface, mtu, flags, gw address goes
 there. For most of the cases, these objects will serve the same role as
 the struct rtentry is currently serving.
Typically there will be low tens of such objects for the router even with
 multiple BGP full-views, as these objects will be shared between routing
 entries. This allows to store more information in the nexthop.

New KPI:

struct nhop_object *fib4_lookup(uint32_t fibnum, struct in_addr dst,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);
struct nhop_object *fib6_lookup(uint32_t fibnum, const struct in6_addr *dst6,
  uint32_t scopeid, uint32_t flags, uint32_t flowid);

These 2 function are intended to replace all all flavours of
 <in_|in6_>rtalloc[1]<_ign><_fib>, mpath functions  and the previous
 fib[46]-generation functions.

Upon successful lookup, they return nexthop object which is guaranteed to
 exist within current NET_EPOCH. If longer lifetime is desired, one can
 specify NHR_REF as a flag and get a referenced version of the nexthop.
 Reference semantic closely resembles rtentry one, allowing sed-style conversion.

Additionally, another 2 functions are introduced to support uRPF functionality
 inside variety of our firewalls. Their primary goal is to hide the multipath
 implementation details inside the routing subsystem, greatly simplifying
 firewalls implementation:

int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);
int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr *dst6, uint32_t scopeid,
  uint32_t flags, const struct ifnet *src_if);

All functions have a separate scopeid argument, paving way to eliminating IPv6 scope
 embedding and allowing to support IPv4 link-locals in the future.

Structure changes:
 * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size.
 * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz.

Old KPI:
During the transition state old and new KPI will coexists. As there are another 4-5
 decent-sized conversion patches, it will probably take a couple of weeks.
To support both KPIs, fields not required by the new KPI (most of rtentry) has to be
 kept, resulting in the temporary size increase.
Once conversion is finished, rtentry will notably shrink.

More details:
* architectural overview: https://reviews.freebsd.org/D24141
* list of the next changes: https://reviews.freebsd.org/D24232

Reviewed by:	ae,glebius(initial version)
Differential Revision:	https://reviews.freebsd.org/D24232
2020-04-12 14:30:00 +00:00
carlavilla
1fce59f215 Add HISTORY section to getc(3)
PR:		240269
Submitted by:	Gordon Bergling
Differential Revision:	https://reviews.freebsd.org/D24295
2020-04-10 09:37:20 +00:00
carlavilla
adb000eb30 exit(3): Add HISTORY section
PR:		240259
Submitted by:	Gordon Bergling
Obtained from:	OpenBSD
Differential Revision:	https://reviews.freebsd.org/D24146
2020-04-10 09:27:18 +00:00
carlavilla
8e3569e8c5 arc4random(3): Expand the SEE ALSO section
Submitted by:	Gordon Bergling
Approved by:	brueffer@
Obtained from:	NetBSD
Differential Revision:	https://reviews.freebsd.org/D23716
2020-04-10 09:12:41 +00:00
kib
ba55a3bd97 libc: Fix possible overflow in binuptime().
This is an application of the kernel overflow fix from r357948 to
userspace, based on the algorithm developed by Bruce Evans. To keep
the ABI of the vds_timekeep stable, instead of adding the large_delta
member, MSB of both multipliers are added to quickly estimate the overflow.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2020-04-09 23:22:35 +00:00
sjg
37df3456a0 Improve interaction of vectx and tftp
On slow platforms, it helps to spread the hashing load
over time so that tftp does not timeout.

Also, some .4th files are too big to fit in cache of pkgfs,
so increase cache size and ensure fully populated.

Reviewed by:	stevek
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D24287
2020-04-07 16:56:34 +00:00
cem
1cb653bc5a libcasper(3): Export functions to C++
We must wrap C declarations in __BEGIN / __END_DECLS to avoid C++ name-mangling
of the declaration when including the C header; name-mangling causes the linker
to attempt to locate the wrong (C++ ABI) symbol name.

Reviewed by:	markj, oshogbo (earlier version both)
Differential Revision:	https://reviews.freebsd.org/D24323
2020-04-07 16:40:41 +00:00
sobomax
21e37b36a4 Normalize deployment tools usage and definitions by putting into one place
instead of sprinkling them out over many disjoint files. This is a follow-up
to achieve the same goal in an incomplete rev.348521.

Approved by:	imp
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D20520
2020-04-07 02:46:22 +00:00
cem
15e07eb7cf libcasper: Constify cap_sysctl_limit_mib() mib parameter
No functional change. Minor API change that is nicer for consumers. ABI is
identical; the routine never needed to modify the pointed to value.

Reviewed by:	emaste, markj
Differential Revision:	https://reviews.freebsd.org/D24319
2020-04-06 23:07:56 +00:00
kevans
8ef9470bff llvm: add a build knob for enabling assertions
For head/, this will remain eternally default-on to maintain the status quo.
For stable/ branches, it should be flipped to default-off to maintain the
status quo.

There's value in being able to flip it one way or the other easily on head
or stable branches, whether you want to gain some performance back on head/
(for machines there's little chance you'll actually hit an assertion) or
potentially diagnose a problem with the version of llvm on an older branch.

Currently, stable branches get the CFLAGS+= -ndebug line uncommented; going
forward, they will instead have the default of LLVM_ASSERTIONS flipped.

Reviewed by:	dim, emaste, re (gjb)
MFC after:	1 week
MFC note:	flip the default of LLVM_ASSERTIONS
Differential Revision:	https://reviews.freebsd.org/D24264
2020-04-06 01:27:17 +00:00
carlavilla
50e8f4b13f Fix typo 2020-04-04 07:43:47 +00:00
mmacy
b460ff02eb Update x86 counters
MFC after:	1 week
2020-04-03 22:36:22 +00:00
emaste
04f16bb71a ANSIfy and KNF function arg definitions in libmd/md4.c
Reported by:	bde, in 2017
2020-04-03 20:56:43 +00:00
emaste
be2f3ab689 lldb: build and enable lua script bindings
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24266
2020-04-03 16:54:13 +00:00
emaste
939ddaef3b lldb: commit generated LLDBWrapLua.cpp 2020-04-03 15:55:58 +00:00
emaste
1055ebaeb1 lldb: add rule to generate LLDBWrapLua.cpp
Building lldb's lua/python bindings requires swig, but we do not want to
include it in the FreeBSD base system (as a build tool) because it has
non-trivial dependencies.  As a workaround, add a make rule to generate
LLDBWrapLua.cpp, and we will commit the generated file.

Requires the swig30 package.

Reviewed by:	brooks
Discussed with:	dim
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24265
2020-04-03 15:52:44 +00:00
imp
0f6c6dff74 Note some functions that appeared in First Edition Unix
These functions first appeared in the First Edition of Unix (or earlier in the
pdp-7 version). Just claim 1st Edition for all this. The pdp-7 code is too
fragmented at this point to extend history that far back.
2020-04-01 22:50:41 +00:00
jhb
efd93357ab Retire procfs-based process debugging.
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging.  ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code.  This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR:		244939 (exp-run)
Reviewed by:	kib, mjg (earlier version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23837
2020-04-01 19:22:09 +00:00
harti
017de2fd54 Merge release 1.14 of bsnmp. 2020-04-01 15:25:16 +00:00
0mp
5c3a6ff8de Use proper mdoc(7) macros for literal text and do not use Tn
Tn is deprecated and upsets linters.

MFC after:	3 days
2020-04-01 09:01:35 +00:00
sjg
23cfd5d47c Do not claim libbearssl et al are INTERNALLIB
If INTERNALLIB is defined we need PIE and bsd.incs.mk is
not included.

PR:		245189
Reviewed by:	emaste
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org//D24233
2020-04-01 05:45:12 +00:00