Commit Graph

8949 Commits

Author SHA1 Message Date
Kristof Provost
b1f3ab0051 pfctl: Fix 'set skip' handling for groups
When we skip on a group the kernel will automatically skip on the member
interfaces. We still need to update our own cache though, or we risk
overruling the kernel afterwards.

This manifested as 'set skip' working initially, then not working when
the rules were reloaded.

PR:		229241
MFC after:	1 week
2019-01-13 05:30:26 +00:00
Kyle Evans
0a603a6ece libbe(3): Change be_mount to mount/unmount child datasets
This set of changes is geared towards making bectl respect deep boot
environments when they exist and are mounted. The deep BE composition
functionality (`bectl add`) remains disabled for the time being. This set of
changes has no effect for the average user. but allows deep BE users to
upgrade properly with their current setup.

libbe(3): Open the target boot environment and get a zfs handle, then pass
that with the target mountpoint to be_mount_iter; If the BE_MNT_DEEP flag is
set call zfs_iter_filesystems and mount the child datasets.

Similar logic is employed when unmounting the datasets, save for children
are unmounted first.

bectl(8): Change bectl_cmd_jail to pass the BE_MNT_DEEP flag when
calling be_mount as well as call be_unmount when cleaning up after the
jail has exited instead of umount(2) directly.

PR:		234795
Submitted by:	Wes Maag <jwmaag_gmail.com> (test additions by kevans)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18796
2019-01-10 03:27:20 +00:00
Enji Cooper
8b5fede0ac route(8): clarify -prefixlen description
Try to reword -prefixlen section to more clearly and accurately describe how
the -prefixlen modifier works.

While here, fix a word that igor considered a typo: aggregatable addresses is a
valid technical term per RFC-2374, however, it was superseded by the term
"aggregator" in RFC-3587.

MFC after:	1 week
Reviewed by:	0mp, crees
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D10087
2019-01-10 00:10:12 +00:00
Mark Johnston
04e9edb544 Capsicumize rtsol(8) and rtsold(8).
These programs parse ND6 Router Advertisement messages; rtsold(8) has
required an SA, SA-14:20.rtsold, for a bug in this code.  Thus, they
are good candidates for sandboxing.

The approach taken is to run the main executable in capability mode
and use Casper services to provide functionality that cannot be
implemented within the sandbox.  In particular, several custom services
were required.

- A Casper service is used to send Router Solicitation messages on a
  raw ICMP6 socket.  Initially I took the approach of creating a
  socket for each interface upon startup, and connect(2)ing it to
  the all-routers multicast group for the interface.  This permits
  the use of sendmsg(2) in capability mode, but only works if the
  interface's link is up when rtsol(d) starts.  So, instead, the
  rtsold.sendmsg service is used to transmit RS messages on behalf
  of the main process.  One could alternately define a service
  which simply creates and connects a socket for each destination
  address, and returns the socket to the sandboxed process.  However,
  to implement rtsold's -m option we also need to read the ND6 default
  router list, and this cannot be done in capability mode.
- rtsold may execute resolvconf(8) in response to RDNSS and DNSSL
  options in received RA messages.  A Casper service is used to
  fork and exec resolvconf(8), and to reap the child process.
- A service is used to determine whether a given interface's
  link-local address is useable (i.e., not duplicated or undergoing
  DAD).  This information is supplied by getifaddrs(3), which reads
  a sysctl not available in capability mode.  The SIOCGIFCONF socket
  ioctl provides equivalent information and can be used in capability
  mode, but I decided against it for now because of some limitations
  of that interface.

In addition to these new services, cap_syslog(3) is used to send
messages to syslogd.

Reviewed by:	oshogbo
Tested by:	bz (previous versions)
MFC after:	2 months
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17572
2019-01-05 16:05:39 +00:00
Mark Johnston
0fadd6731f Disable savecore(8)'s libcasper support when WITHOUT_DYNAMICROOT=yes.
This follows the example of other Capsicumized programs in /sbin.

Reported by:	Manfred Antar <manfredantar@gmail.com>
MFC with:	r342699
Sponsored by:	The FreeBSD Foundation
2019-01-04 19:20:19 +00:00
Mark Johnston
2e4c75c15e Fix an error check after r342699.
Reported by:	gcc
MFC with:	r342699
Sponsored by:	The FreeBSD Foundation
2019-01-02 17:34:25 +00:00
Mark Johnston
d7fffd0689 Capsicumize savecore(8).
- Use cap_fileargs(3) to open dump devices after entering capability
  mode, and use cap_syslog(3) to log messages.
- Use a relative directory fd to open output files.
- Use zdopen(3) to compress kernel dumps in capability mode.

Reviewed by:	cem, oshogbo
MFC after:	2 months
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18458
2019-01-02 17:09:35 +00:00
Kyle Evans
7ce09314b2 bectl: use jail id as the default jail name for a boot environment
By default, bectl is setting the jail 'name' parameter to the boot
environment name, which causes an error when the boot environment name is
not a valid jail name. With the attached fix, when no name is supplied, the
default jail name will be the jail id - this is is the same behavior as the
jail command.

Additionally, this commit addresses two other bugs that prevented unjailing
in scenarios where the jail name does not match the boot environment name:

1. In 'bectl_locate_jail', 'mountpoint' is used to resolve the boot
  environment path, but really 'mounted' should be used. 'mountpoint' is the
  path where the zfs dataset will be mounted. 'mounted' is the path where
  the dataset is actually mounted.

2. in 'bectl_search_jail_paths', 'jail_getv' would fail after the first
  call. Which is fine, if the boot environment you're unjailing is the next
  one up. According to 'man jail_getv', it's expecting name and value
  strings. 'jail_getv' is being passed an integer for the lastjid, so amend
  that to use a string instead.

Test cases have been amended to reflect the bugs found.

PR:		233637
Submitted by:	Rob <rob.fx907_gmail.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D18607
2018-12-25 15:18:41 +00:00
Eugene Grosbein
8ebaf58450 ifconfig.4, lagg.4: fix documentation bug: -use_flowid needs to be used
to force local hash computation and disable usage of RSS hash
provided by driver.

PR:		234242
MFC after:	1 week
2018-12-22 11:38:54 +00:00
Warner Losh
9d0e9f8ef5 Try the first 256 units with nvmecontrol devlist.
The nvmecontrol code that did the devlist assumed that we had a
tightly-packed allocation of units. Since pci writing exists, this
isn't the case. Loop over the first 256 units, which is a reasonable
number of possible units.

Sponsored by: Netflix
2018-12-21 23:22:37 +00:00
Andrey V. Elsukov
a5178bca19 Allow use underscores and dots in service names without escaping.
PR:		234237
MFC after:	1 week
2018-12-21 10:41:45 +00:00
Bruce Evans
9e5ed8593f Use VOP_ADVISE() with POSIX_FADV_DONTNEED instead of IO_DIRECT to
implement not double-caching for reads from vnode-backed md devices.
Use VOP_ADVISE() similarly instead of !IO_DIRECT unsimilarly for writes.
Add a "cache" option to mdconfig to allow changing the default of not
caching.

This depends on a recent commit to fix VOP_ADVISE().  A previous version
had optimizations for sequential i/o's (merge the i/o's and only uncache
for discontiguous i/o's and for full blocks), but optimizations and
knowledge of block boundaries belong in VOP_ADVISE().  Read-ahead should
also be handled better, by supporting it in md and discarding it in
VOP_ADVISE().

POSIX_FADV_DONTNEED is ignored by zfs, but so is IO_DIRECT.

POSIX_FADV_DONTNEED works better than IO_DIRECT if it is not ignored,
since it only discards from the buffer cache immediately, while
IO_DIRECT also discards from the page cache immediately.

IO_DIRECT was not used for writes since it was claimed to be too slow,
but most of the slowness for writes is from doing them synchronously by
default.  Non-synchronous writes still deadlock in many cases.

IO_DIRECT only has a special implementation for ffs reads with DIRECTIO
configured.  Otherwise, if it is not ignored than it uses the buffer and
page caches normally except for discarding everything after each i/o,
and then it has much the same overheads as POSIX_FADV_DONTNEED.  The
overheads for reading with ffs and DIRECTIO were similar in tests of md.

Reviewed by:	kib
2018-12-21 08:15:31 +00:00
Bruce Evans
e6f6d8853c Fix missing (sub)options in usage message to prepare for adding a new one.
Reviewed by:	kib
2018-12-21 06:38:13 +00:00
Mark Johnston
18fcfaa4ca Use caph_enter_casper() in ping(8).
Reported by:	oshogbo
MFC with:	r341837
Sponsored by:	The FreeBSD Foundation
2018-12-18 16:47:03 +00:00
Poul-Henning Kamp
96a3750174 Make (no)ro an alias for (no)readonly 2018-12-16 18:10:55 +00:00
Kirk McKusick
e155208020 Fsck would find, report, and offer to fix inode check-hash failures.
If requested to fix the inode check-hash it would confirm having done
it, but then fail to make the fix. The same code is used in fsdb which,
unlike fsck, would actually fix the inode check-hash.

The discrepancy occurred because fsck has two ways to fetch inodes.
The inode by number function ginode() and the streaming inode
function getnextinode() used during pass1. Fsdb uses the ginode()
function which correctly does the fix, while fsck first encounters
the bad inode check-hash in pass1 where it is using the getnextinode()
function that failed to make the correction. This patch corrects
the getnextinode() function so that fsck now correctly fixes inodes
with incorrect inode check-hashs.

Reported by:  Gary Jennejohn <gljennjohn@gmail.com>
Sponsored by: Netflix
2018-12-15 17:32:47 +00:00
Edward Tomasz Napierala
04e5c6f18a Make fsck(8) use pread(2). This cuts the number of syscalls by half.
Reviewed by:	kib, mckusick
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17586
2018-12-15 11:36:20 +00:00
Mark Johnston
7bdc329113 Use Capsicum helpers in ping(8).
Also use caph_cache_catpages() to ensure that strerror() works when
run with kern.trap_enotcap=1.

Reviewed by:	oshogbo
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18514
2018-12-12 02:33:01 +00:00
Kirk McKusick
8f829a5cf0 Continuing efforts to provide hardening of FFS. This change adds a
check hash to the filesystem inodes. Access attempts to files
associated with an inode with an invalid check hash will fail with
EINVAL (Invalid argument). Access is reestablished after an fsck
is run to find and validate the inodes with invalid check-hashes.
This check avoids a class of filesystem panics related to corrupted
inodes. The hash is done using crc32c.

Note this check-hash is for the inode itself and not any of its
indirect blocks. Check-hash validation may be extended to also
cover indirect block pointers, but that will be a separate (and
more costly) feature.

Check hashes are added only to UFS2 and not to UFS1 as UFS1 is
primarily used in embedded systems with small memories and low-powered
processors which need as light-weight a filesystem as possible.

Reviewed by:  kib
Tested by:    Peter Holm
Sponsored by: Netflix
2018-12-11 22:14:37 +00:00
Andrey V. Elsukov
a895c1c28a Rework how protocol number is tracked in rule. Save it when O_PROTO
opcode will be printed. This should solve the problem, when protocol
name is not printed in `ipfw -N show`.

Reported by:	Claudio Eichenberger <cei at yourshop.com>
MFC after:	1 week
2018-12-10 16:23:11 +00:00
Andrey V. Elsukov
5f9c94c592 Use correct size for IPv4 address in gethostbyaddr().
When u_long is 8 bytes, it returns EINVAL and 'ipfw -N show' doesn't work.

Reported by:	Claudio Eichenberger <cei at yourshop.com>
MFC after:	1 week
2018-12-10 15:42:13 +00:00
Eugene Grosbein
2d0a6ce24c ping(8): add space after "<=" as per style(9).
MFC after:	1 week
X-MFC-with:	r341768
2018-12-10 14:39:21 +00:00
Eugene Grosbein
65c3a67d23 ping(8): remove needless comparision with LONG_MAX
after unsigned long ultmp changed to long ltmp in r340245.

MFC after:	1 week
2018-12-09 21:11:15 +00:00
Warner Losh
e8c7837685 Update paths based on last-minute changes from libexec to lib. 2018-12-06 23:40:56 +00:00
Warner Losh
44d31a441e Declare global function print_intel_add_smart in header 2018-12-06 23:29:06 +00:00
Warner Losh
4b639af1d4 Use proper prototypes. 2018-12-06 23:28:55 +00:00
Warner Losh
e4aa091bd1 It's useful to have this be a global function.
Other vendors base their additional smart info pages on what Intel did
plus some other bits. So it's convenient to have this be global.

Sponsored by: Netflix
2018-12-06 22:59:18 +00:00
Warner Losh
8775459902 This is not a samsung standard, so remove that alias.
This was never documented, and isn't needed, so it's best removed to
avoid confusion.

Sponsored by: Netflix
Differential Revision:	https://reviews.freebsd.org/D18460
2018-12-06 22:59:08 +00:00
Warner Losh
eac8e82796 Move intel and wdc files to their own modules
Move the intel and wdc vendor specific stuff to their own modules.

Sponsored by: Netflix
Differential Revision:  https://reviews.freebsd.org/D18460
2018-12-06 22:58:55 +00:00
Warner Losh
0d095c23a0 Const poison the command interface
Make the pointers we pass into the commands const, also make the
linker set mirrors const.

Suggested by: cem@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18459
2018-12-06 22:58:42 +00:00
Warner Losh
228c425533 Dynamically load .so modules to expand functionality
o Dynamically load all the .so files found in /libexec/nvmecontrol and
  /usr/local/libexec/nvmecontrol.
o Link nvmecontrol -rdynamic so that its symbols are visible to the
  libraries we load.
o Create concatinated linker sets that we dynamically expand.
o Add the linked-in top and logpage linker sets to the mirrors for them
  and add those sets to the mirrors when we load a new .so.
o Add some macros to help hide the names of the linker sets.
o Update the man page.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18455

fold
2018-12-06 22:58:26 +00:00
Kirk McKusick
fb14e73cb4 Normally when an attempt is made to mount a UFS/FFS filesystem whose
superblock has a check-hash error, an error message noting the
superblock check-hash failure is printed and the mount fails. The
administrator then runs fsck to repair the filesystem and when
successful, the filesystem can once again be mounted.

This approach fails if the filesystem in question is a root filesystem
from which you are trying to boot. Here, the loader fails when trying
to access the filesystem to get the kernel to boot. So it is necessary
to allow the loader to ignore the superblock check-hash error and make
a best effort to read the kernel. The filesystem may be suffiently
corrupted that the read attempt fails, but there is no harm in trying
since the loader makes no attempt to write to the filesystem.

Once the kernel is loaded and starts to run, it attempts to mount its
root filesystem. Once again, failure means that it breaks to its prompt
to ask where to get its root filesystem. Unless you have an alternate
root filesystem, you are stuck.

Since the root filesystem is initially mounted read-only, it is
safe to make an attempt to mount the root filesystem with the failed
superblock check-hash. Thus, when asked to mount a root filesystem
with a failed superblock check-hash, the kernel prints a warning
message that the root filesystem superblock check-hash needs repair,
but notes that it is ignoring the error and proceeding. It does
mark the filesystem as needing an fsck which prevents it from being
enabled for writing until fsck has been run on it. The net effect
is that the reboot fails to single user, but at least at that point
the administrator has the tools at hand to fix the problem.

Reported by:    Rick Macklem (rmacklem@)
Discussed with: Warner Losh (imp@)
Sponsored by:   Netflix
2018-12-06 00:09:39 +00:00
Kirk McKusick
8ebae128be Ensure that cylinder-group check-hashes are properly updated when first
creating them and when correcting them when they are found to be corrupted.

Reported by:  Don Lewis (truckman@)
Sponsored by: Netflix
2018-12-05 06:31:50 +00:00
Andrey V. Elsukov
d66f9c86fa Add ability to request listing and deleting only for dynamic states.
This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.

Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.

Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.

Obtained from:	Yandex LLC
MFC after:	2 months
Sponsored by:	Yandex LLC
2018-12-04 16:12:43 +00:00
Ed Maste
133f9fcfff ggated: do not expose stack data in sendfail()
admbugs:	590
Submitted by:	Fabian Keil <fk@fabiankeil.de>
Obtained from:	ElectroBSD
2018-12-04 15:25:15 +00:00
Renato Botelho
270adb2182 Restore /var/crash permissions to 0750, as declared in mtree file. After
r337337 it changed to 0755.

Reviewed by:	loos
Approved by:	loos
MFC after:	3 days
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D18355
2018-12-04 12:34:22 +00:00
Warner Losh
e860439466 Fix typo in comment
Sponsored by: Netflix
2018-12-02 23:13:45 +00:00
Warner Losh
48133c3ff3 Delete the undocumented alias 'wds'.
This was a typo for wdc. Eliminate it since it was in error. People
should use either 'wdc' or 'hgst' for the vendor from now on. 'hgst'
works for all versions this functionality is present for.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:13:35 +00:00
Warner Losh
2da383a59a Move Intel specific log pages to intel.c
Move the Intel specific log pages (including the one that samsung
implements) to intel.c. Add comment to the samsung vendor that it will
be going away soon.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:13:24 +00:00
Warner Losh
d4fdb249f2 Usage cleanup pt 2
Eliminage redundant spaces and nvmecontrol at start of all the usage
strings. Update the usage printing code to add them back when
presenting to the user. Allow multi-line usage messages and print
proper leading spaces for lines starting with a space.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:13:12 +00:00
Warner Losh
7d923c13d7 Usage cleanup pt 1
Provide a usage() function that takes a struct nvme_function pointer
and produces a usage mssage. Eliminate all now-redundant usage
functions. Propigate the new argument through the program as needed.
Use common routine to print usage.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:12:58 +00:00
Warner Losh
fbf14fe84b Return after we find the dispatched function.
If the dispatched function doesn't exit, then we get can get a
spurious function not found message. They all do exit, but this is a
little cleaner.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:12:48 +00:00
Warner Losh
e2ed7941e0 Move the hgst/wdc log page printing code into wdc.c
These are all hgst/wdc specific, so move them into the wdc.c to live
with the wdc command.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:12:37 +00:00
Warner Losh
a773b08b88 Move common logpage routines into nvmecontrol.h
For the upcoming move of vendor specific code into vendor specific
files, make the common logpage routines global and move them to
nvmecontrol.h.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:12:26 +00:00
Warner Losh
aecd1901a9 Make logpage functions a linker set.
Move logpage function def to header. Convert all the logpage_function
elements to elements of the linker set. Leave them all in logpage.c
for the moment.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:12:16 +00:00
Warner Losh
a13a291adf Move nvmecontrol to using linker sets for commands
More commands will be added to nvmecontrol. Also, there will be a few
more vendor commands (some of which may need to remain private to
companies writing them). The first step on that journey is to move to
using linker sets to dispatch commands. The next step will be using
dlopen to bring in the .so's that have the command that might need
to remain private for seamless integration.

Similar changes to this will be needed for vendor specific log pages.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18403
2018-12-02 23:10:55 +00:00
Eugene Grosbein
3b78397008 Small language fix after r340978.
MFC after:	3 days
2018-11-26 16:10:20 +00:00
Eugene Grosbein
3a498c2e8a ipfw.8: add new section to EXAMPLES:
SELECTIVE MIRRORING
     If your network has network traffic analyzer connected to your host
     directly via dedicated interface or remotely via RSPAN vlan, you can
     selectively mirror some ethernet layer2 frames to the analyzer.
     ...
2018-11-26 16:02:17 +00:00
Yuri Pankov
52ee41b778 bectl: sync usage with man page, removing stray multibyte characters
in the process.

PR:		233526
Submitted by:	tigersharke@gmail.com (original version)
Reviewed by:	kevans
Approved by:	kib (mentor, implicit)
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D18335
2018-11-26 15:11:32 +00:00
Kirk McKusick
038c170fc2 Properly recover from superblock check-hash failures. Specifically,
report the check-hash failure and offer to search for and use
alternate superblocks.  Prior to this fix fsck_ffs would simply
report the check-hash failure and exit.

Reported by:  Julian H. Stacey <jhs@berklix.com>
Tested by:    Peter Holm
Sponsored by: Netflix
2018-11-25 18:09:39 +00:00