* Use 0/1 instead of sysexits. Man pages are confusing on this topic,
but 0/1 is sufficient for nvmecontrol.
* Use err function family where possible instead of fprintf/exit.
* Fix some typing errors.
* Clean up some error message inconsistencies.
Sponsored by: Intel
Submitted by: bde (parts of firmware.c changes)
MFC after: 3 days
that looks for interface skips interfaces that are not UP. We need to call
dhclient-script PREINIT before we call discover_interfaces(), so the script has
a chance to bring the interface UP.
Reported by: alfred
Revoke all capability rights from STDIN and allow only for write to STDOUT and
STDERR. All those descriptors are redirected to /dev/null.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Once PID is written to the pidfile, revoke all capability rights.
We just want to keep the pidfile open.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Limit routing socket so only poll(2) and read(2) are allowed (CAP_POLL_EVENT
and CAP_READ). This prevents unprivileged process from adding, removing or
modifying system routes.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
- Limit bpf descriptor in unprivileged process to CAP_POLL_EVENT, CAP_READ and
allow for SIOCGIFFLAGS, SIOCGIFMEDIA ioctls.
- While here limit bpf descriptor in privileged process to only CAP_WRITE.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Currently it was allowed to send any UDP packets from unprivileged process and
possibly any packets because /dev/bpf was open for writing.
Move sending packets to privileged process. Unprivileged process has no longer
access to not connected UDP socket and has only access to /dev/bpf in read-only
mode.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
- Add new request (IMSG_SEND_PACKET) that will be handled by privileged process.
- Add $FreeBSD$.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
The gethostname(3) function won't work in capability mode, because reading
kern.hostname sysctl is not permitted there. Cache hostname early and use
cached value later.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Make use of two fields: rfdesc and wfdesc to keep bpf descriptor open for
reading only in rfdesc and bpf descriptor open for writing only in wfdesc.
In the end they will be used by two different processes.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
contained in the DHCP offer, and write it out to the lease file
as an unquoted value of the "next-server" keyword. The value is ignored
when the lease is read back by dhclient, however other applications
are free to parse it.
The intent behind this change is to allow easier interoperability
with automated installation systems e.g. Cobbler, Foreman, Razor;
FreeBSD installation kernels can automatically probe the network
to discover deployment servers. There are no plans to MFC this
change unless a backport is specifically requested.
The syntax of the "next-server <ip>" lease keyword is intended to be
identical to that used by the ISC DHCPD server in its configuration files.
The required defines are already present in dhclient but were unused before
this change. (Note: This is NOT the same as Option 66, tftp-server-name).
It has been exercised in a university protocol testbed environment, with
Cobbler and an mfsBSD image containing pc-sysinstall (driven by Cobbler
Cheetah templates). The SYSLINUX memdisk driver is used to boot mfsBSD.
Currently this approach requires that a dedicated system profile has
been created for the node where FreeBSD is to be deployed. If this
is not present, the pc-sysinstall wrapper will be unable to obtain
a node configuration. There is code in progress to allow mfsBSD images
to obtain the required hints from the memdisk environment by parsing
the MBFT ACPI chunk. This is non-standard as it is not linked into
the platform's ACPI RSDT.
Reviewed by: des
in gpart(8) and boot(8), adding references to gptboot(8) in both.
Reviewed by: jhb, ae, pjd, Paul Schenkeveld <bsdcan@psconsult.nl>, david_a_bright@dell.com (portions), gjb
MFC after: 1 week
sbin/devd/devd.cc
All output will now go to syslog(3) if devd is daemonized, or stderr
if it's running in the foreground.
sbin/devd/devd.8
Remove the "-D" flag. Filtering messages by priority now
happens in the usual syslog way. For performance reasons, a few
extra-verbose debugging statements are now conditional on the "-d" (do
not daemonize) flag.
etc/syslog.conf
etc/newsyslog.conf
Direct messages from devd(8) to /var/log/devd.log, but leave it
disabled by default
Reviewed by: eadler
Approved by: gibbs (co-mentor)
MFC after: never (removed a command-line option from devd)
which is presently: AES-XTS, no authentication. Create provider
with pagesize as sectorsize by default.
- Rewrite parsing code for geli(8)-backed swap options, now options
are required to be exact match, and unrecognized options will trigger
a warning.
- Don't initialize GELI device if it's already initialized. This
restores previous behavior.
- Don't duplicate file descriptor when working with geli(8) and
gbde(8) as there is no need to communicate with the utility other
than exit status.
- When calling swap_on_off_* routines, which_prog can only be SWAP_ON
or SWAP_OFF. Eliminate unneeded case branches by replacing switch
with if's.
- Plug a few memory leaks.
Reviewed by: hrs (but bugs are mine)
MFC after: 1 week
X-MFC-with: r252310, r252332, r252345
- Reconnect with some minor modifications, in particular now selsocket()
internals are adapted to use sbintime units after recent'ish calloutng
switch.
device names "md" or "md[0-9]*" and a "file" option are specified in
/etc/fstab like this:
md none swap sw,file=/swap.bin 0 0
- Add GBDE/GELI encrypted swap space specification support, which
rc.d/encswap supported. The /etc/fstab lines are like the following:
/dev/ada1p1.bde none swap sw 0 0
/dev/ada1p2.eli none swap sw 0 0
.eli devices accepts aalgo, ealgo, keylen, and sectorsize as options.
swapctl(8) can understand an encrypted device in the command line
like this:
# swapctl -a /dev/ada2p1.bde
- "-L" flag is added to support "late" option to defer swapon until
rc.d/mountlate runs.
- rc.d script change:
rc.d/encswap -> removed
rc.d/addswap -> just display a warning message if $swapfile is defined
rc.d/swap1 -> renamed to rc.d/swap
rc.d/swaplate -> newly added to support "late" option
These changes alleviate a race condition between device creation/removal
and swapon/swapoff.
MFC after: 1 week
Reviewed by: wblock (manual page)
a new firmware command.
NVMe controllers may support up to 7 firmware slots for storing of
different firmware revisions. This new firmware command supports
firmware replacement (i.e. firmware download) with or without immediate
activation, or activation of a previously stored firmware image. It
also supports selection of the firmware slot during replacement
operations, using IDENTIFY information from the controller to
check that the specified slot is valid.
Newly activated firmware does not take effect until the new controller
reset, either via a reboot or separate 'nvmecontrol reset' command to the
same controller.
Submitted by: Joe Golio <joseph.golio@emc.com>
Obtained from: EMC / Isilon Storage Division
MFC after: 3 days
This includes pretty printers for all of the standard NVMe log pages
(Error, SMART/Health, Firmware), as well as hex output for non-standard
or vendor-specific log pages.
Submitted by: Joe Golio <joseph.golio@emc.com>
Obtained from: EMC / Isilon Storage Division
MFC after: 3 days
commands.
Also improve the checking of device node names, so that better error
messages are displayed when incorrect names are specified.
Sponsored by: Intel
MFC after: 3 days
usage message for each nvmecontrol command. This helps reduce some code
clutter both now and for future commits which will add logpage and
firmware support to nvmecontrol(8).
Also move helper function prototypes to the end of the header file, after
the per-command functions.
Sponsored by: Intel
MFC after: 3 days
specified, only md(4) devices which have the specified file as backing
store are displayed.
- Use MD_NAME instead of "md".
- Use _PATH_DEV instead of "/dev/".
MFC after: 1 week
C11 atomics now work on all the architectures. Have at least a single
piece of software in our base system that uses C11 atomics. This
somewhat makes it less likely that we break it because of LLVM imports,
etc.
- When operating on a core file (-M) and -c is specified we don't clear
the message buffer of the running system.
- If we don't have permission to clear the buffer print the error message
only. That's what Linux does in this case, where this feature was ported
from, and it ensures that the error message doesn't get lost in the noise.
Discussed with: antoine, cognet
Approved by: cognet
This allows setting attributes on tables. One simply does not provide
an index in that case. Otherwise the entry corresponding the index has
the attribute set or unset.
Use this change to fix a relatively longstanding bug in our GPT scheme
that's the result of rev 198097 (relatively harmless) followed by rev
237057 (damaging). The damaging part being that our GPT scheme always
has the active flag set on the PMBR slice. This is in violation with
EFI. Existing EFI implementions for both x86 and ia64 reject the GPT.
As such, GPT disks created by us aren't usable under EFI because of
that.
After this change, GPT disks never have the active flag set on the PMBR
slice. In order to make the GPT disk bootable under some x86 BIOSes,
the reason of rev 198097, one must now set the active attribute on the
gpt table. The kernel will apply this to the PMBR slice For (S)ATA:
gpart set -a active ada0
To fix an existing GPT disk that has the active flag set in the PMBR,
and that does not need the flag, use (again for (S)ATA):
gpart unset -a active ada0
The EBR, MBR & PC98 schemes, which also impement at least 1 attribute,
now check to make sure the entry passed is valid. They do not have
attributes that apply to the table.
The "failok" option doesn't have any effect at all unless specified in
fstab(5) and combined with the -a flag. The "failok" option is already
documented in fstab(5).
PR: 177630
No objection: eadler
MFC after: 1 week
recreating the filesystem, check for and output the -i, -k, and -l
options if appropriate.
Note the remaining deficiencies of the -m option in the dumpfs(8)
manual page. Specifically that newfs(8) options -E, -R, -S, and -T
options are not handled and that -p is not useful so is omitted.
Also document that newfs(8) options -n and -r are neither checked
for nor output but should be. The -r flag is needed if the filesystem
uses gjournal(8).
PR: bin/163992
Reported by: Dieter <freebsd@sopwith.solgatos.com>
Submitted by: Andy Kosela <akosela@andykosela.com>
MFC after: 1 week
If an expander returns 0x00 (no device attached) in the ATTACHED DEVICE
field of the SMP DISCOVER response, ignore the value of ATTACHED SAS
ADDRESS, because it is invalid. Some expanders zero out the address
when the attached device is removed, but others do not. Section
9.4.3.10 of the SAS Protocol Layer 2 revision 04b does not require them
to do so.
Approved by: ken (mentor)
MFC after: 3 weeks
been printed. This provides compatibility with other *nix systems
(including Linux).
While here use stdbool booleans for 'all'.
PR: bin/178295
Submitted by: Levent Serinol <lserinol@gmail.com>
Reviewed by: will
This adds the support to the config keyword (vlan operation mode), ports
flags, prints the vlan mode and vlan capabilities. It also adds some basic
information to usage() and support the keyword 'help' as a shortcut to
usage(). The manual page is also updated with the new options.
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
Reviewed by: ray
Address. Although KAME implementation used FF02:0:0:0:0:2::/96 based on
older versions of draft-ietf-ipngwg-icmp-name-lookup, it has been changed
in RFC 4620.
The kernel always joins the /104-prefixed address, and additionally does
/96-prefixed one only when net.inet6.icmp6.nodeinfo_oldmcprefix=1.
The default value of the sysctl is 1.
ping6(8) -N flag now uses /104-prefixed one. When this flag is specified
twice, it uses /96-prefixed one instead.
Reviewed by: ume
Based on work by: Thomas Scheffler
PR: conf/174957
MFC after: 2 weeks
names within the std namespace (and possibly within the global
namespace).
The main advantage is that the C++ versions can provide optimized
versions or simplified interfaces.
piece them together from multiple reads(). It's as if /dev/devctl is
a datagram device instead of a stream device. However, devd's
internal buffer was too small (1025 bytes) to read an entire
ereport.fs.zfs.checksum event (variable, up to ~1300 bytes). This
commit enlarges the buffer to 8k.
Reviewed by: imp
Approved by: ken (mentor)
MFC after: 2 weeks
than VLAN groups.
Some chips (eg this rtl8366rb) has a VLAN group per port - you first
define a set of VLANs in a vlan group, then you assign a VLAN group
to a port.
Other chips (eg the AR8xxx switch chips) have a VLAN ID array per
port - there's no group per se, just a list of vlans that can be
configured.
So for now, the switch API will use the latter and rely on drivers
doing the heavy lifting if one wishes to use the VLAN group method.
Maybe later on both can be supported.
PR: kern/177878
PR: kern/177873
Submitted by: Luiz Otavio O Souza <loos.br@gmail.com>
Reviewed by: ray
This compiler flag enforces that that people either mark variables
static or use an external declarations for the variable, similar to how
-Wmissing-prototypes works for functions.
Due to the fact that Yacc/Lex generate code that cannot trivially be
changed to not warn because of this (lots of yy* variables), add a
NO_WMISSING_VARIABLE_DECLARATIONS that can be used to turn off this
specific compiler warning.
Announced on: toolchain@
disks such as SSD's
Adds the ability to run ATA commands via the SCSI ATA Pass-Through(16) comand
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
fail in nvmecontrol(8).
While here, use consistent checks of return values from stat, open and
ioctl.
Sponsored by: Intel
Suggested by: carl
Reviewed by: carl
invoke it from nvmecontrol(8).
Controller reset will be performed in cases where I/O are repeatedly
timing out, the controller reports an unrecoverable condition, or
when explicitly requested via IOCTL or an nvme consumer. Since the
controller may be in such a state where it cannot even process queue
deletion requests, we will perform a controller reset without trying
to clean up anything on the controller first.
Sponsored by: Intel
Reviewed by: carl
Clang errors around printf could be trivially fixed, but the breakage in
sbin/fsdb were to significant for this type of change.
Submitter of this changeset has been notified and hopefully this can be
restored soon.
that they do not need to be read again in pass5. As this nearly
doubles the memory requirement for fsck, the cache is thrown away
if other memory needs in fsck would otherwise fail. Thus, the
memory footprint of fsck remains unchanged in memory constrained
environments.
This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker
Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.
Reviewed by: kib
Tested by: Peter Holm
MFC after: 4 weeks
running time for a full fsck. It also reduces the random access time
for large files and speeds the traversal time for directory tree walks.
The key idea is to reserve a small area in each cylinder group
immediately following the inode blocks for the use of metadata,
specifically indirect blocks and directory contents. The new policy
is to preferentially place metadata in the metadata area and
everything else in the blocks that follow the metadata area.
The size of this area can be set when creating a filesystem using
newfs(8) or changed in an existing filesystem using tunefs(8).
Both utilities use the `-k held-for-metadata-blocks' option to
specify the amount of space to be held for metadata blocks in each
cylinder group. By default, newfs(8) sets this area to half of
minfree (typically 4% of the data area).
This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker
Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.
Reviewed by: kib
Tested by: Peter Holm
MFC after: 4 weeks
Setting DSCP support is done via O_SETDSCP which works for both
IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done for IPv4.
Dscp can be specified by name (AFXY, CSX, BE, EF), by value
(0..63) or via tablearg.
Matching DSCP is done via another opcode (O_DSCP) which accepts several
classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words).
Many people made their variants of this patch, the ones I'm aware of are
(in alphabetic order):
Dmitrii Tejblum
Marcelo Araujo
Roman Bogorodskiy (novel)
Sergey Matveichuk (sem)
Sergey Ryabin
PR: kern/102471, kern/121122
MFC after: 2 weeks
given descriptors, use Capsicum sandboxing for hastd in primary and secondary
modes. Allow for DIOCGDELETE and DIOCGFLUSH ioctls on provider descriptor and
for G_GATE_CMD_MODIFY, G_GATE_CMD_START, G_GATE_CMD_DONE and G_GATE_CMD_DESTROY
on GEOM Gate descriptor.
Sponsored by: The FreeBSD Foundation
more terse output more observable for both scripts and humans.
Also, it shifts hastctl closer to GEOM utilities with their list/status command
pairs.
Approved by: pjd
MFC after: 4 weeks
of upgrading older machines using ataraid(4) to newer releases.
This optional parameter is controlled via kern.geom.raid.legacy_aliases
and will create a /dev/ar0 device that will point at /dev/raid/r0 for
example.
Tested on Dell SC 1425 DDF-1 format software raid controllers installing from
stable/7 and upgrading to stable/9 without having to adjust /etc/fstab
Reviewed by: mav
Obtained from: Yahoo!
MFC after: 2 Weeks
from the tree since few months (please note that the userland bits
were already disconnected since a long time, thus there is no need
to update the OLD* entries).
This is not targeted for MFC.
why it will now be the default.
- Bump protocol version to 2 and add backward compatibility for version 1.
- Allow to specify hosts by kern.hostid as well (in addition to hostname and
kern.hostuuid) in configuration file.
Sponsored by: Panzura
Tested by: trociny
Since ARP and routing are separated, "proxy only" entries
don't have any meaning, thus we don't need additional field
in sockaddr to pass SIN_PROXY flag.
New kernel is binary compatible with old tools, since sizes
of sockaddr_inarp and sockaddr_in match, and sa_family are
filled with same value.
The structure declaration is left for compatibility with
third party software, but in tree code no longer use it.
Reviewed by: ru, andre, net@
heavily used when parsing config files. Mostly these changes avoid making
temporary copies of the strings, and avoid doing byte at a time append
operations, on the most-used code path.
On a 1.2 GHz ARM processor this reduces the time to parse the config files
from 13 to 6 seconds.
Reviewed by: imp
Approved by: cognet (mentor)
their socket connection any time, and devd only notices that when it gets an
error trying to write an event to the client. On a system with no device
change activity, clients could connect and disappear repeatedly without devd
noticing, leading to an ever-growing list of open socket descriptors in devd.
Now devd uses poll(2) looking for POLLHUP on all existing clients every time
a new client connection is established, and also periodically (once a minute)
to proactively find zombie clients and reap the socket descriptors. It also
now has a connection limit, configurable with a new -l <num> command line arg.
When the maximum number of connections is reached it stops accepting new
connections until some current clients drop off.
Reviewed by: imp
Approved by: cognet (mentor)
- Simplify diagnostic messages.
- Adopt lowercase first letters to make the messages
more canonical.
PR: bin/175404
Submitted by: Christoph Mallon
Reviewed by: bde
MFC after: 3 days
It stops treating the address on the interface as special by source
address selection rule even when the interface is outgoing interface.
This is desired in some situation.
Requested by: hrs
Reviewed by: IHANet folks including hrs
MFC after: 1 week