PJDLOG_ASSERT() and PJDLOG_VERIFY() that will check the given condition
and log the problem where appropriate. The difference between those
two is that PJDLOG_VERIFY() always work and PJDLOG_ASSERT() can be
turned off by defining NDEBUG.
MFC after: 1 month
thus don't depend on one_pass flag anymore.
This is a POLA violation, but it is quite difficult to restore
the old behavior with new code. Also, the new behavior matches
behavior of the older "tee" action, and this is more intuitive.
-S option is meant to be "inclusive".
The original issue of the PR was already fixed.
PR: docs/142418
Submitted by: David Naylor (naylor dot b dot david at gmail dot com)
No objection from: kib
MFC after: 5 days
of add verb. Mention about maximum size limit for "freebsd-boot"
partition. It should be smaller than 545 KB (hardcoded in pmbr).
Show usage of SI unit suffixes in example.
Approved by: mav (mentor)
MFC after: 1 week
it to configure the interface. When the script is complete, dhclient
monitors the routing socket and will terminate if its address is
deleted or if its interface is removed or brought down.
Because the routing socket is already open when dhclient-script is
run, dhclient ignores address deletions for 10 seconds after the
script was run.
If the address that will be obtained is already configured on the
interface before dhclient starts, and if dhclient-script takes more
than 10 seconds (perhaps due to dhclient-*-hooks latencies), on script
completion, dhclient will immediately and silently exit when it sees
the RTM_DELADDR routing message resulting from the script reassigning
the address to the interface.
This change logs dhclient's reason for exiting and also changes the
10 second timeout to be effective from completion of dhclient-script
rather than from when it was started.
We now ignore RTM_DELADDR and RTM_NEWADDR messages when the message
contains no interface address (which should not happen) rather than
exiting.
Not reviewed by: brooks (timeout)
MFC after: 3 weeks
directory truncation to proceed before the link has been cleared. This
is accomplished by detecting a directory with no . or .. links and
clearing the named directory entry in the parent.
- Add a new function ino_remref() which handles the details of removing
a reference to an inode as a result of a lost directory. There were
some minor errors in various subcases of this routine.
int.
- Use errx(3) instead of err(3) to print the error message on short
reads in readlabel(). errno won't be set on short reads which can
easily occur here due to the fixed size read request.
PR: 144307
Reviewed by: bde
need. Close the pidfile. Then close all descriptors >= 3 to avoid
information leakage to children.
This solves the problem of not being able to restart devd when you
have, for example, a dhclient forked to configure your network...
MFC after: 3 days
- Use err/errx only when the case is really fatal. For other
cases, fall back to full fsck instead of quiting fsck.
- Plug a memory leak.
- Avoid divide by zero when printing summary.
- Output "FILE SYSTEM IS MARKED CLEAN" when a successful
journal recovering is done.
- When -f is specified, do full fsck instead of journal recovery.
Move code that converts params from humanized numbers to sectors count
to subr.c and adjust comment.
Add post-processing for "size" and "start offset" params in gpart,
now they are properly converted to sectors count with known sector size
that can be greater that 512 bytes.
Also replace "unsigned long long" type to "off_t" for unify code since
it used for medium size in libgeom(3) and DIOCGMEDIASIZE ioctl.
PR: bin/146277
Reviewed by: marcel (previous version)
Approved by: kib (mentor)
MFC after: 1 month
- remove stray argument [1]
- remove stray whitespace
- use canonical wording for the HISTORY section
PR: docs/147119 [1]
Submitted by: Alexander Best <alexbestms@wwu.de> [1]
MFC after: 1 week
we grow more descriptors, but I'll reconsider readding them once we get there.
Passing (a = b) expression to FD_ISSET() is bad idea, as FD_ISSET() evaluates
its argument twice.
Found by: Coverity Prevent
CID: 5243
MFC after: 3 days
- Add information regarding VTOC8 bootrstrap code and how it's handled with
r208777 in place.
- Document the mapping of partition types to VTOC8 tags.
- Add examples for VTOC8 to the respective section.
- Eliminated hard sentence breaks.
Reviewed by: marcel (slightly buggy version)
MFC after: 3 days
file to be of maximum size.
- Add special handling required for SMI/VTOC8 disklabel partcode, i.e. avoid
overwriting the label when writing the bootstrap code to the partition
starting at 0 and install it to all partitions when the -i option is omitted
just like geom_sunlabel(4) and sunlabel(8) do by default.
- Add missing prototypes.
- Add const where applicable.
Reviewed by: marcel
MFC after: 3 days
mount(8): add xref to devfs(5)
devfs(5): change example to something more likely to be useful (it is not
necessary to mount a devfs on /dev manually, but for chroots/jails it is
often needed), mention since when devfs is preferred to device nodes on ufs
PR: 146600
MFC after: 2 weeks
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
- Only require 256k of blocks per-cg when trying to allocate contiguous
journal blocks. The storage may not actually be contiguous but is at
least within one cg.
- When disabling SUJ leave SU enabled and report this to the user. It
is expected that users will upgrade SU filesystems to SUJ and want
a similar downgrade path.
- Drop bogus quad_t cast for di_gen, it is a 32bit type
- Print di_gen with leading zeros, to get consistent output
Before this change, amd64 would print:
ino 18 gen 616ca2bd
ino 19 gen ffffffff95c2a3ff
ino 20 gen 25c3a3d5
ino 21 gen 8dc1472
ino 22 gen 3797056b
ino 23 gen 1d47853a
ino 24 gen ffffffff82d26995
After the change
ino 18 gen 616ca2bd
ino 19 gen 95c2a3ff
ino 20 gen 25c3a3d5
ino 21 gen 08dc1472
ino 22 gen 3797056b
ino 23 gen 1d47853a
ino 24 gen 82d26995
PR: bin/139994 (sort of)
Reviewed by: mckusick
bottom of the manpages and order them consistently.
GNU groff doesn't care about the ordering, and doesn't even mention
CAVEATS and SECURITY CONSIDERATIONS as common sections and where to put
them.
Found by: mdocml lint run
Reviewed by: ru
- device initiated power management (some devices support only this way);
- Automatic Partial to Slumber Transition (more power saving);
- DMA auto-activation (expected to slightly improve performance).
More features could be added later, when hardware supports.
structure so that we correctly reload. Note that tunefs doesn't
properly detect the need to reload if the disk device is specified
for a read-only mounted filesystem.
- Lessen the contiguity requirement for the journal so that it is more
likely to succeed.
make socket non-blocking, connect() and if we get EINPROGRESS, we have to
wait using select(). Very complex, but I know no other way to define
connection timeout for a given socket.
Reported by: hiroshi@soupacific.com
MFC after: 3 days
secondary, which died between send(2) and recv(2). Do it by adding timeout
to recv(2) for primary incoming and outgoing sockets and secondary outgoing
socket.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
Tested by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
brings in support for an optional intent log which eliminates the need
for background fsck on unclean shutdown.
Sponsored by: iXsystems, Yahoo!, and Juniper.
With help from: McKusick and Peter Holm
interface considers that it hits a fatal error, and will not copyout
the request structure back for _IOW and _IOWR ioctls, keeping them
untouched.
The previous implementation of the SIOCGIFDESCR ioctl intends to
feed the buffer length back to userland. However, if we return
an error, the feedback would be defeated and ifconfig(8) would
trap into an infinite loop.
This commit changes SIOCGIFDESCR to set buffer field to NULL to
indicate the previous ENAMETOOLONG case.
Reported by: bschmidt
MFC after: 2 weeks
Although groff_mdoc(7) gives another impression, this is the ordering
most widely used and also required by mdocml/mandoc.
Reviewed by: ru
Approved by: philip, ed (mentors)
in a device independent manner. Also include an example anticipatory
scheduler, gsched_rr, which gives very nice performance improvements
in presence of competing random access patterns.
This is joint work with Fabio Checconi, developed last year
and presented at BSDCan 2009. You can find details in the
README file or at
http://info.iet.unipi.it/~luigi/geom_sched/
to support various storage boxes which really aren't active-active.
We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.
A usage implication is that you should specificy the currently active
storage path as the first provider.
Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).
Sponsored by: Panasas
MFC after: 1 month
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another
commit.
Also updated the ifconfig utility to show the LINKSTATE
capability if present.
Reviewed by: rwatson, imp, juli
MFC after: 3 days
ipfw add 100 allow ip from { 1.2.3.4 or 5.6.7.8 }
(note that the above example could be better written as
ipfw add 100 allow dst-ip 1.2.3.4,5.6.7.8
Submitted by: Riccardo Panicucci
dscp as a search key in table lookups;
+ (re)implement a sysctl variable to control the expire frequency of
pipes and queues when they become empty;
+ add 'queue number' as optional part of the flow_id. This can be
enabled with the command
queue X config mask queue ...
and makes it possible to support priority-based schedulers, where
packets should be grouped according to the priority and not some
fields in the 5-tuple.
This is implemented as follows:
- redefine a field in the ipfw_flow_id (in sys/netinet/ip_fw.h) but
without changing the size or shape of the structure, so there are
no ABI changes. On passing, also document how other fields are
used, and remove some useless assignments in ip_fw2.c
- implement small changes in the userland code to set/read the field;
- revise the functions in ip_dummynet.c to manipulate masks so they
also handle the additional field;
There are no ABI changes in this commit.
of ip->ip_tos) in a table. This can be useful to direct traffic to
different pipes/queues according to the DSCP of the packet, as follows:
ipfw add 100 queue tablearg lookup dscp 3 // table 3 maps dscp->queue
This change is a no-op (but harmless) until the two-line kernel
side is committed, which will happen shortly.
We'll start noticing this with the next flag introduced as the lower
32bit are all used.
As this is old code we might need to do a full tree sweep one day, unless
changing our strategy to use a different `API' for getting/setting flags
along with the rest of the statfs data.
While here compare to 0 explicitly [1].
Suggested by: kib [1]
Reviewed by: kib
MFC after: 5 days
- Move check of /dev/ prefix and copy into a function to save code duplication.
This also fixes a bug where the /dev/ prefix could not be used when creating
volumes on the command line.
Tested by: Niclas Zeising <niclas.zeising - at - gmail.com>
size or size-like argument. I.e. "-s 32k" instead of "-s 32768".
Size parsing function has been shamelessly stolen from the truncate(1).
I'm sure many sysadmins out there will appreciate this small
improvement.
MFC after: 1 week
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.
The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.
In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.
Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.
Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.
Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.
CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.
- add static and const where appropriate
- check pointers against NULL
- minor styling nits
- it is actually WARNS=6 clean for non-strict alignment platforms
This is shamelessly stolen from DragonflyBSD and reduces our diff.
PR: bin/140078
Approved by: ed (co-mentor)
- The MACHINE_ARCH check is not exhaustive (missing at least powerpc),
and generally not worth maintaining.
- While here, fix whitespace and ordering of the Makefile
PR: bin/140081
Approved by: ed (co-mentor)
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.
HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.
For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.
Sponsored by: FreeBSD Foundation
Sponsored by: OMCnet Internet Service GmbH
Sponsored by: TransIP BV
incomplete as some info doesn't really belong to the structs where it is
defined.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Reviewed by: bde
MFC after: 2 weeks
- fix sign-compare issues.
- ANSIfy a couple of functions.
- Remove more duplicate #includes.
- Memory leak found by Coverity on NetBSD.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Reviewed by: bde
MFC after: 2 weeks
- C99 initializers.
- Change the default volume label from "NO NAME" to "NO_NAME".
- Set OEM String to "BSD4.4 " following the unnamed spacing convention
in that other OS that suggests "MSWIN4.1"
Also, David Naylor's changes for Clang, mostly changing the signess
of constants.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Clang fixes by: David Naylor <naylor.b.david gmail com>
Reviewed by: bde (with some disagreement about Clang issues)
MFC after: 2 weeks
cylinder groups that are created. When the filesystem is first created,
newfs always initialises the first two blocks of inodes, and then in the
UFS1 case will also initialise the remaining inode blocks. The changes in
growfs.c 1.23 broke the initialisation of all inodes, seemingly based on
this implementation detail in newfs(8). The result was that instead of
initialising all inodes, we would actually end up initialising all but the
first two blocks of inodes. If the filesystem was grown into empty
(all-zeros) space then the resulting filesystem was fine, however when
grown onto non-zeroed space the filesystem produced would appear to have
massive corruption on the first fsck after growing.
A test case for this problem can be found in the PR audit trail.
Fix this by once again initialising all inodes in the UFS1 case.
PR: bin/115174
Submitted by: Nate Eldredgei nge cs.hmc.edu
Reviewed by: mjacob
MFC after: 1 month
inodes by cutting back on the number of inodes per cylinder group if
necessary to stay under the limit. For a default (16K block) file
system, this limit begins to take effect for file systems above 32Tb.
This fix is in addition to -r203763 which corrected a problem in the
kernel that treated large inode numbers as negative rather than unsigned.
For a default (16K block) file system, this bug began to show up at a
file system size above about 16Tb.
Reported by: Scott Burns, John Kilburg, Bruce Evans
Followup by: Jeff Roberson
PR: 133980
MFC after: 2 weeks
Since the existing implementation searches ':' backward, a path which
includes ':' could not be mounted. You can now mount such path by
enclosing an IP address by '[]'.
Though we should change to search ':' forward, it will break
'ipv6addr:path' which is currently working. So, it still searches ':'
backward, at least for now.
MFC after: 2 weeks
retrieving individual OIDs. This allows the same list of OIDs to be
passed to sysctl(8) across different systems where particular OIDs may not
exist, and still get as much information as possible from them.
PR: bin/123644
Submitted by: dhw
Approved by: ed (mentor)
MFC after: 2 weeks
In 99% of the cases people just want to recreate device nodes they
removed from /dev. There is no reason to pass the additional "c 0 0"
anymore.
Also slightly improve the manpage. Remove references to non-existent
device names and platforms.
handled when reading from pipes.
- Remove dead code related to the -P option from getvol(). pipein and
pipecmdin are never set at the same time.
PR: bin/121502
Approved by: trasz (mentor)
MFC after: 2 weeks
in format strings.
- Use (void) instead of (void *) when discarding strcat(3) return value.
- Format string fixes to match variable types.
- Change canon() len parameter and getcmd() size parameter type from
int to size_t.
- Style Makefile and increase WARNS to 2.
PR: bin/140061
Submitted by: uqs
Approved by: trasz (mentor)
- Constify geom_config_get() name argument.
- Add void keyword for usage().
- Initialize mdunit to NULL.
- Don't call md_prthumanval() at all if length is NULL.
Approved by: trasz (mentor)
Note that due to e.g. write throttling ('wdrain'), it can stall all the disk
I/O instead of just the device it's configured for. Using it for removable
media is therefore not a good idea.
Reviewed by: pjd (earlier version)
non-digit character.
Due to an issue with rc(8) in a test configuration, ifconfig was being
invoked with the address used again as the width - for example,
ifconfig vlan0 10.0.0.1/10.0.0.1
Prior to this change, that address/width would be interpreted as
10.0.0.1/10.
According to a comment, we cannot safely remove utmpx entries here
anymore. This is because the libc routines may block on file locking. In
an ideal world login(1) should just remove the entries, which is why I'm
disabling this code for now. If it turns out we get lots of stale
entries here, we should figure out a way to deal with that.
in background mode to correct expected inconsistencies that arise
during directory rename (see immediately previous update to this
file for details). If run on a kernel without the new functionality,
background fsck will simply ignore these inconsistencies rather
than fail.
Reported by: jeff
states. First its new name will be created causing it to have two
names (from possibly different parents). Next, if it has different
parents, its value of ".." will be changed from pointing to the old
parent to pointing to the new parent. Concurrently, its old name
will be removed bringing it back into a consistent state. When fsck
encounters an extra name for a directory, it offers to remove the
"extraneous hard link"; when it finds that the names have been
changed but the update to ".." has not happened, it offers to rewrite
".." to point at the correct parent. Both of these changes were
considered unexpected so would cause fsck in preen mode or fsck in
background mode to fail with the need to run fsck manually to fix
these problems.
This update changes these errors to be expected so that in preen
mode fsck will simply fix these transitional errors. For now,
background fsck will note these errors, but will need additional
kernel support to fix them, so will simply ignore them rather than
fail. A future update will allow background fsck to fix these
problems.
Reported by: jeff
cylinder group of a UFS1 filesystem as bad. The error was in the check
and not in the cylinder group itself. So even though fsck fixed the
cylinder group correctly, it was still endlessly reported as bad.
PR: 141992
MFC after: 2 weeks
Reported by: Dan Strick
when trees were big and FAST mode was enabled by default.
So small block size doesn't benefits linear I/O operations in FAST and
significantly slowdowns in ECONOMIC (default) mode. For single stream random
I/Os so small block doesn't give much benefits, as access time is usually
bigger then transfer time there. Same time it requires all heads to seek
together for every single request, reducing performance on parallel load.
(sblock.fs_magic == FS_UFS1_MAGIC) case, so the check within the
loop is redundant.
Submitted by: Nate Eldredge nge cs.hmc.edu
Reviewed by: mjacob
Approved by: ed (mentor)
MFC after: 1 month
I was considering committing all these patches one by one, but as
discussed with brooks@, there is no need to do this. If we ever
need/want to merge these changes back, it is still possible to do this
per application.
Fix some wrong usages.
Note: this does not affect generated binaries as this argument is not used.
PR: 137213
Submitted by: Eygene Ryabinkin (initial version)
MFC after: 1 month
regulatory domain with the "country" parameter, but will also take a full
country name. The man page warns that only the ISO code is unambiguous.
In reality, however, the first match on either would be accepted, leading
to "DE" being interpreted as the "DEBUG" country rather than Germany, and
"MO" selecting Morocco rather than the correct country, Macau.
Fix this by always checking for an ISO CC match first, and only search on
the full country name if that fails.
PR: bin/140571
Tested by: Dirk Meyer dirk.meyer dinoex.sub.org
Reviewed by: sam
Approved by: ed (mentor)
MFC after: 1 month
lookup {dst-ip|src-ip|dst-port|src-port|uid|jail} N
which searches the specified field in table N and sets tablearg
accordingly.
With dst-ip or src-ip the option replicates two existing options.
When used with other arguments, the option can be useful to
quickly dispatch traffic based on other fields.
Work supported by the Onelab project.
MFC after: 1 week
"split" is very ineffective for devices with rotating media as HDDs.
To be effective, it needs that transfer time reduction due to block
splitting was bigger then access time increase due to non-sequential
access. For modern HDDs I was able to reproduce it only with read sizes
of 2MB and above, which is almost not applicable in real life.
"load" algorithm same time is more universal and effective now.
Reviewed by: pjd
it seems that now it is necessary for 'forward' to work outside lo0.
The bug (and fix) was reported on 8.0. This patch probably applies
to RELENG_7 as well.
It seems that 'pf' has a similar bug.
Submitted by: Lytochkin Boris
MFC after: 3 days
Introduce ATA_CAM kernel option, turning ata(4) controller drivers into
cam(4) interface modules. When enabled, this options deprecates all ata(4)
peripheral drivers (ad, acd, ...) and interfaces and allows cam(4) drivers
(ada, cd, ...) and interfaces to be natively used instead.
As side effect of this, ata(4) mode setting code was completely rewritten
to make controller API more strict and permit above change. While doing
this, SATA revision was separated from PATA mode. It allows DMA-incapable
SATA devices to operate and makes hw.ata.atapi_dma tunable work again.
Also allow ata(4) controller drivers (except some specific or broken ones)
to handle larger data transfers. Previous constraint of 64K was artificial
and is not really required by PCI ATA BM specification or hardware.
Submitted by: nwitehorn (powerpc part)
logwtmp() gets called with the raw strings that are written to disk. For
regular user entries, this isn't too bad, but when booting/shutting
down, the contents get rather cryptic.
Just call the standardized pututxline().
long as I remember, and completely superseded by better maintained umass(4).
It's main idea was to optionally avoid CAM dependency for such devices, but
with move ATA to CAM, it is not actual any more.
No objections: hselasky@, thompsa@, arch@
- Extend XPT-SIM transfer settings control API. Now it allows to report to
SATA SIM number of tags supported by each device, implement ATA mode and
SATA revision negotiation for both SATA and PATA SIMs.
- Make ahci(4) and siis(4) to use submitted maximum tag number, when
scheduling requests. It allows to support NCQ on devices with lower tags
count then controller supports.
- Make PMP driver to report attached devices connection speeds.
- Implement ATA mode negotiation between user settings, device and
controller capabilities.