camcontrol.c: In buildbusdevlist(), don't attempt to get call
getdevid() for an unconfigured device, even when the
verbose flag is set. The cam_open_btl() call will almost
certainly fail.
Probe for the buffer size when issuing the XPT_GDEV_ADVINFO
CCB. Probing for the buffer size first helps us avoid
allocating the maximum buffer size when it really may not
be necessary. This also helps avoid errors from
cam_periph_mapmem() if we attempt to map more than MAXPHYS.
cam_periph.c: In cam_periph_mapmem(), if the XPT_GDEV_ADVINFO CCB
shows a bufsiz of 0, we don't have anything to map,
so just return.
Also, set the maximum mapping size to MAXPHYS
instead of DFLTPHYS for XPT_GDEV_ADVINFO CCBs,
since they don't actually go down to the hardware.
scsi_pass.c: Don't bother mapping the buffer in XPT_GDEV_ADVINFO
CCBs if bufsiz is 0.
isn't configurable in a meaningful way. This is for ifconfig(8) or
other tools not to change code whenever IFT_USB-like interfaces are
registered at the interface list.
Reviewed by: brooks
No objections: gavin, jkim
This includes support in the kernel, camcontrol(8), libcam and the mps(4)
driver for SMP passthrough.
The CAM SCSI probe code has been modified to fetch Inquiry VPD page 0x00
to determine supported pages, and will now fetch page 0x83 in addition to
page 0x80 if supported.
Add two new CAM CCBs, XPT_SMP_IO, and XPT_GDEV_ADVINFO. The SMP CCB is
intended for SMP requests and responses. The ADVINFO is currently used to
fetch cached VPD page 0x83 data from the transport layer, but is intended
to be extensible to fetch other types of device-specific data.
SMP-only devices are not currently represented in the CAM topology, and so
the current semantics are that the SIM will route SMP CCBs to either the
addressed device, if it contains an SMP target, or its parent, if it
contains an SMP target. (This is noted in cam_ccb.h, since it will change
later once we have the ability to have SMP-only devices in CAM's topology.)
smp_all.c,
smp_all.h: New helper routines for SMP. This includes
SMP request building routines, response parsing
routines, error decoding routines, and structure
definitions for a number of SMP commands.
libcam/Makefile: Add smp_all.c to libcam, so that SMP functionality
is available to userland applications.
camcontrol.8,
camcontrol.c: Add smp passthrough support to camcontrol. Several
new subcommands are now available:
'smpcmd' functions much like 'cmd', except that it
allows the user to send generic SMP commands.
'smprg' sends the SMP report general command, and
displays the decoded output. It will automatically
fetch extended output if it is available.
'smppc' sends the SMP phy control command, with any
number of potential options. Among other things,
this allows the user to reset a phy on a SAS
expander, or disable a phy on an expander.
'smpmaninfo' sends the SMP report manufacturer
information and displays the decoded output.
'smpphylist' displays a list of phys on an
expander, and the CAM devices attached to those
phys, if any.
cam.h,
cam.c: Add a status value for SMP errors
(CAM_SMP_STATUS_ERROR).
Add a missing description for CAM_SCSI_IT_NEXUS_LOST.
Add support for SMP commands to cam_error_string().
cam_ccb.h: Rename the CAM_DIR_RESV flag to CAM_DIR_BOTH. SMP
commands are by nature bi-directional, and we may
need to support bi-directional SCSI commands later.
Add the XPT_SMP_IO CCB. Since SMP commands are
bi-directional, there are pointers for both the
request and response.
Add a fill routine for SMP CCBs.
Add the XPT_GDEV_ADVINFO CCB. This is currently
used to fetch cached page 0x83 data from the
transport later, but is extensible to fetch many
other types of data.
cam_periph.c: Add support in cam_periph_mapmem() for XPT_SMP_IO
and XPT_GDEV_ADVINFO CCBs.
cam_xpt.c: Add support for executing XPT_SMP_IO CCBs.
cam_xpt_internal.h: Add fields for VPD pages 0x00 and 0x83 in struct
cam_ed.
scsi_all.c: Add scsi_get_sas_addr(), a function that parses
VPD page 0x83 data and pulls out a SAS address.
scsi_all.h: Add VPD page 0x00 and 0x83 structures, and a
prototype for scsi_get_sas_addr().
scsi_pass.c: Add support for mapping buffers in XPT_SMP_IO and
XPT_GDEV_ADVINFO CCBs.
scsi_xpt.c: In the SCSI probe code, first ask the device for
VPD page 0x00. If any VPD pages are supported,
that page is required to be implemented. Based on
the response, we may probe for the serial number
(page 0x80) or device id (page 0x83).
Add support for the XPT_GDEV_ADVINFO CCB.
sys/conf/files: Add smp_all.c.
mps.c: Add support for passing in a uio in mps_map_command(),
so we can map a S/G list at once.
Add support for SMP passthrough commands in
mps_data_cb(). SMP is a special case, because the
first buffer in the S/G list is outbound and the
second buffer is inbound.
Add support for warning the user if the busdma code
comes back with more buffers than will work for the
command. This will, for example, help the user
determine why an SMP command failed if busdma comes
back with three buffers.
mps_pci.c: Add sys/uio.h.
mps_sas.c: Add the SAS address and the parent handle to the
list of fields we pull from device page 0 and cache
in struct mpssas_target. These are needed for SMP
passthrough.
Add support for the XPT_SMP_IO CCB. For now, this
CCB is routed to the addressed device if it supports
SMP, or to its parent if it does not and the parent
does. This is necessary because CAM does not
currently support SMP-only nodes in the topology.
Make SMP passthrough support conditional on
__FreeBSD_version >= 900026. This will make it
easier to MFC this change to the driver without
MFCing the CAM changes as well.
mps_user.c: Un-staticize mpi_init_sge() so we can use it for
the SMP passthrough code.
mpsvar.h: Add a uio and iovecs into struct mps_command for
SMP passthrough commands.
Add a cm_max_segs field to struct mps_command so
that we can warn the user if busdma comes back with
too many segments.
Clear the cm_reply when a command gets freed. If
it is not cleared, reply frames will eventually get
freed into the pool multiple times and corrupt the
pool. (This fix is from scottl.)
Add a prototype for mpi_init_sge().
sys/param.h: Bump __FreeBSD_version to 900026 for the for the
inclusion of the XPT_GDEV_ADVINFO and XPT_SMP_IO
CAM CCBs.
does restore them only when -l option is specified [1]. Make number of
entries field in backup format optional. Document -l and -r options of
`gpart show` action.
Suggested by: pjd [1]
MFC after: 1 week
path id for enumerating the available busses. Previously camcontrol was
implicitly passing 0 as the first path id, which meant that if bus 0 was not
present camcontrol would fail with EINVAL instead of rescanning/resetting any
busses that were present.
Approved by: emaste (mentor)
MFC after: 1 week
in a comma delimited list instead of repeating "mediaopt" for each one.
This matches how the options of the active media are printed with
print_media_word() and brings us in line what NetBSD does.
MFC after: 2 weeks
This fixes verbose mode when either -i specified non-existent kldfile
id, or the file was unloaded between two kldnext(2) calls.
While there, fix printfile() definition to be style(9)-compliant.
Submitted by: arundel
MFC after: 1 week
the "sockarg" ipfw option matches packets associated to
a local socket and with a non-zero so_user_cookie value.
The value is made available as tablearg, so it can be used
as a skipto target or pipe number in ipfw/dummynet rules.
Code by Paul Joe, manpage by me.
Submitted by: Paul Joe
MFC after: 1 week
races - in this case a keepalive packet was send from wrong thread which
lead to connection dropping, because of corrupted packet.
Fix it by sending keepalive packets directly from the send thread.
As a bonus we now send keepalive packets only when connection is idle.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
big sector size. When gctl error is set gctl_has_param() always returns
'false', which prevents geli(8) from finding some arguments and also masks
an error, which is generates in such case.
MFC after: 3 days
This was needed for recover implementation.
Implement the recover command for GPT. Now GPT will marked as
corrupt when any of three types of corruption will be detected:
1. Damaged primary GPT header or table
2. Damaged secondary GPT header or table
3. Secondary header is not located in the last LBA
Marked GPT becomes read-only. Any changes with corrupt table
are prohibited. Only "destroy" and "recover" commands are allowed.
Discussed with: geom@ (mostly silence)
Tested by: Ilya A. Arhipov
Approved by: mav (mentor)
MFC after: 2 weeks
initialize all the data. This is huge waste of time and resources if
there were no writes yet, as there is no real data to synchronize.
Optimize this by sending "virgin" argument to secondary, which gives it a hint
that synchronization is not needed.
In the common case (where noth nodes are configured at the same time) instead
of synchronizing everything, we don't synchronize at all.
MFC after: 1 week
It's a bit more pedantic regarding .Bl list elements. This has an added
benefit of unbreaking the ipfw(8) manpage, where groff was silently
skipping one list element.
Before this change if you wanted to suspend your laptop and be sure that your
encryption keys are safe, you had to stop all processes that use file system
stored on encrypted device, unmount the file system and detach geli provider.
This isn't very handy. If you are a lucky user of a laptop where suspend/resume
actually works with FreeBSD (I'm not!) you most likely want to suspend your
laptop, because you don't want to start everything over again when you turn
your laptop back on.
And this is where geli suspend/resume steps in. When you execute:
# geli suspend -a
geli will wait for all in-flight I/O requests, suspend new I/O requests, remove
all geli sensitive data from the kernel memory (like encryption keys) and will
wait for either 'geli resume' or 'geli detach'.
Now with no keys in memory you can suspend your laptop without stopping any
processes or unmounting any file systems.
When you resume your laptop you have to resume geli devices using 'geli resume'
command. You need to provide your passphrase, etc. again so the keys can be
restored and suspended I/O requests released.
Of course you need to remember that 'geli suspend' won't clear file system
cache and other places where data from your geli-encrypted file system might be
present. But to get rid of those stopping processes and unmounting file system
won't help either - you have to turn your laptop off. Be warned.
Also note, that suspending geli device which contains file system with geli
utility (or anything used by 'geli resume') is not very good idea, as you won't
be able to resume it - when you execute geli(8), the kernel will try to read it
and this read I/O request will be suspended.
error messages, so when we clean up after child process, we have to check if
the event socketpair is still there.
Submitted by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
I'm unable to reproduce the race described in comment anymore and also the
comment is incorrect - localfd represents local component from configuration
file, eg. /dev/da0 and not HAST provider.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 1 week
masking it.
This fixes bogus reports about hooks running for too long and other problems
related to garbage-collecting child processes.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
This is especially useful for things like installers, where regular
geli prompt can't be used.
- Add support for specifing multiple -K or -k options, so there is no
need to cat all keyfiles and read them from standard input.
Requested by: Kris Moore <kris@pcbsd.org>, thompsa
MFC after: 2 weeks
Large (60GB) filesystems created using "newfs -U -O 1 -b 65536 -f 8192"
show incorrect results from "df" for free and used space when mounted
immediately after creation. fsck on the new filesystem (before ever
mounting it once) gives a "SUMMARY INFORMATION BAD" error in phase 5.
This error hasn't occurred in any runs of fsck immediately after
"newfs -U -b 65536 -f 8192" (leaving out the "-O 1" option).
Solution:
The default UFS1 superblock is located at offset 8K in the filesystem
partition; the default UFS2 superblock is located at offset 64K in
the filesystem partition. For UFS1 filesystems with a blocksize of
64K, the first alternate superblock resides at 64K which is the the
location used for the default UFS2 superblock. By default, the
system first checks for a valid superblock at the default location
for a UFS2 filoesystem. For a UFS1 filesystem with a blocksize of
64K, there is a valid UFS1 superblock at this location. Thus, even
though it is expected to be a backup superblock, the system will
use it as its default superblock. So, we have to ensure that all the
statistcs on usage are correct in this first alternate superblock
as it is the superblock that will actually be used.
While tracking down this problem, another limitation of UFS1 became
evident. For UFS1, the number of inodes per cylinder group is stored
in an int16_t. Thus the maximum number of inodes per cylinder group
is limited to 2^15 - 1. This limit can easily be exceeded for block
sizes of 32K and above. Thus when building UFS1 filesystems, newfs
must limit the number of inodes per cylinder group to 2^15 - 1.
Reported by: Guy Helmer<ghelmer@palisadesys.com>
Followup by: Bruce Cran <brucec@freebsd.org>
PR: 107692
MFC after: 4 weeks
This option doesn't passed to kernel and handled in user-space.
With -F option gpart creates new "delete" request for each
partition in table. Each request has flags="X" that disables
auto-commit feature. Last request is the original "destroy" request.
It has own flags and can have disabled or enabled auto-commit feature.
If error is occurred when deleting partitions, then new "undo" request
is created and all changes will be rolled back.
Approved by: kib (mentor)
This way the primary process inherits signal mask from the main process,
which fixes a race where signal is delivered to the primary process before
configuring signal mask.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
while the main process sends control message to the worker process, but worker
process hasn't started control thread yet, because it waits for reply from the
main process.
The fix is to start the control thread before sending any events.
Reported and fix suggested by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
to growing the filesystem.
Refuse to attach providers where the metadata provider size is
wrong. This makes post-boot attaches behave consistently with
pre-boot attaches. Also refuse to restore metadata to a provider
of the wrong size without the new -f switch. The new -f switch
forces the metadata restoration despite the provider size, and
updates the provider size in the restored metadata to the correct
value.
Helped by: pjd
Reviewed by: pjd
Casting from (char *) to (struct ufs1_dinode *) changes the
alignment requirement of the pointer and GCC does not know that
the pointer is adequately aligned (due to malloc(3)), and warns
about it. Cast to (void *) first to by-pass the check.
into un-zeroed storage.
The original patch was questioned by Kirk as it forces the filesystem
to do excessive work initialising inodes on first use, and was never
MFC'd. This change mimics the newfs(8) approach of zeroing two
blocks of inodes for each new cylinder group.
Reviewed by: mckusick
MFC after: 3 weeks
allow the option to be specified multiple times. This will help to
implement things like passing multiple keyfiles to geli(8) instead of
cat(1)ing them all into stdin and reading from there using one '-k -'
option.
understand everything correctly, we don't really need it.
- Provide default numeric value as strings. This allows to simplify
a lot of code.
- Bump version number.
upper layer. Until now, unionfs prevents to use that kind of
file system as upper layer. This time, I changed to allow
that kind of file system as upper layer. By this change, you
can use whiteout not supporting file system (e.g., especially
for tmpfs) as upper layer. It's very useful for combination of
tmpfs as upper layer and read only file system as lower layer.
By difinition, without whiteout support from the file system
backing the upper layer, there is no way that delete and rename
operations on lower layer objects can be done. EOPNOTSUPP is
returned for this kind of operations as generated by VOP_WHITEOUT()
along with any others which would make modifica tions to the
lower layer, such as chmod(1).
This change is suggested by ed.
Submitted by: ed
limited to async-signal safe functions in the child process), move all hooks
execution to the main (non-threaded) process.
Do it by maintaining connection (socketpair) between child and parent
and sending events from the child to parent, so it can execute the hook.
This is step in right direction for others reasons too. For example there is
one less problem to drop privs in worker processes.
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
This fixes various races and eliminates use of pthread* API in signal handler.
Pointed out by: kib
With help from: jilles
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
function to make code more readable.
- Be sure not to reconnect too often in case of signal delivery, etc.
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
process, once it start to use hooks.
- Add hook_check_one() in case the caller expects different child processes
and once it can recognize it, it will pass pid and status to hook_check_one().
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
- ad0 was referred to as da0
- wrong parameter -s instead of -a in example
- use double quotes consistently
PR: docs/150082
Submitted by: N.J. Mann <njm@njm.me.uk>
MFC after: 2 weeks
- Keep all hooks we're running in a global list, so we can report when
they finish and also report when they are running for too long.
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
node failures quickly for HAST resources that are rarely modified.
Remove XXX from a comment now that the guard thread never sleeps infinitely.
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
not available. This improves error reporting when bsdlabel(8) is unable
to open a device for writing. If GEOM_BSD was unavailable, only a rather
obscure error message "Class not found" was printed.
PR: bin/58390
Reviewed by: ae
Discussed with: marcel
MFC after: 1 month
- Increase target limit from 4 to 64; this limit will be removed entirely
at a later time.
- Improve recovery from lost network connections.
- Fix some potential deadlocks and a serious memory leak.
- Fix incorrect use of MH_ALIGN (instead of M_ALIGN), which makes no
practical difference, but triggers a KASSERT with INVARIANTS.
- Fix some warnings in iscontrol(8) and improve the man page somewhat.
Submitted by: Daniel Braniss <danny@cs.huji.ac.il>
Sponsored by: Dansk Scanning A/S, Data Robotics Inc.
use a different interface type (IFT_L2VLAN vs IFT_ETHER). Treat IFT_L2VLAN
interfaces like IFT_ETHER interfaces when handling link layer addresses.
Reviewed by: syrinx (bsnmpd)
MFC after: 1 week
- Load added resources.
- Stop and forget removed resources.
- Update modified resources in least intrusive way, ie. don't touch
/dev/hast/<name> unless path to local component or provider name were
modified.
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
MFC after: 1 month
- Don't exit on errors if not requested.
- Don't keep configuration in global variable, but allocate memory for
configuration.
- Call yyrestart() before yyparse() so that on error in configuration file
we will start from the begining next time and not from the place we left of.
MFC after: 1 month
PJDLOG_ASSERT() and PJDLOG_VERIFY() that will check the given condition
and log the problem where appropriate. The difference between those
two is that PJDLOG_VERIFY() always work and PJDLOG_ASSERT() can be
turned off by defining NDEBUG.
MFC after: 1 month
thus don't depend on one_pass flag anymore.
This is a POLA violation, but it is quite difficult to restore
the old behavior with new code. Also, the new behavior matches
behavior of the older "tee" action, and this is more intuitive.
-S option is meant to be "inclusive".
The original issue of the PR was already fixed.
PR: docs/142418
Submitted by: David Naylor (naylor dot b dot david at gmail dot com)
No objection from: kib
MFC after: 5 days
of add verb. Mention about maximum size limit for "freebsd-boot"
partition. It should be smaller than 545 KB (hardcoded in pmbr).
Show usage of SI unit suffixes in example.
Approved by: mav (mentor)
MFC after: 1 week
it to configure the interface. When the script is complete, dhclient
monitors the routing socket and will terminate if its address is
deleted or if its interface is removed or brought down.
Because the routing socket is already open when dhclient-script is
run, dhclient ignores address deletions for 10 seconds after the
script was run.
If the address that will be obtained is already configured on the
interface before dhclient starts, and if dhclient-script takes more
than 10 seconds (perhaps due to dhclient-*-hooks latencies), on script
completion, dhclient will immediately and silently exit when it sees
the RTM_DELADDR routing message resulting from the script reassigning
the address to the interface.
This change logs dhclient's reason for exiting and also changes the
10 second timeout to be effective from completion of dhclient-script
rather than from when it was started.
We now ignore RTM_DELADDR and RTM_NEWADDR messages when the message
contains no interface address (which should not happen) rather than
exiting.
Not reviewed by: brooks (timeout)
MFC after: 3 weeks
directory truncation to proceed before the link has been cleared. This
is accomplished by detecting a directory with no . or .. links and
clearing the named directory entry in the parent.
- Add a new function ino_remref() which handles the details of removing
a reference to an inode as a result of a lost directory. There were
some minor errors in various subcases of this routine.
int.
- Use errx(3) instead of err(3) to print the error message on short
reads in readlabel(). errno won't be set on short reads which can
easily occur here due to the fixed size read request.
PR: 144307
Reviewed by: bde
need. Close the pidfile. Then close all descriptors >= 3 to avoid
information leakage to children.
This solves the problem of not being able to restart devd when you
have, for example, a dhclient forked to configure your network...
MFC after: 3 days
- Use err/errx only when the case is really fatal. For other
cases, fall back to full fsck instead of quiting fsck.
- Plug a memory leak.
- Avoid divide by zero when printing summary.
- Output "FILE SYSTEM IS MARKED CLEAN" when a successful
journal recovering is done.
- When -f is specified, do full fsck instead of journal recovery.
Move code that converts params from humanized numbers to sectors count
to subr.c and adjust comment.
Add post-processing for "size" and "start offset" params in gpart,
now they are properly converted to sectors count with known sector size
that can be greater that 512 bytes.
Also replace "unsigned long long" type to "off_t" for unify code since
it used for medium size in libgeom(3) and DIOCGMEDIASIZE ioctl.
PR: bin/146277
Reviewed by: marcel (previous version)
Approved by: kib (mentor)
MFC after: 1 month
- remove stray argument [1]
- remove stray whitespace
- use canonical wording for the HISTORY section
PR: docs/147119 [1]
Submitted by: Alexander Best <alexbestms@wwu.de> [1]
MFC after: 1 week
we grow more descriptors, but I'll reconsider readding them once we get there.
Passing (a = b) expression to FD_ISSET() is bad idea, as FD_ISSET() evaluates
its argument twice.
Found by: Coverity Prevent
CID: 5243
MFC after: 3 days
- Add information regarding VTOC8 bootrstrap code and how it's handled with
r208777 in place.
- Document the mapping of partition types to VTOC8 tags.
- Add examples for VTOC8 to the respective section.
- Eliminated hard sentence breaks.
Reviewed by: marcel (slightly buggy version)
MFC after: 3 days
file to be of maximum size.
- Add special handling required for SMI/VTOC8 disklabel partcode, i.e. avoid
overwriting the label when writing the bootstrap code to the partition
starting at 0 and install it to all partitions when the -i option is omitted
just like geom_sunlabel(4) and sunlabel(8) do by default.
- Add missing prototypes.
- Add const where applicable.
Reviewed by: marcel
MFC after: 3 days
mount(8): add xref to devfs(5)
devfs(5): change example to something more likely to be useful (it is not
necessary to mount a devfs on /dev manually, but for chroots/jails it is
often needed), mention since when devfs is preferred to device nodes on ufs
PR: 146600
MFC after: 2 weeks
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
- Only require 256k of blocks per-cg when trying to allocate contiguous
journal blocks. The storage may not actually be contiguous but is at
least within one cg.
- When disabling SUJ leave SU enabled and report this to the user. It
is expected that users will upgrade SU filesystems to SUJ and want
a similar downgrade path.
- Drop bogus quad_t cast for di_gen, it is a 32bit type
- Print di_gen with leading zeros, to get consistent output
Before this change, amd64 would print:
ino 18 gen 616ca2bd
ino 19 gen ffffffff95c2a3ff
ino 20 gen 25c3a3d5
ino 21 gen 8dc1472
ino 22 gen 3797056b
ino 23 gen 1d47853a
ino 24 gen ffffffff82d26995
After the change
ino 18 gen 616ca2bd
ino 19 gen 95c2a3ff
ino 20 gen 25c3a3d5
ino 21 gen 08dc1472
ino 22 gen 3797056b
ino 23 gen 1d47853a
ino 24 gen 82d26995
PR: bin/139994 (sort of)
Reviewed by: mckusick
bottom of the manpages and order them consistently.
GNU groff doesn't care about the ordering, and doesn't even mention
CAVEATS and SECURITY CONSIDERATIONS as common sections and where to put
them.
Found by: mdocml lint run
Reviewed by: ru
- device initiated power management (some devices support only this way);
- Automatic Partial to Slumber Transition (more power saving);
- DMA auto-activation (expected to slightly improve performance).
More features could be added later, when hardware supports.
structure so that we correctly reload. Note that tunefs doesn't
properly detect the need to reload if the disk device is specified
for a read-only mounted filesystem.
- Lessen the contiguity requirement for the journal so that it is more
likely to succeed.
make socket non-blocking, connect() and if we get EINPROGRESS, we have to
wait using select(). Very complex, but I know no other way to define
connection timeout for a given socket.
Reported by: hiroshi@soupacific.com
MFC after: 3 days
secondary, which died between send(2) and recv(2). Do it by adding timeout
to recv(2) for primary incoming and outgoing sockets and secondary outgoing
socket.
Reported by: Mikolaj Golub <to.my.trociny@gmail.com>
Tested by: Mikolaj Golub <to.my.trociny@gmail.com>
MFC after: 3 days
brings in support for an optional intent log which eliminates the need
for background fsck on unclean shutdown.
Sponsored by: iXsystems, Yahoo!, and Juniper.
With help from: McKusick and Peter Holm
interface considers that it hits a fatal error, and will not copyout
the request structure back for _IOW and _IOWR ioctls, keeping them
untouched.
The previous implementation of the SIOCGIFDESCR ioctl intends to
feed the buffer length back to userland. However, if we return
an error, the feedback would be defeated and ifconfig(8) would
trap into an infinite loop.
This commit changes SIOCGIFDESCR to set buffer field to NULL to
indicate the previous ENAMETOOLONG case.
Reported by: bschmidt
MFC after: 2 weeks
Although groff_mdoc(7) gives another impression, this is the ordering
most widely used and also required by mdocml/mandoc.
Reviewed by: ru
Approved by: philip, ed (mentors)
in a device independent manner. Also include an example anticipatory
scheduler, gsched_rr, which gives very nice performance improvements
in presence of competing random access patterns.
This is joint work with Fabio Checconi, developed last year
and presented at BSDCan 2009. You can find details in the
README file or at
http://info.iet.unipi.it/~luigi/geom_sched/
to support various storage boxes which really aren't active-active.
We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.
A usage implication is that you should specificy the currently active
storage path as the first provider.
Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).
Sponsored by: Panasas
MFC after: 1 month
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another
commit.
Also updated the ifconfig utility to show the LINKSTATE
capability if present.
Reviewed by: rwatson, imp, juli
MFC after: 3 days
ipfw add 100 allow ip from { 1.2.3.4 or 5.6.7.8 }
(note that the above example could be better written as
ipfw add 100 allow dst-ip 1.2.3.4,5.6.7.8
Submitted by: Riccardo Panicucci
dscp as a search key in table lookups;
+ (re)implement a sysctl variable to control the expire frequency of
pipes and queues when they become empty;
+ add 'queue number' as optional part of the flow_id. This can be
enabled with the command
queue X config mask queue ...
and makes it possible to support priority-based schedulers, where
packets should be grouped according to the priority and not some
fields in the 5-tuple.
This is implemented as follows:
- redefine a field in the ipfw_flow_id (in sys/netinet/ip_fw.h) but
without changing the size or shape of the structure, so there are
no ABI changes. On passing, also document how other fields are
used, and remove some useless assignments in ip_fw2.c
- implement small changes in the userland code to set/read the field;
- revise the functions in ip_dummynet.c to manipulate masks so they
also handle the additional field;
There are no ABI changes in this commit.
of ip->ip_tos) in a table. This can be useful to direct traffic to
different pipes/queues according to the DSCP of the packet, as follows:
ipfw add 100 queue tablearg lookup dscp 3 // table 3 maps dscp->queue
This change is a no-op (but harmless) until the two-line kernel
side is committed, which will happen shortly.
We'll start noticing this with the next flag introduced as the lower
32bit are all used.
As this is old code we might need to do a full tree sweep one day, unless
changing our strategy to use a different `API' for getting/setting flags
along with the rest of the statfs data.
While here compare to 0 explicitly [1].
Suggested by: kib [1]
Reviewed by: kib
MFC after: 5 days
- Move check of /dev/ prefix and copy into a function to save code duplication.
This also fixes a bug where the /dev/ prefix could not be used when creating
volumes on the command line.
Tested by: Niclas Zeising <niclas.zeising - at - gmail.com>
size or size-like argument. I.e. "-s 32k" instead of "-s 32768".
Size parsing function has been shamelessly stolen from the truncate(1).
I'm sure many sysadmins out there will appreciate this small
improvement.
MFC after: 1 week
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.
The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.
In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.
Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.
Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.
Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.
CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.
- add static and const where appropriate
- check pointers against NULL
- minor styling nits
- it is actually WARNS=6 clean for non-strict alignment platforms
This is shamelessly stolen from DragonflyBSD and reduces our diff.
PR: bin/140078
Approved by: ed (co-mentor)
- The MACHINE_ARCH check is not exhaustive (missing at least powerpc),
and generally not worth maintaining.
- While here, fix whitespace and ordering of the Makefile
PR: bin/140081
Approved by: ed (co-mentor)
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.
HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.
For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.
Sponsored by: FreeBSD Foundation
Sponsored by: OMCnet Internet Service GmbH
Sponsored by: TransIP BV
incomplete as some info doesn't really belong to the structs where it is
defined.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Reviewed by: bde
MFC after: 2 weeks
- fix sign-compare issues.
- ANSIfy a couple of functions.
- Remove more duplicate #includes.
- Memory leak found by Coverity on NetBSD.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Reviewed by: bde
MFC after: 2 weeks
- C99 initializers.
- Change the default volume label from "NO NAME" to "NO_NAME".
- Set OEM String to "BSD4.4 " following the unnamed spacing convention
in that other OS that suggests "MSWIN4.1"
Also, David Naylor's changes for Clang, mostly changing the signess
of constants.
Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Clang fixes by: David Naylor <naylor.b.david gmail com>
Reviewed by: bde (with some disagreement about Clang issues)
MFC after: 2 weeks
cylinder groups that are created. When the filesystem is first created,
newfs always initialises the first two blocks of inodes, and then in the
UFS1 case will also initialise the remaining inode blocks. The changes in
growfs.c 1.23 broke the initialisation of all inodes, seemingly based on
this implementation detail in newfs(8). The result was that instead of
initialising all inodes, we would actually end up initialising all but the
first two blocks of inodes. If the filesystem was grown into empty
(all-zeros) space then the resulting filesystem was fine, however when
grown onto non-zeroed space the filesystem produced would appear to have
massive corruption on the first fsck after growing.
A test case for this problem can be found in the PR audit trail.
Fix this by once again initialising all inodes in the UFS1 case.
PR: bin/115174
Submitted by: Nate Eldredgei nge cs.hmc.edu
Reviewed by: mjacob
MFC after: 1 month