Commit Graph

6636 Commits

Author SHA1 Message Date
Jeff Roberson
280e091a99 Implement fully asynchronous partial truncation with softupdates journaling
to resolve errors which can cause corruption on recovery with the old
synchronous mechanism.

 - Append partial truncation freework structures to indirdeps while
   truncation is proceeding.  These prevent new block pointers from
   becoming valid until truncation completes and serialize truncations.
 - On completion of a partial truncate journal work waits for zeroed
   pointers to hit indirects.
 - softdep_journal_freeblocks() handles last frag allocation and last
   block zeroing.
 - vtruncbuf/ffs_page_remove moved into softdep_*_freeblocks() so it
   is only implemented in one place.
 - Block allocation failure handling moved up one level so it does not
   proceed with buf locks held.  This permits us to do more extensive
   reclaims when filesystem space is exhausted.
 - softdep_sync_metadata() is broken into two parts, the first executes
   once at the start of ffs_syncvnode() and flushes truncations and
   inode dependencies.  The second is called on each locked buf.  This
   eliminates excessive looping and rollbacks.
 - Improve the mechanism in process_worklist_item() that handles
   acquiring vnode locks for handle_workitem_remove() so that it works
   more generally and does not loop excessively over the same worklist
   items on each call.
 - Don't corrupt directories by zeroing the tail in fsck.  This is only
   done for regular files.
 - Push a fsync complete record for files that need it so the checker
   knows a truncation in the journal is no longer valid.

Discussed with:	mckusick, kib (ffs_pages_remove and ffs_truncate parts)
Tested by:	pho
2011-06-10 22:48:35 +00:00
Kenneth D. Merry
b4775b4c1d Add dump.c to the rtsol build. It is needed now that sec2str is non-static
and used in rtsold.c.
2011-06-08 21:59:07 +00:00
Xin LI
c73830758a Add a special mount option "failok" to indicate that the administrator wants
the system to proceed to boot without bailing out into single user mode,
even when the file system can not be successfully mounted.

This option is implemented in mount(8) and not passed into kernel.

MFC after:	1 month
2011-06-07 18:48:49 +00:00
Andrey V. Elsukov
08892bf4bf Do not use LCM from stripesize and user specified alignment value.
When user wants have specific alignment - do what user wants.
Use stripesize as alignment value in case, when some of gpart's
arguments are ommitted for automatic calculation.

Suggested by:	mav
2011-06-07 11:11:11 +00:00
Gavin Atkinson
8a7fca58aa Rework parts of this man page to improve grammar.
Inspired by, and parts submitted by...
PR:		docs/157467
Submitted by:	Ben Kaduk <kaduk mit.edu>
MFC after:	2 weeks
2011-06-06 21:02:26 +00:00
Ed Schouten
48a16a34d8 Remove redundant assignments to WARNS.
For these directories, WARNS is already implied to be 6.
2011-06-06 20:24:17 +00:00
Gavin Atkinson
d452fb8af7 Add another example to mount(8) on using the "-o" argument.
PR:		docs/157389
Submitted by:	Warren Block <wblock wonkity.com>
MFC after:	1 week
2011-06-06 13:24:54 +00:00
Gavin Atkinson
6fbdd4e705 Bump .Dd
Forgotten by:	gavin
MFC after:	1 week
2011-06-06 13:18:29 +00:00
Gavin Atkinson
c6852de31b Document that REQUIRES, PROVIDES and KEYWORDS are alos accepted. This
chnage is different to the one suggested in the PR to try to avoid
cluttering the man page too much.

PR:		docs/154494
Submitted by:	kilian <kilian.klimek googlemail.com>
MFC after:	1 week
2011-06-06 13:13:48 +00:00
Andrey V. Elsukov
71f3650a41 Initialize co.use_set variable before parsing each new rule.
PR:		bin/134975
MFC after:	2 weeks
2011-06-06 11:10:38 +00:00
Andrey V. Elsukov
796051d664 Increase buffer size for the command line.
PR:		bin/125370
Submitted by:	sem
MFC after:	2 weeks
2011-06-06 10:52:26 +00:00
Hiroki Sato
e7fa8d0ada - Accept Router Advertisement messages even when net.inet6.ip6.forwarding=1.
- A new per-interface knob IFF_ND6_NO_RADR and sysctl IPV6CTL_NO_RADR.
  This controls if accepting a route in an RA message as the default route.
  The default value for each interface can be set by net.inet6.ip6.no_radr.
  The system wide default value is 0.

- A new sysctl: net.inet6.ip6.norbit_raif.  This controls if setting R-bit in
  NA on RA accepting interfaces.  The default is 0 (R-bit is set based on
  net.inet6.ip6.forwarding).

Background:

 IPv6 host/router model suggests a router sends an RA and a host accepts it for
 router discovery.  Because of that, KAME implementation does not allow
 accepting RAs when net.inet6.ip6.forwarding=1.  Accepting RAs on a router can
 make the routing table confused since it can change the default router
 unintentionally.

 However, in practice there are cases where we cannot distinguish a host from
 a router clearly.  For example, a customer edge router often works as a host
 against the ISP, and as a router against the LAN at the same time.  Another
 example is a complex network configurations like an L2TP tunnel for IPv6
 connection to Internet over an Ethernet link with another native IPv6 subnet.
 In this case, the physical interface for the native IPv6 subnet works as a
 host, and the pseudo-interface for L2TP works as the default IP forwarding
 route.

Problem:

 Disabling processing RA messages when net.inet6.ip6.forwarding=1 and
 accepting them when net.inet6.ip6.forward=0 cause the following practical
 issues:

 - A router cannot perform SLAAC.  It becomes a problem if a box has
   multiple interfaces and you want to use SLAAC on some of them, for
   example.  A customer edge router for IPv6 Internet access service
   using an IPv6-over-IPv6 tunnel sometimes needs SLAAC on the
   physical interface for administration purpose; updating firmware
   and so on (link-local addresses can be used there, but GUAs by
   SLAAC are often used for scalability).

 - When a host has multiple IPv6 interfaces and it receives multiple RAs on
   them, controlling the default route is difficult.  Router preferences
   defined in RFC 4191 works only when the routers on the links are
   under your control.

Details of Implementation Changes:

 Router Advertisement messages will be accepted even when
 net.inet6.ip6.forwarding=1.  More precisely, the conditions are as
 follow:

 (ACCEPT_RTADV && !NO_RADR && !ip6.forwarding)
	=> Normal RA processing on that interface. (as IPv6 host)

 (ACCEPT_RTADV && (NO_RADR || ip6.forwarding))
	=> Accept RA but add the router to the defroute list with
	   rtlifetime=0 unconditionally.  This effectively prevents
	   from setting the received router address as the box's
	   default route.

 (!ACCEPT_RTADV)
	=> No RA processing on that interface.

 ACCEPT_RTADV and NO_RADR are per-interface knob.  In short, all interface
 are classified as "RA-accepting" or not.  An RA-accepting interface always
 processes RA messages regardless of ip6.forwarding.  The difference caused by
 NO_RADR or ip6.forwarding is whether the RA source address is considered as
 the default router or not.

 R-bit in NA on the RA accepting interfaces is set based on
 net.inet6.ip6.forwarding.  While RFC 6204 W-1 rule (for CPE case) suggests
 a router should disable the R-bit completely even when the box has
 net.inet6.ip6.forwarding=1, I believe there is no technical reason with
 doing so.  This behavior can be set by a new sysctl net.inet6.ip6.norbit_raif
 (the default is 0).

Usage:

 # ifconfig fxp0 inet6 accept_rtadv
	=> accept RA on fxp0
 # ifconfig fxp0 inet6 accept_rtadv no_radr
	=> accept RA on fxp0 but ignore default route information in it.
 # sysctl net.inet6.ip6.norbit_no_radr=1
	=> R-bit in NAs on RA accepting interfaces will always be set to 0.
2011-06-06 02:14:23 +00:00
Hiroki Sato
c3cc3217bc Add the "nd6 options" line handler as af_other_status() of AF_INET6, not as an
own address family.

Reviewed by:	bz
2011-06-05 11:37:20 +00:00
Maxim Sobolev
98453c81af Read from the socket using the same max buffer size as we use while
sending. What happens otherwise is that the sender splits all the
traffic into 32k chunks, while the receiver is waiting for the whole
packet. Then for a certain packet sizes, particularly 66607 bytes in
my case, the communication stucks to secondary is expecting to
read one chunk of 66607 bytes, while primary is sending two chunks
of 32768 bytes and third chunk of 1071. Probably due to TCP windowing
and buffering the final chunk gets stuck somewhere, so neither server
not client can make any progress.

This patch also protect from short reads, as according to the manual
page there are some cases when MSG_WAITALL can give less data than
expected.

MFC after:	3 days
2011-06-04 16:01:30 +00:00
Ruslan Ermilov
8bf9aaabf9 Generally clean up markup. 2011-06-03 10:39:36 +00:00
Andrey V. Elsukov
57512b16ae Always use LCM when stripesize > 0. 2011-06-02 22:15:19 +00:00
Andrey V. Elsukov
a6c21ef2d1 Use stripesize and stripeoffset in the automatic calculation of
partition offsets. If user requests specific alignment and
provider's stripesize is not zero, then use a least common multiple
from the stripesize and user specified value.
Also fix "gpart resize" implementation: do not try to align the partition
size, because the start offset may be not aligned. Instead align the
end offset and then calculate size. Also use stripesize and stripeoffset
for "gpart resize" command.
2011-06-02 21:59:21 +00:00
Ulrich Spörlein
b2e52ced25 mdoc: fix markup 2011-06-02 09:56:42 +00:00
Rick Macklem
6b43e31fe7 Add a sentence to the umount.8 man page to clarify the behaviour
for forced dismount when used on an NFS mount point. Requested by
Jeremy Chadwick.
This is a content change.

MFC after:	2 weeks
2011-05-31 18:27:18 +00:00
Bjoern A. Zeeb
5af3fa9a5f Conditionally compile in the af_inet and af_inet6, af_nd6 modules.
If compiled in for dual-stack use, test with feature_present(3)
to see if we should register the IPv4/IPv6 address family related
options.

In case there is no "inet" support we would love to go with the
usage() and make the address family mandatory (as it is for anything
but inet in theory).  Unfortunately people are used to
  ifconfig IF up/down
etc. as well, so use a fallback of "link".  Adjust the man page
to reflect these minor details.

Improve error handling printing a warning in addition to the usage
telling that we do not know the given address family in two places.

Reviewed by:	hrs, rwatson
Sponsored by:	The FreeBSD Foundation
Sponsored by:	iXsystems
MFC after:	2 weeks
2011-05-31 14:40:21 +00:00
Andrey V. Elsukov
6f5286dca6 Document kern.geom.part.check_integrity sysctl variable. 2011-05-30 11:17:42 +00:00
Andrey V. Elsukov
41b6083752 Add tablearg support for ipfw setfib.
PR:		kern/156410
MFC after:	2 weeks
2011-05-30 05:37:26 +00:00
Mikolaj Golub
a01a750f32 If READ from the local node failed we send the request to the remote
node. There is no use in doing this for synchronization requests.

Approved by:	pjd (mentor)
MFC after:	1 week
2011-05-29 21:20:47 +00:00
Rick Macklem
b37ce15446 Modify the umount(8) command so that it doesn't do
a sync(2) syscall before unmount(2) for the "-f" case.
This avoids a forced dismount from getting stuck for
an NFS mountpoint in sync() when the server is not
responsive. With this commit, forced dismounts should
normally work for the NFS clients, but can take up to
about 1minute to complete.

PR:		kern/157365
Reviewed by:	kib
MFC after:	2 weeks
2011-05-29 21:13:53 +00:00
Kirk McKusick
3fa3e267de Update the manual page to reflect the new 32K/4K defaults.
Reminded by: Ivan Voras
2011-05-28 15:14:50 +00:00
Andrey V. Elsukov
ca203c4faa Add example how to create MBR and BSD schemes and install boot code. 2011-05-27 15:29:39 +00:00
Andrey V. Elsukov
cbcc2a4fd6 Synchronize manpage's synopsis with program's usage. Since -l and -r
keys are mutually exclusive for the `gpart show` command, then mark
them so.

Requested by:	ru
2011-05-27 14:30:13 +00:00
Kirk McKusick
20f2694aa9 Raise the default blocksize for UFS/FFS filesystems from
16K to 32K and the default fragment size from 2K to 4K.

The rational is that most disks are now running with 4K
sectors.  While they can (slowly) simulate 512-byte sectors
by doing a read-modify-write, it is desirable to avoid this
functionality.  By raising the minimum filesystem allocation
to 4K, the filesystem will never trigger the small sector
emulation.

Also, the growth of disk sizes has lead us to double the
default block size about every ten years.  The rise from 8K
to 16K blocks was done in 2001.  So, by the 10-year metric,
the time has come for 32K blocks.

Discussed at: May 2011 BSDCan Developer Summit
Reference: http://wiki.freebsd.org/201105DevSummit/FileSystems
2011-05-26 18:22:49 +00:00
Andrey V. Elsukov
53fb12dbef Simplify ALIGNDOWN macro. 2011-05-24 17:03:46 +00:00
Andrey V. Elsukov
72b066244c Fix calculation of alignment for odd values. Also do not change value
when it is already aligned.
2011-05-24 16:49:34 +00:00
Pawel Jakub Dawidek
3db86c39ae Keep statistics on number of BIO_READ, BIO_WRITE, BIO_DELETE and BIO_FLUSH
requests as well as number of activemap updates.

Number of BIO_WRITEs and activemap updates are especially interesting, because
if those two are too close to each other, it means that your workload needs
bigger number of dirty extents. Activemap should be updated as rarely as
possible.

MFC after:	1 week
2011-05-23 21:15:19 +00:00
Pawel Jakub Dawidek
1c6689d58d To handle BIO_FLUSH and BIO_DELETE requests in secondary worker we need
to use ioctl(2). This is why we can't use capsicum for now to sandbox
secondary. Capsicum is still used to sandbox hastctl.

MFC after:	1 week
2011-05-23 20:59:50 +00:00
Ulrich Spörlein
6e18fca127 Re-encode files from ISO-8859-1 to UTF-8 2011-05-22 14:03:30 +00:00
Pawel Jakub Dawidek
aa27d9ef94 Recognize HIO_FLUSH requests.
MFC after:	1 week
2011-05-21 20:21:20 +00:00
Pawel Jakub Dawidek
588e8623d0 Document IPv6 support.
MFC after:	3 weeks
2011-05-20 11:21:39 +00:00
Pawel Jakub Dawidek
89bad89a59 If no listen address is specified, bind by default to:
tcp4://0.0.0.0:8457
	tcp6://[::]:8457

MFC after:	3 weeks
2011-05-20 11:16:25 +00:00
Pawel Jakub Dawidek
a87399ba7f Rename ipv4/ipv6 to tcp4/tcp6.
MFC after:	3 weeks
2011-05-20 11:15:27 +00:00
Pawel Jakub Dawidek
dc18c8ae6c Now that hell is fully frozen it is good time to add IPv6 support to HAST.
MFC after:	3 weeks
2011-05-20 11:14:05 +00:00
Pawel Jakub Dawidek
496a87aa30 Allow [ ] characters in strings. They might be used in IPv6 addresses.
MFC after:	3 weeks
2011-05-20 11:10:39 +00:00
Pawel Jakub Dawidek
bdbd046b35 Rename tcp4 to tcp in preparation for IPv6 support.
MFC after:	3 weeks
2011-05-20 11:09:02 +00:00
Pawel Jakub Dawidek
933728eea2 Rename proto_tcp4.c to proto_tcp.c in preparation for IPv6 support.
MFC after:	2 weeks
2011-05-20 11:06:17 +00:00
Pawel Jakub Dawidek
d4cb6369e6 In preparation for IPv6 support allow to specify multiple addresses to
listen on.

MFC after:	3 weeks
2011-05-19 23:18:42 +00:00
Pawel Jakub Dawidek
0855e42386 - Add support for AF_INET6 sockets for %S format character.
- Use inet_ntop(3) instead of reimplementing it.
- Use %hhu for unsigned char instead of casting it to unsigned int and
  using %u.

MFC after:	1 week
2011-05-18 22:43:56 +00:00
Sergey Kandaurov
3e71d7d04e mdoc:
- use a proper macro for interface name ipfw0.
- add missing section number for bpf cross reference.
2011-05-17 12:58:19 +00:00
Andrey V. Elsukov
6c7e04f0f3 Some partitioning schemes want to have partitions that are aligned
with geometry. And they do recalculation of user specified parameters.
MBR, PC98, VTOC8, EBR schemes are doing that. For these schemes an
auto alignment feature (ie. gpart add -a alignment) would not work.
But it can work for GPT and BSD schemes. BSD scheme usualy is created
inside MBR, so we can use knowledge about offset of MBR partition to
calculate aligned values for BSD partitions.

Use "offset" attribute of the parent provider for better alignment.

MFC after:	2 weeks
2011-05-15 16:16:48 +00:00
Marius Strobl
6f135a7584 When setting media always and not just in case of switching to IFM_AUTO
clear the options of the current media, i.e. only inherit the instance,
which matches what NetBSD does. Without this it's really non-intuitive
that the following sequence:
	ifconfig bge0 media 1000baseT mediaopt full-duplex
	ifconfig bge0 media 100baseTX
results in 100baseTX full-duplex to be set or that:
	ifconfig bge0 media autoselect mediaopt flowcontrol
	ifconfig bge0 media 1000baseT mediaopt full-duplex
tries to set 1000baseT full-duplex with flowcontrol, which isn't suported
und thus fails while the following:
	ifconfig re0 media 1000baseT mediaopt flowcontrol,full-duplex
	ifconfig re0 media autoselect
just switches to autoselection without flowcontrol.

MFC after:	2 weeks
2011-05-15 12:51:00 +00:00
Andrey V. Elsukov
cb86ada75d Simplify the code a bit. For own providers GEOM_PART always provides
"start" and "end" config attributes.

MFC after:	1 week
2011-05-15 11:45:13 +00:00
Pawel Jakub Dawidek
0cddb12ffd Currently we are unable to use capsicum for the primary worker process,
because we need to do ioctl(2)s, which are not permitted in the capability
mode. What we do now is to chroot(2) to /var/empty, which restricts access
to file system name space and we drop privileges to hast user and hast
group.

This still allows to access to other name spaces, like list of processes,
network and sysvipc.

To address that, use jail(2) instead of chroot(2). Using jail(2) will restrict
access to process table, network (we use ip-less jails) and sysvipc (if
security.jail.sysvipc_allowed is turned off). This provides much better
separation.

MFC after:	1 week
2011-05-14 17:02:03 +00:00
Pawel Jakub Dawidek
bcc9f32110 When using capsicum to sanbox, still use other methods first, just in case
one of them have some problems.
2011-05-14 16:55:24 +00:00
Bruce M Simpson
e1e3c30189 Typo. For USB devices, 'serial' should be 'sernum'.
See sys/dev/usb/usb_device.c for what devctl_notify() gets.
2011-05-10 02:34:11 +00:00