- A new per-interface knob IFF_ND6_NO_RADR and sysctl IPV6CTL_NO_RADR.
This controls if accepting a route in an RA message as the default route.
The default value for each interface can be set by net.inet6.ip6.no_radr.
The system wide default value is 0.
- A new sysctl: net.inet6.ip6.norbit_raif. This controls if setting R-bit in
NA on RA accepting interfaces. The default is 0 (R-bit is set based on
net.inet6.ip6.forwarding).
Background:
IPv6 host/router model suggests a router sends an RA and a host accepts it for
router discovery. Because of that, KAME implementation does not allow
accepting RAs when net.inet6.ip6.forwarding=1. Accepting RAs on a router can
make the routing table confused since it can change the default router
unintentionally.
However, in practice there are cases where we cannot distinguish a host from
a router clearly. For example, a customer edge router often works as a host
against the ISP, and as a router against the LAN at the same time. Another
example is a complex network configurations like an L2TP tunnel for IPv6
connection to Internet over an Ethernet link with another native IPv6 subnet.
In this case, the physical interface for the native IPv6 subnet works as a
host, and the pseudo-interface for L2TP works as the default IP forwarding
route.
Problem:
Disabling processing RA messages when net.inet6.ip6.forwarding=1 and
accepting them when net.inet6.ip6.forward=0 cause the following practical
issues:
- A router cannot perform SLAAC. It becomes a problem if a box has
multiple interfaces and you want to use SLAAC on some of them, for
example. A customer edge router for IPv6 Internet access service
using an IPv6-over-IPv6 tunnel sometimes needs SLAAC on the
physical interface for administration purpose; updating firmware
and so on (link-local addresses can be used there, but GUAs by
SLAAC are often used for scalability).
- When a host has multiple IPv6 interfaces and it receives multiple RAs on
them, controlling the default route is difficult. Router preferences
defined in RFC 4191 works only when the routers on the links are
under your control.
Details of Implementation Changes:
Router Advertisement messages will be accepted even when
net.inet6.ip6.forwarding=1. More precisely, the conditions are as
follow:
(ACCEPT_RTADV && !NO_RADR && !ip6.forwarding)
=> Normal RA processing on that interface. (as IPv6 host)
(ACCEPT_RTADV && (NO_RADR || ip6.forwarding))
=> Accept RA but add the router to the defroute list with
rtlifetime=0 unconditionally. This effectively prevents
from setting the received router address as the box's
default route.
(!ACCEPT_RTADV)
=> No RA processing on that interface.
ACCEPT_RTADV and NO_RADR are per-interface knob. In short, all interface
are classified as "RA-accepting" or not. An RA-accepting interface always
processes RA messages regardless of ip6.forwarding. The difference caused by
NO_RADR or ip6.forwarding is whether the RA source address is considered as
the default router or not.
R-bit in NA on the RA accepting interfaces is set based on
net.inet6.ip6.forwarding. While RFC 6204 W-1 rule (for CPE case) suggests
a router should disable the R-bit completely even when the box has
net.inet6.ip6.forwarding=1, I believe there is no technical reason with
doing so. This behavior can be set by a new sysctl net.inet6.ip6.norbit_raif
(the default is 0).
Usage:
# ifconfig fxp0 inet6 accept_rtadv
=> accept RA on fxp0
# ifconfig fxp0 inet6 accept_rtadv no_radr
=> accept RA on fxp0 but ignore default route information in it.
# sysctl net.inet6.ip6.norbit_no_radr=1
=> R-bit in NAs on RA accepting interfaces will always be set to 0.
sending. What happens otherwise is that the sender splits all the
traffic into 32k chunks, while the receiver is waiting for the whole
packet. Then for a certain packet sizes, particularly 66607 bytes in
my case, the communication stucks to secondary is expecting to
read one chunk of 66607 bytes, while primary is sending two chunks
of 32768 bytes and third chunk of 1071. Probably due to TCP windowing
and buffering the final chunk gets stuck somewhere, so neither server
not client can make any progress.
This patch also protect from short reads, as according to the manual
page there are some cases when MSG_WAITALL can give less data than
expected.
MFC after: 3 days
partition offsets. If user requests specific alignment and
provider's stripesize is not zero, then use a least common multiple
from the stripesize and user specified value.
Also fix "gpart resize" implementation: do not try to align the partition
size, because the start offset may be not aligned. Instead align the
end offset and then calculate size. Also use stripesize and stripeoffset
for "gpart resize" command.
If compiled in for dual-stack use, test with feature_present(3)
to see if we should register the IPv4/IPv6 address family related
options.
In case there is no "inet" support we would love to go with the
usage() and make the address family mandatory (as it is for anything
but inet in theory). Unfortunately people are used to
ifconfig IF up/down
etc. as well, so use a fallback of "link". Adjust the man page
to reflect these minor details.
Improve error handling printing a warning in addition to the usage
telling that we do not know the given address family in two places.
Reviewed by: hrs, rwatson
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 2 weeks
a sync(2) syscall before unmount(2) for the "-f" case.
This avoids a forced dismount from getting stuck for
an NFS mountpoint in sync() when the server is not
responsive. With this commit, forced dismounts should
normally work for the NFS clients, but can take up to
about 1minute to complete.
PR: kern/157365
Reviewed by: kib
MFC after: 2 weeks
16K to 32K and the default fragment size from 2K to 4K.
The rational is that most disks are now running with 4K
sectors. While they can (slowly) simulate 512-byte sectors
by doing a read-modify-write, it is desirable to avoid this
functionality. By raising the minimum filesystem allocation
to 4K, the filesystem will never trigger the small sector
emulation.
Also, the growth of disk sizes has lead us to double the
default block size about every ten years. The rise from 8K
to 16K blocks was done in 2001. So, by the 10-year metric,
the time has come for 32K blocks.
Discussed at: May 2011 BSDCan Developer Summit
Reference: http://wiki.freebsd.org/201105DevSummit/FileSystems
requests as well as number of activemap updates.
Number of BIO_WRITEs and activemap updates are especially interesting, because
if those two are too close to each other, it means that your workload needs
bigger number of dirty extents. Activemap should be updated as rarely as
possible.
MFC after: 1 week
with geometry. And they do recalculation of user specified parameters.
MBR, PC98, VTOC8, EBR schemes are doing that. For these schemes an
auto alignment feature (ie. gpart add -a alignment) would not work.
But it can work for GPT and BSD schemes. BSD scheme usualy is created
inside MBR, so we can use knowledge about offset of MBR partition to
calculate aligned values for BSD partitions.
Use "offset" attribute of the parent provider for better alignment.
MFC after: 2 weeks
clear the options of the current media, i.e. only inherit the instance,
which matches what NetBSD does. Without this it's really non-intuitive
that the following sequence:
ifconfig bge0 media 1000baseT mediaopt full-duplex
ifconfig bge0 media 100baseTX
results in 100baseTX full-duplex to be set or that:
ifconfig bge0 media autoselect mediaopt flowcontrol
ifconfig bge0 media 1000baseT mediaopt full-duplex
tries to set 1000baseT full-duplex with flowcontrol, which isn't suported
und thus fails while the following:
ifconfig re0 media 1000baseT mediaopt flowcontrol,full-duplex
ifconfig re0 media autoselect
just switches to autoselection without flowcontrol.
MFC after: 2 weeks
because we need to do ioctl(2)s, which are not permitted in the capability
mode. What we do now is to chroot(2) to /var/empty, which restricts access
to file system name space and we drop privileges to hast user and hast
group.
This still allows to access to other name spaces, like list of processes,
network and sysvipc.
To address that, use jail(2) instead of chroot(2). Using jail(2) will restrict
access to process table, network (we use ip-less jails) and sysvipc (if
security.jail.sysvipc_allowed is turned off). This provides much better
separation.
MFC after: 1 week
operate on one type of filesystem, mention this.
While here, capitalise the use of "UFS" in growfs.8 to match other uses of
the term in other man pages.
MFC after: 1 week
checking at open time. It may improve performance for read-only
NFS mounts. Use deliberately.
MFC after: 1 week
Reviewed by: rmacklem, jhb (earlier version)
This improves usability a little as we no longer require using touch.
Also reword the manpage wrt. parameters and fix usage() [1]
With no media in a cd(4) drive, the reads will loop producing EINVAL,
abort in that case [2].
Document the shortcoming of sectorsize and MAXPHYS (a quick solution
to this might be having MAXPHYS as the "bigsize", in short testing it
didn't make a difference on throughput).
Submitted by: arundel [1]
PR: bin/154528 [2]
that was built before ffs grew support for TRIM, your filesystem will have
plenty of free blocks that the flash chip doesn't know are free, so it
can't take advantage of them for wear leveling. Once you've upgraded your
kernel, you enable TRIM on the filesystem (tunefs -t enable), then run
fsck_ffs -E on it before mounting it.
I tested this patch by half-filling an mdconfig'ed filesystem image,
running fsck_ffs -E on it, then verifying that the contents were not
damaged by comparing them to a pristine copy using rsync's checksum
functionality. There is no reliable way to test it on real hardware.
Many thanks to mckusick@, who provided the tricky parts of this patch and
reviewed the final version.
Reviewed by: mckusick@
MFC after: 3 weeks
"mdconfig -f file", I decided that it would be easier to make mdconfig
DWIM than to teach my fingers to type the correct command line.
MFC after: 3 weeks