Normally the socket buffers are static (either derived from global
defaults or set with setsockopt) and do not adapt to real network
conditions. Two things happen: a) your socket buffers are too small
and you can't reach the full potential of the network between both
hosts; b) your socket buffers are too big and you waste a lot of
kernel memory for data just sitting around.
With automatic TCP send and receive socket buffers we can start with a
small buffer and quickly grow it in parallel with the TCP congestion
window to match real network conditions.
FreeBSD has a default 32K send socket buffer. This supports a maximal
transfer rate of only slightly more than 2Mbit/s on a 100ms RTT
trans-continental link. Or at 200ms just above 1Mbit/s. With TCP send
buffer auto scaling and the default values below it supports 20Mbit/s
at 100ms and 10Mbit/s at 200ms. That's an improvement of factor 10, or
1000%. For the receive side it looks slightly better with a default of
64K buffer size.
New sysctls are:
net.inet.tcp.sendbuf_auto=1 (enabled)
net.inet.tcp.sendbuf_inc=8192 (8K, step size)
net.inet.tcp.sendbuf_max=262144 (256K, growth limit)
net.inet.tcp.recvbuf_auto=1 (enabled)
net.inet.tcp.recvbuf_inc=16384 (16K, step size)
net.inet.tcp.recvbuf_max=262144 (256K, growth limit)
Tested by: many (on HEAD and RELENG_6)
Approved by: re
MFC after: 1 month
upper-bounding it to the size of the initial socket buffer lower-bound it
to the smallest MSS we accept. Ideally we'd use the actual MSS information
here but it is not available yet.
For socket buffer auto sizing to be effective we need room to grow the
receive window. The window scale shift is determined at connection setup
and can't be changed afterwards. The previous, original, method effectively
just did a power of two roundup of the socket buffer size at connection
setup severely limiting the headroom for larger socket buffers.
Tested by: many (as part of the socket buffer auto sizing patch)
MFC after: 1 month
cannot change (because its referenced by curthread). This fixes
a LOR caused by acquiring emul_shared_lock while holding emul_lock.
Fix typo in comment.
Submitted by: rdivacky
p->p_emuldata is properly initialized in the time when the child can run.
Do not set p->p_emuldata to NULL when the process is exiting.
It does not make any sense and only costs 2 mutex operations.
Do not lock emul_data to unlock it on the very next line.
Comment on possible race while there.
Reparent all procs that are part of a threading group but not its leaders
to init and SIGCHLD init to finish the zombies off. This fixes zombies
left after opera's exit. [1]
There is no need to lock p_em in the linux_proc_init CLONE_THREAD
case because the process cannot change the address of the p_em->shared
because its currently running this code path.
Move assigning of em->shared outside emul_shared_lock.
Noticed by: Scott Robbins <scottro@nyc.rr.com> [1]
Submitted by: rdivacky
buffer resizing, etc.) that was here since eon. Free all (unmanaged)
allocated buffer through sndbuf_destroy() in case we forgot to call
sndbuf_free(). For a managed buffer (mostly hw specific managed buffer),
either provide CHANNEL_FREE() method with appropriate return value to
invoke semi-automatic sndbuf_free() or simply do it on their own. If
everything is failed, sndbuf_destroy() will come to the rescue as a
final measure.
MFC after: 3 days
This should fix the run time bustage observed on recent -CURRENT whilst
mounting a MSDOS filesystem with non-default locale/code page:
link_elf: symbol msdosfs_fileno_free undefined
KLD msdosfs_iconv.ko: depends on msdosfs - not available
using the callers UID instead of the GID when performing group
operations. This could allow users to determine group quota
information for groups they are not a member of in some cases.
Rename the "uid" parameter in ufs_quotactl to "id" to better show
that it is used for more than just the uid, and to be more in line
with the naming conventions in the other quota routines.
PR: kern/33940
added with a single invocation of pkg_add, replacing it with something
rather more dynamic.
Approved by: portmgr (pav)
Tested by: full pointyhat package run
MFC after: 1 week
for pci_cfg_restore() to be exported. It was tested using a
hackily accessed pci_cfg_restore().
- Add ifmedia_removeall() to mxge_detach() in order to stop leaking
an ifaddr
- Fix a small acounting bug introduced by the locking code shuffle
which could cause spurious watchdog resets now that we have a
watchdog.
Sponsored by: Myricom
locking in preparation for adding a watchdog handler (callouts must
not use sleepable locks). This required shuffling memory and
interrupt allocation to the attach routine rather than if_ioctl so as
to avoid potential sleeps while bringing up the interface.
This is not a functional change.
IN_LINKLOCAL() tests if an address falls within the IPv4 link-local prefix.
IN_PRIVATE() tests if an address falls within an RFC 1918 private prefix.
IN_LOCAL_GROUP() tests if an address falls within the statically assigned
link-local multicast scope specified in RFC 2365.
IN_ANY_LOCAL() tests for either of IN_LINKLOCAL() or IN_LOCAL_GROUP().
As with the existing macros in the FreeBSD netinet stack, comparisons
are performed in host-byte order.
See also: RFC 1918, RFC 2365, RFC 3927
Obtained from: NetBSD (dyoung@)
MFC after: 2 weeks
warnx(3) to be compiled on systems that have it (e.g. FreeBSD),
while the intention was opposite, i.e., compile them on systems
that don't have them. Also fixes static linkage of pkg_sign(1).
- initialize ifq_drv_maxlen correctly
- mark the interface as jumbo capable
- keep stats on the number of times the hw transmit queue filled and
was restarted.