received on a TCP session that has entered the ESTABLISHED state. This
results in a lot of calls to reset the keepalive timer.
This patch changes the behavior so we set the keepalive timer for the
keepalive idle time (TP_KEEPIDLE). When the keepalive timer fires, it will
first check to see if the session has been idle for TP_KEEPIDLE ticks. If
not, it will reschedule the keepalive timer for the time the session will
have been idle for TP_KEEPIDLE ticks.
For a session with regular communication, the keepalive timer should fire
approximately once every TP_KEEPIDLE ticks. For sessions with irregular
communication, the keepalive timer might fire more often. But, the
disruption from a periodic keepalive timer should be less than the regular
cost of resetting the keepalive timer on every packet.
(FWIW, this change saved approximately 1.73% of the busy CPU cycles on a
particular test system with a heavy TCP output load. Of course, the
actual impact is very specific to the particular hardware and workload.)
Reviewed by: gallatin, rrs
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D8243
6988 spa_sync() spends half its time in dmu_objset_do_userquota_updates
Using a benchmark which creates 2 million files in one TXG, I observe
that the thread running spa_sync() is on CPU almost the entire time we
are syncing, and therefore can be a performance bottleneck. About 50% of
the time in spa_sync() is in dmu_objset_do_userquota_updates().
The problem is that dmu_objset_do_userquota_updates() calls
zap_increment_int(DMU_USERUSED_OBJECT) once for every file that was
modified (or created). In this benchmark, all the files are owned by the
same user/group, so all 2 million calls to zap_increment_int() are
modifying the same entry in the zap. The same issue exists for the
DMU_GROUPUSED_OBJECT.
We should keep an in-memory map from user to space delta while we are
syncing, and when we finish, iterate over the in-memory map and modify
the ZAP once per entry. This reduces the number of calls to
zap_increment_int() from "number of objects modified" to "number of
owners/groups of modified files".
This reduced the time spent in spa_sync() in the file create benchmark
by ~33%, from 11 seconds to 7 seconds.
Closes#107
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Ned Bass <bass6@llnl.gov>
Reviewed by: Jinshan Xiong <jinshan.xiong@intel.com>
Author: Matthew Ahrens <mahrens@delphix.com>
openzfs/openzfs@5fc46359c5
5120 zfs should allow large block/gzip/raidz boot pool (loader project)
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Toomas Soome <tsoome@me.com>
openzfs/openzfs@c8811bd3e2
FreeBSD still does not support booting from gzip-compressed datasets,
so keep one chunk of this commit out.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reported by: Lili Deng <v-lide microsoft com>
MFC after: 3 days
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8238
RPI3 kernel config builds kernel compatible with latest upstream device
tree and firmware: https://github.com/raspberrypi/firmware/tree/master/boot
As of today it's 597c662a613df1144a6bc43e5f4505d83bd748ca
Default console is PL01x, so pi3-disable-bt dt overlay should be configured
in config.txt and stock U-Boot should be patched to use proper serial port.
Yet unsupported: SMP, VCHIQ, RNG driver. RNG requires some work due to
upstream device tree incompatibility.
Multiple people contributed to this work over time: db@, loos@, manu@
- Revert BUS_SPACE_PHYSADDR back to rman_get_start. BUS_SPACE_PHYSADDR was
introduced in 2013 as temporary wrapper until proper solution appears.
It's ARM only and since we need this file for ARM64 build and no proper
API has been introduced - just revert the change and make sure it's
going to appear when people grep for BUS_SPACE_PHYSADDR in sources.
- Fix printf format for size_t variables
dev_net.c code.
The NETIF_OPEN_CLOSE_ONCE flag was added in r201932 to prevent that behaviour
on some architectures (sparc64 and powerpc64) the default was left to always
open and close the device for each open and close of a file by the loader
because it was necessary for u-boot on arm.
Since it has been added, the flag was turned on for every arches including the
u-boot loader for arm.
This also fixes netbooting on RPi3 (tested by gonzo@)
For the loader.efi it greatly speeds up netbooting
Reviewed by: emaste, gonzo, tsoome
Approved by: gonzo
MFC after: 1 month
Sponsored by: Gandi.net
Differential Revision: https://reviews.freebsd.org/D8230
Ignore the ECN bits on 'tos' and 'set-tos' and allow to use
DCSP names instead of having to embed their TOS equivalents
as plain numbers.
Obtained from: OpenBSD
Sponsored by: OPNsense
Differential Revision: https://reviews.freebsd.org/D8165
buildkernel failed with GCC 5.3 with
error: comparison of constant '852736' with boolean expression is always true
Sponsored by: The FreeBSD Foundation
Linux's copy of the Cavium SDK does not have these non-ASCII characters
and this reduces noise in diffs when comparing the two.
Sponsored by: The FreeBSD Foundation
Suppose that we have an exclusively busy page, and a thread which can
accept shared-busy page. In this case, typical code waiting for the
page xbusy state to pass is
again:
VM_OBJECT_WLOCK(object);
...
if (vm_page_xbusied(m)) {
vm_page_lock(m);
VM_OBJECT_WUNLOCK(object); <---1
vm_page_busy_sleep(p, "vmopax");
goto again;
}
Suppose that the xbusy state owner locked the object, unbusied the
page and unlocked the object after we are at the line [1], but before we
executed the load of the busy_lock word in vm_page_busy_sleep(). If it
happens that there is still no waiters recorded for the busy state,
the xbusy owner did not acquired the page lock, so it proceeded.
More, suppose that some other thread happen to share-busy the page
after xbusy state was relinquished but before the m->busy_lock is read
in vm_page_busy_sleep(). Again, that thread only needs vm_object lock
to proceed. Then, vm_page_busy_sleep() reads busy_lock value equal to
the VPB_SHARERS_WORD(1).
In this case, all tests in vm_page_busy_sleep(9) pass and we are going
to sleep, despite the page being share-busied.
Update check for m->busy_lock == VPB_UNBUSIED in vm_page_busy_sleep(9)
to also accept shared-busy state if we only wait for the xbusy state to
pass.
Merge sequential if()s with the same 'then' clause in
vm_page_busy_sleep().
Note that the current code does not share-busy pages from parallel
threads, the only way to have more that one sbusy owner is right now
is to recurse.
Reported and tested by: pho (previous version)
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D8196
Previously the driver used more low level operations like iicbus_start
and iicbus_write. The problem is that those operations are not
implemented by iicbus(4) and the calls were effectively routed to
a driver to which the bus is attached.
But not all of the controllers implement such low level operations
while all of the drivers are expected to have iicbus_transfer.
While there fix incorrect implementation of iicsmb_bwrite and iicsmb_bread.
The former should send a byte count before the actual bytes, while the
latter should first receive the byte count and then receive the bytes.
I have tested only these commands:
- quick (r/w)
- send byte
- receive byte
- read byte
- write byte
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8170
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.
Submitted by: kib@, dim@, zbb@, emaste@
The runtime kernel loader, linker_load_file, unloads kernel files that
failed to load all of their modules. For consistency, treat preloaded
(loader.conf loaded) kernel files in the same way.
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8200
to ieee80211_add_rx_params() + drop last (ieee80211_rx_stats) parameter
Note: there is an additional check for ieee80211_get_rx_params()
return value (which does not exist in the original diff).
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D8207
- If check for net,ethernet/usb,device compatible node fails, try to find
.../usb/hub/ethernet, where ... is bus path that can depend on actual HW.
net,ethernet/usb,device compatibity strings are FreeBSD custom invention
that is used only in RPi DTBs and since there is no other way to tie USB
device to FDT node we just do our best effort here to work with upstream
device tree
- Use -1 value to indicate invalid phandle_t, 0 is valid phandle value and
shouldn't be used as error signal
the TCP_RFC7413 kernel option. This change removes those few instructions
from the packet processing path.
While not strictly necessary, for the sake of consistency, I applied the
new IS_FASTOPEN macro to all places in the packet processing path that
used the (t_flags & TF_FASTOPEN) check.
Reviewed by: hiren
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D8219
As of r302092, pipe is a wrapper around pipe2 and the pipe syscall is no
longer used. It is included only with the COMPAT_FREEBSD10 kernel option.
Add the compat option to support upgrades from systems with an earlier
userland.
MFC after: 1 week
Keep resource state consistent with INTRNG state - if intr_activate_irq
fails - deactivate resource and propagate error to calling function
Reviewed by: mmel
This will allow to add slave drivers in the same fashion as for iicbus.
Also, allow other code to add a child device and set its 'addr' ivar.
The ivar can only be set if it's unset, it can not be changed.
That could be used, for example, by a platform driver that has
a precise description of the hardware and, thus, knows what drivers
can handle what slaves.
The slave auto-probing code is unsafe and broken because it uses
7-bit slave addresses. It's going to be removed.
Note: internally the driver uses address of zero as an unset address
while smbus_get_addr() returns it as -1 for compatibility reasons.
The address is expected to be unset only for children that do not
work with slaves like, for example, smb(4).
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D8173
This should have been committed in r307093: resource allocation depends
on source of the device tree. upstream dts has extra interrupt that we can
ignore