ZFS SLOGs have very specific access pattern with many cache flushes,
which none of benchmarks I know can simulate. Since SSD vendors rarely
specify cache flush time, this measurement can be useful to explain why
some ZFS pools are slower then expected. This test writes data chunks
of different size followed by cache flush, alike to what ZFS SLOG does,
and measures average time.
To illustrate, here is result for 6 years old SATA Intel 710 Series SSD:
Synchronous random writes:
0.5 kbytes: 138.3 usec/IO = 3.5 Mbytes/s
1 kbytes: 137.7 usec/IO = 7.1 Mbytes/s
2 kbytes: 151.1 usec/IO = 12.9 Mbytes/s
4 kbytes: 158.2 usec/IO = 24.7 Mbytes/s
8 kbytes: 175.6 usec/IO = 44.5 Mbytes/s
16 kbytes: 210.1 usec/IO = 74.4 Mbytes/s
32 kbytes: 274.2 usec/IO = 114.0 Mbytes/s
64 kbytes: 416.5 usec/IO = 150.1 Mbytes/s
128 kbytes: 776.6 usec/IO = 161.0 Mbytes/s
256 kbytes: 1503.1 usec/IO = 166.3 Mbytes/s
512 kbytes: 2968.7 usec/IO = 168.4 Mbytes/s
1024 kbytes: 5866.8 usec/IO = 170.5 Mbytes/s
2048 kbytes: 11696.6 usec/IO = 171.0 Mbytes/s
4096 kbytes: 23329.6 usec/IO = 171.5 Mbytes/s
8192 kbytes: 46779.5 usec/IO = 171.0 Mbytes/s
, and much newer and supposedly much faster NVMe Samsung 950 PRO SSD:
Synchronous random writes:
0.5 kbytes: 2092.9 usec/IO = 0.2 Mbytes/s
1 kbytes: 2013.1 usec/IO = 0.5 Mbytes/s
2 kbytes: 2014.8 usec/IO = 1.0 Mbytes/s
4 kbytes: 2090.7 usec/IO = 1.9 Mbytes/s
8 kbytes: 2044.5 usec/IO = 3.8 Mbytes/s
16 kbytes: 2084.8 usec/IO = 7.5 Mbytes/s
32 kbytes: 2137.1 usec/IO = 14.6 Mbytes/s
64 kbytes: 2173.4 usec/IO = 28.8 Mbytes/s
128 kbytes: 2923.9 usec/IO = 42.8 Mbytes/s
256 kbytes: 3085.3 usec/IO = 81.0 Mbytes/s
512 kbytes: 3112.2 usec/IO = 160.7 Mbytes/s
1024 kbytes: 2430.6 usec/IO = 411.4 Mbytes/s
2048 kbytes: 3788.9 usec/IO = 527.9 Mbytes/s
4096 kbytes: 6198.0 usec/IO = 645.4 Mbytes/s
8192 kbytes: 10764.9 usec/IO = 743.2 Mbytes/s
While the first one obviously has maximal throughput limitations, the
second one has so high cache flush latency (about 2 millisecond), that
it makes one almost useless in SLOG role, despite of its good throughput
numbers. Power loss protection is out of scope of this test, but I
suspect it can be related.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
sbavail() returns u_int and sendwin is a uint32_t. Therefore, min() (which
operates on two u_int values) is able to correctly calculate the minimum
of these two arguments.
Reported by: rrs
MFC after: 1 week
Sponsored by: Netflix
Even though gdb and kgdb may not be removed for 12.0 on some architectures,
the notice is unconditional as these tools will likely be removed at some
point in the future when adequate replacements are available (gdb in ports
or lldb in base).
Reviewed by: emaste
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D11477
This patch adds new bsdinstall option to hardening section that allows users
to change this behaviour to secure one and updates stack guard option so it
would set the value of relevant sysctl to 512 (2MB)
Submitted by: Bartek Rutkowski
Reviewed by: adrian, bapt, emaste
Approved by: bapt, emaste
MFC after: 1 day
Sponsored by: Pixeware LTD
Differential Revision: https://reviews.freebsd.org/D9700
the terminal work properly out of the box when logging over a serial
line, which is quite important for the user experience on boards like
Raspberry Pi. It doesn't affect cases where the terminal size is
already non-zero, such as SSH or vt(4) sessions.
Note that this doesn't handle a scenario pointed out by rgrimes@:
when the terminal is resized after login, the terminal size won't
get updated even after logging out and back in.
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10642
when jails are being used on the system.
It is hoped that the patches in PR#205193 will someday get tested/debugged
so that they can be committed to fix this.
This is a content change.
PR: 205193
MFC after: 2 weeks
Use the standard syntax of name@version, I do not expect a confusion
due to unlikely possibility of the name containing the '@' character.
Requested by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This driver is standard rather than optional because it can always provide
time after a reboot, but it will only provide time after a power cycle if
battery power is supplied to the chip's SNVS power domain.
socket structure into a listening socket. This resulted in an invalid
instruction fault for all 32-bit platforms.
When INVARIANTS is set the union where the two uninitialized fields
reside gets properly zeroed. This patch ensures the two uninitialized
fields are zeroed when INVARIANTS is undefined.
For 64-bit platforms this issue was not visible because so->sol_upcall
which is uninitialized overlaps with so->so_rcv.sb_state which is
already zero during soalloc();
For 32-bit platforms this issue was visible and resulted in an invalid
instruction fault, because so->sol_upcall overlaps with
so->so_rcv.sb_sel which is always initialized to a valid data pointer
during soalloc().
Verifying the offset locations mentioned above are identical is left
as an exercise to the reader.
PR: 220452
PR: 220358
Reviewed by: ae (network), gallatin
Differential Revision: https://reviews.freebsd.org/D11475
Sponsored by: Mellanox Technologies
Collapse should be more effective than defragmentation.
Added missing declaration of ena_check_and_collapse_mbuf().
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
TSO settings were not reflecting real HW capabilities.
DMA tags were created with wrong window - high address was the same as
low, so excluding window was not working.
Capabilities of TX dma transaction were not set properly - TSO max size
had been increased and size of one segment had been adjusted.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
RX lock is no longer required. There can only be one RX cleanup task
running at a time, RX cleanup cannot be executed if interface is not
yet initialized and ena_down() will not free any RX resources if any io
interrupt is being handled - RX cleanup task is only called from an
interrupt handler.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
If drbr_advance() is not called before doing cleanup and packet is
already enqueued for sending (tx_info is holding pointer to mbuf), then
mbuf is cleaned both in drbr_flush() and in cleanup routine, when all
mbufs hold by tx_buffer_info are being released.
This causes panic, because mbuf is released twice.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
If driver left MSI-x handlling routine because interface was put down,
it is not unmasking IRQs, so any requesting interrupt will be awaiting
for unmasking.
On ena_up() routine all interrupts are being unmasked and any awaiting
interrupt will be handled right away.
If handler was executed before driver state was set as running, handling
routine is being ended immediately, leaving IO irqs for given queue
masked.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
It is required to hold lock that is associated with buffer ring before
flushing drbr.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
Lack of this lock was causing crash if down was called in
parallel with the initialization routine.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
- Address most of the post-commit comments on D11128.[1]
- Reference the man pages for the lock types supported by the provider.
- Add a BUGS section.
- Eliminate some redundancy by describing similar probes in the same
paragraph.
- Fix several inaccuracies, particularly in the probe argument
descriptions.
Submitted by: wblock [1]
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11293
the commit message; as actually implemented, the intent is to retry
up to 2 ms for controllers to enable bus power.
Noticed by: ian@, rgrimes@
Additional note: Among others, the problem addressed by r320577 is
the APL32 ("Storage Controllers May Not Be Power Gated") erratum.
Hopefully, along with r318282, r320577 works around the remaining
problems seen with Intel Apollo Lake eMMC and SDXC controllers.
The vm_map_fixed() and vm_map_stack() VM functions return Mach error
codes. Convert them into errno values before returning result from
exec_new_vmspace().
While there, modernize the comment and do minor style adjustments.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week