Prior to this commit, the loader would perform readaheads of up to
128 kB; when booting on a UFS filesystem this resulted in a series
of 160 kB reads (32 kB request + 128 kB readahead).
This commit allows readaheads to be longer, subject to a total I/O
size limit of 256 kB; i.e. 32 kB read requests will have added
readaheads of up to 224 kB.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 80 ms.
Reviewed by: tsoome
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D32251
The loader bcache attempts to determine whether readahead is useful,
increasing or decreasing its readahead length based on whether a
read could be serviced out of the cache. This resulted in two
unfortunate behaviours:
1. A series of consecutive 32 kB reads are requested and bcache
performs 16 kB readaheads. For each read, bcache determines that,
since only the first 16 kB is already in the cache, the readahead
was not useful, and keeps the readahead at the minimum (16 kB) level.
2. A series of consecutive 32 kB reads are requested and bcache
starts with a 32 kB readahead resulting in a 64 kB being read on
the first request. The second 32 kB request can be serviced out of
the cache, and bcache responds by doubling its readahead length to
64 kB. The third 32 kB request cannot be serviced out of the cache,
and bcache reduces its readahead length back down to 32 kB.
The first syndrome converts a series of 32 kB reads into a series of
(misaligned) 32 kB reads, while the second syndrome converts a series
of 32 kB reads into a series of 64 kB reads; in both cases we do not
increase the readahead length to its limit (currently 128 kB) no
matter how many consecutive read requests are made.
This change avoids this problem by tracking the "unconsumed
readahead" length; readahead is deemed to be useful (and the
read-ahead length is potentially increased) not only if a request was
completely serviced out of the cache, but also if *any* of the request
was serviced out of the cache and that length matches the amount of
unconsumed readahead. Conversely, we now only reduce the readahead
length in cases where there was unconsumed readahead data.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 120 ms.
Reviewed by: imp, tsoome
MFC after: 1 week
Sponsored by: https://patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D32250
The change makes block caching algorithm to work better for remote
media on low-BW/high-delay links.
This cuts boot time over IP KVMs noticeably, since the initialization
stage reads bunch of small 4th (and now lua) files that are not in
the same cache stripe (usually), thus wasting lot of bandwidth and
increasing latency even further.
The original regression came in 2017 with revision 87ed2b7f5. We've
seen increase of time it takes for the loader to get to the kernel
loading from under a minute to 10-15 minutes in many cases.
Reviewed by: tsoome
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D31623
Some of these files using <FOO>_DEBUG defined a DEBUG() macro to serve as a
debug-printf. -DDEBUG is useful to enable some debugging output across
multiple ELF/common parts, so switch the DEBUG-as-printf macros over to
something more like DPRINTF that is more commonly used for this kind of
thing and less likely to conflict.
userboot/elf64_freebsd debugging also assumed %llx for uint64; use PRIx64
instead.
MFC after: 1 week
the malloc()/free() as well as having potential of softening the handling
in case error is detected down to a mere warning as compared to hard panic
in free().
Submitted by: tsoome
Differential Revision: https://reviews.freebsd.org/D18299