net_open() does replace f_devdata with pointer to netdev_sock,
this will cause memory leak when device is closed, but also does
alter the devopen() logic.
We should store &netdev_sock to dev->d_opendata instead, this
would preserve and follow the devopen() logic.
Fixes network boot on aarch64 (tested by bz).
Reviewed-by: imp
MFC After: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32227
As Radix MMU with superpages enabled is now stable, make it the
default choice on supported hardware (POWER9 and above), since its
performance is greater than that of HPT MMU.
Reviewed by: alfredo, jhibbits
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D30797
Pass the ivlen along through, and just drop this KASSERT() if we're
building _STANDALONE for the time being.
Fixes: 1833d6042c ("crypto: Permit variable-sized IVs ...")
Prior to this commit, the loader would perform readaheads of up to
128 kB; when booting on a UFS filesystem this resulted in a series
of 160 kB reads (32 kB request + 128 kB readahead).
This commit allows readaheads to be longer, subject to a total I/O
size limit of 256 kB; i.e. 32 kB read requests will have added
readaheads of up to 224 kB.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 80 ms.
Reviewed by: tsoome
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D32251
The loader bcache attempts to determine whether readahead is useful,
increasing or decreasing its readahead length based on whether a
read could be serviced out of the cache. This resulted in two
unfortunate behaviours:
1. A series of consecutive 32 kB reads are requested and bcache
performs 16 kB readaheads. For each read, bcache determines that,
since only the first 16 kB is already in the cache, the readahead
was not useful, and keeps the readahead at the minimum (16 kB) level.
2. A series of consecutive 32 kB reads are requested and bcache
starts with a 32 kB readahead resulting in a 64 kB being read on
the first request. The second 32 kB request can be serviced out of
the cache, and bcache responds by doubling its readahead length to
64 kB. The third 32 kB request cannot be serviced out of the cache,
and bcache reduces its readahead length back down to 32 kB.
The first syndrome converts a series of 32 kB reads into a series of
(misaligned) 32 kB reads, while the second syndrome converts a series
of 32 kB reads into a series of 64 kB reads; in both cases we do not
increase the readahead length to its limit (currently 128 kB) no
matter how many consecutive read requests are made.
This change avoids this problem by tracking the "unconsumed
readahead" length; readahead is deemed to be useful (and the
read-ahead length is potentially increased) not only if a request was
completely serviced out of the cache, but also if *any* of the request
was serviced out of the cache and that length matches the amount of
unconsumed readahead. Conversely, we now only reduce the readahead
length in cases where there was unconsumed readahead data.
In my testing on an EC2 c5.xlarge instance, this change reduces the
boot time by roughly 120 ms.
Reviewed by: imp, tsoome
MFC after: 1 week
Sponsored by: https://patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D32250
Booting on an EC2 c5.xlarge instance, this reduces the number of I/Os
performed from 609 to 432, reduces the total number of blocks read
from 61963 to 60797, and reduces the time spent in the loader by 39 ms.
Note that b4cb3fe0e3 allowed the bcache to be retained for most of
the boot process, but relies on mounting filesystems; this commit
allows the bcache to be retained at the start of the boot process,
before the root filesystem has been located.
Reviewed by: imp, tsoome
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D32239
The definition for 'ST' is in efilib.h, so we don't need extern ST here.
Sponsored by: Netflix
Reviewed by: tsoome, kevans
Differential Revision: https://reviews.freebsd.org/D32225
Lua bindings appeared in FreeBSD 12.0. Delete the authors section of the
man page, since it's unclear who wrote different parts of the man
page.
Noted by: Trond Endrestol
Sponsored by: Netflix
Create a man page per loader. Loader(8) will have information common to
all of them, while loader_${INTERP}(8) will have information relevant to
that specific loader. Rewrite loader(8) to give an overview and point to
the appropriate man page. Rewrite each of the loader_${INTER}(8) man
pages to contain only the relevant information to that loader. Put all
the common commands, environment variables, etc in loader_simp(8) and
refernce that from the loader_lua or loader_4th man pages. The
loader_lua(8) could use more details about the Lua
integration. Additional organization may be benefitial.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31340
Booting FreeBSD on an EC2 c5.xlarge instance, the loader "twiddles"
810 times over the course of 510 ms, a rate of 1.59 kHz. Even accepting
that many systems are slower than this particular VM and will take
longer to boot (especially if using spinning-rust disks), this seems
like an unhelpfully large amount of twiddling when compared to the
~60 Hz frame rate of many displays; printing the twiddles also consumes
roughly 10% of the boot time on the aforementioned VM.
Setting the default globaldiv to 16 dramatically reduces the time spent
printing twiddles to the console while still twiddling at roughly 100
Hz; this should be ample even for systems which take longer to boot and
consequently twiddle slower.
Note that this can adjusted via the twiddle_divisor variable in
loader.conf, but that file is not processed until nearly halfway
through the loader's runtime.
Reviewed by: allanjude, jrtc27, kevans
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: <https://reviews.freebsd.org/D32163>
Now that the loader tslog code doesn't call printf, we can profile
printf using TSLOG. On an EC2 c5.xlarge instance, we spend roughly
45 ms here (out of roughly 500 ms), presumably due to the time spent
writing output to the console.
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
After b4cb3fe0e3, loader started crashing on PowerPC64, with a
Program Exception (700) error. The problem was that archsw was
used before being initialized, with the new mount feature. This
change fixes the issue by initializing archsw earlier, before
setting currdev, that triggers the mount.
Reviewed by: tsoome
MFC after: 1 month
X-MFC-With: b4cb3fe0e3
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D32027
Use radix_mmu environment variable to select between Hash or Radix
MMU, when performing the CAS method call. This matches kernel's
behavior, by selecting Hash MMU by default and Radix if radix_mmu
is not zero, to make sure that both loader and kernel always select
the same MMU.
The device tree is queried to detect Radix/GTSE support and to
find out if CAS is supported, making the old CPU version and HV
bit checks unnecessary now.
Reviewed by: jhibbits
MFC after: 2 weeks
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31951
The behavior remains the same, but lualoader now uses the more concise
verbiage that forthloader used. This is particularly important because
the previous line would exceed the right boundary of the menu and run
straight into space that would typically be allowed for the logo.
This makes it slightly easier to port logos from forthloader to
lualoader.
We want to keep our root file system open to preserve bcache segment
between file accesses, thus reducing physical disk IO.
Reviewed by: imp, allanjude, kevans (previous version)
Differential Revision: https://reviews.freebsd.org/D30848
MFC after: 1 month
Because lld 13 and higher default to garbage collecting start/stop
symbols when using --gc-sections, the linker sets used in the i386 boot
loaders will disappear. This leads to the loaders not recognizing any
commands, and failure to boot.
Until we have a good set of linker scripts for the loaders, work around
it by disabling the start-stop-gc feature.
MFC after: 1 week
When Boot Services (BS) are switched off, we can not use BS
functions any more. Since drawn console does implement our own
Blt(), we can use it to draw the console.
However, SimpleTextOutput protocol based console output must be
blocked.
Tested by inserting printf() after ExitBootServices() call.
MFC after: 1 week
We need to round it up to 2M, for instance. Having staging area too small
might cause the first resize to use negative size for memmove()/memcpy(),
which kills loader.
Tested by: Harry Schmalzbauer <freebsd@omnilan.de>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The definition can be overridden by users, and before f75caed644 it
was in MBs. Make the symbol' unit MB, to be compatible with users
customizations.
Reported and tested by: Harry Schmalzbauer <freebsd@omnilan.de>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This pushes the bulk of the rx servicing into a single loop that's only
slightly convoluted, and it addresses a problem with rx handling in the
process. If we hit a tx interrupt while we're processing, we'd
previously drop the frame on the floor completely and ultimately
timeout, increasing boot time on particularly busy hosts as we keep
having to backoff and resend.
After this patch, we don't seem to hit timeouts at all on zoo anymore
though loading a 27M kernel is still relatively slow (~1m20s).
Reviewed by: tsoome
Triage by: Ash Gokhale <ashfixit gmail com>
Sponsored By: National Bureau of Economic Research
Sponsored by: Klara, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31512
When we quit pager, the return value 1 is returned and command_more()
interprets it as error.
when lua loader gets error from command, it will try to
interpret it once more, so we get the same file shown once more.
There is no reason why we should return error from command_more().
MFC after: 1 week
Scrolling screen will leave "trail" of chars from first column.
Apparently caused by cursor location mismanagement.
Make sure we do not [attempt to] set cursor out of the screen.
MFC after: 1 week
The change makes block caching algorithm to work better for remote
media on low-BW/high-delay links.
This cuts boot time over IP KVMs noticeably, since the initialization
stage reads bunch of small 4th (and now lua) files that are not in
the same cache stripe (usually), thus wasting lot of bandwidth and
increasing latency even further.
The original regression came in 2017 with revision 87ed2b7f5. We've
seen increase of time it takes for the loader to get to the kernel
loading from under a minute to 10-15 minutes in many cases.
Reviewed by: tsoome
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D31623
The Xen kernel has no symbol tables, so calling lookup_symbol against
it triggers the following Divide by Zero fault:
Loading Xen kernel...
/boot/xen data=0x2809c8+0x149638 |
!!!! X64 Exception Type - 00(#DE - Divide Error) CPU Apic ID - 00000000 !!!!
Fix lookup_symbol to prevent the #DE fault from happening if the
symbol table is not loaded and also fix loadfile_raw to mark multiboot
kernels as relocatable, since the only multiboot kernel supported is
Xen and was already unconditionally booted as relocatable.
Fixes: f75caed644 ('amd64 UEFI loader: stop copying staging area to 2M physical')
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D31507
Summary:
Open file list is currently created as statically allocated array (64 items).
Once this array is filled up, loader will not be able to operate with files.
In most cases, this mechanism is good enough, but the problem appears, when
we have many disks with zfs pool(s). In current loader implementation, all
discovered zfs pool configurations are kept in memory and disk devices open -
consuming the open file array. Rewrite the open file mechanism to use
dynamically allocated list.
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31364
There's no need to build both pie and non-pie .o's for stand. There's
some other build thing with MK_BEAR_SSL=yes and/or MK_LOADER_VERIEXEC=yes
that causes the pie build to fail that the 'ar' stage now. Since we don't
need the PIE stuff and the non-PIE stuff, disable PIE for the boot loader.
Reviewed by: emaste
Sponsored by: Netflix
On amd64, add a possibility to activate kernel with staging area in place.
Add 'copy_staging' command to control this. For now, by default the
old mode of copying kernel to 2M phys is retained. It is going to be
changed in several weeks.
On amd64, add some slop to the staging area to satisfy both requirements
of the kernel startup allocator, and to have space for minor staging data
increase after the final size is calculated. Add a new command
'staging_slop' to control its size.
Improve staging area resizing, in particular, reallocate it anew if
we cannot grow it neither down nor up.
Reviewed by: kevans, markj
Discussed with: emaste (the delivery plan)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121
It is more idiomatic. CFLAGS is only augmented with $SSP_CFLAGS when
$MK_SSP != "no".
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31401
The userboot/test program links against the default userspace libraries
(e.g. shared libgcc_s.so) that will be instrumented if WITH_ASAN is set.
All other programs link against libsa instead of libc and therefore can't
use the sanitizer runtime library. To fix the stand/ build with
sanitizers, we disable MK_ASAN/MK_UBSAN if -nostdlib is found in the
LDFLAGS (i.e. we are using libsa instead of libc).
Reviewed By: imp, tsoome
Differential Revision: https://reviews.freebsd.org/D31047
servip is set from bootp bp_siaddr (if present) and rootip is
set immediately from servip in tha sane bootp code.
However, the common/dev_net.c does only set rootip (based on
url processing etc). Therefore, we should also use rootip in tftp
reader.
Fixes hung tftp based boot when bp_siaddr is not provided.
MFC after: 1 week
disable-device fooX will set hint.foo.X.disabled=1 as a way to easily
disable a device attaching during boot.
Reviewed by: tsoome
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31297
Large nextboot.conf files (over 80 bytes) are not read correctly by the
Forth loader, causing file parsing to abort, and nextboot configuration
fails to apply.
Simple repro:
nextboot -e foo=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
shutdown -r now
That will cause the bug to cause a parse failure but shouldn't otherwise
affect the boot. Depending on your loader configuration, you may also
have to set beastie_disable and/or reduce the number of modules loaded
to see the error on a small console screen. 12.0 or CURRENT users will
also have to explicitly use the Forth loader instead of the Lua loader.
The error will look something like:
Warning: syntax error on file /boot/loader.conf.local
foo="xxxxxxxxxxxxxxnextboot_enable="YES"
^
/boot/support.4th has crude file I/O buffering, which uses a buffer
'read_buffer', defined to be 80 bytes by the 'read_buffer_size'
constant. The loader first tastes nextboot.conf, reading and parsing
the first line in it for nextboot_enable="YES". If this is true, then
it reopens the file and parses it like other loader .conf files.
Unfortunately, the file I/O buffering code does not fully reset the
buffer state in the reset_line_reading word. If the last file was read
to the end, that doesn't matter; the file buffer is treated as empty
anyway. But in the nextboot.conf case, the loader will not read to the
end of file if it is over 80 bytes, and the file buffer may be reused
when reading the next file. When the file is reread, the corrupt text
may cause file parsing to abort on bad syntax (if the corrupt line has
<>2 quotes in it), the wrong variable to be set, no variable to be set
at all, or (if the splice happens to land at a line ending) something
approximating normal operation.
The bug is very old, dating back to at least 2000 if not before, and is
still present in 12.0 and CURRENT r345863 (though it is now hidden by
the Lua loader by default).
Suggested one-line attached. This does change the behavior of the
reset_line_reading word, which is exported in the line-reading
dictionary (though the export is not documented in loader man pages).
But repo history shows it was probably exported for the PNP support
code, which was never included in the loader build, and was removed 5
months ago.
One thing that puzzles me: how has this bug gone unnoticed/unfixed for
nearly 2 decades? I find it hard to believe that nobody's tried to do
something interesting with nextboot, like load a kernel and filesystem,
which is what I'm doing.
Tested by: Gary Jennejohn
PR: 239315
MFC After: 3 weeks
Reviewed by: imp (and correctly applied this time)
Differential Revision: https://reviews.freebsd.org/D31328