The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
translate directly into calls to their namesake API functions in
libvmmapi.
It is an improvement over the existing setreg(), setmsr(), setcr()
setgdt() and exec() callbacks in that the new additions give full
control and don't assume we're booting FreeBSD, like exec() and
don't assume one only wants to set the value of RSP, like setreg().
These are no longer needed after the recent 'beforebuild: depend' changes
and hooking DIRDEPS_BUILD into a subset of FAST_DEPEND which supports
skipping 'make depend'.
Sponsored by: EMC / Isilon Storage Division
Add a few other safeguards to ensure things do not break when the
boot device cannot be determined
Reported by: flo
MFC after: 3 days
Sponsored by: ScaleEngine Inc.
Userboot's copy of the libstand Makefile had more extensive changes
compared to the one in sys/boot/libstand32, but it turns out these are
not intentional and we can just include lib/libstand/Makefile as done
for libstand32 in r293040.
Reviewed by: imp, jhb
Tested by: allanjude
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4793
This is not properly respecting WITHOUT or ARCH dependencies in target/.
Doing so requires a massive effort to rework targets/ to do so. A
better approach will be to either include the SUBDIR Makefiles directly
and map to DIRDEPS or just dynamically lookup the SUBDIR. These lose
the benefit of having a userland/lib, userland/libexec, etc, though and
results in a massive package. The current implementation of targets/ is
very unmaintainable.
Currently rescue/rescue and sys/modules are still not connected.
Sponsored by: EMC / Isilon Storage Division
Go ahead and defined -D_STANDALONE for all targets (only strictly
needed for some architecture, but harmless on those it isn't required
for). Also add -msoft-float to all architectures uniformly rather
that higgley piggley like it is today.
Differential Revision: https://reviews.freebsd.org/D3496
Additionally, sort all real filesystems before the virtual ones.
Differential Revision: https://reviews.freebsd.org/D2709
Reviewed by: grehan
MFC after: 5 days
This should be a non-functional change. A future change should
address the functional differences between these three and converge
on a single source.
Differential Revision: https://reviews.freebsd.org/D2058
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
particular, allow loaders to define the name of the RC script the
interpreter needs to use. Use this new-found control to have the
PXE loader (when compiled with TFTP support and not NFS support)
read from ${bootfile}.4th, where ${bootfile} is the name of the
file fetched by the PXE firmware.
The normal startup process involves reading the following files:
1. /boot/boot.4th
2. /boot/loader.rc or alternatively /boot/boot.conf
When these come from a FreeBSD-defined file system, this is all
good. But when we boot over the network, subdirectories and fixed
file names are often painful to administrators and there's really
no way for them to change the behaviour of the loader.
Obtained from: Juniper Networks, Inc.
This includes:
o All directories named *ia64*
o All files named *ia64*
o All ia64-specific code guarded by __ia64__
o All ia64-specific makefile logic
o Mention of ia64 in comments and documentation
This excludes:
o Everything under contrib/
o Everything under crypto/
o sys/xen/interface
o sys/sys/elf_common.h
Discussed at: BSDcan
This allows existing loader.conf files that set "console=comconsole"
to work without failing. No functional difference otherwise.
Reported by: Michael Dexter, pfSense install.
Reviewed by: neel
MFC after: 3 weeks
and finish the job. ncurses is now the only Makefile in the tree that
uses it since it wasn't a simple mechanical change, and will be
addressed in a future commit.
Apparently, LIBZFS is set to a non-empty string when WITHOUT_CDDL/WITHOUT_ZFS
are set, I think this is a bug, but work around this feature for now.
Reviewed by: grehan
Modelled after the i386 zfsloader. However, with no
2nd stage zfsboot to search for a bootable dataset,
attempt a ZFS boot if there is more than one ZFS
dataset found during the disk probe.
sys/boot/userboot/zfs
- build the ZFS boot library
sys/boot/userboot/userboot/
conf.c
- Add the ZFS pool and filesystem tables
devicename.c
- correctly format ZFS devices
main.c
- increase the size of the libstand malloc pool
to account for the increased usage from ZFS buffers
- probe for a ZFS dataset, and if one is
found, attempt to boot from it.
usr.sbin/bhyveload/bhyveload.c
- allow multiple invocations of the '-d' option
to specify multiple disks e.g. a raidz set.
Up to 32 disks are supported.
Tested with various combinations of GPT, MBR, single
and multiple disks, RAID-Z, mirrors.
Reviewed by: neel
Discussed with: avg
Tested by: Michael Dexter and others
MFC after: 3 weeks
- Similar to the hack for bootinfo32.c in userboot, define
_MACHINE_ELF_WANT_32BIT in the load_elf32 file handlers in userboot.
This allows userboot to load 32-bit kernels and modules.
- Copy the SMAP generation code out of bootinfo64.c and into its own
file so it can be shared with bootinfo32.c to pass an SMAP to the i386
kernel.
- Use uint32_t instead of u_long when aligning module metadata in
bootinfo32.c in userboot, as otherwise the metadata used 64-bit
alignment which corrupted the layout.
- Populate the basemem and extmem members of the bootinfo struct passed
to 32-bit kernels.
- Fix the 32-bit stack in userboot to start at the top of the stack
instead of the bottom so that there is room to grow before the
kernel switches to its own stack.
- Push a fake return address onto the 32-bit stack in addition to the
arguments normally passed to exec() in the loader. This return
address is needed to convince recover_bootinfo() in the 32-bit
locore code that it is being invoked from a "new" boot block.
- Add a routine to libvmmapi to setup a 32-bit flat mode register state
including a GDT and TSS that is able to start the i386 kernel and
update bhyveload to use it when booting an i386 kernel.
- Use the guest register state to determine the CPU's current instruction
mode (32-bit vs 64-bit) and paging mode (flat, 32-bit, PAE, or long
mode) in the instruction emulation code. Update the gla2gpa() routine
used when fetching instructions to handle flat mode, 32-bit paging, and
PAE paging in addition to long mode paging. Don't look for a REX
prefix when the CPU is in 32-bit mode, and use the detected mode to
enable the existing 32-bit mode code when decoding the mod r/m byte.
Reviewed by: grehan, neel
MFC after: 1 month
WITH[OUT]_SSP to avoid hitting an error if user has WITH_SSP in their
make.conf. Ports now use this knob.
make[7]: "/usr/src/share/mk/bsd.own.mk" line 466: WITH_SSP and
WITHOUT_SSP can't both be set.
This is similar to previous cleanup done in r188895
Approved by: bapt
Reviewed by: jlh (earlier version)
Approved by: re (marius)
MFC after: 1 week
machine/signal.h and machine/ucontext.h into common x86 includes,
copying from amd64 and merging with i386.
Kernel-only compat definitions are kept in the i386/include/sigframe.h
and i386/include/signal.h, to reduce amd64 kernel namespace pollution.
The amd64 compat uses its own definitions so far.
The _MACHINE_ELF_WANT_32BIT definition is to allow the
sys/boot/userboot/userboot/elf32_freebsd.c to use i386 ELF definitions
on the amd64 compile host. The same hack could be usefully abused by
other code too.
r238966
Bump up the heap size to 1MB. With a few kernel modules, libstand
zalloc and userboot seem to want to use ~600KB of heap space, which
results in a segfault when malloc fails in bhyveload.
r241180
Clarify comment about default number of FICL dictionary cells.
r241153
Allow the number of FICL dictionary cells to be overridden.
Loading a 7.3 ISO with userboot/amd64 takes up 10035 cells,
overflowing the long-standing default of 10000.
Bump userboot's value up to 15000 cells.
Reviewed by: dteske (r238966,241180)
Obtained from: NetApp
.. so that consistent compilation algorithms are used for both
architectures as in practice the binaries are expected to be
interchangeable (for time being).
Previously i386 used default setting which were equivalent to
-march=i486 -mtune=generic.
The only difference is using smaller but slower "leave" instructions.
Discussed with: jhb, dim
MFC after: 29 days
disk_open(). Very often this is called several times for one file.
This leads to reading partition table metadata for each call. To
reduce the number of disk I/O we have a simple block cache, but it
is very dumb and more than half of I/O operations related to reading
metadata, misses this cache.
Introduce new cache layer to resolve this problem. It is independent
and doesn't need initialization like bcache, and will work by default
for all loaders which use the new DISK API. A successful disk_open()
call to each new disk or partition produces new entry in the cache.
Even more, when disk was already open, now opening of any nested
partitions does not require reading top level partition table.
So, if without this cache, partition table metadata was read around
20-50 times during boot, now it reads only once. This affects the booting
from GPT and MBR from the UFS.
It uses new API from the part.c to work with partition tables.
Update userboot's disk driver to use new API. Note that struct
loader_callbacks_v1 has changed.
(x86 assembler optimization disabled for now because it
requires the new .cfi_* directives that is not supported
by base system binutils).
MFC after: 1 week