disk_open(). Very often this is called several times for one file.
This leads to reading partition table metadata for each call. To
reduce the number of disk I/O we have a simple block cache, but it
is very dumb and more than half of I/O operations related to reading
metadata, misses this cache.
Introduce new cache layer to resolve this problem. It is independent
and doesn't need initialization like bcache, and will work by default
for all loaders which use the new DISK API. A successful disk_open()
call to each new disk or partition produces new entry in the cache.
Even more, when disk was already open, now opening of any nested
partitions does not require reading top level partition table.
So, if without this cache, partition table metadata was read around
20-50 times during boot, now it reads only once. This affects the booting
from GPT and MBR from the UFS.
in sys/boot/i386/libi386/biosdisk.c. Otherwise, when DISK_DEBUG is
enabled, the DEBUG() macros will clobber those fields, and cause the
probing to always fail mysteriously when debugging is enabled.
and constants related to the BIOS Enhanced Disk Drive Specification.
- Use this header instead of magic numbers and various duplicate structure
definitions for doing I/O.
- Use an actual structure for the request to fetch drive parameters in
drvsize() rather than a gross hack of a char array with some magic
size. While here, change drvsize() to only pass the 1.1 version of
the structure and not request device path information. If we want
device path information you have to set the length of the device
path information as an input (along with probably checking the actual
EDD version to see which size one should use as the device path
information is variable-length). This fixes data smashing problems
from passing an EDD 3 structure to BIOSes supporting EDD 4.
Reviewed by: avg
Tested by: Dennis Koegel dk neveragain.de
MFC after: 1 week
it possible to boot from ZFS RAIDZ for example from within VirtualBox.
The problem with VirtualBox is that its BIOS reports only one disk present.
If we choose to ignore this report, we can find all the disks available.
We can't have this work-around to be turned on by default, because some broken
BIOSes report true when it comes to number of disks, but present the same disk
multiple times.
slicei, Apple EFI hardware), the bootloader will fail to recognize the GPT
if it finds anything else but the EFI partition. Change the check to continue
detecting the GPT by looking at the EFI partition on the MBR but
stopping successfuly after finding it.
PR: kern/134590
Submitted by: Christoph Langguth <christoph at rosenkeller.org>
Reviewed by: jhb
MFC after: 2 weeks
Approved by: re (kib)
open partition. This fixes access to partitions whose starting offset
is >= 2 TB.
Submitted by: "James R. Van Artsdalen" james jrv.org
MFC after: 3 days
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
in make.conf or src.conf.
- When GPT is enabled (which it is by default), use memory above 1 MB and
leave the memory from the end of the bss to the end of the 640k window
purely for the stack. The loader has grown and now it is much more
common for the heap and stack to grow into each other when both are
located in the 640k window.
PR: kern/129526
MFC after: 1 week
- Consolidate the code to humanize the size of a disk partition into a
single function based on the code for GPT partitions and use it for
GPT partitions, BSD slices, and BSD partitions.
- Teach the humanize code to use KB for small partitions (e.g. GPT boot
partitions now show up as 64KB rather than 0MB).
- Pad a few partition type names out so that things line up in the
common case.
MFC after: 1 week
Enhanced Disk Drive Specification Ver 3.0 defines that the version
of extension in AH would be 30h.
Correct the check for that to be >=30h instead of >3h.
MFC after: 2 months
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
on duplicated code and support 64-bit LBAs for GPT.
- The code to manage an EDD or C/H/S I/O request are now in their own
routines. The EDD routine now handles a full 64-bit LBA instead of
truncating LBAs to the lower 32-bits. (MBRs and BSD labels only
have 32-bit LBAs anyway, so the only LBAs ever passed down were 32-bit).
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
- If a disk supports EDD then always use it rather than only using it if
the cylinder is > 1023. Other parts of the boot code already do
something similar to this. Also, GPT just uses LBAs, so for a GPT disk
it's probably best to ignore C/H/S completely. Always using EDD when
it is supported by a disk is an easy way to accomplish this.
MFC after: 1 week
macros to treat the 'slice' field as a real part of the bootdev instead
of as hack that spans two other fields (adaptor (sic) and controller)
that are not used in any modern FreeBSD boot code.
MFC after: 1 week
device (kind) specific unit field to the common field. This change
allows a future version of libefi to work without requiring anything
more than what is defined in struct devdesc and as such makes it
possible to compile said version of libefi for different platforms
without requiring that those platforms have identical derivatives
of struct devdesc.
fixes filesystem corruption when nextboot.conf is located after
cylinder 1023. The bug appears to have been introduced at the time
bd_read was copied to create bd_write.
PR: bin/98005
Reported by: yar
MFC after: 1 week
memory directly available to loader(8) and friends was limited to 640K on i386.
Those times have passed long time ago and now loader(8) can directly access
up to 4GB of RAM at least theoretically. At the same time, there are several
places where it's assumed that malloc() will only allocate memory within
first megabyte.
Remove that assumption by allocating appropriate bounce buffers for BIOS
calls on stack where necessary.
This allows using memory above first megabyte for heap if necessary.
use a bounce buffer for the actual transfer to avoid crossing a 64k
boundary. To do this, we malloc a buffer twice as big as we need and then
find an aligned block within that buffer to do the transfer. The check
to see which part of the block we use used the wrong variable for part of
the condition meaning that in certain edge cases we would ask the BIOS to
cross a 64k boundary. The BIOS request would then fail resulting in file
transfers that just magically fail in the middle without any apparent
reason. Specifically, my tests for the splitfs boot floppies managed to
trigger this edge case.
MFC after: 1 week
X-MFC-info: along with fixes to libstand filesystems
Move the remaining bits of <sys/diskslice.h> to <i386/include/bootinfo.h>
Move i386/pc98 specific bits from <sys/reboot.h> to
<i386/include/bootinfo.h> as well.
Adjust includes in sys/boot accordingly.
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.
These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.
This commit adds a number of such #includes.
Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.
Sponsored by: DARPA & NAI Labs.
- Add in support for the EDD (Enhanced Disk Drive) BIOS extensions to
use LBA mode for accessing drives past cylinder 1024. This should allow
us to load a kernel from anywhere on a newer drive up to 2 TB. Part
of this came from the PR below.
PR: i386/13847
Submitted by: Tor Egge <Tor.Egge@fast.no>
- Don't hard code 0x10000 as the entry point for the loader. Instead add
src/sys/boot/i386/Makefile.inc which defines a make variable with the
entry point for the loader. Move the loader's entry point up to
0x20000, which makes PXE happy.
- Don't try to use cpp to parse btxldr for the optional BTXLDR_VERBOSE,
instead use m4 to achieve this. Also, add a BTXLDR_VERBOSE knob in the
btxldr Makefile to turn this option on.
- Redo parts of cdldr's Makefile so that it now builds and installs cdboot
instead of having i386/loader/Makefile do that. Also, add in some more
variables to make the pxeldr Makefile almost identical and thus to ease
maintainability.
- Teach cdldr about the a.out format. Cdldr now parsers the a.out header
of the loader binary and relocates it based on that. The entry point of
the loader no longer has to be hardcoded into cdldr. Also, the boot
info table from mkisofs is no longer required to get a useful cdboot.
- Update the lsdev function for BIOS disks to parse other file systems
(such as DOS FAT) that we currently support. This is still buggy as
it assumes that a floppy with a DOS boot sector actually has a MBR and
parses it as such. I'll be fixing this in the future.
- The biggie: Add in support for booting off of PXE-enabled network
adapters. Currently, we use the TFTP API provided by the PXE BIOS.
Eventually we will switch to using the low-level NIC driver thus
allowing both TFTP and NFS to be used, but for now it's just TFTP.
Submitted by: ps, alfred
Testing by: Benno Rice <benno@netizen.com.au>
necessary. Pass an absolute block number too, instead of receiving a
relative one in realstrategy(), as bcache_strategy() requires this.
The fix is sligthly different from the one in the PR.
PR: 17098
Submitted by: John Hood <jhood@sitaranetworks.com>
This should resolve the problem raised in PR 12315, and incidentally
makes it easier to determine what geometry the BIOS is actually using
(by way of boot -v and dmesg).