In some Dell systems and usb stick combinations, it is found that
int13 AH=08 is reporting back bad sector information, preventing the
boot.
This update is allowing bd_int13probe() to use extended info call to
build disk properties.
It also can happen the total sectors count from extended info may be
wrong, in such case, the CHS data is used to calculate total sectors.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D7718
beyond the end of disk. r298900 added code to prevent this. Some BIOSes
cause significant delays if asked to read past end-of-disk.
We never trusted the BIOS to accurately report the sectorsize of disks
before and this set of changes. Unfortuately they interact badly with
the infamous >2TB wraparound bugs. We have a number of relatively-recent
machines in the FreeBSD.org cluster where the BIOS reports 3TB disks as 1TB.
With pre-r298900 they work just fine. After r298900 they stop working if
the boot environment attempts to access anything outside the first 1TB on
the disk. 'ZFS: I/O error, all block copies unavailable' etc. It affects
both UFS and ZFS if they try to boot from large volumes.
This change replaces the blind trust of the BIOS end-of-disk reporting
with a read-ahead clip to prevent reads crossing the of end-of-disk
boundary. Since 2^32 (2TB) size reporting truncation is not uncommon,
the clipping is done on 2TB aliases of the reported end-of-disk.
ie: a 3TB disk reported as 1TB has readahead clipped at 1TB, 3TB, 5TB, ...
as one of them is likely to be the real end-of-disk.
This should make the loader on these broken machines behave the same as
traditional pre-r298900 loader behavior, without disabling read-ahead.
PR: 212139
Discussed with: tsoome, allanjude
Replace all rounding with the round{up,down}2 macros
a missing set of braces caused the previous code to be incorrect
replace alloca() with malloc() because alloca() can return an allocation
that is actually invalid, causing boot to fail
Reviewed by: emaste, ed
Thanks To: peter
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D6213
return value when it could return 1 (indicating we should stop).
Fix a few instances of pager_open() / pager_close() not being called.
Actually use these routines for the environment variable printing code
I just committed.
The new bcache code does not know the size of the disk, and therefore may attempt to read past the end of the disk while trying to fill its read-ahead cache.
This is usually not an issue, it fails gracefully on all of my machines, but some BIOSes seem to retry the reads for up to 30 seconds each, resulting in a long stall during boot
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: jhb, np
Differential Revision: https://reviews.freebsd.org/D6109
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
Add support for 4k sector GELI encrypted partitions to the bootloader
This is the default created by the installer
Because the IV is different for each sector, and the XTS tweak carries forward you can not decrypt a partial sector if the starting offset is not 0
Make boot2 and the loader read in 4k aligned chunks
Reviewed by: ed, oshogbo
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D5820
disk_open(). Very often this is called several times for one file.
This leads to reading partition table metadata for each call. To
reduce the number of disk I/O we have a simple block cache, but it
is very dumb and more than half of I/O operations related to reading
metadata, misses this cache.
Introduce new cache layer to resolve this problem. It is independent
and doesn't need initialization like bcache, and will work by default
for all loaders which use the new DISK API. A successful disk_open()
call to each new disk or partition produces new entry in the cache.
Even more, when disk was already open, now opening of any nested
partitions does not require reading top level partition table.
So, if without this cache, partition table metadata was read around
20-50 times during boot, now it reads only once. This affects the booting
from GPT and MBR from the UFS.
in sys/boot/i386/libi386/biosdisk.c. Otherwise, when DISK_DEBUG is
enabled, the DEBUG() macros will clobber those fields, and cause the
probing to always fail mysteriously when debugging is enabled.
and constants related to the BIOS Enhanced Disk Drive Specification.
- Use this header instead of magic numbers and various duplicate structure
definitions for doing I/O.
- Use an actual structure for the request to fetch drive parameters in
drvsize() rather than a gross hack of a char array with some magic
size. While here, change drvsize() to only pass the 1.1 version of
the structure and not request device path information. If we want
device path information you have to set the length of the device
path information as an input (along with probably checking the actual
EDD version to see which size one should use as the device path
information is variable-length). This fixes data smashing problems
from passing an EDD 3 structure to BIOSes supporting EDD 4.
Reviewed by: avg
Tested by: Dennis Koegel dk neveragain.de
MFC after: 1 week
it possible to boot from ZFS RAIDZ for example from within VirtualBox.
The problem with VirtualBox is that its BIOS reports only one disk present.
If we choose to ignore this report, we can find all the disks available.
We can't have this work-around to be turned on by default, because some broken
BIOSes report true when it comes to number of disks, but present the same disk
multiple times.
slicei, Apple EFI hardware), the bootloader will fail to recognize the GPT
if it finds anything else but the EFI partition. Change the check to continue
detecting the GPT by looking at the EFI partition on the MBR but
stopping successfuly after finding it.
PR: kern/134590
Submitted by: Christoph Langguth <christoph at rosenkeller.org>
Reviewed by: jhb
MFC after: 2 weeks
Approved by: re (kib)
open partition. This fixes access to partitions whose starting offset
is >= 2 TB.
Submitted by: "James R. Van Artsdalen" james jrv.org
MFC after: 3 days
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
in make.conf or src.conf.
- When GPT is enabled (which it is by default), use memory above 1 MB and
leave the memory from the end of the bss to the end of the 640k window
purely for the stack. The loader has grown and now it is much more
common for the heap and stack to grow into each other when both are
located in the 640k window.
PR: kern/129526
MFC after: 1 week
- Consolidate the code to humanize the size of a disk partition into a
single function based on the code for GPT partitions and use it for
GPT partitions, BSD slices, and BSD partitions.
- Teach the humanize code to use KB for small partitions (e.g. GPT boot
partitions now show up as 64KB rather than 0MB).
- Pad a few partition type names out so that things line up in the
common case.
MFC after: 1 week
Enhanced Disk Drive Specification Ver 3.0 defines that the version
of extension in AH would be 30h.
Correct the check for that to be >=30h instead of >3h.
MFC after: 2 months
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
on duplicated code and support 64-bit LBAs for GPT.
- The code to manage an EDD or C/H/S I/O request are now in their own
routines. The EDD routine now handles a full 64-bit LBA instead of
truncating LBAs to the lower 32-bits. (MBRs and BSD labels only
have 32-bit LBAs anyway, so the only LBAs ever passed down were 32-bit).
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
- If a disk supports EDD then always use it rather than only using it if
the cylinder is > 1023. Other parts of the boot code already do
something similar to this. Also, GPT just uses LBAs, so for a GPT disk
it's probably best to ignore C/H/S completely. Always using EDD when
it is supported by a disk is an easy way to accomplish this.
MFC after: 1 week
macros to treat the 'slice' field as a real part of the bootdev instead
of as hack that spans two other fields (adaptor (sic) and controller)
that are not used in any modern FreeBSD boot code.
MFC after: 1 week
device (kind) specific unit field to the common field. This change
allows a future version of libefi to work without requiring anything
more than what is defined in struct devdesc and as such makes it
possible to compile said version of libefi for different platforms
without requiring that those platforms have identical derivatives
of struct devdesc.
fixes filesystem corruption when nextboot.conf is located after
cylinder 1023. The bug appears to have been introduced at the time
bd_read was copied to create bd_write.
PR: bin/98005
Reported by: yar
MFC after: 1 week
memory directly available to loader(8) and friends was limited to 640K on i386.
Those times have passed long time ago and now loader(8) can directly access
up to 4GB of RAM at least theoretically. At the same time, there are several
places where it's assumed that malloc() will only allocate memory within
first megabyte.
Remove that assumption by allocating appropriate bounce buffers for BIOS
calls on stack where necessary.
This allows using memory above first megabyte for heap if necessary.
use a bounce buffer for the actual transfer to avoid crossing a 64k
boundary. To do this, we malloc a buffer twice as big as we need and then
find an aligned block within that buffer to do the transfer. The check
to see which part of the block we use used the wrong variable for part of
the condition meaning that in certain edge cases we would ask the BIOS to
cross a 64k boundary. The BIOS request would then fail resulting in file
transfers that just magically fail in the middle without any apparent
reason. Specifically, my tests for the splitfs boot floppies managed to
trigger this edge case.
MFC after: 1 week
X-MFC-info: along with fixes to libstand filesystems
Move the remaining bits of <sys/diskslice.h> to <i386/include/bootinfo.h>
Move i386/pc98 specific bits from <sys/reboot.h> to
<i386/include/bootinfo.h> as well.
Adjust includes in sys/boot accordingly.
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.
These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.
This commit adds a number of such #includes.
Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.
Sponsored by: DARPA & NAI Labs.