it possible to boot from ZFS RAIDZ for example from within VirtualBox.
The problem with VirtualBox is that its BIOS reports only one disk present.
If we choose to ignore this report, we can find all the disks available.
We can't have this work-around to be turned on by default, because some broken
BIOSes report true when it comes to number of disks, but present the same disk
multiple times.
the file handle's size and was recently committed to
lib/libstand/nfs.c. This allows pxeboot to use NFSv3 and work
correcty for non-FreeBSD as well as FreeBSD NFS servers.
If built with OLD_NFSV2 defined, the old
code that predated this patch will be used.
Tested by: danny at cs.huji.ac.il
heap when using a range above 1MB.
Previously the loader would always use the last 3MB in the first memory
range above 1MB for the heap. However, this memory range is also where the
kernel and any modules are loaded. If this memory range is "small", then
using the high 3MB for the heap may not leave enough room for the kernel
and modules.
Now the loader will use any range below 4GB for the heap, and the logic to
choose the "high" heap region has moved into biosmem.c. It sets two
variables that the loader can use for a high heap if it desires. When a
high heap is enabled (BZIP2, FireWire, GPT, or ZFS), then the following
memory ranges are preferred for the heap in order from best to worst:
- The largest memory region in the SMAP with a start address greater than
1MB. The memory region must be at least 3MB in length. This leaves the
region starting at 1MB purely for use by the kernel and modules.
- The last 3MB of the memory region starting at 1MB if it is at least 3MB
in size. This matches the current behavior except that the current loader
would break horribly if the first region was not at least 3MB in size.
- The memory range from the end of the loader up to the 640k window. This
is the range the loader uses when none of the high-heap-requesting options
are enabled.
Tested by: hrs
MFC after: 1 week
video console which doesn't take any input from keyboard and hides
all output replacing it with ``spinning'' character (useful for
embedded products and custom installations).
Sponsored by: Sippy Software, Inc.
slicei, Apple EFI hardware), the bootloader will fail to recognize the GPT
if it finds anything else but the EFI partition. Change the check to continue
detecting the GPT by looking at the EFI partition on the MBR but
stopping successfuly after finding it.
PR: kern/134590
Submitted by: Christoph Langguth <christoph at rosenkeller.org>
Reviewed by: jhb
MFC after: 2 weeks
Approved by: re (kib)
- Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata. Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS. We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.
Reviewed by: jhb
open partition. This fixes access to partitions whose starting offset
is >= 2 TB.
Submitted by: "James R. Van Artsdalen" james jrv.org
MFC after: 3 days
- First three fields of system UUID may be little-endian as described in
SMBIOS Specification v2.6. For now, we keep the network byte order for
backward compatibility (and consistency with popular dmidecode tool)
if SMBIOS table revision is less than 2.6. However, little-endian format
can be forced by defining BOOT_LITTLE_ENDIAN_UUID from make.conf(5) if it
is necessary.
- Replace overly ambitious optimizations with more readable code.
- Update comments to SMBIOS Specification v2.6 and clean up style(9) bugs.
as 'real memory' instead of Maxmem if the value is available.
Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640
to unconfuse users. Now it is consistent across amd64 and i386 again.
While I am here, clean up smbios.c a bit and update copyright date.
Reviewed by: jhb
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
in make.conf or src.conf.
- When GPT is enabled (which it is by default), use memory above 1 MB and
leave the memory from the end of the bss to the end of the 640k window
purely for the stack. The loader has grown and now it is much more
common for the heap and stack to grow into each other when both are
located in the 640k window.
PR: kern/129526
MFC after: 1 week
This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
boot an amd64 kernel. If not, then fail the boot request with an error
message. Otherwise, the boot attempt will fail with a BTX fault when
trying to read the EFER MSR.
MFC after: 3 days
commit, calling i386_parsedev(..., X, ...) where X is "ad", "bge", or
any other disk or network device name without a unit number, would
result in dereferencing whatever happened to be on the stack where the
variable "cp" is stored.
Found by: LLVM/Clang Static Checker
libi386's time(), caused by a qemu bug. The bug might
be present in other BIOSes, too.
qemu either does not simulate the AT RTC correctly or
has a broken BIOS 1A/02 implementation, and will return
an incorrect value if the RTC is read while it is being
updated.
The effect is worsened by the fact that qemu's INT 15/86
function ("wait" a.k.a. usleep) is non-implmeneted or
broken and returns immediately, causing beastie.4th to
spin in a tight loop calling the "read RTC" function
millions of times, triggering the problem quickly.
Therefore, we keep reading the BIOS value until we get
the same result twice. This change fixes beastie.4th's
countdown under qemu.
Approved by: des (mentor)
entry in the SMAP is a 20 byte structure and they are queried from the
BIOS via sucessive BIOS calls. Due to an apparent bug in the R900's
BIOS, for some SMAP requests the BIOS overflows the 20 byte buffer
trashing a few bytes of memory immediately after the SMAP structure. As
a workaround, add 8 bytes of padding after the SMAP structure used in
the loader for SMAP queries.
PR: i386/122668
Submitted by: Mike Hibler mike flux.utah.edu, silby
MFC after: 3 days
- Consolidate the code to humanize the size of a disk partition into a
single function based on the code for GPT partitions and use it for
GPT partitions, BSD slices, and BSD partitions.
- Teach the humanize code to use KB for small partitions (e.g. GPT boot
partitions now show up as 64KB rather than 0MB).
- Pad a few partition type names out so that things line up in the
common case.
MFC after: 1 week
Enhanced Disk Drive Specification Ver 3.0 defines that the version
of extension in AH would be 30h.
Correct the check for that to be >=30h instead of >3h.
MFC after: 2 months
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
on duplicated code and support 64-bit LBAs for GPT.
- The code to manage an EDD or C/H/S I/O request are now in their own
routines. The EDD routine now handles a full 64-bit LBA instead of
truncating LBAs to the lower 32-bits. (MBRs and BSD labels only
have 32-bit LBAs anyway, so the only LBAs ever passed down were 32-bit).
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
- If a disk supports EDD then always use it rather than only using it if
the cylinder is > 1023. Other parts of the boot code already do
something similar to this. Also, GPT just uses LBAs, so for a GPT disk
it's probably best to ignore C/H/S completely. Always using EDD when
it is supported by a disk is an easy way to accomplish this.
MFC after: 1 week
macros to treat the 'slice' field as a real part of the bootdev instead
of as hack that spans two other fields (adaptor (sic) and controller)
that are not used in any modern FreeBSD boot code.
MFC after: 1 week
to get the physical address doesn't work for all values of KVA_PAGES,
while masking 8 MSBs works for all values of KVA_PAGES that are
multiple of 4 for non-PAE and 8 for PAE. (This leaves us limited
with 12MB for non-PAE kernels and 14MB for PAE kernels.)
To get things right, we'd need to subtract the KERNBASE from the
virtual address (but KERNBASE is not easy to figure out from here),
or have physical addresses set properly in the ELF headers.
Discussed with: jhb
device (kind) specific unit field to the common field. This change
allows a future version of libefi to work without requiring anything
more than what is defined in struct devdesc and as such makes it
possible to compile said version of libefi for different platforms
without requiring that those platforms have identical derivatives
of struct devdesc.
are no longer limited to a virtual address space of 16 megabytes,
only mask high two bits of a virtual address. This allows to load
larger kernels (up to 1 gigabyte). Not masking addresses at all
was a bad idea on machines with less than >3G of memory -- kernels
are linked at 0xc0xxxxxx, and that would attempt to load a kernel
at above 3G. By masking only two highest bits we stay within the
safe limits while still allowing to boot larger kernels.
(This is a safer reimplmentation of sys/boot/i386/boot2/boot.2.c
rev. 1.71.)
Prodded by: jhb
Tested by: nyan (pc98)
fixes filesystem corruption when nextboot.conf is located after
cylinder 1023. The bug appears to have been introduced at the time
bd_read was copied to create bd_write.
PR: bin/98005
Reported by: yar
MFC after: 1 week