This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
boot an amd64 kernel. If not, then fail the boot request with an error
message. Otherwise, the boot attempt will fail with a BTX fault when
trying to read the EFER MSR.
MFC after: 3 days
commit, calling i386_parsedev(..., X, ...) where X is "ad", "bge", or
any other disk or network device name without a unit number, would
result in dereferencing whatever happened to be on the stack where the
variable "cp" is stored.
Found by: LLVM/Clang Static Checker
libi386's time(), caused by a qemu bug. The bug might
be present in other BIOSes, too.
qemu either does not simulate the AT RTC correctly or
has a broken BIOS 1A/02 implementation, and will return
an incorrect value if the RTC is read while it is being
updated.
The effect is worsened by the fact that qemu's INT 15/86
function ("wait" a.k.a. usleep) is non-implmeneted or
broken and returns immediately, causing beastie.4th to
spin in a tight loop calling the "read RTC" function
millions of times, triggering the problem quickly.
Therefore, we keep reading the BIOS value until we get
the same result twice. This change fixes beastie.4th's
countdown under qemu.
Approved by: des (mentor)
entry in the SMAP is a 20 byte structure and they are queried from the
BIOS via sucessive BIOS calls. Due to an apparent bug in the R900's
BIOS, for some SMAP requests the BIOS overflows the 20 byte buffer
trashing a few bytes of memory immediately after the SMAP structure. As
a workaround, add 8 bytes of padding after the SMAP structure used in
the loader for SMAP queries.
PR: i386/122668
Submitted by: Mike Hibler mike flux.utah.edu, silby
MFC after: 3 days
- Consolidate the code to humanize the size of a disk partition into a
single function based on the code for GPT partitions and use it for
GPT partitions, BSD slices, and BSD partitions.
- Teach the humanize code to use KB for small partitions (e.g. GPT boot
partitions now show up as 64KB rather than 0MB).
- Pad a few partition type names out so that things line up in the
common case.
MFC after: 1 week
Enhanced Disk Drive Specification Ver 3.0 defines that the version
of extension in AH would be 30h.
Correct the check for that to be >=30h instead of >3h.
MFC after: 2 months
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)
on duplicated code and support 64-bit LBAs for GPT.
- The code to manage an EDD or C/H/S I/O request are now in their own
routines. The EDD routine now handles a full 64-bit LBA instead of
truncating LBAs to the lower 32-bits. (MBRs and BSD labels only
have 32-bit LBAs anyway, so the only LBAs ever passed down were 32-bit).
- All of the bounce buffer and retry logic duplicated in bd_read() and
bd_write() are merged into a single bd_io() routine that takes an
extra direction argument. bd_read() and bd_write() are now simple
wrappers around bd_io().
- If a disk supports EDD then always use it rather than only using it if
the cylinder is > 1023. Other parts of the boot code already do
something similar to this. Also, GPT just uses LBAs, so for a GPT disk
it's probably best to ignore C/H/S completely. Always using EDD when
it is supported by a disk is an easy way to accomplish this.
MFC after: 1 week
macros to treat the 'slice' field as a real part of the bootdev instead
of as hack that spans two other fields (adaptor (sic) and controller)
that are not used in any modern FreeBSD boot code.
MFC after: 1 week
to get the physical address doesn't work for all values of KVA_PAGES,
while masking 8 MSBs works for all values of KVA_PAGES that are
multiple of 4 for non-PAE and 8 for PAE. (This leaves us limited
with 12MB for non-PAE kernels and 14MB for PAE kernels.)
To get things right, we'd need to subtract the KERNBASE from the
virtual address (but KERNBASE is not easy to figure out from here),
or have physical addresses set properly in the ELF headers.
Discussed with: jhb
device (kind) specific unit field to the common field. This change
allows a future version of libefi to work without requiring anything
more than what is defined in struct devdesc and as such makes it
possible to compile said version of libefi for different platforms
without requiring that those platforms have identical derivatives
of struct devdesc.
are no longer limited to a virtual address space of 16 megabytes,
only mask high two bits of a virtual address. This allows to load
larger kernels (up to 1 gigabyte). Not masking addresses at all
was a bad idea on machines with less than >3G of memory -- kernels
are linked at 0xc0xxxxxx, and that would attempt to load a kernel
at above 3G. By masking only two highest bits we stay within the
safe limits while still allowing to boot larger kernels.
(This is a safer reimplmentation of sys/boot/i386/boot2/boot.2.c
rev. 1.71.)
Prodded by: jhb
Tested by: nyan (pc98)
fixes filesystem corruption when nextboot.conf is located after
cylinder 1023. The bug appears to have been introduced at the time
bd_read was copied to create bd_write.
PR: bin/98005
Reported by: yar
MFC after: 1 week
Use 'BOOT_SENSITIVE_INFO=YES' variable to turn them on.
- Use 'uint*_t' instead of 'u_int*_t', correct compilation warnings, and
update copyright while I am here.
3MB of physical memory for heap instead of range between 1MB and 4MB.
This makes this feature working with PAE and amd64 kernels, which are
loaded at 2MB. Teach i386_copyin() to avoid using range allocated by
heap in such case, so that it won't trash heap in the low memory
conditions.
This should make loading bzip2-compressed kernels/modules/mfs images
generally useable, so that re@ team is welcome to evaluate merits
of using this feature in the installation CDs.
Valuable suggestions by: jhb
memory directly available to loader(8) and friends was limited to 640K on i386.
Those times have passed long time ago and now loader(8) can directly access
up to 4GB of RAM at least theoretically. At the same time, there are several
places where it's assumed that malloc() will only allocate memory within
first megabyte.
Remove that assumption by allocating appropriate bounce buffers for BIOS
calls on stack where necessary.
This allows using memory above first megabyte for heap if necessary.
the serial console speed (i386 and amd64 only). If the previous
stage boot loader requested a serial console (RB_SERIAL or RB_MULTIPLE)
then the default speed is determined from the current serial port
speed. Otherwise it is set to 9600 or the value of BOOT_COMCONSOLE_SPEED
at compile time.
This makes it possible to set the serial port speed once in
/boot.config and the setting will propagate to boot2, loader and
the kernel serial console.
variables to loader:
hint.smbios.0.enabled "YES" when SMBIOS is detected
hint.smbios.0.bios.vendor BIOS vendor
hint.smbios.0.bios.version BIOS version
hint.smbios.0.bios.reldate BIOS release date
hint.smbios.0.system.maker System manufacturer
hint.smbios.0.system.product System product name
hint.smbios.0.system.version System version number
hint.smbios.0.planar.maker Base board manufacturer
hint.smbios.0.planar.product Base board product name
hint.smbios.0.planar.version Base board version number
hint.smbios.0.chassis.maker Enclosure manufacturer
hint.smbios.0.chassis.version Enclosure version
These strings can be used to detect hardware quirks and to set appropriate
flags. For example, Compaq R3000 series and some HP laptops require
hint.atkbd.0.flags="0x9"
to boot. See amd64/67745 for more detail.
Note: Please do not abuse this feature to resolve general problem when it
can be fixed programmatically. This must be used as a last resort.
PR: kern/81449
Approved by: anholt (mentor)
- Teach the i386 and pc98 loaders to honor multiple console requests from
their respective boot2 binaries so that the same console(s) are used in
both boot2 and the loader.
- Since the kernel doesn't support multiple consoles, whichever console is
listed first is treated as the "primary" console and is passed to the
kernel in the boot_howto flags.
PR: kern/66425
Submitted by: Gavin Atkinson gavin at ury dot york dot ac dot uk
MFC after: 1 week
to stll be able to mount NFS root as prescribed by DCHP configuration. Since
pxeboot is using TFTP to get to the files, pxeboot can not rely on NFS to
provide it a root directory hande as a side effect. pxeboot has to make RPC
mount call itself.
format modules, which are currently only used on the amd64 platform.
This initial implementation just parses enough of the module to
allow it to extract dependencies and load all the bits into the
right place in memory, so the kernel must still do the full relocation
and linking. The details of the loaded sections are passed to the
kernel by supplying a copy of the ELF section header table as module
metadata with the MODINFOMD_SHDR tag.
fills its field (6 characters). In that case the OEMID is not
null-terminated, and the sprintf that was used would copy up to the
next null byte, which could be pretty far away.
switch to using C99-style comments everywhere in preprocessed
assembler. The reason is that lines starting with the regexp
'^[[:space:]]#' are treated as preprocessing directives, and
while it seems to work now with GCC, it's not necessarily has
to work. Use C99 comments `//' for the trailing comments to
save whitespace.
- do not use PROG for what's not a real C program,
- use sys.mk transformation rules where possible,
- only create the "machine" symlink on AMD64,
- removed MAINTAINER lines in individual makefiles,
- added the LIBSTAND defitinion to <bsd.libnames.mk>,
- somewhat better contents in .depend files.
Tested on: i386, amd64
Prodded by: bde
use a bounce buffer for the actual transfer to avoid crossing a 64k
boundary. To do this, we malloc a buffer twice as big as we need and then
find an aligned block within that buffer to do the transfer. The check
to see which part of the block we use used the wrong variable for part of
the condition meaning that in certain edge cases we would ask the BIOS to
cross a 64k boundary. The BIOS request would then fail resulting in file
transfers that just magically fail in the middle without any apparent
reason. Specifically, my tests for the splitfs boot floppies managed to
trigger this edge case.
MFC after: 1 week
X-MFC-info: along with fixes to libstand filesystems
the terminating '\0'. Since the initialisation of rootpath in
libstand/bootp.c may copy junk into the rest of the buffer, it was
possible for the code to find a ':' after the '\0' and do the wrong
thing.
Reviewed by: ps
MFC after: 1 week
things over floppy size limits, I can exclude it for release builds or
something like that. Most of the changes are to get the load_elf.c file
into a seperate elf32_ or elf64_ namespace so that you can have two
ELF loaders present at once. Note that for 64 bit kernels, it actually
starts up the kernel already in 64 bit mode with paging enabled. This
is really easy because we have a known minimum feature set.
Of note is that for amd64, we have to pass in the bios int 15 0xe821
memory map because once in long mode, you absolutely cannot make VM86
calls. amd64 does not use 'struct bootinfo' at all. It is a pure loader
metadata startup, just like sparc64 and powerpc. Much of the
infrastructure to support this was adapted from sparc64.
* AcpiOsDerivePciId(): finds a bus number, given the slot/func and the
acpi parse tree.
* AcpiOsPredefinedOverride(): use the sysctl hw.acpi.os_name to
override the value for _OS.
Ideas from: takawata, jhb
Reviewed by: takawata, marcel
Tested on: i386, ia64
Move the remaining bits of <sys/diskslice.h> to <i386/include/bootinfo.h>
Move i386/pc98 specific bits from <sys/reboot.h> to
<i386/include/bootinfo.h> as well.
Adjust includes in sys/boot accordingly.
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.
These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.
This commit adds a number of such #includes.
Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.
Sponsored by: DARPA & NAI Labs.
Fix device hints entry for disabling acpi(4).
This also should fix the arbitration with apm(4) when both drivers
are enabled.
Note that your /boot/device.hints needs to be updated if you want to
stop auto-loading acpi.ko or disable acpi(4).
this was quite broken, it never was updated for metadata support.
The a.out kld file support was never really used, as it wasn't necessary.
You could always load elf kld's, even in an a.out kernel.
Add definition of COMPILER_DEPENDENT_INT64 and also
fix definition of COMPILER_DEPENDENT_UINT64.
Pointed-out by: Michael Nottebrock <michaelnottebrock@gmx.net>
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.
Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.
Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).
Sponsored by: DARPA & NAI Labs.
Reviewed by: Poul-Henning Kamp <phk@freebsd.org>