Add support for Pre-Boot Virtual Memory (PBVM) to the loader.
PBVM allows us to link the kernel at a fixed virtual address without
having to make any assumptions about the physical memory layout. On
the SGI Altix 350 for example, there's no usuable physical memory
below 192GB. Also, the PBVM allows us to control better where we're
going to physically load the kernel and its modules so that we can
make sure we load the kernel in memory that's close to the BSP.
The PBVM is managed by a simple page table. The minimum size of the
page table is 4KB (EFI page size) and the maximum is currently set
to 1MB. A page in the PBVM is 64KB, as that's the maximum alignment
one can specify in a linker script. The bottom line is that PBVM is
between 64KB and 8GB in size.
The loader maps the PBVM page table at a fixed virtual address and
using a single translations. The PBVM itself is also mapped using a
single translation for a maximum of 32MB.
While here, increase the heap in the EFI loader from 512KB to 2MB
and set the stage for supporting relocatable modules.
o bunch of variables are turned into uint8_t
o initial setting of namep[] in lookup() is removed
as it's only overwritten a few lines down
o kname is explicitly initialized in main() as BSS
in boot2 is not zeroed
o the setting and reading of "fmt" in load() is removed
o buf in printf() is made static to save space
Reviewed by: jhb
Tested by: me and Fabian Keil <freebsd-listen fabiankeil de>
little further. This gets us further on the way to be able to build it
successfully with clang. Using in-tree gcc, this shrinks boot2.bin with
60 bytes, the in-tree clang shaves off 72 bytes, and ToT clang 84 bytes.
Submitted by: rdivacky
Reviewed by: imp
clean up most layering violations:
sys/boot/i386/common/rbx.h:
RBX_* defines
OPT_SET()
OPT_CHECK()
sys/boot/common/util.[ch]:
memcpy()
memset()
memcmp()
bcpy()
bzero()
bcmp()
strcmp()
strncmp() [new]
strcpy()
strcat()
strchr()
strlen()
printf()
sys/boot/i386/common/cons.[ch]:
ioctrl
putc()
xputc()
putchar()
getc()
xgetc()
keyhit() [now takes number of seconds as an argument]
getstr()
sys/boot/i386/common/drv.[ch]:
struct dsk
drvread()
drvwrite() [new]
drvsize() [new]
sys/boot/common/crc32.[ch] [new]
sys/boot/common/gpt.[ch] [new]
- Teach gptboot and gptzfsboot about new files. I haven't touched the
rest, but there is still a lot of code duplication to be removed.
- Implement full GPT support. Currently we just read primary header and
partition table and don't care about checksums, etc. After this change we
verify checksums of primary header and primary partition table and if
there is a problem we fall back to backup header and backup partition
table.
- Clean up most messages to use prefix of boot program, so in case of an
error we know where the error comes from, eg.:
gptboot: unable to read primary GPT header
- If we can't boot, print boot prompt only once and not every five
seconds.
- Honour newly added GPT attributes:
bootme - this is bootable partition
bootonce - try to boot from this partition only once
bootfailed - we failed to boot from this partition
- Change boot order of gptboot to the following:
1. Try to boot from all the partitions that have both 'bootme'
and 'bootonce' attributes one by one.
2. Try to boot from all the partitions that have only 'bootme'
attribute one by one.
3. If there are no partitions with 'bootme' attribute, boot from
the first UFS partition.
- The 'bootonce' functionality is implemented in the following way:
1. Walk through all the partitions and when 'bootonce'
attribute is found without 'bootme' attribute, remove
'bootonce' attribute and set 'bootfailed' attribute.
'bootonce' attribute alone means that we tried to boot from
this partition, but boot failed after leaving gptboot and
machine was restarted.
2. Find partition with both 'bootme' and 'bootonce' attributes.
3. Remove 'bootme' attribute.
4. Try to execute /boot/loader or /boot/kernel/kernel from that
partition. If succeeded we stop here.
5. If execution failed, remove 'bootonce' and set 'bootfailed'.
6. Go to 2.
If whole boot succeeded there is new /etc/rc.d/gptboot script coming
that will log all partitions that we failed to boot from (the ones with
'bootfailed' attribute) and will remove this attribute. It will also
find partition with 'bootonce' attribute - this is the partition we
booted from successfully. The script will log success and remove the
attribute.
All the GPT updates we do here goes to both primary and backup GPT if
they are valid. We don't touch headers or partition tables when
checksum doesn't match.
Reviewed by: arch (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
MFC after: 2 weeks
Clang to compile this file: it was using the builtin memcpy and we want
to use the memcpy defined in gptboot.c. (Clang can't compile boot2 yet).
Submitted by: Dimitry Andric <dimitry at andric.com>
Reviewed by: jhb
Current code doesn't check size of elf sections and may perform needless
actions of zero-sized memory allocation and similar.
The bigger issue is that alignment requirement of a zero-sized section
gets effectively applied to the next section if it has smaller alignment
requirement. But other tools, like gdb and consequently kgdb,
completely ignore zero-sized sections and thus may map symbols to
addresses differently.
Zero-sized sections are not typical in general.
Their typical (only, even) cause in FreeBSD modules is inline assembly that
creates custom sections which is found in pcpu.h and vnet.h. Mere inclusion
of one of those header files produces a custom section in elf output.
If there is no actual use for the section in a given module, then the
section remains empty.
Better solution is to avoid creating zero-sized sections altogether,
which is in plans.
Preloaded modules are handled in boot code (load_elf_obj.c), while
dynamically loaded modules are handled by kernel (link_elf_obj.c).
Based on code by: np
MFC after: 3 weeks
as this only allows us to access file systems that EFI knows about.
With a loader that can only use EFI-supported file systems, we're
forced to put /boot on the EFI system partition. This is suboptimal
in the following ways:
1. With /boot a symlink to /efi/boot, mergemaster complains about
the mismatch and there's no quick solution.
2. The EFI loader can only boot a single version of FreeBSD. There's
no way to install multiple versions of FreeBSD and select one
at the loader prompt.
3. ZFS maintains /boot/zfs/zpool.cache and with /boot a symlink we
end up with the file on a MSDOS file system. ZFS does not have
proper handling of file systems that are under Giant.
Implement a disk device based on the block I/O protocol instead and
pull in file system code from libstand. The disk devices are really
the partitions that EFI knows about.
This change is backward compatible.
MFC after: 1 week
by keeping it opened after the first open and closing it via the
cleanup handler when NETIF_OPEN_CLOSE_ONCE is defined in order to
avoid the open-close-dance on every file access which with firmware
that for example performs an auto-negotiation on every open causes
netbooting to take horribly long. Basically the behavior with this
knob enabled resembles the one employed between r60506 and r177108
(and for sparc64 also again since r182919) with the addition that
the network device now is closed eventually before entering the
kernel and before rebooting. Actually I think this should be the
desired MI behavior, however the U-Boot loader actually requires
net_close() to be called after every transaction in order for some
local shutdown operations to be performed (and which I think thus
will break on concurrent opens, i.e. when netdev_opens is > 1, like
the loader does at least for disks when LOADER_GZIP_SUPPORT is
enabled).
- Use NETIF_OPEN_CLOSE_ONCE to replace the hack, which artificially
increased netdev_opens for sparc64 in order to keep the network
device opened forever, as at least some firmware versions require
the network device to be closed eventually before entering the
kernel or otherwise will DMA received packets to stale memory.
The powerpc OFW loader probably wants NETIF_OPEN_CLOSE_ONCE to be
set as well for the same reasons.
is determined by MD_IMAGE_SIZE. A file system can be embedded
into the loader with /sys/tools/embed_mfs.sh.
Note that md.c is not included when MD_IMAGE_SIZE is not set.
only when typing the sequence "123" (opposite to the standard 'push any
button' approach).
That results useful when using serial lines sending garbage and leading
to unwilling boot prompt appearence.
Obtained from: Sandvine Incorporated
Reviewed by: emaste, jhb
Sponsored by: Sandvine Incorporated
MFC: 1 week
=================
Extend the loader to parse the root file system mount options in /etc/fstab,
and set a new loader variable vfs.root.mountfrom.options with these options.
The root mount options must be a comma-delimited string, as specified in
/etc/fstab.
Only set the vfs.root.mountfrom.options variable if it has not been
set in the environment.
sys/kern/vfs_mount.c
====================
When mounting the root file system, pass the mount options
specified in vfs.root.mountfrom.options, but filter out "rw" and "noro",
since the initial mount of the root file system must be done as "ro".
While we are here, try to add a few hints to the mountroot prompt
to give users and idea what might of gone wrong during mounting
of the root file system.
Reviewed by: jhb (an earlier patch)
uses the generic struct dirent, which happens to look identical to UFS's
struct direct. If BSD ever changes dirent then this will be a problem.
Submitted by: matthew dot fleming at isilon dot com
above) exhibits some misbehaviours on machines with AMD64 CPUs,
which at least in some cases I have tracked down to a heap overflow.
It is unclear whether it depends on the CPU or on the pxe bios
itself which may use more memory on AMD machines.
Noticeably a pxeboot compiled from 6.x sources works fine on all
machines I have tried so far, while a pxeboot compiled from 7.x
sources does not.
This patch is a first step in reducing the amount of memory used
while processing the configuration files read by the loader at boot
(some of them are quite large, 1700+ lines), and it does so by:
+ moving a buffer to static memory instead of allocating in the heap;
+ skipping empty lines;
+ reducing the amount of memory used for line descriptors;
Unfortunately there are several changes between 6.x and above,
affecting the compiler, the loader code itself, and libstand,
and it is not so straightforward to
These changes fix the behaviour on one motherboard with a
single-core AMD cpu, but are still not enough e.g on an Asus
M2N-VM (with a dual-core CPU).
I need to investigate the problem a bit more before figuring
out what should be committed to RELENG_7
PR: kern/118222
This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
This uses the common U-Boot support lib (sys/boot/uboot, already used on
FreeBSD/powerpc), and assumes the underlying firmware has the modern API for
stand-alone apps enabled in the config (CONFIG_API).
Only netbooting is supported at the moment.
Obtained from: Marvell, Semihalf
isn't fixed to only open the network device once and not do a open
and close dance on every file access; the firmwares of newer sparc64
machines perform an auto-negotiation with every open which in turn
causes netbooting to take horribly long if we open and close the
device over and over again.
This was introduced as a workaround long time ago for some Alpha firmware
(which is now gone), and actually prevented net_close() to ever be
called.
Certain firmwares (U-Boot) need local shutdown operations to be performed on a
network controller upon transaction end: such platform-specific hooks are
supposed to be called via netif_close() (from within net_close()).
This change effectively reverts the following CVS commit:
sys/boot/common/dev_net.c
revision 1.7
date: 2000/05/13 15:40:46; author: dfr; state: Exp; lines: +2 -1
Only probe network settings on the first open of the network device.
The alpha firmware takes a seriously long time to open the network device
the first time.
Also suppress excessive output while netbooting via loader, unless debugging.
While there, make sys/boot/uboot more style(9) compliant.
Reviewed by: imp
Approved by: cognet (mentor)
(link) address and the physical (load) address. Ideally, the mapping
between link and load addresses should be abstracted by the copyin(),
copyout() and readin() functions, so that we don't have to add kluges
in __elfN(loadimage)(). Then, we could also have paged virtual memory
for the kernel. This can be important under EFI, where you need to
allocate physical memory form the firmware if you want to work in all
scenarios.
defined. This lets each boot program choose which version of cgbase() it
wants to use rather than forcing ufsread.c to have that knowledge.
MFC after: 1 week
Discussed with: imp
saves about 500 bytes in the boot code. While the AT91RM9200 has 12k
of space for the boot loader, which is more than i386's 8k, the code
generated by gcc is a bit bigger.
I've had this in p4 for about two years now.
on i386 and amd64 machines. The overall process is that /boot/pmbr lives
in the PMBR (similar to /boot/mbr for MBR disks) and is responsible for
locating and loading /boot/gptboot. /boot/gptboot is similar to /boot/boot
except that it groks GPT rather than MBR + bsdlabel. Unlike /boot/boot,
/boot/gptboot lives in its own dedicated GPT partition with a new
"FreeBSD boot" type. This partition does not have a fixed size in that
/boot/pmbr will load the entire partition into the lower 640k. However,
it is limited in that it can only be 545k. That's still a lot better than
the current 7.5k limit for boot2 on MBR. gptboot mostly acts just like
boot2 in that it reads /boot.config and loads up /boot/loader. Some more
details:
- Include uuid_equal() and uuid_is_nil() in libstand.
- Add a new 'boot' command to gpt(8) which makes a GPT disk bootable using
/boot/pmbr and /boot/gptboot. Note that the disk must have some free
space for the boot partition.
- This required exposing the backend of the 'add' function as a
gpt_add_part() function to the rest of gpt(8). 'boot' uses this to
create a boot partition if needed.
- Don't cripple cgbase() in the UFS boot code for /boot/gptboot so that
it can handle a filesystem > 1.5 TB.
- /boot/gptboot has a simple loader (gptldr) that doesn't do any I/O
unlike boot1 since /boot/pmbr loads all of gptboot up front. The
C portion of gptboot (gptboot.c) has been repocopied from boot2.c.
The primary changes are to parse the GPT to find a root filesystem
and to use 64-bit disk addresses. Currently gptboot assumes that the
first UFS partition on the disk is the / filesystem, but this algorithm
will likely be improved in the future.
- Teach the biosdisk driver in /boot/loader to understand GPT tables.
GPT partitions are identified as 'disk0pX:' (e.g. disk0p2:) which is
similar to the /dev names the kernel uses (e.g. /dev/ad0p2).
- Add a new "freebsd-boot" alias to g_part() for the new boot UUID.
MFC after: 1 month
Discussed with: marcel (some things might still change, but am committing
what I have so far)