Updated sha512 from illumos.
Using skein from freebsd crypto tree.
Since loader itself is using 64MB memory for heap, updated zfsboot to
use same, and this also allows to support zfs large blocks.
Note, adding additional features does increate zfsboot code, therefore
this update does increase zfsboot code to 128k, also I have ported gptldr.S
update to zfsldr.S to support 64k+ code.
With this update, boot1.efi has almost reached the current limit of the size
set for it, so one of the future patches for boot1.efi will need to
increase the limit.
Currently known missing zfs features in boot loader are edonr and gzip support.
Reviewed by: delphij, imp
Approved by: imp (mentor)
Obtained from: sha256.c update and skein_zfs.c stub from illumos.
Differential Revision: https://reviews.freebsd.org/D7418
OpenZFS uses feature flags instead of a zpool version number to track
features since the split from Oracle. In addition to avoiding confusion
on ZFS vs OpenZFS version numbers, this also allows features to be added
to different operating systems that use OpenZFS in different order.
The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly
tries to read the pool, and if failed provided only a vague error message.
With this change, both the boot code and loader check the MOS features
list in the ZFS label and compare it against the list of features that
the loader supports. If any unsupported feature is active, the pool is
not considered as a candidate for booting, and a helpful diagnostic
message is printed to the screen. Features that are merely enabled via
zpool upgrade, but not in use, do not block booting from the pool.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij, mav
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D6857
There is no reason to return non-zero value from zfs_probe_partition()
as that causes following partitions to not be probed for ZFS vdevs.
A particular scenario that I encountered is a GPT partitioned disk
where several partitions have freebsd-zfs type. A partition with a lower
index is used as a cache (l2arc) vdev and in that case case zfs_probe()
returned a non-zero status. That status was returned to ptable_iterate()
and caused it to abort the iteration. Because of that the subsequent
partitions were not probed and a root pool was not discovered resulting
in a boot failure.
While there fix the style for nearby return statements.
Approved by: re (kib)
return value when it could return 1 (indicating we should stop).
Fix a few instances of pager_open() / pager_close() not being called.
Actually use these routines for the environment variable printing code
I just committed.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.
This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.
Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.
Submitted by: Toomas Soome <tsoome@me.com>
Reviewed by: delphij (previous version), emaste (previous version)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D4713
This causes boot from external media (installer USB image) to mount / from
the default ZFS BE, rather than the USB device.
Reported by: kmoore
MFC after: 5 days
Sponsored by: ScaleEngine Inc.
Silence a bogus compiler warning about indexing past the end of dn_bonus.
The ZFS code ensures this is not possible but the compiler can't determine
this so added an additional check to prevent this warning.
Sponsored by: Multiplay
member secsz was a uint16_t
sys/boot/zfs/zfs.c has a probe args structure member, secsz, that is a
uint16_t for media sector size; it is used as an argument for ioctl()
at line 484. however, this ioctl writes 32 bits of data (u_int *) and
therefore this ioctl will overwrite and corrupt 16 bits of memory.
other use cases seem to use correct u_int type for secsz.
PR: 204358
Submitted by: Toomas Soome <tsoome at me.com>
Reviewed by: asomers, delphij, smh
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D4811
Add a few other safeguards to ensure things do not break when the
boot device cannot be determined
Reported by: flo
MFC after: 3 days
Sponsored by: ScaleEngine Inc.
If the system was booted with ZFS, a new menu item (#7) appears
It contains an autogenerated list of ZFS Boot Environments
This allows the user to switch to an alternate root file system
Use Cases:
- Revert a failed upgrade
- Concurrently run different versions of FreeBSD with common home directory
- Easier integration with the sysadmin/beadm utility
Requested by: many
Reviewed by: dteske
MFC after: 10 days
Relnotes: yes
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D3167
This is not properly respecting WITHOUT or ARCH dependencies in target/.
Doing so requires a massive effort to rework targets/ to do so. A
better approach will be to either include the SUBDIR Makefiles directly
and map to DIRDEPS or just dynamically lookup the SUBDIR. These lose
the benefit of having a userland/lib, userland/libexec, etc, though and
results in a massive package. The current implementation of targets/ is
very unmaintainable.
Currently rescue/rescue and sys/modules are still not connected.
Sponsored by: EMC / Isilon Storage Division
Go ahead and defined -D_STANDALONE for all targets (only strictly
needed for some architecture, but harmless on those it isn't required
for). Also add -msoft-float to all architectures uniformly rather
that higgley piggley like it is today.
Differential Revision: https://reviews.freebsd.org/D3496
A hole block pointer can be encountered at any level and the hole
can cover multiple data blocks, but we are reading the data blocks
one by one, so we care only about the current one.
PR: 199804
Reported by: Toomas Soome <tsoome@me.com>
Submitted by: Toomas Soome <tsoome@me.com> (earlier version)
Tested by: Toomas Soome <tsoome@me.com>
MFC after: 11 days
ZFS large block support.
Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool). This *may* remain unchanged because of memory constraint.
Limited safety belt is provided for mounted root filesystem but use
caution is advised.
Illumos issue:
5027 zfs large block support
MFC after: 1 month
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47
NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.
MFC after: 2 weeks
.. so that consistent compilation algorithms are used for both
architectures as in practice the binaries are expected to be
interchangeable (for time being).
Previously i386 used default setting which were equivalent to
-march=i486 -mtune=generic.
The only difference is using smaller but slower "leave" instructions.
Discussed with: jhb, dim
MFC after: 29 days
- only filesystem datasets are supported
- children names are printed to stdout
To do: allow to iterate over the list and fetch names programatically
MFC after: 17 days
The previous one was totally bogus as it used hash value of
_output_ variable as an index for searching...
The only reliable way to do a reverse lookup here is to iterate
over all entries.
MFC after: 15 days
often modified directory created symbolic links points to - it cause
unnecessary full rebuilds each time make runs when directory is changed.
So do it only if symbolic link does not exists, which usually means that
objdir is clean anyway.
MFC after: 1 week
This is to silence warnings that result from different definitions of
uint64_t on different architectures, specifically i386 and sparc64.
MFC after: 1 month
In zfs loader zfs device name format now is "zfs:pool/fs",
fully qualified file path is "zfs:pool/fs:/path/to/file"
loader allows accessing files from various pools and filesystems as well
as changing currdev to a different pool/filesystem.
zfsboot accepts kernel/loader name in a format pool:fs:path/to/file or,
as before, pool:path/to/file; in the latter case a default filesystem
is used (pool root or bootfs). zfsboot passes guids of the selected
pool and dataset to zfsloader to be used as its defaults.
zfs support should be architecture independent and is provided
in a separate library, but architectures wishing to use this zfs support
still have to provide some glue code and their devdesc should be
compatible with zfs_devdesc.
arch_zfs_probe method is used to discover all disk devices that may
be part of ZFS pool(s).
libi386 unconditionally includes zfs support, but some zfs-specific
functions are stubbed out as weak symbols. The strong definitions
are provided in libzfsboot.
This change mean that the size of i386_devspec becomes larger
to match zfs_devspec.
Backward-compatibility shims are provided for recently added sparc64
zfs boot support. Currently that architecture still works the old
way and does not support the new features.
TODO:
- clear up pool root filesystem vs pool bootfs filesystem distinction
- update sparc64 support
- set vfs.root.mountfrom based on currdev (for zfs)
Mid-future TODO:
- loader sub-menu for selecting alternative boot environment
Distant future TODO:
- support accessing snapshots, using a snapshot as readonly root
Reviewed by: marius (sparc64),
Gavin Mu <gavin.mu@gmail.com> (sparc64)
Tested by: Florian Wagner <florian@wagner-flo.net> (x86),
marius (sparc64)
No objections: fs@, hackers@
MFC after: 1 month
V100, the firmware is known to be broken and not allowing to simultaneously
open disk devices, causing attempts to boot from a mirror or RAIDZ to cause
a crash. This will be worked around later. The firmwares of newer sun4u models
don't seem to exhibit this problem though.
Steps for ZFS booting:
1. create VTOC8 label
# gpart create -s vtoc8 da0
2. add partitions, f.e.:
# gpart add -t freebsd-zfs -s 60g da0
# gpart add -t freebsd-swap da0
resulting in something like:
# gpart show
=> 0 143331930 da0 VTOC8 (68G)
0 125821080 1 freebsd-zfs (60G)
125821080 17510850 2 freebsd-swap (8.4G)
3. create zpool
# zpool create bunker da0a
or for mirror/RAIDZ (after preparing additional disks as in steps 1. + 2.):
# zpool create bunker mirror da0a da1a
# zpool create bunker raidz da0a da1a da2a ...
4. set bootfs
# zpool set bootfs=bunker bunker
5. install zfsboot
# zpool export bunker
# gpart bootcode -p /boot/zfsboot da0
6. write zfsloader to the ZFS Boot Block (so far, there's no dedicated tool
for this, so dd(1) has to be used for this purpose)
When using mirror/RAIDZ, step 4. and the dd(1) invocation should be repeated
for the additional disks in order to be able to boot from another disk in
case of failure.
# sysctl kern.geom.debugflags=0x10
# dd if=/boot/zfsloader of=/dev/da0a bs=512 oseek=1024 conv=notrunc
# zpool import bunker
7. install system on ZFS filesystem
Don't forget to set 'zfs_load="YES"' and vfs.root.mountfrom="zfs:bunker" in
loader.conf as well as 'zfs_enable="YES"'in rc.conf.
8. copy zpool.cache to the ZFS filesystem
cp -p /boot/zfs/zpool.cache /bunker/boot/zfs/zpool.cache
9. set mountpoint
# zfs set mountpoint=/ bunker
10. Now, given that aliases for all disks in the zpool exists (check with
the `devalias` command on the boot monitor prompt) and disk0 corresponds
to da0 (likewise for additional disks), the system can be booted from the
ZFS with:
{1} ok boot disk0
PR: 165025
Submitted by: Gavin Mu
- Decompress assembled gang block data if compressed.
- Verify checksum of a gang header.
- Verify checksum of assembled gang block data.
- Verify checksum of uber block.
Submitted by: avg
MFC after: 3 days
physical block size declared in bp may not always be what we want.
For example in case of gang block header physical block size declared
in bp is much larger than SPA_GANGBLOCKSIZE (512 bytes) and checksum
calculation failed. This bug could lead to accessing unallocated
memory and resets/failures during boot.
MFC after: 3 days
handled by lower layers like vdev_raidz, which uses bp for checksum
verification. This bug could lead to NULL pointer reference and resets
during boot.
MFC after: 3 days