99 Commits

Author SHA1 Message Date
Toomas Soome
3384149c15 loader: zfs reader vdev_probe should check for minimum device size
The smallest device we can have in the pool is 64MB, since we are trying to
walk all four labels to find the most up to date uberblock, this limit will
also give us good method to check if we even should attempt to probe.

Enforcing the check also will make sure we are not getting wrapped while
calculating the label offset.

Also, after label check, we should verify if we actually got any UB or not.

PR:		218473
Reported by:	Masachika ISHIZUKA
Reviewed by:	allanjude
Differential Revision:	https://reviews.freebsd.org/D10381
2017-04-18 15:43:47 +00:00
Toomas Soome
e41fab8d40 loader: zfs reader should check all labels
The current zfs reader is only checking first label from each device, however,
we do have 4 labels on device and we should check all 4 to be protected
against disk failures and incomplete label updates.

The difficulty is about the fact that 2 label copies are in front of the
pool data, and 2 are at the end, which means, we have to know the size of
the pool data area.

Since we have now the mechanism from common/disk.c to use the partition
information, it does help us in this task; however, there are still some
corner cases.

Namely, if the pool is created without partition, directly on the disk,
and firmware will give us the wrong size for the disk, we only can check
the first two label copies.

Reviewed by:	allanjude
Differential Revision:	https://reviews.freebsd.org/D10203
2017-04-06 18:17:29 +00:00
Allan Jude
ec5c0e5be9 Implement boot-time encryption key passing (keybuf)
This patch adds a general mechanism for providing encryption keys to the
kernel from the boot loader. This is intended to enable GELI support at
boot time, providing a better mechanism for passing keys to the kernel
than environment variables. It is designed to be extensible to other
applications, and can easily handle multiple encrypted volumes with
different keys.

This mechanism is currently used by the pending GELI EFI work.
Additionally, this mechanism can potentially be used to interface with
GRUB, opening up options for coreboot+GRUB configurations with completely
encrypted disks.

Another benefit over the existing system is that it does not require
re-deriving the user key from the password at each boot stage.

Most of this patch was written by Eric McCorkle. It was extended by
Allan Jude with a number of minor enhancements and extending the keybuf
feature into boot2.

GELI user keys are now derived once, in boot2, then passed to the loader,
which reuses the key, then passes it to the kernel, where the GELI module
destroys the keybuf after decrypting the volumes.

Submitted by:	Eric McCorkle <eric@metricspace.net> (Original Version)
Reviewed by:	oshogbo (earlier version), cem (earlier version)
MFC after:	3 weeks
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D9575
2017-04-01 05:05:22 +00:00
Enji Cooper
e468767c4b Remove redundant declaration for zfs_crc64_table
zfssubr.c already defines this statically. Besides, zfsimpl.c defined it, but
didn't use it.

This fixes a -Wredundant-decls warning.

MFC after:	3 days
Reported by:	amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by:	Dell EMC Isilon
2017-03-28 20:52:59 +00:00
Enji Cooper
5b594b7f21 Don't shadow read(2) definition with read argument in vdev_{create,probe}
This fixes several -Wshadow warnings introduced in r192194, but now errors
with gcc 6.3.0.

MFC after:	3 days
Reported by:	amd64-gcc-6.3.0 (devel/amd64-xtoolchain-gcc)
Sponsored by:	Dell EMC Isilon
2017-03-28 20:39:24 +00:00
Toomas Soome
d8b59bf47d loader: r314112 did introduce dereference freed pointer entry
CID: 1371675
Reported by:	Coverity
Reviewed by:	jhb, allanjude
Approved by:	allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D9846
2017-03-01 19:02:43 +00:00
Toomas Soome
84a6eddc43 loader: update symlink support in zfs reader
As the current zfs file system is providing symlink via system attributes, need
to update the code accordingly.

Note, as the zfsboot code does not free the memory at this time, the
object list will put some stress on the boot2 heap, eventually we should
address the issue.

Reviewed by:	allanjude, smh
Approved by:	allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D9706
2017-02-22 22:00:50 +00:00
Pedro F. Giffuni
e099b90b80 sys: Replace zero with NULL for pointers.
Found with:	devel/coccinelle
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D9694
2017-02-22 02:35:59 +00:00
Toomas Soome
467c82cb84 loader: Replace EFI part devices.
Rewrite EFI part device interface to present disk devices in more
user friendly way.

We keep list of three types of devices: floppy, cd and disk, the
visible names: fdX: cdX: and diskX:

Use common/disk.c and common/part.c interfaces to manage the
partitioning.

The lsdev -l will additionally list the device path.

Reviewed by:	imp, allanjude
Approved by:	imp (mentor), allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D8581
2017-02-06 09:18:47 +00:00
Toomas Soome
c12dbfe608 loader: Implement disk_ioctl() to support DIOCGSECTORSIZE and DIOCGMEDIASIZE.
Need interface to extract information about disk abstraction,
to read disk or partition size depending on the provided argument
and adjust disk size based on information in partition table.

The disk handle from disk_open() has d_offset field to point to
partition start. So we can use this fact to return either whole disk
size or partition size. For this we only need to record partition size
we get from disk_open() anyhow.

In addition, this will also make it possible to adjust the disk media size
based on information from partition table. The problem with disk size is
about some BIOS systems reporting bogus disk size for 2+TB disks, but
since such disks are using GPT partitioning, and GPT does have information
about disk size (alternate LBA + 1), we can use this fact to record disk
size based on partition table.

This patch does exactly this: implements DIOCGSECTORSIZE and DIOCGMEDIASIZE
ioctl, and DIOCGMEDIASIZE will report either disk media size or partition size.

Adds ptable_getsize() call to read partition size in bytes from ptable pointer.
Updates disk_open() to use ptable_getsize() to update mediasize value.

Implements GPT detection function to update ptable size (used by
ptable_getsize()) according to alternate lba (which is location of backup copy
of GPT header table).

Reviewed by:	allanjude
Approved by:	allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D8594
2017-02-06 08:26:45 +00:00
Toomas Soome
151139ad9e loader: disk/part api needs to use uint64_t offsets
The disk_* and part_* api is using 64bit values for media size and
offsets. However, the current api is using off_t type, which is signed
64-bit int.

In this context the signed media size does not make any sense, and
the offsets are used to mark absolute, not relative locations.

Also, the data from GPT partition table and some other sources is
already using uint64_t data type, so using signed off_t can cause sign
issues.

Reviewed by:	imp
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D8710
2017-02-01 20:10:56 +00:00
Toomas Soome
1ecc859193 dosfs support in libstand is broken since r298230
Apparently the libstand dosfs optimization is a bit too optimistic
and did introduce possible memory corruption.

This patch is backing out the bad part and since this results in
dosfs reading full blocks now, we can also remove extra offset argument
from dv_strategy callback.

The analysis of the issue and the backout patch is provided by Mikhail Kupchik.

PR:		214423
Submitted by:	Mikhail Kupchik
Reported by:	Mikhail Kupchik
Reviewed by:	bapt, allanjude
Approved by:	allanjude (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8644
2016-12-30 19:06:29 +00:00
Toomas Soome
77d81a4dab lsdev device name section headers should be printed by dv_print callback.
lsdev command does walk over devsw list, prints list element name and
will use dv_print() callback to print the device list.
Unfortunately this approach will add unneeded noise when there are no
particular devices detected.

To remove "empty" device section headers, the dv_print() callback
should print the header instead.

In addition, fixed dv_print callback for md module.

Reviewed by:	imp
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D8551
2016-11-19 08:54:21 +00:00
Toomas Soome
0e75a50173 loader: zfs toplevel vdev must have spa set.
The salt based checksum mechanisms, such as skein, are storing the seed
in spa structure, and need to access the spa to use the seed. The current
mechanism for quick access to correct spa is via pointer provided by
vdev structure, but unfortunately the current code does set spa only
for the leaf vdev. This patch will fix the issue by making sure the
loader zfs reader will set spa also for top-level vdevs.

PR:		214375
Reported by:	lstewart
Reviewed by:	allanjude, imp
Approved by:	allanjude (mentor), imp (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D8487
2016-11-17 19:38:30 +00:00
Toomas Soome
cbd6713146 Loader paged/pageable data is not always paged.
This change does modify devsw dv_print() to return the int value,
enabling walkers to interrupt the walk on non zero value from dv_print().

This will allow the pager_print actually to stop displaying data on
user input, and additionally pager is used in various *dev_print callbacks,
where it was missing.

For test, lsdev [-v] command should display data by screenfuls and should
stop when the key 'q' is pressed on pager prompt.

Reviewed by:	allanjude
Approved by:	allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D5461
2016-11-08 06:50:18 +00:00
Toomas Soome
1d1140af58 loader should boot pre-feature flags pools.
The feature flags chek is missing the corner case where we have valid pool
version, but feature flags are not enabled - as for example plain v28 pool.

This update does fix the boot support for such pools.

Reviewed by:	avg, allanjude
Approved by:	allanjude (mentor)
Differential Revision:	https://reviews.freebsd.org/D8331
2016-10-24 16:28:54 +00:00
Allan Jude
3595d72f86 Disable loop unrolling in skein for sys/boot
When tsoome@ added skein support to the ZFS boot code and zfsloader, it
resulted in an explosion in code size, running close to a number of
limits.

The default for the C version of skein is to unroll all loops for
skein-256 and 512

Disabling the loop unrolling saves 20-28kb from each binary
boot1.efi
gptzfsboot
loader.efi
userboot.so
zfsloader

Reviewed by:	emaste, tsoome
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D7826
2016-10-06 03:32:30 +00:00
Toomas Soome
2c55d0903d Add SHA512, skein, large blocks support for loader zfs.
Updated sha512 from illumos.
Using skein from freebsd crypto tree.
Since loader itself is using 64MB memory for heap, updated zfsboot to
use same, and this also allows to support zfs large blocks.

Note, adding additional features does increate zfsboot code, therefore
this update does increase zfsboot code to 128k, also I have ported gptldr.S
update to zfsldr.S to support 64k+ code.

With this update, boot1.efi has almost reached the current limit of the size
set for it, so one of the future patches for boot1.efi will need to
increase the limit.

Currently known missing zfs features in boot loader are edonr and gzip support.

Reviewed by:	delphij, imp
Approved by:	imp (mentor)
Obtained from:	sha256.c update and skein_zfs.c stub from illumos.
Differential Revision:	https://reviews.freebsd.org/D7418
2016-08-18 00:37:07 +00:00
Allan Jude
4deb8929ea Make boot code and loader check for unsupported ZFS feature flags
OpenZFS uses feature flags instead of a zpool version number to track
features since the split from Oracle. In addition to avoiding confusion
on ZFS vs OpenZFS version numbers, this also allows features to be added
to different operating systems that use OpenZFS in different order.

The previous zfs boot code (gptzfsboot) and loader (zfsloader) blindly
tries to read the pool, and if failed provided only a vague error message.

With this change, both the boot code and loader check the MOS features
list in the ZFS label and compare it against the list of features that
the loader supports. If any unsupported feature is active, the pool is
not considered as a candidate for booting, and a helpful diagnostic
message is printed to the screen. Features that are merely enabled via
zpool upgrade, but not in use, do not block booting from the pool.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	delphij, mav
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6857
2016-08-01 19:37:43 +00:00
Andriy Gapon
dc87c22a61 fix a zfs boot regression introduced in r300117 by accident
There is no reason to return non-zero value from zfs_probe_partition()
as that causes following partitions to not be probed for ZFS vdevs.
A particular scenario that I encountered is a GPT partitioned disk
where several partitions have freebsd-zfs type.  A partition with a lower
index is used as a cache (l2arc) vdev and in that case case zfs_probe()
returned a non-zero status.  That status was returned to ptable_iterate()
and caused it to abort the iteration.  Because of that the subsequent
partitions were not probed and a root pool was not discovered resulting
in a boot failure.

While there fix the style for nearby return statements.

Approved by:	re (kib)
2016-06-16 07:45:57 +00:00
Warner Losh
9a0b26ec6f Fix several instances where the boot loader ignored pager_output
return value when it could return 1 (indicating we should stop).
Fix a few instances of pager_open() / pager_close() not being called.
Actually use these routines for the environment variable printing code
I just committed.
2016-05-18 05:59:05 +00:00
Pedro F. Giffuni
d9c9c81c08 sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
Allan Jude
87ed2b7f5a A new implementation of the loader block cache
The block cache implementation in loader has proven to be almost useless, and in worst case even slowing down the disk reads due to insufficient cache size and extra memory copy.
Also the current cache implementation does not cache reads from CDs, or work with zfs built on top of multiple disks.
Instead of an LRU, this code uses a simple hash (O(1) read from cache), and instead of a single global cache, a separate cache per block device.
The cache also implements limited read-ahead to increase performance.
To simplify read ahead management, the read ahead will not wrap over bcache end, so in worst case, single block physical read will be performed to fill the last block in bcache.

Booting from a virtual CD over IPMI:
0ms latency, before: 27 second, after: 7 seconds
60ms latency, before: over 12 minutes, after: under 5 minutes.

Submitted by:	Toomas Soome <tsoome@me.com>
Reviewed by:	delphij (previous version), emaste (previous version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D4713
2016-04-18 23:09:22 +00:00
Allan Jude
b996592c3e Implement GELI (AES-XTS and AES-CBC only) in gptboot and gptzfsboot
Allows booting from a GELI encrypted root file system, via UFS or ZFS

Reviewed by:	gnn, smh (previous version), delphij (previous version)
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D4593
2016-03-16 23:12:19 +00:00
Bryan Drewery
a09f4b4d12 Add more .NOMETA missed in r291320
Sponsored by:	EMC / Isilon Storage Division
2016-03-11 23:45:51 +00:00
Allan Jude
9cdff681a4 Do not set vfs.root.mountfrom unnecessarily
This causes boot from external media (installer USB image) to mount / from
the default ZFS BE, rather than the USB device.

Reported by:	kmoore
MFC after:	5 days
Sponsored by:	ScaleEngine Inc.
2016-02-07 00:49:15 +00:00
Allan Jude
fbe958861e Move init_zfs_bootenv to sys/boot/zfs/zfs.c instead of having a copy in each loader
While here, add a filter to ignore special datasets

MFC after:	3 days
Sponsored by:	ScaleEngine Inc.
2016-01-15 05:45:10 +00:00
Steven Hartland
276a15676c Prevent bogus compiler in ZFS boot code
Silence a bogus compiler warning about indexing past the end of dn_bonus.

The ZFS code ensures this is not possible but the compiler can't determine
this so added an additional check to prevent this warning.

Sponsored by:	Multiplay
2016-01-14 21:31:26 +00:00
Allan Jude
076b613091 DIOCGSECTORSIZE expects to write to a u_int, but struct zfs_probe_args
member secsz was a uint16_t

sys/boot/zfs/zfs.c has a probe args structure member, secsz, that is a
uint16_t for media sector size; it is used as an argument for ioctl()
at line 484. however, this ioctl writes 32 bits of data (u_int *) and
therefore this ioctl will overwrite and corrupt 16 bits of memory.
other use cases seem to use correct u_int type for secsz.

PR:		204358
Submitted by:	Toomas Soome <tsoome at me.com>
Reviewed by:	asomers, delphij, smh
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4811
2016-01-11 15:35:29 +00:00
Allan Jude
fce8d0e350 Only call init_zfs_bootenv() when the system was booted with ZFS
Add a few other safeguards to ensure things do not break when the
boot device cannot be determined

Reported by:	flo
MFC after:	3 days
Sponsored by:	ScaleEngine Inc.
2016-01-09 00:54:08 +00:00
Steven Hartland
5d07d143e6 Fix return from zfs_probe_dev
Ensure zfs_probe_dev returns the correct value.

Also fix a style(9) trailing whitespace issue while here.

MFC after:	2 weeks
X-MFC-With:	r293268
Sponsored by:	Multiplay
2016-01-06 20:25:41 +00:00
Allan Jude
f1aba489c1 Introduce the ZFS Boot Environments menu to the loader menu
If the system was booted with ZFS, a new menu item (#7) appears
It contains an autogenerated list of ZFS Boot Environments

This allows the user to switch to an alternate root file system
Use Cases:
 - Revert a failed upgrade
 - Concurrently run different versions of FreeBSD with common home directory
 - Easier integration with the sysadmin/beadm utility

Requested by:	many
Reviewed by:	dteske
MFC after:	10 days
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D3167
2015-12-31 20:00:53 +00:00
Bryan Drewery
b1f92fa229 META MODE: Update dependencies with 'the-lot' and add missing directories.
This is not properly respecting WITHOUT or ARCH dependencies in target/.
Doing so requires a massive effort to rework targets/ to do so.  A
better approach will be to either include the SUBDIR Makefiles directly
and map to DIRDEPS or just dynamically lookup the SUBDIR.  These lose
the benefit of having a userland/lib, userland/libexec, etc, though and
results in a massive package.  The current implementation of targets/ is
very unmaintainable.

Currently rescue/rescue and sys/modules are still not connected.

Sponsored by:	EMC / Isilon Storage Division
2015-12-01 05:23:19 +00:00
Warner Losh
9d2edd63d4 Use CFLAGS_NO_SIMD in preference to varying lists of -mno-xxxx flags.
Go ahead and defined -D_STANDALONE for all targets (only strictly
needed for some architecture, but harmless on those it isn't required
for). Also add -msoft-float to all architectures uniformly rather
that higgley piggley like it is today.

Differential Revision: https://reviews.freebsd.org/D3496
2015-08-27 23:46:42 +00:00
Andriy Gapon
649a80dc4c dnode_read: fixup r284025, BP_IS_HOLE macro expects a pointer
PR:		199804
Reported by:	sbruno
Pointyhat to:	avg
MFC after:	10 days
X-MFC with:	r284025
2015-06-05 17:02:21 +00:00
Andriy Gapon
dade6e27ab dnode_read: handle hole blocks in zfs boot code
A hole block pointer can be encountered at any level and the hole
can cover multiple data blocks, but we are reading the data blocks
one by one, so we care only about the current one.

PR:		199804
Reported by:	Toomas Soome <tsoome@me.com>
Submitted by:	Toomas Soome <tsoome@me.com> (earlier version)
Tested by:	Toomas Soome <tsoome@me.com>
MFC after:	11 days
2015-06-05 15:32:04 +00:00
Xin LI
8bcd603968 MFV r274273:
ZFS large block support.

Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool).  This *may* remain unchanged because of memory constraint.

Limited safety belt is provided for mounted root filesystem but use
caution is advised.

Illumos issue:
    5027 zfs large block support

MFC after:	1 month
2014-11-10 08:20:21 +00:00
Xin LI
29441ba3fa MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression")
4913 zfs release should not be subject to space checks

MFC after:	2 weeks
2014-07-01 06:43:15 +00:00
Xin LI
f4c8ba8370 MFV r259170:
4370 avoid transmitting holes during zfs send

4371 DMU code clean up

illumos/illumos-gate@43466aae47

NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.

MFC after:	2 weeks
2014-01-01 00:45:28 +00:00
Dimitry Andric
0673132dcb For libstand and sys/boot, split off gcc-only flags into CFLAGS.gcc.
MFC after:	3 days
X-MFC-With:	r259730
2013-12-26 11:32:39 +00:00
Andriy Gapon
881a94fab8 boot: use -march=i386 for both i386 and amd64 builds
.. so that consistent compilation algorithms are used for both
architectures as in practice the binaries are expected to be
interchangeable (for time being).
Previously i386 used default setting which were equivalent to
-march=i486 -mtune=generic.
The only difference is using smaller but slower "leave" instructions.

Discussed with:	jhb, dim
MFC after:	29 days
2012-10-20 16:57:23 +00:00
Andriy Gapon
aae0c9de03 zfs boot: export boot/primary pool and vdev guid all the way to kenv
This is work in progress to for znextboot and it also provides
some convenient infrastructure.

MFC after:	20 days
2012-10-06 19:47:24 +00:00
Andriy Gapon
d39075208e zfs loader: treat plain pool name as a name of its root dataset
... as opposed to the previous behavior of treating it as boot
dataset (specified by bootfs or default)

MFC after:	19 days
2012-10-06 19:42:50 +00:00
Andriy Gapon
edfd4fce8f zfs boot spa_status: print bootfs for each reported pool
MFC after:	9 days
2012-10-06 19:42:05 +00:00
Andriy Gapon
164efe4010 boot/zfs: a small whitespace cleanup
MFC after:	5 days
2012-10-06 19:41:11 +00:00
Andriy Gapon
62c725a9db boot/zfs: call zfs_spa_init for all found pools
... and drop those for which it fails.
Also, add more sanity checking to the function.

MFC after:	16 days
2012-10-06 19:40:12 +00:00
Andriy Gapon
74b3e265c7 zfs boot: add code for listing child datasets of a given dataset
- only filesystem datasets are supported
- children names are printed to stdout

To do: allow to iterate over the list and fetch names programatically

MFC after:	17 days
2012-10-06 19:27:04 +00:00
Andriy Gapon
84b339ac4c zfs boot: chose a "first" pool if none is explicitly requested
MFC after:	8 days
2012-10-06 19:25:40 +00:00
Andriy Gapon
7ae0dc79b7 zfs boot: add a size check for a value in fzap_lookup
MFC after:	25 days
2012-09-11 07:15:11 +00:00
Andriy Gapon
4b7fc6b08e zfs boot: print only an attribute name in fzap_list
... this matches mzap_list behavior

MFC after:	12 days
2012-09-11 07:13:58 +00:00