This fixes booting from a ZFS mirror with a unavailable primary device.
PR: kern/148655
Reviewed by: avg
Approved by: delphij (mentor)
MFC after: 3 days
Current code doesn't check size of elf sections and may perform needless
actions of zero-sized memory allocation and similar.
The bigger issue is that alignment requirement of a zero-sized section
gets effectively applied to the next section if it has smaller alignment
requirement. But other tools, like gdb and consequently kgdb,
completely ignore zero-sized sections and thus may map symbols to
addresses differently.
Zero-sized sections are not typical in general.
Their typical (only, even) cause in FreeBSD modules is inline assembly that
creates custom sections which is found in pcpu.h and vnet.h. Mere inclusion
of one of those header files produces a custom section in elf output.
If there is no actual use for the section in a given module, then the
section remains empty.
Better solution is to avoid creating zero-sized sections altogether,
which is in plans.
Preloaded modules are handled in boot code (load_elf_obj.c), while
dynamically loaded modules are handled by kernel (link_elf_obj.c).
Based on code by: np
MFC after: 3 weeks
out that "on amd64, libstand.a is compiled for i386, but is still installed
under ${WORLDTMP}/usr/lib instead of ${WORLDTMP}/usr/lib32. Even if it
would be installed there, ld on amd64 is set up incorrectly with a
${TOOLS_PREFIX}/usr/lib/i386 default path, so it wouldn't link. The reason
it does link under gcc is that gcc passes -L${WORLDTMP}/usr/lib twice,
even for -m32 builds, which is also incorrect, but accidentally works in
this case."
Submitted by: Dimitry Andric <dimitry at andric.com>
GCC forwards the -N flag directly to ld. This flag is not documented and
not supported by (for example) Clang. Just use -Wl,-N.
Submitted by: Pawel Worach
- use correct size (512) while reading a gang block
- skip holes while reading child blocks
- advance buffer pointer while reading child blocks
PR: 144214
MFC after: 10 days
o DB-88F5182
o DB-88F5281
o DB-88F6281
o DB-78100
o SheevaPlug
This also includes device tree bindings definitions for some newly introduced
nodes (mpp, gpio).
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
o This is disabled by default for now, and can be enabled using WITH_FDT at
build time.
o Tested with ARM and PowerPC.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
locate a high memory area for the heap using the SMAP.
- Read the number of hard drive devices from the BIOS instead of hardcoding
a limit of 128. Some BIOSes duplicate disk devices once you get beyond
the maximum drive number.
MFC after: 1 month
bottom of the manpages and order them consistently.
GNU groff doesn't care about the ordering, and doesn't even mention
CAVEATS and SECURITY CONSIDERATIONS as common sections and where to put
them.
Found by: mdocml lint run
Reviewed by: ru
HAL/Fujitsu) CPUs. For the most part this consists of fleshing out the
MMU and cache handling, it doesn't add pmap optimizations possible with
these CPU, yet, though.
With these changes FreeBSD runs stable on Fujitsu Siemens PRIMEPOWER 250
and likely also other models based on SPARC64 V like 450, 650 and 850.
Thanks go to Michael Moll for providing access to a PRIMEPOWER 250.
This driver was written by Alexander Pohoyda and greatly enhanced
by Nikolay Denev. I don't have these hardwares but this driver was
tested by Nikolay Denev and xclin.
Because SiS didn't release data sheet for this controller, programming
information came from Linux driver and OpenSolaris. Unlike other open
source driver for SiS190/191, sge(4) takes full advantage of TX/RX
checksum offloading and does not require additional copy operation in
RX handler.
The controller seems to have advanced offloading features like VLAN
hardware tag insertion/stripping, TCP segmentation offload(TSO) as
well as jumbo frame support but these features are not available
yet. Special thanks to xclin <xclin<> cs dot nctu dot edu dot tw>
who sent fix for receiving VLAN oversized frames.
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.
OpenBSD and OpenSolaris do instead of fiddling with the MMUs ourselves.
Unlike direct access the firmware methods don't automatically use the
next free (?) TLB slot, instead the slot to be used has to be specified.
We allocate the TLB slots for the kernel top-down as OpenSolaris suggests
that the firmware will always allocate the ones for its own use bottom-up.
Besides being simpler, according to OpenBSD using the firmware methods is
required to allow booting on Sun Fire E10K with multi-systemboard domains.
of Sun Fire V1280 doesn't round up the size itself but instead lets
claiming of non page-sized amounts of memory fail.
- Change parameters and variables related to the TLB slots to unsigned
which is more appropriate.
- Search the whole OFW device tree instead of only the children of the
root nexus device for the BSP as starting with UltraSPARC IV the 'cpu'
nodes hang off of from 'cmp' (chip multi-threading processor) or 'core'
or combinations thereof. Also in large UltraSPARC III based machines
the 'cpu' nodes hang off of 'ssm' (scalable shared memory) nodes which
group snooping-coherency domains together instead of directly from the
nexus.
- Add support for UltraSPARC IV and IV+ BSPs. Due to the fact that these
are multi-core each CPU has two Fireplane config registers and thus the
module/target ID has to be determined differently so the one specific
to a certain core is used. Similarly, starting with UltraSPARC IV the
individual cores use a different property in the OFW device tree to
indicate the CPU/core ID as it no longer is in coincidence with the
shared slot/socket ID.
While at it additionally distinguish between CPUs with Fireplane and
JBus interconnects as these also use slightly different sizes for the
JBus/agent/module/target IDs.
- Check the return value of init_heap(). This requires moving it after
cons_probe() so we can panic when appropriate. This should be fine as
the PowerPC OFW loader uses that order for quite some time now.
Note that due to e.g. write throttling ('wdrain'), it can stall all the disk
I/O instead of just the device it's configured for. Using it for removable
media is therefore not a good idea.
Reviewed by: pjd (earlier version)
kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1. Given that the Windows group limit is 1024, this range
should be sufficient for most applications.
MFC after: 1 month
as this only allows us to access file systems that EFI knows about.
With a loader that can only use EFI-supported file systems, we're
forced to put /boot on the EFI system partition. This is suboptimal
in the following ways:
1. With /boot a symlink to /efi/boot, mergemaster complains about
the mismatch and there's no quick solution.
2. The EFI loader can only boot a single version of FreeBSD. There's
no way to install multiple versions of FreeBSD and select one
at the loader prompt.
3. ZFS maintains /boot/zfs/zpool.cache and with /boot a symlink we
end up with the file on a MSDOS file system. ZFS does not have
proper handling of file systems that are under Giant.
Implement a disk device based on the block I/O protocol instead and
pull in file system code from libstand. The disk devices are really
the partitions that EFI knows about.
This change is backward compatible.
MFC after: 1 week
by keeping it opened after the first open and closing it via the
cleanup handler when NETIF_OPEN_CLOSE_ONCE is defined in order to
avoid the open-close-dance on every file access which with firmware
that for example performs an auto-negotiation on every open causes
netbooting to take horribly long. Basically the behavior with this
knob enabled resembles the one employed between r60506 and r177108
(and for sparc64 also again since r182919) with the addition that
the network device now is closed eventually before entering the
kernel and before rebooting. Actually I think this should be the
desired MI behavior, however the U-Boot loader actually requires
net_close() to be called after every transaction in order for some
local shutdown operations to be performed (and which I think thus
will break on concurrent opens, i.e. when netdev_opens is > 1, like
the loader does at least for disks when LOADER_GZIP_SUPPORT is
enabled).
- Use NETIF_OPEN_CLOSE_ONCE to replace the hack, which artificially
increased netdev_opens for sparc64 in order to keep the network
device opened forever, as at least some firmware versions require
the network device to be closed eventually before entering the
kernel or otherwise will DMA received packets to stale memory.
The powerpc OFW loader probably wants NETIF_OPEN_CLOSE_ONCE to be
set as well for the same reasons.
for each vdev's status. Booting from a degraded vdev should now be
more robust.
Submitted by: Matt Reimer <mattjreimer at gmail.com>
Sponsored by: VPOP Technologies, Inc.
MFC after: 2 weeks
It's based on the newest i386's one and has the advantage of:
- ELF binary support.
- UFS2 filesystem support.
- Many FreeBSD slices support on a disk.
Tested by: SATOU Tomokazu ( tomo1770 _ maple _ ocn _ ne _ jp ),
WATANABE Kazuhiro ( CQG00620 _ nifty _ ne _ jp ) and
nyan
MFC after: 2 week
Happy New Year in Japan!!
Fix some wrong usages.
Note: this does not affect generated binaries as this argument is not used.
PR: 137213
Submitted by: Eygene Ryabinkin (initial version)
MFC after: 1 month
M5229 appears to be once again fixed. If this happens to return
we probably should disable ATAPI DMA in ataacerlabs(4) instead
just like the Linux libATA does.
is determined by MD_IMAGE_SIZE. A file system can be embedded
into the loader with /sys/tools/embed_mfs.sh.
Note that md.c is not included when MD_IMAGE_SIZE is not set.
gptzfsboot. I got the segment and offset fields reversed in the structure,
but I also succeeded in crossing the assignments so the actual EDD packet
ended up correct.
MFC after: 1 week
safely allocate a heap region above 1MB. This enables {gpt,}zfsboot()
to allocate much larger buffers than before.
- Use a larger buffer (1MB instead of 128K) for temporary ZFS buffers. This
allows more reliable reading of compressed files in a raidz/raidz2 pool.
Submitted by: Matt Reimer mattjreimer of gmail
MFC after: 1 week
heap when using a range above 1MB.
Previously the loader would always use the last 3MB in the first memory
range above 1MB for the heap. However, this memory range is also where the
kernel and any modules are loaded. If this memory range is "small", then
using the high 3MB for the heap may not leave enough room for the kernel
and modules.
Now the loader will use any range below 4GB for the heap, and the logic to
choose the "high" heap region has moved into biosmem.c. It sets two
variables that the loader can use for a high heap if it desires. When a
high heap is enabled (BZIP2, FireWire, GPT, or ZFS), then the following
memory ranges are preferred for the heap in order from best to worst:
- The largest memory region in the SMAP with a start address greater than
1MB. The memory region must be at least 3MB in length. This leaves the
region starting at 1MB purely for use by the kernel and modules.
- The last 3MB of the memory region starting at 1MB if it is at least 3MB
in size. This matches the current behavior except that the current loader
would break horribly if the first region was not at least 3MB in size.
- The memory range from the end of the loader up to the 640k window. This
is the range the loader uses when none of the high-heap-requesting options
are enabled.
Tested by: hrs
MFC after: 1 week
video console which doesn't take any input from keyboard and hides
all output replacing it with ``spinning'' character (useful for
embedded products and custom installations).
Sponsored by: Sippy Software, Inc.
This adds zfsloader which will be called by zfsboot/gptzfsboot code rather
than the tradional loader. This eliminates the need to set the
LOADER_ZFS_SUPPORT variable in order to get a ZFS enabled loader.
Note however, that you must reinstall your bootcode (zfsboot/gptzfsboot)
in order for the boot process to use the new loader.
New installations will no longer be required to build a ZFS enabled
loader for a working ZFS boot system. Installing zfsboot/gptzfsboot is
sufficient for acknowledging the use of CDDL code and therefore the ZFS
enabled loader.
Based on a previous patch from jhb@
Reviewed by: jhb@
MFC after: 2 weeks
fully support booting from large volumes.
Tested by: Emil Smolenski ambsd of raisa.eu.org
Submitted by: Matt Reimer mattjreimer of gmail (most of the C bits)
MFC after: 1 week
only when typing the sequence "123" (opposite to the standard 'push any
button' approach).
That results useful when using serial lines sending garbage and leading
to unwilling boot prompt appearence.
Obtained from: Sandvine Incorporated
Reviewed by: emaste, jhb
Sponsored by: Sandvine Incorporated
MFC: 1 week
- Teach it to read gang blocks. (essentially untested)
If you see "ZFS: gang block detected!", please let
me know, so we can either remove the printf if it
works, or fix it if it doesn't.
- If multiple partitions exist on a disk, probe them all.
We also need to reset dsk->start to 0 to read the right
sector here.
- With GPT, we can have 128 partitions.
- If the bootfs property has ever been set on a pool
it seems that it never goes away. zpool won't allow
you to add to the pool with the bootfs property set.
However, if you clear the property back to default
we end up getting 0 for the object number and read
a bogus block pointer and fail to boot.
- Fix some error printfs. The printf in the loader is
only capable of c,s and u formats.
- Teach printf how to display %llu
Reviewed by: dfr, jhb
MFC after: 2 weeks
short read requests, so the result was that a /boot.config smaller than 512
bytes was ignored. boot2 uses fsread() instead of xfsread() to read
/boot.config already, so this makes zfsboot more like boot2.
Submitted by: Johny Mattsson johny-freebsd of earthmagic org
Reviewed by: dfr
MFC after: 3 days
devices that we also support, just not by default (thus only LINT or
module builds by default).
While currently there is only "/dev/full" [2], we are planning to see more
in the future. We may decide to change the module/dependency logic in the
future should the list grow too long.
This is not part of linux.ko as also non-linux binaries like kFreeBSD
userland or ports can make use of this as well.
Suggested by: rwatson [1] (name)
Submitted by: ed [2]
Discussed with: markm, ed, rwatson, kib (weeks ago)
Reviewed by: rwatson, brueffer (prev. version)
PR: kern/68961
MFC after: 6 weeks
things a bit:
- use dpcpu data to track the ifps with packets queued up,
- per-cpu locking and driver flags
- along with .nh_drainedcpu and NETISR_POLICY_CPU.
- Put the mbufs in flight reference count, preventing interfaces
from going away, under INVARIANTS as this is a general problem
of the stack and should be solved in if.c/netisr but still good
to verify the internal queuing logic.
- Permit changing the MTU to virtually everythinkg like we do for loopback.
Hook epair(4) up to the build.
Approved by: re (kib)
slicei, Apple EFI hardware), the bootloader will fail to recognize the GPT
if it finds anything else but the EFI partition. Change the check to continue
detecting the GPT by looking at the EFI partition on the MBR but
stopping successfuly after finding it.
PR: kern/134590
Submitted by: Christoph Langguth <christoph at rosenkeller.org>
Reviewed by: jhb
MFC after: 2 weeks
Approved by: re (kib)
DP83065 Saturn Gigabit Ethernet controllers. These are the successors
of the Sun GEM controllers and still have a similar but extended transmit
logic. As such this driver is based on gem(4).
Thanks to marcel@ for providing a Sun Quad GigaSwift Ethernet UTP (QGE)
card which was vital for getting this driver to work on architectures
not using Open Firmware.
Approved by: re (kib)
MFC after: 2 weeks
controller. These controllers are also known as L1C(AR8131) and
L2C(AR8132) respectively. These controllers resembles the first
generation controller L1 but usage of different descriptor format
and new register mappings over L1 register space requires a new
driver. There are a couple of registers I still don't understand
but the driver seems to have no critical issues for performance and
stability. Currently alc(4) supports the following hardware
features.
o MSI
o TCP Segmentation offload
o Hardware VLAN tag insertion/stripping
o Tx/Rx interrupt moderation
o Hardware statistics counters(dev.alc.%d.stats)
o Jumbo frame
o WOL
AR8131/AR8132 also supports Tx checksum offloading but I disabled
it due to stability issues. I'm not sure this comes from broken
sample boards or hardware bugs. If you know your controller works
without problems you can still enable it. The controller has a
silicon bug for Rx checksum offloading, so the feature was not
implemented.
I'd like to say big thanks to Atheros. Atheros kindly sent sample
boards to me and answered several questions I had.
HW donated by: Atheros Communications, Inc.
=================
Extend the loader to parse the root file system mount options in /etc/fstab,
and set a new loader variable vfs.root.mountfrom.options with these options.
The root mount options must be a comma-delimited string, as specified in
/etc/fstab.
Only set the vfs.root.mountfrom.options variable if it has not been
set in the environment.
sys/kern/vfs_mount.c
====================
When mounting the root file system, pass the mount options
specified in vfs.root.mountfrom.options, but filter out "rw" and "noro",
since the initial mount of the root file system must be done as "ro".
While we are here, try to add a few hints to the mountroot prompt
to give users and idea what might of gone wrong during mounting
of the root file system.
Reviewed by: jhb (an earlier patch)
uses the generic struct dirent, which happens to look identical to UFS's
struct direct. If BSD ever changes dirent then this will be a problem.
Submitted by: matthew dot fleming at isilon dot com
- Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata. Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS. We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.
Reviewed by: jhb
open partition. This fixes access to partitions whose starting offset
is >= 2 TB.
Submitted by: "James R. Van Artsdalen" james jrv.org
MFC after: 3 days
- First three fields of system UUID may be little-endian as described in
SMBIOS Specification v2.6. For now, we keep the network byte order for
backward compatibility (and consistency with popular dmidecode tool)
if SMBIOS table revision is less than 2.6. However, little-endian format
can be forced by defining BOOT_LITTLE_ENDIAN_UUID from make.conf(5) if it
is necessary.
- Replace overly ambitious optimizations with more readable code.
- Update comments to SMBIOS Specification v2.6 and clean up style(9) bugs.
as 'real memory' instead of Maxmem if the value is available.
Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640
to unconfuse users. Now it is consistent across amd64 and i386 again.
While I am here, clean up smbios.c a bit and update copyright date.
Reviewed by: jhb
driver in Linux 2.6. uscanner was just a simple wrapper around a fifo and
contained no logic, the default interface is now libusb (supported by sane).
Reviewed by: HPS
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
in make.conf or src.conf.
- When GPT is enabled (which it is by default), use memory above 1 MB and
leave the memory from the end of the bss to the end of the 640k window
purely for the stack. The loader has grown and now it is much more
common for the heap and stack to grow into each other when both are
located in the 640k window.
PR: kern/129526
MFC after: 1 week
the disklabel in the 2nd sector for boot code. Even with both UFS1
and UFS2 supported, there's enough bytes left that we don't have to
nibble from the disklabel.
Thus, the entire 2nd sector is now reserved for the disklabel, which
makes the bootcode compatible again with disklabels that have more
than 8 partitions -- such as those created and supported by gpart.
i386: 135 bytes available
amd64: 151 bytes available
Ok'd by: jhb
The old BTX passed the general purpose registers from the 32-bit client to
the routines called via virtual 86 mode. The new BTX did the same thing.
However, it turns out that some instructions behave differently in virtual 86
mode and real mode (even though this is under-documented). For example, the
LEAVE instruction will cause an exception in real mode if any of the upper
16-bits of %ebp are non-zero after it executes. In virtual 8086 mode the
upper 16-bits are simply ignored. This could cause faults in hardware
interrupt handlers that inherited an %ebp larger than 0xffff from the 32-bit
client (loader, boot2, etc.) while running in real mode.
To fix, when executing hardware interrupt handlers provide an explicit clean
state where all the general purpose and segment registers are zero upon
entry to the interrupt handler. While here, I attempted to simplify the
control flow in the 'intusr' code that sets up the various stack frames
and exits protected mode to invoke the requested routine via real mode.
A huge thanks to Tor Egge (tegge@) for debugging this issue.
Submitted by: tegge
Reviewed by: tegge
Tested by: bz
MFC after: 1 week
kernel one as the non-faulting flush address in the loader so
we can can change KERNBASE and VM_MIN_KERNEL_ADDRESS if we
ever want to without needing to worry about using a compatible
loader.
- Correctly check for LOADER_DEBUG.
- Add a missing const for page_sizes[].
functions used in the bootloader. The goal is to make the code more
readable and smaller (especially because we have size issues
in the loader's environment).
High level description of the changes:
+ define some string manipulation functions to improve readability;
+ create functions to manipulate module descriptors, removing some
duplicated code;
+ rename the error codes to ESOMETHING;
+ consistently use set_environment_variable (which evaluates
$variables) when interpreting variable=value assignments;
I have tested the code, but there might be code paths that I have
not traversed so please let me know of any issues.
Details of this change:
--- loader.4th ---
+ add some module operators, to remove duplicated code while parsing
module-related commands:
set-module-flag
enable-module
disable-module
toggle-module
show-module
--- pnp.4th ---
+ move here the definition related to the pnp devices list, e.g.
STAILQ_* , pnpident, pnpinfo
--- support.4th ---
+ rename error codes to capital e.g. ENOMEM EFREE ... and do obvious
changes related to the renaming;
+ remove unused structures (those relevant to pnp are moved to pnp.4th)
+ various string functions
- strlen removed (it is an internal function)
- strchr, defined as the C function
- strtype -- type a string to output
- strref -- assign a reference to the string on the stack
- unquote -- remove quotes from a string
+ remove reset_line_buffer
+ move up the 'set_environment_variable' function (which now
uses the interpreter, so $variables are evaluated).
Use the function in various places
+ add a 'test_file function' for debugging purposes
MFC after: 4 weeks
and re-enable it as default.
In particular:
+ re-enable the 'update' flag in the Makefile (of course!);
+ commit Warner's patch "orb $NOUPDATE,_FLAGS(%bp)"
to avoid writing to disk in case of a timeout/default choice;
+ fix an off-by-one count in the partition scan code that would
print the wrong name for unknown partitions;
+ unconditionally change the boot prompt to 'Boot:' instead of 'Default:'
to make room for the extra code/checks/messages. Some of the changes
listed below are also made to save space;
+ rearrange and fix comments for known partition types. Right now we
explicitly recognise *BSD, Linux, FAT16 (type 6, used on many USB keys),
NTFS (type 7), FAT32 (type 11).
Depending on other options we also recognise Extended (type 5),
FAT12 (type 1) and FAT16 < 32MB (type 4).
+ Add an entry "F6 PXE" when the code is built with -DPXE (which is
a default now). Technically, F6 boots through INT18, so the prompt 'PXE'
is a bit misleading. Unfortunately the name INT18
is too long and does not fit in - we could use ROM perhaps.
The reason I picked 'PXE' is that on many (I believe) new systems
INT18 calls PXE.
Apart from the choice of the name for PXE/ROM/INT18, this should close
pending issues on the 1-sector boot0 code and we should be able to
move the code to RELENG_7 when it reopens.
No boot0cfg changes are necessary.
MFC after: 3 weeks