Runtime services require special execution environment for the call.
Besides that, OS must inform firmware about runtime virtual memory map
which will be active during the calls, with the SetVirtualAddressMap()
runtime call, done while the 1:1 mapping is still used. There are two
complication: the SetVirtualAddressMap() effectively must be done from
loader, which needs to know kernel address map in advance. More,
despite not explicitely mentioned in the specification, both 1:1 and
the map passed to SetVirtualAddressMap() must be active during the
SetVirtualAddressMap() call. Second, there are buggy BIOSes which
require both mappings active during runtime calls as well, most likely
because they fail to identify all relocations to perform.
On amd64, we can get rid of both problems by providing 1:1 mapping for
the duration of runtime calls, by temprorary remapping user addresses.
As result, we avoid the need for loader to know about future kernel
address map, and avoid bugs in BIOSes. Typically BIOS only maps
something in low 4G. If not runtime bugs, we would take advantage of
the DMAP, as previous versions of this patch did.
Similar but more complicated trick can be used even for i386 and 32bit
runtime, if and when the EFI boot on i386 is supported. We would need
a trampoline page, since potentially whole 4G of VA would be switched
on calls, instead of only userspace portion on amd64.
Context switches are disabled for the duration of the call, FPU access
is granted, and interrupts are not disabled. The later is possible
because kernel is mapped during calls.
To test, the sysctl mib debug.efi_time is provided, setting it to 1
makes one call to EFI get_time() runtime service, on success the efitm
structure is printed to the control terminal. Load efirt.ko, or add
EFIRT option to the kernel config, to enable code.
Discussed with: emaste, imp
Tested by: emaste (mac, qemu)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
There is nothing CPU specific here, and it's usable by both fdt and Open
Firmware based systems. Rather than keeping the same file in every one, just
add it to the ofw/fdt block in the main file.
This is in preparation for linking with LLVM's lld, which does not have
a compiled-in default output emulation. lld requires that it is
specified via the -m option, or obtained from the object file(s) being
linked.
This will also allow all build targets to share a common linker binary.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7837
In order to make CloudABI work on ARMv6, start off by copying over the
sysvec for ARM64 and adjust it to use 32-bit registers. Also add code
for fetching arguments from the stack if needed, as there are fewer
register than on ARM64.
Also import the vDSO that is needed to invoke system calls. This vDSO
uses the intra procedure call register (ip) to store the system call
number. This is a bit simpler than what native FreeBSD does, as FreeBSD
uses r7, while preserving the original r7 into ip.
This sysvec seems to be complete enough to start CloudABI processes.
These processes are capable of linking in the vDSO and are therefore
capable of executing (most?) system calls successfully. Unfortunately,
the biggest show stopper is still that TLS is completely broken:
- The linker used by CloudABI, LLD, still has troubles with some of the
relocations needed for TLS. See LLVM bug 30218 for more details.
- Whereas FreeBSD uses the tpidruro register for TLS, for CloudABI I
want to make use of tpidrurw, so that userspace can modify the base
address directly. This is needed for efficient emulation.
Unfortunately, this register doesn't seem to be preserved across
context switches yet.
Obtained from: https://github.com/NuxiNL/cloudabi (the vDSO)
evdev is a generic input event interface compatible with Linux
evdev API at ioctl level. It allows using unmodified (apart from
header name) input evdev drivers in Xorg, Wayland, Qt.
This commit has only generic kernel API. evdev support for individual
hardware drivers like ukbd, ums, atkbd, etc. will be committed later.
Project was started by Jakub Klama as part of GSoC 2014. Jakub's
evdev implementation was later used as a base, updated and finished
by Vladimir Kondratiev.
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by: adrian, hans
Differential Revision: https://reviews.freebsd.org/D6998
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters. The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.
Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device. It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.
t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.
t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.
VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages). This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request. In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices. Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.
Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.
Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics. In addition, TOE is not supported on VF devices, only for
the PF interfaces.
Reviewed by: np
MFC after: 2 months
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7599
This commit adds drivers for Alpine Cache Coherency Unit
and North Bridge Service whose task is to configure
the system fabric and enable cache coherency.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: wma
Differential Revision: https://reviews.freebsd.org/D7565
This defines a new bhnd_erom_if API, providing a common interface to device
enumeration on siba(4) and bcma(4) devices, for use both in the bhndb bridge
and SoC early boot contexts, and migrates mips/broadcom over to the new API.
This also replaces the previous adhoc device enumeration support implemented
for mips/broadcom.
Migration of bhndb to the new API will be implemented in a follow-up commit.
- Defined new bhnd_erom_if interface for bhnd(4) device enumeration, along
with bcma(4) and siba(4)-specific implementations.
- Fixed a minor bug in bhndb that logged an error when we attempted to map the
full siba(4) bus space (18000000-17FFFFFF) in the siba EROM parser.
- Reverted use of the resource's start address as the ChipCommon enum_addr in
bhnd_read_chipid(). When called from bhndb, this address is found within the
host address space, resulting in an invalid bridged enum_addr.
- Added support for falling back on standard bus_activate_resource() in
bhnd_bus_generic_activate_resource(), enabling allocation of the bhnd_erom's
bhnd_resource directly from a nexus-attached bhnd(4) device.
- Removed BHND_BUS_GET_CORE_TABLE(); it has been replaced by the erom API.
- Added support for statically initializing bhnd_erom instances, for use prior
to malloc availability. The statically allocated buffer size is verified both
at runtime, and via a compile-time assertion (see BHND_EROM_STATIC_BYTES).
- bhnd_erom classes are registered within a module via a linker set, allowing
mips/broadcom to probe available EROM parser instances without creating a
strong reference to bcma/siba-specific symbols.
- Migrated mips/broadcom to bhnd_erom_if, replacing the previous MIPS-specific
device enumeration implementation.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7748
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.
Reviewed by: alc, imp, kib
Differential Revision: https://reviews.freebsd.org/D7714
cnv API is a set of functions for managing name/value pairs by cookie.
The cookie can be obtained by nvlist_next(), nvlist_get_parent() or
nvlist_get_pararr() function. This patch also includes unit tests.
Submitted by: Adam Starak <starak.adam@gmail.com>
- Added bhnd_pmu driver implementations for PMU and PWRCTL chipsets,
derived from Broadcom's ISC-licensed HND code.
- Added bhnd bus-level support for routing per-core clock and resource
power requests to the PMU device.
- Lift ChipCommon support out into the bhnd module, dropping
bhnd_chipc.
Reviewed by: mizhka
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7492
A nice thing about requiring a vDSO is that it makes it incredibly easy
to provide full support for running 32-bit processes on 64-bit systems.
Instead of letting the kernel be responsible for composing/decomposing
64-bit arguments across multiple registers/stack slots, all of this can
now be done in the vDSO. This means that there is no need to provide
duplicate copies of certain system calls, like the sys_lseek() and
freebsd32_lseek() we have for COMPAT_FREEBSD32.
This change imports a new vDSO from the CloudABI repository that has
automatically generated code in it that copies system call arguments
into a buffer, padding them to eight bytes and zero-extending any
pointers/size_t arguments. After returning from the kernel, it does the
inverse: extracting return values, in the process truncating
pointers/size_t values to 32 bits.
Obtained from: https://github.com/NuxiNL/cloudabi
An optimization is in place to skip reading the .depend.* files with
'make install'. This was too strong and broke 'make all install' and
'make foo.o foo install'. Now only skip reading the dependency files
if all make targets ran are install targets.
The problem comes about because headers are only added in as a guessed
dependency if .depend.* files do not yet exist. If they do exist, even
if being skipped from being read, then the header dependencies are not
applied. This applies to all #included files, and not just headers.
Reported by: kib
MFC after: 1 day
Sponsored by: EMC / Isilon Storage Division
Copy over amd64's cloudabi64_sysvec.c into i386 and tailor it to work.
Again, we use a system call convention similar to FreeBSD, except that
there is no support for indirect system calls (%eax == 0).
Where i386 differs from amd64 is that we have to store thread/process
entry arguments on the stack instead of using registers. We also have to
put an extra pointer on the stack for TLS (for GSBASE). Place that
pointer in the empty slot that is normally used to hold return
addresses. That seems to keep the code simple.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D7590
Essentially, this is a literal copy of the code in sys/compat/cloudabi64,
except that it now makes use of 32-bits datatypes and limits. In
sys/conf/files, we now need to take care to build the code in
sys/compat/cloudabi if either COMPAT_CLOUDABI32 or COMPAT_CLOUDABI64 is
turned on.
This change does not yet include any of the CPU dependent bits. Right
now I have implementations for running i386 binaries both on i386 and
x86-64, which I will send out for review separately.
The reason why the old vDSOs were written in C using inline assembly was
purely because they were embedded in the C library directly as static
inline functions. This was practical during development, because it
meant you could invoke system calls without any library dependencies.
The vDSO was simply a copy of these functions.
Now that we require the use of the vDSO, there is no longer any need for
embedding them in C code directly. Rewriting them in assembly has the
advantage that they are closer to ideal (less useless branching, less
assumptions about registers remaining unclobbered by the kernel, etc).
They are also easier to build, as they no longer depend on the C type
information for CloudABI.
Obtained from: https://github.com/NuxiNL/cloudabi
This driver only supports 10Mb Ethernet using PIO (the hardware supports
DMA, but the driver only does PIO). There are not any PCCard adapters
supported by this driver, only ISA cards. In addition, it does not use
bus_space but instead uses bcopy with volatile pointers triggering a
host of warnings. (if_ie.c is one of 3 files always built with
-Wno-error)
Relnotes: yes
This hardware is not present on any modern systems. The driver is quite
hackish (raw inb/outb instead of bus_space, and raw inb/outb to random
I/O ports to enable ACPI since it predated proper ACPI support).
Relnotes: yes
The wl(4) driver supports pre-802.11 PCCard wireless adapters that
are slower than 802.11b. They do not work with any of the 802.11
framework and the driver hasn't been reported to actually work in a
long time.
Relnotes: yes
The si(4) driver supported multiport serial adapters for ISA, EISA, and
PCI buses. This driver does not use bus_space, instead it depends on
direct use of the pointer returned by rman_get_virtual(). It is also
still locked by Giant and calls for patch testing to convert it to use
bus_space were unanswered.
Relnotes: yes
This follows NTB subsystem modularization in Linux, tuning it to FreeBSD
native NewBus interfaces. This change allows to support different types
of hardware with different drivers, support multiple NTB instances in a
system, ntb_transport module use for needs other then if_ntb, etc.
Sponsored by: iXsystems, Inc.
- Added a generic bhnd_nvram_parser API, with support for the TLV format
used on WGT634U devices, the standard BCM NVRAM format used on most
modern devices, and the "board text file" format used on some hardware
to supply external NVRAM data at runtime (e.g. via an EFI variable).
- Extended the bhnd_bus_if and bhnd_nvram_if interfaces to support both
string-based and primitive data type variable access, required for
common behavior across both SPROM and NVRAM data sources.
- Extended the existing SPROM implementation to support the new
string-based NVRAM APIs.
- Added an abstract bhnd_nvram driver, implementing the bhnd_nvram_if
atop the bhnd_nvram_parser API.
- Added a CFE-based bhnd_nvram driver to provide read-only access to
NVRAM data on MIPS SoCs, pending implementation of a flash-aware
bhnd_nvram driver.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7489
This is a driver for a pre-ATAPI ISA CD-ROM adapter. As noted in
the manpage, this driver is only useful as a backend to cdcontrol to
play audio CDs since it doesn't use DMA, so its data performance is
"abysmal" (and that was true in the mid 90's).
The module works together with ipfw(4) and implemented as its external
action module.
Stateless NAT64 registers external action with name nat64stl. This
keyword should be used to create NAT64 instance and to address this
instance in rules. Stateless NAT64 uses two lookup tables with mapped
IPv4->IPv6 and IPv6->IPv4 addresses to perform translation.
A configuration of instance should looks like this:
1. Create lookup tables:
# ipfw table T46 create type addr valtype ipv6
# ipfw table T64 create type addr valtype ipv4
2. Fill T46 and T64 tables.
3. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
4. Create NAT64 instance:
# ipfw nat64stl NAT create table4 T46 table6 T64
5. Add rules that matches the traffic:
# ipfw add nat64stl NAT ip from any to table(T46)
# ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96
6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.
Stateful NAT64 registers external action with name nat64lsn. The only
one option required to create nat64lsn instance - prefix4. It defines
the pool of IPv4 addresses used for translation.
A configuration of instance should looks like this:
1. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
2. Create NAT64 instance:
# ipfw nat64lsn NAT create prefix4 A.B.C.D/28
3. Add rules that matches the traffic:
# ipfw add nat64lsn NAT ip from any to A.B.C.D/28
# ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96
4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D6434
* make interface cloner VNET-aware;
* simplify cloner code and use if_clone_simple();
* migrate LOGIF_LOCK() to rmlock;
* add ipfw_bpf_mtap2() function to pass mbuf to BPF;
* introduce new additional ipfwlog0 pseudo interface. It differs from
ipfw0 by DLT type used in bpfattach. This interface is intended to
used by ipfw modules to dump packets with additional info attached.
Currently pflog format is used. ipfw_bpf_mtap2() function uses second
argument to determine which interface use for dumping. If dlen is equal
to ETHER_HDR_LEN it uses old ipfw0 interface, if dlen is equal to
PFLOG_HDRLEN - ipfwlog0 will be used.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
- Move group task queue into kern/subr_gtaskqueue.c
- Change intr_enable to return an int so it can be detected if it's not
implemented
- Allow different TX/RX queues per set to be different sizes
- Don't split up TX mbufs before transmit
- Allow a completion queue for TX as well as RX
- Pass the RX budget to isc_rxd_available() to allow an earlier return
and avoid multiple calls
Submitted by: shurd
Reviewed by: gallatin
Approved by: scottl
Differential Revision: https://reviews.freebsd.org/D7393
These may have ccache in them or -target/--sysroot from external
compiler or SYSTEM_COMPILER support. Many ports do not support
a CC with spaces in it, such as emulators/virtualbox-ose-kmod.
Passing --sysroot to ports makes no sense as ports doesn't support
--sysroot currently.
If these variables need to be overridden for ports then they can
be set in make.conf or passed as make arguments.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.
Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D7438
Machine privilege level was specially designed to use in vendor's
firmware or bootloader. We have implemented operation in machine
mode in FreeBSD as part of understanding RISC-V ISA, but it is time
to remove it.
We now use BBL (Berkeley Boot Loader) -- standard RISC-V firmware,
which provides operation in machine mode for us.
We now use standard SBI calls to machine mode, instead of handmade
'syscalls'.
o Remove HTIF bus.
HTIF bus is now legacy and no longer exists in RISC-V specification.
HTIF code still exists in Spike simulator, but BBL do not provide
raw interface to it.
Memory disk is only choice for now to have multiuser booted in Spike,
until Spike has implemented more devices (e.g. Virtio, etc).
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
_prison_check_ip4 renamed to prison_check_ip4_locked
Move IPv6-specific jail functions to new file netinet6/in6_jail.c
_prison_check_ip6 renamed to prison_check_ip6_locked
Add appropriate prototypes to sys/sys/jail.h
Adjust kern_jail.c to call prison_check_ip4_locked and
prison_check_ip6_locked accordingly.
Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that
need to be built when INET and INET6, respectively, are configured in the
kernel configuration file.
Reviewed by: jtl
Approved by: sjg (mentor)
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D6799