We're currently seeing how hard it would be to run CloudABI binaries on
operating systems cannot be modified easily (Windows, Mac OS X). The
idea is that we want to just run them without any sandboxing. Now
that CloudABI executables are PIE, this is already a bit easier, but TLS
is still problematic:
- CloudABI executables want to write to the %fs, which typically
requires extra system calls by the emulator every time it needs to
switch between CloudABI's and its own TLS.
- If CloudABI executables overwrite the %fs base unconditionally, it
also becomes harder for the emulator to store a backup of the old
value of %fs. To solve this, let's no longer overwrite %fs, but just
%fs:0.
As CloudABI's C library does not use a TCB, this space can now be used
by an emulator to keep track of its internal state. The executable can
now safely overwrite %fs:0, as long as it makes sure that the TCB is
copied over to the new TLS area.
Ensure that there is an initial TLS area set up when the process starts,
only containing a bogus TCB. We don't really care about its contents on
FreeBSD.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D5836
Allow using DTRACE for performance analysis of userspace
applications - the function call stack can be captured.
This is almost an exact copy of AMD64 solution.
Obtained from: Semihalf
Sponsored by: Cavium
Reviewed by: emaste, gnn, jhibbits
Differential Revision: https://reviews.freebsd.org/D5779
Instead of providing a wrapper around device_delete_child() that the PCI
bus and child bus drivers must call explicitly, move the bulk of the logic
from pci_delete_child() into a bus_child_deleted() method
(pci_child_deleted()). This allows PCI devices to be safely deleted via
device_delete_child().
- Add a bus_child_deleted method to the ACPI PCI bus which clears the
device_t associated with the corresponding ACPI handle in addition to
the normal PCI bus cleanup.
- Change cardbus_detach_card to call device_delete_children() and move
CardBus-specific delete logic into a new cardbus_child_deleted() method.
- Use device_delete_child() instead of pci_delete_child() in the SRIOV code.
- Add a bus_child_deleted method to the OpenFirmware PCI bus drivers which
frees the OpenFirmware device info for each PCI device.
Reviewed by: imp
Tested on: amd64 (CardBus and PCI-e hotplug)
Differential Revision: https://reviews.freebsd.org/D5831
A-MSDU is another 11n aggregation mechanism where multiple ethernet
frames get LLC encapsulated (so they have a length field), padded,
and put in a single MPDU (802.11 MAC frame.) This means it gets sent
out as a single frame, with a single seqno, it's acked as one frame, etc.
It turns out that, hah, atheros fast frames is almost but not quite
like this, so I'm reusing all of the current superg/fast-frames stuff
in order to actually transmit A-MSDU. Yes, this means that A-MSDU
frames are also only aggregated two at a time, so it's not necessarily
a huge win, but it's better than nothing.
This doesn't do anything by default - the driver needs to say it does
A-MSDU as well as set the AMSDU software TX capability so this code path
gets exercised.
For now, the only driver that enables this is urtwn. I'll enable it
for rsu at some point soon.
Tested:
* Add an amsdu encap path to aggregate two frames, same as the
fast-frames path.
* Always do the superg init/teardown and node init/teardown stuff,
regardless of whether the nodes are doing fast-frames (the ATH
capability stuff.) That way we can reuse it for amsdu.
* Don't do AMSDU for multicast/broadcast and EAPOL frames.
* If we're doing A-MPDU, then don't bother doing FF/A-MSDU.
We can likely do both together, but I don't want to change
behaviour.
* Teach the fast frames approx txtime logic to support the 11n
rates. But, since we don't currently have a full "current rate"
support, assume it's HT20, long-gi, etc. That way we overshoot
on the TX time estimation, so we're always inside the requirements.
(And we only aggregate two frames for now, so we're not really
going to exceed that.)
* Drop the maximum FF age default down to 2ms, otherwise we end up
with some very annoyingly large latencies.
TODO:
* We only aggregate two ethernet frames, so I'm not checking the max
A-MSDU size. But when it comes time to support >2 frames, we should
obey that.
Tested:
* urtwn(4)
This appears to be implementation dependent but convenient and makes
our sed behave more like GNU sed.
Given that it is not the historic behavior, bump FreeBSD_version
should userland/ports somehow depend on it.
Obtained from: NetBSD (bin/49872)
Reviewed by: bdrewery
PR: 208554
Merge after: NEVER
We were setting an incorrect/undefined size and as it came out the st
struct was not really being used at all. This was actually a bug but
by sheer luck it had no visual effect.
CID: 1194320
Reviewed by: grehan
kern.features.linux: 1 meaning linux 32 bits binaries are supported
kern.features.linux64: 1 meaning linux 64 bits binaries are supported
The goal here is to help 3rd party applications (including ports) to determine
if the host do support linux emulation
Reviewed by: dchagin
MFC after: 1 week
Relnotes: yes
Differential Revision: D5830
The urtwn hardware transmits FF/A-MSDU just fine - it takes an 802.11
frame and will dutifully send the thing.
So:
* bump RX queue up from 1. Why's it 1? That's really silly.
* Add the "software A-MSDU" encap capability bit.
* bump the TX buffer size up so we can at least send A-MSDU frames.
* track active frames submitted to the NIC - we can't make assumptions
about how many are in flight in the NIC though. For 88E parts we
could use per-packet TX indication, but for R92 parts we can't.
So, just fake it somewhat.
* Kick the transmit queue when we finish reception; try to avoid stalls.
* Kick the FF queue a little more regularly.
A-MSDU TX won't happen until the net80211 side is done, but atheros
fast-frames support should now work.
Tested:
* urtwn0: MAC/BB RTL8188EU, RF 6052 1T1R ; A-MSDU transmit.
* begin moving the 11n macros out of ieee80211_phy.c and
into a header so they can be used elsewhere.
* rename some of them into the IEEE80211_* namespace.
* convert HT_RC_2_MCS() to work with three-stream rates.
do software A-MSDU encapsulation.
Right now there's AMSDU TX/RX capability bits and they're mostly
unused, however I'd like to maintain those as the general configuration,
not also "please software encap AMSDU." For platforms that can do
A-MSDU in firmware (iwn, iwm, etc) then their init paths can read
this flag to configure A-MSDU.
ncq was not being inititialized properly but it was not actually
necessary either, so make the code smaller by removing it.
CID: 1248842
Reviewed by: grehan
The PLL_X, base CPU frequency source, doesn't have a bypass switch and thus
we must use another frequency source for CPU while changing its frequency.
PLL_P is ideal for this, it runs at 480MHz and CPU can be clocked at this
frequency at any CPU voltage.
This is compatible with the ds1307, but comparing the mcp7941x datasheet vs the
ds1307 code, appears there is one bit placement difference, so that is now
accounted for.
Relnotes: yes
OFW i2c probing requires a new method ofw_bus_get_node(), and the bus device is
assumed iichb. With these changes, i2c devices attached in fdt are probed and
attached automagically.
The fdc worker thread was using a one second timeout while waiting for
a new bio to arrive or for the device to detach. However, the driver
already does a wakeup when queueing a new bio or asking the thread to
detach, so the timeout only served to waste CPU time waking up the
thread once a second just so it could go right back to sleep. Use an
infinite timeout instead.
Discussed with: phk
Sponsored by: Netflix
many SoCs these two are the same, however there is no requirement for this
to be the case, e.g. on the ARM Juno we boot on what the GIC thinks of as
CPU 2, but FreeBSD numbers it CPU 0.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Previously, the code determined a topology of processing units
(hardware threads, cores, packages) and then deduced a cache topology
using certain assumptions. The new code builds a topology that
includes both processing units and caches using the information
provided by the hardware.
At the moment, the discovered full topology is used only to creeate
a scheduling topology for SCHED_ULE.
There is no KPI for other kernel uses.
Summary:
- based on APIC ID derivation rules for Intel and AMD CPUs
- can handle non-uniform topologies
- requires homogeneous APIC ID assignment (same bit widths for ID
components)
- topology for dual-node AMD CPUs may not be optimal
- topology for latest AMD CPU models may not be optimal as the code is
several years old
- supports only thread/package/core/cache nodes
Todo:
- AMD dual-node processors
- latest AMD processors
- NUMA nodes
- checking for homogeneity of the APIC ID assignment across packages
- more flexible cache placement within topology
- expose topology to userland, e.g., via sysctl nodes
Long term todo:
- KPI for CPU sharing and affinity with respect to various resources
(e.g., two logical processors may share the same FPU, etc)
Reviewed by: mav
Tested by: mav
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D2728