Some of the collisions that are occuring are due to flowtable lookups
that succeed but have an invalid lle - typically because the L2 adjacency
lookup hasn't completed. This would lead to a follow-up insert which
would then fail (ie, collision) and the code would fall through to doing
a slow-path L2/L3 lookup in the netinet/netinet6 code.
This patch simply aborts storing a new flowtable entry if the lle isn't
yet valid.
Whilst I'm here, add a new pcpu counter for the item so the number of
failures can be tracked separately from generic "collisions."
Reviewed by: glebius
MFC after: 10 days
Sponsored by: Netflix, Inc.
because we use the 1MiB section maps as they only need a single pagetable.
To allow this we only use pc relative loads to ensure we only load from
physical addresses until we are running from a known virtual address.
As a side effect any data from before or 64MiB after the kernel needs to
be mapped in to be used. This should not be an issue for kernels loaded
with ubldr as it places this data just after the kernel. It will be a
problem when loading directly from anything using the Linux ABI that
places the ATAG data outside this range, for example U-Boot.
Currently we have 3 usage patterns:
1) nd6_output (most traffic flow, no lle supplied, lle RLOCK sufficient)
2) corner cases for output (no lle, STALE lle, so on). lle WLOCK needed.
3) nd* iunternal machinery (WLOCK'ed lle provided, perform packet queing).
We separate case 1 and implement it inside its only customer - nd6_output.
This leads to some code duplication (especialy SEND stuff, which should be
hooked to output in a different way), but simplifies locking and control
flow logic fir nd6_output_lle.
Reviewed by: ae
MFC after: 3 weeks
Sponsored by: Yandex LLC
Add gpioled(4) to BEAGLEBONE kernel and add the description of the four
on-board leds of beaglebone-black to its DTS file.
Approved by: adrian (mentor, implicit)
change the gpio children can be described as directly connected to the GPIO
controller without the need of describing the OFW GPIO bus itself on the
DTS file.
With this commit the OFW GPIO bus is fully functional on BBB and RPi.
GPIO controllers which want to use the OFW GPIO bus will need similar
changes.
Approved by: adrian (mentor, implicit)
This change makes ofw_iicbus attach to iicbb(4) controllers in addition to
the already supported i2c host bridges (iichb).
On iicbb(4) allow the direct access of the OFW parent node by its children,
so they can be directly attached to iicbb(4) node on the DTS without the
need of describing the i2c bus.
Approved by: adrian (mentor, implicit)
gpioled(4).
Tested on RPi and BBB (using the hardware I2C controller and gpioiic(4) for
the I2C tests). It was also verified for regressions on RSPRO (MIPS/ar71xx)
used as reference for a non OFW-based system.
Update the gpioled(4) and gpioiic(4) man pages with some details and
examples about the FDT/OFW support.
Some compatibility details pointed out by imp@ will follow in subsequent
commits.
Approved by: adrian (mentor, implicit)
describe GPIO bindings in the system.
Move the GPIOBUS lock macros to gpiobusvar.h as they are now shared between
the OFW and the non OFW versions of GPIO bus.
Export gpiobus_print_pins() so it can also be used on the OFW GPIO bus.
Approved by: adrian (mentor, implicit)
- Get USB input report length from HID descriptor.
- Use 1 finger TAP for devices which has no integrated button.
- Move data buffer to softc instead of allocating it.
MFC after: 1 week
and probably is a leftover from first prototyping by Kip. The
non-pcpu implementation used mutexes, so it doubtfully worked
better than simple routing lookup.
o Use UMA_ZONE_PCPU zone for pointers instead of [MAXCPU] arrays,
use zpcpu_get() to access data in there.
o Substitute own single list implementation with SLIST(). This
has two functional side effects:
- new flows go into head of a list, before they went to tail.
- a bug when incorrect flow was deleted in flow cleaner is
fixed.
o Due to cache line alignment, there is no reason to keep
different zones for IPv4 and IPv6 flows. Both consume one
cache line, real size of allocation is equal.
o Rely on that f_hash, f_rt, f_lle are stable during fle
lifetime, remove useless volatile quilifiers.
o More INET/INET6 splitting.
Reviewed by: adrian
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
to decide what protections are required by the faulting access. The old
code disassembled the faulting instruction, and there are a lot of new
instructions that aren't handled. The old code is still used for armv4/5
which doesn't have the WNR bit)
should fix DMA descriptor caching issues seen with the EHCI controller
found in Google Chromebook C720 during removal and insertion of USB
devices.
MFC after: 1 week
Reported by: Matthew Dillon at DragonFlyBSD
I/O windows, the default is to preserve the firmware-assigned resources.
PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture
defines a PCI_RES_BUS resource type.
- Add a helper API to create top-level PCI bus resource managers for each
PCI domain/segment. Host-PCI bridge drivers use this API to allocate
bus numbers from their associated domain.
- Change the PCI bus and CardBus drivers to allocate a bus resource for
their bus number from the parent PCI bridge device.
- Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the
full range of bus numbers from secbus to subbus from their parent bridge.
The drivers also always program their primary bus register. The bridge
drivers also support growing their bus range by extending the bus resource
and updating subbus to match the larger range.
- Add support for managing PCI bus resources to the Host-PCI bridge drivers
used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib).
- Define a PCI_RES_BUS resource type for amd64 and i386.
Reviewed by: imp
MFC after: 1 month
routine, now a platform can provide a pointer to an early_putc() routine
which is used instead of cn_putc(). Control can be handed off from early
printf support to standard console support by NULLing out the pointer
during standard console init.
This leverages all the existing error reporting that uses printf calls,
such as panic() which can now be usefully employed even in early
platform init code (useful at least to those who maintain that code and
build kernels with EARLY_PRINTF defined).
Reviewed by: imp, eadler
reuse the first page of the crashdumpmap as CMAP1/CADDR1. For i386,
remove CMAP1/CADDR1 entirely and reuse CMAP3/CADDR3 for the memory test.
Reviewed by: alc, peter
MFC after: 2 weeks
Real means the one TSC / Symmetricom / Microsemi actually uses on their 4370
and other rm9200 boards. This code demonstrates a variety of useful things
board init code can do, including adjusting the master clock frequency.
* Check ND6_IFF_IFDISABLED before acquiring any locks
* Assume m is always non-NULL
* remove 'bad' case not used anymore
* Simply if_output conditional
MFC after: 2 weeks
Sponsored by: Yandex LLC
communicate the kernel's physical load address from where it's known in
initarm() into cpu_mp_start() which is called from non-arm code and
takes no parameters.
This adds the global variable and ensures that all the various copies
of initarm() set it. It uses the variable in cpu_mp_start(), eliminating
the last uses of KERNPHYSADDR outside of locore.S (where we can now
calculate it instead of relying on the constant).
a new physmem.c file. The new code provides helper routines that can be
used by legacy SoCs and newer FDT-based systems. There are routines to
add one or more regions of physically contiguous ram, and exclude one or
more physically contiguous regions of ram. Ram can be excluded from crash
dumps, from being given over to the vm system for allocation management,
or both. After all the included and excluded regions have been added,
arm_physmem_init_kernel_globals() processes the regions into the global
dump_avail and phys_avail arrays and realmem and physmem variables that
communicate memory configuration to the rest of the kernel.
Convert all existing SoCs to use the new helper code.
This was an optimization used only by a few xscale platforms. Part of
the optimization was to create a direct map for all physical pages, and
that resulted in making multiple mappings of pages in a way that bypassed
the logic in pmap.c to handle VIVT cache aliasing. It also just generally
made the code more complex and hard to maintain for all SoCs.
Reviewed by: cognet
XSAVE-enabled features like AVX.
- Store a per-cpu guest xcr0 register. When switching to the guest FPU
state, switch to the guest xcr0 value. Note that the guest FPU state is
saved and restored using the host's xcr0 value and xcr0 is saved/restored
"inside" of saving/restoring the guest FPU state.
- Handle VM exits for the xsetbv instruction by updating the guest xcr0.
- Expose the XSAVE feature to the guest only if the host has enabled XSAVE,
and only advertise XSAVE features enabled by the host to the guest.
This ensures that the guest will only adjust FPU state that is a subset
of the guest FPU state saved and restored by the host.
Reviewed by: grehan
4574 get_clones_stat does not call zap_count in non-debug kernel
zap_count(...) is never called in non-DEBUG kernel.
As result "count" variable is always 0, and "goto fail" is always
reached. This means get_clones_stat function never makes up list
of clones for "clones" properties.
MFC after: 2 weeks
k_ipad.
Note that the two consumers in geli(4) are not affected by this
issue because the way the code is constructed and as such, we
believe there is no security impact with or without this change
with geli(4)'s usage.
Reported by: Serge van den Boom <serge vdboom.org>
Reviewed by: pjd
MFC after: 2 weeks
If a VM-exit is caused by an NMI then "blocking by NMI" is in effect on the
CPU when the VM-exit is completed. No more NMIs will be recognized until
the execution of an "iret".
Prior to this change the NMI handler was dispatched via a software interrupt
with interrupts enabled. This meant that an interrupt could be recognized
by the processor before the NMI handler completed its execution. The "iret"
issued by the interrupt handler would then cause the "blocking by NMI" to
be cleared prematurely.
This is now fixed by handling the NMI with interrupts disabled in addition
to "blocking by NMI" already established by the VM-exit.
- ip_output() and ip_output6() simply call flowtable_lookup(),
passing mbuf and address family. That's the only code under
#ifdef FLOWTABLE in the protocols code now.
o Revamp statistics gathering and export.
- Remove hand made pcpu stats, and utilize counter(9).
- Snapshot of statistics is available via 'netstat -rs'.
- All sysctls are moved into net.flowtable namespace, since
spreading them over net.inet isn't correct.
o Properly separate at compile time INET and INET6 parts.
o General cleanup.
- Remove chain of multiple flowtables. We simply have one for
IPv4 and one for IPv6.
- Flowtables are allocated in flowtable.c, symbols are static.
- With proper argument to SYSINIT() we no longer need flowtable_ready.
- Hash salt doesn't need to be per-VNET.
- Removed rudimentary debugging, which use quite useless in dtrace era.
The runtime behavior of flowtable shouldn't be changed by this commit.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
and where the code that references it can safely be elided if it's not
defined (meaning the code is used for legacy arm platforms that still
define the compile-time PHYSADDR but not on newer systems that calculate
the value at runtime).
in the sysctl_root().
Note: SYSCTL_VNET_* macros can be removed as well. All is
needed to virtualize a sysctl oid is set CTLFLAG_VNET on it.
But for now keep macros in place to avoid large code churn.
Sponsored by: Nginx, Inc.