r274560 modified kqueue_register() to only test the event condition if the
corresponding knote is not disabled. However, this check takes place before
the EV_ENABLE flag is used to clear the KN_DISABLED flag on the knote, so
enabling a previously-disabled kevent would not result in a notification for
a triggered event. This change fixes the problem by testing for EV_ENABLED
before possibly checking the event condition.
This change also updates a kqueue regression test to exercise this case.
PR: 206368
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D5307
This will speed up some tree-walks with FAST_DEPEND which otherwise
would include length(SRCS) .depend files.
This also uses a trick suggested by sjg@ to still read them in when
specifying _V_READ_DEPEND=1 in the env/make args.
Sponsored by: EMC / Isilon Storage Division
the if statement it pairs with). While not an error today, a careless
edit in the future could cause problems (though given the nature of
this specific code, the problems quite likely would be some variation
of "most direct access SCSI storage devices won't attach," which is
unlikely to go unnoticed).
PVS-Studio: V705
Import portions of the PowerPC OF PCI implementation into
new file "ofw_pci.c", common for other platforms. The files ofw_pci.c and
ofw_pci.h from sys/powerpc/ofw no longer exist. All required declarations
are moved to sys/dev/ofw/ofw_pci.h.
This creates a new ofw_pci_write_ivar() function and modifies
ofw_pci_nranges(), ofw_pci_read_ivar(), ofw_pci_route_interrupt() methods.
Most functions contain existing ppc implementations in the majority
unchanged. Now there is no need to have multiple identical copies
of methods for various architectures.
Submitted by: Marcin Mazurek <mma@semihalf.com>
Obtained from: Semihalf
Sponsored by: Annapurna Labs
Reviewed by: jhibbits, mmel
Differential Revision: https://reviews.freebsd.org/D4879
Provide bus_get_bus_tag() for sparc64, powerpc, arm, arm64 and mips
nexus and its children in order to return a platform specific default tag.
This is required to ensure generic correctness of the bus_space tag.
It is especially needed for arches where child bus tag does not match
the parent bus tag. This solves the problem with ppc architecture
where the PCI bus tag differs from parent bus tag which is big-endian.
This commit is a part of the following patch:
https://reviews.freebsd.org/D4879
Submitted by: Marcin Mazurek <mma@semihalf.com>
Obtained from: Semihalf
Sponsored by: Annapurna Labs
Reviewed by: jhibbits, mmel
Differential Revision: https://reviews.freebsd.org/D4879
Resource list for devices that are not ofwbus descendants, but
got to ofwbus method via bus_generic_release_resource() call chain,
cannot be found using BUS_GET_RESOURCE_LIST() used by ofwbus.
In that case, changing device's resource list should be avoided
(will not contain resource list prepared by ofw or simplebus).
Pointy-hat to: zbb
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5304
So one spinlock is avoided, which would be potentially dangerous for
virtual machine, if the spinlock holder was scheduled out by the host,
as noted by royger.
Old spinlock based txdesc list is still kept around, so we could have
a safe fallback.
No performance regression nor improvement is observed.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5290
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5283
Performance stays same; so no need to use fast taskqueue here.
Suggested by: royger
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5282
This also eases experiment on the non-fast taskqueue.
Reviewed by: adrian, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5276
This paves the way for upcoming vRSS stuffs and eases more code cleanup.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5275
And use SYSCTL+CTLFLAG_RDTUN for them.
Suggested by: adrian
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5274
This one gives the best performance so far.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5273
It is off by default. This eases further experimenting on this driver.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5272
Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most. Aggregating
anything more than 2 hurts TCP sending performance in hyperv. This
significantly improves the TCP sending performance when the number of
concurrent connetion is low (2~8). And it greatly stabilizes the TCP
sending performance in other cases.
Set TCP data segments aggregation length limit to 37500. Without this
limitation, hn(4) could aggregate ~45 TCP data segments for each
connection (even at 64 or more connections) before dispatching them to
socket code; large aggregation slows down ACK sending and eventually
hurts/destabilizes TCP reception performance. This setting stabilizes
and improves TCP reception performance for >4 concurrent connections
significantly.
Make them sysctls so they could be adjusted.
Reviewed by: adrian, gallatin (previous version), hselasky (previous version)
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5185
ACK aggregation limit is append count based, while the TCP data segment
aggregation limit is length based. Unless the network driver sets these
two limits, it's an NO-OP.
Reviewed by: adrian, gallatin (previous version), hselasky (previous version)
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5185
unlinked. Otherwise the vnode stays cached, causing leak. This is
similar to r292961 for regular files.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
numbers range.
This effectively skips indirect and extdata blocks on the buffer
queue. Since their logical block numbers are negative, bnoreuselist()
could loop infinitely.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
configuration from the FDT data, then set the pins into the requested
state. As part of this the gpio controller now reports the correct number
of pins instead of returning the number of bank * 32.
To allow for a future consolidated kernel we add the SOC_ALLWINNER_A10 and
SOC_ALLWINNER_A20 kernel options. These need to be set as appropriate for
the SoC the kernel will boot on.
Submitted by: Emmanuel Vadot <manu@bidouilliste.com>
Differential Revision: https://reviews.freebsd.org/D5177
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.
are not utilized there. Only domain #0 is used and there is no reference
to it in the whole pmap-v6.c. Thus initialize domain access register in
locore-v6.c without reference too.
Before this change all mappings done by this function were executable
as pte entries have NOT EXECUTABLE bit.
The function is used only for static device mappings at present. Thus
this is also a fix as DEVICE memory should not be mapped as executable.
struct socket. When sockbuf.h was moved out of socketvar.h, the locking
key was no longer nearby. Instead, add a new key for sockbuf and use
a single item for the socket buffer lock instead of separate entries for
receive vs send buffers.
Reviewed by: adrian
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D4901
will allow for code that uses the old fdt_get_range and fdt_regsize
functions to find a range, map it, access, then unmap to replace this, up
to and including the map, with a call to OF_decode_addr.
As this function should only be used in the early boot code the unmap is
mostly do document we no longer need the mapping as it's a no-op, at least
on arm.
Reviewed by: jhibbits
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D5258
am335x_prcm driver which uses it. Used BUS_PASS_BUS is a quick pick
for now and may be revised when other drivers start using multipass
feature.
This is needed after an update of Linux dts files done in r295436.
(1) The channel mask is get from "brcm,dma-channel-mask" property of
dma node, and if not provided, from "broadcom,channels" property.
(2) Consequently, sdhci driver does not allocate any specific channel.
(3) Use CS_RESET bit for initial channel reset.
Differential Revision: https://reviews.freebsd.org/D4303
A10/A20 SoC. Based loosely on the submitters NetBSD driver, tested on
Cubieboard 2. Playback and capture are supported.
Submitted by: Jared McNeill <jmcneill@invisible.ca>
Differential Revision: https://reviews.freebsd.org/D5202
Some chip revisions don't have their external PCIe buses
behind the internal bridge. Add support for FDT-configurable
PEMs but keep ability for PCIe enumeration.
Reviewed by: andrew, wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5285
dts files. It may be removed once it will be fixed upstream.
This is done just to supresses a warning during dtb evaluation as
there is no elm driver in tree at present.
least the audio codec driver currently in review.
Submitted by: Jared McNeill <jmcneill@invisible.ca>
Differential Revision: https://reviews.freebsd.org/D5050
with this and an Allwinner SoC to power off.
Submitted by: Emmanuel Vadot <manu@bidouilliste.com>
Differential Revision: https://reviews.freebsd.org/D4954
Marvell twsi part, however uses different register locations, as such split
the existing driver into Marvell and Allwinner attachments.
While here clean a few style issues.
Submitted by: Emmanuel Vadot <manu@bidouilliste.com>
Differential Revision: https://reviews.freebsd.org/D4846
This patch allows the newly imported INTRNG code to be built without necessarily
having FDT support in the kernel. This may be useful for some MIPS platforms
that wish to move to INTRNG, but not to FDT at the same time.
Basically all the code is already within ifdef's where FDT is concerned,
it's just the headers that aren't.
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Differential Revision: https://reviews.freebsd.org/D5249
This patch comes from Dave Jiang's Linux tree, davejiang/ntb. It hasn't
been accepted into Linus' tree, so I do not have an authoritative SHA1
to point at. Original commit log:
=====================================================================
A hardware errata causes the NTB to hang when heavy bi-directional
traffic in addition to the usage of BAR0/1 (where the registers reside,
including the doorbell registers to trigger interrupts).
This workaround is only available on Haswell and Broadwell platform.
The workaround is to enable split BAR in the BIOS to allow the 64bit
BAR4 to be split into two 32bit BAR4 and BAR5. The BAR4 shall be pointed
to LAPIC region of the remote host. We will bypass the db mechanism and
directly trigger the MSIX interrupts. The offsets and vectors are
exchanged during transport scratch pad negotiation. The scratch pads are
now overloaded in order to allow the exchange of the information. This
gets around using the doorbell and prevents the lockup with additional
pcode changes in BIOS.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
=====================================================================
Notable changes in the FreeBSD version of this patch:
* The MSIX BAR is configurable, like hw.ntb.b2b_mw_idx (msix_mw_idx).
The Linux version of the patch only uses BAR4.
* MSIX negotiation aborts if the link goes down.
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Sync with r84642 from UFS:
The panics are inappropriate because the IN_RENAME flag only fixes a
few of the huge number of race conditions that can result in the
source path becoming invalid even prior to the VOP_RENAME() call.
Found accidentally while checking an issue from PVS Static Analysis.
MFC after: 3 days
The I/OAT HW reset process may sleep, so it is invalid to perform a
channel reset from the software interrupt thread.
Sponsored by: EMC / Isilon Storage Division
Cleanup some checks for NULL. Most of these were always unnecessary and
starting with r294954 brelse() doesn't need any NULL checks at all.
For now keep the checks somewhat consistent with NetBSD in case we want to
merge the cleanups to older versions.
It is otherwise left dangling, and callers that request cookies always free
the cookie buffer, even when VOP_READDIR(9) returns an error. This results
in a double free if tmpfs_readdir() returns an error to the NFS server or
the Linux getdents(2) emulation code.
Reported by: pho
MFC after: 1 week
Security: double free of malloc(9)-backed memory
Sponsored by: EMC / Isilon Storage Division
values.
If switching from a thread that used floating-point registers to a thread
that is still running, but holding the blocked_lock lock we would switch
the curthread to the new (running) thread, then call critical_enter. This
will non-atomically increment td_critnest, and later call critical_exit to
non-atomically decrement this value.
This can happen at the same time as the new thread is still running on the
old core, also calling these functions. In this case there will be a race
between these non-atomic operations. This can be an issue as we could loose
one of these operations leading to the value to not return to zero.
If, later on, we then hit a data abort we check if the td_critnest is zero.
If this check fails we will panic the kernel.
This has been observed when running pcmstat on a Cavium ThunderX. The pcm
thread will use the blocked_lock lock and there is a high chance userspace
will use the floating-point registers. When, later on, pmcstat triggers a
data abort we will hit this panic.
The fix is to update these values after storing the floating-point state.
This means we use the correct curthread while storing the state so it will
not be an issue that the changes to td_critnest are non-atomic.
Sponsored by: ABT Systems Ltd
ucontext_t available. Our code even has XXX comment about this.
Add a bit of compliance by moving struct __ucontext definition into
sys/_ucontext.h and including it into signal.h and sys/ucontext.h.
Several machine/ucontext.h headers were changed to use namespace-safe
types (like uint64_t->__uint64_t) to not depend on sys/types.h.
struct __stack_t from sys/signal.h is made always visible in private
namespace to satisfy sys/_ucontext.h requirements.
Apparently mips _types.h pollutes global namespace with f_register_t
type definition. This commit does not try to fix the issue.
PR: 207079
Reported and tested by: Ting-Wei Lan <lantw44@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
supported, use full-width aliases MSRs for writes. This fixes the
"[pmc,X] negative increment" assertion on the context switch when
clipped counter value is sign-extended.
Add definitions for the MSR IA32_PERF_CAPABILITIES needed to detect
the feature.
PR: 207068
Submitted by: joss.upton@yahoo.com
MFC after: 2 weeks
mbuf(9) manipulation functions into uipc_mbuf.c. This looks like
the initial intent, but had diffused in the last decade.
o Gather all declarations in mbuf.h in one place and sort them.
o Uninline m_clget() and m_cljget().
There are no functional changes in this patch.
The patch comes from a larger version, where all mbuf(9) allocation was
uninlined, which allowed to make mbuf(9) UMA zones private to kern_mbuf.c.
The performance impact of the total uninlining is still unclear, so we
are holding on now with larger version.
Together with: melifaro, olivier
nvme(4) issues a SET_NUM_QUEUES command during device
initialization to ensure enough I/O queues exists for each
of the MSI-X vectors we have allocated. The SET_NUM_QUEUES
command is then issued again during nvme_ctrlr_start(), to
ensure that is properly set after any controller reset.
At least one NVMe drive exists which fails this second
SET_NUM_QUEUES command during device initialization. So
change nvme_ctrlr_start() to only issue its SET_NUM_QUEUES
command when it is coming out of a reset - avoiding the
duplicate SET_NUM_QUEUES during device initialization.
Reported by: gallatin
MFC after: 3 days
Sponsored by: Intel
Fix a panic that occurs when a vnet interface is unavailable at the time the
vnet jail referencing said interface is stopped.
Sponsored by: FIS Global, Inc.
o Add kernel configuration for QEMU.
Both SPIKE and QEMU kernel configs are temporary (until
we will be able to obtain DTB from loader).
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
Summary:
The revised Book-E spec, adding the specification for the MMUv2 and e6500,
includes a hardware PTE layout for indirect page tables. In order to support
this in the future, migrate the PTE format to match the MMUv2 hardware PTE
format.
Test Plan: Boot tested on a P5020 board. Booted to multiuser mode.
Differential Revision: https://reviews.freebsd.org/D5224
- Add MOVI command and routine for the LPI migration
- Allow to search for the ITS device descriptor using
not only devID but also LPI number.
- Bind SPIs in the Distributor
- Don't bind its_dev to collection. Keep track of the collection
IDs for each LPI.
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5231
- Change locks' names to be more suitable
- Don't use blocking mutex. Lock only basic operations such
as lists or bitmaps modifications.
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5230
This should be done by routing all interrupts to CPU0,
different assignment will be induced by either interrupts
shuffling or bus_bind_intr().
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5229
pmc_hook() was called only in case of the stray interrupt but should
rather be called on each interrupt. Move in to the arm_cpu_intr()
handler, out of the critical section too.
Reviewed by: br
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5161
It can be used to bind specific interrupt to a particular CPU.
Requires PIC support for interrupts binding.
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5122
Separate interrupt descriptors lookup from allocation. It was possible
to perform config on non-existing interrupt simply by allocating spurious
descriptor.
Must lock the interrupt descriptors table lookup to avoid mismatches.
This ought to prevent trouble while setting up new interrupt
and dispatching existing one.
Use spin mutex rather than sleep mutex. This is mainly due to lock in
arm_dispatch_intr.
This should be eventually changed to a lock-less solution without
walking through a linked list on each interrupt.
Reviewed by: andrew, wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5121
xn_ifp is allocated in create_netdev with if_alloc(IFT_ETHER).
According to the current arrangement it can't be NULL.
Coverity ID: 1349805
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5252
The variable error is assigned to 0 before entering the switch.
Assigning error to 0 before break pointless rewrites the real error
value that should be returned.
Coverity ID: 1304974
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5250
This fixes operation in QEMU and saves some booting time as well.
Pointed out by: Sagar Karandikar <skarandikar@berkeley.edu>
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
the sign bit doesn't cause an overflow. The overflow manifests itself
as a sorting index wrap around in the middle of the sorted array,
which is not a problem for the LRO code, but might be a problem for
the logic inside qsort().
Reviewed by: gnn @
Sponsored by: Mellanox Technologies
Differential Revision: https://reviews.freebsd.org/D5239
pmap_unmapdev respectively) so that resources are properly managed.
This is work originally done by kan@. Stanislav picked it up as part
of his Mediatek SoC work.
Tested:
* Carambola2, AR933x SoC
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Reviewed by: kan
Differential Revision: https://reviews.freebsd.org/D5184
This was originall done by kan@.
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Reviewed by: kan
Differential Revision: https://reviews.freebsd.org/D5184
This is a prelude to intr-ng support for MIPS boards that need it -
notably the CI20 port from kan@ that's upcoming, but also work that
Stanislav is doing for the Mediatek platforms.
This is the initial platform dependent bits in include/intr.h, some
#defines for the nexus code for the intrng initialisation/runtime
bits, some changed naming (which I'll fix later to be the same, much
like what I did for ARM intr-ng) in exception.S, and the first cut
at a PIC.
Stanislav and I refactored out the common code for intrng support,
so the mips intrng definitions are quite small (sys/mips/include/intr.h.)
This is all work done by kan@, which stanislav has been cherry picking
into common code for his mediatek chipset work.
Tested:
* Carambola2 - no regressions (not intr-ng though!)
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Reviewed by: kan (original author)
Differential Revision: https://reviews.freebsd.org/D5182
This is ongoing work from Damjan Jovanovic to improve ext4 read support
with sparse files:
Keep track of the first and last block in each extent as it descends down
the extent tree, thus being able to work out that some blocks are sparse
earlier. This solves an issue on r293680.
In ext4_bmapext() start supporting the runb parameter, which appears to be
the number of adjacent blocks prior to the block being converted in the
same way that runp is the number of blocks after, speding up random access
to mmaped files.
PR: 206652
Replace the hw.ntb.enable_writecombine tunable with
hw.ntb.default_mw_pat. It can be set with several specific numerical
values to select a caching type. Any bogus value is treated as
Uncacheable (UC).
The ntb_mw_set_wc() KPI has removed the restriction that the selected
mode must be one of UC, WC, or WB.
Sponsored by: EMC / Isilon Storage Division
The IOCTL is used by 'ifconfig -v' to show SFP+/QSFP+ information
including inventory information and dianostics (temperature, light
levels, voltage etc).
Reviewed by: gnn,melifaro
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D5240
* Use the Linux compat string
* Use EARLY_DRIVER_MODULE to attach at the right time
* Add a generic A10 kernel config file
* A20 now use generic_timer
* Add two new dts files for Olimex boards
* Update our custom DTS file for A10 and A20 to use the same compatible
property names as the vendor ones.
Submitted by: Emmanuel Vadot <manu@bidouilliste.com>
Differential Revision: https://reviews.freebsd.org/D4792
made writeable by the root user. Userspace audio daemons can add or
update an entry in /dev/sndstat by doing a single system write call to
any /dev/sndstat file descriptor handle. When the audio daemon closes the
file handle or is killed the entry disappears.
While at it, cleanup the sound status code a bit:
- keep the device list sorted to avoid sorting the list every time a
/dev/sndstat read request is made.
- factor out locking into a pair of locking macros.
- use the sound status lock to protect all per file handle states,
when generating the output for /dev/sndstat and when removing or
adding sound status devices. This way sndstat_acquire() and
sndstat_release() become superfluous and can be removed.
Reviewed by: mav @
Differential Revision: https://reviews.freebsd.org/D5191
copyin and copyout code handle virtual addresses such that they will take
a virtual address and convert it into a valid physical address. It may
also mean we fail to boot as the elf files load address could be 0.
Sponsored by: ABT Systems Ltd
Prevent the function from null-pointer-dereference when unexisting
mapping is being processed.
Obtained from: Semihalf
Sponsored by: Cavium
Approved by: cognet (mentor)
Reviewed by: zbb, cognet
Differential revision: https://reviews.freebsd.org/D5228
Currently, there is no easy way to know in advance how many entries a list parsed by
ofw_bus_parse_xref_list_alloc() in sys/dev/ofw/ofw_bus_subr.c has.
This patch:
* teaches the existing function about handling idx == -1 and returning how big
the set is; then renames it as _internal;
* create a new function that asserts idx != -1, so the old API is maintained;
* add a new function that returns just the list length.
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Differential Revision: https://reviews.freebsd.org/D5043
so will access data from an unrelocated address. This is only needed for
self relocating code on ARMv7, however this is true for both ubldr and
loader.efi, the only two loaders we support on ARMv7.
While here also force the fpu to be none as is done in libstand.
Sponsored by: ABT Systems Ltd
stores to clear it.
While here reduce the alignment of the data from 4k to 16 byte aligned.
This should be more than enough, without wasting too much space.
Sponsored by: ABT Systems Ltd
Kernel threads (and processes) are supposed to call kthread_exit() (or
kproc_exit()) to terminate. However, the kernel includes a fallback in
fork_exit() to force a kthread exit if a kernel thread's "main" routine
returns. This fallback was added back when the kernel only had processes
and was not updated to call kthread_exit() instead of kproc_exit() when
threads were added to the kernel.
This mistake was particular exciting when the errant thread belonged to
proc0. Due to the missing P_KTHREAD flag the fallback did not kick in
and instead tried to return to userland via whatever garbage was in the
trapframe. With P_KTHREAD set it tried to terminate proc0 resulting in
other amusements.
PR: 204999
MFC after: 1 week
All other kernel processes have this flag set and all threads in proc0
(including thread0) have the similar TDP_KTHREAD flag set.
PR: 204999
Submitted by: Oliver Pinter @ HardenedBSD
Reviewed by: kib
MFC after: 1 week
and a retry is scheduled.
Instead of leaving the device queue frozen, unfreeze the device queue so
that the retry can happen.
Sponsored by: Spectra Logic
MFC after: 3 days
This allows skipping 'make depend' or running 'make clean all' without
getting a flip-flopping dependency due to the exists() just below.
Otherwise an error is encountered, such as:
fatal error: 'machine/endian.h' file not found.
Sponsored by: EMC / Isilon Storage Division
While the later is a better random generator than the former, the main
reason of the change is that random() has a better chance to work with
libstand(3).
At this time we don't include random number generators in bootforth
so this has no effect.
in boot1, like is normally done. When a keyboard appears in the UEFI
device tree, assume -D -h, just like on a BIOS boot.
# It is unclear if an ACPI keyboard appearing in the tree means there's
# a real keyboard or not. A USB keyboard doesn't seem to appear unless
# it is really there.
Differential Revision: https://reviews.freebsd.org/D5223
I previously disconnected kgzdr based on a misunderstanding.
I'd still like to transition to supporting only the loader(8)-based
boot path for handling compressed kernels, but that can follow the
standard deprecation procedure.
This reverts r291113.
Requested by: dteske
disable compilation of the code which made it possible to call
stop_all_proc() from usermode at all.
Move the comment to the preamble of stop_all_proc() and reword it to
give overview of the function intent.
proc0 has P_HADTHREADS flag set due to kthread_add(), but no
P_KTHREAD, which triggered the assert, which does not serve a purpose
now.
Reported by: Oliver Pinter
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Although POSIX literally permits failing with [EINVAL] if IPC_CREAT and
IPC_EXCL were both passed, the semaphore set already exists and has fewer
semaphores than nsems, this does not allow an application to retry safely:
if the [EINVAL] is actually because of the semmsl limit, an infinite loop
would result.
PR: 206927
CID 1018688 is a false positive.
The initialization is done by calling vn_start_write(... &mp, flags).
mp is only an output parameter unless (flags & V_MNTREF), and fdesc
doesn't put V_MNTREF in flags.
Pointed out by: bde
This causes boot from external media (installer USB image) to mount / from
the default ZFS BE, rather than the USB device.
Reported by: kmoore
MFC after: 5 days
Sponsored by: ScaleEngine Inc.
With warnings now enabled some plaforms where failing due to warnings.
* Fix st_size printed as a size_t when its actually an off_t.
* Fix pointer conversion in load_elf for some 32bit platforms due to 64bit
off in ef.
MFC after: 2 days
X-MFC-With:
Sponsored by: Multiplay
when its result is immediately ignored, i.e. for kernel processes
forked from the user process. Do not test for non-null before freeing
string.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Typically <foo>list is used for a structure that holds a list head in
FreeBSD, not for members of a list. As such, rename 'struct aiocblist'
to 'struct kaiocb' (the kernel version of 'struct aiocb').
While here, use more consistent variable names for AIO control blocks:
- Use 'job' instead of 'aiocbe', 'cb', 'cbe', or 'iocb' for kernel job
objects.
- Use 'jobn' instead of 'cbn' for use with TAILQ_FOREACH_SAFE().
- Use 'sjob' and 'sjobn' instead of 'scb' and 'scbn' for fsync jobs.
- Use 'ujob' instead of 'aiocbp', 'job', 'uaiocb', or 'uuaiocb' to hold
a user pointer to a 'struct aiocb'.
- Use 'ujobp' instead of 'aiocbp' for a user pointer to a 'struct aiocb *'.
Reviewed by: kib
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D5125
Major changes:
- Add i219/i219(2) hardware support. (Found on Skylake generation and newer
chipsets.)
- Further to the last Skylake support diff, this one also includes support for
the Lewisburg chipset (i219(3)).
- Add a workaround to an igb hardware errata.
All 1G server products need to have IPv6 extension header parsing turned off.
This should be listed in the specification updates for current 1G server
products, e.g. for i350 it's errata #37 in this document:
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/ethernet-controller-i350-spec-update.pdf
- Avoton (i354) PHY errata workaround added
And a bunch of minor fixes, as well as #defines for things that the current
em(4)/igb(4) drivers don't implement.
Differential Revision: https://reviews.freebsd.org/D3162
Reviewed by: sbruno, marius, gnn
Approved by: gnn
MFC after: 2 weeks
Sponsored by: Intel Corporation
Fix EFI boot support when presented with multiple valid boot partitions
across multiple devices.
It now prefers to boot from partitions that are present on the underlying
device that the boot1 image was loaded from. This means that it will boot
from the partitions on device the user chose from EFI boot menu in
preference to those on other devices.
Also fixed is the recovery from a failed attempt to boot, from a seemingly
valid partition, by continuing to trying all other available partitions
no matter what the error.
boot1 now use * to signify a partition what was accepted from the preferred
device and + otherwise.
Finally some error messages where improved and DPRINTF's with slowed boot
to aid debugging.
ZFS will still be preferred over UFS when both are available on the boot
device.
Reviewed by: imp
MFC after: 1 week
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D5108
We will eventually convert them to use busdma.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Dexuan Cui <decui microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5087
And convert rndis non-hot path spinlock to mutex.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5081
HyperV code was ported from Linux. There is an implementation of
work queue called hv_work_queue. In FreeBSD, taskqueue could be
used for the same purpose. Convert all the consumer of hv_work_queue
to use taskqueue, and remove work queue implementation.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4963
It is off by default. This eases more experiment on hn(4).
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5175
This significantly increases LRO aggregation ratio when there are
large amount of connections (improves reception performance a lot).
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5167
hn(4) only has one RX ring currently, so default 8 LRO entries
are too small.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5166
We lost half of the chimney sending space, because we mis-used
ffs() on a 64 bits mask, where ffsl() should be used.
While I'm here:
- Use system atomic operation instead.
- Stringent chimney sending index assertion.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5159
It will be shared w/ upcoming ifnet.if_transmit implementaion.
No functional changes.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5158
So that:
- TCP/IP stack will not do unnecessary IP header checksum for TSO
packets.
- Reduce guest load for non-TSO IP packets.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5099
- For non-TSO offloading, we don't need to access mbuf to know
which csum offloading is requested, we can just use the
CSUM_{IP,TCP,UDP} in the csum_flags.
- For TSO offloading, we still can depend on CSUM_{TSO4,TSO6}
in the csum_flags to tell whether the TSO packet is an IPv4
TSO packet or an IPv6 TSO packet.
This streamlines csum offloading handling (remove the two goto)
and allows us the nuke the unnecessary get_transport_proto_type().
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5098
- Record csum features in softc, so we don't need to duplicate the
logic from attach path to ioctl path.
- Protect if_capenable and if_hwassist changes by main lock.
- Prefer turn on/off bits in if_hwassist explicitly instead of using
XOR.
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5085
a mips big-endian board.
This is (hopefully! ish!) a temporary change until a slightly better way
can be found to express this without a config option.
Tested:
* BUFFALO WZR-HP-G300NH 1stGen (by submitter)
Submitted by: Mori Hiroki <yamori813@yahoo.co.jp>
with interpreter name exactly matching one wanted by the binary. If
no such brand exists, return first brand which accepted the binary by
note.
The change fixes a regression after r292749, where e.g. our two ia32
compat brands, ia32_brand_info and ia32_brand_oinfo, only differ by
the interpeter path and binary matches to a brand by linkage order.
Then old binaries which require /usr/libexec/ld-elf.so.1 but matched
against ia32_brand_info with interp_path /libexec/ld-elf.so.1, were
considered requiring non-standard interpreter name, and magic to force
ld-elf32.so.1 did not happen.
Note that it might make sense to apply the same selection of brands
for other matching criteria, SCO EI_OSABI and 3.x string.
Reported and tested by: dwmalone
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Add #defines for ATA_WRITE_UNCORRECTABLE48 and its features. Update the
decoding in ATACAM to recognize the new values. Also improve command
decoding for a few other commands (SMART, NOP, SET_FEATURES). Bring the
decoding in ata(4) up to parity with ATACAM.
Reviewed by: mav, imp
MFC after: 1 month
Sponsored by: Panasas, Inc.
Differential Revision: https://reviews.freebsd.org/D5181
also for 82598, which doesn't support it.
The legacy code has a check for it, which was missed when the code for dealing with
CSUM_IP6_* was added. Add the same check for FreeBSD 10 and higher.
Differential Revision: https://reviews.freebsd.org/D5192
ABI of struct fpreg. The FPU emulator operates on the "raw" FPU state
stored in the pcb rather than the "cooked" fpreg state used for ptrace()
and cores.
Reported by: bz
16-byte value. With this the hardware will check if a memory access uses
an incorrectly aligned stack pointer as the base address.
Sponsored by: ABT Systems Ltd
which return -1 as well as on tier 1 archs. Remove block_userspace_access
used only in these implementations.
(1) These functions may be called in interrupt context and pcb_onfault
can be already set in this time. Thus, prior pcb_onfault must be saved
and restored afterwards.
(2) The check that an abort came either from nested interrupt or while
in critical section or holding not sleepable lock must be avoided for
this case.
These functions are called only for profiling reason, so there will be
only small gain by making the code more complex.
export_args on mount update, bzero() is consistent with
vfs_oexport_conv().
Make the code structure more explicit by using switch.
Return EINVAL if export option layout (deduced from size) is unknown.
Based on the submission by: bde
Sponsored by: The FreeBSD Foundation
(1) Move cnt.v_trap increment to the beginning. There is cnt.v_vm_faults
counter in vm_fault(), so a number of hardware emulation aborts may be
get roughly as difference.
(2) Move kdb_reenter() up to not be ignored if pmap_fault() has failed.
(3) Update comments.
gp (global pointer) is used by compiler in userland only,
so re-use it for pcpup in kernel, save it on stack on switching
out to userland and load back on return to kernel.
Discussed with: jhb, andrew, kib
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
Differential Revision: https://reviews.freebsd.org/D5178