While I'm here, add comment along the attach DEVMETHOD.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7874
* hard code a noise floor of -96 for now. The noise floor update code returns
some "interesting" values that I can't map to anything useful right now.
* Ensure a default noise floor is set - otherwise the initial scan results
have a noise floor of '0'.
* Fix up the RSSI calculation to be correctly relative to the noise floor.
The RSSI routines return an absolute value in dBm - so fix this up.
* Cap RSSI values appropriately.
* Ensure we pass in a 1/2 dB unit value in to net80211.
Tested:
* Intel 7260, STA mode
iwm0: <Intel Dual Band Wireless AC 7260> mem 0xf1400000-0xf1401fff irq 17 at device 0.0 on pci2
iwm0: hw rev 0x140, fw ver 16.242414.0, address xx:xx:xx:xx:xx:xx
In r304870, refcount handling was lifted out into a common OTP/SPROM code
path, but the refcount assertions in chipc_disable_sprom_pins() were not
updated accordingly; this triggered an assertion on BCM4331 devices when
releasing a SPROM pin reservation.
Approved by: adrian (mentor, implicit)
- Use _Bool to not require userspace to include stdbool.h.
- Make extattr.h usable without vnode_if.h.
- Follow i_ump to get cdev pointer.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Remove redunand i_dev and i_fs pointers, which are available as
ip->i_ump->um_dev and ip->i_ump->um_fs, and reorder members by size to
reduce padding. To compensate added derefences, the most often i_ump
access to differentiate between UFS1 and UFS2 dinode layout is
removed, by addition of the new i_flag IN_UFS2. Overall, this
actually reduces the amount of memory dereferences.
On 64bit machine, original struct inode size is 176, reduced to 152
bytes with the change.
Tested by: pho (previous version)
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Make virtio_console(4) create `/dev/vtcon/<port_name>` alias pointing
to /dev/ttyVx.y upon receiving PORT_NAME (id = 7) event over the control
queue.
Approved by: trasz
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D7182
preserve the ABI and API for applications. It was removed in the port
to amd64, but was remained as garbage giving a micro-pessimization and
spurious single-step traps on i386.
pcb_psl was intended to be used just to do a context switch of PSL_I,
but this context switch was null in most or all versions, and
mis-switching of PSL_T was done instead.
Some history:
- in 386BSD-0.0, cpu_switch() ran at splhigh() and splhigh() did too
much interrupt disabling, so interrupts were hard-disabled across
cpu_switch() and too many other places
- in 386BSD-0.0-patchkit through FreeBSD-4 and FreeBSD-5 before
SMPng, splhigh() did soft interrupt masking, and cpu_switch() was
excessively cautious and did a cli at the start and a sti at the
end to hard-disable interrupts across the switch
- SMPng replaced the spl's and cli's by spinlocks (just sched_lock?),
so interrupts were hard-disabled across cpu_switch() and too many
other places again
- initial attempts to fix this intended to restore some soft
interrupt disabling, but to support variations in this cpu_switch()
used pushfl/popfl into pcb_psl to avoid hard-coding the assumption
that the initial and final states have PSL_I enabled. But the
version with soft interrupt disabling wasn't used for long, or was
never committed, (except I always used my different version of it
for UP) so the pushfl/popl and pcb_psl to hold them have been doing
less than nothing for about 14 years.
off single-stepping). Only do this on arches (only x86 so far)
which classify single-step traps unambiguously.
This allows other parts of the kernel to be intentionally and
unintentionally sloppy about generating single-step traps. On
x86, at least the following places were unintentionally sloppy:
- all operations that context-switched [er]flags. Especially
spinlock_enter()/exit() and cpu_switch(). When single-stepped,
saving the flags leaves PSL_T set in the saved flags, so
restoring gives a trap that is spurious if it occurs after
single-step mode has been left. Switching contexts away from
a low priority thread gives especially long-lived saved copies.
- the vm86 emulation allows user mode to set PSL_T. This was
correct until vm86 bios call mode was unintentionally given
access to kdb handling its single-step traps.
Now these places are intentionally sloppy, but unexpected
debugger traps still cause panics if no debugger that handles
the trap is attached when the trap is delivered.
IBSS negotiation is a subset of the STA/AP negotiation. We always have a
current channel, so base the HT capabilities on the current channel.
This is then put into IBSS probe requests to inform peers of our
11n capabilities.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.
Sponsored by: Mellanox Technologies
MFC after: 1 week
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.
Sponsored by: Mellanox Technologies
MFC after: 1 week
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Make some functions and structures global to allow for code reuse
when creating rate limiting sendqueues.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.
Sponsored by: Mellanox Technologies
MFC after: 1 week
This change allows for reusing the transmit path for so called
rate limited senqueues. While at it optimise some pointer lookups
in the fast path.
Sponsored by: Mellanox Technologies
MFC after: 1 week
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.
Sponsored by: Mellanox Technologies
MFC after: 1 week
i386-only section, and fix a comment about the amd64 kernel trapframe
not having stackregs.
tf_rsp doesn't need decoding on amd64, but had an old clone of i386
code to do this in 1 place, and since the amd64 kernel trapframe does
have stackregs, the result was an off-by-16 error for %rsp in an error
message.
current on first entry. This fixes a spurious "Stepping aborted"
message when the first entry is for a breakpoint.
Don't reset to the run mode to STEP_NONE when stopping, and remove
STEP_NONE. This mode was never really used, except transiently to
mis-decide whether to print the message on first entry.
While here, avoid using the old variable 'code' and remove it
in trap(). ('code' was meant for holding things like %dr6,
but is too small to hold %dr6 on amd64 and was reduced to an
obfuscation of tf_err, with early truncation on amd64.)
Submitted by: Michael Butler (imb@...)
The fill pattern was previously an ia64 instruction sequence. Presumably
ia64's linker script was copied as a starting point.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
come up as 't6nex' nexus devices with 'cc' ports hanging off them.
The T6 firmware and configuration files will be added as soon as they
are released. For now the driver will try to work with whatever
firmware and configuration is on the card's flash.
Sponsored by: Chelsio Communications
Add IEEE80211_KEY_SWCRYPT / IEEE80211_KEY_SWMIC bits to the
IEEE80211_KEY_DEVICE mask - as a result, those bits will be preserved
during group key handshake.
A driver can override them in iv_key_alloc() for some keys in case
when hardware crypto support is not possible. As an example:
- multi-vap without multicast key search support;
- IBSS RSN for devices w/ fixed storage for group keys;
Tested with RTL8188EU (AP, sw crypto) and
RTL8821AU (STA, sw crypto for group keys + hw crypto for pairwise keys)
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D7901
Move to a per-proc TLS offset rather than incorrectly keying off the
presense of freebsd32 compability in the kernel.
Reviewed by: adrian, sbruno
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D7843
This is not very easy to do, since ddb didn't know when traps are
for single-stepping. It more or less assumed that traps are either
breakpoints or single-step, but even for x86 this became inadequate
with the release of the i386 in ~1986, and FreeBSD passes it other
trap types for NMIs and panics.
On x86, teach ddb when a trap is for single stepping using the %dr6
register. Unknown traps are now treated almost the same as breakpoints
instead of as the same as single-steps. Previously, the classification
of breakpoints was almost correct and everything else was unknown so
had to be treated as a single-step. Now the classification of single-
steps is precise, the classification of breakpoints is almost correct
(as before) and everything else is unknown and treated like a
breakpoint.
This fixes:
- breakpoints not set by ddb, including the main one in kdb_enter(),
were treated as single-steps and not stopped on when stepping
(except for the usual, simple case of a step with residual count 1).
As special cases, kdb_enter() didn't stop for fatal traps or panics
- similarly for "hardware breakpoints".
Use a new MD macro IS_SSTEP_TRAP(type, code) to code to classify
single-steps. This is excessively complicated for bug-for-bug and
backwards compatibilty. Design errors apparently started in Mach
in ~1990 or perhaps in the FreeBSD interface in ~1993. Common trap
types like single steps should have a unique MI code (like the TRAP*
codes for user SIGTRAP) so that debuggers don't need macros like
IS_SSTEP_TRAP() to decode them. But 'type' is actually an ambiguous
MD trap number, and code was always 0 (now it is (int)%dr6 on x86).
So it was impossible to determine the trap type from the args.
Global variables had to be used.
There is already a classification macro db_pc_is_single_step(), but
this just gets in the way. It is only used to recover from bugs in
IS_BREAKPOINT_TRAP(). On some arches, IS_BREAKPOINT_TRAP() just
duplicates the ambiguity in 'type' and misclassifies single-steps as
breakpoints. It defaults to 'false', which is the opposite of what is
needed for bug-for-bug compatibility.
When this is cleaned up, MI classification bits should be passed in
'code'. This could be done now for positive-logic bits, since 'code'
was always 0, but some negative logic is needed for compatibility so
a simple MI classificition is not usable yet.
After reading %dr6, clear the single-step bit in it so that the type
of the next debugger trap can be decoded. This is a little
ddb-specific. ddb doesn't understand the need to clear this bit and
doing it before calling kdb is easiest. gdb would need to reverse
this to support hardware breakpoints, but it just doesn't support
them now since gdbstub doesn't support %dr*.
Fix a bug involving %dr6: when emulating a single-step trap for vm86,
set the bit for it in %dr6. Userland debuggers need this. ddb now
needs this for vm86 bios calls. The bit gets copied to 'code' then
cleared again.
Fix related style bugs:
- when clearing bits for hardware breakpoints in %dr6, spell the mask
as ~0xf on both amd64 and i386 to get the correct number of bits
using sign extension and not need a comment about using the wrong
mask on amd64 (amd64 traps for invalid results but clearing the
reserved top bits didn't trap since they are 0).
- rewrite my old wrong comments about using %dr6 for ddb watchpoints.
The 'cpu' and 'cpu_class' variables were always set to the same value
on amd64 and are legacy holdovers from i386. Remove them entirely on
amd64.
Reviewed by: imp, kib (older version)
Differential Revision: https://reviews.freebsd.org/D7888
* TCP_KEEPINIT
* TCP_KEEPINTVL
* TCP_KEEPIDLE
* TCP_KEEPCNT
always always report the values currently used when getsockopt()
is used. This wasn't the case when the sysctl-inherited default
values where used.
Ensure that the IPPROTO_TCP level socket option TCP_INFO has the
TCPI_OPT_ECN flag set in the tcpi_options field when ECN support
has been negotiated successfully.
Reviewed by: rrs, jtl, hiren
MFC after: 1 month
Differential Revision: 7833
SEL_UPL and sometimes PSL_VM. This is just a style change on amd64,
but on i386 it fixes 1 unimportant place where the PSL_VM check was
missing and starts fixing 1 important place where the PSL_VM check
had a logic error.
Fix logic errors in treating vm86 bioscall mode as kernel mode. The
main place checked all the necessary flags, but put the necessary
parentheses for the PSL_VM and PCB_VM86CALL checks in the wrong
place. The broken case is only reached if a vm86 bioscall uses a
%cs which is nonzero mod 4, but that is unusual -- most bios calls
start with %cs = 0xc000 or 0xf000 and rarely change it. Another
place was missing the check for PCB_VM86CALL, but was only reachable
if there are bugs virtualizing PSL_I.
Add a macro TF_HAS_STACKREGS() and use this instead of converting
open-coded checks of SEL_UPL, etc. to TRAPF_USERMODE() when we only
care about whether the frame has stack registers. This fixes 3
places in my recent fix for register variables in vm86 mode where I
messed up the PSL_VM check and cleans up other places.
mutexes or using any callouts when active.
Trying to lock a mutex when KDB is active or the scheduler is stopped
can result in infinite wait loops. The same goes for calling callout
related functions which in turn lock mutexes.
If the USB controller at which a USB keyboard is connected is idle
when KDB is entered, polling the USB keyboard via USB will always
succeed. Else polling may fail depending on which state the USB
subsystem and USB interrupt handler is in. This is unavoidable unless
KDB can wait for USB interrupt threads to complete before stalling the
CPU(s).
Tested by: Bruce Evans <bde@freebsd.org>
MFC after: 4 weeks
- ifnet.if_mtu does not require explicit setting.
- ifnet.if_hdrlen must be set after ether_ifattach().
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7873
It is available on both stable/10 and stable/11. This eases future MFCs
to stable/10.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7872
This paves the way for the dynamic RSS key and indirect table setting.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7864
- Add few checks for group/pairwise ciphers into
ieee80211_parse_{wpa,rsn}().
- Split error code and cipher value in wpa_cipher() / rsn_cipher(); current
hack with (1 << 32) does not work - it's 1, not 0 (detected by CSA).
- Return IEEE80211_REASON_UNSUPP_RSN_IE_VERSION instead of
IEEE80211_REASON_IE_INVALID when version field is not equal to RSN_VERSION.
Tested with wpi(4) / urtwn(4) (HOSTAP mode).
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D7887
We have 6 opcode rewriters for table opcodes. When `set swap' command
invoked, it is called for each rewriter, so at the end we get the same
result, because opcode rewriter uses ETLV type to match opcode. And all
tables opcodes have the same ETLV type. To solve this problem, use
separate sets handler for one opcode rewriter. Use it to handle TEST_ALL,
SWAP_ALL and MOVE_ALL commands.
PR: 212630
MFC after: 1 week
As this is evaluation hardware with only a few users, and there is a lack
of information add a warning when booting on this hardware.
Reported by: cognet
Obtained from: ABT Systems Ltd
MFC after: Instant
Sponsored by: The FreeBSD Foundation
This paves the way for further attach/detach code reorganization.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7858
In case if there is already running interface, a second non-sta
interface will omit scanning, going directly to RUN state. Handle
this case for adhoc mode appropriately.
Tested with RTL8821AU, 2 vaps in IBSS mode.
In particular, reset the DF_QUIET flag when detaching from a device so
that a driver that marks a device quiet doesn't dictate policy for a
different driver that may claim the device in the future.
Reviewed by: rpokala, wblock
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7803
- Certain pic_assign_cpu, e.g. msi_assign_cpu can have quite a long
call chain. For msi_assign_cpu, mutex makes complex PCI bridge
drivers more tricky, e.g. sleep can note be called, etc, it will
be pretty tricky for upcoming Hyper-V PCI bridge driver for PCI
pass-through.
- It is not used on any hot code path nor non-sleepable context, so
sx should have the same effect as mutex.
PIC list is still protected by mutex to keep suspend/resume work.
Discussed with: jhb
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7784
* Don't do RTS/CTS - experiments show that we get ACK frames for each of them
and this ends up causing the timestamps to look all funny.
* Set the HAL_TXDESC_POS bit, so the AR9300 HAL sets up the hardware to return
location and CSI information.
- evdev_set_methods call is not required if actual methods are no-ops
- evdev_set_serial is also optional if there is no meaningful input device
identifier
- evdev_set_id on the other hand is mandatory, so set virtual bus with
dummy vendor/product/version
Suggested by: Vladimir Kondratiev
Add generic evdev support to touchscreen part of ti_adc: two absolute
coordinates + button touch to indicate pen position. Pressure value
reporting is not implemented yet.
Tested on: Beaglebone Black + 4DCAPE-43T + tslib
evdev is a generic input event interface compatible with Linux
evdev API at ioctl level. It allows using unmodified (apart from
header name) input evdev drivers in Xorg, Wayland, Qt.
This commit has only generic kernel API. evdev support for individual
hardware drivers like ukbd, ums, atkbd, etc. will be committed later.
Project was started by Jakub Klama as part of GSoC 2014. Jakub's
evdev implementation was later used as a base, updated and finished
by Vladimir Kondratiev.
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by: adrian, hans
Differential Revision: https://reviews.freebsd.org/D6998
The flag specifies that the block which uses FPU must be executed in
critical section, i.e. take no context switches, and does not need an
FPU save area during the execution.
It is intended to be applied around fast and short code pathes where
save area allocation is impossible or undesirable, due to context or
due to the relative cost of calculation vs. allocation.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Rename registers as in the manual.
Do a hard reset of the controller before a soft one.
Since DMA is always used remove dependancy on allwinner_soc_family, it was used
to differentiate SoC as the fdt compatible string were the same.
Tested on A10, A20, H3 and A64.
Reviewed by: jmcneill
Differential Revision: https://reviews.freebsd.org/D6868
Move PMAP_TS_REFERENCED_MAX out of the various pmap implementations and
into vm/pmap.h, and describe what its purpose is. Eliminate the archaic
"XXX" comment about its value. I don't believe that its exact value, e.g.,
5 versus 6, matters.
Update the arm64 and riscv pmap implementations of pmap_ts_referenced()
to opportunistically update the page's dirty field.
On amd64, use the PDE value already cached in a local variable rather than
dereferencing a pointer again and again.
Reviewed by: kib, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D7836
An array of bucket locks is added.
All modifications still require the global cache_lock to be held for
writing. However, most readers only need the relevant bucket lock and in
effect can run concurrently to the writer as long as they use a
different lock. See the added comment for more details.
This is an intermediate step towards removal of the global lock.
Reviewed by: kib
Tested by: pho
Our shim for Solaris random_get_bytes() uses read_random(), that looks
reasonable, since it guaranties reliably seeded random data. On the other
side Solaris random_get_pseudo_bytes() does not provide this guarantie,
and its original Solaris implementation is equivalent to our arc4rand(),
using software crypto without stressing slower hardware RNG.
If wait4() or wait6() return 0 because of WNOHANG, the status, rusage and
wrusage information should not be returned.
PR: 212048
Reported by: Casey Lucas
MFC after: 2 weeks
that the latter can easily determine what the trap type actually is
after callers are fixed to encode the type unambigously.
ddb currently barely understands breakpoints, and it treats all
non-breakpoints as single-step traps. This works OK for stopping
after every instruction when single-stepping, but is broken for
single-stepping with a count > 1 (especially with a large count).
ddb needs to stop on the first non-single-step trap while single-
stepping. Otherwise, ddb doesn't even stop the first time for
fatal traps and external breakpoints like the one in kdb_enter().
The lower vnode is already referenced and nodeget is supposed to consume
the reference. Thus the extra vref call was causing a leak.
Reported by: pho
Reviewed by: kib
MFC after: 1 week
Trying to build a MIPS platform that uses INTRNG needs this
for this to work right in gpiobusvar.h :
#ifdef INTRNG
struct intr_map_data_gpio {
struct intr_map_data hdr;
...
};
#endif
* change the HT_RC_2_MCS to do MCS0..23
* Use it when looking up the ht20/ht40 array for bits-per-symbol
* add a clk_to_psec (picoseconds) routine, so we can get sub-microsecond
accuracy for the math
* .. and make that + clk_to_usec public, so higher layer code that is
returning clocks (eg the ANI diag routines, some upcoming locationing
experiments) can be converted to microseconds.
Whilst here, add a comment in ar5416 so i or someone else can revisit the
latency values.
On big endian hardware that uses 1 byte bool a type mismatch of bool vs int will
cause the least signifcant byte of db_cmd_loop_done to be set, but the MSB to be
read, and read as 0. This causes ddb to stay in an infinite loop.
MFC after: 1 week
Split the QUEUE_MACRO_DEBUG into QUEUE_MACRO_DEBUG_TRACE and
QUEUE_MACRO_DEBUG_TRASH.
Add the debug macrso QMD_IS_TRASHED() and QMD_SLIST_CHECK_PREVPTR().
Document these in queue.3.
Reviewed by: emaste
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D3984
mounts in almost all cases instead of in most cases. Don't override
DOINGASYNC() by any condition except IO_SYNC.
Fix previous sprinking of DOINGASYNC() checks. Don't override IO_SYNC
by DOINGASYNC(). In ffs_write() and ffs_extwrite(), there were
intentional overrides that just broke O_SYNC of data. In
ffs_truncate(), there are 5 calls to ffs_update(), 4 with
apparently-unintentional overrides and 1 without; this had no effect
due to the main async mount hack descibed below.
Fix 1 place in ffs_truncate() where the caller's IO_ASYNC was overridden
for the soft updates case too (to do a delayed write instead of a sync
write). This is supposed to be the only change that affects anything
except async mounts.
In ffs_update(), remove the 19 year old efficiency hack of ignoring
the waitfor flag for async mounts, so that fsync() almost works for
async mounts. All callers are supposed to be fixed to not ask for a
sync update unless they are for fsync() or [I]O_SYNC operations.
fsync() now almost works for async mounts. It used to sync the data
but not the most important metdata (the inode). It still doesn't sync
associated directories.
This gave 10-20% fewer writes for my makeworld benchmark with async
mounted tmp and obj directories from an already small number.
Style fixes:
- in ffs_balloc.c, remove rotted quadruplicated comments about the
simplest part of the DOING*() decisions and rearrange the nearly-
quadruplicated code to be more nearly so.
- in ufs_vnops.c, use a consistent style with less negative logic and
no manual "optimization" of || to | in DOING*() expressions.
Reviewed by: kib (previous version)
This will be required for SMP support on MIPS Malta platform.
Reviewed by: adrian
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
Differential Revision: https://reviews.freebsd.org/D7835
In vm86.c, fix 2 (rarely used) cases where the return code lost the
single-step indicator. While here, fix 2 misspellings of PSL_T as
PSL_TF (TF is the CPU manufacturer's spelling, but we use T).
In trap.c, turn T_PROTFLT and T_STKFLT into T_TRCTRAP if
vm86_emulate() asked for this (it does this when the instruction is
being traced and was successully emulated). In the kernel case, we
used to deliver the trap as SIGTRAP to the process, where it always
terminated the process; now we deliver the trap as T_TRCTRAP to kdb,
where it normally gives single-stepping. In the user case, the only
difference is that we now clear PSL_T and initialize ucode properly.
Reviewed by: kib
Some statuses, such as "ATA pass through information available", are part
part of absolutely normal operation and do not worth reporting.
MFC after: 2 weeks
truncation failed.
Doing so resulted in inconsistent state of the ufs dirhash with regard
to the actual directory inode state, and could lead to spurious ENOENT
errors for lookups of existing files in production kernels, or
assertion failures in the debugging kernels.
Change the logic of calling ufsdirhash_dirtrunc() to be same as in
ufs_direnter(). Execute UFS_TRUNCATE() first, log error, and only do
dirtrunc() if UFS_TRUNCATE() succeeded.
Note that the problem was exacerbated by the bug in the
flush_newblk_dep() function (see r305599), which caused in the spurios
errors from ffs_sync() and then ffs_truncate().
In collaboration with: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The buffer lock is retried on failed LK_SLEEPFAIL attempt, and error
from the failed attempt is irrelevant. But since there is path after
retry which does not clear error, it is possible to return spurious
error from the function.
The issue resulted in a spurious failure of softdep_sync_buf(),
causing further spurious failure of ffs_sync().
In collaboration with: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
This change is formally not needed, since i_endoff not used in all
code paths after the call to ufs_direnter(), and i_endoff is
recalculated by the next lookup. But having the value correct makes
the reasoning about code simpler.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
cleared since nothing prevents completion of the parallel quotaoff.
There is nothing to sync in this case, and no reason to panic.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
operations. Instead of upgrading, assert that the lock is exclusive.
Explain the cause in comments.
This effectively reverts r209367.
Tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
going to re-read inodes.
Secondary write initiators, e.g. ufs_inactive(), might need to start a
write while owning the vnode lock. Since the suspended state
established by /dev/ufssuspend prevents them from entering
vn_start_secondary_write(), we get deadlock otherwise.
Note that it is arguably not very useful to re-read inodes after
/dev/ufssuspend suspension, because the suspension does not block
readers, and other threads might read existing files in parallel with
suspension owner (for now, only growfs(8)) operations. This
effectively means that suspension owner cannot safely modify existing
inodes, and then there is no sense in re-reading. But keep the code
enabled for now.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The GUID string provided by hypervisor has leading and trailing braces,
while our GUID string does not have braces at all. Both braces should
be ignored, when the GUID strings are compared.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Modified by: sephe
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7809
While I'm here, pull up the channel callback related code too.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7805
The DS_FIELD_LARGE_BLOCKS macro has been unused since the integration of
this patch:
commit ca0cc3918a1789fa839194af2a9245f801a06b1a
Author: Matthew Ahrens <mahrens@delphix.com>
Date: Fri Jul 24 09:53:55 2015 -0700
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
This patch simply removes this macro from dsl_dataset.h.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
When changing zfs_arc_max (e.g. as zdb does), it may be set to less
than the default arc_c_min. arc_c_min should decrease to not be more than
arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Author: Matthew Ahrens <mahrens@delphix.com>
openzfs/openzfs@608764bead
The cxgbev/cxlv driver supports Virtual Function devices for Chelsio
T4 and T4 adapters. The VF devices share most of their code with the
existing PF4 driver (cxgbe/cxl) and as such the VF device driver
currently depends on the PF4 driver.
Similar to the cxgbe/cxl drivers, the VF driver includes a t4vf/t5vf
PCI device driver that attaches to the VF device. It then creates
child cxgbev/cxlv devices representing ports assigned to the VF.
By default, the PF driver assigns a single port to each VF.
t4vf_hw.c contains VF-specific routines from the shared code used to
fetch VF-specific parameters from the firmware.
t4_vf.c contains the VF-specific PCI device driver and includes its
own attach routine.
VF devices are required to use a different firmware request when
transmitting packets (which in turn requires a different CPL message
to encapsulate messages). This alternate firmware request does not
permit chaining multiple packets in a single message, so each packet
results in a firmware request. In addition, the different CPL message
requires more detailed information when enabling hardware checksums,
so parse_pkt() on VF devices must examine L2 and L3 headers for all
packets (not just TSO packets) for VF devices. Finally, L2 checksums
on non-UDP/non-TCP packets do not work reliably (the firmware trashes
the IPv4 fragment field), so IPv4 checksums for such packets are
calculated in software.
Most of the other changes in the non-VF-specific code are to expose
various variables and functions private to the PF driver so that they
can be used by the VF driver.
Note that a limited subset of cxgbetool functions are supported on VF
devices including register dumps, scheduler classes, and clearing of
statistics. In addition, TOE is not supported on VF devices, only for
the PF interfaces.
Reviewed by: np
MFC after: 2 months
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7599
If a packet contains the Ethernet header (14 bytes) in the first mbuf
and the payload (IP + UDP + data) in the second mbuf, then the attempt
to fetch the l3hdr will return a NULL pointer. The first loop iteration
will drop len to zero and exit the loop without setting 'p'. However,
the desired data is at the start of the second mbuf, so the correct
behavior is to loop around and let the conditional set 'p' to m_data of
the next mbuf (and leave offset as 0).
Reviewed by: np
Sponsored by: Chelsio Communications
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.
This was noticed when investigating r305545.
Reported by: bz
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
page is non-executable the contents of the i-cache are unimportant so this
call is just adding unneeded overhead when inserting pages.
While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1
iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into
arm64_icache_sync_range() from pmap_enter() along with a high number of
instruction fetches and iTLB/iCache hits.
Limiting the calls to arm64_icache_sync_range() to only executable pages,
we observe the iTLB and iCache Hit going down by about 43%. These numbers
are quite misleading when looked at alone as at the same time instructions
retired were reduced by 19.2% and instruction fetches were reduced by 38.8%.
Overall this reduced the runtime of the test program by 22.4%.
On Juno hardware, in steady-state, running the same test, using the cycle
count to determine runtime, we do see a reduction of up to 28.9% in runtime.
While these numbers certainly depend on the program executed, we expect an
overall performance improvement.
Reported by: bz
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Due to reading initialized variable, FIS receive area was always allocated
as 256 bytes, suitable for command-based switching, instead of 4096 bytes,
required for FIS-based switching. This caused memory corruption in case of
port multipliers used on FBS-capable HBAs (Marvell).
MFC after: 1 week
More changes to MIPS may be required, as commented in D7692, but this
revision aims to restore MIPS INTRNG functionality so we can move on
with working interrupts.
Reported by: yamori813@yahoo.co.jp
Tested by: mizhka (on BCM), sgalabov (on Mediatek)
Reviewed by: adrian, nwhitehorn (older version)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D7692
Let drivers for Alpine CCU, NB and Serdes take care of internal SoC configuration.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: imp,wma
Differential Revision: https://reviews.freebsd.org/D7566
This commit adds drivers for Alpine Cache Coherency Unit
and North Bridge Service whose task is to configure
the system fabric and enable cache coherency.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: wma
Differential Revision: https://reviews.freebsd.org/D7565
pmap_early_io_map()/pmap_early_io_unmap(), if used in pairs, should be used in
the form:
pmap_early_io_map()
..do stuff..
pmap_early_io_unmap()
Without other allocations in the middle. Without reclaiming memory this can
leave large holes in the device space.
While here, make a simple change to the unmap loop which now permits it to unmap
multiple TLB entries in the range.
Such errors can occur as the result of a write error or because the disk
backing the mirror element was removed. They result in a generation ID bump
on all active elements of the mirror, so we can safely disconnect the mirror
component rather than destroy it.
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D7750
These are useful for testing changes to I/O error handling, and for
reproducing existing bugs in a controlled manner. The fail points are
g_mirror_regular_request_read
g_mirror_regular_request_write
g_mirror_sync_request_read
g_mirror_sync_request_write
g_mirror_metadata_write
They all effectively allow one to inject an error value into the bio_error
field of a corresponding BIO request as it is being completed.
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
have the serious problem of not actually attaching the hardware they
are driving at the bus level. This causes creator(4) and machfb(4)
to attach and drive the very same hardware in parallel when both
syscons(4) and vt(4) as well as their associated hardware drivers
are built into a kernel, i. e. GENERIC, at the same time.
Also, syscons(4) and its drivers still are way superior to vt(4) and
its equivalents; unlike the syscons(4) counterparts the vt(4) drivers
don't provide hardware acceleration resulting in considerably slower
screen drawing, creator_vt(4) doesn't provide a /dev/fb node as
required by the Xorg sunffb(4) etc. In theory, vt_ofwfb(4) should be
able to handle more devices than machfb(4). However, testing shows
that it hardly works with any hardware machfb(4) isn't also able to
drive, making vt(4) and vt_ofwfb(4) not favorable for the time being
from that perspective either.
MFC after: 3 days
Use C99 designators to set the value of each slot and the nitems macro to
check for valid entries. In the process, switch to indexing by signal
number rather than signal-1 for improved clarity.
Obtained from: CheriBSD (a6053c5abf03a5f53bbfcdd3a26429383f67e09f)
Sponsored by: DARPA, AFRL
Reviewed by: kib
The previous code was forcing an expensive walk in vop_stdvptocnp,
which was causing performance issues on highly contended zfs.
No objections: kib
MFC after: 2 weeks
Add routines to trigger a function level reset (FLR) of a PCI-express
device via the PCI-express device control register. This also includes
support routines to wait for pending transactions to complete as well
as calculating the maximum completion timeout permitted by a device.
Change the ppt(4) driver to reset pass through devices before attaching
to a VM during startup and before detaching from a VM during shutdown.
Reviewed by: imp, wblock (earlier version)
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7751
This driver supports two bindings:
- cpufreq-dt: systems which share clock and voltage across all CPUs
- arm_big_little_dt: systems which share clock and voltage across all
CPUs in a single cluster
Reviewed by: andrew, imp
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D7741
When the I/O MMU is active in bhyve, all PCI devices need valid entries
in the DMAR context tables. The I/O MMU code does a single enumeration
of the available PCI devices during initialization to add all existing
devices to a domain representing the host. The ppt(4) driver then moves
pass through devices in and out of domains for virtual machines as needed.
However, when new PCI devices were added at runtime either via SR-IOV or
HotPlug, the I/O MMU tables were not updated.
This change adds a new set of EVENTHANDLERS that are invoked when PCI
devices are added and deleted. The I/O MMU driver in bhyve installs
handlers for these events which it uses to add and remove devices to
the "host" domain.
Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7667
This allows a pass through device to be reset to a normal device driver
on the host and reused on the host. ppt devices are now always active in
some I/O MMU domain when the I/O MMU is active, either the host domain
or the domain of a VM they are attached to.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7666
This is required on my system, which loads nvidia, vmm, and zfs, and 48M is
no longer enough for that. nvidia-driver's recent update increased its size
by several megabytes.
Reviewed by: jhb
MFC after: 1 week
Other files including pci_host_generic.h failed to compile
due to missing declaration of enum pci_id_type.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: wma
Differential Revision: https://reviews.freebsd.org/D7561
Split usbd_xfer_status() check:
- Check xfer length: must be longer, than Rx descriptor size.
- Check frame size: must be shorter than xfer length.
- Discard too short frames.
Tested with WUSB54GC, STA/MONITOR modes.
The upstream change introduced a new load state, SPA_LOAD_CREATE,
and vdev_geom code needs to be aware of it.
Tested by: cy
MFC after: 1 week
X-MFC with: r305331
This adds bhnd(4) bus-level support for querying backplane interrupt vector
routing, and delegating machine/bridge-specific interrupt handling to the
concrete bhnd(4) driver implementation.
On bhndb(4) bridged PCI devices, we provide the PCI/MSI interrupt directly
to attached cores.
On MIPS devices, we report a backplane interrupt count of 0, effectively
disabling the bus-level interrupt assignment. This allows mips/broadcom
to temporarily continue using hard-coded MIPS IRQs until bhnd_mips PIC
support is implemented.
Reviewed by: mizhka
Approved by: adrian (mentor, implicit)
Broadcom Intensi-fi chipsets provided a common set of IP cores; on PCI/PCIe
devices, the USB11 host controller is left floating.
Approved by: adrian (mentor, implicit)
This patch adds driver implementation for BHND USB core. Driver has been
imported from ZRouter project with small adaptions for FreeBSD 11.
Also it's enabled for BroadCom MIPS74k boards by default. It's fully tested
on Asus boards (RT-N16: external USB, RT-N53: USB bus between SoC and WiFi
chips).
Reviewed by: adrian (mentor), ray
Approved by: adrian (mentor)
Obtained from: ZRouter
Differential Revision: https://reviews.freebsd.org/D7781
by reviving the SX control request lock and refining which lock
protects the common scratch area in "struct usb_device".
The SX control request lock was removed by r246759 because it caused a
lock order reversal with the USB enumeration lock inside
usbd_transfer_setup() as a function of r246616. It was thought that
reducing the number of locks would resolve the LOR, but because some
USB device drivers use usbd_do_request_flags() inside callback
functions, like in taskqueues, a deadlock may occur when these are
drained from device_detach(). By restoring the SX control request
lock usbd_do_request_flags() is allowed to complete its execution
when a USB device driver is detaching. By using the SX control request
lock to protect the scratch area, the LOR introduced by r246616 is
also resolved.
Bump the FreeBSD version while at it to force recompilation of all USB
kernel modules.
Found by: avos@
MFC after: 1 week
like we've done elsewhere, e.g., amd64.
As an optimization to the machine-independent layer, change the machine-
dependent pmap_ts_referenced() so that it updates the page's dirty field
if a modified bit is found while counting reference bits. This
opportunistic update can be performed at low cost and can eliminate the
need for some future calls to pmap_is_modified() by the machine-
independent layer.
MFC after: 3 weeks
survived multiple world and kernel builds, and of poudriere building full
package sets.
I have observed a 3% reduction in buildworld times with superpages enabled,
however further testing is needed to see if this is observed in other
workloads.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
warning:
sys/netinet/igmp.c:546:21: error: implicit conversion from 'int' to 'char' changes value from 148 to -108 [-Werror,-Wconstant-conversion]
p->ipopt_list[0] = IPOPT_RA; /* Router Alert Option */
~ ^~~~~~~~
sys/netinet/ip.h:153:19: note: expanded from macro 'IPOPT_RA'
#define IPOPT_RA 148 /* router alert */
^~~
This is because ipopt_list is an array of char, so IPOPT_RA is wrapped
to a negative value. It would be nice to change ipopt_list to an array
of u_char, but it changes the signature of the public struct ipoption,
so add an explicit cast to suppress the warning.
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7777
sys/dev/usb/serial/uplcom.c:543:29: error: implicit conversion from 'int' to 'int8_t' (aka 'signed char') changes value from 192 to -64 [-Werror,-Wconstant-conversion]
if (uplcom_pl2303_do(udev, UT_READ_VENDOR_DEVICE, UPLCOM_SET_REQUEST, 0x8484, 0, 1)
~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~~~~~~
sys/dev/usb/usb.h:179:53: note: expanded from macro 'UT_READ_VENDOR_DEVICE'
#define UT_READ_VENDOR_DEVICE (UT_READ | UT_VENDOR | UT_DEVICE)
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
This is because UT_READ is 0x80, so the int8_t argument is wrapped to a
negative value. Fix this by using uint8_t instead.
Reviewed by: imp, hselasky
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7776
Since negative entries are managed with a LRU list, a hit requires a
modificaton.
Currently the code tries to upgrade the global lock if needed and is
forced to retry the lookup if it fails.
Provide a dedicated lock for use when the cache is only shared-locked.
Reviewed by: kib
MFC after: 1 week
This fixes bhnd(4) nvram handling on devices that map SPROM CSRs via PCI
configuration space.
The probe method previously required that a bhnd(4) device be attached to the
parent bridge; now that the bhnd_nvram device is always attached first, this
unnecessary sanity check always failed.
Approved by: adrian (mentor, implicit)
On BCM4321 chipsets, both PCI and PCIe cores are included, with one of
the cores potentially left floating.
Since the PCI core appears first in the device table, and the PCI
profiles appear first in the resource configuration tables, this resulted in
incorrectly matching and using the PCI/v1 resource configuration on PCIe
devices, rather than the correct PCIe/v1 profile.
Approved by: adrian (mentor, implicit)
Adds support for probing and initializing bhndb(4) bridge state using
the bhnd_erom API, ensuring that full bridge configuration is available
*prior* to actually attaching and enumerating the bhnd(4) child device,
allowing us to safely allocate bus-level agent/device resources during
bhnd(4) bus enumeration.
- Add a bhnd_erom_probe() method usable by bhndb(4). This is an analogue
to the existing bhnd_erom_probe_static() method, and allows the bhndb
bridge to discover the best available erom parser class prior to newbus
probing of its children.
- Add support for supplying identification hints when probing erom
devices. This is required on early EXTIF-only chipsets, where chip
identification registers are not available.
- Migrate bhndb over to the new bhnd_erom API, using bhnd_core_info
records rather than bridged bhnd(4) device_t references to determine
the bridged chipsets' capability/bridge configuration.
- The bhndb parent (e.g. if_bwn) is now required to supply a hardware
priority table to the bridge. The default table is currently sufficient
for our supported devices.
- Drop the two-pass attach approach we used for compatibility with bhndb(4) in
the bhnd(4) bus drivers, and instead perform bus enumeration immediately,
and allocate bridged per-child bus-level resources during that enumeration.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7768
The pager getpages interface allows the caller to bound the number of
readahead and readbehind pages, and vm_fault_hold() makes use of this
feature. These bounds were ignored after r305056, causing the swap pager
to potentially page in more than the specified number of pages.
Reported and reviewed by: alc
X-MFC with: r305056
This defines a new bhnd_erom_if API, providing a common interface to device
enumeration on siba(4) and bcma(4) devices, for use both in the bhndb bridge
and SoC early boot contexts, and migrates mips/broadcom over to the new API.
This also replaces the previous adhoc device enumeration support implemented
for mips/broadcom.
Migration of bhndb to the new API will be implemented in a follow-up commit.
- Defined new bhnd_erom_if interface for bhnd(4) device enumeration, along
with bcma(4) and siba(4)-specific implementations.
- Fixed a minor bug in bhndb that logged an error when we attempted to map the
full siba(4) bus space (18000000-17FFFFFF) in the siba EROM parser.
- Reverted use of the resource's start address as the ChipCommon enum_addr in
bhnd_read_chipid(). When called from bhndb, this address is found within the
host address space, resulting in an invalid bridged enum_addr.
- Added support for falling back on standard bus_activate_resource() in
bhnd_bus_generic_activate_resource(), enabling allocation of the bhnd_erom's
bhnd_resource directly from a nexus-attached bhnd(4) device.
- Removed BHND_BUS_GET_CORE_TABLE(); it has been replaced by the erom API.
- Added support for statically initializing bhnd_erom instances, for use prior
to malloc availability. The statically allocated buffer size is verified both
at runtime, and via a compile-time assertion (see BHND_EROM_STATIC_BYTES).
- bhnd_erom classes are registered within a module via a linker set, allowing
mips/broadcom to probe available EROM parser instances without creating a
strong reference to bcma/siba-specific symbols.
- Migrated mips/broadcom to bhnd_erom_if, replacing the previous MIPS-specific
device enumeration implementation.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D7748
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.
Reviewed by: alc, imp, kib
Differential Revision: https://reviews.freebsd.org/D7714