cause stack overflow for IO's which return ZIO_PIPELINE_CONTINUE
from the zio_vdev_io_start stage and hence don't suspend and complete
in a different thread.
This prevents double fault panic on slow machines running ZFS on
GELI volumes which return EOPNOTSUPP directly to BIO_DELETE requests.
MFC after: 1 month
X-MFC-With: r265152
sys/systm.h must always come after sys/param.h.
Remove sys/types.h which should never be included together with sys/param.h.
Add sys/malloc.h for correctness even if it seems to don't be needed.
Remove more unused headers found by unusedinc (from bde@) and tested with a
universe build.
Reported by: bde
It looks like current consumers are either unaware
of MRT (and uses RT_DEFAULT_FIB implicitly) or
know what thay are doing, In latter case they
will be either hit by KASSERT or ESCRH will be returned
due to NULL rnh.
disk. That has a side effect of corrupting the "." entries names on
rename, since the call to createde() in the msdosfs_rename() sets the
de_Name to the target name. If any change to the directory attributes
is performed, the wrong name is written back to the on-disk direntry
on update.
Overwrite the de_Name for the directories on rename to correct the dot
name.
Submitted by: bde
MFC after: 1 week
This allows to run 32bit applications on a 64bit host. This was tested
successfully with Wine (emulators/i386-wine-devel) and StarCraft II.
Submitted by: Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after: 1 week
should either accept owner and owner_group strings that are just
the digits of the uid/gid or return NFS4ERR_BADOWNER.
This patch adds a sysctl vfs.nfsd.enable_stringtouid, which can
be set to enable the server w.r.t. accepting numeric string. It
also ensures that NFS4ERR_BADOWNER is returned if numeric uid/gid
strings are not enabled. This fixes the server for recent Linux
nfs4 clients that use numeric uid/gid strings by default.
Reported and tested by: craigyk@gmail.com
MFC after: 2 weeks
This is derived from the mps(4) driver, but it supports only the 12Gb
IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.
Some notes about this driver:
o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
this driver.
o WarpDrive functionality has been removed, since it isn't supported in
the 12Gb driver interface.
o The Scatter/Gather list handling code is significantly different between
the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather
lists.
Thanks to LSI for developing and testing this driver for FreeBSD.
share/man/man4/mpr.4:
mpr(4) man page.
sys/dev/mpr/*:
mpr(4) driver files.
sys/modules/Makefile,
sys/modules/mpr/Makefile:
Add a module Makefile for the mpr(4) driver.
sys/conf/files:
Add the mpr(4) driver.
sys/amd64/conf/GENERIC,
sys/i386/conf/GENERIC,
sys/mips/conf/OCTEON1,
sys/sparc64/conf/GENERIC:
Add the mpr(4) driver to all config files that currently
have the mps(4) driver.
sys/ia64/conf/GENERIC:
Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
config file.
sys/i386/conf/XEN:
Exclude the mpr module from building here.
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com>
MFC after: 3 days
Tested by: Chris Reeves <chrisr@spectralogic.com>
Sponsored by: LSI, Spectra Logic
Relnotes: LSI 12Gb SAS driver mpr(4) added
The thread that is destroying the lagg has already set sc->sc_psc=NULL when
the "ifconfig -am" thread gets to lacp_req(). It tries to dereference
sc->sc_psc and panics. The solution is for lacp_req() to check the value of
sc->sc_psc. If NULL, harmlessly return an lacp_opreq structure full of
zeros. Full details in GNATS.
PR: kern/189003
Reviewed by: timeout on freebsd-net@
MFC after: 3 weeks
Sponsored by: Spectra Logic Corporation
for lockmgr and sx interlocks, but unused since optimised versions of
those sleep locks were introduced. This will save a (quite) small
amount of memory in all kernel configurations. The sleep mutex pool is
retained as it is used for 'struct bio' and several other consumers.
Discussed with: jhb
MFC after: 3 days
lindev(4) was only used to provide /dev/full which is now a standard feature of
FreeBSD. /dev/full was never linux-specific and provides a generally useful
feature.
Document this in UPDATING and bump __FreeBSD_version. This will be documented
in the PH shortly.
Reported by: jkim
lindev(4) was only used to provide /dev/full which is now a standard feature of
FreeBSD. /dev/full was never linux-specific and provides a generally useful
feature.
Document this in UPDATING and bump __FreeBSD_version. This will be documented
in the PH shortly.
Reported by: jkim
Adjust the exynos and zedboard dts files to use max-frequency (the
documented standard property) instead of clock-frequency.
Submitted by: Thomas Skibo <ThomasSkibo@sbcglobal.net>
It can fail if pipe map is exhausted (as a result of too many pipes created),
but it is not fatal and could be provoked by unprivileged users. The only
consequence is worse performance with given pipe.
Reported by: ivoras
Suggested by: kib
MFC after: 1 week
The hardware can generate its own frames (eg RTS/CTS exchanges, other
kinds of 802.11 management stuff, especially when it comes to 802.11n)
and these also have PWRMGT flags. So if the VAP is asleep but the
NIC is in force-awake for some reason, ensure that the self-generated
frames have PWRMGT set to 1.
Now, this (like basically everything to do with powersave) is still
racy - the only way to guarantee that it's all actually consistent
is to pause transmit and let it finish before transitioning the VAP
to sleep, but this at least gets the basic method of tracking and
updating the state debugged.
Tested:
* AR5416, STA mode
* AR9380, STA mode
to sleep permanently by executing a HLT with interrupts disabled.
When this condition is detected the guest with be suspended with a reason of
VM_SUSPEND_HALT and the bhyve(8) process will exit.
Tested by executing "halt" inside a RHEL7-beta guest.
Discussed with: grehan@
Reviewed by: jhb@, tychon@
If time_t is 64-bit (i.e. isn't 32-bit) allow any value of year, not
just years less than 2038.
Don't bother fixing the underflow in the case of years before 1903.
MFC after: 1 week
Sponsored by: DARPA, AFRL
and geom_uncompress(4). Now, they produce an almost clean diff(1) output.
Remove a duplicated variable from g_uncompress.c and an unnecessary header
from g_uzip.c.
No functional changes.
This allows existing loader.conf files that set "console=comconsole"
to work without failing. No functional difference otherwise.
Reported by: Michael Dexter, pfSense install.
Reviewed by: neel
MFC after: 3 weeks
Nobody yet reported disk supporting I/Os less then our MAXPHYS value, but
since we any way have code to read Block Limits VPD page, that is easy.
MFC after: 2 weeks
#NO_UNIVERSE. Many of these config files are important examples, but
add little to no regresive value to the intended purpose of
UNIVERSE. We now build over 120 kernels during universe. There's
really little to no value to this over building say 60 or even 30 of
them (either is still a way too big number). This is especially true
for kernels that are nothing more than including a common base and
adding a static DTB file. Start by pruning 1/3 of the arm kernels that
add little regresion value.
kernel config file. If you also want to have a static DTB compiled
into your kernel, however, it cannot be a list. We have no mechanism
in the kernel for picking one, so that doesn't make sense and will
result in a compile-time error.
The changes how TRIM requests are generated to use ZIO_TYPE_FREE + a priority
instead of ZIO_TYPE_IOCTL, until processed by vdev_geom; only then is it
translated the required geom values. This reduces the amount of changes
required for FREE requests to be supported by the new IO scheduler. This
also eliminates the need for a specific DKIOCTRIM.
Also fixed FREE vdev child IO's from running ZIO_STAGE_VDEV_IO_DONE as part
of their schedule.
As the new IO scheduler can result in a request to execute one type of IO to
actually run a different type of IO it requires that zio_trim requests are
processed without holding the trim map lock (tm->tm_lock), as the free request
execute call may result in write request running hence triggering a
trim_map_write_start call, which takes the trim map lock and hence would result
in recused on no-recursive sx lock.
This is based off avg's original work, so credit to him.
MFC after: 1 month
Instead of rereading VPD pages on every device open, do it only on initial
device probe, and in cases when device reported via UNIT ATTENTIONs that
something has changed. Capacity is still rereaded on every open because
it is more critical for operation and more probable to change in run time.
On my tests with Intel 530 SSDs on mps(4) HBA this change reduces time
GEOM needs to retaste the device (that includes few open/close cycles)
from ~150ms to ~30ms.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
fixes and beacon programming / debugging into the ath(4) driver.
The basic power save tracking:
* Add some new code to track the current desired powersave state; and
* Add some reference count tracking so we know when the NIC is awake; then
* Add code in all the points where we're about to touch the hardware and
push it to force-wake.
Then, how things are moved into power save:
* Only move into network-sleep during a RUN->SLEEP transition;
* Force wake the hardware up everywhere that we're about to touch
the hardware.
The net80211 stack takes care of doing RUN<->SLEEP<->(other) state
transitions so we don't have to do it in the driver.
Next, when to wake things up:
* In short - everywhere we touch the hardware.
* The hardware will take care of staying awake if things are queued
in the transmit queue(s); it'll then transit down to sleep if
there's nothing left. This way we don't have to track the
software / hardware transmit queue(s) and keep the hardware
awake for those.
Then, some transmit path fixes that aren't related but useful:
* Force EAPOL frames to go out at the lowest rate. This improves
reliability during the encryption handshake after 802.11
negotiation.
Next, some reset path fixes!
* Fix the overlap between reset and transmit pause so we don't
transmit frames during a reset.
* Some noisy environments will end up taking a lot longer to reset
than normal, so extend the reset period and drop the raise the
reset interval to be more realistic and give the hardware some
time to finish calibration.
* Skip calibration during the reset path. Tsk!
Then, beacon fixes in station mode!
* Add a _lot_ more debugging in the station beacon reset path.
This is all quite fluid right now.
* Modify the STA beacon programming code to try and take
the TU gap between desired TSF and the target TU into
account. (Lifted from QCA.)
Tested:
* AR5210
* AR5211
* AR5212
* AR5413
* AR5416
* AR9280
* AR9285
TODO:
* More AP, IBSS, mesh, TDMA testing
* Thorough AR9380 and later testing!
* AR9160 and AR9287 testing
Obtained from: QCA
Some code will appear soon that is actually setting the chip powerstate
separate from the self-generated frames power state.
* Allow the AR5416 family chips to actually have the power state changed
from the self generated state change.
Tested (STA mode):
* AR5210
* AR5211
* AR5412
* AR5413
* AR5416
* AR9285
All rtsock-initiated rte creation/modification are now
performed in route.c holding radix tree write lock.
This reduces the need for per-rte mutex.
Sponsored by: Yandex LLC
MFC after: 1 month
the 'HLT' instruction. This condition was detected by 'vm_handle_hlt()' and
converted into the SPINDOWN_CPU exitcode . The bhyve(8) process would exit
the vcpu thread in response to a SPINDOWN_CPU and when the last vcpu was
spun down it would reset the virtual machine via vm_suspend(VM_SUSPEND_RESET).
This functionality was broken in r263780 in a way that made it impossible
to kill the bhyve(8) process because it would loop forever in
vm_handle_suspend().
Unbreak this by removing the code to spindown vcpus. Thus a 'halt' from
a Linux guest will appear to be hung but this is consistent with the
behavior on bare metal. The guest can be rebooted by using the bhyvectl
options '--force-reset' or '--force-poweroff'.
Reviewed by: grehan@
into the area backed by vm_page_array wrongly compared end with
vm_page_array_size. It should be adjusted by first_page index to be
correct.
Also, the corner and incorrect case of the requested range extending
after the end of the vm_page_array was incorrectly handled by
allocating the segment.
Fix the comparision for the end of range and return EINVAL if the end
extends beyond vm_page_array.
Discussed with: royger
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
pmap pvlist locks which are scaled by MAXCPU.
This allows an amd64 system to boot with MAXCPU set
to 256, which is currently FreeBSD's hard limit without
x2apic support.
Compile-tested for other arch's.
PR: 185831
Discussed with: jhb
MFC after: 3 weeks
exists on another interface. The panic was introduced by change 264887, which
changed the fibnum parameter in the call to rtalloc1_fib() in
ifa_switch_loopback_route() from RT_DEFAULT_FIB to RT_ALL_FIBS. The solution
is to use the interface fib in that call. For the majority of users, that will
be equivalent to the legacy behavior.
PR: kern/189089
Reported by: neel
Reviewed by: neel
MFC after: 3 weeks
X-MFC with: 264887
Sponsored by: Spectra Logic
platforms which do not use loaders or kernels that want to hardcode
options or for FDT passed in by loader.
Also fix a build issue by putting the kmdp variable accessed back under
the #ifdef FDT; we may wish to revisit decision in which case more
code needs changing.
Submitted by: brooks
by adding an argument to the VM_SUSPEND ioctl that specifies how the virtual
machine should be suspended, viz. VM_SUSPEND_RESET or VM_SUSPEND_POWEROFF.
The disposition of VM_SUSPEND is also made available to the exit handler
via the 'u.suspended' member of 'struct vm_exit'.
This capability is exposed via the '--force-reset' and '--force-poweroff'
arguments to /usr/sbin/bhyvectl.
Discussed with: grehan@
old devd's and thus hosts that get IP addresses from DHCP was too much
of a POLA violation.
The sysctl may be removed again after r263758 has been merged to at
least stable/9 and stable/10, and releases have been cut from those
branches.
Discussed with: mjg
Reported by: theraven, rwatson
It exposes I/O resources to user space, so that programs can peek
and poke at the hardware. It does not itself have knowledge about
the hardware device it attaches to.
Sponsored by: Juniper Networks, Inc.
Instead opening/closing provider by each of metadata classes, do it only
once in core code. Since for SCSI disks open/close means sending some
SCSI commands to the device, this change reduces taste time.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
FD_SETSIZE is often used as an argument to select or compared with an
integer file descriptor. Rather than force 3rd party software to add
explicit casts, just make it a plain (int) constant as on other
operating systems.
Previous discussion:
http://lists.freebsd.org/pipermail/freebsd-standards/2012-July/002410.html
returns ZIO_PIPELINE_CONTINUE from vdev_op_io_start to zio_execute resulting
in the wrong ZIO continuing its pipeline.
This is a serious issue which could cause data loss / corruption but appears
to be limited to error handling such as when vdev_readable(vd) returns false.
MFC after: 2 days
the MYBEACON RX filter (only receive beacons which match the BSSID)
or all beacons on the current channel.
* Add the relevant RX filter entry for MYBEACON.
Tested:
* AR5416, STA
* AR9285, STA
TODO:
* once the code is in -HEAD, just make sure that the code which uses it
correctly sets BEACON for pre-AR5416 chips.
Obtained from: QCA, Linux ath9k
the QCA HAL.
This fires off an interrupt if the TSF from the AP / IBSS peer is
wildly out of range. I'll add some code to the ath(4) driver soon
which makes use of this.
TODO:
* verify this didn't break TDMA!
to the hardware.
The QCA HAL has a comment noting that if this isn't done, modifications
to AR_IMR_S2 before AR_IMR is flushed may produce spurious interrupts.
Obtained from: QCA
Flushing the caches is required before doing a panic dump, but ARM
doesn't provide a flavor of flush that gets broadcast to other cores.
However, all cores except one are stopped before doing a dump, so this
works around the lack of a global flush/invalidate by doing it locally
on each CPU as part of stopping.
Discussed with: cognet@
This was added ca. 2004 for the purpose of ensuring the caches were in the
right state after the debugger set a breakpoint. kdb_cpu_sync_icache()
was added in 2007 to handle that situation, and now the wbinv_all is
actually harmful because the operation isn't broadcast to other cores.
* memory is now allocated as early as possible, without holding locks.
* sysctl users are now guaranteed to get a response (M_WAITOK buffer prealloc).
* socket users are more likely to use on-stack buffer for replies.
* standard kernel malloc/free functions are now used instead of radix wrappers.
rt_msg2() has been renamed to rtsock_msg_buffer().
MFC after: 1 month