Commit Graph

246194 Commits

Author SHA1 Message Date
Doug Moore
2767c9f36a Where 'current' is used to index over vm_map entries, use
'entry'. Where 'entry' is used to identify the starting point for
iteration, use 'first_entry'. These are the naming conventions used in
most of the vm_map.c code.  Where VM_MAP_ENTRY_FOREACH can be used, do
so. Squeeze a few lines to fit in 80 columns.  Where lines are being
modified for these reasons, look to remove style(9) violations.

Reviewed by: alc, markj
Differential Revision: https://reviews.freebsd.org/D22458
2019-11-25 02:19:47 +00:00
Alexander Motin
62ba8e8469 Report XLAT0 register for completeness. 2019-11-25 01:00:51 +00:00
Bjoern A. Zeeb
3232273f42 Allow kernel to compile without BPF.
r297816 added some bpf magic for VIMAGE unconditionally which no longer
allows kernels to compile without bpf (but with other networking).
Add the missing ifdef checks and allow a kernel to compile without bpf
again.

PR:		242136
Reported by:	dave mischler.com
MFC after:	2 weeks
2019-11-24 23:21:47 +00:00
Ian Lepore
b39d851dcb When doing ARM stack unwinding as part of stack_save(9), do not search
loaded modules (pass 0/false for the can_lock arg).  Searching the unwind
info in modules acquires an exclusive sxlock, and the stack(9) functions can
be called in a context where unbounded sleeps are forbidden (such as from
the witness checkorder code).

Just ignoring the existence of modules in stack_save() is not ideal, so I'm
looking for a better solution, but this commit will make it possible to boot
an ARM kernel with WITNESS enabled again, until I get something better.

PR:		242200
2019-11-24 21:08:56 +00:00
Vladimir Kondratyev
71b8e362c5 Linux epoll: Allow passing of any negative timeout value to epoll_wait
Linux epoll allow passing of any negative timeout value to epoll_wait()
to cause unbound blocking

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22517
2019-11-24 20:51:09 +00:00
Vladimir Kondratyev
335fe0afb8 Linux epoll: Register events with zero event mask
Such an events are legal and should be interpreted as EPOLLERR | EPOLLHUP.
Register a disabled kqueue event in that case as we do not support EPOLLHUP yet.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22516
2019-11-24 20:47:40 +00:00
Vladimir Kondratyev
461120b834 Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD
Linux epoll EPOLL_CTL_ADD op handler should always check registration
of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied
file descriptor fd is already registered with epoll instance.

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22515
2019-11-24 20:44:14 +00:00
Vladimir Kondratyev
896a4c279d Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired
Linux epoll does not remove descriptor after one-shot event has been triggered.
Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22513
2019-11-24 20:41:47 +00:00
Konstantin Belousov
3236244936 Ignore object->handle for OBJ_ANON objects.
Note that the change in vm_object_collapse() is arguably a correctness
fix.  We must not collapse into content-identity carrying objects.

Reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22467
2019-11-24 19:18:12 +00:00
Eitan Adler
2e3fa849dd bsd-family-tree: correct macOS release date
Reported by: Herbert J. Skuhra <herbert@gojira.at>
Reported by: Maxim Konovalov <maxim.konovalov@gmail.com>
2019-11-24 19:16:57 +00:00
Konstantin Belousov
b631c36f0d Record part of the owner struct thread pointer into busy_lock.
Record as much bits from curthread into busy_lock as fits.  Low bits
for struct thread * representation are zero due to struct and zone
alignment, and they leave space for busy flags (perhaps except
statically allocated thread0).  Upper bits are not very interesting
for assert, and in most practical situations recorded value should
allow to manually identify the owner with certainity.

Assert that unbusy is performed by the owner, except few places where
unbusy is done in io completion handler.  For this case, add
_unchecked variants of asserts and unbusy primitives.

Reviewed by:	markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D22298
2019-11-24 19:12:23 +00:00
Konstantin Belousov
dbe257d253 tmpfs: resolve deadlock between rename and unmount.
Top-level kern_renameat() increases the writecount on the mount point,
which, together with tmpfs unmount suspending the mount, already
ensures that unmount cannot proceed while rename unlocks and relocks
all operated vnodes.

Remove vfs_busy() call from tmpfs_rename() which was done while
holding a vnode lock, creating the deadlock.  The only intent of the
busy operation seems to be the prevention of unmount, which is already
ensured.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-11-24 19:06:38 +00:00
Konstantin Belousov
13189065cb amd64: assert that EARLY_COUNTER does not corrupt memory.
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D22514
2019-11-24 19:02:13 +00:00
Navdeep Parhar
515a40d5d9 cxgbe(4): sysctl to reset the temperature/voltage sensor.
# sysctl dev.<nexus>.<inst>.reset_sensor=1
# sysctl dev.t6nex.0.reset_sensor=1

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-24 16:40:54 +00:00
Warner Losh
dfdbb32093 Don't need giant for these drivers dev nodes.
Also, Giant isn't required to busy / unbusy a device, so drop that too while I'm
here. It's not done elsewhere in the tree and in the future will likely be
handled by a node lock to ensure consistency. Leave Giant in place for attach
and removing childing, as that's actually still needed, even if imperfect.

Remove stale comment about contigmalloc taking Giant and calling w/o the lock
held. Neither of these is still true.
2019-11-24 15:37:19 +00:00
Warner Losh
96b506a57c Hoist locking giant back up into the ioctl handler
Move the locking back into the ioctl handler. This "fixes" the race where we hve
a hot plug event just after the dropping of Giant in pci_find_dbsf, assuming the
driver doesn't then call anything that drops and picks up Giant again... It's a
little safer since don't think it doesn't, but we lack the tools to know for
sure.
2019-11-24 15:37:14 +00:00
Warner Losh
57aa9163fd Fix leak in state machine for commands.
When we get a device departed message from the firmware, we send a TARGET_REST
to the device to let the firmware know we're done and as part of the recovery
process. This will abort all the commands. While the documentation says the IOC
is responsible for writing the completion message for all the commands pending
with an aborted status, we sometimes have queued commands for the target that
haven't been completed so are in the INQUEUE state. So, when we later complete
the pending CCB as aborted, these commands are freed and we hit the "state not
busy" panic.

Elsewhere where we dequeue commands, we move the state to BUSY from INQUEUE. Do
that here as well. In talking to Ken, Scott and Justin, they recommended a
series of tests to see if this is 100% safe. Those tests are ongoing, but
preliminary tests suggest this is safe as we see no duplicate completions when
we hit this case at work. We have a machine that has a dodgy powersupply which
usually doesn't apply power to a few drives, but sometimes does when the machine
is under heavy load so we get a rash of the connect / disconnect messages over
half an hour. Without this change, we'd see state not busy panic. With this
change, the drives just annoyingly come and go without affecting the rest of the
machine, but without a complete error injection test suite, it's hard to know if
all edge cases are now covered or not.

Discussed with: scottl, ken, gibbs
2019-11-24 15:24:05 +00:00
Li-Wen Hsu
b9e5cb0580 Fix gcc build
We have -Werror=strict-overflow so gcc complains:

In file included from /tmp/obj/workspace/src/amd64.amd64/tmp/usr/include/bitstring.h:36:0,
                 from /workspace/src/tests/sys/sys/bitstring_test.c:34:
/workspace/src/tests/sys/sys/bitstring_test.c: In function 'bit_ffc_at_test':
/workspace/src/sys/sys/bitstring.h:239:5: error: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Werror=strict-overflow]
  if (_start >= _nbits) {
     ^

Disable assuming overflow of signed integer will never happen by specifying
-fno-strict-overflow

Sponsored by:	The FreeBSD Foundation
2019-11-24 15:03:35 +00:00
Kristof Provost
492f3a312a pf: Add endline to all DPFPRINTF()
DPFPRINTF() doesn't automatically add an endline, so be consistent and
always add it.
2019-11-24 13:53:36 +00:00
Eitan Adler
71fb754d11 bsd-family-tree: add several new entries
Reviewed by:	imp, scottl
Differential Revision:	https://reviews.freebsd.org/D22529
2019-11-24 07:52:35 +00:00
Brandon Bergren
e58d379587 [PowerPC] Fix stack padding issue on ppc32.
Four bytes of padding are needed in the regular powerpc case to bring the
stack frame size up to a multiple of 16 bytes to meet ABI requirements.

Fixes odd hangs I was encountering during testing.
2019-11-24 06:43:03 +00:00
Navdeep Parhar
e56d731b7d cxgbe(4): Update the firmware interface header.
This allows the driver to be updated for the next firmware without
waiting for it to be released.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2019-11-24 05:37:28 +00:00
Justin Hibbits
7511645efa rtld/powerpc: Fix _rtld_bind_start for powerpcspe
Summary:
We need to save off the full 64-bit register, not just the low 32 bits,
of all registers getting saved off in _rtld_bind_start.  Additionally,
we need to save off the other SPE registers (SPEFSCR and accumulator),
so that their program state is not affected by the PLT resolver.

Reviewed by:	bdragon
Differential Revision:	https://reviews.freebsd.org/D22520
2019-11-24 04:35:29 +00:00
Warner Losh
a921c2003f Add a warning about Giant Locked devices
Add a warning when a device registers with devfs and requests
D_NEEDGIANT. The warning says the device will go away before
13.0. This is needed to flush out the devices in the tree that are
still Giant locked. This warning, or some variant of it, should have
gone into the tree a long time ago...

The intention is to require all devices be converted to not use
automatic giant in this way, or remove any such devices that remain
that we don't have the hardware to test a conversion of.

kbd so far is the only device that can't leave the tree, yet needs
something sensible done to avoid the auto giant lock (even if it is
just doing the wrapping itself). There may be others added to this
list... Any discussions of this topic will take place on arch@.
2019-11-23 23:57:26 +00:00
Warner Losh
283a5a3796 We don't even need Giant here. It isn't protecting anything internal
to geom, and nothing we call requires it to be held. It's left over
from a time when the latter wasn't the case. Retire it.

Reviewed in concept: scottl@
2019-11-23 23:44:00 +00:00
Warner Losh
dd615d09c4 Push Giant down one layer
The /dev/pci device doesn't need GIANT, per se. However, one routine
that it calls, pci_find_dbsf implicitly does. It walks a list that can
change when PCI scans a new bus. With hotplug, this means we could
have a race with that scanning. To prevent that, take out Giant around
scanning the list.

However, given that we have places in the tree that drop giant, if
held when we call into them, the whole use of Giant to protect newbus
may be less effective that we desire, so add a comment about why we're
talking it out, and we'll address the issue when we lock newbus with
something other than Giant.
2019-11-23 23:43:52 +00:00
Brandon Bergren
0ee420b608 [PowerPC] Fix typo in _ctx_start on ppc32
Theoretically, this was breaking the size calculation for the symbol.

Noticed when doing a readthrough.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22525
2019-11-23 23:41:21 +00:00
Brandon Bergren
a638bf2a76 [PowerPC] Use QEMU-compatible version of SPE accumulator save
Switch from "evaddumiaaw 0,0" to "evmwumiaa 0,0,0" when persisting the
accumulator. This has the benefit of actually being implemented in QEMU
as it is the form Linux uses for the same task.

Both instructions are functionally equivilent, as we are using them for
their side effect of copying the accumulator to GPRs rather than for the
actual math operation that they are performing.

Reviewed by: jhibbits
2019-11-23 21:18:55 +00:00
Dimitry Andric
1bb8eb56ef libclang_rt: enable on powerpc*
Summary:
Enable on powerpc64 and in lib/libclang_rt/Makefile change
MACHINE_CPUARCH to MACHINE_ARCH because on powerpc64
MACHINE_ARCH==MACHINE_CPUARCH so the 32-bit library overwrites 64-bit
library during installworld.

This patch doesn't enable any other libclang_rt libraries because they
need to be separately ported.

I have verified that games/julius (which fails on powerpc64 elfv2
without this change because of no libclang_rt profiling library) builds.

Test Plan: Ship it, test on powerpc and powerpcspe

Submitted by:	pkubaj
Reviewed by:	dim, jhibbits
Differential Revision: https://reviews.freebsd.org/D22425
MFC after:	1 month
X-MFC-With:	r353358
2019-11-23 19:35:09 +00:00
Doug Moore
be252a414f The error messages that indicate bugs in 'area' bitstring functions
should identify accurately which function exhibited the bug.

Reviewed by: asomers
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22519
2019-11-23 17:22:36 +00:00
Kyle Evans
8922c2ca03 bcm2835_sdhci: fix non-INVARIANTS build
sc is now only used to make sure we're not re-entering the data handling
path erroneously.

Reported by:	Mark Millard
2019-11-23 13:39:47 +00:00
Kyle Evans
0227a14997 arm64/NOTES: add SOC_BRCM_BCM2838
This should have been done back when it was added, but it was not. It only
really adds an extra entry for memory mapping bits in bcm2835_vcbus.c, so
nothing too extensive yet.
2019-11-23 03:38:26 +00:00
Kyle Evans
d5f1d33c67 bcm2835_dma: rip out the "use_dma" flag, make it non-optional
Now that it works for the Raspberry Pi 4, we can discontinue our workarounds
that were put in place to at least get a bootable kernel for other testing.
2019-11-23 01:47:17 +00:00
Kyle Evans
d7399dfdba bcm2835_sdhci: "fix" DMA on the RPi 4
According to the documentation I have, DREQ pacing should be required here.
The DREQ# hasn't changed since the BCM2835. As soon as we attempt to setup
DREQ, DMA stalls and there's no clear reason why as of yet. Setting this
back to NONE seems to work just as well, though it's yet to be determined if
this is a sustainable model in high-throughput scenarios.
2019-11-23 01:46:02 +00:00
Conrad Meyer
7993a104a1 Add explicit SI_SUB_EPOCH
Add explicit SI_SUB_EPOCH, after SI_SUB_TASKQ and before SI_SUB_SMP
(EARLY_AP_STARTUP).  Rename existing "SI_SUB_TASKQ + 1" to SI_SUB_EPOCH.

epoch(9) consumers cannot epoch_alloc() before SI_SUB_EPOCH:SI_ORDER_SECOND,
but likely should allocate before SI_SUB_SMP.  Prior to this change,
consumers (well, epoch itself, and net/if.c) just open-coded the
SI_SUB_TASKQ + 1 order to match epoch.c, but this was fragile.

Reviewed by:	mmacy
Differential Revision:	https://reviews.freebsd.org/D22503
2019-11-22 23:23:40 +00:00
Alexander Motin
bae3729be4 Do not retry long ready waits if previous gave nothing.
I have some disks reporting "Logical unit is in process of becoming ready"
for about half an hour before finally reporting failure.  During that time
CAM waits for the readiness during ~2 minutes for each request, that makes
system boot take very long time.

This change reduces wait times for the following requests to ~1 second if
previously long wait for that device has timed out.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-22 21:31:59 +00:00
Conrad Meyer
b6db1cc710 random(4): De-export random_sources list
The internal datastructures do not need to be visible outside of
random_harvestq, and this helps ensure they are not misused.

No functional change.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22485
2019-11-22 20:24:15 +00:00
Scott Long
02d4535d2d Mark hpt27xx for removal in 13.0; all CAM drivers will be Giant-free by then.
Relnotes:	yes
2019-11-22 20:23:22 +00:00
Conrad Meyer
d7a23f9f6b random(4): Use ordinary sysctl definitions
There's no need to dynamically populate them; the SYSCTL_ macros take care
of load/unload appropriately already (and random_harvestq is 'standard' and
cannot be unloaded anyway).

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22484
2019-11-22 20:22:29 +00:00
Dave Cottlehuber
130cfcf3fc dhclient: support option 114, default-url ascii
This will enable further automation of HTTP UEFI boot loader support by
providing a specific option for providing the boot URL to FreeBSD.

Documented in:

https://www.iana.org/assignments/bootp-dhcp-parameters/bootp-dhcp-parameters.xhtml
https://kb.isc.org/docs/isc-dhcp-44-manual-pages-dhcp-options
https://tools.ietf.org/html/rfc3679

Approved by:	emaste
MFC after:	2 weeks
Sponsored by:	SkunkWerks, GmbH
Differential Revision:	https://reviews.freebsd.org/D22475
2019-11-22 20:22:16 +00:00
Conrad Meyer
f19de0a945 random(4): Abstract loader entropy injection
Break random_harvestq_prime up into some logical subroutines.  The goal
is that it becomes easier to add other early entropy sources.

While here, drop pre-12.0 compatibility logic.  loader default configuration
should preload the file as expeced since 12.0.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22482
2019-11-22 20:20:37 +00:00
Conrad Meyer
92ebf15da5 random(4): Remove unused definitions
Approved by:	csprng(gordon, markm)
Differential Revision:	https://reviews.freebsd.org/D22481
2019-11-22 20:18:07 +00:00
Kyle Evans
ba78f78f44 bcm2835_vcbus: add the *other* rpi4 compat string
The DTS I used initially had brcm,bcm2838; the new one uses brcm,bcm2711.
Add that one as well.
2019-11-22 19:56:52 +00:00
Kyle Evans
5b0a8ee218 MMCCAM: defer release of ccb until we're done with it
If we've found a device, we attempt to call xpt_action() on a ccb that's
already been released. Simply defer release until after we're done with it.

Reviewed by:	imp, scottl
MFC after:	1 week
2019-11-22 19:54:14 +00:00
Conrad Meyer
cb285f7c7c random/ivy: Provide mechanism to read independent seed values from rdrand
On x86 platforms with the intrinsic, rdrand is a deterministic bit generator
(AES-CTR) seeded from an entropic source.  On x86 platforms with rdseed, it
is something closer to the upstream entropic source.  (There is more nuance;
a block diagram is provided in [1].)

On devices with rdrand and without rdseed, there is no good intrinsic for
acecssing the good entropic soure directly.  However, the DRBG is guaranteed
to reseed every 8 kB on these platforms.  As a conservative option, on such
hardware we can read an extra 7.99kB samples every time we want a sample
from an independent seed.

As one can imagine, this drastically slows the effective read rate of
RDRAND (a factor of 1024 on amd64 and 2048 on ia32).  Microbenchmarks on AMD
Zen (has RDSEED) show an RDRAND rate of 25 MB/s and Intel Haswell (no
RDSEED) show RDRAND of 170 MB/s.  This would reduce the read rate on Haswell
to ~170 kB/s (at 100% CPU).  random(4)'s harvestq thread periodically
"feeds" from pure sources in amounts of 128-1024 bytes.  On Haswell,
enabling this feature increases the CPU time of RDRAND in each "feed" from
approximately 0.7-6 µs to 0.7-6 ms.

Because there is some performance penalty to this more conservative option,
a knob is provided to enable the change.  The change does not affect
platforms with RDSEED.

[1]: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide#inpage-nav-4-2

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22455
2019-11-22 19:30:31 +00:00
Alexander Motin
7e8baf37e0 Remove xpt_lock mutex.
CAM does not require SIM locks for years, and obviously does not require
it for completely virtual XPT SIM.

MFC after:	2 weeks
2019-11-22 18:55:27 +00:00
Scott Long
8823960b8d Schedule the trm(4) driver for removal. It relies on Giant and thus has
required compat shims in CAM for 12 years.

Relnotes:	yes
2019-11-22 18:50:53 +00:00
Brooks Davis
edb0ec001e Revert r354909: Make the warning for deprecated NO_ variables an error.
An unexpectidly large number of ports define NO_MAN (and sometimes the
long-dead NOMAN).  I'll fix ports and then re-commit.
2019-11-22 18:41:09 +00:00
Alexander Motin
a4876fbfc3 Make CAM use root_mount_hold_token() to delay boot.
Before this change CAM used config_intrhook_establish() for this purpose,
but that approach does not allow to delay it again after releasing once.

USB stack uses root_mount_hold() to delay boot until bus scan is complete.
But once it is, CAM had no time to scan SCSI bus, registered by umass(4),
if it already done other scans and called config_intrhook_disestablish().
The new approach makes it work smooth, assuming the USB device is found
during the initial bus scan.  Devices appearing on USB bus later may still
require setting kern.cam.boot_delay, but hopefully those are minority.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-11-22 18:39:51 +00:00
Scott Long
f0d6f5774a Remove NEEDGIANT from the scsi_sg /dev node. It likely has not been
needed for many years.

Reported by:	imp
2019-11-22 18:18:36 +00:00