Commit Graph

238506 Commits

Author SHA1 Message Date
Mateusz Guzik
b0b246b0ba Remove proctree acquire from note_procstat_proc
It is not needed since r340482 ("proc: always store parent pid in p_oppid")

Sponsored by:	The FreeBSD Foundation
2018-12-08 11:38:39 +00:00
Mateusz Guzik
eab2132ad9 Fix a corner case in ID bitmap management.
If all IDs from trypid to pid_max were used as pids, the code would enter
a loop which would be infinite if none of the IDs could become free (e.g.
they all belong to processes which did not transitioned to zombie).

Fixes:	r341684 ("Manage process-related IDs with bitmaps")

Sponsored by:	The FreeBSD Foundation
2018-12-08 10:22:12 +00:00
Mateusz Guzik
e52327e3c5 proc: postpone proc unlock until after reporting with kqueue
kqueue would always relock immediately afterwards.

While here drop the NULL check for list itself. The list is
always allocated.

Sponsored by:	The FreeBSD Foundation
2018-12-08 06:34:12 +00:00
Mateusz Guzik
eadb1dcb71 proc: handle sdt exit probe before taking the proc lock
Sponsored by:	The FreeBSD Foundation
2018-12-08 06:31:43 +00:00
Mateusz Guzik
13a45e4b14 Provide SDT_PROBES_ENABLED macro.
Sponsored by:	The FreeBSD Foundation
2018-12-08 06:30:41 +00:00
Mateusz Guzik
3c76ace36b amd64: stop re-reading curpc on subyte/suword
Originally read value is still safely kept. Re-reading code was there
for previous iterations which were partially shared with i386.

Sponsored by:	The FreeBSD Foundation
2018-12-08 04:53:08 +00:00
Konstantin Belousov
18519f1583 Simplify kern_readlink_vp().
When we detected that the vnode is not symlink, return immediately.
This moves the readlink code out of else branch and unindents it.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-12-07 23:07:51 +00:00
Konstantin Belousov
978f879483 Fix expression evaluation.
Braces were put in the wrong place, causing failing EAGAIN check to
return zero result.  Remove the problematic assignment from the
conditional expression at all.

While there, remove used once variable vp, and wrap too long line.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-12-07 23:05:12 +00:00
Warner Losh
91182bcfb6 Even though they are reserved, cdw2 and cdw3 can be set via nvme-cli
(and soon nvmecontrol). Go ahead and copy them into rsvd2 and rsvd3.

Sponsored by: Netflix
2018-12-07 21:58:08 +00:00
Warner Losh
80c8ffad94 Add nda(4) cross reference to nvme(4) 2018-12-07 21:57:39 +00:00
Alexander Motin
6810fd0acf Make virtio-scsi pass SCSI Task Attributes to CTL.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2018-12-07 20:55:29 +00:00
Alexander Motin
99fa47de81 Fix several iov handling bugs in bhyve virtio-scsi backend.
- buf_to_iov() does not use buflen parameter, allowing out of bound read.
 - buf_to_iov() leaks memory if seek argument > 0.
 - iov_to_buf() doesn't need to reallocate buffer for every segment.
 - there is no point to use size_t for iov counts, int is more then enough.
 - some iov function arguments can be constified.
 - pci_vtscsi_request_handle() used truncate_iov() incorrectly, allowing
   getting out of buffer and possibly corrupting data.
 - pci_vtscsi_controlq_notify() written returned status at wrong offset.
 - pci_vtscsi_controlq_notify() leaked one buffer per event.

Reported by:	wg
Reviewed by:	araujo
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D18465
2018-12-07 20:30:00 +00:00
Alexander Motin
9ce46e8107 Fill initid explicitly on requests.
Unfortunately ctl_scsi_zero_io() wipes that field, so it was always zero.
While there, targ_port is set by kernel, so user-space should not fill it.

MFC after:	1 week
2018-12-07 19:10:51 +00:00
Ed Maste
b9de82a654 BSD.debug.dist: add newly added nvmecontrol directory 2018-12-07 16:52:52 +00:00
Mateusz Guzik
08d005e6a3 fd: use racct_set_unlocked
Sponsored by:	The FreeBSD Foundation
2018-12-07 16:51:38 +00:00
Mateusz Guzik
448db4f761 racct: add RACCT_ENABLED macro and racct_set_unlocked
This allows to remove PROC_LOCK/UNLOCK pairs spread thorought the kernel
only used to appease racct_set.

Sponsored by:	The FreeBSD Foundation
2018-12-07 16:47:34 +00:00
Mateusz Guzik
82f4b82634 fd: try do less work with the lock in dup
Sponsored by:	The FreeBSD Foundation
2018-12-07 16:44:52 +00:00
Mateusz Guzik
83764b446a vm: use fcmpset for vmspace reference counting
Sponsored by:	The FreeBSD Foundation
2018-12-07 16:22:54 +00:00
Mateusz Guzik
6ff4688b09 Replace hand-rolled unrefs if > 1 with refcount_release_if_not_last
Sponsored by:	The FreeBSD Foundation
2018-12-07 16:11:45 +00:00
Mateusz Guzik
f07c942dd8 refcount: remove a stale comment about conditional ref/unref routines
It was there for vfs-specific variants and was copied over when it should not.

Sponsored by:	The FreeBSD Foundation
2018-12-07 16:10:13 +00:00
Andriy Gapon
6d29ba58a8 acpi_MatchHid: use ACPI_MATCHHID_NOMATCH instead of FALSE
Binary representation of both is the same (zero), but
ACPI_MATCHHID_NOMATCH is better for consistency.

MFC after:	4 days
X-MFC with:	r339754
2018-12-07 16:05:39 +00:00
Andriy Gapon
f01b5ed9c8 aibs: fix a typo in the probe method that was introduced in r339754
Because of that typo the driver would try to attach to every device
on acpi bus.  That disrupted acpi attachment of uart driver, at least.

MFC after:	4 days
X-MFC with:	r339754
2018-12-07 16:01:51 +00:00
Mark Johnston
1a153f42fa Update the description of the address space layout on RISC-V.
This adds more detail and fixes some inaccuracies.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18463
2018-12-07 15:56:40 +00:00
Mark Johnston
1f5e341b46 Rename sptbr to satp per v1.10 of the privileged architecture spec.
Add a subroutine for updating satp, for use when updating the
active pmap.  No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18462
2018-12-07 15:55:23 +00:00
Mark Johnston
08c4a937a6 Let the cap_syslog capability inherit stdio descriptors.
Otherwise cap_openlog(LOG_PERROR) doesn't work.

Reviewed by:	oshogbo
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18457
2018-12-07 15:52:50 +00:00
Konstantin Belousov
fd52edaf70 Regen. 2018-12-07 15:19:00 +00:00
Konstantin Belousov
d1fd400a80 Add new file handle system calls.
Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2).  The
syscalls are provided for a NFS userspace server (nfs-ganesha).

Submitted by:	Jack Halford <jack@gandi.net>
Sponsored by:	Gandi.net
Tested by:	pho
Feedback from:	brooks, markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18359
2018-12-07 15:17:29 +00:00
Mateusz Guzik
b1fbffe73c proc: when exiting move to zombproc before taking proctree
The kernel was already doing this prior to r329615. It was changed
to reduce contention on allproc. However, introduction of pidhash
locks and removal of proctree -> allproc ordering from fork thanks
to bitmaps fixed things enough to make this change pessimal.

waitpid takes proctree on each call and this change (now) causes
avoidable stalls if allproc is held.

Sponsored by:	The FreeBSD Foundation
2018-12-07 12:32:25 +00:00
Mateusz Guzik
34ebdceac0 Manage process-related IDs with bitmaps
Currently unique pid allocation on fork often requires a full walk of
process, group, session lists to make sure it is not used by anything.
This has a side effect of requiring proctree to be held along with allproc,
which adds more contention in poudriere -j 128.

The patch below implements trivial bitmaps which gets rid of the problem.
Dedicated lock is introduced to manage IDs.

While here a bug was discovered: all processes would inherit reap id from
the first process spawned by init. This had a side effect of keeping the
ID used and when allocation rolls over to the beginning it keeps being
skipped.

The patch is loosely based on initial work by mjoras@.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
2018-12-07 12:22:32 +00:00
Mateusz Guzik
6e8c1ccbe2 Annotate Giant drop/pickup macros with __predict_false
They are used in important places of the kernel with the lock not being held
majority of the time.

Sponsored by:	The FreeBSD Foundation
2018-12-07 12:06:03 +00:00
Mateusz Guzik
afacf95d5c unr64: use locked variant if not __LP64__
The current ifdefs are not sufficient to distinguish 32- and 64- bit
variants, which results e.g. in powerpc64 not using atomics.

While some 32-bit archs provide 64-bit atomics, there is no huge advantage
of using them on these platforms.

Reported by:	many
Suggested by:	jhb
Sponsored by:	The FreeBSD Foundation
2018-12-07 12:05:11 +00:00
Andriy Gapon
2e2b365e47 daprobedone: announce if a disk is write-protected
MFC after:	2 weeks
2018-12-07 12:02:31 +00:00
Vincenzo Maffione
2605ddfce9 netmap: remove dead code obsoleted by iflib
The iflib subsystem implements netmap support in a driver-independent
way (sys/net/iflib.c). We can therefore remove the headers that
used to implement netmap support for all the drivers now supported
by iflib (em, igb, ixl, ixgbe, lem).

MFC after:	1 week
2018-12-07 11:47:42 +00:00
Michal Meloun
025489d337 Fix cut&paste typo in atomic_fetchadd_64().
Reported by:	Jia-Shiun Li <jiashiun@gmail.com>
MFC after:	1 week
2018-12-07 11:10:27 +00:00
Pawel Jakub Dawidek
4926792bc9 Consider the following situation:
The sender has .not_terminated file. It gets disconnected. The last trail
file is then terminated without adding new data (this can happen for example
when auditd is being stopped on the sender). After reconnect the .not_terminated
was not renamed on the receiver as it should.

We were already handling similar situation where the sender crashed and the
.not_terminated trail file was renamed to .crash_recovery. Extend this case to
handle the situation above.
2018-12-07 03:13:36 +00:00
Conrad Meyer
af7dcae0e2 gmirror: Evaluate mirror components against newest metadata copy
Re-apply r341665 with format strings fixed.

If we happen to taste a stale mirror component first, don't reject valid,
newer components that have differing metadata from the stale component
(during STARTING).  Instead, update our view of the most recent metadata as
we taste components.

Like mediasize beforehand, remove some checks from g_mirror_check_metadata
which would evict valid components due to metadata that can change over a
mirror's lifetime.  g_mirror_check_metadata is invoked long before we check
genid/syncid and decide which component(s) are newest and whether or not we
have quorum.

Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW
component is added, first remove any known stale or inconsistent disks from
the mirrorset, rather than removing them *after* deciding we have quorum.
Check if we have quorum after removing these components.

Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to
force gmirrors to wait out the full timeout (kern.geom.mirror.timeout)
before transitioning from STARTING to RUNNING.  This is a kludge to help
ensure all eligible, boot-time available mirror components are tasted before
RUNNING a gmirror.

Add a basic test case for STARTING -> RUNNING startup behavior around stale
genids.

PR:		232671, 232835
Submitted by:	Cindy Yang <cyang AT isilon.com> (previous version)
Reviewed by:	markj (kernel portions)
Discussed with:	asomers, Cindy Yang
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18062
2018-12-07 02:44:04 +00:00
Conrad Meyer
c4e87bdfc1 Revert r341665 due to tinderbox breakage
I didn't notice that some format strings were non-portable.  Will fix and
re-commit later.
2018-12-07 00:47:05 +00:00
Alan Somers
a9ebbf33ea geom tests: Fix cleanup of ATF tests since r341392
r341392 changed common test cleanup routines in a way that allowed them to
be used by TAP tests as well as ATF tests.  However, a late change made
during code review resulted in cleanup being broken for ATF tests, which
source geom_subr.sh separately during the body and cleanup phases of the
test.  The result was that md(4) devices wouldn't get cleaned up.

MFC after:	2 weeks
X-MFC-With:	341392
2018-12-07 00:27:38 +00:00
Conrad Meyer
bc1ee0be2d gmirror: Evaluate mirror components against newest metadata copy
If we happen to taste a stale mirror component first, don't reject valid,
newer components that have differing metadata from the stale component
(during STARTING).  Instead, update our view of the most recent metadata as
we taste components.

Like mediasize beforehand, remove some checks from g_mirror_check_metadata
which would evict valid components due to metadata that can change over a
mirror's lifetime.  g_mirror_check_metadata is invoked long before we check
genid/syncid and decide which component(s) are newest and whether or not we
have quorum.

Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW
component is added, first remove any known stale or inconsistent disks from
the mirrorset, rather than removing them *after* deciding we have quorum.
Check if we have quorum after removing these components.

Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to
force gmirrors to wait out the full timeout (kern.geom.mirror.timeout)
before transitioning from STARTING to RUNNING.  This is a kludge to help
ensure all eligible, boot-time available mirror components are tasted before
RUNNING a gmirror.

When we are instructed to forget mirror components, bump the generation id
to avoid confusion with such stale components later.

Add a basic test case for STARTING -> RUNNING startup behavior around stale
genids.

PR:		232671, 232835
Submitted by:	Cindy Yang <cyang AT isilon.com> (previous version)
Reviewed by:	markj (kernel portions)
Discussed with:	asomers, Cindy Yang
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18062
2018-12-06 23:55:39 +00:00
Warner Losh
e8c7837685 Update paths based on last-minute changes from libexec to lib. 2018-12-06 23:40:56 +00:00
Warner Losh
44d31a441e Declare global function print_intel_add_smart in header 2018-12-06 23:29:06 +00:00
Warner Losh
4b639af1d4 Use proper prototypes. 2018-12-06 23:28:55 +00:00
Warner Losh
e4aa091bd1 It's useful to have this be a global function.
Other vendors base their additional smart info pages on what Intel did
plus some other bits. So it's convenient to have this be global.

Sponsored by: Netflix
2018-12-06 22:59:18 +00:00
Warner Losh
8775459902 This is not a samsung standard, so remove that alias.
This was never documented, and isn't needed, so it's best removed to
avoid confusion.

Sponsored by: Netflix
Differential Revision:	https://reviews.freebsd.org/D18460
2018-12-06 22:59:08 +00:00
Warner Losh
eac8e82796 Move intel and wdc files to their own modules
Move the intel and wdc vendor specific stuff to their own modules.

Sponsored by: Netflix
Differential Revision:  https://reviews.freebsd.org/D18460
2018-12-06 22:58:55 +00:00
Warner Losh
0d095c23a0 Const poison the command interface
Make the pointers we pass into the commands const, also make the
linker set mirrors const.

Suggested by: cem@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18459
2018-12-06 22:58:42 +00:00
Warner Losh
228c425533 Dynamically load .so modules to expand functionality
o Dynamically load all the .so files found in /libexec/nvmecontrol and
  /usr/local/libexec/nvmecontrol.
o Link nvmecontrol -rdynamic so that its symbols are visible to the
  libraries we load.
o Create concatinated linker sets that we dynamically expand.
o Add the linked-in top and logpage linker sets to the mirrors for them
  and add those sets to the mirrors when we load a new .so.
o Add some macros to help hide the names of the linker sets.
o Update the man page.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18455

fold
2018-12-06 22:58:26 +00:00
John Baldwin
dab45de671 Rename riscv64-freebsd.c to riscv-freebsd.c.
This fixes truss when built as part of a riscv64sf world.  Additionally,
if FreeBSD ever supports RV32 RISC-V most of this file can be used as-is
just as a single file is used for all of the MIPS ABIs.

Sponsored by:	DARPA
2018-12-06 22:35:07 +00:00
Konstantin Belousov
a748d99a17 Fix build with option RSS, removing unused variables.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 21:52:40 +00:00
Navdeep Parhar
9b11a65d1c cxgbe(4): Get Linux cxgb4vf working in bhyve VMs with VFs passed
through.

cxgb4vf doesn't own the buffer size list but still expects the first two
entries to be 4K and some power of 2 respectively.  The BSD cxgbe
doesn't care where its preferred buffer sizes are as long as they're in
the list somewhere, so just move its entries towards the end as a
workaround.

MFC after:	1 month
Sponsored by:	Chelsio Communicatons
2018-12-06 21:33:08 +00:00