Commit Graph

15052 Commits

Author SHA1 Message Date
rwatson
e2edde369a Audit the accepted (or rejected) username argument to setlogin(2).
(NB: This was likely a mismerge from XNU in audit support, where the
text argument to setlogin(2) is captured -- but as a text token,
whereas this change uses the dedicated login-name field in struct
audit_record.)

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2016-08-20 20:28:08 +00:00
rwatson
d27d338583 Audit additional vnode information in the implementation of the
ftruncate(2) system call.  This was not required by the Common
Criteria, which needed only open-time audit.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2016-08-20 18:51:48 +00:00
markj
b3cc80b493 Don't set P2_PTRACE_FSTP in a process that invokes ptrace(PT_TRACE_ME).
Such processes are stopped synchronously by a direct call to
ptracestop(SIGTRAP) upon exec. P2_PTRACE_FSTP causes the exec()ing thread
to suspend itself while waiting for a SIGSTOP that never arrives.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7576
2016-08-19 17:57:14 +00:00
mmel
28257ccca8 INTRNG: Rework handling with resources. Partially revert r301453.
- Read interrupt properties at bus enumeration time and store
   it into global mapping table.
 - At bus_activate_resource() time, given mapping entry is resolved and
   connected to real interrupt source. A copy of mapping entry is attached
   to given resource.
 - At bus_setup_intr() time, mapping entry stored in resource is used
   for delivery of requested interrupt configuration.
 - For MSI/MSIX interrupts, mapping entry is created within
   pci_alloc_msi()/pci_alloc_msix() call.
 - For legacy PCI interrupts, mapping entry must be created within
   pcib_route_interrupt() by pcib driver itself.

Reviewed by: nwhitehorn, andrew
Differential Revision: https://reviews.freebsd.org/D7493
2016-08-19 10:52:39 +00:00
markj
03e282f619 Correct a check for P2_PTRACE_FSTP in ptracestop().
MFC after:	1 day
2016-08-19 01:27:24 +00:00
gnn
956381ee95 Remove the obsolete and unused openbsd_poll system call. (Phase 2)
Reported by:	brooks
Reviewed by:	brooks, jhb
Differential Revision:	https://reviews.freebsd.org/D7548
2016-08-18 10:54:39 +00:00
gnn
14495cd3b7 Remove unusedd and obsolete openbsd_poll system call. (Phase 1)
Reported by:	brooks
Reviewed by:	brooks,jhb
Differential Revision:	https://reviews.freebsd.org/D7548
2016-08-18 10:50:40 +00:00
bdrewery
4eb119b8e7 Garbage collect _umtx_lock(2)/_umtx_unlock(2) references removed in r263318.
This has no real impact on the resulting libc.so file.

MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-17 10:20:05 +00:00
kib
5413d2d271 Remove duplicated code.
aio_aqueue() calls aio_init_aioinfo() as the first action. There is no
need to duplicate the code in kern_aio_fsync().

Also fix indent for aio_aqueue() definition.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7523
2016-08-17 10:14:22 +00:00
kib
e56264ca17 Implement userspace gettimeofday(2) with HPET timecounter.
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC.  For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge.  Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET.  For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods.  Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location.  __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access.  But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.

Tested by:	Howard Su <howard0su@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
glebius
0d6b2c40a8 Fix a stupid typo (or copy/paste buffer malfunction). 2016-08-16 23:00:22 +00:00
glebius
0495c9a2f3 We should not be allowing a timeout to reset when a drain is in progress on
it (either async or sync drain).

At this moment the only user of drain is TCP, but TCP wouldn't reschedule a
callout after it has drained it, since it drains only when a tcpcb is closed.
This for now the problem isn't observed.

Submitted by:	rrs
2016-08-16 21:55:34 +00:00
ed
b877ee1f8a Eliminate use of sys_fsync() and sys_fdatasync().
Make the kern_fsync() function public, so that it can be used by other
parts of the kernel. Fix up existing consumers to make use of it.

Requested by:	kib
2016-08-15 20:11:52 +00:00
badger
c22ca3f26a sem_post(): wake up the sleeper only after adjusting has_waiters
If the caller of sem_post() wakes up a thread sleeping via sem_wait()
before it clears the has_waiters flag, the caller of sem_wait() has no way of
knowing when it is safe to destroy the semaphore and reuse the memory. This is
because the caller of sem_post() may be interrupted between the wake step and
the clearing of has_waiters. It will then write into the has_waiters flag in
userspace after being preempted for some unknown amount of time.

Reviewed by:	jhb, kib, vangyzen
Approved by:	kib (mentor), vangyzen (mentor)
MFC after:	2 weeks
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D7505
2016-08-15 20:09:09 +00:00
kib
30646b2071 Implement VOP_FDATASYNC() for msdosfs.
Standard VOP_FSYNC() implementation just syncs data buffers, and due
to this, is the correct and efficient implementation for msdosfs or
any other filesystem which uses bufer cache trivially.  Provide
globally visible wrapper vop_stdfdatasync_buf() for future consumption
by other filesystems.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7471
2016-08-15 19:17:00 +00:00
kib
ab615e726f Regen after r304176, fdatasync(2) addition. 2016-08-15 19:15:46 +00:00
kib
aa6b4fc56a Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2).  For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC().  This is functionally correct but not efficient.

This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.

Reviewed by:	mckusick
Discussed with:	avg (ZFS), jhb (AIO part)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7471
2016-08-15 19:08:51 +00:00
kib
af1734a6ca VOP_FSYNC() does not take cred as an argument. Correct comment.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-15 18:55:33 +00:00
alc
c2131431d1 Eliminate unneeded vm_page_xbusy() and vm_page_xunbusy() operations when
neither vm_pager_has_page() nor vm_pager_get_pages() is called.

Reviewed by:	kib, markj
MFC after:	3 weeks
2016-08-14 22:00:45 +00:00
bde
1cd636e74d Print the tid of curthread in "show pcpu" in ddb.
It was remarkably hard to trace all current threads.  "show pcpu" only
showed the pid, and there was nothing (?) better than searching ps output
to find the tids on CPUs.  This change simplifies the search, but you
still have to trace the tid for each CPU manually.
2016-08-14 15:52:00 +00:00
alc
adb85a5730 Eliminate two calls to vm_page_xunbusy() that are both unnecessary and
incorrect from the error cases in exec_map_first_page().  They are
unnecessary because we automatically unbusy the page in vm_page_free()
when we remove it from the object.  The calls are incorrect because they
happen after the page is freed, so we might actually unbusy the page
after it has been reallocated to a different object.  (This error was
introduced in r292373.)

Reviewed by:	kib
MFC after:	1 week
2016-08-13 18:10:32 +00:00
trasz
523c191717 Remove unused "X" vnode lock assertion, somehow missed in r303743.
MFC after:	1 month
2016-08-12 22:22:11 +00:00
trasz
cf980f6c7a Print vnode details when vnode locking assertion gets triggered.
MFC after:	1 month
2016-08-12 22:20:52 +00:00
shurd
181dc4875b Update iflib to support more NIC designs
- Move group task queue into kern/subr_gtaskqueue.c
- Change intr_enable to return an int so it can be detected if it's not
  implemented
- Allow different TX/RX queues per set to be different sizes
- Don't split up TX mbufs before transmit
- Allow a completion queue for TX as well as RX
- Pass the RX budget to isc_rxd_available() to allow an earlier return
  and avoid multiple calls

Submitted by:	shurd
Reviewed by:	gallatin
Approved by:	scottl
Differential Revision:	https://reviews.freebsd.org/D7393
2016-08-12 21:29:44 +00:00
markj
6843d46c78 Remove b_pin_count from struct buf.
It was added in r153192 for XFS and doesn't appear to have been used for
anything else. XFS was disconnected in r241607 and removed entirely in
r247631.

Reported by:	mlaier
Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D7468
2016-08-11 07:58:23 +00:00
trasz
255ed885fa Replace all remaining calls to vprint(9) with vn_printf(9), and remove
the old macro.

MFC after:	1 month
2016-08-10 16:12:31 +00:00
mjg
6d8d343a28 ktrace: do a lockless check on fork to see if tracing is enabled
This saves 2 lock acquisitions in the common case.
2016-08-10 15:25:44 +00:00
mjg
0a1afe2a20 sigio: do a lockless check in funsetownlist
There is no need to grab the lock first to see if sigio is used, and it
typically is not.
2016-08-10 15:24:15 +00:00
kib
9e755a47c1 Fix indentation.
Reported by:	hselasky
MFC after:	17 days
2016-08-10 14:41:53 +00:00
kib
3c085e9419 Re-schedule signals after kthread exits, since apparently there are
processes which combine kernel and non-kernel threads, e.g. nfsd.  For
such processes, termination of a kthread must recheck signal delivery
among other threads according to masks.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-10 13:47:12 +00:00
dumbbell
371e0a7254 Consistently use device_t
Several files use the internal name of `struct device` instead of
`device_t` which is part of the public API. This patch changes all
`struct device *` to `device_t`.

The remaining occurrences of `struct device` are those referring to the
Linux or OpenBSD version of the structure, or the code is not built on
FreeBSD and it's unclear what to do.

Submitted by:	Matthew Macy <mmacy@nextbsd.org> (previous version)
Approved by:	emaste, jhibbits, sbruno
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7447
2016-08-09 19:32:06 +00:00
stevek
12aa5d6087 Move IPv4-specific jail functions to new file netinet/in_jail.c
_prison_check_ip4 renamed to prison_check_ip4_locked

Move IPv6-specific jail functions to new file netinet6/in6_jail.c
_prison_check_ip6 renamed to prison_check_ip6_locked

Add appropriate prototypes to sys/sys/jail.h

Adjust kern_jail.c to call prison_check_ip4_locked and
prison_check_ip6_locked accordingly.

Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that
need to be built when INET and INET6, respectively, are configured in the
kernel configuration file.

Reviewed by:	jtl
Approved by:	sjg (mentor)
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D6799
2016-08-09 02:16:21 +00:00
markj
cfea0efd4a Handle races with listening socket close when connecting a unix socket.
If the listening socket is closed while sonewconn() is executing, the
nascent child socket is aborted, which results in recursion on the
unp_link lock when the child's pru_detach method is invoked. Fix this
by using a flag to mark such sockets, and skip a part of the socket's
teardown during detach.

Reported by:	Raviprakash Darbha <rdarbha@juniper.net>
Tested by:	pho
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D7398
2016-08-08 20:25:04 +00:00
bdrewery
23ccf3c427 Regenerate after r303755.
MFC after:	3 days
X-MFC-With:	r303755
Sponsored by:	EMC / Isilon Storage Division
2016-08-04 19:15:51 +00:00
bdrewery
d7bdbffe9e Still provide freebsd10_* symbols from libc for COMPAT10.
r296773 was done to only remove libc symbols for <7.  We want to provide
the syscall symbols going forward for 7+.

Discussed with:	jhb
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-04 19:14:18 +00:00
trasz
522c3665ca Remove unused - never actually implemented - vnode lock types
from vnode_if.src.

MFC after:	1 month
2016-08-04 13:45:18 +00:00
bdrewery
fd91db24a2 Correct some comments.
Sponsored by:	EMC / Isilon Storage Division
MFC after:	3 days
2016-08-03 18:48:56 +00:00
mjg
bdc6c48a91 locks: fix sx compilation on mips after r303643
The kernel.h header is required for the SYSINIT macro, which apparently
was present on amd64 by accident.

Reported by:	kib
2016-08-03 09:15:10 +00:00
kib
f7fe41e771 Remove mention of the Giant from the fork_return() description.
Making emphasis on this lock in the core function comment is confusing
for the modern kernel.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-08-03 07:10:09 +00:00
ed
532d4f6297 Regenerate system call tables for r303699 and r303700. 2016-08-03 06:36:45 +00:00
ed
36994c6f2e Re-add traling slash that was removed in r303699.
I must have accidentally pressed some random key in vim.
2016-08-03 06:35:58 +00:00
ed
f14bc84311 mprotect(): Change prototype to comply to POSIX.
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.

PR:		211423 (exp-run)
Tested by:	antoine@ (Thanks!)
2016-08-03 06:33:04 +00:00
mjg
7955eb834a locks: fix compilation for KDTRACE_HOOKS && !ADAPTIVE_* case
Reported by:	Michael Butler <imb protected-networks.net>
2016-08-02 03:05:59 +00:00
mjg
21bfe79a18 locks: fix up ifdef guards introduced in r303643
Both sx and rwlocks had copy-pasted ADAPTIVE_MUTEXES instead of the correct
define.

MFC after:	1 week
2016-08-02 00:15:08 +00:00
mjg
e5c0a1549b Implement trivial backoff for locking primitives.
All current spinning loops retry an atomic op the first chance they get,
which leads to performance degradation under load.

One classic solution to the problem consists of delaying the test to an
extent. This implementation has a trivial linear increment and a random
factor for each attempt.

For simplicity, this first thouch implementation only modifies spinning
loops where the lock owner is running. spin mutexes and thread lock were
not modified.

Current parameters are autotuned on boot based on mp_cpus.

Autotune factors are very conservative and are subject to change later.

Reviewed by:	kib, jhb
Tested by:	pho
MFC after:	1 week
2016-08-01 21:48:37 +00:00
mjg
75eb95197c locks: change sleep_cnt and spin_cnt types to u_int
Both variables are uint64_t, but they only count spins or sleeps.
All reasonable values which we can get here comfortably hit in 32-bit range.

Suggested by: kib
MFC after:	1 week
2016-07-31 12:11:55 +00:00
mjg
a2a371f3c8 sx: increment spin_cnt before cpu_spinwait in xlock
The change is a no-op only done for consistency with the rest of the file.
2016-07-30 22:23:31 +00:00
mjg
636d512a65 rwlock: s/READER/WRITER/ in wlock lockstat annotation 2016-07-30 22:21:48 +00:00
kib
a5b5388a46 Cache getbintime(9) answer in timehands, similarly to getnanotime(9)
and getmicrotime(9).

Suggested and reviewed by:	bde (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2016-07-30 09:25:57 +00:00
jhb
040924b411 Don't treat NOCPU as a valid CPU to CPU_ISSET.
If a thread is created bound to a cpuset it might already be bound before
it's very first timeslice, and td_lastcpu will be NOCPU in that case.

MFC after:	1 week
2016-07-29 20:19:14 +00:00