freebsd-nq

Author	SHA1	Message	Date
Enji Cooper	d467b2ee0c	encode_long, encode_timeval: mechanically replace `exp` with `exponent` This helps fix a -Wshadow issue with exp(3) with tests/sys/acct/acct_test, which include math.h, which in turn defines exp(3) MFC after: 2 weeks Tested with: clang, gcc 4.2.1, gcc 4.9 Sponsored by: Dell EMC Isilon	2017-01-14 05:06:14 +00:00
Enji Cooper	66db8cca1a	Clean up trailing whitespace MFC after: 3 days Sponsored by: Dell EMC Isilon	2017-01-14 04:16:13 +00:00
Enji Cooper	5e8fcdfe1b	Fix -Wunused on gcc 4.9 (x was set but not used) MFC after: 3 days Sponsored by: Dell EMC Isilon	2017-01-14 04:13:28 +00:00
Gleb Smirnoff	4fce19da8d	Remove deprecated fgetsock() and fputsock().	2017-01-13 22:16:41 +00:00
Ian Lepore	d5b937680c	Correct the comments about how much buffer is allocated.	2017-01-13 17:03:23 +00:00
Ian Lepore	a6f63533a7	Check tty_gone() after allocating IO buffers. The tty lock has to be dropped then reacquired due to using M_WAITOK, which opens a window in which the tty device can disappear. Check for this and return ENXIO back up the call chain so that callers can cope. This closes a race where TF_GONE would get set while buffers were being allocated as part of ttydev_open(), causing a subsequent call to ttydevsw_modem() later in ttydev_open() to assert. Reported by: pho Reviewed by: kib	2017-01-13 16:37:38 +00:00
Ian Lepore	e046e8e680	Restructure the tty_drain loop so that device-busy is checked one more time after tty_timedwait() returns an error only if the error is EWOULDBLOCK; other errors cause an immediate return. This fixes the case of the tty disappearing while in tty_drain(). Reported by: pho	2017-01-12 21:18:43 +00:00
Ravi Pokala	8e712af70b	Remove writability requirement for single-mbuf, contiguous-range m_pulldown() m_pulldown() only needs to determine if a mbuf is writable if it is going to copy data into the data region of an existing mbuf. It does this to create a contiguous data region in a single mbuf from multiple mbufs in the chain. If the requested memory region is already contiguous and nothing needs to change, the mbuf does not need to be writeable. Submitted by: Brian Mueller <bmueller@panasas.com> Reviewed by: bz MFC after: 1 week Sponsored by: Panasas Differential Revision: https://reviews.freebsd.org/D9053	2017-01-12 06:38:03 +00:00
Ian Lepore	f64342e354	Rework tty_drain() to poll the hardware for completion, and restore drain timeout handling to historical freebsd behavior. The primary reason for these changes is the need to have tty_drain() call ttydevsw_busy() at some reasonable sub-second rate, to poll hardware that doesn't signal an interrupt when the transmit shift register becomes empty (which includes virtually all USB serial hardware). Such hardware hangs in a ttyout wait, because it never gets an opportunity to trigger a wakeup from the sleep in tty_drain() by calling ttydisc_getc() again, after handing the last of the buffered data to the hardware. While researching the history of changes to tty_drain() I stumbled across some email describing the historical BSD behavior of tcdrain() and close() on serial ports, and the ability of comcontrol(1) to control timeout behavior. Using that and some advice from Bruce Evans as a guide, I've put together these changes to implement the hardware polling and restore the historical timeout behaviors... - tty_drain() now calls ttydevsw_busy() in a loop at 10 Hz to accomodate hardware that requires polling for busy state. - The "new historical" behavior for draining during close(2) is retained: the drain timeout is "1 second without making any progress". When the 1-second timeout expires, if the count of bytes remaining in the tty layer buffer is smaller than last time, the timeout is extended for another second. Unfortunately, the same logic cannot be extended all the way down to the hardware, because the interface to that layer is a simple busy/not-busy indication. - Due to the previous point, an application that needs a guarantee that all data has been transmitted must use TIOCDRAIN/tcdrain(3) before calling close(2). - The historical behavior of honoring the drainwait setting for TIOCDRAIN (used by tcdrain(3)) is restored. - The historical kern.drainwait sysctl to control the global default drainwait time is restored, but is now named kern.tty_drainwait. - The historical default drainwait timeout of 300 seconds is restored. - Handling of TIOCGDRAINWAIT and TIOCSDRAINWAIT ioctls is restored (this also makes the comcontrol(1) drainwait verb work again). - Manpages are updated to document these behaviors. Reviewed by: bde (prior version)	2017-01-12 00:48:06 +00:00
Mark Johnston	90e17792c8	Do not set BIO_DONE if the BIO specifies a completion handler. biowait() will otherwise race with completions of such BIOs. In-tree code only calls biowait() on BIOs that do not specify a handler, so this change should not have any functional impact. Reviewed by: mav MFC after: 1 month Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D9070	2017-01-10 21:41:28 +00:00
John Baldwin	14da48cbe4	Set MORETOCOME for AIO write requests on a socket. Add a MSG_MOREOTOCOME message flag. When this flag is set, sosend* set PRUS_MOREOTOCOME when invoking the protocol send method. The aio worker tasks for sending on a socket set this flag when there are additional write jobs waiting on the socket buffer. Reviewed by: adrian MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D8955	2017-01-06 23:41:45 +00:00
Konstantin Belousov	6e89d383c7	Explicitely add "opt_compat.h" to kern_exec.c: fix powerpc LINT builds. sys/ptrace.h includes sys/signal.h, which includes sys/_sigset.h. Note that sys/_sigset.h only defines osigset_t if COMPAT_43 was defined. Two lines later, sys/ptrace.h includes machine/reg.h, which in case of powerpc, includes opt_compat.h. After the include headers reordering in r311345, we have sys/ptrace.h included before sys/sysproto.h. If COMPAT_43 was requested in the kernel config, the result is that sys/_sigset.h does not define osigset_t, but sys/sysproto.h sees COMPAT_43 and uses osigset_t. Fix this by explicitely including opt_compat.h to cover the whole kern/kern_exec.c scope. Sponsored by: The FreeBSD Foundation	2017-01-06 16:56:24 +00:00
Konstantin Belousov	2f304845e2	Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-05 17:19:26 +00:00
Konstantin Belousov	607fa849d2	Some style fixes for getfstat(2)-related code. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-05 17:03:35 +00:00
Mark Johnston	ec492b13f1	Add a small allocator for exec_map entries. Upon each execve, we allocate a KVA range for use in copying data to the new image. Pages must be faulted into the range, and when the range is freed, the backing pages are freed and their mappings are destroyed. This is a lot of needless overhead, and the exec_map management becomes a bottleneck when many CPUs are executing execve concurrently. Moreover, the number of available ranges is fixed at 16, which is insufficient on large systems and potentially excessive on 32-bit systems. The new allocator reduces overhead by making exec_map allocations persistent. When a range is freed, pages backing the range are marked clean and made easy to reclaim. With this change, the exec_map is sized based on the number of CPUs. Reviewed by: kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8921	2017-01-05 01:44:12 +00:00
Mark Johnston	eeeaa7ba22	Sort includes in kern_exec.c. MFC after: 1 week	2017-01-05 01:28:08 +00:00
Gleb Smirnoff	bfc8c24c73	Move bogus_page declaration to vm_page.h and initialization to vm_page.c. Reviewed by: kib	2017-01-04 22:27:19 +00:00
Konstantin Belousov	6c4338f2ef	The callers of kern_getfsstat(UIO_SYSSPACE) expect that buf always returns memory which must be freed, regardless of the error. Assign NULL to buf in case we are not going to allocate any memory due to invalid mode. Reported and tested by: pho Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 weeks (together with r310638) Differential revision: https://reviews.freebsd.org/D9042	2017-01-04 16:09:45 +00:00
Edward Tomasz Napierala	5ec7cde488	Fix bug that would result in a kernel crash in some cases involving a symlink and an autofs mount request. The crash was caused by namei() calling bcopy() with a negative length, caused by numeric underflow: in lookup(), in the relookup path, the ni_pathlen was decremented too many times. The bug was introduced in r296715. Big thanks to Alex Deiter for his help with debugging this. Reviewed by: kib@ Tested by: Alex Deiter <alex.deiter at gmail.com> MFC after: 1 month	2017-01-04 14:43:57 +00:00
Mateusz Guzik	391df78ad4	mtx: plug open-coded mtx_lock access missed in r311172	2017-01-04 02:25:31 +00:00
Mateusz Guzik	5e5ad162ad	Reduce lock accesses in thread lock similarly to r311172.	2017-01-03 23:08:11 +00:00
Mateusz Guzik	2604eb9e17	mtx: reduce lock accesses Instead of spuriously re-reading the lock value, read it once. This change also has a side effect of fixing a performance bug: on failed _mtx_obtain_lock, it was possible that re-read would find the lock is unowned, but in this case the primitive would make a trip through turnstile code. This is diff reduction to a variant which uses atomic_fcmpset. Discussed with: jhb (previous version) Tested by: pho (previous version)	2017-01-03 21:36:15 +00:00
Konstantin Belousov	7ee34a31fd	There is no need to use temporary statfs buffer for fsid obliteration and prison enforcement. Do it on the caller buffer directly. Besides eliminating memory copies, this change also removes large structure from the kernel stack. Extracted from: ino64 work by gleb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-02 18:59:23 +00:00
Konstantin Belousov	b961dc3193	Style. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-02 18:49:48 +00:00
Konstantin Belousov	f2af4041fa	Move common code from kern_statfs() and kern_fstatfs() into a new helper. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-02 18:20:22 +00:00
Mark Johnston	b5442eba5c	Factor out instances of a knote detach followed by a knote_drop() call. Reviewed by: kib (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D9015	2017-01-02 01:23:21 +00:00
Sean Bruno	1248952a50	2017 IFLIB updates in preparation for commits to e1000 and ixgbe. - iflib - add checksum in place support (mmacy) - iflib - initialize IP for TSO (going to be needed for e1000) (mmacy) - iflib - move isc_txrx from shared context to softc context (mmacy) - iflib - Normalize checks in TXQ drainage. (shurd) - iflib - Fix queue capping checks (mmacy) - iflib - Fix invalid assert, em can need 2 sentinels (mmacy) - iflib - let the driver determine what capabilities are set and what tx csum flags are used (mmacy) - add INVARIANTS debugging hooks to gtaskqueue enqueue (mmacy) - update bnxt(4) to support the changes to iflib (shurd) Some other various, sundry updates. Slightly more verbose changelog: Submitted by: mmacy@nextbsd.org Reviewed by: shurd mFC after: Sponsored by: LimeLight Networks and Dell EMC Isilon	2017-01-02 00:56:33 +00:00
Mateusz Guzik	d4db49c4c7	fd: access openfiles once in falloc_noinstall This is similar to what's done with nprocs. Note this is only a band aid.	2017-01-01 08:55:28 +00:00
Mateusz Guzik	41b0046a4d	vfs: switch nodes_created, recycles_count and free_owe_inact to counter(9) Reviewed by: kib	2016-12-31 19:59:31 +00:00
Mateusz Guzik	0b3b55a0f2	Remove cpu_spinwait after seq_consistent. It does not add any benefit as the read routine will do it as necessary.	2016-12-30 06:26:17 +00:00
Mateusz Guzik	4938d86764	cache: sprinkle __predict_false	2016-12-29 16:35:49 +00:00
Mateusz Guzik	b37707533e	cache: move shrink lock init to nchinit This gets rid of unnecesary sysinit usage. While here also rename the lock to be consistent with the rest.	2016-12-29 12:01:54 +00:00
Mateusz Guzik	0569bc9ca9	cache: depessimize hashing macros/inlines All hash sizes are power-of-2, but the compiler does not know that for sure and 'foo % size' forces doing a division. Store the size - 1 and use 'foo & hash' instead which allows mere shift.	2016-12-29 08:41:25 +00:00
Mateusz Guzik	6dd9661b77	cache: drop the NULL check from VP2VNODELOCK Now that negative entries are annotated with a dedicated flag, NULL vnodes are no longer passed.	2016-12-29 08:34:50 +00:00
John Baldwin	1fabda45c3	Regen after r310638. Differential Revision: https://reviews.freebsd.org/D8854	2016-12-27 20:22:17 +00:00
John Baldwin	34ed0c63c8	Rename the 'flags' argument to getfsstat() to 'mode' and validate it. This argument is not a bitmask of flags, but only accepts a single value. Fail with EINVAL if an invalid value is passed to 'flag'. Rename the 'flags' argument to getmntinfo(3) to 'mode' as well to match. This is a followup to r308088. Reviewed by: kib MFC after: 1 month	2016-12-27 20:21:11 +00:00
Konstantin Belousov	fd30dd7c26	Make knote KN_INFLUX state counted. This is final fix for the issue closed by r310302 for knote(). If KN_INFLUX \| KN_SCAN flags are set for the note passed to knote() or knote_fork(), i.e. the knote is scanned, we might erronously clear INFLUX when finishing notification. For normal knote() it was fixed in r310302 simply by remembering the fact that we do not own KN_INFLUX, since there we own knlist lock and scan thread cannot clear KN_INFLUX until we drop the lock. For knote_fork(), the situation is more complicated, e must drop knlist lock AKA the process lock, since we need to register new knotes. Change KN_INFLUX into counter and allow shared ownership of the in-flux state between scan and knote_fork() or knote(). Both in-flux setters need to ensure that knote is not dropped in parallel. Added assert about kn_influx == 1 in knote_drop() verifies that in-flux state is not shared when knote is destroyed. Since KBI of the struct knote is changed by addition of the int kn_influx field, reorder kn_hook and kn_hookid to fill pad on LP64 arches [1]. This keeps sizeof(struct knote) to same 128 bytes as it was before addition of kn_influx, on amd64. Reviewed by: markj Suggested by: markj [1] Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D8898	2016-12-26 19:33:40 +00:00
Konstantin Belousov	5c36b2e8cb	Change knlist_destroy() to assert that knlist is empty instead of accepting the wrong state and printing warning. Do not obliterate kl_lock and kl_unlock pointers, they are often useful for post-mortem analysis. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D8898	2016-12-26 19:28:10 +00:00
Konstantin Belousov	34311568dc	Style. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D8898	2016-12-26 19:26:40 +00:00
Konstantin Belousov	fc05543fa7	Some optimizations for kqueue timers. There is no need to do two allocations per kqueue timer. Gather all data needed by the timer callout into the structure and allocate it at once. Use the structure to preserve the result of timer2sbintime(), to not perform repeated 64bit calculations in callout. Remove tautological casts. Remove now unused p_nexttime [1]. Noted by: markj [1] Reviewed by: markj (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC note: do not remove p_nexttime Differential revision: https://reviews.freebsd.org/D8901	2016-12-25 19:49:35 +00:00
Konstantin Belousov	7611b72816	Some style. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D8901	2016-12-25 19:38:07 +00:00
Mark Johnston	eab80d9276	Add a comment explaining the race fixed by r310423. Suggested and reviewed by: jhb X-MFC With: r310423	2016-12-23 05:02:17 +00:00
Mark Johnston	aa3c544349	Revert part of r300109. The removal of TAILQ_FOREACH_SAFE introduced a small race: when the last thread on a sleepqueue is awoken, it reclaims the sleepqueue and may begin executing on a different CPU before sleepq_resume_thread() returns. This leaves a window during which it may go back to sleep and incorrectly be awoken again by the caller of sleepq_broadcast(). Reported and tested by: pho MFC after: 3 days Sponsored by: Dell EMC Isilon	2016-12-22 17:51:44 +00:00
John Baldwin	99bc7e4123	Don't spin in pause() during early boot for kthreads other than thread0. pause() uses a spin loop to simulate a sleep during early boot. However, we only need this for thread0 to get far enough in the boot process to enable timers (at which point pause() can sleep). For other kthreads, sleeping in pause() is ok as the callout will be scheduled and will eventually fire once thread0 initializes timers. Tested by: Steven Kargl Sleuthing by: markj MFC after: 1 week Sponsored by: Netflix	2016-12-20 19:44:44 +00:00
Konstantin Belousov	4afd808be7	Do not clear KN_INFLUX when not owning influx state. For notes in KN_INFLUX\|KN_SCAN state, the influx bit is set by a parallel scan. When knote() reports event for the vnode filters, which require kqueue unlocked, it unconditionally sets and then clears influx to keep note around kqueue unlock. There, do not clear influx flag if a scan set it, since we do not own it, instead we prevent scan from executing by holding knlist lock. The knote_fork() function has somewhat similar problem, it might set KN_INFLUX for scanned note, drop kqueue and list locks, and then clear the flag after relock. A solution there would be different enough, as well as the test program, so close the reported issue first. Reported and test case provided by: yjh0502@gmail.com PR: 214923 Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-12-19 22:18:36 +00:00
Konstantin Belousov	69baec3619	Switch from stdatomic.h to atomic.h for kernel. Apparently stdatomic.h implementation for gcc 4.2 on sparc64 does not work properly. This effectively reverts r251803. Reported and tested by: lidl Discussed with: ed Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-12-16 17:41:20 +00:00
Ed Schouten	669a25b50d	Document the existence of the {0, 6, ...} sysctl.	2016-12-15 15:45:11 +00:00
Jilles Tjoelker	b9a6fb9343	reaper: Make REAPER_KILL_SUBTREE actually work. MFC after: 2 weeks	2016-12-14 22:49:20 +00:00
Ed Schouten	ae15715360	Add a "device_index" label to all sysctls under dev.$driver.$index. This way it becomes possible to graph a property for all instances of a single driver. For example, graphing the number of packets across all USB controllers, the amount of dropped packets on all NICs, etc. Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D8775	2016-12-14 13:03:01 +00:00
Ed Schouten	fd0f59709d	Add labels to sysctls related to clocks. Sysctls like kern.eventtimer.et.*.quality currently embed the name of the clock device. This is problematic for the Prometheus metrics exporter for two reasons: - Some of those clocks have dashes in their names, which Prometheus doesn't allow to be used in metric names. - It doesn't allow for extracting the same property of all clocks on the system from within a single query. Attach these nodes to have a label, so that the Prometheus metrics exporter gives these metric a uniform name with the name of the clock attached as a label. Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D8775	2016-12-14 12:56:58 +00:00

1 2 3 4 5 ...

15221 Commits