freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	ea117d1735	The vinactive() call in vgonel() may start writes for the dirty pages, creating delayed write buffers belonging to the reclaimed vnode. Put the buffer cleanup code after inactivation. Add asserts that ensure that buffer queues are empty and add BO_DEAD flag for bufobj to check that no buffers are added after the cleanup. BO_DEAD is only used by INVARIANTS-enabled kernels. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-13 16:02:37 +00:00
Konstantin Belousov	fe21241ee0	For architectures where time_t is wide enough, in particular, 64bit platforms, avoid overflow after year 2038 in clock_ct_to_ts(). PR: 195868 Reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-12 09:37:18 +00:00
Konstantin Belousov	b2344ab5ff	Do not call VFS_SYNC() before VFS_UNMOUNT() for forced unmount. Since VFS does not/cannot stop writes, sync might run indefinitely, or be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for forced unmounts, since non-responding server does not allow sync to finish. On the other hand, filesystems can and do stop writes using fs-specific facilities, and should already fully flush caches in VFS_UNMOUNT() due to the race. Adjust msdosfs tp sync in unmount for forced call, to accomodate the new behaviour. Note that it is still racy, since writes are not stopped. Discussed with: avg, bjk, mckusick Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-12-09 10:00:47 +00:00
Konstantin Belousov	a77c72f5ae	Apply chunk forgotten in r275620. Remove local variable for real. CID: 1257462 Sponsored by: The FreeBSD Foundation	2014-12-09 09:36:28 +00:00
Konstantin Belousov	a25100c539	Add functions syncer_suspend() and syncer_resume(), which are supposed to be called before suspension and after resume, correspondingly. The syncer_suspend() ensures that all filesystems dirty data and metadata are saved to the permanent storage, and stops kernel threads which might modify filesystems. The syncer_resume() restores stopped threads. For now, only syncer is stopped. This is needed, because each sync loop causes superblock updates for UFS. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:48:57 +00:00
Konstantin Belousov	904ed548bb	When getnewbuf_reuse_bp() is called to reclaim some (clean) buffer, the vnode owning the buffer is not locked. More, it cannot be locked safely, since getnewbuf_reuse_bp() is called from newbuf(), and some other vnode is already locked, for which reused buffer will be reassigned. As the consequence, reclamation of the owning vnode could go in parallel, in particular, the call to vnode_destroy_vobject(), which deallocates the vm object and zeroes the v_bufobj->bo_object. Note that the pages wired by the buffer are left wired and can be safely freed by the vfs_vmio_release() without the need for the vm object lock. Also, seeing stale pointer to the v_object is safe due to vm object type stability. Check for bo_bufobj != NULL and cache the value in local variable to avoid trying to lock NULL vm object. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:42:34 +00:00
Konstantin Belousov	07a9368a48	Do some refactoring and minor cleanups of the thread_single() code in preparation for the global stop commit. Move the code to weed suspended or sleeping threads into the appropriate state, into the helper weed_inhib(). Current code already has deep nesting and hard to follow [1]. Add currently useless helper remain_for_mode(), which returns the count of threads which are allowed to run, according to the single-threading mode. In thread_single_end(), do not save curthread into local variable, it is unused after, except to find curproc. Remove stray empty line. Requested by: avg [1] Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:27:43 +00:00
Konstantin Belousov	8638fe7bea	Thread waiting for the vfork(2)-ed child to exec or exit, must allow for the suspension. Currently, the loop performs uninterruptible cv_wait(9) call, which prevents suspension until child allows further execution of parent. If child is stopped, suspension or single-threading is delayed indefinitely. Create a helper thread_suspend_check_needed() to identify the need for a call to thread_suspend_check(). It is required since call to the thread_suspend_check() cannot be safely done while owning the child (p2) process lock. Only when suspension is needed, drop p2 lock and call thread_suspend_check(). Perform wait for cv with timeout, in case suspend is requested after wait started; I do not see a better way to interrupt the wait. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:18:05 +00:00
Konstantin Belousov	aba1ca528e	When process is exiting, check for suspension regardless of multithreaded status of the process. The stopped state must be cleared before P_WEXIT is set. A stop signal delivered just before first PROC_LOCK() block in exit1(9) would put the process into pending stop with P_WEXIT set or assertion triggered. Also recheck for the suspension after failed thread_single(9) call, since process lock could be dropped. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:02:02 +00:00
Andriy Gapon	036a8c5dac	remove opensolaris cyclic code, replace with high-precision callouts In the old days callout(9) had 1 tick precision and that was inadequate for some uses, e.g. DTrace profile module, so we had to emulate cyclic API and behavior. Now we can directly use callout(9) in the very few places where cyclic was used. Differential Revision: https://reviews.freebsd.org/D1161 Reviewed by: gnn, jhb, markj MFC after: 2 weeks	2014-12-07 11:21:41 +00:00
Warner Losh	d0b6da086f	Const poison in a few places to ensure we don't modify things through the module data pointer.	2014-12-03 22:14:13 +00:00
John Baldwin	b10c08a52b	Revert device_getenv_int() for now as it duplicates resource_int_value(). We should perhaps implement a device_getenv_() and device_setenv_() API as a convenience wrapper on top of resource__value() and resource_set_().	2014-12-03 15:29:53 +00:00
Konstantin Belousov	6afb32fc67	Disable recursion for the process spinlock. Tested by: pho Discussed with: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 month	2014-12-01 17:36:10 +00:00
Justin T. Gibbs	2c6bf3d90b	Remove trailing whitespace.	2014-11-30 19:32:00 +00:00
Gleb Smirnoff	c80ea19b38	Merge from projects/sendfile: Provide pru_ready for AF_LOCAL sockets. Local sockets sendsdata directly to the receive buffer of the peer, thus pru_ready also works on the peer socket. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 13:40:58 +00:00
Gleb Smirnoff	651e4e6a30	Merge from projects/sendfile: extend protocols API to support sending not ready data: o Add new flag to pru_send() flags - PRUS_NOTREADY. o Add new protocol method pru_ready(). Sponsored by: Nginx, Inc. Sponsored by: Netflix	2014-11-30 13:24:21 +00:00
Gleb Smirnoff	0f9d0a73a4	Merge from projects/sendfile: o Introduce a notion of "not ready" mbufs in socket buffers. These mbufs are now being populated by some I/O in background and are referenced outside. This forces following implications: - An mbuf which is "not ready" can't be taken out of the buffer. - An mbuf that is behind a "not ready" in the queue neither. - If sockbet buffer is flushed, then "not ready" mbufs shouln't be freed. o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc. The sb_ccc stands for ""claimed character count", or "committed character count". And the sb_acc is "available character count". Consumers of socket buffer API shouldn't already access them directly, but use sbused() and sbavail() respectively. o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones with M_BLOCKED. o New field sb_fnrdy points to the first not ready mbuf, to avoid linear search. o New function sbready() is provided to activate certain amount of mbufs in a socket buffer. A special note on SCTP: SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet allow protocol specific sockbufs. Thus, SCTP does some hacks to make itself compatible with FreeBSD: it manages sockbufs on its own, but keeps sb_cc updated to inform the stack of amount of data in them. The new notion of "not ready" data isn't supported by SCTP. Instead, only a mechanical substitute is done: s/sb_cc/sb_ccc/. A proper solution would be to take away struct sockbuf from struct socket and allow protocols to implement their own socket buffers, like SCTP already does. This was discussed with rrs@. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 12:52:33 +00:00
Gleb Smirnoff	57f43a45a3	- Move sbcheck() declaration under SOCKBUF_DEBUG. - Improve SOCKBUF_DEBUG macros. - Improve sbcheck(). Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 11:22:39 +00:00
Gleb Smirnoff	8967b220a3	Make sballoc() and sbfree() functions. Ideally, they could be marked as static, but unfortunately Infiniband (ab)uses them. Sponsored by: Nginx, Inc.	2014-11-30 11:02:07 +00:00
Warner Losh	fac92ae126	The current limit of 100k for the linker hints file is getting a bit crowded as we now are at about 70k. Bump the limit to 1MB instead which is still quite a reasonable limit and allows for future growth of this file and possible future expansion to additional data. MFC After: 2 weeks	2014-11-29 17:29:30 +00:00
Konstantin Belousov	6762091ea4	Remove lock recursion for the pipe pair mutex, and disable the recursion on mutex initialization. The only places where the recursive acquire is performed are read and write filters, since knlist, which uses the pipe pair mutex as lock, is locked when filter is called. The recursion was added in r93296, and consistent locking for kn_fop->f_event() introduced in r133741. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 month	2014-11-29 17:18:20 +00:00
Konstantin Belousov	70778bba03	Assert the state of the process lock and sigact mutex in kern_sigprocmask() and reschedule_signals(). Discussed with: rea Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-28 10:20:00 +00:00
Hans Petter Selasky	50ae6690fc	Style changes: - Move two IOCTL related defines to the top of the C-file - Add more comments describing the recently added IOCTL small size and small align macros	2014-11-28 09:32:07 +00:00
Alfred Perlstein	56c14bca7e	Make igb and ixgbe check tunables at probe time. This allows one to make a kernel module to tune the number of queues before the driver loads. This is needed so that a module at SI_SUB_CPU can set tunables for these drivers to take. Otherwise getenv is called too early by the TUNABLE macros. Reviewed by: smh Phabric: https://reviews.freebsd.org/D1149	2014-11-26 20:19:36 +00:00
Konstantin Belousov	5c7bebf961	The process spin lock currently has the following distinct uses: - Threads lifetime cycle, in particular, counting of the threads in the process, and interlocking with process mutex and thread lock. The main reason of this is that turnstile locks are after thread locks, so you e.g. cannot unlock blockable mutex (think process mutex) while owning thread lock. - Virtual and profiling itimers, since the timers activation is done from the clock interrupt context. Replace the p_slock by p_itimmtx and PROC_ITIMLOCK(). - Profiling code (profil(2)), for similar reason. Replace the p_slock by p_profmtx and PROC_PROFLOCK(). - Resource usage accounting. Need for the spinlock there is subtle, my understanding is that spinlock blocks context switching for the current thread, which prevents td_runtime and similar fields from changing (updates are done at the mi_switch()). Replace the p_slock by p_statmtx and PROC_STATLOCK(). The split is done mostly for code clarity, and should not affect scalability. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-26 14:10:00 +00:00
Konstantin Belousov	e442f29f08	Fix SA_SIGINFO \| SA_RESETHAND handling. The sysent' sv_sendsig() method needs pre-reset state of the ps_siginfo to correctly construct signal frame. Move sigdflt() call after the sv_sendsig() invocation in postsig(). Simultaneously extract common code from trapsignal() and postsig() into new helper postsig_done(). Submitted by: rea MFC after: 1 week	2014-11-26 14:09:04 +00:00
John Baldwin	a2d751936b	Add a bus_get_domain() wrapper around BUS_GET_DOMAIN(). Use this to add a new per-device '%domain' sysctl node that returns the NUMA domain a device is associated with if it is associated with one. Note that this API is still a WIP and might change before 11.0 actually ships. Differential Revision: https://reviews.freebsd.org/D930 Reviewed by: kib, adrian	2014-11-24 19:55:45 +00:00
John Baldwin	20abb66ede	Properly initialize the capability rights for vnodes exported to procstat that aren't for file descriptors (cwd, jdir, tracevp, etc.). Submitted by: Mikhail <mp@lenta.ru>	2014-11-24 18:34:11 +00:00
Gleb Smirnoff	90effb2341	Merge from projects/sendfile: o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-23 12:01:52 +00:00
Mateusz Guzik	dff9862c0e	ifdef RACCT ui_racct_foreach and struct uidinfo's ui_racct Change racct_ create and destroy to macros evaluating to nothing without RACCT so that their callers passing ui_racct don't have to be ifdefed.	2014-11-23 08:25:44 +00:00
Mateusz Guzik	0c0d16e8ac	filedesc: plug a test for impossible condition in fgetvp_rights	2014-11-23 00:12:27 +00:00
Konstantin Belousov	64779280c9	The size value should be asserted when it is known. Reported and tested by: pho Sponsored by: The FreeBSD Foundation	2014-11-22 18:15:02 +00:00
John Baldwin	180e57e5c7	Improve support for XSAVE with debuggers. - Dump an NT_X86_XSTATE note if XSAVE is in use. This note is designed to match what Linux does in that 1) it dumps the entire XSAVE area including the fxsave state, and 2) it stashes a copy of the current xsave mask in the unused padding between the fxsave state and the xstate header at the same location used by Linux. - Teach readelf() to recognize NT_X86_XSTATE notes. - Change PT_GET/SETXSTATE to take the entire XSAVE state instead of only the extra portion. This avoids having to always make two ptrace() calls to get or set the full XSAVE state. - Add a PT_GET_XSTATE_INFO which returns the length of the current XSTATE save area (so the size of the buffer needed for PT_GETXSTATE) and the current XSAVE mask (%xcr0). Differential Revision: https://reviews.freebsd.org/D1193 Reviewed by: kib MFC after: 2 weeks	2014-11-21 20:53:17 +00:00
Gleb Smirnoff	67af272bcf	Do not allocate zero-length mbuf in sosend_generic(). Found by: pho Sponsored by: Nginx, Inc.	2014-11-19 14:27:38 +00:00
Zbigniew Bodek	dc61566f95	Stop using early_putc immediately after configuring console with cninit() Early UART should be released right after system console initialization is completed. Otherwise, after cninit() both early and system console coexist what may lead to various issues (i.a. writing to unmapped early UART address). This cannot be done in cninit_finish() since it can be called late at the end of MI configuration. Obtained from: Semihalf Reviewed by: andrew Sponsored by: The FreeBSD Foundation	2014-11-19 14:23:29 +00:00
Warner Losh	40e6bdaf1e	opt_global.h is included automatically in the build. No need to explicitly include it in these places. Sponsored by: Netflix	2014-11-18 17:06:56 +00:00
John-Mark Gurney	2c30bc1fcf	prevent doing filter ops locking for staticly compiled filter ops... This significantly reduces lock contention when adding/removing knotes on busy multi-kq system... Next step is to cache these references per kq.. i.e. kq refs it once and keeps a local ref count so that the same refs don't get accessed by many cpus... only allocate a knote when we might use it... Add a new flag, _FORCEONESHOT.. This allows a thread to force the delivery of another event in a safe manner, say waking up an idle http connection to force it to be reaped... If we are _DISABLE'ing a knote, don't bother to call f_event on it, it's disabled, so won't be delivered anyways.. Tested by: adrian	2014-11-16 01:18:41 +00:00
Gleb Smirnoff	8146bcfea1	- Use NULL to compare a pointer. - Use KASSERT() instead of panic. - Remove useless 'continue', no need to restart cycle here. Sponsored by: Nginx, Inc.	2014-11-14 15:44:19 +00:00
Gleb Smirnoff	6bf6b25e88	Merge from projects/sendfile: Use sbcut_locked() instead of manually editing a sockbuf. Sponsored by: Nginx, Inc.	2014-11-14 15:33:40 +00:00
Konstantin Belousov	5fab60a071	In vfs_write_suspend_umnt(), if suspension cannot be established, do not forget to restore write ops count when returning the error. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-14 11:31:10 +00:00
Gleb Smirnoff	f274e25659	There should not be zero length mbufs in socket buffers. The code comes from r1451, and thus can't be explained. A patch with explicit panic() here survived all tests. Tested by: pho Sponsored by: Nginx, Inc.	2014-11-14 06:02:29 +00:00
Jung-uk Kim	db1ec81edd	Correct a typo to fix chown(2). It was broken since r274476. Pointy hat to: kib X-MFC-With: r274476	2014-11-13 23:51:13 +00:00
Mateusz Guzik	eb48fbd963	filedesc: fixup fdinit to lock fdp and preapare files conditinally Not all consumers providing fdp to copy from want files. Perhaps these functions should be reorganized to better express the outcome. This fixes up panics after r273895 . Reported by: markj	2014-11-13 21:15:09 +00:00
Konstantin Belousov	416be7a1c6	Fix assertion, &uc->uc_busy is never zero, the intent is to test the uc_busy value, and not its address [1]. Remove the single use of the macro, write KASSERT() explicitely in the code of umtxq_sleep_pi(). Submitted by: Eric van Gyzen <eric@vangyzen.net> [1] MFC after: 1 week	2014-11-13 18:51:09 +00:00
Konstantin Belousov	6e646651d3	Remove the no-at variants of the kern_xx() syscall helpers. E.g., we have both kern_open() and kern_openat(); change the callers to use kern_openat(). This removes one (sometimes two) levels of indirection and consolidates arguments checks. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-13 18:01:51 +00:00
Konstantin Belousov	e64b4fa858	Do not try to dereference thread pointer when the value is not a pointer. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-13 17:44:35 +00:00
Konstantin Belousov	f2c1a52afb	Remove fossil. It has been present in 4.4Lite2, but its use was removed for some time. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-13 17:43:37 +00:00
Dmitry Chagin	c28d9d0f9f	Regen for r274462.	2014-11-13 05:28:06 +00:00
Dmitry Chagin	186d9c3473	Add the ppoll() system call. Export kern_poll() needed by an upcoming Linuxulator change. Differential Revision: https://reviews.freebsd.org/D1133 Reviewed by: kib, wblock MFC after: 1 month	2014-11-13 05:26:14 +00:00
Konstantin Belousov	389a25c716	For posix_fallocate(2) and posix_fadvise(2), return ESPIPE when underlying file does not have DFLAG_SEEKABLE set [1]. For posix_fallocate(2), simplify error handling logic. Do return when fp is not yet referenced. Noted by: bde [1] Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-12 17:31:38 +00:00

1 2 3 4 5 ...

14012 Commits