freebsd-nq

Author	SHA1	Message	Date
Mateusz Guzik	84267cacf4	sysctl: don't modify oid_running for static nodes It is necessary to prevent nodes from being destroyed while used, but static ones cannot be destroyed.	2014-12-28 19:24:01 +00:00
Rick Macklem	46b34e5adc	Fix the comment introduced in r276192 so that it clearly states that the change is needed to avoid a deadlock. Suggested by: kib MFC after: 1 week	2014-12-25 14:44:04 +00:00
Rick Macklem	65cb225c5e	Modify vop_stdadvlock{async}() so that it only locks/unlocks the vnode and does a VOP_GETATTR() for the SEEK_END case. This is safe to do, since lf_advlock{async}() only uses the size argument for the SEEK_END case. The NFSv4 server needs this when vfs.nfsd.enable_locallocks!=0 since locking the vnode results in a LOR that can cause a deadlock for the nfsd threads. Reviewed by: kib MFC after: 1 week	2014-12-24 22:58:08 +00:00
Gleb Smirnoff	53b680caa2	In sbappend*() family of functions clear M_PROTO flags of incoming mbufs. sbappendstream() already does this in m_demote(). PR: 196174 Sponsored by: Nginx, Inc.	2014-12-22 15:39:24 +00:00
Konstantin Belousov	8ee9765a9d	Add VN_OPEN_NAMECACHE flag for vn_open_cred(9), which requests that the created file name was cached. Use the flag for core dumps. Requested by: rpaulo Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-21 13:32:07 +00:00
Warner Losh	61f26cae7d	Where appropriate, use the modern terms for the one true time base (UTC) rather than the archaic (GMT) in comments. Except where the comments are making fun of people doing this (and pedants who insist on the new terms).	2014-12-21 05:07:11 +00:00
Gleb Smirnoff	e834a84026	Revert r274494, r274712, r275955 and provide extra comments explaining why there could appear a zero-sized mbufs in socket buffers. A proper fix would be to divorce record socket buffers and stream socket buffers, and divorce pru_send that accepts normal data from pru_send that accepts control data.	2014-12-20 22:12:04 +00:00
Gleb Smirnoff	b7413232f6	Add to sbappendstream_locked() a check against NULL mbuf, like it is done in sbappend_locked() and sbappendrecord_locked(). This is a quick fix to the panic introduced by r274712. A proper solution should be to make sosend_generic() avoid calling pru_send() with NULL mbuf for the protocols that do not understand control messages. Those protocols that understand control messages, should be able to receive NULL mbuf, if control is non-NULL.	2014-12-20 14:19:46 +00:00
Konstantin Belousov	6c21f6edb8	The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-18 10:01:12 +00:00
Gleb Kurtsou	dde58752db	Adjust printf format specifiers for dev_t and ino_t in kernel. ino_t and dev_t are about to become uint64_t. Reviewed by: kib, mckusick	2014-12-17 07:27:19 +00:00
Konstantin Belousov	5b73811feb	Add missed break. CID: 1258587 Sponsored by: The FreeBSD Foundation MFC after: 20 days	2014-12-16 09:49:07 +00:00
Konstantin Belousov	917dd39084	Add missed break. CID: 1258586 Sponsored by: The FreeBSD Foundation MFC after: 4 days	2014-12-16 09:48:23 +00:00
John Baldwin	5ad25ceb41	Check for SS_NBIO in so->so_state instead of sb->sb_flags in soreceive_stream(). Differential Revision: https://reviews.freebsd.org/D1299 Reviewed by: bz, gnn MFC after: 1 week	2014-12-15 17:52:08 +00:00
Konstantin Belousov	237623b028	Add a facility for non-init process to declare itself the reaper of the orphaned descendants. Base of the API is modelled after the same feature from the DragonFlyBSD. Requested by: bapt Reviewed by: jilles (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-12-15 12:01:42 +00:00
Konstantin Belousov	83d7eb544c	Fix gcc build. Sponsored by: The FreeBSD Foundation MFC after: 13 days	2014-12-14 08:43:13 +00:00
Dmitry Chagin	fd07ddcf6f	Add _NEW flag to mtx(9), sx(9), rmlock(9) and rwlock(9). A _NEW flag passed to _init_flags() to avoid check for double-init. Differential Revision: https://reviews.freebsd.org/D1208 Reviewed by: jhb, wblock MFC after: 1 Month	2014-12-13 21:00:10 +00:00
Konstantin Belousov	6ddcc23386	Add facility to stop all userspace processes. The supposed use of the feature is to quisce the system before suspend. Stop is implemented by reusing the thread_single(9) with the special mode SINGLE_ALLPROC. SINGLE_ALLPROC differs from the existing single-threading modes by allowing (requiring) caller to operate on other process. Interruptible sleeps for !TDF_SBDRY threads are suspended like SIGSTOP does it, instead of aborting the sleep, like SINGLE_NO_EXIT, to avoid spurious EINTRs on resume. Provide debugging sysctl debug.stop_all_proc, which causes total stop and suspends syncer, while waiting for variable reset for resume. It is used for debugging; should be removed after the real use of the interface is added. In collaboration with: pho Discussed with: avg Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-12-13 16:18:29 +00:00
Konstantin Belousov	0061ddb3ed	Only sleep interruptible while waiting for suspension end when filesystem specified VFCF_SBDRY flag, i.e. for NFS. There are two issues with the sleeps. First, applications may get unexpected EINTR from the disk i/o syscalls. Second, interruptible sleep allows the stop of the process, and since mount point is referenced while thread sleeps, unmount cannot free mount point structure' memory, blocking unmount indefinitely. Even for NFS, it is probably only reasonable to enable PCATCH for intr mounts, but this information is currently not available at VFS level. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-13 16:07:01 +00:00
Konstantin Belousov	ea117d1735	The vinactive() call in vgonel() may start writes for the dirty pages, creating delayed write buffers belonging to the reclaimed vnode. Put the buffer cleanup code after inactivation. Add asserts that ensure that buffer queues are empty and add BO_DEAD flag for bufobj to check that no buffers are added after the cleanup. BO_DEAD is only used by INVARIANTS-enabled kernels. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-13 16:02:37 +00:00
Konstantin Belousov	fe21241ee0	For architectures where time_t is wide enough, in particular, 64bit platforms, avoid overflow after year 2038 in clock_ct_to_ts(). PR: 195868 Reviewed by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-12 09:37:18 +00:00
Konstantin Belousov	b2344ab5ff	Do not call VFS_SYNC() before VFS_UNMOUNT() for forced unmount. Since VFS does not/cannot stop writes, sync might run indefinitely, or be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for forced unmounts, since non-responding server does not allow sync to finish. On the other hand, filesystems can and do stop writes using fs-specific facilities, and should already fully flush caches in VFS_UNMOUNT() due to the race. Adjust msdosfs tp sync in unmount for forced call, to accomodate the new behaviour. Note that it is still racy, since writes are not stopped. Discussed with: avg, bjk, mckusick Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-12-09 10:00:47 +00:00
Konstantin Belousov	a77c72f5ae	Apply chunk forgotten in r275620. Remove local variable for real. CID: 1257462 Sponsored by: The FreeBSD Foundation	2014-12-09 09:36:28 +00:00
Konstantin Belousov	a25100c539	Add functions syncer_suspend() and syncer_resume(), which are supposed to be called before suspension and after resume, correspondingly. The syncer_suspend() ensures that all filesystems dirty data and metadata are saved to the permanent storage, and stops kernel threads which might modify filesystems. The syncer_resume() restores stopped threads. For now, only syncer is stopped. This is needed, because each sync loop causes superblock updates for UFS. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:48:57 +00:00
Konstantin Belousov	904ed548bb	When getnewbuf_reuse_bp() is called to reclaim some (clean) buffer, the vnode owning the buffer is not locked. More, it cannot be locked safely, since getnewbuf_reuse_bp() is called from newbuf(), and some other vnode is already locked, for which reused buffer will be reassigned. As the consequence, reclamation of the owning vnode could go in parallel, in particular, the call to vnode_destroy_vobject(), which deallocates the vm object and zeroes the v_bufobj->bo_object. Note that the pages wired by the buffer are left wired and can be safely freed by the vfs_vmio_release() without the need for the vm object lock. Also, seeing stale pointer to the v_object is safe due to vm object type stability. Check for bo_bufobj != NULL and cache the value in local variable to avoid trying to lock NULL vm object. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:42:34 +00:00
Konstantin Belousov	07a9368a48	Do some refactoring and minor cleanups of the thread_single() code in preparation for the global stop commit. Move the code to weed suspended or sleeping threads into the appropriate state, into the helper weed_inhib(). Current code already has deep nesting and hard to follow [1]. Add currently useless helper remain_for_mode(), which returns the count of threads which are allowed to run, according to the single-threading mode. In thread_single_end(), do not save curthread into local variable, it is unused after, except to find curproc. Remove stray empty line. Requested by: avg [1] Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:27:43 +00:00
Konstantin Belousov	8638fe7bea	Thread waiting for the vfork(2)-ed child to exec or exit, must allow for the suspension. Currently, the loop performs uninterruptible cv_wait(9) call, which prevents suspension until child allows further execution of parent. If child is stopped, suspension or single-threading is delayed indefinitely. Create a helper thread_suspend_check_needed() to identify the need for a call to thread_suspend_check(). It is required since call to the thread_suspend_check() cannot be safely done while owning the child (p2) process lock. Only when suspension is needed, drop p2 lock and call thread_suspend_check(). Perform wait for cv with timeout, in case suspend is requested after wait started; I do not see a better way to interrupt the wait. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:18:05 +00:00
Konstantin Belousov	aba1ca528e	When process is exiting, check for suspension regardless of multithreaded status of the process. The stopped state must be cleared before P_WEXIT is set. A stop signal delivered just before first PROC_LOCK() block in exit1(9) would put the process into pending stop with P_WEXIT set or assertion triggered. Also recheck for the suspension after failed thread_single(9) call, since process lock could be dropped. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-12-08 16:02:02 +00:00
Andriy Gapon	036a8c5dac	remove opensolaris cyclic code, replace with high-precision callouts In the old days callout(9) had 1 tick precision and that was inadequate for some uses, e.g. DTrace profile module, so we had to emulate cyclic API and behavior. Now we can directly use callout(9) in the very few places where cyclic was used. Differential Revision: https://reviews.freebsd.org/D1161 Reviewed by: gnn, jhb, markj MFC after: 2 weeks	2014-12-07 11:21:41 +00:00
Warner Losh	d0b6da086f	Const poison in a few places to ensure we don't modify things through the module data pointer.	2014-12-03 22:14:13 +00:00
John Baldwin	b10c08a52b	Revert device_getenv_int() for now as it duplicates resource_int_value(). We should perhaps implement a device_getenv_() and device_setenv_() API as a convenience wrapper on top of resource__value() and resource_set_().	2014-12-03 15:29:53 +00:00
Konstantin Belousov	6afb32fc67	Disable recursion for the process spinlock. Tested by: pho Discussed with: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 month	2014-12-01 17:36:10 +00:00
Justin T. Gibbs	2c6bf3d90b	Remove trailing whitespace.	2014-11-30 19:32:00 +00:00
Gleb Smirnoff	c80ea19b38	Merge from projects/sendfile: Provide pru_ready for AF_LOCAL sockets. Local sockets sendsdata directly to the receive buffer of the peer, thus pru_ready also works on the peer socket. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 13:40:58 +00:00
Gleb Smirnoff	651e4e6a30	Merge from projects/sendfile: extend protocols API to support sending not ready data: o Add new flag to pru_send() flags - PRUS_NOTREADY. o Add new protocol method pru_ready(). Sponsored by: Nginx, Inc. Sponsored by: Netflix	2014-11-30 13:24:21 +00:00
Gleb Smirnoff	0f9d0a73a4	Merge from projects/sendfile: o Introduce a notion of "not ready" mbufs in socket buffers. These mbufs are now being populated by some I/O in background and are referenced outside. This forces following implications: - An mbuf which is "not ready" can't be taken out of the buffer. - An mbuf that is behind a "not ready" in the queue neither. - If sockbet buffer is flushed, then "not ready" mbufs shouln't be freed. o In struct sockbuf the sb_cc field is split into sb_ccc and sb_acc. The sb_ccc stands for ""claimed character count", or "committed character count". And the sb_acc is "available character count". Consumers of socket buffer API shouldn't already access them directly, but use sbused() and sbavail() respectively. o Not ready mbufs are marked with M_NOTREADY, and ready but blocked ones with M_BLOCKED. o New field sb_fnrdy points to the first not ready mbuf, to avoid linear search. o New function sbready() is provided to activate certain amount of mbufs in a socket buffer. A special note on SCTP: SCTP has its own sockbufs. Unfortunately, FreeBSD stack doesn't yet allow protocol specific sockbufs. Thus, SCTP does some hacks to make itself compatible with FreeBSD: it manages sockbufs on its own, but keeps sb_cc updated to inform the stack of amount of data in them. The new notion of "not ready" data isn't supported by SCTP. Instead, only a mechanical substitute is done: s/sb_cc/sb_ccc/. A proper solution would be to take away struct sockbuf from struct socket and allow protocols to implement their own socket buffers, like SCTP already does. This was discussed with rrs@. Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 12:52:33 +00:00
Gleb Smirnoff	57f43a45a3	- Move sbcheck() declaration under SOCKBUF_DEBUG. - Improve SOCKBUF_DEBUG macros. - Improve sbcheck(). Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-30 11:22:39 +00:00
Gleb Smirnoff	8967b220a3	Make sballoc() and sbfree() functions. Ideally, they could be marked as static, but unfortunately Infiniband (ab)uses them. Sponsored by: Nginx, Inc.	2014-11-30 11:02:07 +00:00
Warner Losh	fac92ae126	The current limit of 100k for the linker hints file is getting a bit crowded as we now are at about 70k. Bump the limit to 1MB instead which is still quite a reasonable limit and allows for future growth of this file and possible future expansion to additional data. MFC After: 2 weeks	2014-11-29 17:29:30 +00:00
Konstantin Belousov	6762091ea4	Remove lock recursion for the pipe pair mutex, and disable the recursion on mutex initialization. The only places where the recursive acquire is performed are read and write filters, since knlist, which uses the pipe pair mutex as lock, is locked when filter is called. The recursion was added in r93296, and consistent locking for kn_fop->f_event() introduced in r133741. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 month	2014-11-29 17:18:20 +00:00
Konstantin Belousov	70778bba03	Assert the state of the process lock and sigact mutex in kern_sigprocmask() and reschedule_signals(). Discussed with: rea Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-28 10:20:00 +00:00
Hans Petter Selasky	50ae6690fc	Style changes: - Move two IOCTL related defines to the top of the C-file - Add more comments describing the recently added IOCTL small size and small align macros	2014-11-28 09:32:07 +00:00
Alfred Perlstein	56c14bca7e	Make igb and ixgbe check tunables at probe time. This allows one to make a kernel module to tune the number of queues before the driver loads. This is needed so that a module at SI_SUB_CPU can set tunables for these drivers to take. Otherwise getenv is called too early by the TUNABLE macros. Reviewed by: smh Phabric: https://reviews.freebsd.org/D1149	2014-11-26 20:19:36 +00:00
Konstantin Belousov	5c7bebf961	The process spin lock currently has the following distinct uses: - Threads lifetime cycle, in particular, counting of the threads in the process, and interlocking with process mutex and thread lock. The main reason of this is that turnstile locks are after thread locks, so you e.g. cannot unlock blockable mutex (think process mutex) while owning thread lock. - Virtual and profiling itimers, since the timers activation is done from the clock interrupt context. Replace the p_slock by p_itimmtx and PROC_ITIMLOCK(). - Profiling code (profil(2)), for similar reason. Replace the p_slock by p_profmtx and PROC_PROFLOCK(). - Resource usage accounting. Need for the spinlock there is subtle, my understanding is that spinlock blocks context switching for the current thread, which prevents td_runtime and similar fields from changing (updates are done at the mi_switch()). Replace the p_slock by p_statmtx and PROC_STATLOCK(). The split is done mostly for code clarity, and should not affect scalability. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-11-26 14:10:00 +00:00
Konstantin Belousov	e442f29f08	Fix SA_SIGINFO \| SA_RESETHAND handling. The sysent' sv_sendsig() method needs pre-reset state of the ps_siginfo to correctly construct signal frame. Move sigdflt() call after the sv_sendsig() invocation in postsig(). Simultaneously extract common code from trapsignal() and postsig() into new helper postsig_done(). Submitted by: rea MFC after: 1 week	2014-11-26 14:09:04 +00:00
John Baldwin	a2d751936b	Add a bus_get_domain() wrapper around BUS_GET_DOMAIN(). Use this to add a new per-device '%domain' sysctl node that returns the NUMA domain a device is associated with if it is associated with one. Note that this API is still a WIP and might change before 11.0 actually ships. Differential Revision: https://reviews.freebsd.org/D930 Reviewed by: kib, adrian	2014-11-24 19:55:45 +00:00
John Baldwin	20abb66ede	Properly initialize the capability rights for vnodes exported to procstat that aren't for file descriptors (cwd, jdir, tracevp, etc.). Submitted by: Mikhail <mp@lenta.ru>	2014-11-24 18:34:11 +00:00
Gleb Smirnoff	90effb2341	Merge from projects/sendfile: o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc.	2014-11-23 12:01:52 +00:00
Mateusz Guzik	dff9862c0e	ifdef RACCT ui_racct_foreach and struct uidinfo's ui_racct Change racct_ create and destroy to macros evaluating to nothing without RACCT so that their callers passing ui_racct don't have to be ifdefed.	2014-11-23 08:25:44 +00:00
Mateusz Guzik	0c0d16e8ac	filedesc: plug a test for impossible condition in fgetvp_rights	2014-11-23 00:12:27 +00:00
Konstantin Belousov	64779280c9	The size value should be asserted when it is known. Reported and tested by: pho Sponsored by: The FreeBSD Foundation	2014-11-22 18:15:02 +00:00

1 2 3 4 5 ...

14030 Commits