freebsd-skq

Author	SHA1	Message	Date
julian	396ed947f6	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
jb	f82c799735	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
netchild	b2a39f267a	- Change process_exec function handlers prototype to include struct image_params arg. - Change struct image_params to include struct sysentvec pointer and initialize it. - Change all consumers of process_exit/process_exec eventhandlers to new prototypes (includes splitting up into distinct exec/exit functions). - Add eventhandler to userret. Sponsored by: Google SoC 2006 Submitted by: rdivacky Parts suggested by: jhb (on hackers@)	2006-08-15 12:10:57 +00:00
jhb	9cbcfd1fff	Don't lock each of the processes while looking for a pid. The allproc and proctree locks that we already hold provide sufficient protection.	2006-08-01 15:30:56 +00:00
pjd	03a43a81a3	- Use suser_cred(9) instead of checking cr_ruid directly. - For privileged processes safe two mutex operations. We may want to consider if this is good idea to use SUSER_ALLOWJAIL here, but for now I didn't wanted to change the original behaviour. Reviewed by: rwatson	2006-06-27 11:28:50 +00:00
davidxu	7e196fdd6e	Fix a race between file operations and rfork(RFCFDG) by parking all other threads at user boundary, the race can crash kernel under stress testing. Reviewed by: jhb MFC after: 3 days	2006-03-15 23:24:14 +00:00
phk	74f8e63a10	Simplify system time accounting for profiling. Rename struct thread's td_sticks to td_pticks, we will need the other name for more appropriately named use shortly. Reduce it from uint64_t to u_int. Clear td_pticks whenever we enter the kernel instead of recording its value as reference for userret(). Use the absolute value of td->pticks in userret() and eliminate third argument.	2006-02-08 08:09:17 +00:00
jhb	2eb77c6d18	We don't need the proc lock to check P_KTHREAD on curthread since it is only set before the kthread starts executing and is never cleared.	2006-02-06 21:54:47 +00:00
wsalamon	c41a486364	Audit the args to rfork(), and the child PID for all fork system calls. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)	2006-02-06 00:28:50 +00:00
rwatson	53a606d94a	Hook up audit to fork() and exit() events. These changes manage the audit state on processes, not auditing of these events. Much work by: wsalamon Obtained from: TrustedBSD Project	2006-02-02 01:32:58 +00:00
rwatson	2a5785fb21	Moderate rewrite of kernel ktrace code to attempt to generally improve reliability when tracing fast-moving processes or writing traces to slow file systems by avoiding unbounded queueuing and dropped records. Record loss was previously possible when the global pool of records become depleted as a result of record generation outstripping record commit, which occurred quickly in many common situations. These changes partially restore the 4.x model of committing ktrace records at the point of trace generation (synchronous), but maintain the 5.x deferred record commit behavior (asynchronous) for situations where entering VFS and sleeping is not possible (i.e., in the scheduler). Records are now queued per-process as opposed to globally, with processes responsible for committing records from their own context as required. - Eliminate the ktrace worker thread and global record queue, as they are no longer used. Keep the global free record list, as records are still used. - Add a per-process record queue, which will hold any asynchronously generated records, such as from context switches. This replaces the global queue as the place to submit asynchronous records to. - When a record is committed asynchronously, simply queue it to the process. - When a record is committed synchronously, first drain any pending per-process records in order to maintain ordering as best we can. Currently ordering between competing threads is provided via a global ktrace_sx, but a per-process flag or lock may be desirable in the future. - When a process returns to user space following a system call, trap, signal delivery, etc, flush any pending records. - When a process exits, flush any pending records. - Assert on process tear-down that there are no pending records. - Slightly abstract the notion of being "in ktrace", which is used to prevent the recursive generation of records, as well as generating traces for ktrace events. Future work here might look at changing the set of events marked for synchronous and asynchronous record generation, re-balancing queue depth, timeliness of commit to disk, and so on. I.e., performing a drain every (n) records. MFC after: 1 month Discussed with: jhb Requested by: Marc Olzheim <marcolz at stack dot nl>	2005-11-13 13:27:44 +00:00
ssouhlal	efe31cd3da	Fix the recent panics/LORs/hangs created by my kqueue commit by: - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone	2005-07-01 16:28:32 +00:00
davidxu	0719b14efb	Inherit signal mask for child process in fork1(), RELENG_4 and other *BSD have this behaviour, also it is required by POSIX. PR: kern/80130 Submitted by: Kostik Belousov konstantin.belousov at zoral dot com dot ua	2005-04-20 13:14:52 +00:00
jhb	41cadaa11e	Divorce critical sections from spinlocks. Critical sections as denoted by critical_enter() and critical_exit() are now solely a mechanism for deferring kernel preemptions. They no longer have any affect on interrupts. This means that standalone critical sections are now very cheap as they are simply unlocked integer increments and decrements for the common case. Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter() and spinlock_exit(). This KPI is responsible for providing whatever MD guarantees are needed to ensure that a thread holding a spin lock won't be preempted by any other code that will try to lock the same lock. For now all archs continue to block interrupts in a "spinlock section" as they did formerly in all critical sections. Note that I've also taken this opportunity to push a few things into MD code rather than MI. For example, critical_fork_exit() no longer exists. Instead, MD code ensures that new threads have the correct state when they are created. Also, we no longer try to fixup the idlethreads for APs in MI code. Instead, each arch sets the initial curthread and adjusts the state of the idle thread it borrows in order to perform the initial context switch. This change is largely a big NOP, but the cleaner separation it provides will allow for more efficient alternative locking schemes in other parts of the kernel (bare critical sections rather than per-CPU spin mutexes for per-CPU data for example). Reviewed by: grehan, cognet, arch@, others Tested on: i386, alpha, sparc64, powerpc, arm, possibly more	2005-04-04 21:53:56 +00:00
imp	20280f1431	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
phk	e622b9bcba	Add new function fdunshare() which encapsulates the necessary light magic for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c	2004-12-14 07:20:03 +00:00
das	130bed6547	Don't include sys/user.h merely for its side-effect of recursively including other headers.	2004-11-27 06:51:39 +00:00
das	6175c08488	Remove local definitions of RANGEOF() and use __rangeof() instead. Also remove a few bogus casts.	2004-11-20 23:00:59 +00:00
das	22907ad4ac	Malloc p_stats instead of putting it in the U area. We should consider simply embedding it in struct proc. Reviewed by: arch@	2004-11-20 02:28:48 +00:00
phk	216166ee0d	Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.	2004-11-13 11:53:02 +00:00
phk	d24107be6b	Use more intuitive pointer for fdinit() and fdcopy(). Change fdcopy() to take unlocked filedesc.	2004-11-08 12:43:23 +00:00
phk	418eb9d34b	Allow fdinit() to be called with a NULL fdp argument so we can use it when setting up init. Make fdinit() lock the fdp argument as needed.	2004-11-07 12:39:28 +00:00
das	35b6f981ab	Back out rev 1.240; it is unnecessary. In particular, p1 == curthread, so _PHOLD(p1) will not have to block to swap in p1. Noticed by: jhb	2004-10-06 23:53:49 +00:00
das	e399d76f1b	Avoid calling _PHOLD(p1) with p2's lock held, since _PHOLD() may block to swap in p1. Instead, call _PHOLD earlier, at a point where the only lock held happens to be p1's.	2004-10-01 05:01:29 +00:00
julian	29732c6fb7	make some of these conditions apply equally to both threading systems.	2004-09-13 22:10:04 +00:00
julian	5813d27029	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
alc	82e55fdf76	Push Giant deep into vm_forkproc(), acquiring it only if the process has mapped System V shared memory segments (see shmfork_myhook()) or requires the allocation of an ldt (see vm_fault_wire()).	2004-09-03 05:11:32 +00:00
julian	e9d9514975	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
julian	ee753ed190	Remove sched_free_thread() which was only used in diagnostics. It has outlived its usefulness and has started causing panics for people who turn on DIAGNOSTIC, in what is otherwise good code. MFC after: 2 days	2004-08-31 06:12:13 +00:00
jmg	bc1805c6e8	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
julian	61fada7840	Increase the amount of data exported by KTR in the KTR_RUNQ setting. This extra data is needed to really follow what is going on in the threaded case.	2004-08-09 18:21:12 +00:00
bmilekic	1c3958ce88	Move the schedlock owner state update following the context switch in fork_exit() to before anything else is done (but keep schedlock for the deadthread check). This means one less nasty bug if ever in the future whatever might have been called before the update played with schedlock or critical sections. Discussed with: tjr	2004-07-27 03:46:31 +00:00
cperciva	c009fddfd6	In revision 1.228, I accidentally broke the "total number of processes in the system" resource limit code: When checking if the caller has superuser privileges, we should be checking the real user, not the effective user. (In general, resource limiting is done based on the real user, in order to avoid resource-exhaustion-by-setuid-program attacks.) Now that a SUSER_RUID flag to suser_cred exists, use it here to return this code to its correct behaviour. Pointed out by: rwatson	2004-07-26 07:54:39 +00:00
julian	a488bebcd2	When calling scheduler entrypoints for creating new threads and processes, specify "us" as the thread not the process/ksegrp/kse. You can always find the others from the thread but the converse is not true. Theorotically this would lead to runtime being allocated to the wrong entity in some cases though it is not clear how often this actually happenned. (would only affect threaded processes and would probably be pretty benign, but it WAS a bug..) Reviewed by: peter	2004-07-18 23:36:13 +00:00
phk	e86017f8eb	fix compilation.	2004-07-13 16:33:38 +00:00
cperciva	6bbfebb261	Replace "uid != 0" with "suser(td->td_ucred) != 0" when checking if we've hit the maximum number of processes. The last ten processes are reserved for the non-jailed superuser.	2004-07-13 13:10:07 +00:00
marcel	49e32d12eb	Allocate TIDs in thread_init() and deallocate them in thread_fini(). The overhead of unconditionally allocating TIDs (and likewise, unconditionally deallocating them), is amortized across multiple thread creations by the way UMA makes it possible to have type-stable storage. Previously the cost was kept down by having threads created as part of a fork operation use the process' PID as the TID. While this had some nice properties, it also introduced complexity in the way TIDs were allocated. Most importantly, by using the type-stable storage that UMA gives us this was also unnecessary. This change affects how core dumps are created and in particular how the PRSTATUS notes are dumped. Since we don't have a thread with a TID equalling the PID, we now need a different way to preserve the old and previous behavior. We do this by having the given thread (i.e. the thread passed to the core dump code in td) dump it's state first and fill in pr_pid with the actual PID. All other threads will have pr_pid contain their TIDs. The upshot of all this is that the debugger will now likely select the right LWP (=TID) as the initial thread. Credits to: julian@ for spotting how we can utilize UMA. Thanks to: all who provided julian@ with test results.	2004-06-26 18:58:22 +00:00
imp	74cf37bd00	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
marcel	1d37410c51	Assign thread IDs to kernel threads. The purpose of the thread ID (tid) is twofold: 1. When a 1:1 or M:N threaded process dumps core, we need to put the register state of each of its kernel threads in the core file. This can only be done by differentiating the pid field in the respective note. For this we need the tid. 2. When thread support is present for remote debugging the kernel with gdb(1), threads need to be identified by an integer due to limitations in the remote protocol. This requires having a tid. To minimize the impact of having thread IDs, threads that are created as part of a fork (i.e. the initial thread in a process) will inherit the process ID (i.e. tid=pid). Subsequent threads will have IDs larger than PID_MAX to avoid interference with the pid allocation algorithm. The assignment of tids is handled by thread_new_tid(). The thread ID allocation algorithm has been written with 3 assumptions in mind: 1. IDs need to be created as fast a possible, 2. Reuse of IDs may happen instantaneously, 3. Someone else will write a better algorithm.	2004-04-03 15:59:13 +00:00
peter	bd5efd4600	Make the process_exit eventhandler run without Giant. Add Giant hooks in the two consumers that need it.. processes using AIO and netncp. Update docs. Say that process_exec is called with Giant, but not to depend on it. All our consumers can handle it without Giant.	2004-03-14 02:06:28 +00:00
peter	963c36c195	Move the process_fork event out from under Giant. This one is easy, since there are no consumers in the tree. Document this.	2004-03-14 01:48:32 +00:00
peter	1cb95fd2b7	Push Giant down a little further: - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe	2004-03-13 22:31:39 +00:00
jmg	cf1b8bdb72	make sure we had the filedesc lock when calling fdinit when RFCFDG is set on call to rfork. Submitted by: Brian Buchanan Semi-Reviewed by: rwatson	2004-03-10 00:27:36 +00:00
peter	836666b0d7	Move a vref call outside of proc locks and Giant. By virtue of the fact that we (p1) are currently running, we hold a reference on p_textvp which means the vnode cannot go away. p2 cannot run yet (and hence cannot exit) so this should be safe to do at this point. As a bonus, it removes a block of under-Giant code that was there to support the vref.	2004-03-08 00:32:34 +00:00
alc	94f567f9bb	Giant is not required for vm_thread_new_altkstack().	2004-03-07 00:06:32 +00:00
peter	8ac8c686e1	Add a missing part of jhb's previous commit. It looks like he had a patch chunk rejected that he missed. This would manifest as a lock assertion panic at boot (Giant not locked in kern_fork.c). Obtained from: jhb	2004-03-06 00:44:59 +00:00
jhb	af72c48e5f	- Grab a share lock of the proctree lock while looking for a pid due to the process group and session dereferences. Also, check that p_pgrp and p_sesssion are NULL before dereferencing them. - Push down Giant in fork1(). Requested by: peter	2004-03-05 22:37:32 +00:00
bde	9c7937d9cf	Fixed some style bugs (mainly English usage errors in comments).	2004-03-04 09:56:29 +00:00

1 2 3 4 5 ...

264 Commits