freebsd-skq

Author	SHA1	Message	Date
alc	5d9c66a3f8	The page queues lock is no longer required by vm_page_busy() or vm_page_wakeup(). Reduce or eliminate its use accordingly.	2006-10-22 21:18:48 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
alc	cbcb760109	Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object lock instead of the global page queues lock.	2006-10-22 04:28:14 +00:00
wsalamon	c62317c442	Audit the argv and env vectors passed in on exec: Add the argument auditing functions for argv and env. Add kernel-specific versions of the tokenizer functions for the arg and env represented as a char array. Implement the AUDIT_ARGV and AUDIT_ARGE audit policy commands to enable/disable argv/env auditing. Call the argument auditing from the exec system calls. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)	2006-09-01 11:45:40 +00:00
netchild	b2a39f267a	- Change process_exec function handlers prototype to include struct image_params arg. - Change struct image_params to include struct sysentvec pointer and initialize it. - Change all consumers of process_exit/process_exec eventhandlers to new prototypes (includes splitting up into distinct exec/exit functions). - Add eventhandler to userret. Sponsored by: Google SoC 2006 Submitted by: rdivacky Parts suggested by: jhb (on hackers@)	2006-08-15 12:10:57 +00:00
rwatson	81673aab33	In execve(), audit the path name being executed. In the future, it would also be good to audit the interpreter pathname, if any. Obtained from: TrustedBSD Project	2006-05-28 08:28:47 +00:00
tegge	ce79e019da	Temporarily unlock vnode for new image being executed to avoid lock order reversals that can lead to deadlocks. Normally vn_close(), namei() or vrele() should not be called while holding vnode locks.	2006-05-05 20:25:05 +00:00
peter	0f363b7d24	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
ups	5ad34fd1d6	Fix exec_map resource leaks. Tested by: kris@	2006-03-08 20:21:54 +00:00
jhb	ae432f93f2	- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.	2006-02-06 22:06:54 +00:00
jeff	01caaf4329	- textvp may have been from a different mountpoint than ndp->ni_vp and we may need to acquire giant to vrele it. Found by: mjacob MFC After: 3 days	2006-02-02 08:39:39 +00:00
alc	a5d0ac5faf	Remove unneeded calls to pmap_remove_all(). The given page is not mapped. Reviewed by: tegge	2005-12-11 22:06:57 +00:00
davidxu	1628e16677	Register itimers_event_hook as a kernel event handler, so I don't have to duplicate code to call it in exec() and exit1().	2005-12-09 05:43:26 +00:00
alc	71b52e0fd2	Reduce the scope of the page queues lock in exec_map_first_page(). The vm object lock is sufficient for reading a page's PG_BUSY and busy flags. MFC after: 1 week	2005-12-06 07:39:36 +00:00
davidxu	301b115d60	Cleanup some signal interfaces. Now the tdsignal function accepts both proc pointer and thread pointer, if thread pointer is NULL, tdsignal automatically finds a thread, otherwise it sends signal to given thread. Add utility function psignal_event to send a realtime sigevent to a process according to the delivery requirement specified in struct sigevent.	2005-11-03 04:49:16 +00:00
ps	e0951fe504	Calling setrlimit from 32bit apps could potentially increase certain limits beyond what should be capiable in a 32bit process, so we must fixup the limits. Reviewed by: jhb	2005-11-02 21:18:07 +00:00
davidxu	7086514f41	Make p_itimers as a pointer, so file sys/proc.h does not need to include sys/timers.h.	2005-10-23 12:19:08 +00:00
davidxu	cf50eec401	Implement POSIX timers. Current only CLOCK_REALTIME and CLOCK_MONOTONIC clock are supported. I have plan to merge XSI timer ITIMER_REAL and other two CPU timers into the new code, current three slots are available for the XSI timers. The SIGEV_THREAD notification type is not supported yet because our sigevent struct lacks of two member fields: sigev_notify_function sigev_notify_attributes I have found the sigevent is used in AIO, so I won't add the two members unless the AIO code is adjusted.	2005-10-23 04:22:56 +00:00
davidxu	3fbdb3c215	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
dds	0fb2e655fd	Move execve's access time update functionality into a new vfs_mark_atime() function, and use the new function for performing efficient atime updates in mmap(). Reviewed by: bde MFC after: 2 weeks	2005-10-12 06:56:00 +00:00
truckman	3b657afbab	Add missing word to comment.	2005-10-04 04:02:33 +00:00
cperciva	9e9bb7c935	If sufficiently bad things happen during a call to kern_execve(), it is possible for do_execve() to call exit1() rather than returning. As a result, the sequence "allocate memory; call kern_execve; free memory" can end up leaking memory. This commit documents this astonishing behaviour and adds a call to exec_free_args() before the exit1() call in do_execve(). Since all the users of kern_execve() in the tree use exec_free_args() to free the command-line arguments after kern_execve() returns, this should be safe, and it fixes the memory leak which can otherwise occur. Submitted by: Peter Holm MFC after: 3 days Security: Local denial of service	2005-10-03 12:49:54 +00:00
truckman	6c6c328bd0	Copy new process argument list in do_execve() before grabbing PROC_LOCK to avoid touching pageable memory while holding a mutex. Simplify argument list replacement logic. PR: kern/84935 Submitted by: "Antoine Pelisse" apelisse AT gmail.com (in a different form) MFC after: 3 days	2005-10-01 08:33:56 +00:00
jkoshy	3cade8d074	MFP4: - pmcstat(8) gprof output mode fixes: lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h: + Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event + Add an 'entryaddr' field to the PMCLOG_PROCEXEC event, so that pmcstat(8) can determine where the runtime loader /libexec/ld-elf.so.1 is getting loaded. sys/kern/kern_exec.c: + Use a local struct to group the entry address of the image being exec()'ed and the process credential changed flag to the exec handling hook inside hwpmc(4). usr.sbin/pmcstat/*: + Support "-k kernelpath", "-D sampledir". + Implement the ELF bits of 'gmon.out' profile generation in a new file "pmcstat_log.c". Move all log related functions to this file. + Move local definitions and prototypes to "pmcstat.h" - Other bug fixes: + lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read(). + sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all attached PMCs when a process exits. + sys/sys/pmc.h: correct a function prototype. + Improve usage checks in pmcstat(8). Approved by: re (blanket hwpmc)	2005-06-30 19:01:26 +00:00
peter	c9db8c3eb4	Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit being in opt_global.h and forcing a global recompile when only a few files reference it. Approved by: re	2005-06-24 00:16:57 +00:00
jkoshy	1d3209ab83	MFP4: - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.	2005-06-09 19:45:09 +00:00
kensmith	3a7e275ce6	This patch addresses a standards violation issue. The standards say a file's access time should be updated when it gets executed. A while ago the mechanism used to exec was changed to use a more mmap based mechanism and this behavior was broken as a side-effect of that. A new vnode flag is added that gets set when the file gets executed, and the VOP_SETATTR() vnode operation gets called. The underlying filesystem is expected to handle it based on its own semantics, some filesystems don't support access time at all. Those that do should handle it in a way that does not block, does not generate I/O if possible, etc. In particular vn_start_write() has not been called. The UFS code handles it the same way as it would normally handle the access time if a file was read - the IN_ACCESS flag gets set in the inode but no other action happens at this point. The actual time update will happen later during a sync (which handles all the necessary locking). Got me into this: cperciva Discussed with: a lot with bde, a little with kan Showed patches to: phk, jeffr, standards@, arch@ Minor discussion on: arch@	2005-05-31 19:39:52 +00:00
jeff	33ac8108e2	- Initialize vfslocked correctly early enough for MAC to compile. - Fix one place where we explicitly drop Giant! Pointy hat to: me Submitted by: Max Laier Warned by: Tinderbox	2005-05-03 16:24:59 +00:00
jeff	ab437d7b1d	- Use namei to acquire Giant for VFS if it is necessary. Drop the explicit Giant acquisition. - Remove GIANT_REQUIRED in the few remaining cases; the vm and vfs have both been locked.	2005-05-03 10:55:05 +00:00
jeff	80bb41c921	- Return EACCES if we're trying to exec on a vp with no object. Errno supplied by: cperciva	2005-05-01 00:58:19 +00:00
jeff	f869be5c72	- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.	2005-04-27 09:05:19 +00:00
jkoshy	dc3444cd91	Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities and documentation into -CURRENT. Bump FreeBSD_version. Reviewed by: alc, jhb (kernel changes)	2005-04-19 04:01:25 +00:00
sobomax	3155a16bbf	Welcome to the 21st century: increase MAXSHELLCMDLEN from 128 bytes to PAGE_SIZE. Unlike originator of the PR suggests retain MAXSHELLCMDLEN definition (he has been proposing to replace it with PAGE_SIZE everywhere), not only this reduced the diff significantly, but prevents code obfuscation and also allows to increase/decrease this parameter easily if needed. PR: kern/64196 Submitted by: Magnus Bäckström <b@etek.chalmers.se>	2005-02-25 11:49:42 +00:00
sobomax	5fd43d6c79	Grrr, this committer needs to have a sleep. Remove lines from the previous delta not intended for public consumption. MFC after: 2 weeks	2005-01-29 23:51:05 +00:00
sobomax	bc473990f5	Fix small non-conformance introduced in the previous commit: execve() is expected to return ENAMETOOLONG, not E2BIG if first argument doesn't fit into {PATH_MAX} bytes. MFC after: 2 weeks	2005-01-29 23:47:36 +00:00
sobomax	f489acaf0f	o Split out kernel part of execve(2) syscall into two parts: one that copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks	2005-01-29 23:12:00 +00:00
phk	796d435574	Don't use VOP_GETVOBJECT, use vp->v_object directly.	2005-01-25 00:40:01 +00:00
imp	20280f1431	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
phk	e622b9bcba	Add new function fdunshare() which encapsulates the necessary light magic for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c	2004-12-14 07:20:03 +00:00
das	130bed6547	Don't include sys/user.h merely for its side-effect of recursively including other headers.	2004-11-27 06:51:39 +00:00
phk	216166ee0d	Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.	2004-11-13 11:53:02 +00:00
phk	d24107be6b	Use more intuitive pointer for fdinit() and fdcopy(). Change fdcopy() to take unlocked filedesc.	2004-11-08 12:43:23 +00:00
peter	09964b7499	Put on my peril sensitive sunglasses and add a flags field to the internal sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app. I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working. With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go. This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.	2004-10-11 22:04:16 +00:00
davidxu	94500a0336	Add an execve command for kse_thr_interrupt to allow libpthread to restore signal mask correctly, this is required by POSIX. Reviewed by: deischen	2004-10-07 13:50:10 +00:00
davidxu	793ea9317e	In original kern_execve() code, at the start of the function, it forces all other threads to suicide, problem is execve() could be failed, and a failed execve() would change threaded process to unthreaded, this side effect is unexpected. The new code introduces a new single threading mode SINGLE_BOUNDARY, in the mode, all threads should suspend themself at user boundary except the singler. we can not use SINGLE_NO_EXIT because we want to start from a clean state if execve() is successful, suspending other threads at unknown point and later resuming them from there and forcing them to exit at user boundary may cause the process to start from a dirty state. If execve() is successful, current thread upgrades to SINGLE_EXIT mode and forces other threads to suicide at user boundary, otherwise, other threads will be resumed and their interrupted syscall will be restarted. Reviewed by: julian	2004-10-06 00:40:41 +00:00
jhb	1f2758a712	- Don't try to unlock Giant if single threading fails since we don't have it locked. - Unlock Giant before calling exit1() since exit1() does not require Giant.	2004-09-23 21:01:50 +00:00
julian	5015e1ce1f	Revert the last change.. Better to kill all other threads than to panic the system if 2 threads call execve() at the same time. A better fix will be committed later. Note that this only affects the case where the execve fails.	2004-09-22 01:30:23 +00:00
julian	fca3e8d0f0	In a threaded process, don't kill off all the other threads until we have a reasonable chance that the eceve() is going to succeeed. I.e. wait until we've done the permission checks etc. MFC after: 1 week	2004-09-21 21:05:13 +00:00
julian	5813d27029	Refactor a bunch of scheduler code to give basically the same behaviour but with slightly cleaned up interfaces. The KSE structure has become the same as the "per thread scheduler private data" structure. In order to not make the diffs too great one is #defined as the other at this time. The KSE (or td_sched) structure is now allocated per thread and has no allocation code of its own. Concurrency for a KSEGRP is now kept track of via a simple pair of counters rather than using KSE structures as tokens. Since the KSE structure is different in each scheduler, kern_switch.c is now included at the end of each scheduler. Nothing outside the scheduler knows the contents of the KSE (aka td_sched) structure. The fields in the ksegrp structure that are to do with the scheduler's queueing mechanisms are now moved to the kg_sched structure. (per ksegrp scheduler private data structure). In other words how the scheduler queues and keeps track of threads is no-one's business except the scheduler's. This should allow people to write experimental schedulers with completely different internal structuring. A scheduler call sched_set_concurrency(kg, N) has been added that notifies teh scheduler that no more than N threads from that ksegrp should be allowed to be on concurrently scheduled. This is also used to enforce 'fainess' at this time so that a ksegrp with 10000 threads can not swamp a the run queue and force out a process with 1 thread, since the current code will not set the concurrency above NCPU, and both schedulers will not allow more than that many onto the system run queue at a time. Each scheduler should eventualy develop their own methods to do this now that they are effectively separated. Rejig libthr's kernel interface to follow the same code paths as linkse for scope system threads. This has slightly hurt libthr's performance but I will work to recover as much of it as I can. Thread exit code has been cleaned up greatly. exit and exec code now transitions a process back to 'standard non-threaded mode' before taking the next step. Reviewed by: scottl, peter MFC after: 1 week	2004-09-05 02:09:54 +00:00
jmg	bc1805c6e8	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00

1 2 3 4 5 ...

298 Commits