freebsd-dev

Author	SHA1	Message	Date
Pawel Jakub Dawidek	2c6040bbb7	We probably shouldn't allow users to mount file systems with MNT_SUIDDIR. There should be not shell access when SUIDDIR is compiled in, but better be sure. Reviewed by: rwatson	2004-03-26 21:12:14 +00:00
Alan Cox	2b63e7f397	Use uiomove_fromphys() instead of pmap_qenter() and pmap_qremove() in proc_rwmem().	2004-03-24 23:35:04 +00:00
Warner Losh	9fc0327792	Conform to local file sytle and prefer (a && (b & flag)).	2004-03-24 16:49:37 +00:00
David E. O'Brien	0d50bcb36b	Change the !MPSAFE boot string to something that doesn't potentially scare users that the kernel won't run on MP systems.	2004-03-23 01:58:09 +00:00
Alfred Perlstein	12e9993f65	Emit a traceback when witness_trace is set and witness_warn() is called and triggers (typically caused by sleeping with a non-sleepable lock). Reviewed by: jhb	2004-03-23 00:32:27 +00:00
David E. O'Brien	f1c8692d0a	Rather than display which interrupts are MPSAFE, display those that aren't. This way we can take stock of the work to be done. boot -v will note those interrupts that are MPSAFE.	2004-03-22 22:36:11 +00:00
Paul Saab	2eada6bc8e	Remove some netbsd debug code that crept into rev 1.116	2004-03-22 10:17:40 +00:00
David E. O'Brien	b003da7938	Give a more reasonable CPU time to the threads which are using scheduler activation (i.e., applications are using libpthread). This is because SCHED_ULE sometimes puts P_SA processes into ksq_next unnecessarily. Which doesn't give fair amount of CPU time to processes which are using scheduler-activation-based threads when other (semi-)CPU-intensive, non-P_SA processes are running. Further work will no doubt be done by jeffr at a later date. Submitted by: Taku YAMAMOTO <taku@cent.saitama-u.ac.jp> Reviewed by: rwatson, freebsd-current@	2004-03-21 18:53:29 +00:00
Julian Elischer	84eef27df4	Massively up the (artificial) limit on system scope threads in a process from 50 to 500 Also up the number of process scope threads allowed to be in the kernel at one time from 150 to 1500 (per process)	2004-03-21 09:22:38 +00:00
Brian Feldman	150883179a	Add the missing Giant when doing anything with VFS -- in this case, releasing the ktrace vnode.	2004-03-18 18:15:58 +00:00
Jacques Vidrine	3dc19c4677	Verify more bits of the ELF header: the program header table entry size and the ELF version. Also, avoid a potential integer overflow when determining whether the ELF header fits entirely within the first page. Reviewed by: jdp A panic when attempting to execute an ELF binary with a bogus program header table entry size was Reported by: Christer Öberg <christer.oberg@texonet.com>	2004-03-18 16:33:05 +00:00
Alan Cox	9508f75c23	Revise socow_iodone() in light of recent sf_buf changes. Specifically, use sf_buf_free() instead of sf_buf_mext() to consolidate all actions that require the page queues lock in one critical section. While I'm here remove unnecessary splvm() and splx() calls.	2004-03-17 23:25:04 +00:00
John Baldwin	b7e23e826c	- Replace wait1() with a kern_wait() function that accepts the pid, options, status pointer and rusage pointer as arguments. It is up to the caller to copyout the status and rusage to userland if needed. This lets us axe the 'compat' argument and hide all that functionality in owait(), by the way. This also cleans up some locking in kern_wait() since it no longer has to drop locks around copyout() since all the copyout()'s are deferred. - Convert owait(), wait4(), and the various ABI compat wait() syscalls to use kern_wait() rather than wait1() or wait4(). This removes a bit more stackgap usage. Tested on: i386 Compiled on: i386, alpha, amd64	2004-03-17 20:00:00 +00:00
Pawel Jakub Dawidek	9cdb62160b	Fix information leakage. Without this fix it is possible to cheat policies like: - sysctl security.bsd.see_other_[gu]ids=0, - mac_seeotheruids(4), - jail(2) and get full processes list with their arguments. This problem exists from revision 1.62 of kern_proc.c when it was introduced. Reviewed by: nectar, rwatson.	2004-03-17 13:19:43 +00:00
Colin Percival	018e32c194	Adjust the number of processes waiting on a semaphore properly if we're woken up in the middle of sleeping. PR: misc/64347 Reviewed by: tjr MFC after: 7 days	2004-03-17 09:37:13 +00:00
Alan Cox	90ecfebd82	Refactor the existing machine-dependent sf_buf_free() into a machine- dependent function by the same name and a machine-independent function, sf_buf_mext(). Aside from the virtue of making more of the code machine- independent, this change also makes the interface more logical. Before, sf_buf_free() did more than simply undo an sf_buf_alloc(); it also unwired and if necessary freed the page. That is now the purpose of sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used as a general-purpose emphemeral map cache.	2004-03-16 19:04:28 +00:00
John Baldwin	27de234992	Remove a bogus assertion and readd it in a more correct location. A thread might be enqueued on a sleep queue but not be asleep when the timeout fires if it is blocked on a lock trying to check for pending signals before going to sleep. In the case of fixing up the TDF_TIMEOUT race, however, the thread must be marked asleep. Reported by: kan (the bogus one)	2004-03-16 18:56:22 +00:00
Peter Grehan	721b6196d5	Add powerpc to temporary fix. The new cpu device claims all 'generic' OpenFirmware nexus nodes, since it uses bus_generic_probe. Maybe the cpu device probe should be MD.	2004-03-16 13:34:50 +00:00
David Malone	31c7e8b05b	Nudge Giant as far as I can into kern_open(). Mark open() as MPSAFE. Use kern_open() to implement creat() rather than taking the long route through open(). Mark creat as MPSAFE. While I'm at it, mark nosys() (syscall 0) as MPSAFE, for all the difference it will make.	2004-03-16 10:46:42 +00:00
David Malone	1f325ae35e	Get ready to mark open, creat and nosys as MPSAFE.	2004-03-16 10:41:23 +00:00
Tim J. Robbins	537370d0a4	Make vfs_nmount() public. The Linux emulator needs this in order to mount linprocfs filesystems.	2004-03-16 08:59:37 +00:00
Don Lewis	a961520c13	Rename the wiredlen member of struct sysctl_req to validlen and always set it to avoid the need for a bunch of code that tests whether or not the lock member is set to REQ_WIRED in order to determine which length member should be used. Fix another bug in the oldlen return value code. Fix a potential wired memory leak if a sysctl handler uses sysctl_wire_old_buffer() and returns an EAGAIN error to trigger a retry.	2004-03-16 06:53:03 +00:00
Don Lewis	8ac3e8e940	Don't bother calling vslock() and vsunlock() if oldlen is zero. If vslock() returns ENOMEM, sysctl_wire_old_buffer() should set wiredlen to zero and return zero (success) so that the handler will operate according to sysctl(3): The size of the buffer is given by the location specified by oldlenp before the call, and that location gives the amount of data copied after a successful call and after a call that returns with the error code ENOMEM. The handler will return an ENOMEM error because the zero length buffer will overflow.	2004-03-16 01:28:45 +00:00
John Baldwin	6b55d75c44	Regen for ptrace being safe again.	2004-03-15 18:50:06 +00:00
John Baldwin	8ac61436e6	Drop the proc lock around calls to the MD functions ptrace_single_step(), ptrace_set_pc(), and cpu_ptrace() so that those functions are free to acquire Giant, sleep, etc. We already do a PHOLD/PRELE around them so that it is safe to sleep inside of these routines if necessary. This allows ptrace() to be marked MP safe again as it no longer triggers lock order reversals on Alpha. Tested by: wilko	2004-03-15 18:48:28 +00:00
Pawel Jakub Dawidek	7f4704c01d	Remove sysctl security.jail.list_allowed. This functionality was a misfeature, sysctl was added and turned off by default just to check if nobody complains. Reviewed by: rwatson	2004-03-15 12:10:34 +00:00
Don Lewis	ce8660e395	Revert to the original vslock() and vsunlock() API with the following exceptions: Retain the recently added vslock() error return. The type of the len argument should be size_t, not u_int. Suggested by: bde	2004-03-15 06:42:40 +00:00
Poul-Henning Kamp	bcfe6d8b26	Annual NTP kernel code spring-cleaning: Use int64_t rather than long long for the fixpoint type. Don't discard fractional nanosecond frequency correction.	2004-03-14 15:23:05 +00:00
Peter Wemm	8f650450c6	Set default HZ to 1024 for amd64. The comment in kern/tty.c doesn't apply here because we have 64 bit longs and don't suffer the hz > 169 overflows.	2004-03-14 05:49:31 +00:00
Peter Wemm	a5bdcb2a2f	Make the process_exit eventhandler run without Giant. Add Giant hooks in the two consumers that need it.. processes using AIO and netncp. Update docs. Say that process_exec is called with Giant, but not to depend on it. All our consumers can handle it without Giant.	2004-03-14 02:06:28 +00:00
Peter Wemm	8a412f314e	Move the process_fork event out from under Giant. This one is easy, since there are no consumers in the tree. Document this.	2004-03-14 01:48:32 +00:00
Peter Wemm	78c45c5d66	Regen for mpsafe kse_create()	2004-03-13 22:32:17 +00:00
Peter Wemm	37814395c1	Push Giant down a little further: - no longer serialize on Giant for thread_single*() and family in fork, exit and exec - thread_wait() is mpsafe, assert no Giant - reduce scope of Giant in exit to not cover thread_wait and just do vm_waitproc(). - assert that thread_single() family are not called with Giant - remove the DROP/PICKUP_GIANT macros from thread_single() family - assert that thread_suspend_check() s not called with Giant - remove manual drop_giant hack in thread_suspend_check since we know it isn't held. - remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family - mark kse_create() mpsafe	2004-03-13 22:31:39 +00:00
Robert Watson	5d8dd01da2	Add annotations to mtx_lock(&Giant) in kern_select() and poll() that we always grab Giant, even if we're actually only polling objects that don't require giant. Once socket locking is merged, there will be strong motivation to fix this.	2004-03-13 05:58:57 +00:00
Bruce Evans	0249823ecb	Align the offset in vn_rdwr_inchunks() so that at most the first and the last chunk are misaligned relative to a MAXBSIZE byte boundary. vn_rdwr_inchunks() is used mainly for elf core dumps, and elf sections are usually perfectly misaligned relative to MAXBSIZE, and chunking prevents the file system from doing much realigning. This gives a surprisingly large speedup for core dumps -- from 50 to 13 seconds for a 512MB core dump here. The pessimization was mostly from an interaction of the misalignment with IO_DIRECT. It increased the number of i/o's for each chunk by a factor of 5 (3 writes and 2 read-before-writes instead of 1 write).	2004-03-13 02:56:27 +00:00
Tom Rhodes	a122cca953	These are changes to allow to use the Intel C/C++ compiler (lang/icc) to build the kernel. It doesn't affect the operation if gcc. Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as icc v8 may define __GNUC__ some parts may look strange but are necessary. Additional changes: - in_cksum.[ch]: * use a generic C version instead of the assembly version in the !gcc case (ASM code breaks with the optimizations icc does) -> no bad checksums with an icc compiled kernel Help from: andre, grehan, das Stolen from: alpha version via ppc version The entire checksum code should IMHO be replaced with the DragonFly version (because it isn't guaranteed future revisions of gcc will include similar optimizations) as in: ---snip--- Revision Changes Path 1.12 +1 -0 src/sys/conf/files.i386 1.4 +142 -558 src/sys/i386/i386/in_cksum.c 1.5 +33 -69 src/sys/i386/include/in_cksum.h 1.5 +2 -0 src/sys/netinet/igmp.c 1.6 +0 -1 src/sys/netinet/in.h 1.6 +2 -0 src/sys/netinet/ip_icmp.c 1.4 +3 -4 src/contrib/ipfilter/ip_compat.h 1.3 +1 -2 src/sbin/natd/icmp.c 1.4 +0 -1 src/sbin/natd/natd.c 1.48 +1 -0 src/sys/conf/files 1.2 +0 -1 src/sys/conf/files.amd64 1.13 +0 -1 src/sys/conf/files.i386 1.5 +0 -1 src/sys/conf/files.pc98 1.7 +1 -1 src/sys/contrib/ipfilter/netinet/fil.c 1.10 +2 -3 src/sys/contrib/ipfilter/netinet/ip_compat.h 1.10 +1 -1 src/sys/contrib/ipfilter/netinet/ip_fil.c 1.7 +1 -1 src/sys/dev/netif/txp/if_txp.c 1.7 +1 -1 src/sys/net/ip_mroute/ip_mroute.c 1.7 +1 -2 src/sys/net/ipfw/ip_fw2.c 1.6 +1 -2 src/sys/netinet/igmp.c 1.4 +158 -116 src/sys/netinet/in_cksum.c 1.6 +1 -1 src/sys/netinet/ip_gre.c 1.7 +1 -2 src/sys/netinet/ip_icmp.c 1.10 +1 -1 src/sys/netinet/ip_input.c 1.10 +1 -2 src/sys/netinet/ip_output.c 1.13 +1 -2 src/sys/netinet/tcp_input.c 1.9 +1 -2 src/sys/netinet/tcp_output.c 1.10 +1 -1 src/sys/netinet/tcp_subr.c 1.10 +1 -1 src/sys/netinet/tcp_syncache.c 1.9 +1 -2 src/sys/netinet/udp_usrreq.c 1.5 +1 -2 src/sys/netinet6/ipsec.c 1.5 +1 -2 src/sys/netproto/ipsec/ipsec.c 1.5 +1 -1 src/sys/netproto/ipsec/ipsec_input.c 1.4 +1 -2 src/sys/netproto/ipsec/ipsec_output.c and finally remove sys/i386/i386 in_cksum.c sys/i386/include in_cksum.h ---snip--- - endian.h: * DTRT in C++ mode - quad.h: * we don't use gcc v1 anymore, remove support for it Suggested by: bde (long ago) - assym.h: * avoid zero-length arrays (remove dependency on a gcc specific feature) This change changes the contents of the object file, but as it's only used to generate some values for a header, and the generator knows how to handle this, there's no impact in the gcc case. Explained by: bde Submitted by: Marius Strobl <marius@alchemy.franken.de> - aicasm.c: * minor change to teach it about the way icc spells "-nostdinc" Not approved by: gibbs (no reply to my mail) - bump __FreeBSD_version (lang/icc needs to know about the changes) Incarnations of this patch survive gcc compiles since a loooong time, I use it on my desktop. An icc compiled kernel works since Nov. 2003 (exceptions: snd_* if used as modules), it survives a build of the entire ports collection with icc. Parts of this commit contains suggestions or submissions from Marius Strobl <marius@alchemy.franken.de>. Reviewed by: -arch Submitted by: netchild	2004-03-12 21:45:33 +00:00
Ruslan Ermilov	7700eb86e7	Do what the execve(2) manpage says and enforce what a Strictly Conforming POSIX application should do by disallowing the argv argument to be NULL. PR: kern/33738 Submitted by: Marc Olzheim, Serge van den Boom OK'ed by: nectar	2004-03-12 21:06:20 +00:00
Ken Smith	db322c7eba	This is a temporary fix to solve a regression issue on sparc64 that is caused by the way sparc64 registers its CPUs. Nate will work on a real fix shortly. Approved by: njl	2004-03-12 20:35:21 +00:00
John Baldwin	1ed3e44f22	- Remove old sleep queues. - Remove sleepqueue argument from sleepq_set_timeout() since it is not used.	2004-03-12 19:06:18 +00:00
John Baldwin	595bc82a1d	Fixup a comment.	2004-03-12 19:05:46 +00:00
Dag-Erling Smørgrav	30a058027a	Replace a manual check of a VMIO candidate with vn_canvmio(). This silences an annoying warning in getblk() when VMIO'ing on a directory vnode, which can happen when vfs.vmiodirenable is 1. Bring the warning message in line with reality at the same time. Submitted by: hmp	2004-03-12 12:02:12 +00:00
Poul-Henning Kamp	ceb58ca58f	When I was a kid my work table was one cluttered mess an cleaning it up were a rather overwhelming task. I soon learned that if you don't know where you're going to store something, at least try to pile it next to something slightly related in the hope that a pattern emerges. Apply the same principle to the ffs/snapshot/softupdates code which have leaked into specfs: Add yet a buf-quasi-method and call it from the only two places I can see it can make a difference and implement the magic in ffs_softdep.c where it belongs. It's not pretty, but at least it's one less layer violated.	2004-03-11 18:50:33 +00:00
Poul-Henning Kamp	4d453ef101	Properly vector all bwrite() and BUF_WRITE() calls through the same path and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().	2004-03-11 18:02:36 +00:00
Poul-Henning Kamp	2b348f7429	Remove unused mnt_reservedvnlist field.	2004-03-11 16:59:57 +00:00
Poul-Henning Kamp	651b11eaf2	Remove unused second arg to vfinddev(). Don't call addaliasu() on VBLK nodes.	2004-03-11 16:33:11 +00:00
Poul-Henning Kamp	8666b655b5	Correctly account for extra bits in unit numbers when looking for next free unit.	2004-03-11 14:11:02 +00:00
Poul-Henning Kamp	9397290e76	Add clone_setup() function rather than rely on lazy initialization. Requested by: rwatson	2004-03-11 12:58:55 +00:00
John-Mark Gurney	0235bf0261	make sure we had the filedesc lock when calling fdinit when RFCFDG is set on call to rfork. Submitted by: Brian Buchanan Semi-Reviewed by: rwatson	2004-03-10 00:27:36 +00:00
Nate Lawson	29f5b9a8c1	Hook CPUs up to newbus. CPUs will ultimately be a bus driver so that multiple CPU-specific drivers can attach. This is a work in progress so children aren't supported yet. Help from: jhb	2004-03-09 03:37:21 +00:00
Robert Watson	ce89352952	Mark loadaverage callout as CALLOUT_MPSAFE. Reviewed by: jhb	2004-03-08 22:01:19 +00:00
Pawel Jakub Dawidek	dd604e2647	Add two new sysctls: - security.bsd.hardlink_check_uid, when set, means, that unprivileged users are not permitted to create hard links to files not owned by them, - security.bsd.hardlink_check_gid, when set, means, that unprivileged users are not permitted to create hard links to files owned by group they don't belong to. OK'ed by: rwatson	2004-03-08 20:37:25 +00:00
Peter Wemm	a69d88af52	Move a vref call outside of proc locks and Giant. By virtue of the fact that we (p1) are currently running, we hold a reference on p_textvp which means the vnode cannot go away. p2 cannot run yet (and hence cannot exit) so this should be safe to do at this point. As a bonus, it removes a block of under-Giant code that was there to support the vref.	2004-03-08 00:32:34 +00:00
Alan Cox	3eba15c12e	Remove GIANT_REQUIRED from vunmapbuf().	2004-03-07 00:37:18 +00:00
Alan Cox	5fadbfeac2	Giant is not required for vm_thread_new_altkstack().	2004-03-07 00:06:32 +00:00
Alexander Kabaev	ff85a3f0e1	Always call vn_finished_write after vn_start_write was called. All occurences of 'goto done' after vn_start_write invocation were cleaning up incompletely.	2004-03-06 04:09:54 +00:00
Peter Wemm	5750ee293d	Add a missing part of jhb's previous commit. It looks like he had a patch chunk rejected that he missed. This would manifest as a lock assertion panic at boot (Giant not locked in kern_fork.c). Obtained from: jhb	2004-03-06 00:44:59 +00:00
John Baldwin	6074439965	kthread_exit() no longer requires Giant, so don't force callers to acquire Giant just to call kthread_exit(). Requested by: many	2004-03-05 22:42:17 +00:00
John Baldwin	4ae89b957c	- Push down Giant in exit() and wait(). - Push Giant down a bit in coredump() and call coredump() with the proc lock already held rather than unlocking it only to turn around and relock it. Requested by: peter	2004-03-05 22:39:53 +00:00
John Baldwin	8144e3b884	Lock Giant around the single threading code in exec() to satisfy an assertion in the single threading code.	2004-03-05 22:38:26 +00:00
John Baldwin	5ce2f67858	- Grab a share lock of the proctree lock while looking for a pid due to the process group and session dereferences. Also, check that p_pgrp and p_sesssion are NULL before dereferencing them. - Push down Giant in fork1(). Requested by: peter	2004-03-05 22:37:32 +00:00
Don Lewis	169299398a	Undo the merger of mlock()/vslock and munlock()/vsunlock() and the introduction of kern_mlock() and kern_munlock() in src/sys/kern/kern_sysctl.c 1.150 src/sys/vm/vm_extern.h 1.69 src/sys/vm/vm_glue.c 1.190 src/sys/vm/vm_mmap.c 1.179 because different resource limits are appropriate for transient and "permanent" page wiring requests. Retain the kern_mlock() and kern_munlock() API in the revived vslock() and vsunlock() functions. Combine the best parts of each of the original sets of implementations with further code cleanup. Make the mclock() and vslock() implementations as similar as possible. Retain the RLIMIT_MEMLOCK check in mlock(). Move the most strigent test, which can return EAGAIN, last so that requests that have no hope of ever being satisfied will not be retried unnecessarily. Disable the test that can return EAGAIN in the vslock() implementation because it will cause the sysctl code to wedge. Tested by: Cy Schubert <Cy.Schubert AT komquats.com>	2004-03-05 22:03:11 +00:00
Robert Watson	8cbec0c8dd	The roundrobin callout from sched_4bsd is MPSAFE, so set up the callout as MPSAFE to avoid grabbing Giant. Reviewed by: jhb	2004-03-05 19:27:04 +00:00
Robert Watson	16df17d062	Put "failed to set signal flags properly for ast()" check under DIAGNOSTIC instead of INVARIANTS. INVARIANTS is intended for tests that don't substantially change code flow or behavior (passive), but this test required locking both the proc lock and scheduler lock in order to execute. It also appears to be a very advisory diagnostic as opposed to an invariant violation. Following discussion with: bde	2004-03-05 17:35:28 +00:00
Poul-Henning Kamp	1e0e79c993	Just because the timecounter reads the same value on two samples after each other doesn't mean that nothing happened.	2004-03-04 14:14:23 +00:00
Bruce Evans	c8564ad433	Fixed some style bugs (mainly English usage errors in comments).	2004-03-04 09:56:29 +00:00
Bruce Evans	01e3f3ae4f	Fixed some style bugs (mainly misplaced comments, and totally disordered declarations in acct_process()).	2004-03-04 09:47:09 +00:00
Robert Watson	0b759971a2	Remove unneeded label 'done2' from socket(). We now grab Giant only around socreate(), and don't need it for file descriptor accesses. Submitted by: sam	2004-03-04 01:57:48 +00:00
Dag-Erling Smørgrav	86b5e56351	Use different dummy wait channels to avoid panic in msleep(). Reviewed by: jhb	2004-03-03 23:03:18 +00:00
John Baldwin	efac7951fe	Always assert that the passed in lock is the same as the saved lock in the sleep queue now that the one abnormal case has been fixed.	2004-03-02 15:02:08 +00:00
John Baldwin	959c0c4122	Correct handling of PDROP in msleep() to just skip the mtx_lock() rather than clear the lock pointer so that sleepq_add() still gets the correct lock pointer and doesn't bogusly trip an assertion.	2004-03-02 14:58:33 +00:00
John Baldwin	707559e402	Check for TDF_SINTR before calling sleepq_abort() as there is a narrow race in between sleepq_add() and sleepq_catch_signals() in that setting td_wchan and TDF_SINTR is not atomic to sched_lock but only to the sleepq lock. This band-aid will stop assertion failures, but there is perhaps a larger problem with the sleepq_add/sleepq_catch_signals race that I am not sure how to solve. For the signals case the race is harmless because we always call cursig() after setting TDF_SINTR. However, KSE doesn't do anything in sleepq_catch_signals() to check that this race was lost, so I am unsure if this race is harmful for this specific abort.	2004-03-01 23:07:58 +00:00
Robert Watson	746e5bf09b	Rename dup_sockaddr() to sodupsockaddr() for consistency with other functions in kern_socket.c. Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT in from the caller context rather than "1" or "0". Correct mflags pass into mac_init_socket() from previous commit to not include M_ZERO. Submitted by: sam	2004-03-01 03:14:23 +00:00
Scott Long	740d9ba692	Convert the other use of flags to mflags in soalloc().	2004-03-01 01:14:28 +00:00
Robert Watson	2bc87dcfbe	Modify soalloc() API so that it accepts a malloc flags argument rather than a "waitok" argument. Callers now passing M_WAITOK or M_NOWAIT rather than 0 or 1. This simplifies the soalloc() logic, and also makes the waiting behavior of soalloc() more clear in the calling context. Submitted by: sam	2004-02-29 17:54:05 +00:00
Poul-Henning Kamp	2cf6bdac50	Loudly announce WITNESS and DIAGNOSTIC options and warn about reduced performance.	2004-02-29 16:56:54 +00:00
Poul-Henning Kamp	3d6e5ccb06	Make sure to disable the watchdog if we cannot honour the timeout.	2004-02-28 22:01:19 +00:00
Poul-Henning Kamp	4103b7652d	Rename the WATCHDOG option to SW_WATCHDOG and make it use the generic watchdoc(9) interface. Make watchdogd(8) perform as watchdog(8) as well, and make it possible to specify a check command to run, timeout and sleep periods. Update watchdog(4) to talk about the generic interface and add new watchdog(8) page.	2004-02-28 20:56:35 +00:00
John Baldwin	44f3b09204	Switch the sleep/wakeup and condition variable implementations to use the sleep queue interface: - Sleep queues attempt to merge some of the benefits of both sleep queues and condition variables. Having sleep qeueus in a hash table avoids having to allocate a queue head for each wait channel. Thus, struct cv has shrunk down to just a single char * pointer now. However, the hash table does not hold threads directly, but queue heads. This means that once you have located a queue in the hash bucket, you no longer have to walk the rest of the hash chain looking for threads. Instead, you have a list of all the threads sleeping on that wait channel. - Outside of the sleepq code and the sleep/cv code the kernel no longer differentiates between cv's and sleep/wakeup. For example, calls to abortsleep() and cv_abort() are replaced with a call to sleepq_abort(). Thus, the TDF_CVWAITQ flag is removed. Also, calls to unsleep() and cv_waitq_remove() have been replaced with calls to sleepq_remove(). - The sched_sleep() function no longer accepts a priority argument as sleep's no longer inherently bump the priority. Instead, this is soley a propery of msleep() which explicitly calls sched_prio() before blocking. - The TDF_ONSLEEPQ flag has been dropped as it was never used. The associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been dropped and replaced with a single explicit clearing of td_wchan. TD_SET_ONSLEEPQ() would really have only made sense if it had taken the wait channel and message as arguments anyway. Now that that only happens in one place, a macro would be overkill.	2004-02-27 18:52:44 +00:00
John Baldwin	e5bb601d87	Drop sched_lock around the wakeup of the parent process after setting the process state to zombie when a process exits to avoid a lock order reversal with the sleepqueue locks. This appears to be the only place that we call wakeup() with sched_lock held.	2004-02-27 18:39:09 +00:00
John Baldwin	dd75b0a90d	Add an implementation of a generic sleep queue abstraction that is used to queue threads sleeping on a wait channel similar to how turnstiles are used to queue threads waiting for a lock. This subsystem will be used as the backend for sleep/wakeup and condition variables initially. Eventually it will also be used to replace the ithread-specific iwait thread inhibitor. Sleep queues are also not locked by sched_lock, so this splits sched_lock up a bit further increasing concurrency within the scheduler. Sleep queues also natively support timeouts on sleeps and interruptible sleeps allowing for the reduction of a lot of duplicated code between the sleep/wakeup and condition variable implementations. For more details on the sleep queue implementation, check the comments in sys/sleepqueue.h and kern/subr_sleepqueue.c.	2004-02-27 18:33:09 +00:00
Dag-Erling Smørgrav	21885af505	Add sysctl_move_oid() which reparents an existing OID.	2004-02-27 17:13:23 +00:00
John Baldwin	5b7de7e19e	Clarify and tweak some comments.	2004-02-27 16:14:27 +00:00
John Baldwin	03129ba97f	Fix _sx_assert() to panic() rather than printf() when an assertion fails and ignore assertions if we have already paniced.	2004-02-27 16:13:44 +00:00
John Baldwin	f4114c3d7f	Replace the ktrace queue's semaphore with a condition variable instead as it is slightly more efficient since we already have a mutex to protect the queue. Ktrace originally used a semaphore more as a proof of concept.	2004-02-26 19:30:22 +00:00
Don Lewis	47934cef8f	Split the mlock() kernel code into two parts, mlock(), which unpacks the syscall arguments and does the suser() permission check, and kern_mlock(), which does the resource limit checking and calls vm_map_wire(). Split munlock() in a similar way. Enable the RLIMIT_MEMLOCK checking code in kern_mlock(). Replace calls to vslock() and vsunlock() in the sysctl code with calls to kern_mlock() and kern_munlock() so that the sysctl code will obey the wired memory limits. Nuke the vslock() and vsunlock() implementations, which are no longer used. Add a member to struct sysctl_req to track the amount of memory that is wired to handle the request. Modify sysctl_wire_old_buffer() to return an error if its call to kern_mlock() fails. Only wire the minimum of the length specified in the sysctl request and the length specified in its argument list. It is recommended that sysctl handlers that use sysctl_wire_old_buffer() should specify reasonable estimates for the amount of data they want to return so that only the minimum amount of memory is wired no matter what length has been specified by the request. Modify the callers of sysctl_wire_old_buffer() to look for the error return. Modify sysctl_old_user to obey the wired buffer length and clean up its implementation. Reviewed by: bms	2004-02-26 00:27:04 +00:00
Robert Watson	049ffe98a8	Assert pipe mutex in pipeselwakeup(), as we manipulate pipe_state in a non-atomic manner. It appears to always be called with the mutex (good).	2004-02-26 00:18:22 +00:00
Robert Watson	094bdd260c	Update comment regarding MAC labels: we no longer pass endpoints into the MAC Framework, just the pipe pair. GC 'hadpeer' used in pipedestroy(), which is no longer needed as we check pipe_present flags on the pair.	2004-02-25 23:30:56 +00:00
Dag-Erling Smørgrav	854a417d92	Whitespace cleanup	2004-02-24 19:31:30 +00:00
Poul-Henning Kamp	652d04726d	Fix two oversights here: don't trash the freelist, and properly cleanup the cdevsw{}. Submitted by: tegge	2004-02-23 08:42:55 +00:00
Brian Feldman	240160d48b	Correct some major SMP-harmful problems in the pipe implementation. First of all, PIPE_EOF is not checked pervasively after everything that can drop the pipe mutex and msleep(), so fix. Additionally, though it might not harm anything, pipelock() and pipeunlock() are not used consistently. Third, the kqueue support functions do not use the pipe mutex correctly. Last, but absolutely not least, is a race: if pipe_busy is not set on the closing side of the pipe, the other side that is trying to write to that will crash BECAUSE PIPE_EOF IS NOT SET! Unconditionally set PIPE_EOF, and get rid of all the lockups/crashes I have seen trying to build ports.	2004-02-22 23:00:14 +00:00
Daniel Eischen	2648efa621	Add sysctls to allow showing threads for pgrp, tty, uid, ruid, and pid.	2004-02-22 17:54:32 +00:00
Pawel Jakub Dawidek	63dba32b76	Reimplement sysctls handling by MAC framework. Now I believe it is done in the right way. Removed some XXMAC cases, we now assume 'high' integrity level for all sysctls, except those with CTLFLAG_ANYBODY flag set. No more magic. Reviewed by: rwatson Approved by: rwatson, scottl (mentor) Tested with: LINT (compilation), mac_biba(4) (functionality)	2004-02-22 12:31:44 +00:00
Colin Percival	b17dd2bcc0	If we're going to panic(), do it before dereferencing a NULL pointer. Reported by: "Ted Unangst" <tedu@coverity.com> Approved by: rwatson (mentor)	2004-02-22 01:11:53 +00:00
Robert Watson	f6a4109212	Update my personal copyrights and NETA copyrights in the kernel to use the "year1-year3" format, as opposed to "year1, year2, year3". This seems to make lawyers more happy, but also prevents the lines from getting excessively long as the years start to add up. Suggested by: imp	2004-02-22 00:33:12 +00:00
Poul-Henning Kamp	ded67d0f77	Check for NODEV return from udev2dev()	2004-02-21 23:52:03 +00:00
Poul-Henning Kamp	cd690b60de	Device megapatch 6/6: This is what we came here for: Hang dev_t's from their cdevsw, refcount cdevsw and dev_t and generally keep track of things a lot better than we used to: Hold a cdevsw reference around all entrances into the device driver, this will be necessary to safely determine when we can unload driver code. Hold a dev_t reference while the device is open. KASSERT that we do not enter the driver on a non-referenced dev_t. Remove old D_NAG code, anonymous dev_t's are not a problem now. When destroy_dev() is called on a referenced dev_t, move it to dead_cdevsw's list. When the refcount drops, free it. Check that cdevsw->d_version is correct. If not, set all methods to the dead_*() methods to prevent entrance into driver. Print warning on console to this effect. The device driver may still explode if it is also incompatible with newbus, but in that case we probably didn't get this far in the first place.	2004-02-21 21:57:26 +00:00
Poul-Henning Kamp	816d62bbb9	Device megapatch 5/6: Remove the unused second argument from udev2dev(). Convert all remaining users of makedev() to use udev2dev(). The semantic difference is that udev2dev() will only locate a pre-existing dev_t, it will not line makedev() create a new one. Apart from the tiny well controlled windown in D_PSEUDO drivers, there should no longer be any "anonymous" dev_t's in the system now, only dev_t's created with make_dev() and make_dev_alias()	2004-02-21 21:32:15 +00:00
Poul-Henning Kamp	dc08ffec87	Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.	2004-02-21 21:10:55 +00:00
Poul-Henning Kamp	8e1f1df080	Device megapatch 3/6: Add missing D_TTY flags to various drivers. Complete asserts that dev_t's passed to ttyread(), ttywrite(), ttypoll() and ttykqwrite() have (d_flags & D_TTY) and a struct tty pointer. Make ttyread(), ttywrite(), ttypoll() and ttykqwrite() the default cdevsw methods for D_TTY drivers and remove the explicit initializations in various drivers cdevsw structures.	2004-02-21 20:41:11 +00:00
Poul-Henning Kamp	b0b0334878	Device megapatch 2/6: This commit adds a couple of functions for pseudodrivers to use for implementing cloning in a manner we will be able to lock down (shortly). Basically what happens is that pseudo drivers get a way to ask for "give me the dev_t with this unit number" or alternatively "give me a dev_t with the lowest guaranteed free unit number" (there is unfortunately a lot of non-POLA in the exact numeric value of this number, just live with it for now) Managing the unit number space this way removes the need to use rman(9) to do so in the drivers this greatly simplifies the code in the drivers because even using rman(9) they still needed to manage their dev_t's anyway. I have taken the if_tun, if_tap, snp and nmdm drivers through the mill, partly because they (ab)used makedev(), but mostly because together they represent three different problems for device-cloning: if_tun and snp is the plain case: just give me a device. if_tap has two kinds of devices, with a flag for device type. nmdm has paired devices (ala pty) can you can clone either of them.	2004-02-21 20:29:52 +00:00

1 2 3 4 5 ...

7147 Commits