freebsd-dev

Author	SHA1	Message	Date
Robert Watson	b8ae1cd619	Add sosend_dgram(), a greatly reduced and simplified version of sosend() intended for use solely with atomic datagram socket types, and relies on the previous break-out of sosend_copyin(). Changes to allow UDP to optionally use this instead of sosend() will be committed as a follow-up.	2006-01-13 10:22:01 +00:00
Robert Watson	d7dca9034c	XXX a comment in uipc_usrreq.c that requires updating.	2006-01-13 00:00:32 +00:00
Alfred Perlstein	7d7e053c21	Novel idea, don't print a string if it is NULL! This protects people from loading _really_ old modules, like say from 5.x to a 6.x or 7.x system, like for instance right after an upgrade.	2006-01-12 19:15:14 +00:00
Scott Long	1c3a3b0bd0	The interlock in taskqueue_terminate() is completely wrong for taskqueues that use spinlocks. Remove it for now.	2006-01-11 00:37:13 +00:00
Poul-Henning Kamp	d3e64681d6	Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43) to COMPAT_43TTY. Add COMPAT_43TTY to NOTES and */conf/GENERIC Compile tty_compat.c only under the new option. Spit out #warning "Old BSD tty API used, please upgrade." if ioctl_compat.h gets #included from userland.	2006-01-10 09:19:10 +00:00
Scott Long	9df1a6dd61	Add functions and macros and refactor code to make it easier to manage fast taskqueues. The following have been added: TASKQUEUE_FAST_DEFINE() - create a global task queue. an arbitrary execution context. TASKQUEUE_FAST_DEFINE_THREAD() - create a global taskqueue that uses a dedicated kthread. taskqueue_create_fast() - create a local/private taskqueue. These are all complimentary of the standard taskqueue functions. They are primarily useful for fast interrupt handlers that can only use spinlock for synchronization. I personally think that the taskqueue API is starting to get too narrow and hairy, but fixing it will require a major redesign on the API. Such a redesign would be good but would break compatibility with FreeBSD 6.x, so it really isn't desirable at this time. Submitted by: sam	2006-01-10 06:31:12 +00:00
Tor Egge	82be0a5a24	Add marker vnodes to ensure that all vnodes associated with the mount point are iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman	2006-01-09 20:42:19 +00:00
Scott Long	861a23087b	If destroying a spinlock, make sure that it is exited properly. Submitted by: jhb MFC After: 3 days	2006-01-08 00:18:34 +00:00
John Baldwin	3b783acd2a	Revert an untested local change that crept in with the lo_class changes and subsequently broke the build. This change is supposed to fix the case where doing a mtx_destroy() off a spin mutex while you hold it fails. If it had been tested I would just leave it in, but it hasn't been tested yet, so it will have to wait until later.	2006-01-07 14:03:15 +00:00
David Xu	0a5cd498bb	Add a new feature to thr_kill, if thread ID argument is -1, send signals to all threads except current sender.	2006-01-07 03:15:21 +00:00
Tai-hwa Liang	75d6a87fa3	Trying to fix compilation bustage introduced in rev1.160 by converting a missing lo_class to LO_CLASSINDEX().	2006-01-07 02:07:08 +00:00
John Baldwin	3c6decc327	Trim another pointer from struct lock_object (and thus from struct mtx and struct sx). Instead of storing a direct pointer to a our lock_class struct in lock_object, reserve 4 bits in the lo_flags field to serve as an index into a global lock_classes array that contains pointers to the lock classes. Only debugging code such as WITNESS or INVARIANTS checks and KTR logging need to access the lock_class member, so this shouldn't add any overhead to production kernels. It might add some slight overhead to kernels using those debug options however. As with the previous set of changes to lock_object, this is going to completely obliterate the kernel ABI, so be sure to recompile all your modules.	2006-01-06 18:07:32 +00:00
John Baldwin	af56abaab5	Return error from fget_write() rather than hardcoding EBADF now that fget_write() DTRT. Requested by: bde	2006-01-06 16:34:22 +00:00
John Baldwin	38f63f7e47	Return EBADF rather than EINVAL for FWRITE failure as per POSIX. MFC after: 1 week	2006-01-06 16:30:30 +00:00
John Baldwin	e730167f16	Remove XXX comments complaining that write(2) on a read-only descriptor returns EBADF. That errno is correct and is mandated by POSIX. It also goes back to revision 1.1 of our CVS history (i.e. 4.4BSD). The _fget() function should probably also be upated as it currently returns EINVAL in that case rather than EBADF. (It does return EBADF for reads on a write-only descriptor without any XXX comments oddly enough.) Discussed with: scottl, grog, mjacob, bde	2006-01-05 22:20:31 +00:00
Bjoern A. Zeeb	ba0b6851b4	Minor whitespace cleanup.	2006-01-04 17:40:54 +00:00
Poul-Henning Kamp	d5f1e0d1ef	Deorbit ttymalloc() in preference for ttyalloc()	2006-01-04 09:59:07 +00:00
Poul-Henning Kamp	8607f52b66	Use ttyalloc() instead of ttymalloc()	2006-01-04 09:09:46 +00:00
Poul-Henning Kamp	246b8d448a	Use MTX_SYSINIT to set up the tty list mutex.	2006-01-04 08:22:39 +00:00
Diomidis Spinellis	c3d78136c9	Fix style bug. Prompted by: bde	2006-01-04 07:50:54 +00:00
Diomidis Spinellis	f8ccc6ceb9	Replace tv_usec normalization with the return of EINVAL. This addresses two objections to the previous behavior, and unbreaks the alpha tinderbox build. TODO: update the utimes(2) man page.	2006-01-04 00:47:13 +00:00
Diomidis Spinellis	51339e8593	Normalize the tv_usec part of the utimes(2) arguments to ensure that a file's atime and mtime are only set to correct fractional second values (0-999999000ns with the current interface). Prior to this change users could create files with values outside that range. Moreover, on 32-bit machines tv_usec offsets larger than 4.3s would result in an unnormalized AND wrong timestamp value, due to overflow. MFC after: 1 week	2006-01-03 21:58:21 +00:00
Alexander Leidinger	ef39c05baa	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
Pawel Jakub Dawidek	d362c40d3a	Improve memguard a bit: - Provide tunable vm.memguard.desc, so one can specify memory type without changing the code and recompiling the kernel. - Allow to use memguard for kernel modules by providing sysctl vm.memguard.desc, which can be changed to short description of memory type before module is loaded. - Move as much memguard code as possible to memguard.c. - Add sysctl node vm.memguard. and move memguard-specific sysctl there. - Add malloc_desc2type() function for finding memory type based on its short description (ks_shortdesc field). - Memory type can be changed (via vm.memguard.desc sysctl) only if it doesn't exist (will be loaded later) or when no memory is allocated yet. If there is allocated memory for the given memory type, return EBUSY. - Implement two ways of memory types comparsion and make safer/slower the default.	2005-12-30 11:45:07 +00:00
Pawel Jakub Dawidek	e7736557d6	Print a warning when we miss vinactive() call, because of race in vget(). The race is very real, but conditions needed for triggering it are rather hard to meet now. When gjournal will be committed (where it is quite easy to trigger) we need to fix it. For now, verify if it is really hard to trigger. Discussed with: kan	2005-12-29 22:52:09 +00:00
John Baldwin	8963150678	patch(1) and I aren't friends today. Axe a duplicate copy of the msleep_spin() function definition. Spotted by: pjd	2005-12-29 21:15:32 +00:00
John Baldwin	0cb7e6aec8	Add a new function msleep_spin() which is a slightly stripped down version of msleep(). msleep_spin() doesn't support changing the priority of the thread while it is asleep nor does it support interruptible sleeps (PCATCH) or the PDROP flag. It does support timeouts however. It differs from msleep() in that the passed in mutex is a spin mutex. This means one can use msleep_spin() and wakeup() with a spin mutex similar to msleep() and wakeup() with a regular mutex. Note that the spin mutex in question needs to come before sched_lock and the sleepq locks in lock order.	2005-12-29 20:57:45 +00:00
John Baldwin	b0e9883e2f	Teach WITNESS_SAVE() and WITNESS_RESTORE() to work with spin locks instead of only sleep locks.	2005-12-29 20:54:25 +00:00
John Baldwin	0a46ed7d56	Fix a deadlock I introduced with the recently added printf to warn about spin locks that are not in the static order list. It is not safe to call printf while holding the witness spin mutex since the console drivers that back printf may need to use their own spin locks which would try to talk to witness when they were locked. Given this, it is possible for one CPU to lock a console driver lock (such as sio) which then tries to lock the witness lock while another CPU is doing the printf while holding the witness lock. Fix this by moving the printf outside of the witness lock. All other printf's in witness are already correct. MFC after: 3 days	2005-12-29 20:53:01 +00:00
John Baldwin	42b6a681bc	Increment kobj_lookup_misses on a miss rather than decrementing it. Otherwise, the miss count is actually -kobj_lookup_misses. Mostly a pedantic change as KOBJ_STATS isn't on by default.	2005-12-29 18:00:42 +00:00
David Xu	3357835a46	Add code to report zombie state. PR: threads/91044 MFC after: 3 days	2005-12-29 13:00:42 +00:00
Alexander Kabaev	3f34977614	Trim trailing whitespace.	2005-12-28 17:13:31 +00:00
Pawel Jakub Dawidek	619f284195	In realloc(9), determine size of the original block based on UMA_SLAB_MALLOC flag. In some circumstances (I observed it when I was doing a lot of reallocs) UMA_SLAB_MALLOC can be set even if us_keg != NULL. If this is the case we have wonderful, silent data corruption, because less data is copied to the newly allocated region than should be. I'm not sure when this bug was introduced, it could be there undetected for years now, as we don't have a lot of realloc(9) consumers and it was hard to reproduce it... ...but what I know for sure, is that I don't want to know who introduce the bug:) It took me two/three days to track it down (of course most of the time I was looking for the bug in my own code).	2005-12-28 01:53:13 +00:00
David Xu	9f8eb3cb52	Use variable i instead of variable cpus as an index to get correct kseq.	2005-12-27 12:02:03 +00:00
Maxim Sobolev	d49b21093c	Fix breakage introduced in the previous commit.	2005-12-26 22:32:52 +00:00
Maxim Sobolev	900b28f9f6	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
Alan Cox	60bb39431a	Maintain the lock on the vnode for most of exec_elfN_imgact(). Specifically, it is required for the I/O that may be performed by elfN_load_section(). Avoid an obscure deadlock in the a.out, elf, and gzip image activators. Add a comment describing why the deadlock does not occur in the common case and how it might occur in less usual circumstances. Eliminate an unused variable from exec_aout_imgact(). In collaboration with: tegge	2005-12-24 04:57:50 +00:00
David Xu	d7bc12b096	Avoid kernel panic when attaching a process which may not be stopped by debugger, e.g process is dumping core. Only access p_xthread if P_STOPPED_TRACE is set, this means thread is ready to exchange signal with debugger, print a warning if P_STOPPED_TRACE is not set due to some bugs in other code, if there is. The patch has been tested by Anish Mistry mistry.7 at osu dot edu, and is slightly adjusted.	2005-12-24 02:59:29 +00:00
Jeff Roberson	49bdcff518	- Remove and unused include. Submitted by: Antoine Brodin <antoine.brodin@laposte.net>	2005-12-23 21:32:40 +00:00
Poul-Henning Kamp	25f6e35a05	Regenerate sysent with new abort2 system call. Implement abort2(const char reason, int narg, void *args); Submitted by: "Wojciech A. Koszek" <dunstan@freebsd.czest.pl>	2005-12-23 11:58:42 +00:00
Poul-Henning Kamp	5a56b437ec	Add abort2() systemcall.	2005-12-23 11:54:11 +00:00
Poul-Henning Kamp	49091c48d5	Make sbuf_copyin() return the number of bytes copied on success. Submitted by: "Wojciech A. Koszek" <dunstan@freebsd.czest.pl>	2005-12-23 11:49:53 +00:00
Scott Long	d2a401cb70	Create the taskqueue_fast handler with INTR_MPSAFE so that it doesn't run with Giant. MFC After: 3 days	2005-12-23 06:18:33 +00:00
John Baldwin	b439e431bf	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
Alan Cox	373d1a3f8c	Maintain the vnode lock throughout elfN_load_file() rather than releasing it and reacquiring it in vrele(). Consequently, there is no reason to increase the reference count on the vm object caching the file's pages. Reviewed by: tegge Eliminate unused parameters to elfN_load_file().	2005-12-21 18:58:40 +00:00
Alan Cox	ff6f03c7cd	Eliminate an unneeded (vm_prot_t) parameter from two functions. Eliminate unnecessary uses of a local variable. Reviewed by: tegge	2005-12-20 23:42:18 +00:00
Pawel Jakub Dawidek	c505fe7a0f	Reduce Giant scope a bit, as fdrop() is believed to be MPSAFE. The purpose of this change is consistency (not performance improvement:)), as it was hard to tell if fdrop() is MPSAFE or not when I saw it sometimes under the Giant and sometimes without it. Glanced at by: ssouhlal, kan	2005-12-20 00:49:59 +00:00
Pawel Jakub Dawidek	ade9b797a0	vfs_mount_alloc() always returns 0, but what we really want is newly allocated 'struct mount *' pointer, so simplify code a bit and return the pointer directly. Reviewed by: ssouhlal	2005-12-20 00:43:51 +00:00
Pawel Jakub Dawidek	003ba8a000	Use 'td' instead of 'curthread'.	2005-12-19 16:27:13 +00:00
David Xu	a1d4fe69d2	Fix a bug in slice calculation code, current code uses hz but sched_clock() is called by state clock. Submitted by: taku at tackymt dot homeip dot net	2005-12-19 08:26:09 +00:00
Nate Lawson	bd6b217753	Remove the KTR for hardclock completely. It seems to not be useful. Requested by: jhb	2005-12-18 18:11:55 +00:00
Nate Lawson	1335c4df32	Restore KTR_CRITICAL but conditionally compile it in as KTR_SCHED. Requested by: scottl, jhb	2005-12-18 18:10:57 +00:00
Marcel Moolenaar	757686b115	Make our ELF64 type definitions match standards. In particular this means: o Remove Elf64_Quarter, o Redefine Elf64_Half to be 16-bit, o Redefine Elf64_Word to be 32-bit, o Add Elf64_Xword and Elf64_Sxword for 64-bit entities, o Use Elf_Size in MI code to abstract the difference between Elf32_Word and Elf64_Word. o Add Elf_Ssize as the signed counterpart of Elf_Size. MFC after: 2 weeks	2005-12-18 04:52:37 +00:00
Alan Cox	044bbbb523	Correct a long-standing problem in elfN_map_insert(): In order to copy a page to user space, the user space mapping must allow write access. In collaboration with: tegge@ MFC after: 3 weeks	2005-12-17 19:40:47 +00:00
Nate Lawson	8615fd8696	Clean up unused or poorly utilized KTR values. Remove KTR_FS, KTR_KGDB, and KTR_IO as they were never used. Remove KTR_CLK since it was only used for hardclock firing and use KTR_INTR there instead. Remove KTR_CRITICAL since it was only used for crit enter/exit and use KTR_CONTENTION instead.	2005-12-17 03:57:10 +00:00
John Baldwin	5c8b444153	- Use uintfptr_t rather than int for the kernel profiling index (though it really should be a fptrdiff_t if we had that) in profclock(). - Don't try to profile kernel pc's that are >= the kernel lowpc to avoid underflows when computing a profiling index. - Use the PC_TO_I() macro to compute the kernel profiling index rather than doing it inline. Discussed with: bde	2005-12-16 22:11:52 +00:00
John Baldwin	cb49fcd145	Change the addupc_*() functions to use the uintfptr_t type for pc rather than uintptr_t as that is technically more correct.	2005-12-16 22:08:32 +00:00
Alan Cox	584716b08a	Style: The second argument to vm_map_find() should be NULL instead of 0.	2005-12-16 19:14:25 +00:00
Alan Cox	da61b9a69e	Use sf_buf_alloc() instead of vm_map_find() on exec_map to create the ephemeral mappings that are used as the source for three copy operations from kernel space to user space. There are two reasons for making this change: (1) Under heavy load exec_map can fill up causing vm_map_find() to fail. When it fails, the nascent process is aborted (SIGABRT). Whereas, this reimplementation using sf_buf_alloc() sleeps. (2) Although it is possible to sleep on vm_map_find()'s failure until address space becomes available (see kmem_alloc_wait()), using sf_buf_alloc() is faster. Furthermore, the reimplementation uses a CPU private mapping, avoiding a TLB shootdown on multiprocessors. Problem uncovered by: kris@ Reviewed by: tegge@ MFC after: 3 weeks	2005-12-16 18:34:14 +00:00
Xin LI	6ba9ec2d09	In pipe_write(): when uiomove() fails, do not spin on it forever. Submitted by: Kostik Belousov <kostikbel at gmail.com> on -current@ Message-ID: <20051216151016.GE84442@deviant.zoral.local> MFC After: 3 weeks	2005-12-16 18:32:39 +00:00
David Xu	03f70aec67	Replace selwakeuppri with selwakeup, let scheduler figure out appropriate thread priority.	2005-12-16 15:01:16 +00:00
Ed Maste	63e6f39011	When using m_dup(9) to copy more than MHLEN bytes of data, don't create an mbuf chain that starts with a cluster containing just MHLEN bytes. This happened because m_dup called m_get or m_getcl depending on the amount of data to copy, but then always set the size available in the first mbuf to MHLEN. Submitted by: Matt Koivisto <mkoivisto at sandvine dot com> Approved by: jmg Silence from: rwatson (mentor)	2005-12-14 23:34:26 +00:00
Maxime Henrion	e59898ff36	Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to match the type of the variable they are exporting. Spotted by: Thomas Hurst <tom@hur.st> MFC after: 3 days	2005-12-14 22:27:48 +00:00
Dag-Erling Smørgrav	0430a5e289	Eradicate caddr_t from the VFS API.	2005-12-14 00:49:52 +00:00
John Baldwin	d272fe53a4	Add a new 'show lock' command to ddb. If the argument has a valid lock class, then it displays various information about the lock and calls a new function pointer in lock_class (lc_ddb_show) to dump class-specific information about the lock as well (such as the owner of a mutex or xlock'ed sx lock). This is easier than staring at hex dumps of locks to figure out who owns the lock, etc. Note that extending lock_class doesn't affect the ABI for any kernel modules as the only code that deals with lock_class structures directly is kern_mutex.c, kern_sx.c, and witness. MFC after: 1 week	2005-12-13 23:14:35 +00:00
David Xu	dd1a6f53ac	Stop fiddling thread priority with msleep, eliminating unnecessary context switching. This improves performance about 30% on UP machine.	2005-12-12 05:04:56 +00:00
Craig Rodrigues	92f44a3f3c	Contributions from XFS for FreeBSD project: - Implement cv_wait_unlock() method which has semantics compatible with the sv_wait() method in IRIX. For cv_wait_unlock(), the lock must be held before entering the function, but is not held when the function is exited. - Implement the existing cv_wait() function in terms of cv_wait_unlock(). Submitted by: kan Feedback from: jhb, trhodes, Christoph Hellwig <hch at infradead dot org>	2005-12-12 00:02:22 +00:00
Alan Cox	05406e6f33	Remove unneeded calls to pmap_remove_all(). The given page is not mapped. Reviewed by: tegge	2005-12-11 22:06:57 +00:00
Andre Oppermann	36ae3fd3c3	Hide the 4k mbuf clusters if the normal clusters are defined to be 4k already. This unbreaks tinderbox. Submitted by: ru	2005-12-10 15:21:04 +00:00
David Xu	3e70c6f047	Fix compiling warning on 64 bits system.	2005-12-09 13:16:48 +00:00
David Xu	f71a882f15	Add a sysctl to force a process to sigexit if a trap signal is being hold by current thread or ignored by current process, otherwise, it is very possible the thread will enter an infinite loop and lead to an administrator's nightmare.	2005-12-09 08:29:29 +00:00
David Xu	d26b1a1fb9	Register itimers_event_hook as a kernel event handler, so I don't have to duplicate code to call it in exec() and exit1().	2005-12-09 05:43:26 +00:00
David Xu	102178d0df	Comment out mqfs_create_link. Inline some small functions.	2005-12-09 02:38:29 +00:00
David Xu	5c4745177f	Now SIGCHLD is always queued.	2005-12-09 02:27:55 +00:00
David Xu	761a4d9423	Cleanup sigqueue sysctl.	2005-12-09 02:26:44 +00:00
Andre Oppermann	d5269a636b	Add an API for jumbo mbuf cluster allocation and also provide 4k clusters in addition to 9k and 16k ones. struct mbuf m_getjcl(int how, short type, int flags, int size) void m_cljget(struct mbuf *m, int how, int size) m_getjcl() returns an mbuf with a cluster of the specified size attached like m_getcl() does for 2k clusters. m_cljget() is different from m_clget() as it can allocate clusters without attaching them to an mbuf. In that case the return value is the pointer to the cluster of the requested size. If an mbuf was specified, it gets the cluster attached to it and the return value can be safely ignored. For size both take MCLBYTES, MJUM4BYTES, MJUM9BYTES, MJUM16BYTES. Reviewed by: glebius Tested by: glebius Sponsored by: TCP/IP Optimization Fundraise 2005	2005-12-08 13:13:06 +00:00
Craig Rodrigues	d5989f64cf	In devfs_first(), set mp->mnt_opt to a valid empty list of mount options instead of leaving it NULL. This eliminates a kernel panic when trying to do a mount -o update of /dev. Noticed by: cjsp Reviewed by: phk	2005-12-08 04:27:53 +00:00
Craig Rodrigues	8539ca4cde	Add "errmsg" to list of global mount options.	2005-12-08 04:09:29 +00:00
Craig Rodrigues	6951bea6c8	Changes imported from XFS for FreeBSD project: - add fields to struct buf (needed by XFS) - 3 private fields: b_fsprivate1, b_fsprivate2, b_fsprivate3 - b_pin_count, count of pinned buffer - add new B_MANAGED flag - add breada() function to initiate asynchronous I/O on read-ahead blocks. - add bufdone_finish(), bpin(), bunpin_wait() functions Patches provided by: kan Reviewed by: phk Silence on: arch@	2005-12-07 03:39:08 +00:00
Alan Cox	8ad398d089	Reduce the scope of the page queues lock in exec_map_first_page(). The vm object lock is sufficient for reading a page's PG_BUSY and busy flags. MFC after: 1 week	2005-12-06 07:39:36 +00:00
David Xu	9da8a32aae	o Turn on MPSAFE flag for mqueuefs. o Reuse si_mqd field in siginfo_t, this also gives userland information about which descriptor is notified.	2005-12-06 06:22:12 +00:00
David Xu	027f760408	Fix a lock leak in childproc_continued().	2005-12-06 05:30:13 +00:00
John Baldwin	5d2162b2f8	Tweak witness handling of lock object to shave 2 pointers off of each lock object (and thus off of each mutex and sx lock): - Rename the all_locks list to pending_locks and only put locks initialized before SI_SUB_WITNESS on the list so that the SI_SUB_WITNESS can add them to witness once it starts up. - Now that pending_locks is only used during early startup, change it from a TAILQ to an STAILQ. This removes a pointer from the STAILQ_ENTRY in struct lock_object. - Since the pending_locks list is only used during the single-threaded early boot it no longer needs to be protected by a mutex, so remove all_mtx. - Since the lo_list member of struct lock_object is now only used during early boot before witness is running, collapse lo_list and lo_witness into a union. This shaves the second pointer off of struct lock_object. - Axe lock_cur_cnt and lock_max_cnt. With these changes, struct mtx shrinks from 36 to 28 bytes on 32-bit platforms and from 72 to 56 bytes on 64-bit platforms. Note that this commit will completely and utterly destroy the kernel ABI, so no MFC. Tested on: alpha, amd64, i386, sparc64	2005-12-05 20:45:24 +00:00
David Xu	052ea11c71	After reading some documents, I realized SIGEV_NONE != NULL, also fix code in mqueue_send_notification to handle SIGEV_NONE.	2005-12-05 04:41:32 +00:00
David Xu	9947b45978	Handle SIGEV_NONE, if notification is SIGEV_NONE, error status and return status will be set, but no notification will be registered. Increase hard limit of maxmsg to 100, so posixtestsuite ports can run.	2005-12-05 03:23:27 +00:00
Ruslan Ermilov	f4e9888107	Fix -Wundef.	2005-12-04 02:12:43 +00:00
Craig Rodrigues	1245b3433e	Add "rdonly" to global_opts, and parse it in vfs_donmount(). Requested by: rwatson	2005-12-03 12:04:20 +00:00
Craig Rodrigues	ec528a3472	- Add "rw" mount option to global_opts. - In vfs_donmount(), parse "ro", "noro", and "rw", in order to set or unset the MNT_RDONLY filesystem flag.	2005-12-03 01:26:27 +00:00
David Xu	5ee2d4ac5a	1. Cleanup including. 2. Set configuration value for CTL_P1003_1B_MESSAGE_PASSING.	2005-12-02 14:09:32 +00:00
David Xu	a6de716d7e	1. Check if message priority is less than MQ_PRIO_MAX. 2. Use getnanotime instead of getnanouptime. 3. Don't free message in _mqueue_send, mqueue_send will free it.	2005-12-02 08:23:49 +00:00
David Xu	77e718f773	1. Set timer configuration values for sysconf(). 2. Set overrun limit to INT_MAX, report ERANGE error if overrun will be greater than INT_MAX.	2005-12-01 07:56:15 +00:00
David Xu	b51d237a67	set signal queue values for sysconf().	2005-12-01 00:25:50 +00:00
David Xu	b2f92ef96b	Last step to make mq_notify conform to POSIX standard, If the process has successfully attached a notification request to the message queue via a queue descriptor, file closing should remove the attachment.	2005-11-30 05:12:03 +00:00
John Baldwin	398293a8de	Fix snderr() to not leak the socket buffer lock if an error occurs in sosend(). Robert accidentally changed the snderr() macro to jump to the out label which assumes the lock is already released rather than the release label which drops the lock in his previous change to sosend(). This should fix the recent panics about returning from write(2) with the socket lock held and the most recent LOR on current@.	2005-11-29 23:07:14 +00:00
Robert Watson	66dd8a6f99	Move zero copy statistics structure before sosend_copyin(). MFC after: 1 month Reported by: tinderbox, sam	2005-11-28 21:45:36 +00:00
John Baldwin	ef627e7da0	When checking to see if a process has exceeded its time limit, flag the process as over the limit when its time is >= to the limit rather than > the limit. Technically, if p->p_rux.rux_runtime.sec == p->p_pcpulimit and p->p_rux.rux_runtime.frac == 0, the process hasn't exceeded the limit yet. However, having the fraction exactly equal to 0 is rather rare, and it is not worth the overhead to handle that edge case. With just the > comparison, the process would have to exceed its limit by almost a second before it was killed. PR: kern/83192 Submitted by: Maciej Zawadzinski mzawadzinski at gmail dot com Reviewed by: bde MFC after: 1 week	2005-11-28 19:09:08 +00:00
Robert Watson	a725629cf8	Break out functionality in sosend() responsible for building mbuf chains and copying in mbufs from the body of the send logic, creating a new function sosend_copyin(). This changes makes sosend() almost readable, and will allow the same logic to be used by tailored socket send routines. MFC after: 1 month Reviewed by: andre, glebius	2005-11-28 18:09:03 +00:00
David Xu	f72b11a40c	Fix a stupid compiler warining, remove a redundant line.	2005-11-27 22:59:47 +00:00
David Xu	47bf2cf9fe	Change filesystem name from mqueue to mqueuefs for style consistent. Suggested by: rwatson	2005-11-27 08:30:12 +00:00
David Xu	6829585c43	Regen.	2005-11-27 01:23:31 +00:00
David Xu	94e1294b06	Don't use OpenBSD syscall numbers, instead, use new syscall numbers for POSIX message queue. Suggested by: rwatson	2005-11-27 01:13:00 +00:00
Robert Watson	5e758b9561	Add several aliases for existing clockid_t names to indicate that the application wishes to request high precision time stamps be returned: Alias Existing CLOCK_REALTIME_PRECISE CLOCK_REALTIME CLOCK_MONOTONIC_PRECISE CLOCK_MONOTONIC CLOCK_UPTIME_PRECISE CLOCK_UPTIME Add experimental low-precision clockid_t names corresponding to these clocks, but implemented using cached timestamps in kernel rather than a full time counter query. This offers a minimum update rate of 1/HZ, but in practice will often be more frequent due to the frequency of time stamping in the kernel: New clockid_t name Approximates existing clockid_t CLOCK_REALTIME_FAST CLOCK_REALTIME CLOCK_MONOTONIC_FAST CLOCK_MONOTONIC CLOCK_UPTIME_FAST CLOCK_UPTIME Add one additional new clockid_t, CLOCK_SECOND, which returns the current second without performing a full time counter query or cache lookup overhead to make sure the cached timestamp is stable. This is intended to support very low granularity consumers, such as time(3). The names, visibility, and implementation of the above are subject to change, and will not be MFC'd any time soon. The goal is to expose lower quality time measurement to applications willing to sacrifice accuracy in performance critical paths, such as when taking time stamps for the purpose of rescheduling select() and poll() timeouts. Future changes might include retrofitting the time counter infrastructure to allow the "fast" time query mechanisms to use a different time counter, rather than a cached time counter (i.e., TSC). NOTE: With different underlying time mechanisms exposed, using different time query mechanisms in the same application may result in relative non-monoticity or the appearance of clock stalling for a single clockid_t, as a cached time stamp queried after a precision time stamp lookup may be "before" the time returned by the earlier live time counter query.	2005-11-27 00:55:18 +00:00
David Xu	7023331e59	Regen.	2005-11-26 12:45:22 +00:00
David Xu	655291f2ae	Bring in experimental kernel support for POSIX message queue.	2005-11-26 12:42:35 +00:00
Craig Rodrigues	5e6b93a014	In nmount() and vfs_donmount(), do not strcmp() the options in the iovec directly. We need to copyin() the strings in the iovec before we can strcmp() them. Also, when we want to send the errmsg back to userspace, we need to copyout()/copystr() the string. Add a small helper function vfs_getopt_pos() which takes in the name of an option, and returns the array index of the name in the iovec, or -1 if not found. This allows us to locate an option in the iovec without actually manipulating the iovec members. directly via strcmp(). Noticed by: kris on sparc64	2005-11-23 20:51:15 +00:00
John Polstra	ba3612cd5c	Fix a bug in the loop in sonewconn that makes room on the incomplete connection queue for a new connection. It was removing connections from the wrong list. Submitted by: Paul Mikesell Sponsored by: Isilon Systems MFC after: 1 week	2005-11-22 01:55:29 +00:00
Marcel Moolenaar	60b7823989	Fix bug introduced in revision 1.186: When all file systems have a time stamp of zero, which is the case for example when the root file system is on a read-only medium, we ended up not calling inittodr() at all. A potential uncleanliness existed as well. If multiple file systems had a non-zero time stamp, we would call inittodr() multiple times. While this should not be harmful, it's definitely not ideal. Fix both issues by iterating over the mounted file systems to find the largest time stamp and call inittodr() exactly once with that time stamp. This could of course be a zero time stamp if none of the mounted file systems have a non-zero time stamp. In that case the annoying errors mentioned in the commit log for revision 1.186 still haven't been avoided. The bottom line is that inittodr() should not complain when it gets a time base of zero. At the time of this commit only alpha seems to have that problem. Reported by: Dario Freni (saturnero at freesbie dot org) MFC after: 1 week	2005-11-19 21:51:45 +00:00
Craig Rodrigues	425e5b6268	Parse more mount options in vfs_donmount(), before vfs_domount() is called. It looks like there are lots of different mount flags checked in vfs_domount(), so we need to do the parsing for these particular mount flags earlier on. The new flags parsed are: async, force, multilabel, noasync, noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow, snapshot, suiddir, sync, union. Existing code which uses mount() to mount UFS filesystems is not affected, but new code which uses nmount() to mount UFS filesystems should behave better.	2005-11-19 21:22:21 +00:00
Andre Oppermann	5eefd88949	Add CLOCK_UPTIME to clock_gettime(2) reporting the current uptime measured in SI seconds. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 16:51:13 +00:00
Craig Rodrigues	8fd860cfa1	In vfs_nmount(), check to see if "update" mount option was passed in, and if so, set MNT_UPDATE filesystem flag. vfs_nmount() calls vfs_domount(), and there is special logic inside vfs_domount() if MNT_UPDATE is set. This is very important when we want to do an update mount of the root filesystem, using nmount().	2005-11-18 01:31:10 +00:00
Pyun YongHyeon	dc22aef2ae	Prefer NULL to 0. Add missing lock/unlock in sysctl handler. Protect accessing NULL pointer when resource allocation was failed. style(9) Reviewed by: scottl MFC after: 1 week	2005-11-17 08:56:21 +00:00
Olivier Houchard	481a1fe19e	Add a new sysctl, kern.elf[32\|64].can_exec_dyn. When set to 1, one can execute a ET_DYN binary (shared object). This does not make much sense, but some linux scripts expect to be able to execute /lib/ld-linux.so.2 (ldd comes to mind). The sysctl defaults to 0. MFC after: 3 days	2005-11-14 22:24:00 +00:00
Robert Watson	c5c9bd5b72	In ktr_getrequest(), acquire ktrace_mtx earlier -- while the race currently present is minor and offers no real semantic issues, it also doesn't make sense since an earlier lockless check has already occurred. Also hold the mutex longer, over a manipulation of per-process ktrace state, which requires synchronization. MFC after: 1 month Pointed out by: jhb	2005-11-14 19:30:09 +00:00
Robert Watson	2c255e9df6	Moderate rewrite of kernel ktrace code to attempt to generally improve reliability when tracing fast-moving processes or writing traces to slow file systems by avoiding unbounded queueuing and dropped records. Record loss was previously possible when the global pool of records become depleted as a result of record generation outstripping record commit, which occurred quickly in many common situations. These changes partially restore the 4.x model of committing ktrace records at the point of trace generation (synchronous), but maintain the 5.x deferred record commit behavior (asynchronous) for situations where entering VFS and sleeping is not possible (i.e., in the scheduler). Records are now queued per-process as opposed to globally, with processes responsible for committing records from their own context as required. - Eliminate the ktrace worker thread and global record queue, as they are no longer used. Keep the global free record list, as records are still used. - Add a per-process record queue, which will hold any asynchronously generated records, such as from context switches. This replaces the global queue as the place to submit asynchronous records to. - When a record is committed asynchronously, simply queue it to the process. - When a record is committed synchronously, first drain any pending per-process records in order to maintain ordering as best we can. Currently ordering between competing threads is provided via a global ktrace_sx, but a per-process flag or lock may be desirable in the future. - When a process returns to user space following a system call, trap, signal delivery, etc, flush any pending records. - When a process exits, flush any pending records. - Assert on process tear-down that there are no pending records. - Slightly abstract the notion of being "in ktrace", which is used to prevent the recursive generation of records, as well as generating traces for ktrace events. Future work here might look at changing the set of events marked for synchronous and asynchronous record generation, re-balancing queue depth, timeliness of commit to disk, and so on. I.e., performing a drain every (n) records. MFC after: 1 month Discussed with: jhb Requested by: Marc Olzheim <marcolz at stack dot nl>	2005-11-13 13:27:44 +00:00
Craig Rodrigues	d5328381f1	style(9) cleanups. Spotted by: njl, bde	2005-11-12 14:41:44 +00:00
Robert Watson	71909edec8	Significant refactoring of the accounting code to improve locking and VFS happiness, as well as correct other bugs: - Replace notion of current and saved accounting credential/vnode with a single credential/vnode and an acct_suspended flag. This simplifies the accounting logic substantially. - Replace acct_mtx with acct_sx, a sleepable lock held exclusively during reconfiguration and space polling, but shared during log entry generation. This avoids holding a mutex over sleepable VFS operations. - Hold the sx lock over the duration of the I/O so that the vnode I/O cannot occur after vnode close, which could occur previously if accounting was disabled as a process exited. - Write the accounting log entry with Giant conditionally acquired based on the file system where the log is stored. Previously, the accounting code relied on the caller acquiring Giant. - Acquire Giant conditionally in the accounting callout based on the file system where the accounting log is stored. Run the callout MPSAFE. - Expose acct_suspended via a read-only sysctl so it is possibly to programmatically determine whether accounting is suspended or not without attempting to parse logs. - Check both acct_vp and acct_suspended lock-free before entering the accounting sx lock in acct(). - When accounting is disabled due to a VBAD vnode (i.e., forceable unmount), generate a log message indicating accounting has been disabled. - Correct a long-standing bug in how free space is calculated and compared to the required space: generate and compare signed results, not unsigned results, or negative free space will cause accounting to not be suspended when required, or worse, incorrectly resumed once negative free space is reached. MFC after: 2 weeks	2005-11-12 10:45:13 +00:00
David Xu	413cf3bbe1	Make sure only remove one signal by debugger.	2005-11-12 04:22:16 +00:00
Robert Watson	a0ec558af0	Correct a number of serious and closely related bugs in the UNIX domain socket file descriptor garbage collection code, which is intended to detect and clear cycles of orphaned file descriptors that are "in-flight" in a socket when that socket is closed before they are received. The algorithm present was both run at poor times (resulting in recursion and reentrance), and also buggy in the presence of parallelism. In order to fix these problems, make the following changes: - When there are in-flight sockets and a UNIX domain socket is destroyed, asynchronously schedule the garbage collector, rather than running it synchronously in the current context. This avoids lock order issues when the garbage collection code reenters the UNIX domain socket code, avoiding lock order reversals, deadlocks, etc. Run the code asynchronously in a task queue. - In the garbage collector, when skipping file descriptors that have entered a closing state (i.e., have f_count == 0), re-test the FDEFER flag, and decrement unp_defer. As file descriptors can now transition to a closed state, while the garbage collector is running, it is no longer the case that unp_defer will remain an accurate count of deferred sockets in the mark portion of the GC algorithm. Otherwise, the garbage collector will loop waiting waiting for unp_defer to reach zero, which it will never do as it is skipping file descriptors that were marked in an earlier pass, but now closed. - Acquire the UNIX domain socket subsystem lock in unp_discard() when modifying the unp_rights counter, or a read/write race is risked with other threads also manipulating the counter. While here: - Remove #if 0'd code regarding acquiring the socket buffer sleep lock in the garbage collector, this is not required as we are able to use the socket buffer receive lock to protect scanning the receive buffer for in-flight file descriptors on the socket buffer. - Annotate that the description of the garbage collector implementation is increasingly inaccurate and needs to be updated. - Add counters of the number of deferred garbage collections and recycled file descriptors. This will be removed and is here temporarily for debugging purposes. With these changes in place, the unp_passfd regression test now appears to be passed consistently on UP and SMP systems for extended runs, whereas before it hung quickly or panicked, depending on which bug was triggered. Reported by: Philip Kizer <pckizer at nostrum dot com> MFC after: 2 weeks	2005-11-10 16:06:04 +00:00
Robert Watson	742be7821c	Add the f_msgcount field to the set of struct file fields printed in show files. MFC after: 1 week	2005-11-10 13:26:29 +00:00
Robert Watson	2be165c93e	Expanet of details printed for each file descriptor to include it's garbage collection flags. Reformat generally to make this fit and leave some room for future expansion. MFC after: 1 week	2005-11-10 11:35:59 +00:00
Robert Watson	b4e507aafa	Add a DDB "show files" command to list the current open file list, some state about each open file, and identify the first process in the process table that references the file. This is helpful in debugging leaks of file descriptors. MFC after: 1 week	2005-11-10 10:42:50 +00:00
Doug White	16e35dcc39	This is a workaround for a complicated issue involving VFS cookies and devfs. The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days	2005-11-09 22:03:50 +00:00
Robert Watson	f8a9ed1fa7	Fix typo in recent comment tweak. Submitted by: jkim MFC after: 1 week	2005-11-09 22:02:02 +00:00
Robert Watson	923633b4b5	In closef(), remove the assumption that there is a thread associated with the file descriptor. When a file descriptor is closed as a result of garbage collecting a UNIX domain socket, the file descriptor will not have any associated thread, so the logic to identify advisory locks held by that thread is not appropriate. Check the thread for NULL to avoid this scenario. Expand an existing comment to say a bit more about this. MFC after: 1 week	2005-11-09 20:54:25 +00:00
Warner Losh	5d56add2ba	General consensus is that it would be even better to run this in a thread context. While it doesn't matter too much at the moment, in the future we could be back in the same boat if/when more restrictions are placed (or enforced) in a SWI. Suggested by: njl, bde, jhb, scottl	2005-11-09 16:22:56 +00:00
John Baldwin	26b7bd707c	Use intptr_t casts to convert void * <--> int to make 64-bit archs happy.	2005-11-09 15:15:59 +00:00
Ruslan Ermilov	303989a2f3	Use sparse initializers for "struct domain" and "struct protosw", so they are easier to follow for the human being.	2005-11-09 13:29:16 +00:00
David Xu	f4d8522334	WIFxxx macros requires an int type but p_xstat is short, convert it to int before using the macros. Bug reported by : Pyun YongHyeon pyunyh at gmail dot com	2005-11-09 07:58:16 +00:00
Warner Losh	161604d863	Kick off the suspend sequence from the keyboard in a SWI rather than in the hardware interrupt context (even if it is likely just an ithread). We don't document that suspend/resume routines are run from such a context and some of the things that happen in those routines aren't interrupt safe. Since there's no real need to run from that context, this restores assumptions that suspend routines have made. This fixes Thierry Herbelot's 'Trying to sleep while sleeping is prohibited' problem.	2005-11-09 07:32:01 +00:00
Warner Losh	2002eaadb7	Clarify panic message, I parsed the old one 'trying to sleep while sleeping'	2005-11-09 07:28:52 +00:00
Craig Rodrigues	4560dfb5b1	For nmount(), allow a text string error message to be propagated back to user-space if a parameter named "errmsg" is passed into the iovec. Used in conjunction with vfs_mount_error(), more useful error messages than errno can be passed back to userspace when mounting a filesystem fails. Discussed with: phk, pjd	2005-11-09 02:26:38 +00:00
David Xu	323fe56580	In aio_waitcomplete, do not return EAGAIN if no other threads have started aio, instead, initialize aio management structure if it hasn't been done, the reason to adjust this behavior is to make it a bit friendly for threaded program, consider two threads, one submits aio_write, and another just calls aio_waitcomplete to wait any I/O to be completed and recycle the aio requests, before submitter doing any I/O, the recycler wants to wait in kernel. This also fixes inconsistency with other aio syscalls.	2005-11-08 23:48:32 +00:00
David Xu	c20cedbfc9	Make sure pending SIGCHLD is removed from previous parent when process is attached or detached.	2005-11-08 23:28:12 +00:00
John Baldwin	2a522eb9d3	Various and sundry cleanups: - Use curthread for calls to knlist_delete() and add a big comment explaining why as well as appropriate assertions. - Use TAILQ_FOREACH and TAILQ_FOREACH_SAFE instead of handrolling them. - Use fget() family of functions to lookup file objects instead of grovelling around in file descriptor tables. - Destroy the aio_freeproc mutex if we are unloaded. Tested on: i386	2005-11-08 17:43:05 +00:00
Christian S.J. Peron	576068804d	Giant clean up for exit(2) -Change unconditional aquisition of Giant to only pickup Giant if the vnode for the controlling tty resides on a non-mpsafe file system. -Pickup Giant around executable vnode reference counting operations only if the executable resides on a non-mpsafe file system. -If this process is being traced, pickup Giant for trace file reference count operations only if it resides on a non-mpsafe file system. Discussed with: jhb Tested by: kris	2005-11-08 17:11:03 +00:00
David Xu	ebceaf6dc7	Add support for queueing SIGCHLD same as other UNIX systems did. For each child process whose status has been changed, a SIGCHLD instance is queued, if the signal is stilling pending, and process changed status several times, signal information is updated to reflect latest process status. If wait() returns because the status of a child process is available, pending SIGCHLD signal associated with the child process is discarded. Any other pending SIGCHLD signals remain pending. The signal information is allocated at the same time when proc structure is allocated, if process signal queue is fully filled or there is a memory shortage, it can still send the signal to process. There is a booting time tunable kern.sigqueue.queue_sigchild which can control the behavior, setting it to zero disables the SIGCHLD queueing feature, the tunable will be removed if the function is proved that it is stable enough. Tested on: i386 (SMP and UP)	2005-11-08 09:09:26 +00:00
Craig Rodrigues	84e69560b6	Add utility function to propagate mount errors as text string messages. Discussed with: phk	2005-11-08 04:13:39 +00:00
Gleb Smirnoff	49d46b616e	Fix panic string in last revision.	2005-11-06 16:47:59 +00:00
Andre Oppermann	cd5bb63b3d	Free only those mbuf+clusters back to the packet zone that were allocated from there. All others get broken up and free'd individually to the mbuf and cluster zones. The packet zone is a secondary zone to the mbuf zone. There is currently a limitation in UMA which prevents decreasing the packet zone stock when the mbuf and cluster zone are drained and all their members are part of packets. When this is fixed this change may be reverted.	2005-11-05 19:43:55 +00:00
Andre Oppermann	a5f7708723	Fix a logic error introduced with mandatory mbuf cluster refcounting and freeing of mbufs+clusters back to the packet zone.	2005-11-04 17:20:53 +00:00
David Xu	8f0371f19d	Fix name compatible problem with POSIX standard. the sigval_ptr and sigval_int really should be sival_ptr and sival_int. Also sigev_notify_function accepts a union sigval value but not a pointer.	2005-11-04 09:41:00 +00:00
John Baldwin	091e8307d0	Add stoppcbs[] arrays on Alpha and sparc64 and have each CPU save its current context in the IPI_STOP handler so that we can get accurate stack traces of threads on other CPUs on these two archs like we do now on i386 and amd64. Tested on: alpha, sparc64	2005-11-03 21:08:20 +00:00
John Baldwin	55de4dcab6	Fix 'show allpcpu' ddb command on non-x86. CPU IDs are in the range 0 .. mp_maxid, not 0 .. mp_maxid - 1. The result was that the highest numbered CPU was skipped on Alpha and sparc64. MFC after: 1 week	2005-11-03 21:06:29 +00:00
Pawel Jakub Dawidek	2a143d5bf5	Detect memory leaks when memory type is being destroyed. This is very helpful for detecting kernel modules memory leaks on unload. Discussed and reviewed by: rwatson	2005-11-03 13:48:59 +00:00
David Xu	4c0fb2cfff	Support sending realtime signal information via signal queue, realtime signal memory is pre-allocated, so kernel can always notify user code.	2005-11-03 05:25:26 +00:00
David Xu	6d7b314b14	Cleanup some signal interfaces. Now the tdsignal function accepts both proc pointer and thread pointer, if thread pointer is NULL, tdsignal automatically finds a thread, otherwise it sends signal to given thread. Add utility function psignal_event to send a realtime sigevent to a process according to the delivery requirement specified in struct sigevent.	2005-11-03 04:49:16 +00:00
David Xu	d1f16b4d2e	Oops, don't change tdsignal call.	2005-11-03 01:38:49 +00:00
David Xu	44355392b4	Add thread_find() function to search a thread by lwpid.	2005-11-03 01:34:08 +00:00
Paul Saab	1471f287e1	Calling setrlimit from 32bit apps could potentially increase certain limits beyond what should be capiable in a 32bit process, so we must fixup the limits. Reviewed by: jhb	2005-11-02 21:18:07 +00:00
Andre Oppermann	56a4e45ab3	Mandatory mbuf cluster reference counting and groundwork for UMA based jumbo 9k and jumbo 16k cluster support. All mbuf's with external storage attached are mandatory reference counted. For clusters and jumbo clusters UMA provides the refcnt storage directly. It does not have to be separatly allocated. Any other type of external storage gets its own refcnt allocated from an UMA mbuf refcnt zone instead of normal kernel malloc. The refcount API MEXT_ADD_REF() and MEXT_REM_REF() is no longer publically accessible. The proper m_* functions have to be used. mb_ctor_clust() and mb_dtor_clust() both handle normal 2K as well as 9k and 16k clusters. Clusters and jumbo clusters may be obtained without attaching it immideatly to an mbuf. This is for high performance cluster allocation in network drivers where mbufs are attached after the cluster has been filled. Tested by: rwatson Sponsored by: TCP/IP Optimizations Fundraise 2005	2005-11-02 16:20:36 +00:00

1 2 3 4 5 ...

9058 Commits