freebsd-skq

Author	SHA1	Message	Date
jeff	df170ebc61	- It has long been my suspicion that we don't actually need a loop in vn_lock(). Add an assert that will help me gain more confidence that this is correct. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:47:29 +00:00
jeff	2ef7df2a1a	- Add KTR_VFS events to vdestroy, vtruncbuf, vinvalbuf, vfreehead. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:37 +00:00
jeff	bf15cf7167	- Add KTR_VFS messages for various name cache related events. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:03 +00:00
jeff	c92b8a6f78	- Split one KASSERT in bremfree() into two to aid in debugging. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:45:05 +00:00
jeff	2599199edf	- Dramatically simplify bioqdisksort(). We no longer do ordered bios so most of the code to deal with them has been dead for sometime. Simplify the code by doing an insert sort hinted by the current head position. Met with apathy by: arch@	2005-06-12 22:32:29 +00:00
pjd	d2fe610c90	Do not allocate memory while holding a mutex. I introduce a very small race here (some file system can be mounted or unmounted between 'count' calculation and file systems list creation), but it is harmless. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-12 07:03:23 +00:00
pjd	be79126844	Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-11 14:58:20 +00:00
maxim	e5e29d142d	o setsockopt(2) cannot remove accept filter. [1] o getsockopt(SO_ACCEPTFILTER) always returns success on listen socket even we didn't install accept filter on the socket. o Fix these bugs and add regression tests for them. Submitted by: Igor Sysoev [1] Reviewed by: alfred MFC after: 2 weeks	2005-06-11 11:59:48 +00:00
jeff	306b180d66	- Assert that we're not in the name cache anymore in vdestroy(). Sponsored by: Isilon Systems, Inc.	2005-06-11 08:48:09 +00:00
jeff	8a4fe36603	- Assert that we're not adding a doomed vnode to the name cache. Sponsored by: Isilon Systems, Inc.	2005-06-11 08:47:30 +00:00
jeff	3625e8746b	- Add KTR_VFS tracing to track the life of vnodes. Eventually KTR_VFS events could be added to cover other interesting details. - Add some VNASSERTs to discover places where we access vnodes after they have been uma_zfree'd before we try to free them again. - Add a few more VNASSERTs to vdestroy() to be certain that the vnode is really unused. Sponsored by: Isilon Systems, Inc.	2005-06-11 01:16:46 +00:00
green	ff904ffb64	Fix a serious deadlock with the NFS client. Given a large enough atomic write request, it can fill the buffer cache with the entirety of that write in order to handle retries. However, it never drops the vnode lock, or else it wouldn't be atomic, so it ends up waiting indefinitely for more buf memory that cannot be gotten as it has it all, and it waits in an uncancellable state. To fix this, hibufspace is exported and scaled to a reasonable fraction. This is used as the limit of how much of an atomic write request by the NFS client will be handled asynchronously. If the request is larger than this, it will be turned into a synchronous request which won't deadlock the system. It's possible this value is far off from what is required by some, so it shall be tunable as soon as mount_nfs(8) learns of the new field. The slowdown between an asynchronous and a synchronous write on NFS appears to be on the order of 2x-4x. General nod by: gad MFC after: 2 weeks More testing: wes PR: kern/79208	2005-06-10 23:50:41 +00:00
jeff	d372186b52	- Add curthread to the state that ktr is saving. The extra information is well worth the bloat. - Change the formatting of 'show ktr' slightly to accommodate the additional field. Remove a tab from the verbose output and place the actual trace data after a : so it is more easy to understand which part is the event and which is part of the record.	2005-06-10 23:21:29 +00:00
jkoshy	b195d18520	Fix typo. Reviewed by: rwatson, sam	2005-06-10 18:06:59 +00:00
brooks	567ba9b00a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
ups	5273b0bf9f	Restore preemption of idle threads. Submitted by: jhb	2005-06-10 03:00:29 +00:00
ssouhlal	0835f7b4a9	Allow EVFILT_VNODE events to work on every filesystem type, not just UFS by: - Making the pre and post hooks for the VOP functions work even when DEBUG_VFS_LOCKS is not defined. - Moving the KNOTE activations into the corresponding VOP hooks. - Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct mount that permits filesystems to disable the new behavior. - Creating a default VOP_KQFILTER function: vfs_kqfilter() My benchmarks have not revealed any performance degradation. Reviewed by: jeff, bde Approved by: rwatson, jmg (kqueue changes), grehan (mentor)	2005-06-09 20:20:31 +00:00
scottl	7a9b003ce5	Drat! Committed from the wrong branch. Restore HEAD to its previous goodness.	2005-06-09 19:59:09 +00:00
scottl	6be4cb00a4	Back out 1.68.2.26. It was a mis-guided change that was already backed out of HEAD and should not have been MFC'd. This will restore UDP socket functionality, which will correct the recent NFS problems. Submitted by: rwatson	2005-06-09 19:56:38 +00:00
jkoshy	1d3209ab83	MFP4: - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.	2005-06-09 19:45:09 +00:00
ups	4421a08742	Lots of whitespace cleanup. Fix for broken if condition. Submitted by: nate@	2005-06-09 19:43:08 +00:00
pjd	47f442bcb9	Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson	2005-06-09 18:49:19 +00:00
pjd	5269cbb9cd	Remove process information leak from inside a jail, when security.bsd.see_other_uids is set to 0, etc. One can check if invisible process is active, by doing: # ktrace -p <pid> If ktrace returns 'Operation not permitted' the process is alive and if returns 'No such process' there is no such process. MFC after: 1 week	2005-06-09 18:33:21 +00:00
ups	d9753fcc91	Fix some race conditions for pinned threads that may cause them to run on the wrong CPU. Add IPI support for preempting a thread on another CPU. MFC after:3 weeks	2005-06-09 18:26:31 +00:00
pjd	3af857a21a	Avoid code duplication in serval places by introducing universal kern_getfsstat() function. Obtained from: jhb	2005-06-09 17:44:46 +00:00
imp	6bc1b07ae1	Simplify the code a bit after the bzero().	2005-06-09 05:50:01 +00:00
jeff	f637381b78	- My sub-par public school education has been exposed. s/sentinal/sentinel/ Noticed by: Emil Mikulic	2005-06-09 04:40:20 +00:00
gad	d916eb91e6	Remove the previous parsing-logic for arguments on the '#!'-line of shell scripts. As far as I know, no one has needed the '#!#<' kludge to get at the behavior implemented by the historical parsing.	2005-06-09 00:27:02 +00:00
jeff	b53b83993c	- Under heavy IO load the buf daemon can run for many hundereds of milliseconds due to what is essentially n^2 algorithmic complexity. This change makes the algorithm N*2 instead. This heavy processing manifested itself as skipping in audio and video playback due to the long scheduling latencies and contention on giant by pcm. - flushbufqueues() is now responsible for flushing multiple buffers rather than one at a time. This allows us to save our progress in the list by using a sentinal. We must do the numdirtywakeup() and waitrunningbufspace() here now rather than in buf_daemon(). - Also add a uio_yield() after we have processed the list once for bufs without deps and again for bufs with deps. This is to release Giant and allow any other giant locked code to proceed. Tested by: Many users on current@ Revealed by: schedgraph traces sent by Emil Mikulic & Anthony Ginepro	2005-06-08 20:26:05 +00:00
rodrigc	b2d9df7a8b	Initialize uio_iovcnt to 1 in extattr_list_vp() and extattr_get_vp() PR: kern/79357 Approved by: rwatson	2005-06-08 13:22:10 +00:00
rwatson	1a25bf9ccd	In sem_forkhook(), don't attempt to generate a copy of the process semaphore list on fork() if the process doesn't actually have references to any semaphores. This avoids extra work, as well as potentially asking to allocate storage for 0 references. Found by: avatar MFC after: 1 week	2005-06-08 07:29:22 +00:00
jeff	4a9af33a3f	- Clear OWEINACT prior to calling VOP_INACTIVE to remove the possibility of a vget causing another call to INACTIVE before we're finished.	2005-06-07 22:05:32 +00:00
alc	43bc57303e	In lio_listio(2) change jobref from an int to a long so that lio_listio(LIO_WAIT, ...) works correctly on 64-bit architectures. Reviewed by: tegge	2005-06-07 05:28:21 +00:00
rwatson	ee01c1bf47	Gratuitous renaming of four System V Semaphore MAC Framework entry points to convert _sema() to _sem() for consistency purposes with respect to the other semaphore-related entry points: mac_init_sysv_sema() -> mac_init_sysv_sem() mac_destroy_sysv_sem() -> mac_destroy_sysv_sem() mac_create_sysv_sema() -> mac_create_sysv_sem() mac_cleanup_sysv_sema() -> mac_cleanup_sysv_sem() Congruent changes are made to the policy interface to support this. Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA	2005-06-07 05:03:28 +00:00
jeff	6eaf0bed4f	- Fix the case where we're not preempting but there is already a newtd as this happens via thread_switchout(). I don't particularly like the structure of the code here. We twice call out to thread code when a thread is voluntarily switching. Once to thread_switchout() and once to slot_fill(), while sched_4BSD does even more work which is redundant to select another thread to use our remaining slice. This should be simplified in the future, but for now I'm only going to fix the bug not the bad design.	2005-06-07 02:59:16 +00:00
dwhite	1d894721d3	Make "show msgbuf" use the pager instead of blasting the whole thing out. MFC after: 3 days	2005-06-06 22:18:32 +00:00
davidxu	c2895a92bd	Fix a bug relavant to debugging, a masked signal unexpectedly interrupts a sleeping thread when process is being debugged. PR: GNU/77818 Tested by: Sean C. Farley <sean-freebsd at farley org>	2005-06-06 05:13:10 +00:00
gallatin	c6980c2b7a	Allow sends sent from non page-aligned userspace addresses to be considered for zero-copy sends. Reviewed by: alc Submitted by: Romer Gil at Rice University	2005-06-05 17:13:23 +00:00
alc	981752ea4e	Eliminate an unused field from struct aio_liojob.	2005-06-05 05:41:48 +00:00
marius	c74fc16e2d	After some input from bde@ and rereading the datasheet use a MTX_SPIN mutex instead of a MTX_DEF one in order to defer preemption while reading the date and time registers. If we don't manage to read them within the time slot where we are guaranteed that no updates occur we might actually read them during an update in which case the output is undefined.	2005-06-04 23:24:50 +00:00
alc	369cab6800	Eliminate the original method of requesting notification of aio_read(2) and aio_write(2) completion through kevent(2). This method does not work on 64-bit architectures. It was deprecated in FreeBSD 4.4. See revisions 1.87 and 1.70.2.7. Change aio_physwakeup() to call psignal(9) directly rather than indirectly through a timeout(9). Discussed with: bde Correct a bug introduced in revision 1.65 that could result in premature delivery of a signal if an lio_listio(2) consisted of a mixture of direct/raw and queued I/O operations. Observed by: tegge Eliminate a field from struct kaioinfo that is now unused. Reviewed by: tegge	2005-06-04 19:16:33 +00:00
jeff	d33221f20a	- It's 2005 already, I've been working on this for three years.	2005-06-04 09:24:15 +00:00
jeff	c720bf6f50	- Don't SLOT_USE() in the preempt case, sched_add() has already taken the slot for us. Previously, we would take two slots on every preempt, and setrunqueue() would fix it up for us in the non threaded case. The threaded case was simply broken. - Clean up flags, prototypes, comments.	2005-06-04 09:23:28 +00:00
ps	bac0ce72d5	Wrap copyin/copyout for kevent so the 32bit wrapper does not have to malloc nchanges * sizeof(struct kevent) AND/OR nevents * sizeof(struct kevent) on every syscall. Glanced at by: peter, jmg Obtained from: Yahoo! MFC after: 2 weeks	2005-06-03 23:15:01 +00:00
alc	652afa11cf	Synchronize access to the per process aiocb lists in many of the functions.	2005-06-03 05:27:20 +00:00
alc	685bbca37b	In aio_waitcomplete() correct two cases of using an aiocb after freeing it.	2005-06-02 23:14:38 +00:00
alc	47a9b57f58	Giant is no longer required in kern_setrlimit(); remove its acquisition and release. Reviewed by: jhb	2005-06-01 17:52:51 +00:00
kensmith	3a7e275ce6	This patch addresses a standards violation issue. The standards say a file's access time should be updated when it gets executed. A while ago the mechanism used to exec was changed to use a more mmap based mechanism and this behavior was broken as a side-effect of that. A new vnode flag is added that gets set when the file gets executed, and the VOP_SETATTR() vnode operation gets called. The underlying filesystem is expected to handle it based on its own semantics, some filesystems don't support access time at all. Those that do should handle it in a way that does not block, does not generate I/O if possible, etc. In particular vn_start_write() has not been called. The UFS code handles it the same way as it would normally handle the access time if a file was read - the IN_ACCESS flag gets set in the inode but no other action happens at this point. The actual time update will happen later during a sync (which handles all the necessary locking). Got me into this: cperciva Discussed with: a lot with bde, a little with kan Showed patches to: phk, jeffr, standards@, arch@ Minor discussion on: arch@	2005-05-31 19:39:52 +00:00
alc	09a2a99469	Synchronize access to aio_freeproc with a mutex. Eliminate related spl calls. Reduce the scope of Giant in aio_daemon().	2005-05-30 22:26:34 +00:00
alc	0a10c2b5cd	Use the proc mtx to prevent simultaneous changes to p_aioinfo.	2005-05-30 19:33:33 +00:00
alc	3ecc8d1129	Eliminate unnecessary calls to wakeup(); no one sleeps on &aio_freeproc. Eliminate an unused flag, AIOP_SCHED; it's cleared but never set.	2005-05-30 18:02:00 +00:00
rwatson	5010364761	Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:20:21 +00:00
rwatson	370e72b242	Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:09:18 +00:00
jeff	33b78c31e9	- Add bufobj_wrefl() to add a write ref to a bufobj that is already locked.	2005-05-30 07:01:18 +00:00
jkoshy	ad86ac4ba4	Kernel hooks to support PMC sampling modes. Reviewed by: alc	2005-05-30 06:29:29 +00:00
alc	f570134192	Eliminate aio_activeproc; it's unused.	2005-05-30 05:25:10 +00:00
alc	404d37a14d	Eliminate aio_bufjobs; it's unused.	2005-05-29 21:29:15 +00:00
rwatson	aaf5c1d3e8	Normalize white space in syscalls.master: try to use tabs before system call types.	2005-05-29 20:20:16 +00:00
rwatson	7035f9f56a	Kernel malloc layers malloc_type allocation over one of two underlying allocators: a set of power-of-two UMA zones for small allocations, and the VM page allocator for large allocations. In order to maintain unified statistics for specific malloc types, kernel malloc maintains a separate per-type statistics pool, which can be monitored using vmstat -m. Prior to this commit, each pool of per-type statistics was protected using a per-type mutex associated with the malloc type. This change modifies kernel malloc to maintain per-CPU statistics pools for each malloc type, and protects writing those statistics using critical sections. It also moves to unsynchronized reads of per-CPU statistics when generating coalesced statistics. To do this, several changes are implemented: - In the previous world order, the statistics memory was allocated by the owner of the malloc type structure, allocated statically using MALLOC_DEFINE(). This embedded the definition of the malloc_type structure into all kernel modules. Move to a model in which a pointer within struct malloc_type points at a UMA-allocated malloc_type_internal data structure owned and maintained by kern_malloc.c, and not part of the exported ABI/API to the rest of the kernel. For the purposes of easing a possible MFC, re-use an existing pointer in 'struct malloc_type', and maintain the current malloc_type structure size, as well as layout with respect to the fields reused outside of the malloc subsystem (such as ks_shortdesc). There are several unused fields as a result of no longer requiring the mutex in malloc_type. - Struct malloc_type_internal contains an array of malloc_type_stats, of size MAXCPU. The structure defined above avoids hard-coding a kernel compile-time value of MAXCPU into kernel modules that interact with malloc. - When accessing per-cpu statistics for a malloc type, surround read - modify - update requests with critical_enter()/critical_exit() in order to avoid races during write. The per-CPU fields are written only from the CPU that owns them. - Per-CPU stats now maintained "allocated" and "freed" counters for number of allocations/frees and bytes allocated/freed, since there is no longer a coherent global notion of the totals. When coalescing malloc stats, accept a slight race between reading stats across CPUs, and avoid showing the user a negative allocation count for the type in the event of a race. The global high watermark is no longer maintained for a malloc type, as there is no global notion of the number of allocations. - While tearing up the sysctl() path, also switch to using sbufs. The current "export as text" sysctl format is retained with the same syntax. We may want to change this in the future to export more per-CPU information, such as how allocations and frees are balanced across CPUs. This change results in a substantial speedup of kernel malloc and free paths on SMP, as critical sections (where usable) out-perform mutexes due to avoiding atomic/bus-locked operations. There is also a minor improvement on UP due to the slightly lower cost of critical sections there. The cost of the change to this approach is the loss of a continuous notion of total allocations that can be exploited to track per-type high watermarks, as well as increased complexity when monitoring statistics. Due to carefully avoiding changing the ABI, as well as hardening the ABI against future changes, it is not necessary to recompile kernel modules for this change. However, MFC'ing this change to RELENG_5 will require also MFC'ing optimizations for soft critical sections, which may modify exposed kernel ABIs. The internal malloc API is changed, and modifications to vmstat in order to restore "vmstat -m" on core dumps will follow shortly. Several improvements from: bde Statistics approach discussed with: ups Tested by: scottl, others	2005-05-29 13:38:07 +00:00
pjd	58d2b4c193	Fix panic when module is compiled in and it is loaded from loader.conf. Only panic is fixed, module will be still listed in kldstat(8) output. Not sure what is correct fix, because adding unloading code in case of failure to linker_init_kernel_modules() doesn't work.	2005-05-28 23:20:05 +00:00
gad	2add4b872d	Change the way options are parsed on the `#!'-line of a shell-script. Instead of having the kernel parse that line and add an entry to the argument list for each 'separate word' it finds, have it add only one entry which holds all the words found on that line. The old behavior is useful in some situations, but it does not match the way any other operating system will parse that line. This has been discussed in the thread "Bug in #! processing - One More Time" on the freebsd-arch mailing list (starting back on Feb 24, 2005). The first few messages in that thread provide the background in much detail. PR: 16393 Reviewed by: freebsd-arch	2005-05-28 22:42:41 +00:00
pjd	7543a23525	Prevent loading modules with are compiled into the kernel. PR: kern/48759 Submitted by: Pawe³ Ma³achowski <pawmal@unia.3lo.lublin.pl> Patch from: demon MFC after: 2 weeks	2005-05-28 22:29:44 +00:00
rwatson	067b94d2d9	Regenerate from syscalls.master.	2005-05-28 14:35:43 +00:00
rwatson	c0001c0613	Mark ntp_gettime() as MSTD, since its system call path will acquire Giant if required.	2005-05-28 14:35:05 +00:00
rwatson	f1dfea9d61	Explicitly acquire Giant around the ntp_gettime() and assert it in the sysctl path. While this code is close to MPSAFE, it may require some additional locking. Mark ntp_gettime1() as GIANT_REQUIRED for now. Suggested by: phk	2005-05-28 14:34:41 +00:00
rwatson	fb931ae00a	Regenerate for updated syscalls.master.	2005-05-28 13:24:05 +00:00
rwatson	ceb26b4c48	Mark the following compatability system calls as MCOMPAT or MCOMPAT4 based on the their simply wrapping MPSAFE implementations of existing MPSAFE system calls: getfsstat() lseek() stat() lstat() truncate() ftruncate() statfs() fstatfs() Note that ogetdirentries() is not marked MPSAFE because it does not share the MPSAFE implementation used for getdirentries(), and requires separate locking to be implemented.	2005-05-28 13:23:42 +00:00
rwatson	c060f4b949	Regenerate from syscalls.master.	2005-05-28 13:13:01 +00:00
rwatson	0439e13c01	Mark quotactl() as MSTD.	2005-05-28 13:12:04 +00:00
rwatson	527c640ad3	Acquire Giant explicitly in quotactl() so that the syscalls.master entry can become MSTD.	2005-05-28 13:11:35 +00:00
rwatson	ff36d1a493	Regenerate from updated syscalls.master.	2005-05-28 13:09:56 +00:00
rwatson	35ffa17830	Mark kenv(2) as MPSAFE, since it appears to be properly locked down.	2005-05-28 13:09:41 +00:00
rwatson	ea08d61a73	Regenerate system call tables from syscalls.master.	2005-05-28 13:08:26 +00:00
rwatson	acb673063c	Also mark the COMPAT4 version of fhstatfs() as MPSAFE.	2005-05-28 13:07:43 +00:00
rwatson	fa7cf37c72	Mark fhopen(), fhstat(), and fhstatfs() as MSTD, since they now acquire Giant themselves.	2005-05-28 12:59:33 +00:00
rwatson	66d882141f	Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(), so that we can start to eliminate the presence of non-MPSAFE system call entries in syscalls.master.	2005-05-28 12:58:54 +00:00
pjd	ac435fbb13	Remove (now) unused argument 'td' from cvtstatfs().	2005-05-27 19:23:48 +00:00
pjd	788f75ddb2	Sync locking in freebsd4_getfsstat() with getfsstat(). Giant is probably also needed in kern_fhstatfs().	2005-05-27 19:21:08 +00:00
pjd	2fc56b12a9	Use consistent style in functions I want to modify in the near future.	2005-05-27 19:15:46 +00:00
rwatson	ac1a365e2d	In the current world order, each socket has two mutexes: a mutex that protects socket and receive socket buffer state, and a second mutex to protect send socket buffer state. In some places, the mutex shared between the socket and receive socket buffer will be acquired twice, once by each layer, resulting in some inconsistency, but providing the abstraction benefit of being able to more easily separate the two mutexes in the future if desired. When transitioning a socket to the SS_ISDISCONNECTING or SS_ISDISCONNECTED states, grab the socket/receive socket buffer lock once rather than grabbing it as the socket lock, modifying socket state, then grabbing a second time as the receive lock in order to modify the socket buffer state to indicate no further data can be read. This change is believed to close a race between the change in socket state and the change in socket buffer state, which for a remotely initiated close on a UNIX domain socket, resulted in soreceive() returning ENOTCONN rather than an EOF condition. A similar race still exists in the case of send, however, and is harder to fix as the socket and send socket buffer mutexes are not the same, and we would like to avoid holding combinations of socket mutexes over sb_upcall until we've finished clarifying the locking protocol for upcalls. This change has the side affect of reducing the number of mutex operations to initiate disconnect or perform disconnect on a socket by two. PR: 78824 Rerported by: Marc Olzheim <marcolz@stack.nl> MFC after: 2 weeks	2005-05-27 17:16:43 +00:00
davidxu	5a8d3af0d6	Remove thread_upcall_check, it was used to avoid race bug in earlier day's sleep queue code, today the bug no longer exists. please see 04/25/2004 freebsd-threads@ mailing list archive.	2005-05-27 15:57:27 +00:00
davidxu	3fbc6983fa	Remove sleep queue hack, it is no longer needed with current sleep queue. Actually, it causes process to hang when it is being debugged. PR: gnu/77818	2005-05-27 04:27:22 +00:00
jmg	07e93041c6	make stat return an zero'd struct, and be a FIFO again... This is only to fix libc_r since it requires stat to close fd's, and so commented in the code... PR: threads/75795 Reviewed by: ps MFC after: 1 week	2005-05-24 23:42:50 +00:00
cognet	9bcd47137c	Don't set the default of kern.fallback_elf_brand to FreeBSD for arm, as binutils now do the job for us	2005-05-24 22:21:44 +00:00
ups	acfce18a2a	Use low level constructs borrowed from interrupt threads to wait for work in proc0. Remove the TDP_WAKEPROC0 workaround.	2005-05-23 23:01:53 +00:00
pjd	0b89469bda	Protect fsid in freebsd4_getfsstat() in simlar way as it is done in getfsstat().	2005-05-22 23:05:27 +00:00
pjd	a6e0e217b2	If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us, so do not duplicate the code in cvtstatfs(). Note, that we now need to clear fsid in freebsd4_getfsstat(). This moves all security related checks from functions like cvtstatfs() and will allow to add more security related stuff (like statfs(2), etc. protection for jails) a bit easier.	2005-05-22 21:52:30 +00:00
njl	9ab8d98ce5	Document that the returned pointer should be freed even if the number of items returned is 0.	2005-05-20 05:04:22 +00:00
ups	c8d93020ce	Fix a bug that caused preemption to happen for a thread in the same ksegrp with the same priority as the currently running thread. This can cause propagate_priority() to panic. Pointy hat to: ups	2005-05-19 01:08:30 +00:00
pjd	4c810f35cd	devfs_first() return value isn't used, remove it.	2005-05-18 22:05:12 +00:00
alc	bcfd7ad6a6	Revert revision 1.164: pmap_qremove() does not require protection by VM_LOCK_GIANT. Discussed with: jeff	2005-05-14 05:09:11 +00:00
jhb	6772446cb8	Actually use the iterating variable in the for loop when trying to avoid overflow. Reported by: Vladislav Shabanov vs at rambler-co dot ru MFC after: 1 week Glanced at: alfred	2005-05-12 20:04:48 +00:00
pjd	c6e5e8f446	We don't use 'mp' variable, but we do want to mount devfs, ehh.	2005-05-12 01:49:51 +00:00
pjd	91b47597be	Remove unised variable introduced by accident in rev 1.168. Found by: Coverity Prevent analysis tool	2005-05-11 19:50:34 +00:00
pjd	f66a55ffcd	Plug memory leaks. Found by: Coverity Prevent analysis tool	2005-05-11 19:27:38 +00:00
kan	4085840a33	Handle theoretical case of vfs_export being called with both MNT_DELEXPORT and MNT_EXPORT flags set. Do not reuse the memory that has just been freed.	2005-05-11 18:25:42 +00:00
cperciva	a199a4f74b	Fix two issues which were missed in FreeBSD-SA-05:08.kmem. Reported by: Uwe Doering	2005-05-07 00:41:36 +00:00
cperciva	e513415af9	If we are going to 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem	2005-05-06 02:50:00 +00:00
davidxu	af64c19b3b	Only check signal event, single threading event shouldn't be reported.	2005-05-05 06:42:02 +00:00
emax	a52b6c9ce3	Change m_uiotombuf so it will accept offset at which data should be copied to the mbuf. Offset cannot exceed MHLEN bytes. This is currently used to fix Ethernet header alignment problem on alpha and sparc64. Also change all users of m_uiotombuf to pass proper offset. Reviewed by: jmg, sam Tested by: Sten Spans "sten AT blinkenlights DOT nl" MFC after: 1 week	2005-05-04 18:55:03 +00:00
rwatson	2197ab2d93	Introduce MAC Framework and MAC Policy entry points to label and control access to POSIX Semaphores: mac_init_posix_sem() Initialize label for POSIX semaphore mac_create_posix_sem() Create POSIX semaphore mac_destroy_posix_sem() Destroy POSIX semaphore mac_check_posix_sem_destroy() Check whether semaphore may be destroyed mac_check_posix_sem_getvalue() Check whether semaphore may be queried mac_check_possix_sem_open() Check whether semaphore may be opened mac_check_posix_sem_post() Check whether semaphore may be posted to mac_check_posix_sem_unlink() Check whether semaphore may be unlinked mac_check_posix_sem_wait() Check whether may wait on semaphore Update Biba, MLS, Stub, and Test policies to implement these entry points. For information flow policies, most semaphore operations are effectively read/write. Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net> Sponsored by: DARPA, McAfee, SPARTA Obtained from: TrustedBSD Project	2005-05-04 10:39:15 +00:00
rwatson	182429e8d0	Move definitions of 'struct kuser' and 'struct ksem' from uipc_sem.c to ksem.h so that they are accessible from the MAC Framework for the purposes of labeling and enforcing additional protections. #error if these are included without _KERNEL, since they are not intended (nor installed) for user application use. Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net> Sponsored by: DARPA, SPARTA Obtained from: TrustedBSD Project	2005-05-03 20:21:24 +00:00
jeff	33ac8108e2	- Initialize vfslocked correctly early enough for MAC to compile. - Fix one place where we explicitly drop Giant! Pointy hat to: me Submitted by: Max Laier Warned by: Tinderbox	2005-05-03 16:24:59 +00:00
jeff	79452537e3	- Remove two mtx_asserts that can incorrectly trigger if devstat_end_transaction is called from a fast interrupt. Presently there is no way for mtx_assert to determine that we're not executing in a real thread context. Submitted by: jhusted@isilon.com	2005-05-03 10:58:05 +00:00
jeff	92f17d1e6a	- A vnode may have made its way onto the free list while it was being vgone'd. We must remove it from the freelist before returning in vtryrecycle() or we may get a duplicate free. Reported by: kkenn	2005-05-03 10:56:00 +00:00
jeff	ab437d7b1d	- Use namei to acquire Giant for VFS if it is necessary. Drop the explicit Giant acquisition. - Remove GIANT_REQUIRED in the few remaining cases; the vm and vfs have both been locked.	2005-05-03 10:55:05 +00:00
jeff	451e14446f	- Use NAMEI to pickup Giant if we need it in fpcheckstd().	2005-05-03 10:52:22 +00:00
jeff	617ce99006	- Neither of our image formats require Giant now that the vm and vfs have been locked.	2005-05-03 10:51:38 +00:00
csjp	431f1afe8c	Since it is not possible for curthread to be NULL in this context, drop the check+initialization for a straight initialization. Also assert that curthread will never be NULL just to be sure. Discussed with: rwatson, peter MFC after: 1 week	2005-05-02 02:07:55 +00:00
jeff	dd41538cd8	- All buffers should either be clean or dirty. If neither of these flags are set when we attempt to remove a buffer from a queue we should panic. Hopefully this will catch the source of the wrong bufobj panics. Sponsored by: Isilon Systems, Inc.	2005-05-01 12:00:36 +00:00
jeff	ff4a7a72e9	- Remove spls and comments relating to them.	2005-05-01 01:01:17 +00:00
jeff	22004a9723	- Remove an old splcam hack.	2005-05-01 00:59:55 +00:00
jeff	1bc61f8f0f	- Remove unnecessary spls.	2005-05-01 00:59:34 +00:00
jeff	80bb41c921	- Return EACCES if we're trying to exec on a vp with no object. Errno supplied by: cperciva	2005-05-01 00:58:19 +00:00
sam	17d6060ac9	o enable shutdown of taskqueue threads; the thread servicing the queue checks a new entry in the taskqueue struct each time it wakes up to see if it should terminate o adjust TASKQUEUE_DEFINE_THREAD & co. to record the thread/proc identity for the shutdown rendezvous o replace wakeup after adding a task to a queue with wakeup_one; this helps queues where multiple threads are used to service tasks (e.g. acpi) o remove NULL check of tq_enqueue method; it should never be NULL Reviewed by: dfr, njl	2005-05-01 00:38:11 +00:00
dwhite	c8fa809967	Implement an alternate method to stop CPUs when entering DDB. Normally we use a regular IPI vector, but this vector is blocked when interrupts are disabled. With "options KDB_STOP_NMI" and debug.kdb.stop_cpus_with_nmi set, KDB will send an NMI to each CPU instead. The code also has a context-stuffing feature which helps ddb extract the state of processes running on the stopped CPUs. KDB_STOP_NMI is only useful with SMP and complains if SMP is not defined. This feature only applies to i386 and amd64 at the moment, but could be used on other architectures with the appropriate MD bits. Submitted by: ups	2005-04-30 20:01:00 +00:00
jeff	cb9dfadd87	- Remove long dead splbio() calls and comments relating to the old synchronization mechanism.	2005-04-30 12:18:50 +00:00
jeff	116d72569a	- Don't acquire Giant before calling b_biodone, individual consumers are now required to do so themselves. Sponsored by: Isilon Systems, Inc.	2005-04-30 11:44:22 +00:00
jeff	32c015f463	- Acquire Giant in AIO's iodone routine. VFS will no longer do it for us soon. Sponsored by: Isilon Systems, Inc.	2005-04-30 11:27:31 +00:00
jeff	f9172cb275	- Call VM_LOCK_GIANT in cluster_callback() to protect some pmap calls. VFS will not be acquiring Giant before calling this function anymore. Sponsored by: Isilon Systems, Inc.	2005-04-30 11:26:58 +00:00
jeff	7354fc5e28	- In vnlru_free() remove the vnode from the free list before we call vtryrecycle(). We could sometimes get into situations where two threads could try to recycle the same vnode before this. - vtryrecycle() is now responsible for returning the vnode to the free list if it fails and someone else hasn't done it. - Make a new function vfreehead() which moves a vnode to the head of the free list and use it in vgone() to clean up that code a bit. Sponsored by: Isilon Systems, Inc. Reported by: pho, kkenn	2005-04-30 11:22:40 +00:00
jeff	0e56b01ed6	- Don't vgonel() via vgone() or vrecycle() if the vnode is already doomed. This fixes forced unmounts via nullfs. Reported by: kkenn Sponsored by: Isilon Systems, Inc.	2005-04-27 10:03:21 +00:00
jeff	a80bbe799e	- Stop setting vxthread, we've asserted that it was useless for several weeks now.	2005-04-27 09:17:33 +00:00
jeff	18cd3a36d3	- Stop checking vxthread, we've asserted that it was useless for several weeks.	2005-04-27 09:17:11 +00:00
jeff	f869be5c72	- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.	2005-04-27 09:05:19 +00:00
mdodd	56c42039a5	Add missing break. Found by: marcus	2005-04-25 00:48:04 +00:00
sam	0f63abff2a	o eliminate modification of task structures after their run to avoid modify-after-free races when the task structure is malloc'd o shrink task structure by removing ta_flags (no longer needed with avoid fix) and combining ta_pending and ta_priority Reviewed by: dwhite, dfr MFC after: 4 days	2005-04-24 16:52:45 +00:00
davidxu	50a5bbcbfd	Wake up swapper process if needed. PR: kern/78474 Submitted by: Sam Lawrance <boris at brooknet dot com dot au>	2005-04-23 05:06:44 +00:00
davidxu	a247de6aeb	Regen.	2005-04-23 02:38:17 +00:00
davidxu	1b8f9e10e1	Add new syscall thr_new to create thread in atomic, it will inherit signal mask from parent thread, setup TLS and stack, and user entry address. Also support POSIX thread's PTHREAD_SCOPE_PROCESS and PTHREAD_SCOPE_SYSTEM, sysctl is also provided to control the scheduler scope.	2005-04-23 02:36:07 +00:00
davidxu	2155a04472	Change cpu_set_kse_upcall to more generic style, so we can reuse it in other codes. Add cpu_set_user_tls, use it to tweak user register and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall, but since cpu_set_kse_upcall is also used by M:N threads which may not need this feature, so I wrote a separated cpu_set_user_tls.	2005-04-23 02:32:32 +00:00
jeff	4eaa5ebe1b	- Define the real lock order with cdev and a few vm/vfs related locks. This can be removed once cdev no longer calls free() with the cdev lock held.	2005-04-22 22:43:31 +00:00
jeff	b29bfc6efa	- Check LO_DUPOK as well as LOP_DUPOK when determining whether we should warn about duplicate acquires. Sponsored by: Isilon Systems, Inc.	2005-04-22 22:39:46 +00:00
trhodes	f02068c038	Get the directory structure correct in a comment. Submitted by: Samy Al Bahra	2005-04-22 19:09:12 +00:00
jeff	31cfb7f242	- Disable code which allows getnewvnode() to fail. Many ffs_vget() callers do not correctly deal with failures. This presently risks deadlock problems if dependency processing is held up by failures to allocate a vnode, however, this is better than the situation with the failures. Sponsored by: Isilon Systems, Inc.	2005-04-22 00:57:05 +00:00
jeff	d8b31a35ea	- Add two KASSERTs to prevent us from recycling a buf that is still on a bufobj list. Sponsored by: Isilon Systems, Inc.	2005-04-22 00:53:20 +00:00
marcel	dd5b3be596	Do not conditionally compile the contents of this file upon whether HWPMC_HOOKS is defined. The pmc_cpu_is_() functions in this file are referenced unconditionally by hwpmc(4). This is mostly a stop-gap. The pmc_cpu_is() function should probably be declared inline in <sys/pmc.h> or <sys/pmckern.h> and the function pointers with corresponding SX lock should probably be moved to another file and compiled conditionally upon HWPMC_HOOKS. Ok'd by: jkoshy@	2005-04-20 20:30:59 +00:00
davidxu	0719b14efb	Inherit signal mask for child process in fork1(), RELENG_4 and other *BSD have this behaviour, also it is required by POSIX. PR: kern/80130 Submitted by: Kostik Belousov konstantin.belousov at zoral dot com dot ua	2005-04-20 13:14:52 +00:00
mdodd	7826c585d5	Check sopt_level in uipc_ctloutput() and return early if it is non-zero. This prevents unintended consequnces when an application calls things like setsockopt(x, SOL_SOCKET, SO_REUSEADDR, ...) on a Unix domain socket.	2005-04-20 02:57:56 +00:00
pjd	db9ce4609f	Call g_waitidle() before every check the list of holds is empty. Suggested by: phk	2005-04-19 21:44:44 +00:00
davidxu	9452a25d2d	Clear P_STATCHILD earlier to avoid unnecessary retrying.	2005-04-19 12:31:15 +00:00
davidxu	913d50be4f	Oops, forgot to update this file. Fix a race condition between kern_wait() and thread_stopped(). Problem is in kern_wait(), parent process steps through children list, once a child process is skipped, and later even if the child is stopped, parent process still sleeps in msleep(), the race happens if parent masked SIGCHLD. Submitted by : Peter Edwards peadar.edwards at gmail dot com MFC after : 4 days	2005-04-19 08:11:28 +00:00
davidxu	02615ff23a	Fix a race condition between kern_wait() and thread_stopped(). Problem is in kern_wait(), parent process steps through children list, once a child process is skipped, and later even if the child is stopped, parent process still sleeps in msleep(), the race happens if parent masked SIGCHLD. Submitted by : Peter Edwards peadar.edwards at gmail dot com MFC after : 4 days	2005-04-19 08:07:28 +00:00
phk	ed5a7da798	Call g_waitidle() instead of GEOM using the root_mount_hold() KPI. GEOM could (and will) get events as a result of drivers coming in late so a one-shot method is not good enough for GEOM.	2005-04-19 06:23:59 +00:00
jkoshy	dc3444cd91	Bring a working snapshot of hwpmc(4), its associated libraries, userland utilities and documentation into -CURRENT. Bump FreeBSD_version. Reviewed by: alc, jhb (kernel changes)	2005-04-19 04:01:25 +00:00
phk	b7f29c0fc0	Add a named reference-count KPI to hold off mounting of the root filesystem. While we wait for holds to be released, print a list of who holds us back once per second. Use the new KPI from GEOM instead of vfs_mount.c calling g_waitidle(). Use the new KPI also from ata. With ATAmkIII's newbusification, ata could narrowly miss the window and ad0 would not exist when we tried to mount root.	2005-04-18 21:21:26 +00:00
phk	4bd811c8dd	Initialize mountlist_mtx with an MTX_SYSINIT(), we need it to be ready earlier.	2005-04-18 21:11:47 +00:00
rwatson	75030e30f6	Introduce p_canwait() and MAC Framework and MAC Policy entry points mac_check_proc_wait(), which control the ability to wait4() specific processes. This permits MAC policies to limit information flow from children that have changed label, although has to be handled carefully due to common programming expectations regarding the behavior of wait4(). The cr_seeotheruids() check in p_canwait() is #if 0'd for this reason. The mac_stub and mac_test policies are updated to reflect these new entry points. Sponsored by: SPAWAR, SPARTA Obtained from: TrustedBSD Project	2005-04-18 13:36:57 +00:00
rwatson	997d8772c4	Remove end-of-line tabs. MFC after: 3 days	2005-04-18 11:51:10 +00:00
das	5aec008257	Add a sysctl that returns the full path of a process' text file. This information is needed by things like `gdb -p' and Sun's javac, and previously it could only be obtained via procfs	2005-04-18 02:10:37 +00:00

1 2 3 4 5 ...

8659 Commits