freebsd-dev

Author	SHA1	Message	Date
Peter Wemm	62919d788b	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
Peter Wemm	48033188a6	Second part of commit for moving KDB_STOP_NMI from opt_global.h to opt_kdb.h. Found by: kris Approved by: re	2005-06-30 03:38:10 +00:00
Peter Wemm	2de92a386e	Conditionally weaken sys_generic.c rev 1.136 to allow certain dubious ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with a size of zero. Traditionally this was what you did before IOC_VOID existed, and we had some established users of this in the tree, namely procfs. Certain 3rd party drivers with binary userland components also have this too. This is necessary to have 4.x and 5.x binaries use these ioctl's. We found this at work when trying to run 4.x binaries. Approved by: re	2005-06-30 00:19:08 +00:00
Peter Wemm	f0c6706de9	Move the KDB_STOP_NMI option from opt_global.h to opt_kdb.h Approved by: re	2005-06-29 23:23:16 +00:00
Mike Silbersack	a7b844d2be	Fix the false memory modified after free messages some users have been reporting - in my previous change, I missed the case where a mbuf from the packet zone was freed back to the mbuf/packet keg, where it was subsequently put into the mbuf zone and found not to contain the expected trash. This change adds the necessary trash_dtor call inside mb_fini_pack so that everything is correct. Thanks for Bosko for finding the bug and showing me how secondary zones work. Approved by: re (dwhite)	2005-06-29 08:18:26 +00:00
Dima Dorfman	1ee6b74603	Fix fdcheckstd to pass the file descriptor along through vn_open. When opening a device, devfs_open needs the file descriptor to install its own fileops. Failing to pass the file descriptor causes the vnode to be returned with the regular vnops, which will cause a panic on the first read or write because devfs_specops is not meant to support those operations. This bug caused a panic after exec'ing any set[ug]id program with fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we would panic if the exec'd program ever tried to use any of those descriptors). Reviewed by: phk Approved by: re (scottl)	2005-06-25 03:34:49 +00:00
Pawel Jakub Dawidek	400a74bff8	Close another information leak in ktrace(2): one was able to find active process groups outside a jail, etc. by using ktrace(2). OK'ed by: rwatson Approved by: re (scottl) MFC after: 1 week	2005-06-24 12:05:24 +00:00
Peter Wemm	4da0d332f4	Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit being in opt_global.h and forcing a global recompile when only a few files reference it. Approved by: re	2005-06-24 00:16:57 +00:00
Pawel Jakub Dawidek	06a137780b	Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)	2005-06-23 22:13:29 +00:00
John Baldwin	57dbcb11db	Fix a typo in a comment. Approved by: re (scottl)	2005-06-23 21:55:43 +00:00
Mike Silbersack	121f050976	Change the mbuf, mbuf cluster, and mbuf packet allocation routines so that the UMA "trash" allocator is used - this ensures that any writes to a freed mbuf should provoke a panic. Only enabled under INVARIANTS, of course. Approved by: re (scottl)	2005-06-23 04:33:39 +00:00
Pawel Jakub Dawidek	b0d9aedd28	Add missing unlock. Pointy hat to: pjd Approved by: re (dwhite)	2005-06-21 21:17:02 +00:00
John Baldwin	943928c905	Simplify the storming logic and remove a variable as a result. Approved by: re (dwhite)	2005-06-20 19:32:23 +00:00
Garance A Drosehn	bd3aace7e4	Fix a panic which could occur parsing #!-lines in a shell-script. If the #!-line had multiple whitespace characters after the interpreter name, and it did not have any options, then the code would do nasty things trying to process a (non-existent) option-string which "ended before it began"... Submitted by: Morten Johansen Approved by: re (dwhite)	2005-06-19 02:21:03 +00:00
Jeff Roberson	b770ff6eb2	- Try to catch the wrong bufobj panics a little earlier. I believe they are actually caused by a buf with both VNCLEAN and VNDIRTY set. In the traces it is clear that the buf is removed from the dirty queue while it is actually on the clean queue which leaves the tail pointer set. Assert that both flags are not set in buf_vlist_add and buf_vlist_remove. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-18 18:17:03 +00:00
Jeff Roberson	32b6dcd8a4	- Fix a leaked reference to a vnode via v_dd. We rely on cache_purge() and cache_zap() to clear the v_dd pointers when a directory vnode is forcibly discarded. For this to work, all vnodes with v_dd pointers to a directory must also have name cache entries linked via v_cache_dst to that dvp otherwise we could not find them at cache_purge() time. The following code snipit could break this guarantee by unlinking a directory before fetching it's dotdot. The dotdot lookup would initialize the v_dd field of the unlinked directory which could never be cleared. To fix this we don't initialize v_dd for orphaned vnodes. printf("rmdir: %d\n", rmdir("../foo")); /* foo is cwd */ printf("chdir: %d\n", chdir("..")); printf("%s\n", getwd(NULL)); Sponsored by: Isilon Systems, Inc. Discovered by: kkenn Approved by: re (blanket vfs)	2005-06-17 01:05:13 +00:00
Ken Smith	c0cac8dc20	Remove a variable that became unused as a result of changes made in v1.139. This was only exposed if MALLOC_PROFILE was defined. Submitted by: Gary Jennejohn Pointy hat: rwatson Approved by: re (scottl)	2005-06-16 16:01:46 +00:00
Jeff Roberson	114a1006a8	- Change holdcnt use around vnode recycling. We now always keep a holdcnt ref while we're calling vgone(). This prevents transient refs from re-adding us to the free list. Previously, a vfree() triggered via vinvalbuf() getting rid of all of a vnode's pages could place a partially destructed vnode on the free list where vtryrecycle() could find it. The first call to vtryrecycle would hang up on the vnode lock, but when it failed it would place a now dead vnode onto the free list, and another call to vtryrecycle() would free an already free vnode. There were many complications of having a zero ref count while freeing which can now go away. - Change vdropl() to release the interlock before returning. All callers now respect this, so vdropl() directly frees VI_DOOMED vnodes once the last ref is dropped. This means that we'll never have VI_DOOMED vnodes on the free list. - Seperate v_incr_usecount() into v_incr_usecount(), v_decr_usecount() and v_decr_useonly(). The incr/decr split is so that incr usecount can return with the interlock still held while decr drops the interlock so it can call vdropl() which will potentially free the vnode. The calling function can't drop the lock of an already free'd node. v_decr_useonly() drops a usecount without droping the hold count. This is done so the usecount reaches zero in vput() before we recycle, however the holdcount is still 1 which prevents any new references from placing the vnode back on the free list. - Fix vnlrureclaim() to vhold the vnode since it doesn't do a vget(). We wouldn't want vnlrureclaim() to bump the usecount since this has different semantics. Also change vnlrureclaim() to do a NOWAIT on the vn_lock. When this function runs we're usually in a desperate situation and we wouldn't want to wait for any specific vnode to be released. - Fix a bunch of misc comments to reflect the new behavior. - Add vhold() and vdrop() to vflush() for the same reasons that we do in vlrureclaim(). Previously we held no reference and a vnode could have been freed while we were waiting on the lock. - Get rid of vlruvp() and vfreehead(). Neither are used. vlruvp() should really be rethought before it's reintroduced. - vgonel() always returns with the vnode locked now and never puts the vnode back on a free list. The vnode will be freed as soon as the last reference is released. Sponsored by: Isilon Systems, Inc. Debugging help from: Kris Kennaway, Peter Holm Approved by: re (blanket vfs)	2005-06-16 04:41:42 +00:00
Jeff Roberson	bdcd9f26b0	- Fix insertions of bios which represent data earlier than anything else in the queue. The insertion sort assumed this had already been taken care of. Spotted by: Antoine Brodin Approved by: re (scottl)	2005-06-15 23:32:07 +00:00
Jeff Roberson	7a06fe49dc	- Add and enhance asserts related to the wrong bufobj panic. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-14 20:32:27 +00:00
Jeff Roberson	12c2dcde40	- In reassignbuf() add many asserts to validate the head and tail pointers of the clean and dirty lists. This is in an attempt to catch the wrong bufobj problem sooner. - In vgonel() don't acquire an extra reference in the active case, the vnode lock and VI_DOOMED protect us from recursively cleaning. - Also in vgonel() clean up some stale comments. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-14 20:31:53 +00:00
Jeff Roberson	dbb3ec5ce3	- Remove vnode lock asserts at the end of vfs syscalls. These asserts were used to ensure that we weren't exiting the syscall with a lock still held. This wasn't safe, however, because we'd already executed a vput() and on a loaded system the vnode may have been free'd by the time we assert. This functionality is also handled by the td_locks assert in userret, which doesn't tell you what the syscall was, but will at least panic before you deadlock. Sponsored by: Isilon Systems, Inc. Discovred by: Peter Holm Approved by: re (blanket vfs)	2005-06-14 01:14:40 +00:00
Jeff Roberson	b930d85380	- Don't make vgonel() globally visible, we want to change its prototype anyway and it's not used outside of vfs_subr.c. - Change vgonel() to accept a parameter which determines whether or not we'll put the vnode on the free list when we're done. - Use the new vgonel() parameter rather than VI_DOOMED to signal our intentions in vtryrecycle(). - In vgonel() return if VI_DOOMED is already set, this vnode has already been reclaimed. Sponsored by: Isilon Systems, Inc.	2005-06-13 06:26:55 +00:00
Jeff Roberson	6bd8103d33	- Clear v_dd in cache_zap() instead of cache_purge() as cache_purge() may not be called in all cases where we free the cnp. Sponsored by: Isilon Systems, Inc.	2005-06-13 05:59:59 +00:00
Jeff Roberson	d598b04d44	- It has long been my suspicion that we don't actually need a loop in vn_lock(). Add an assert that will help me gain more confidence that this is correct. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:47:29 +00:00
Jeff Roberson	d2ad9baac0	- Add KTR_VFS events to vdestroy, vtruncbuf, vinvalbuf, vfreehead. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:37 +00:00
Jeff Roberson	eff2d12635	- Add KTR_VFS messages for various name cache related events. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:03 +00:00
Jeff Roberson	748c92fbad	- Split one KASSERT in bremfree() into two to aid in debugging. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:45:05 +00:00
Jeff Roberson	f19f6869cf	- Dramatically simplify bioqdisksort(). We no longer do ordered bios so most of the code to deal with them has been dead for sometime. Simplify the code by doing an insert sort hinted by the current head position. Met with apathy by: arch@	2005-06-12 22:32:29 +00:00
Pawel Jakub Dawidek	65ac438c8f	Do not allocate memory while holding a mutex. I introduce a very small race here (some file system can be mounted or unmounted between 'count' calculation and file systems list creation), but it is harmless. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-12 07:03:23 +00:00
Pawel Jakub Dawidek	3a996d6e91	Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-11 14:58:20 +00:00
Maxim Konovalov	922a5d9c2b	o setsockopt(2) cannot remove accept filter. [1] o getsockopt(SO_ACCEPTFILTER) always returns success on listen socket even we didn't install accept filter on the socket. o Fix these bugs and add regression tests for them. Submitted by: Igor Sysoev [1] Reviewed by: alfred MFC after: 2 weeks	2005-06-11 11:59:48 +00:00
Jeff Roberson	d6dbf760a6	- Assert that we're not in the name cache anymore in vdestroy(). Sponsored by: Isilon Systems, Inc.	2005-06-11 08:48:09 +00:00
Jeff Roberson	1b2da2d0fa	- Assert that we're not adding a doomed vnode to the name cache. Sponsored by: Isilon Systems, Inc.	2005-06-11 08:47:30 +00:00
Jeff Roberson	9aa0eba464	- Add KTR_VFS tracing to track the life of vnodes. Eventually KTR_VFS events could be added to cover other interesting details. - Add some VNASSERTs to discover places where we access vnodes after they have been uma_zfree'd before we try to free them again. - Add a few more VNASSERTs to vdestroy() to be certain that the vnode is really unused. Sponsored by: Isilon Systems, Inc.	2005-06-11 01:16:46 +00:00
Brian Feldman	cc3149b1ea	Fix a serious deadlock with the NFS client. Given a large enough atomic write request, it can fill the buffer cache with the entirety of that write in order to handle retries. However, it never drops the vnode lock, or else it wouldn't be atomic, so it ends up waiting indefinitely for more buf memory that cannot be gotten as it has it all, and it waits in an uncancellable state. To fix this, hibufspace is exported and scaled to a reasonable fraction. This is used as the limit of how much of an atomic write request by the NFS client will be handled asynchronously. If the request is larger than this, it will be turned into a synchronous request which won't deadlock the system. It's possible this value is far off from what is required by some, so it shall be tunable as soon as mount_nfs(8) learns of the new field. The slowdown between an asynchronous and a synchronous write on NFS appears to be on the order of 2x-4x. General nod by: gad MFC after: 2 weeks More testing: wes PR: kern/79208	2005-06-10 23:50:41 +00:00
Jeff Roberson	37ee2d8dd4	- Add curthread to the state that ktr is saving. The extra information is well worth the bloat. - Change the formatting of 'show ktr' slightly to accommodate the additional field. Remove a tab from the verbose output and place the actual trace data after a : so it is more easy to understand which part is the event and which is part of the record.	2005-06-10 23:21:29 +00:00
Joseph Koshy	8c61b21927	Fix typo. Reviewed by: rwatson, sam	2005-06-10 18:06:59 +00:00
Brooks Davis	fc74a9f93a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
Stephan Uphoff	3ea6bbc59a	Restore preemption of idle threads. Submitted by: jhb	2005-06-10 03:00:29 +00:00
Suleiman Souhlal	679985d03a	Allow EVFILT_VNODE events to work on every filesystem type, not just UFS by: - Making the pre and post hooks for the VOP functions work even when DEBUG_VFS_LOCKS is not defined. - Moving the KNOTE activations into the corresponding VOP hooks. - Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct mount that permits filesystems to disable the new behavior. - Creating a default VOP_KQFILTER function: vfs_kqfilter() My benchmarks have not revealed any performance degradation. Reviewed by: jeff, bde Approved by: rwatson, jmg (kqueue changes), grehan (mentor)	2005-06-09 20:20:31 +00:00
Scott Long	8bde93598a	Drat! Committed from the wrong branch. Restore HEAD to its previous goodness.	2005-06-09 19:59:09 +00:00
Scott Long	76b472dbda	Back out 1.68.2.26. It was a mis-guided change that was already backed out of HEAD and should not have been MFC'd. This will restore UDP socket functionality, which will correct the recent NFS problems. Submitted by: rwatson	2005-06-09 19:56:38 +00:00
Joseph Koshy	f263522a45	MFP4: - Implement sampling modes and logging support in hwpmc(4). - Separate MI and MD parts of hwpmc(4) and allow sharing of PMC implementations across different architectures. Add support for P4 (EMT64) style PMCs to the amd64 code. - New pmcstat(8) options: -E (exit time counts) -W (counts every context switch), -R (print log file). - pmc(3) API changes, improve our ability to keep ABI compatibility in the future. Add more 'alias' names for commonly used events. - bug fixes & documentation.	2005-06-09 19:45:09 +00:00
Stephan Uphoff	a3f2d84279	Lots of whitespace cleanup. Fix for broken if condition. Submitted by: nate@	2005-06-09 19:43:08 +00:00
Pawel Jakub Dawidek	820a0de9a9	Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson	2005-06-09 18:49:19 +00:00
Pawel Jakub Dawidek	4eb7c9f6c9	Remove process information leak from inside a jail, when security.bsd.see_other_uids is set to 0, etc. One can check if invisible process is active, by doing: # ktrace -p <pid> If ktrace returns 'Operation not permitted' the process is alive and if returns 'No such process' there is no such process. MFC after: 1 week	2005-06-09 18:33:21 +00:00
Stephan Uphoff	f3a0f87396	Fix some race conditions for pinned threads that may cause them to run on the wrong CPU. Add IPI support for preempting a thread on another CPU. MFC after:3 weeks	2005-06-09 18:26:31 +00:00
Pawel Jakub Dawidek	13a82b9623	Avoid code duplication in serval places by introducing universal kern_getfsstat() function. Obtained from: jhb	2005-06-09 17:44:46 +00:00
Warner Losh	139f16505d	Simplify the code a bit after the bzero().	2005-06-09 05:50:01 +00:00
Jeff Roberson	a3d239bc29	- My sub-par public school education has been exposed. s/sentinal/sentinel/ Noticed by: Emil Mikulic	2005-06-09 04:40:20 +00:00
Garance A Drosehn	386ea9321d	Remove the previous parsing-logic for arguments on the '#!'-line of shell scripts. As far as I know, no one has needed the '#!#<' kludge to get at the behavior implemented by the historical parsing.	2005-06-09 00:27:02 +00:00
Jeff Roberson	9e879a5ee0	- Under heavy IO load the buf daemon can run for many hundereds of milliseconds due to what is essentially n^2 algorithmic complexity. This change makes the algorithm N*2 instead. This heavy processing manifested itself as skipping in audio and video playback due to the long scheduling latencies and contention on giant by pcm. - flushbufqueues() is now responsible for flushing multiple buffers rather than one at a time. This allows us to save our progress in the list by using a sentinal. We must do the numdirtywakeup() and waitrunningbufspace() here now rather than in buf_daemon(). - Also add a uio_yield() after we have processed the list once for bufs without deps and again for bufs with deps. This is to release Giant and allow any other giant locked code to proceed. Tested by: Many users on current@ Revealed by: schedgraph traces sent by Emil Mikulic & Anthony Ginepro	2005-06-08 20:26:05 +00:00
Craig Rodrigues	1209e08faf	Initialize uio_iovcnt to 1 in extattr_list_vp() and extattr_get_vp() PR: kern/79357 Approved by: rwatson	2005-06-08 13:22:10 +00:00
Robert Watson	e2f7a83d6b	In sem_forkhook(), don't attempt to generate a copy of the process semaphore list on fork() if the process doesn't actually have references to any semaphores. This avoids extra work, as well as potentially asking to allocate storage for 0 references. Found by: avatar MFC after: 1 week	2005-06-08 07:29:22 +00:00
Jeff Roberson	fae89dce3e	- Clear OWEINACT prior to calling VOP_INACTIVE to remove the possibility of a vget causing another call to INACTIVE before we're finished.	2005-06-07 22:05:32 +00:00
Alan Cox	b490cc72b2	In lio_listio(2) change jobref from an int to a long so that lio_listio(LIO_WAIT, ...) works correctly on 64-bit architectures. Reviewed by: tegge	2005-06-07 05:28:21 +00:00
Robert Watson	3831e7d7f5	Gratuitous renaming of four System V Semaphore MAC Framework entry points to convert _sema() to _sem() for consistency purposes with respect to the other semaphore-related entry points: mac_init_sysv_sema() -> mac_init_sysv_sem() mac_destroy_sysv_sem() -> mac_destroy_sysv_sem() mac_create_sysv_sema() -> mac_create_sysv_sem() mac_cleanup_sysv_sema() -> mac_cleanup_sysv_sem() Congruent changes are made to the policy interface to support this. Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA	2005-06-07 05:03:28 +00:00
Jeff Roberson	6680bbd529	- Fix the case where we're not preempting but there is already a newtd as this happens via thread_switchout(). I don't particularly like the structure of the code here. We twice call out to thread code when a thread is voluntarily switching. Once to thread_switchout() and once to slot_fill(), while sched_4BSD does even more work which is redundant to select another thread to use our remaining slice. This should be simplified in the future, but for now I'm only going to fix the bug not the bad design.	2005-06-07 02:59:16 +00:00
Doug White	4a30c508d1	Make "show msgbuf" use the pager instead of blasting the whole thing out. MFC after: 3 days	2005-06-06 22:18:32 +00:00
David Xu	ec8297bda1	Fix a bug relavant to debugging, a masked signal unexpectedly interrupts a sleeping thread when process is being debugged. PR: GNU/77818 Tested by: Sean C. Farley <sean-freebsd at farley org>	2005-06-06 05:13:10 +00:00
Andrew Gallatin	92dd256bd4	Allow sends sent from non page-aligned userspace addresses to be considered for zero-copy sends. Reviewed by: alc Submitted by: Romer Gil at Rice University	2005-06-05 17:13:23 +00:00
Alan Cox	67b95a95eb	Eliminate an unused field from struct aio_liojob.	2005-06-05 05:41:48 +00:00
Marius Strobl	fce21e7e25	After some input from bde@ and rereading the datasheet use a MTX_SPIN mutex instead of a MTX_DEF one in order to defer preemption while reading the date and time registers. If we don't manage to read them within the time slot where we are guaranteed that no updates occur we might actually read them during an update in which case the output is undefined.	2005-06-04 23:24:50 +00:00
Alan Cox	bbe7bbdfee	Eliminate the original method of requesting notification of aio_read(2) and aio_write(2) completion through kevent(2). This method does not work on 64-bit architectures. It was deprecated in FreeBSD 4.4. See revisions 1.87 and 1.70.2.7. Change aio_physwakeup() to call psignal(9) directly rather than indirectly through a timeout(9). Discussed with: bde Correct a bug introduced in revision 1.65 that could result in premature delivery of a signal if an lio_listio(2) consisted of a mixture of direct/raw and queued I/O operations. Observed by: tegge Eliminate a field from struct kaioinfo that is now unused. Reviewed by: tegge	2005-06-04 19:16:33 +00:00
Jeff Roberson	9fe02f7e16	- It's 2005 already, I've been working on this for three years.	2005-06-04 09:24:15 +00:00
Jeff Roberson	21381d1b9e	- Don't SLOT_USE() in the preempt case, sched_add() has already taken the slot for us. Previously, we would take two slots on every preempt, and setrunqueue() would fix it up for us in the non threaded case. The threaded case was simply broken. - Clean up flags, prototypes, comments.	2005-06-04 09:23:28 +00:00
Paul Saab	efe5becafa	Wrap copyin/copyout for kevent so the 32bit wrapper does not have to malloc nchanges * sizeof(struct kevent) AND/OR nevents * sizeof(struct kevent) on every syscall. Glanced at by: peter, jmg Obtained from: Yahoo! MFC after: 2 weeks	2005-06-03 23:15:01 +00:00
Alan Cox	3769f562e2	Synchronize access to the per process aiocb lists in many of the functions.	2005-06-03 05:27:20 +00:00
Alan Cox	e293dc860c	In aio_waitcomplete() correct two cases of using an aiocb after freeing it.	2005-06-02 23:14:38 +00:00
Alan Cox	f0e5132053	Giant is no longer required in kern_setrlimit(); remove its acquisition and release. Reviewed by: jhb	2005-06-01 17:52:51 +00:00
Ken Smith	6341095e0d	This patch addresses a standards violation issue. The standards say a file's access time should be updated when it gets executed. A while ago the mechanism used to exec was changed to use a more mmap based mechanism and this behavior was broken as a side-effect of that. A new vnode flag is added that gets set when the file gets executed, and the VOP_SETATTR() vnode operation gets called. The underlying filesystem is expected to handle it based on its own semantics, some filesystems don't support access time at all. Those that do should handle it in a way that does not block, does not generate I/O if possible, etc. In particular vn_start_write() has not been called. The UFS code handles it the same way as it would normally handle the access time if a file was read - the IN_ACCESS flag gets set in the inode but no other action happens at this point. The actual time update will happen later during a sync (which handles all the necessary locking). Got me into this: cperciva Discussed with: a lot with bde, a little with kan Showed patches to: phk, jeffr, standards@, arch@ Minor discussion on: arch@	2005-05-31 19:39:52 +00:00
Alan Cox	3148c2c96a	Synchronize access to aio_freeproc with a mutex. Eliminate related spl calls. Reduce the scope of Giant in aio_daemon().	2005-05-30 22:26:34 +00:00
Alan Cox	3999ebe3b6	Use the proc mtx to prevent simultaneous changes to p_aioinfo.	2005-05-30 19:33:33 +00:00
Alan Cox	8285135020	Eliminate unnecessary calls to wakeup(); no one sleeps on &aio_freeproc. Eliminate an unused flag, AIOP_SCHED; it's cleared but never set.	2005-05-30 18:02:00 +00:00
Robert Watson	3984b2328c	Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:20:21 +00:00
Robert Watson	f3596e3370	Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:09:18 +00:00
Jeff Roberson	1f22a07afd	- Add bufobj_wrefl() to add a write ref to a bufobj that is already locked.	2005-05-30 07:01:18 +00:00
Joseph Koshy	36c0fd9d0f	Kernel hooks to support PMC sampling modes. Reviewed by: alc	2005-05-30 06:29:29 +00:00
Alan Cox	95eca142ec	Eliminate aio_activeproc; it's unused.	2005-05-30 05:25:10 +00:00
Alan Cox	8484b5e66c	Eliminate aio_bufjobs; it's unused.	2005-05-29 21:29:15 +00:00
Robert Watson	45cb0a0074	Normalize white space in syscalls.master: try to use tabs before system call types.	2005-05-29 20:20:16 +00:00
Robert Watson	63a7e0a3f9	Kernel malloc layers malloc_type allocation over one of two underlying allocators: a set of power-of-two UMA zones for small allocations, and the VM page allocator for large allocations. In order to maintain unified statistics for specific malloc types, kernel malloc maintains a separate per-type statistics pool, which can be monitored using vmstat -m. Prior to this commit, each pool of per-type statistics was protected using a per-type mutex associated with the malloc type. This change modifies kernel malloc to maintain per-CPU statistics pools for each malloc type, and protects writing those statistics using critical sections. It also moves to unsynchronized reads of per-CPU statistics when generating coalesced statistics. To do this, several changes are implemented: - In the previous world order, the statistics memory was allocated by the owner of the malloc type structure, allocated statically using MALLOC_DEFINE(). This embedded the definition of the malloc_type structure into all kernel modules. Move to a model in which a pointer within struct malloc_type points at a UMA-allocated malloc_type_internal data structure owned and maintained by kern_malloc.c, and not part of the exported ABI/API to the rest of the kernel. For the purposes of easing a possible MFC, re-use an existing pointer in 'struct malloc_type', and maintain the current malloc_type structure size, as well as layout with respect to the fields reused outside of the malloc subsystem (such as ks_shortdesc). There are several unused fields as a result of no longer requiring the mutex in malloc_type. - Struct malloc_type_internal contains an array of malloc_type_stats, of size MAXCPU. The structure defined above avoids hard-coding a kernel compile-time value of MAXCPU into kernel modules that interact with malloc. - When accessing per-cpu statistics for a malloc type, surround read - modify - update requests with critical_enter()/critical_exit() in order to avoid races during write. The per-CPU fields are written only from the CPU that owns them. - Per-CPU stats now maintained "allocated" and "freed" counters for number of allocations/frees and bytes allocated/freed, since there is no longer a coherent global notion of the totals. When coalescing malloc stats, accept a slight race between reading stats across CPUs, and avoid showing the user a negative allocation count for the type in the event of a race. The global high watermark is no longer maintained for a malloc type, as there is no global notion of the number of allocations. - While tearing up the sysctl() path, also switch to using sbufs. The current "export as text" sysctl format is retained with the same syntax. We may want to change this in the future to export more per-CPU information, such as how allocations and frees are balanced across CPUs. This change results in a substantial speedup of kernel malloc and free paths on SMP, as critical sections (where usable) out-perform mutexes due to avoiding atomic/bus-locked operations. There is also a minor improvement on UP due to the slightly lower cost of critical sections there. The cost of the change to this approach is the loss of a continuous notion of total allocations that can be exploited to track per-type high watermarks, as well as increased complexity when monitoring statistics. Due to carefully avoiding changing the ABI, as well as hardening the ABI against future changes, it is not necessary to recompile kernel modules for this change. However, MFC'ing this change to RELENG_5 will require also MFC'ing optimizations for soft critical sections, which may modify exposed kernel ABIs. The internal malloc API is changed, and modifications to vmstat in order to restore "vmstat -m" on core dumps will follow shortly. Several improvements from: bde Statistics approach discussed with: ups Tested by: scottl, others	2005-05-29 13:38:07 +00:00
Pawel Jakub Dawidek	885fec3e08	Fix panic when module is compiled in and it is loaded from loader.conf. Only panic is fixed, module will be still listed in kldstat(8) output. Not sure what is correct fix, because adding unloading code in case of failure to linker_init_kernel_modules() doesn't work.	2005-05-28 23:20:05 +00:00
Garance A Drosehn	5f49915eb2	Change the way options are parsed on the `#!'-line of a shell-script. Instead of having the kernel parse that line and add an entry to the argument list for each 'separate word' it finds, have it add only one entry which holds all the words found on that line. The old behavior is useful in some situations, but it does not match the way any other operating system will parse that line. This has been discussed in the thread "Bug in #! processing - One More Time" on the freebsd-arch mailing list (starting back on Feb 24, 2005). The first few messages in that thread provide the background in much detail. PR: 16393 Reviewed by: freebsd-arch	2005-05-28 22:42:41 +00:00
Pawel Jakub Dawidek	870fba2648	Prevent loading modules with are compiled into the kernel. PR: kern/48759 Submitted by: Pawe³ Ma³achowski <pawmal@unia.3lo.lublin.pl> Patch from: demon MFC after: 2 weeks	2005-05-28 22:29:44 +00:00
Robert Watson	0cc0090517	Regenerate from syscalls.master.	2005-05-28 14:35:43 +00:00
Robert Watson	d85bfefd79	Mark ntp_gettime() as MSTD, since its system call path will acquire Giant if required.	2005-05-28 14:35:05 +00:00
Robert Watson	75b8223886	Explicitly acquire Giant around the ntp_gettime() and assert it in the sysctl path. While this code is close to MPSAFE, it may require some additional locking. Mark ntp_gettime1() as GIANT_REQUIRED for now. Suggested by: phk	2005-05-28 14:34:41 +00:00
Robert Watson	7329f580c8	Regenerate for updated syscalls.master.	2005-05-28 13:24:05 +00:00
Robert Watson	d7b9187bff	Mark the following compatability system calls as MCOMPAT or MCOMPAT4 based on the their simply wrapping MPSAFE implementations of existing MPSAFE system calls: getfsstat() lseek() stat() lstat() truncate() ftruncate() statfs() fstatfs() Note that ogetdirentries() is not marked MPSAFE because it does not share the MPSAFE implementation used for getdirentries(), and requires separate locking to be implemented.	2005-05-28 13:23:42 +00:00
Robert Watson	958a52b82b	Regenerate from syscalls.master.	2005-05-28 13:13:01 +00:00
Robert Watson	160349adb1	Mark quotactl() as MSTD.	2005-05-28 13:12:04 +00:00
Robert Watson	f8e5f64207	Acquire Giant explicitly in quotactl() so that the syscalls.master entry can become MSTD.	2005-05-28 13:11:35 +00:00
Robert Watson	a72baeca1d	Regenerate from updated syscalls.master.	2005-05-28 13:09:56 +00:00
Robert Watson	ec792a6740	Mark kenv(2) as MPSAFE, since it appears to be properly locked down.	2005-05-28 13:09:41 +00:00
Robert Watson	848c3ec33f	Regenerate system call tables from syscalls.master.	2005-05-28 13:08:26 +00:00
Robert Watson	5267dc0b3a	Also mark the COMPAT4 version of fhstatfs() as MPSAFE.	2005-05-28 13:07:43 +00:00
Robert Watson	2191a5d154	Mark fhopen(), fhstat(), and fhstatfs() as MSTD, since they now acquire Giant themselves.	2005-05-28 12:59:33 +00:00
Robert Watson	f73e1f57cf	Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(), so that we can start to eliminate the presence of non-MPSAFE system call entries in syscalls.master.	2005-05-28 12:58:54 +00:00

1 2 3 4 5 ...

8630 Commits