freebsd-nq

Author	SHA1	Message	Date
Pyun YongHyeon	ccd8d954f3	Add missing socket buffer unlock before returning to userland. Reviewed by: rwatson	2007-05-08 12:34:14 +00:00
Paolo Pisati	bafe5a3118	Bring in the reminaing bits to make interrupt filtering work: o push much of the i386 and amd64 MD interrupt handling code (intr_machdep.c::intr_execute_handlers()) into MI code (kern_intr.c::ithread_loop()) o move filter handling to kern_intr.c::intr_filter_loop() o factor out the code necessary to mask and ack an interrupt event (intr_machdep.c::intr_eoi_src() and intr_machdep.c::intr_disab_eoi_src()), and make them part of 'struct intr_event', passing them as arguments to kern_intr.c::intr_event_create(). o spawn a private ithread per handler (struct intr_handler::ih_thread) with filter and ithread functions. Approved by: re (implicit?)	2007-05-06 17:02:50 +00:00
Wojciech A. Koszek	9e2894466a	Don't acquire Giant unconditionally. Reviewed by: rwatson	2007-05-06 12:00:38 +00:00
Konstantin Belousov	5c76452f8f	Mark the filedescriptor table entries with VOP_OPEN being performed for them as UF_OPENING. Disable closing of that entries. This should fix the crashes caused by devfs_open() (and fifo_open()) dereferencing struct file * by index, while the filedescriptor is closed by parallel thread. Idea by: tegge Reviewed by: tegge (previous version of patch) Tested by: Peter Holm Approved by: re (kensmith) MFC after: 3 weeks	2007-05-04 14:23:29 +00:00
Robert Watson	7abab91135	sblock() implements a sleep lock by interlocking SB_WANT and SB_LOCK flags on each socket buffer with the socket buffer's mutex. This sleep lock is used to serialize I/O on sockets in order to prevent I/O interlacing. This change replaces the custom sleep lock with an sx(9) lock, which results in marginally better performance, better handling of contention during simultaneous socket I/O across multiple threads, and a cleaner separation between the different layers of locking in socket buffers. Specifically, the socket buffer mutex is now solely responsible for serializing simultaneous operation on the socket buffer data structure, and not for I/O serialization. While here, fix two historic bugs: (1) a bug allowing I/O to be occasionally interlaced during long I/O operations (discovere by Isilon). (2) a bug in which failed non-blocking acquisition of the socket buffer I/O serialization lock might be ignored (discovered by sam). SCTP portion of this patch submitted by rrs.	2007-05-03 14:42:42 +00:00
Alan Cox	fa75abb0d2	Remove unneeded include files.	2007-05-01 06:35:54 +00:00
John-Mark Gurney	ebf750a9fd	Complete removal of restriction about overlaps to rman_manage_region: remove comment and man page verbage... Document return values for rman_init and rman_manage_region.. MFC after: 1 week	2007-04-28 07:37:49 +00:00
John Baldwin	06e043fb20	Avoid a lot of code duplication by using kern_open() to open /dev/null in fdcheckstd() instead of a stripped down version of kern_open()'s code. MFC after: 1 week Reviewed by: cperciva	2007-04-26 18:01:19 +00:00
Konstantin Belousov	e5ea32c290	Allow the dounmount() to proceed even for doomed coveredvp. In dounmount(), before or while vn_lock(coveredvp) is called, coveredvp vnode may be VI_DOOMED due to one of the following: - other thread finished unmount and vput()ed it, and vnode was chosen for recycling, while vn_lock() slept; - forced unmount of the coveredvp->v_mount fs. In the first case, next check for changed v_mountedhere or mnt_gen counter would be successfull. In the second case, the unmount shall be allowed. Submitted by: sobomax MFC after: 2 weeks	2007-04-26 08:56:56 +00:00
Konstantin Belousov	8e68f804a7	Disable nesting of BOP_BDFLUSH(). VOP_FSYNC() call in bdwrite() could result in bdwrite() being reentered, thus causing infinite recursion. Reported and tested by: Peter Holm Reviewed by: tegge MFC after: 2 weeks	2007-04-24 10:59:21 +00:00
Pawel Jakub Dawidek	8c804c7c98	Correct typo.	2007-04-23 12:53:00 +00:00
Robert Watson	c14d15ae3e	Remove MAC Framework access control check entry points made redundant with the introduction of priv(9) and MAC Framework entry points for privilege checking/granting. These entry points exactly aligned with privileges and provided no additional security context: - mac_check_sysarch_ioperm() - mac_check_kld_unload() - mac_check_settime() - mac_check_system_nfsd() Add mpo_priv_check() implementations to Biba and LOMAC policies, which, for each privilege, determine if they can be granted to processes considered unprivileged by those two policies. These mostly, but not entirely, align with the set of privileges granted in jails. Obtained from: TrustedBSD Project	2007-04-22 15:31:22 +00:00
Stephane E. Potvin	0e5179e441	Add support for specifying a minimal size for vm.kmem_size in the loader via vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will be at least 256mb (for example) without forcing a particular value via vm.kmem_size. Approved by: njl (mentor) Reviewed by: alc	2007-04-21 01:14:48 +00:00
Pawel Jakub Dawidek	eed20b37f5	Don't reinvent vm_page_grab(). Reviewed by: ups	2007-04-20 19:49:20 +00:00
Kip Macy	fb1e3ccd7e	Schedule the ithread on the same cpu as the interrupt Tested by: kmacy Submitted by: jeffr	2007-04-20 05:45:46 +00:00
Joseph Koshy	382d30cdd8	Fix witness(4) warnings about mutex use. Group mutexes used in hwpmc(4) into 3 "types" in the sense of witness(4): - leaf spin mutexes---only one of these should be held at a time, so these mutexes are specified as belonging to a single witness type "pmc-leaf". - `struct pmc_owner' descriptors are protected by a spin mutex of witness type "pmc-owner-proc". Since we call wakeup_one() while holding these mutexes, the witness type of these mutexes needs to dominate that of "sleepq chain" mutexes. - logger threads use a sleep mutex, of type "pmc-sleep". Submitted by: wkoszek (earlier patch)	2007-04-19 08:02:51 +00:00
Pawel Jakub Dawidek	fb1daf8164	Fix a bug in sendfile(2) when files larger than page size and nbytes=0. When nbytes=0, sendfile(2) should use file size. Because of the bug, it was sending half of a file. The bug is that 'off' variable can't be used for size calculation, because it changes inside the loop, so we should use uap->offset instead.	2007-04-19 05:54:45 +00:00
Nate Lawson	0ae62c18a0	Bump the interrupt storm detection counter to 1000. My slow fileserver gets a bogus irq storm detected when periodic daily kicks off at 3 am and disconnects the disk. Change the print logic to print once per second when the storm is occurring instead of only once. Otherwise, it appeared that something else was causing the errors each night at 3 am since the print only occurred the first time. Reviewed by: jhb MFC after: 1 week	2007-04-19 01:24:32 +00:00
Pawel Jakub Dawidek	7760d8409f	Export vfs_mount_alloc() as it is used in ZFS.	2007-04-17 21:14:06 +00:00
John Baldwin	2248f68064	- Add a 'show rman <rm>' DDB command to dump the resources in a resource manager similar to 'devinfo -u'. - Add a 'show allrman' DDB command that effectively does 'show rman' on all resource managers in the system.	2007-04-16 21:09:03 +00:00
Kip Macy	21ee3e7aff	remove now invalid check from m_sanity panic on m_sanity check failure with INVARIANTS	2007-04-14 20:19:16 +00:00
Pawel Jakub Dawidek	24b0502ee0	Fix jails and jail-friendly file systems handling: - We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail. - Move security checks to vfs_suser() and deny unmounting and updating for jailed root from different jails, etc. OK'ed by: rwatson	2007-04-13 23:54:22 +00:00
Pawel Jakub Dawidek	6bc3ab2574	When we are running low on vnodes, there is currently no way to ask other subsystems to release some vnodes. Implement backpressure based on vfs_lowvnodes event (similar to vm_lowmem for memory).	2007-04-13 08:38:48 +00:00
Robert Watson	94b94b2b49	Remove now-obsolete comment regarding mqueue privileges in jail.	2007-04-11 16:22:59 +00:00
Robert Watson	4b08405682	Allow PRIV_NETINET_REUSEPORT in jail.	2007-04-10 15:59:49 +00:00
Robert Watson	9956b3f5e4	Do allow POSIX mqueue unlink privilege inside a jail, as we all all other POSIX mqueue privileges inside a jail.	2007-04-10 15:40:27 +00:00
Pawel Jakub Dawidek	08be819487	Minor style cleanups (mostly removal of trailing whitespaces).	2007-04-10 15:29:37 +00:00
Pawel Jakub Dawidek	21ff8c6715	Correct typos.	2007-04-10 15:22:40 +00:00
Nate Lawson	a363f67a81	Restore the locking for the sleep/wakeup to avoid waiting an extra 1 sec if a race was lost. We're still single-threaded at this point, but just be safe for the future.	2007-04-09 21:10:04 +00:00
Nate Lawson	6b1e469ea5	Clean up the root mount and mount wait code. No mutexes are needed here since a spurious wakeup() is the only possible outcome and this is fine in the BSD programming model.	2007-04-09 19:23:52 +00:00
Pawel Jakub Dawidek	82068fe7a9	Add kern.hostuuid sysctl, which will be used to keep host's UUID. Reviewed by: mlaier, rink, brooks, rwatson	2007-04-09 19:18:09 +00:00
Pawel Jakub Dawidek	2eb68d493f	Add root_mounted() function that returns true if the root file system is already mounted.	2007-04-08 23:54:01 +00:00
Pawel Jakub Dawidek	c2cda60911	prison_free() can be called with a mutex held. This wasn't a problem until I converted allprison_mtx mutex to allprison_lock sx lock. To fix this LOR, move prison removal to prison_complete() entirely. To ensure that noone will reference this prison before it's beeing removed from the list skip prisons with 'pr_ref == 0' in prison_find() and assert that pr_ref has to greater than 0 in prison_hold(). Reported by: kris OK'ed by: rwatson	2007-04-08 10:46:23 +00:00
Pawel Jakub Dawidek	b63b0c6529	Only use prison mutex to protect the fields that need to be protected by it.	2007-04-08 10:21:38 +00:00
Pawel Jakub Dawidek	264de85e73	pr_list is protected by the allprison_lock.	2007-04-08 02:13:32 +00:00
Robert Watson	7b20aa9ca6	Remove XXX comment that changes to file fields should be protected with the file lock rather than the filedesc lock: I fixed this in the last revision. Spotted by: kris	2007-04-06 23:31:30 +00:00
Pawel Jakub Dawidek	028e84c68b	allprison mutex was converted to sx(9) lock.	2007-04-05 23:32:32 +00:00
Pawel Jakub Dawidek	dc68a63332	Implement functionality I called 'jail services'. It may be used for external modules to attach some data to jail's in-kernel structure. - Change allprison_mtx mutex to allprison_sx sx(9) lock. We will need to call external functions while holding this lock, which may want to allocate memory. Make use of the fact that this is shared-exclusive lock and use shared version when possible. - Implement the following functions: prison_service_register() - registers a service that wants to be noticed when a jail is created and destroyed prison_service_deregister() - deregisters service prison_service_data_add() - adds service-specific data to the jail structure prison_service_data_get() - takes service-specific data from the jail structure prison_service_data_del() - removes service-specific data from the jail structure Reviewed by: rwatson	2007-04-05 23:19:13 +00:00
Pawel Jakub Dawidek	54b369c1ae	Make prison_find() globally accessible.	2007-04-05 21:34:54 +00:00
Pawel Jakub Dawidek	f6521d1c31	Implement SEEK_DATA and SEEK_HOLE extensions to lseek(2) as found in OpenSolaris. For more information please refer to: http://blogs.sun.com/bonwick/entry/seek_hole_and_seek_data	2007-04-05 21:10:53 +00:00
Pawel Jakub Dawidek	f3a8d2f93c	Add security.jail.mount_allowed sysctl, which allows to mount and unmount jail-friendly file systems from within a jail. Precisely it grants PRIV_VFS_MOUNT, PRIV_VFS_UNMOUNT and PRIV_VFS_MOUNT_NONUSER privileges for a jailed super-user. It is turned off by default. A jail-friendly file system is a file system which driver registers itself with VFCF_JAIL flag via VFS_SET(9) API. The lsvfs(1) command can be used to see which file systems are jail-friendly ones. There currently no jail-friendly file systems, ZFS will be the first one. In the future we may consider marking file systems like nullfs as jail-friendly. Reviewed by: rwatson	2007-04-05 21:03:05 +00:00
Kip Macy	0f4d9d04ea	Fix mb_ctor_clust and mb_dtor_clust to reference the appropriate zone, simplify setting refcnt Reviewed by: andre, rwatson, and glebius MFC after: 3 days	2007-04-04 21:27:01 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Kip Macy	59a31e6acf	fix typo	2007-04-04 00:11:22 +00:00
Kip Macy	e2bc106690	style fixes and make sure that the lock is treated as released in the sharers == 0 case not that this is somewhat racy because a new sharer can come in while we're updating stats	2007-04-04 00:01:05 +00:00
Kip Macy	afc0bfbd90	Fixes to sx for newsx - fix recursed case and move out of inline Submitted by: Attilio Rao <attilio@freebsd.org>	2007-04-03 22:58:21 +00:00
Kip Macy	70fe8436c8	move lock_profile calls out of the macros and into kern_mutex.c add check for mtx_recurse == 0 when releasing sleep lock	2007-04-03 22:52:31 +00:00
Kip Macy	8289600ce7	skip call to _lock_profile_obtain_lock_success entirely if acquisition time is non-zero (i.e. recursing or adding sharers)	2007-04-03 18:36:27 +00:00
Pawel Jakub Dawidek	afd894bb12	Add root_mount_wait() function which can be used to wait until the root file system is mounted. This is useful for kernel modules loaded from /boot/loader.conf, that have to access file system.	2007-04-03 11:45:28 +00:00
John Baldwin	1ce2bc9187	Fix a fd leak in socketpair(): - Close the new file objects created during socketpair() if the copyout of the new file descriptors fails. - Add a test to the socketpair regression test for this edge case.	2007-04-02 19:15:47 +00:00

1 2 3 4 5 ...

9900 Commits