freebsd-skq

Author	SHA1	Message	Date
phk	239117c33f	Make the DIAGNOSTIC code which complains about long {call\|time}out(9) functions less noisy: We printf if a new function took longer than the previous record holder, or of the previous record holder took more than twice as long as the current record.	2003-12-07 20:03:28 +00:00
marcel	f3326a4c71	Regen due to kse_switchin(2).	2003-12-07 19:36:16 +00:00
marcel	2ba380839b	Add kse_switchin(2). This syscall can be used by KSE implementations to have the kernel switch to a new thread, instead of doing it in userland. It is in fact needed on ia64 where syscall restarts do not return to userland first. It's completely handled inside the kernel. As such, any context created by the kernel as part of an upcall and caused by some syscall needs to be restored by the kernel.	2003-12-07 19:34:29 +00:00
peter	e28e232dff	rqb_bits[] may be an int64_t (eg: on alpha, and recently on amd64). Be sure to shift (long)1 << 33 and higher, not (int)1. Otherwise bad things happen(TM). This is why beast.freebsd.org paniced with ULE. Reviewed by: jeff	2003-12-07 09:57:51 +00:00
scottl	2b68e67c6d	Re-arrange and consolidate some random debugging stuff	2003-12-07 05:04:49 +00:00
alc	4408614be4	- Giant is no longer required by vm_thread_new().	2003-12-07 04:16:49 +00:00
rwatson	08335c63bf	Rename mac_create_cred() MAC Framework entry point to mac_copy_cred(), and the mpo_create_cred() MAC policy entry point to mpo_copy_cred_label(). This is more consistent with similar entry points for creation and label copying, as mac_create_cred() was called from crdup() as opposed to during process creation. For a number of policies, this removes the requirement for special handling when copying credential labels, and improves consistency. Approved by: re (scottl) Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-12-06 21:48:03 +00:00
jhb	4b61439e79	Fix all users of mp_maxid to use the same semantics, namely: 1) mp_maxid is a valid FreeBSD CPU ID in the range 0 .. MAXCPU - 1. 2) For all active CPUs in the system, PCPU_GET(cpuid) <= mp_maxid. Approved by: re (scottl) Tested on: i386, amd64, alpha	2003-12-03 14:57:26 +00:00
jhb	907202ec1f	Export a few SMP related symbols in UP kernels as well. This is needed to aid other kernel code, especially code which can be in a module such as the acpi_cpu(4) driver, to work properly with both SMP and UP kernels. The exported symbols include mp_ncpus, all_cpus, mp_maxid, smp_started, and the smp_rendezvous() function. This also means that CPU_ABSENT() is now always implemented the same on all kernels. Approved by: re (scottl)	2003-12-03 14:55:31 +00:00
dg	da88330aaa	Fixed a bug in sendfile(2) where the sent data would be corrupted due to sendfile(2) being erroneously automatically restarted after a signal is delivered. Fixed by converting ERESTART to EINTR prior to exiting. Updated manual page to indicate the potential EINTR error, its cause and consequences. Approved by: re@freebsd.org	2003-12-01 22:12:50 +00:00
iedowse	bc2791c3fa	In dounmount(), only call checkdirs() prior to VFS_UNMOUNT() in the forced unmount case. Otherwise, a file system that is referenced only by process fd_cdir/fd_rdir references to the file system root vnode will be successfully unmounted without the MNT_FORCE flag. The previous behaviour was not compatible with the unmount semantics required by amd(8), so file systems could be unexpectedly unmounted while there were still references to the file system root directory. Reported by: Erez Zadok <ezk@cs.sunysb.edu> Approved by: re (scottl)	2003-11-30 23:30:09 +00:00
jeff	e35dcab926	- Don't forget to unlock the vnode interlock in the LK_NOWAIT case. Submitted by: Stephan Uphoff <ups@stups.com> Approved by: re (rwatson)	2003-11-30 22:09:58 +00:00
kan	c31eef63dc	Do not attempt to destroy NULL vfs options list. Approved by: re (scottl) Reported by: Christian Laursen <xi atborderworlds dot dk>	2003-11-23 17:13:48 +00:00
jhb	bbe7d290ea	- Split cpu_mp_probe() into two parts. cpu_mp_setmaxid() is still called very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid. cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is actually present and sets mp_ncpus and all_cpus. Splitting these up allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just setting mp_maxid to MAXCPU in cpu_mp_setmaxid(). This could allow the CPU probing code to live in a module, for example, since modules sysinit's in modules cannot be invoked prior to SI_SUB_KLD. This is needed to re-enable the ACPI module on i386. - For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating its contents in a few places. Also, add a smp_cpu_enabled() function to avoid duplicating some code. There is room for further code reduction later since much of this code is also present in cpu_mp_start(). - All archs besides i386 still set mp_maxid to the same values they set it to before this change. i386 now sets mp_maxid to MAXCPU. Tested on: alpha, amd64, i386, ia64, sparc64 Approved by: re (scottl)	2003-11-21 22:23:26 +00:00
markm	6a2f4748c4	Fix a major faux pas of mine. I was causing 2 very bad things to happen in interrupt context; 1) sleep locks, and 2) malloc/free calls. 1) is fixed by using spin locks instead. 2) is fixed by preallocating a FIFO (implemented with a STAILQ) and using elements from this FIFO instead. This turns out to be rather fast. OK'ed by: re (scottl) Thanks to: peter, jhb, rwatson, jake Apologies to: *	2003-11-20 15:35:48 +00:00
markm	832b97971f	Hackfix to patch around a kernel panic I introduced. Real fix to follow. In the meanwhile, we are not harvesting interrupt entropy. Approved by: re (jhb)	2003-11-18 14:35:43 +00:00
rwatson	9c969b771a	Introduce a MAC label reference in 'struct inpcb', which caches the MAC label referenced from 'struct socket' in the IPv4 and IPv6-based protocols. This permits MAC labels to be checked during network delivery operations without dereferencing inp->inp_socket to get to so->so_label, which will eventually avoid our having to grab the socket lock during delivery at the network layer. This change introduces 'struct inpcb' as a labeled object to the MAC Framework, along with the normal circus of entry points: initialization, creation from socket, destruction, as well as a delivery access control check. For most policies, the inpcb label will simply be a cache of the socket label, so a new protocol switch method is introduced, pr_sosetlabel() to notify protocols that the socket layer label has been updated so that the cache can be updated while holding appropriate locks. Most protocols implement this using pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use the the worker function in_pcbsosetlabel(), which calls into the MAC Framework to perform a cache update. Biba, LOMAC, and MLS implement these entry points, as do the stub policy, and test policy. Reviewed by: sam, bms Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-18 00:39:07 +00:00
rwatson	cc012e0835	Add a sysctl, security.bsd.see_other_gids, similar in semantics to see_other_uids but with the logical conversion. This is based on (but not identical to) the patch submitted by Samy Al Bahra. Submitted by: Samy Al Bahra <samy@kerneled.com>	2003-11-17 20:20:53 +00:00
peter	9dedda25aa	Initial landing of SMP support for FreeBSD/amd64. - This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.	2003-11-17 08:58:16 +00:00
jeff	71a2f6d146	- Mark ksq_assigned as volatile so that when this code is used without sched_lock we can be sure that we'll pick up the new value.	2003-11-17 08:27:11 +00:00
jeff	a6911261a3	- Remove long dead code. rslices hasn't been used in some time and neither has sched_pickcpu().	2003-11-17 08:24:14 +00:00
peter	38ebd79a92	Expand the argument to the ithread enable/disable helper hooks from an int to something big enough to hold a pointer. amd64 needs this.	2003-11-17 06:08:10 +00:00
rwatson	7aa5c2497a	Implement sockets support for __mac_get_fd() and __mac_set_fd() system calls, and prefer these calls over getsockopt()/setsockopt() for ABI reasons. When addressing UNIX domain sockets, these calls retrieve and modify the socket label, not the label of the rendezvous vnode. - Create mac_copy_socket_label() entry point based on mac_copy_pipe_label() entry point, intended to copy the socket label into temporary storage that doesn't require a socket lock to be held (currently Giant). - Implement mac_copy_socket_label() for various policies. - Expose socket label allocation, free, internalize, externalize entry points as non-static from mac_net.c. - Use mac_socket_label_set() in __mac_set_fd(). MAC-aware applications may now use mac_get_fd(), mac_set_fd(), and mac_get_peer() to retrieve and set various socket labels without directly invoking the getsockopt() interface. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-16 23:31:45 +00:00
rwatson	f9ad21ec5d	Reduce gratuitous redundancy and length in function names: mac_setsockopt_label_set() -> mac_setsockopt_label() mac_getsockopt_label_get() -> mac_getsockopt_label() mac_getsockopt_peerlabel_get() -> mac_getsockopt_peerlabel() Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-16 18:25:20 +00:00
alc	74614e7f63	- Modify alpha's sf_buf implementation to use the direct virtual-to- physical mapping. - Move the sf_buf API to its own header file; make struct sf_buf's definition machine dependent. In this commit, we remove an unnecessary field from struct sf_buf on the alpha, amd64, and ia64. Ultimately, we may eliminate struct sf_buf on those architecures except as an opaque pointer that references a vm page.	2003-11-16 06:11:26 +00:00
rwatson	0dca1bb7dd	When implementing getsockopt() for SO_LABEL and SO_PEERLABEL, make sure to sooptcopyin() the (struct mac) so that the MAC Framework knows which label types are being requested. This fixes process queries of socket labels. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-16 03:53:36 +00:00
bde	60cfaec287	Localized the cy driver's locking.	2003-11-16 00:55:54 +00:00
phk	1cf4c96919	Rename the debugging mutex "callout_no_sleep" to "dont_sleep_in_callout".	2003-11-15 18:33:54 +00:00
tjr	9e7ac78554	Initialize sequence numbers to 0 in seminit() instead of using whatever garbage happens to be in memory. This did not seem to cause any problems except making semaphore ID's unpredictable (and ugly in ipcs(1) output).	2003-11-15 11:56:53 +00:00
phk	d04b779c2d	Send B_PHYS out to pasture, it no longer serves any function.	2003-11-15 09:28:09 +00:00
alc	0e7d8d9c51	- Remove the remaining now unnecessary checks for the buf's b_object being NULL. See revision 1.421 for more detail. - Remove GIANT_REQUIRED from vfs_unbusy_pages(). Discussed with: jeff	2003-11-15 08:45:36 +00:00
jeff	be190686fa	- Introduce kseq_runq_{add,rem}() which are used to insert and remove kses from the run queues. Also, on SMP, we track the transferable count here. Threads are transferable only as long as they are on the run queue. - Previously, we adjusted our load balancing based on the transferable count minus the number of actual cpus. This was done to account for the threads which were likely to be running. All of this logic is simpler now that transferable accounts for only those threads which can actually be taken. Updated various places in sched_add() and kseq_balance() to account for this. - Rename kseq_{add,rem} to kseq_load_{add,rem} to reflect what they're really doing. The load is accounted for seperately from the runq because the load is accounted for even as the thread is running. - Fix a bug in sched_class() where we weren't properly using the PRI_BASE() version of the kg_pri_class. - Add a large comment that describes the impact of a seemingly simple conditional in sched_add(). - Also in sched_add() check the transferable count and KSE_CAN_MIGRATE() prior to checking kseq_idle. This reduces the frequency of access for kseq_idle which is a shared resource.	2003-11-15 07:32:07 +00:00
cognet	690de3f7ac	Better fix than my previous commit: in exit1(), make sure the p_klist is empty after sending NOTE_EXIT. The process won't report fork() or execve() and won't be able to handle NOTE_SIGNAL knotes anyway. This fixes some race conditions with do_tdsignal() calling knote() while the process is exiting. Reported by: Stefan Farfeleder <stefan@fafoe.narf.at> MFC after: 1 week	2003-11-14 18:49:01 +00:00
kan	1246b503c6	Fix a number of style(9) bugs introduced in r1.113 by me. Suggested by: bde	2003-11-14 05:27:41 +00:00
jeff	76902f9650	- regen.	2003-11-14 03:49:41 +00:00
jeff	7491142e9a	- Revision 1.156 marked ptrace() SMP safe. Unfortunately, alpha implements parts of ptrace using proc_rwmem(). proc_rwmem() requires giant, and giant must be acquired prior to the proc lock, so ptrace must require giant still.	2003-11-14 03:48:37 +00:00
phk	c92feb226d	Various minor details: Give the HZ/overflow check a 10% margin. Eliminate bogus newline. If timecounters have equal quality, prefer higher frequency. Some inspiration from: bde	2003-11-13 10:03:58 +00:00
jhb	989e0408dd	- Close a race where a thread on another CPU could release a contested lock and empty its turnstile while the blocking threads still pointed to the turnstile. If the thread on the first CPU blocked on a lock owned by one of the threads blocked on the turnstile just woken up, then the first CPU could try to manipulate a bogus thread queue in the turnstile during priority propagation. - Update locking notes for ts_owner and always clear ts_owner, not just under INVARIANTS. Tested by: sam (1)	2003-11-12 23:48:42 +00:00
mckusick	b2ca74654c	At the request of several developers, restore the DIAGNOSIC code deleted in 1.81. Increase the initial timeout limit to 2ms to eliminate spurious messages of excessive timeouts in the NFS client code. Requested by: Poul-Henning Kamp <phk@phk.freebsd.dk> Requested by: Mike Silbersack <silby@silby.com> Requested by: Sam Leffler <sam@errno.com>	2003-11-12 22:28:27 +00:00
rwatson	3f5efde5af	Mark __mac_get_pid() as MPSAFE in the comment, as it runs without Giant and is also MPSAFE. Push Giant further down into __mac_get_fd() and __mac_set_fd(), grabbing it only for constrained regions dealing with VFS, and dropping it entirely for operations related to labeling of pipes. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-12 22:19:15 +00:00
peter	e97c7f33cb	MNAMELEN is back to an int again after Kirk's statfs commit kern/vfs_mount.c:1305: warning: signed size_t format, different type arg (arg 4) *** Error code 1	2003-11-12 17:09:12 +00:00
jhb	b996af9fb8	Fix a typo in a comment. Submitted by: das	2003-11-12 14:55:45 +00:00
phk	70388f65e9	Replace B_PHYS conditional assignment to bio_offset with KASSERT check to see that the originating code already did it right.	2003-11-12 10:27:06 +00:00
mckusick	a7fc450278	Update the five files derived from /sys/kern/syscalls.master after the additions made for the new statfs structure (version 1.157). These must be updated in a separate checkin after syscalls.master has been checked in so that they reflect its new CVS identity. As these are purely derived files, it is not clear to me why they are under CVS at all. I presume that it has something to do with having `make world' operate properly.	2003-11-12 08:09:19 +00:00
mckusick	6a4c30bccd	Update the statfs structure with 64-bit fields to allow accurate reporting of multi-terabyte filesystem sizes. You should build and boot a new kernel BEFORE doing a `make world' as the new kernel will know about binaries using the old statfs structure, but an old kernel will not know about the new system calls that support the new statfs structure. Running an old kernel after a `make world' will cause programs such as `df' that do a statfs system call to fail with a bad system call. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Tim Robbins <tjr@freebsd.org> Reviewed by: Julian Elischer <julian@elischer.org> Reviewed by: the hoards of <arch@freebsd.org> Sponsored by: DARPA & NAI Labs.	2003-11-12 08:01:40 +00:00
rwatson	77ed6e2d1c	Modify the MAC Framework so that instead of embedding a (struct label) in various kernel objects to represent security data, we embed a (struct label *) pointer, which now references labels allocated using a UMA zone (mac_label.c). This allows the size and shape of struct label to be varied without changing the size and shape of these kernel objects, which become part of the frozen ABI with 5-STABLE. This opens the door for boot-time selection of the number of label slots, and hence changes to the bound on the number of simultaneous labeled policies at boot-time instead of compile-time. This also makes it easier to embed label references in new objects as required for locking/caching with fine-grained network stack locking, such as inpcb structures. This change also moves us further in the direction of hiding the structure of kernel objects from MAC policy modules, not to mention dramatically reducing the number of '&' symbols appearing in both the MAC Framework and MAC policy modules, and improving readability. While this results in minimal performance change with MAC enabled, it will observably shrink the size of a number of critical kernel data structures for the !MAC case, and should have a small (but measurable) performance benefit (i.e., struct vnode, struct socket) do to memory conservation and reduced cost of zeroing memory. NOTE: Users of MAC must recompile their kernel and all MAC modules as a result of this change. Because this is an API change, third party MAC modules will also need to be updated to make less use of the '&' symbol. Suggestions from: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-12 03:14:31 +00:00
kan	9352a05d40	1. Consolidate mount struct allocation/destruction into a common code in vfs_mount_alloc/vfs_mount_destroy functions and take care to completely destroy the mount point along with its locks. Mount struct has grown in coplexity recently and depending on each failure path to destroy it completely isn't working anymore. 2. Eliminate largely identical vfs_mount and vfs_unmount question by moving the code to handle both cases into a newly introduced vfs_domount function. 3. Simplify nfs_mount_diskless to always expect an allocated mount struct and never attempt an allocation/destruction itself. The vfs_allocroot allocation was there to support 'magic' swap space configuration for diskless clients that was already removed by PHK some time ago. 4. Include a vfs_buildopts cleanups by Peter Edwards to validate the sanity of nmount parameters passed from userland. Submitted by: (4) Peter Edwards <peter.edwards@openet-telecom.com> Reviewed by: rwatson	2003-11-12 02:54:47 +00:00
jhb	6cc1f7e330	Add an implementation of turnstiles and change the sleep mutex code to use turnstiles to implement blocking isntead of implementing a thread queue directly. These turnstiles are somewhat similar to those used in Solaris 7 as described in Solaris Internals but are also different. Turnstiles do not come out of a fixed-sized pool. Rather, each thread is assigned a turnstile when it is created that it frees when it is destroyed. When a thread blocks on a lock, it donates its turnstile to that lock to serve as queue of blocked threads. The queue associated with a given lock is found by a lookup in a simple hash table. The turnstile itself is protected by a lock associated with its entry in the hash table. This means that sched_lock is no longer needed to contest on a mutex. Instead, sched_lock is only used when manipulating run queues or thread priorities. Turnstiles also implement priority propagation inherently. Currently turnstiles only support mutexes. Eventually, however, turnstiles may grow two queue's to support a non-sleepable reader/writer lock implementation. For more details, see the comments in sys/turnstile.h and kern/subr_turnstile.c. The two primary advantages from the turnstile code include: 1) the size of struct mutex shrinks by four pointers as it no longer stores the thread queue linkages directly, and 2) less contention on sched_lock in SMP systems including the ability for multiple CPUs to contend on different locks simultaneously (not that this last detail is necessarily that much of a big win). Note that 1) means that this commit is a kernel ABI breaker, so don't mix old modules with a new kernel and vice versa. Tested on: i386 SMP, sparc64 SMP, alpha SMP	2003-11-11 22:07:29 +00:00
jkoshy	ac83b0ec2b	Bound the number of iterations a thread can perform inside ktr_resize_pool(); this eliminates a potential livelock. Return ENOSPC only if we encountered an out-of-memory condition when trying to increase the pool size. Reviewed by: jhb, bde (style)	2003-11-11 09:09:26 +00:00
jkoshy	edc6e45a50	Have utrace(2) return ENOMEM if malloc() fails. Document this error return in its manual page. Reviewed by: jhb	2003-11-11 04:54:11 +00:00

1 2 3 4 5 ...

6826 Commits