freebsd-nq

Author	SHA1	Message	Date
Robert Watson	68352adfe7	Define four constants, MBUF_{,MEM,CLUSTER,PACKET,TAG}_MEM_NAME, which are string names for their respective UMA zones and malloc types, and are passed into uma_zcreate() and MALLOC_DEFINE(). Export them outside of _KERNEL in mbuf.h so that netstat can reference them. Change the names to improve consistency, with each zone/type associated with the mbuf allocator being prefixed mbuf_. MFC after: 1 week	2005-07-17 14:04:03 +00:00
John Baldwin	122eceef61	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
Robert Watson	4f8721d2a9	Correct build on 64-bit: cast u_int64_t to (unsigned long long) before printfing as (unsigned long long). 32-bit build on i386 didn't notice this. Whoops. Reported by: arved Tested by: sledge	2005-07-14 15:21:18 +00:00
Robert Watson	cd814b2692	Introduce a new sysctl, kern.malloc_stats, which exports kernel malloc statistics via a binary structure stream: - Add structure 'malloc_type_stream_header', which defines a stream version, definition of MAXCPUS used in the stream, and a number of malloc_type records in the stream. - Add structure 'malloc_type_header', which defines the name of the malloc type being reported on. - When the sysctl is queried, return a stream header, followed by a series of type descriptions, each consisting of a type header followed by a series of MAXCPUS malloc_type_stats structures holding per-CPU allocation information. Typical values of MAXCPUS will be 1 (UP compiled kernel) and 16 (SMP compiled kernel). This query mechanism allows user space monitoring tools to extract memory allocation statistics in a machine-readable form, and to do so at a per-CPU granularity, allowing monitoring of allocation patterns across CPUs in order to better understand the distribution of work and memory flow over multiple CPUs. While here: - Bump statistics width to uint64_t, and hard code using fixed-width type in order to be more sure about structure layout in the stream. We allocate and free a lot of memory. - Add kmemcount, a counter of the number of registered malloc types, in order to avoid excessive manual counting of types. Export via a new sysctl to allow user-space code to better size buffers. - De-XXX comment on no longer maintaining the high watermark in old sysctl monitoring code. A follow-up commit of libmemstat(3), a library to monitor kernel memory allocation, will occur in the next few days. Likewise, similar changes to UMA.	2005-07-14 11:52:06 +00:00
Robert Watson	49bb6870cc	Bump the module versions of the MAC Framework and MAC policy modules from 2 (6.x) to 3 (7.x) to allow for future changes in the MAC policy module ABI in 7.x. Obtained from: TrustedBSD Project	2005-07-14 10:46:03 +00:00
Robert Watson	d26dd2d99e	When devfs cloning takes place, provide access to the credential of the process that caused the clone event to take place for the device driver creating the device. This allows cloned device drivers to adapt the device node based on security aspects of the process, such as the uid, gid, and MAC label. - Add a cred reference to struct cdev, so that when a device node is instantiated as a vnode, the cloning credential can be exposed to MAC. - Add make_dev_cred(), a version of make_dev() that additionally accepts the credential to stick in the struct cdev. Implement it and make_dev() in terms of a back-end make_dev_credv(). - Add a new event handler, dev_clone_cred, which can be registered to receive the credential instead of dev_clone, if desired. - Modify the MAC entry point mac_create_devfs_device() to accept an optional credential pointer (may be NULL), so that MAC policies can inspect and act on the label or other elements of the credential when initializing the skeleton device protections. - Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(), so that the pty clone credential is exposed to the MAC Framework. While currently primarily focussed on MAC policies, this change is also a prerequisite for changes to allow ptys to be instantiated with the UID of the process looking up the pty. This requires further changes to the pty driver -- in particular, to immediately recycle pty nodes on last close so that the credential-related state can be recreated on next lookup. Submitted by: Andrew Reisse <andrew.reisse@sparta.com> Obtained from: TrustedBSD Project Sponsored by: SPAWAR, SPARTA MFC after: 1 week MFC note: Merge to 6.x, but not 5.x for ABI reasons	2005-07-14 10:22:09 +00:00
John Baldwin	2c65cb82ad	Add a 'sysent' target that depends on the various files built from syscalls.master for the master list and the Alpha/OSF1 compat ABI to be consistent with all the other compat ABIs where 'make sysent' already works. MFC after: 3 days	2005-07-13 20:50:17 +00:00
David Xu	740fd64d65	Validate if the value written into {FS,GS}.base is a canonical address, writting non-canonical address can cause kernel a panic, by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring only canonical values get written to the registers. Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com > Approved by: re (scottl)	2005-07-10 23:31:11 +00:00
John Baldwin	522ccb2381	Regen. Approved by: re (scottl)	2005-07-08 15:06:58 +00:00
John Baldwin	4acd2e73e5	Mark second instance of lchown() MP safe just like the first. Approved by: re (scottl)	2005-07-08 15:01:13 +00:00
John Baldwin	9f3157a254	Regenerate. Approved by: re (scottl)	2005-07-07 18:20:38 +00:00
John Baldwin	bcd9e0dd20	- Add two new system calls: preadv() and pwritev() which are like readv() and writev() except that they take an additional offset argument and do not change the current file position. In SAT speak: preadv:readv::pread:read and pwritev:writev::pwrite:write. - Try to reduce code duplication some by merging most of the old kern_foov() and dofilefoo() functions into new dofilefoo() functions that are called by kern_foov() and kern_pfoov(). The non-v functions now all generate a simple uio on the stack from the passed in arguments and then call kern_foov(). For example, read() now just builds a uio and calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev(). PR: kern/80362 Submitted by: Marc Olzheim marcolz at stack dot nl (1) Approved by: re (scottl) MFC after: 1 week	2005-07-07 18:17:55 +00:00
Robert Watson	6758f88ea4	Add MAC Framework and MAC policy entry point mac_check_socket_create(), which is invoked from socket() and socketpair(), permitting MAC policy modules to control the creation of sockets by domain, type, and protocol. Obtained from: TrustedBSD Project Sponsored by: SPARTA, SPAWAR Approved by: re (scottl) Requested by: SCC	2005-07-05 22:49:10 +00:00
Pawel Jakub Dawidek	c23c87bd93	Fix one "wrong b_bufobj" panic in reassignbuf() by moving VI_UNLOCK(vp) below KASSERT()s, which means there was no real problem here, we just needed better locking for assertions. OK'ed by: jeff Approved by: re (scottl)	2005-07-05 15:57:55 +00:00
Suleiman Souhlal	571dcd15e2	Fix the recent panics/LORs/hangs created by my kqueue commit by: - Introducing the possibility of using locks different than mutexes for the knlist locking. In order to do this, we add three arguments to knlist_init() to specify the functions to use to lock, unlock and check if the lock is owned. If these arguments are NULL, we assume mtx_lock, mtx_unlock and mtx_owned, respectively. - Using the vnode lock for the knlist locking, when doing kqueue operations on a vnode. This way, we don't have to lock the vnode while holding a mutex, in filt_vfsread. Reviewed by: jmg Approved by: re (scottl), scottl (mentor override) Pointyhat to: ssouhlal Will be happy: everyone	2005-07-01 16:28:32 +00:00
Joseph Koshy	151392465f	MFP4: - pmcstat(8) gprof output mode fixes: lib/libpmc/pmclog.{c,h}, sys/sys/pmclog.h: + Add a 'is_usermode' field to the PMCLOG_PCSAMPLE event + Add an 'entryaddr' field to the PMCLOG_PROCEXEC event, so that pmcstat(8) can determine where the runtime loader /libexec/ld-elf.so.1 is getting loaded. sys/kern/kern_exec.c: + Use a local struct to group the entry address of the image being exec()'ed and the process credential changed flag to the exec handling hook inside hwpmc(4). usr.sbin/pmcstat/*: + Support "-k kernelpath", "-D sampledir". + Implement the ELF bits of 'gmon.out' profile generation in a new file "pmcstat_log.c". Move all log related functions to this file. + Move local definitions and prototypes to "pmcstat.h" - Other bug fixes: + lib/libpmc/pmclog.c: correctly handle EOF in pmclog_read(). + sys/dev/hwpmc_mod.c: unconditionally log a PROCEXIT event to all attached PMCs when a process exits. + sys/sys/pmc.h: correct a function prototype. + Improve usage checks in pmcstat(8). Approved by: re (blanket hwpmc)	2005-06-30 19:01:26 +00:00
Paul Saab	cff2e749e2	Use SCTL_MASK32 to determine that the sysctl call is from a 32bit binary for kern.cp_time. Approved by: re	2005-06-30 17:17:29 +00:00
Peter Wemm	62919d788b	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
Peter Wemm	48033188a6	Second part of commit for moving KDB_STOP_NMI from opt_global.h to opt_kdb.h. Found by: kris Approved by: re	2005-06-30 03:38:10 +00:00
Peter Wemm	2de92a386e	Conditionally weaken sys_generic.c rev 1.136 to allow certain dubious ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with a size of zero. Traditionally this was what you did before IOC_VOID existed, and we had some established users of this in the tree, namely procfs. Certain 3rd party drivers with binary userland components also have this too. This is necessary to have 4.x and 5.x binaries use these ioctl's. We found this at work when trying to run 4.x binaries. Approved by: re	2005-06-30 00:19:08 +00:00
Peter Wemm	f0c6706de9	Move the KDB_STOP_NMI option from opt_global.h to opt_kdb.h Approved by: re	2005-06-29 23:23:16 +00:00
Mike Silbersack	a7b844d2be	Fix the false memory modified after free messages some users have been reporting - in my previous change, I missed the case where a mbuf from the packet zone was freed back to the mbuf/packet keg, where it was subsequently put into the mbuf zone and found not to contain the expected trash. This change adds the necessary trash_dtor call inside mb_fini_pack so that everything is correct. Thanks for Bosko for finding the bug and showing me how secondary zones work. Approved by: re (dwhite)	2005-06-29 08:18:26 +00:00
Dima Dorfman	1ee6b74603	Fix fdcheckstd to pass the file descriptor along through vn_open. When opening a device, devfs_open needs the file descriptor to install its own fileops. Failing to pass the file descriptor causes the vnode to be returned with the regular vnops, which will cause a panic on the first read or write because devfs_specops is not meant to support those operations. This bug caused a panic after exec'ing any set[ug]id program with fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we would panic if the exec'd program ever tried to use any of those descriptors). Reviewed by: phk Approved by: re (scottl)	2005-06-25 03:34:49 +00:00
Pawel Jakub Dawidek	400a74bff8	Close another information leak in ktrace(2): one was able to find active process groups outside a jail, etc. by using ktrace(2). OK'ed by: rwatson Approved by: re (scottl) MFC after: 1 week	2005-06-24 12:05:24 +00:00
Peter Wemm	4da0d332f4	Move HWPMC_HOOKS into its own opt_hwpmc_hooks.h file. It doesn't merit being in opt_global.h and forcing a global recompile when only a few files reference it. Approved by: re	2005-06-24 00:16:57 +00:00
Pawel Jakub Dawidek	06a137780b	Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)	2005-06-23 22:13:29 +00:00
John Baldwin	57dbcb11db	Fix a typo in a comment. Approved by: re (scottl)	2005-06-23 21:55:43 +00:00
Mike Silbersack	121f050976	Change the mbuf, mbuf cluster, and mbuf packet allocation routines so that the UMA "trash" allocator is used - this ensures that any writes to a freed mbuf should provoke a panic. Only enabled under INVARIANTS, of course. Approved by: re (scottl)	2005-06-23 04:33:39 +00:00
Pawel Jakub Dawidek	b0d9aedd28	Add missing unlock. Pointy hat to: pjd Approved by: re (dwhite)	2005-06-21 21:17:02 +00:00
John Baldwin	943928c905	Simplify the storming logic and remove a variable as a result. Approved by: re (dwhite)	2005-06-20 19:32:23 +00:00
Garance A Drosehn	bd3aace7e4	Fix a panic which could occur parsing #!-lines in a shell-script. If the #!-line had multiple whitespace characters after the interpreter name, and it did not have any options, then the code would do nasty things trying to process a (non-existent) option-string which "ended before it began"... Submitted by: Morten Johansen Approved by: re (dwhite)	2005-06-19 02:21:03 +00:00
Jeff Roberson	b770ff6eb2	- Try to catch the wrong bufobj panics a little earlier. I believe they are actually caused by a buf with both VNCLEAN and VNDIRTY set. In the traces it is clear that the buf is removed from the dirty queue while it is actually on the clean queue which leaves the tail pointer set. Assert that both flags are not set in buf_vlist_add and buf_vlist_remove. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-18 18:17:03 +00:00
Jeff Roberson	32b6dcd8a4	- Fix a leaked reference to a vnode via v_dd. We rely on cache_purge() and cache_zap() to clear the v_dd pointers when a directory vnode is forcibly discarded. For this to work, all vnodes with v_dd pointers to a directory must also have name cache entries linked via v_cache_dst to that dvp otherwise we could not find them at cache_purge() time. The following code snipit could break this guarantee by unlinking a directory before fetching it's dotdot. The dotdot lookup would initialize the v_dd field of the unlinked directory which could never be cleared. To fix this we don't initialize v_dd for orphaned vnodes. printf("rmdir: %d\n", rmdir("../foo")); /* foo is cwd */ printf("chdir: %d\n", chdir("..")); printf("%s\n", getwd(NULL)); Sponsored by: Isilon Systems, Inc. Discovered by: kkenn Approved by: re (blanket vfs)	2005-06-17 01:05:13 +00:00
Ken Smith	c0cac8dc20	Remove a variable that became unused as a result of changes made in v1.139. This was only exposed if MALLOC_PROFILE was defined. Submitted by: Gary Jennejohn Pointy hat: rwatson Approved by: re (scottl)	2005-06-16 16:01:46 +00:00
Jeff Roberson	114a1006a8	- Change holdcnt use around vnode recycling. We now always keep a holdcnt ref while we're calling vgone(). This prevents transient refs from re-adding us to the free list. Previously, a vfree() triggered via vinvalbuf() getting rid of all of a vnode's pages could place a partially destructed vnode on the free list where vtryrecycle() could find it. The first call to vtryrecycle would hang up on the vnode lock, but when it failed it would place a now dead vnode onto the free list, and another call to vtryrecycle() would free an already free vnode. There were many complications of having a zero ref count while freeing which can now go away. - Change vdropl() to release the interlock before returning. All callers now respect this, so vdropl() directly frees VI_DOOMED vnodes once the last ref is dropped. This means that we'll never have VI_DOOMED vnodes on the free list. - Seperate v_incr_usecount() into v_incr_usecount(), v_decr_usecount() and v_decr_useonly(). The incr/decr split is so that incr usecount can return with the interlock still held while decr drops the interlock so it can call vdropl() which will potentially free the vnode. The calling function can't drop the lock of an already free'd node. v_decr_useonly() drops a usecount without droping the hold count. This is done so the usecount reaches zero in vput() before we recycle, however the holdcount is still 1 which prevents any new references from placing the vnode back on the free list. - Fix vnlrureclaim() to vhold the vnode since it doesn't do a vget(). We wouldn't want vnlrureclaim() to bump the usecount since this has different semantics. Also change vnlrureclaim() to do a NOWAIT on the vn_lock. When this function runs we're usually in a desperate situation and we wouldn't want to wait for any specific vnode to be released. - Fix a bunch of misc comments to reflect the new behavior. - Add vhold() and vdrop() to vflush() for the same reasons that we do in vlrureclaim(). Previously we held no reference and a vnode could have been freed while we were waiting on the lock. - Get rid of vlruvp() and vfreehead(). Neither are used. vlruvp() should really be rethought before it's reintroduced. - vgonel() always returns with the vnode locked now and never puts the vnode back on a free list. The vnode will be freed as soon as the last reference is released. Sponsored by: Isilon Systems, Inc. Debugging help from: Kris Kennaway, Peter Holm Approved by: re (blanket vfs)	2005-06-16 04:41:42 +00:00
Jeff Roberson	bdcd9f26b0	- Fix insertions of bios which represent data earlier than anything else in the queue. The insertion sort assumed this had already been taken care of. Spotted by: Antoine Brodin Approved by: re (scottl)	2005-06-15 23:32:07 +00:00
Jeff Roberson	7a06fe49dc	- Add and enhance asserts related to the wrong bufobj panic. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-14 20:32:27 +00:00
Jeff Roberson	12c2dcde40	- In reassignbuf() add many asserts to validate the head and tail pointers of the clean and dirty lists. This is in an attempt to catch the wrong bufobj problem sooner. - In vgonel() don't acquire an extra reference in the active case, the vnode lock and VI_DOOMED protect us from recursively cleaning. - Also in vgonel() clean up some stale comments. Sponsored by: Isilon Systems, Inc. Approved by: re (blanket vfs)	2005-06-14 20:31:53 +00:00
Jeff Roberson	dbb3ec5ce3	- Remove vnode lock asserts at the end of vfs syscalls. These asserts were used to ensure that we weren't exiting the syscall with a lock still held. This wasn't safe, however, because we'd already executed a vput() and on a loaded system the vnode may have been free'd by the time we assert. This functionality is also handled by the td_locks assert in userret, which doesn't tell you what the syscall was, but will at least panic before you deadlock. Sponsored by: Isilon Systems, Inc. Discovred by: Peter Holm Approved by: re (blanket vfs)	2005-06-14 01:14:40 +00:00
Jeff Roberson	b930d85380	- Don't make vgonel() globally visible, we want to change its prototype anyway and it's not used outside of vfs_subr.c. - Change vgonel() to accept a parameter which determines whether or not we'll put the vnode on the free list when we're done. - Use the new vgonel() parameter rather than VI_DOOMED to signal our intentions in vtryrecycle(). - In vgonel() return if VI_DOOMED is already set, this vnode has already been reclaimed. Sponsored by: Isilon Systems, Inc.	2005-06-13 06:26:55 +00:00
Jeff Roberson	6bd8103d33	- Clear v_dd in cache_zap() instead of cache_purge() as cache_purge() may not be called in all cases where we free the cnp. Sponsored by: Isilon Systems, Inc.	2005-06-13 05:59:59 +00:00
Jeff Roberson	d598b04d44	- It has long been my suspicion that we don't actually need a loop in vn_lock(). Add an assert that will help me gain more confidence that this is correct. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:47:29 +00:00
Jeff Roberson	d2ad9baac0	- Add KTR_VFS events to vdestroy, vtruncbuf, vinvalbuf, vfreehead. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:37 +00:00
Jeff Roberson	eff2d12635	- Add KTR_VFS messages for various name cache related events. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:46:03 +00:00
Jeff Roberson	748c92fbad	- Split one KASSERT in bremfree() into two to aid in debugging. Sponsored by: Isilon Systems, Inc.	2005-06-13 00:45:05 +00:00
Jeff Roberson	f19f6869cf	- Dramatically simplify bioqdisksort(). We no longer do ordered bios so most of the code to deal with them has been dead for sometime. Simplify the code by doing an insert sort hinted by the current head position. Met with apathy by: arch@	2005-06-12 22:32:29 +00:00
Pawel Jakub Dawidek	65ac438c8f	Do not allocate memory while holding a mutex. I introduce a very small race here (some file system can be mounted or unmounted between 'count' calculation and file systems list creation), but it is harmless. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-12 07:03:23 +00:00
Pawel Jakub Dawidek	3a996d6e91	Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-11 14:58:20 +00:00
Maxim Konovalov	922a5d9c2b	o setsockopt(2) cannot remove accept filter. [1] o getsockopt(SO_ACCEPTFILTER) always returns success on listen socket even we didn't install accept filter on the socket. o Fix these bugs and add regression tests for them. Submitted by: Igor Sysoev [1] Reviewed by: alfred MFC after: 2 weeks	2005-06-11 11:59:48 +00:00
Jeff Roberson	d6dbf760a6	- Assert that we're not in the name cache anymore in vdestroy(). Sponsored by: Isilon Systems, Inc.	2005-06-11 08:48:09 +00:00

1 2 3 4 5 ...

8697 Commits