freebsd-skq

Author	SHA1	Message	Date
rafan	ff392b04b7	- Remove UMAP filesystem. It was disconnected from build three years ago, and it is seriously broken. Discussed on: freebsd-arch@ Approved by: re (mux)	2007-06-25 05:06:57 +00:00
delphij	1da67b5003	Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.	2007-06-18 14:40:19 +00:00
delphij	d50b261fe6	MFp4: fix two locking problems: - Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su	2007-06-18 01:43:13 +00:00
delphij	ef518e4122	MFp4: Add tmpfs, an efficient memory file system. Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)	2007-06-16 01:56:05 +00:00
rwatson	00b02345d4	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
remko	5983912c3d	Correct corrupt read when the read starts at a non-aligned offset. PR: kern/77234 MFC After: 1 week Approved by: imp (mentor) Requested by: many many people Submitted by: Andriy Gapon <avg at icyb dot net dot ua>	2007-06-11 20:14:44 +00:00
attilio	12d804e413	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
bmah	229a009b9d	Fix off-by-one error (introduced in r1.60) that had the effect of disallowing a read of exactly MAXPHYS bytes. Reviewed by: des, rdivacky MFC after: 1 week Sponsored by: nCircle Network Security	2007-06-07 15:04:30 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
attilio	9bd4fdf7ce	Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:45:18 +00:00
trhodes	4173ae4155	Revert previous, part of NFS that I didn't know about.	2007-06-01 17:06:46 +00:00
trhodes	aae93b87b9	Garbage collect msdosfs_fhtovp; it appears unused and I have been using MSDOSFS without this function and problems for the last month.	2007-06-01 14:57:19 +00:00
kib	17260ba6f1	Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file: part 2. Convert calls missed in the first big commit. Noted by: rwatson Pointy hat to: kib	2007-06-01 14:33:11 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
kib	f13486a222	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
rwatson	be966bcc03	Where I previously removed calls to kdb_enter(), now remove include of kdb.h. Pointed out by: bde	2007-05-29 11:28:28 +00:00
rwatson	d162983163	Rather than entering the debugger via kdb_enter() when detecting memory corruption under SMBUFS_NAME_DEBUG, panic() with the same error message.	2007-05-27 13:12:36 +00:00
rwatson	b3193c8a43	Rather than entering the debugger via kdb_enter() in the event the root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.	2007-05-27 13:10:16 +00:00
kib	162fa8dc6d	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
des	cde2655812	The process lock is held when procfs_ioctl() is called. Assert that this is so, and PHOLD the process while sleeping since msleep() will release the lock.	2007-05-01 12:59:20 +00:00
des	d5a4cf1cbe	Fix old locking bugs which were revealed when pseudofs was made MPSAFE. Submitted by: tegge	2007-04-23 19:17:01 +00:00
rwatson	62d2d15116	Rename macdevfsdirent() to macdevfs() to synchronize with SEDarwin, where similar data structures exist to support devfs and the MAC Framework, but are named differently. Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.	2007-04-23 13:36:54 +00:00
alc	11f5869ec4	Add synchronization. Eliminate the acquisition and release of Giant. Reviewed by: tegge	2007-04-23 06:12:24 +00:00
trhodes	a559ca6419	In some cases, like whenever devfs file times are zero, the fix(aa) will not be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk	2007-04-20 01:47:05 +00:00
des	6a33fe5e57	Avoid "unused variable" warning when building without PSEUDOFS_TRACE.	2007-04-15 20:35:18 +00:00
des	4632956a5b	Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.	2007-04-15 17:10:01 +00:00
des	29fb9c20c7	Instead of stating GIANT_REQUIRED, just acquire and release Giant where needed. This does not make a difference now, but will when procfs is marked MPSAFE.	2007-04-15 17:06:09 +00:00
des	7b0adac1da	Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset is 0 upon entry, and don't reset it before returning. MFC after: 3 weeks	2007-04-15 13:29:36 +00:00
des	5fa10252a8	Don't reset uio_offset to 0 before returning. Instead, refuse to service requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever. MFC after: 3 weeks	2007-04-15 13:24:03 +00:00
des	8dea6eb551	Further pseudofs improvements: The pfs_info mutex is only needed to lock pi_unrhdr. Everything else in struct pfs_info is modified only while Giant is held (during vfs_init() / vfs_uninit()); add assertions to that effect. Simplify pfs_destroy somewhat. Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the assertions which were added in the previous commit to ensure they were consistent. Assert that Giant is held while the vnode cache is initialized and destroyed. Also assert that the cache is empty when it is destroyed. Rename the vnode cache mutex for consistency. Fix a long-standing bug in pfs_getattr(): it would uncritically return the node's pn_fileno as st_ino. This would result in st_ino being 0 if the node had not previously been visited by readdir(), and also in an incorrect st_ino for process directories and any files contained therein. Correct this by abstracting the fileno manipulations previously done in pfs_readdir() into a new function, pfs_fileno(), which is used by both pfs_getattr() and pfs_readdir().	2007-04-14 14:08:30 +00:00
des	517c13e66c	Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process- specific nodes when the process exits) Move the vnode-cache-walking loop which was duplicated in pfs_exit() and pfs_disable() into its own function, pfs_purge(), which looks for vnodes marked as dead and / or belonging to the specified pfs_node and reclaims them. Note that this loop is still extremely inefficient. Add a comment in pfs_vncache_alloc() explaining why we have to purge the vnode from the vnode cache before returning, in case anyone should be tempted to remove the call to cache_purge(). Move the special handling for pfstype_root nodes into pfs_fileno_alloc() and pfs_fileno_free() (the root node's fileno must always be 2). This also fixes a bug where pfs_fileno_free() would reclaim the root node's fileno, triggering a panic in the unr code, as that fileno was never allocated from unr to begin with. When destroying a pfs_node, release its fileno and purge it from the vnode cache. I wish we could put off the call to pfs_purge() until after the entire tree had been destroyed, but then we'd have vnodes referencing freed pfs nodes. This probably doesn't matter while we're still under Giant, but might become an issue later. When destroying a pseudofs instance, destroy the tree before tearing down the fileno allocator. In pfs_mount(), acquire the mountpoint interlock when required. MFC after: 3 weeks	2007-04-11 22:40:57 +00:00
des	4d2fbcb41b	Whitespace nits.	2007-04-05 13:43:00 +00:00
rwatson	765a83fd79	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
kris	2ddeaf9203	Annotate that this giant acqusition is dependent on tty locking.	2007-03-26 21:56:46 +00:00
maxim	5134b2c9f3	o cd9660 code repo-copied, update a comment.	2007-03-24 22:40:16 +00:00
tegge	214bc5723c	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib	2007-03-13 01:50:27 +00:00
des	3d3e50beaf	Add a pn_destroy field to pfs_node. This field points to a destructor function which is called from pfs_destroy() before the node is reclaimed. Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers. This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal. Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>	2007-03-12 12:16:52 +00:00
mpp	61cebfa898	Change fifo_printinfo to check if the vnode v_fifoinfo pointer is NULL and print a message to that effect to prevent a panic.	2007-03-02 00:10:11 +00:00
jhb	9081d44243	Use pause() rather than tsleep() on stack variables and function pointers.	2007-02-27 17:23:29 +00:00
cognet	0592e954a1	Check that the error returned by vfs_getopts() is not ENOENT before assuming there's actually an error. This is just in order to unbreak ntfs on current, before a proper solution is committed.	2007-02-21 00:30:09 +00:00
rwatson	228e8a2b29	Do allow PIOCSFL in jail for setguid processes; this is more consistent with other debugging checks elsewhere. XXX comment on the fact that p_candebug() is not being used here remains.	2007-02-19 13:04:25 +00:00
pjd	cb2d7c85a8	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
rodrigc	b236d22de9	Forced commit and #include changes for repo copy from sys/isofs/cd9660 to sys/fs/cd9660. Discussed on freebsd-current.	2007-02-11 13:54:25 +00:00
rodrigc	9327f3ee0d	Add noatime to the list of mount options that msdosfs accepts. PR: 108896 Submitted by: Eugene Grosbein <eugen grosbein pp ru>	2007-02-08 02:30:55 +00:00
rodrigc	9a7c587caa	Style fixes: use ANSI C function declarations.	2007-02-08 02:25:35 +00:00
kib	0e5b15d726	Fix the race of dereferencing /proc/<pid>/file with execve(2) by caching the value of p_textvp. This way, we always unlock the locked vnode. While there, vhold() the vnode around the vn_lock(). Reported and tested by: Guy Helmer (ghelmer palisadesys com) Approved by: des (procfs maintainer) MFC after: 1 week	2007-02-07 10:30:49 +00:00
rodrigc	5efe7e2fa5	Eliminate some dead code which was introduced in 1.23, yet was always commented out.	2007-02-06 03:30:58 +00:00
pjd	a71f42e832	coda_vptofh is never defined nor used.	2007-02-02 15:47:28 +00:00
avatar	15257284d7	Fixing compilation bustage by removing references to opt_msdosfs.h. This auto-generated header file no longer exists since the removal of MSDOSFS_LARGE in sys/conf/options:1.574.	2007-01-30 08:05:04 +00:00
trhodes	810381390e	Fix spacing from my previous commit to this file: Noticed by: fjoe	2007-01-30 04:41:38 +00:00
rodrigc	fdf518fe9a	Add a "-o large" mount option for msdosfs. Convert compile-time checks for #ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified. Test case provided by Oliver Fromme: truncate -s 200G test.img mdconfig -a -t vnode -f test.img -u 9 newfs_msdos -s 419430400 -n 1 /dev/md9 zip250 mount -t msdosfs /dev/md9 /mnt # should fail mount -t msdosfs -o large /dev/md9 /mnt # should succeed PR: 105964 Requested by: Oliver Fromme <olli lurza secnetix de> Tested by: trhodes MFC after: 2 weeks	2007-01-30 03:11:45 +00:00
kib	79752b63e1	Below is slightly edited description of the LOR by Tor Egge: -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B \| +->F->D->E L2: F->D->E \| +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks	2007-01-22 11:25:22 +00:00
trhodes	317819a5f6	Add a 3rd entry in the cache, which keeps the end position from just before extending a file. This has the desired effect of keeping the write speed constant. And yes, that helps a lot copying large files always at full speed now, and I have seen improvements using benchmarks/bonnie. Stolen from: NetBSD Reviewed by: bde	2007-01-16 23:43:14 +00:00
pav	9ca9d354d9	Rewrite the udf_read() routine to use a file vnode instead of the devvp vnode. The code is modelled after cd9660, including support for simple read-ahead courtesy of clustered read. Fix udf_strategy to DTRT. This change fixes sendfile(2) not to send out garbage. Reviewed by: scottl MFC after: 1 month	2007-01-15 18:45:36 +00:00
pav	419ce53db8	Tell backing v_object the filesize right on it's creation. MFC after: 1 week	2007-01-07 23:53:16 +00:00
rodrigc	a533f9bcf6	When performing a mount update to change a mount from read-only to read-write, do not call markvoldirty() until the mount has been flagged as read-write. Due to the nature of the msdosfs code, this bug only seemed to appear for FAT-16 and FAT-32. This fixes the testcase: #!/bin/sh dd if=/dev/zero bs=1m count=1 oseek=119 of=image.msdos mdconfig -a -t vnode -f image.msdos newfs_msdos -F 16 /dev/md0 fd120m mount_msdosfs -o ro /dev/md0 /mnt mount \| grep md0 mount -u -o rw /dev/md0; echo $? mount \| grep md0 umount /mnt mdconfig -d -u 0 PR: 105412 Tested by: Eugene Grosbein <eugen grosbein pp ru>	2007-01-06 20:46:02 +00:00
rodrigc	605122b836	Simplify code in union_hashins() and union_hashget() functions. These functions now more closely resemble similar functions in nullfs. This also eliminates some errors. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 14:06:42 +00:00
rodrigc	0da08222b7	Eliminate obsolete comment, now that getushort() is implemented in terms of functions in <sys/endian.h>.	2007-01-05 05:28:57 +00:00
rodrigc	ce2987dc85	Eliminate ASSERT_VOP_ELOCKED panics when doing mkdir or symlink when sysctl vfs.lookup_shared=1. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 02:25:44 +00:00
jhb	bdddbb74e0	Use the vnode interlock to close a race where pfs_vncache_alloc() could attempt to vn_lock() a destroyed vnode resulting in a hang. MFC after: 1 week Submitted by: ups Reviewed by: des	2007-01-02 17:27:52 +00:00
pav	0fd460000c	Call vnode_create_vobject() in VOP_OPEN. Makes mmap work on UDF filesystem. PR: kern/92040 Approved by: scottl MFC after: 1 week	2006-12-23 18:53:22 +00:00
marcel	24b8c057ed	Unbreak 64-bit little-endian systems that do require alignment. The fix involves using le16dec(), le32dec(), le16enc() and le32enc(). This eliminates invalid casts and duplicated logic.	2006-12-21 05:40:46 +00:00
rodrigc	35ede48071	For big-endian version of getulong() macro, cast result to u_int32_t. This macro was written expecting a 32-bit unsigned long, and doesn't work properly on 64-bit systems. This bug caused vn_stat() to return incorrect values for files larger than 2gb on msdosfs filesystems on 64-bit systems. PR: 106703 Submitted by: Axel Gonzalez <loox e-shell net> MFC after: 3 days	2006-12-19 02:31:58 +00:00
rodrigc	24c66ef262	Fix get_ulong() macro on AMD64 (or any little-endian 64-bit platform). This bug caused vn_stat() to fail on files larger than 2gb on msdosfs filesystems on AMD64. PR: 106703 Tested by: Axel Gonzalez <loox e-shell net> MFC after: 3 days	2006-12-19 01:55:45 +00:00
rodrigc	ef305f6c5d	Remove unused variable in unionfs_root(). Submitted by: daichi, Masanori OZAWA	2006-12-09 17:24:18 +00:00
rodrigc	930aa00466	Use vfs_mount_error() in a few places to give more descriptive mount error messages.	2006-12-09 17:21:25 +00:00
rodrigc	b2bd033255	Add locking around calls to unionfs_get_node_status() in unionfs_ioctl() and unionfs_poll(). Submitted by: daichi, Masanori OZAWA <ozawa@ongs.co.jp> Prompted by: kris	2006-12-09 16:51:09 +00:00
rodrigc	951f903c66	In unionfs_readdir(), prevent a possible NULL dereference. CID: 1667 Found by: Coverity Prevent (tm)	2006-12-09 16:34:37 +00:00
rodrigc	e9a21b7fd8	In unionfs_hashrem(), use LIST_FOREACH_SAFE when iterating over the list of nodes to free them. CID: 1668 Found by: Coverity Prevent (tm)	2006-12-09 16:27:50 +00:00
rodrigc	ca5229f781	Minor cleanup. If we are doing a mount update, and we pass in an "export" flag indicating that we are trying to NFS export the filesystem, and the MSDOSFS_LARGEFS flag is set on the filesystem, then deny the mount update and export request. Otherwise, let the full mount update proceed normally. MSDOSFS_LARGES and NFS don't mix because of the way inodes are calculated for MSDOSFS_LARGEFS. MFC after: 3 days	2006-12-09 01:49:19 +00:00
kientzle	86c1b97857	The ISO9660 spec does allow files up to 4G. Change the i_size field to "unsigned long" so that it actually works. Thanks to Robert Sciuk for sending me a DVD that demonstrated ISO9660-formatted media with a file >2G. I've now fixed this both in libarchive and in the cd9660 filesystem. MFC after: 14 days	2006-12-08 07:43:53 +00:00
julian	396ed947f6	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
maxim	b3ab8f2011	o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME set birthtime to FAT CTime (creation time) and in the other cases set birthtime to -1. o Set ctime to mtime instead of FAT CTime which has completely different meaning. PR: kern/106018 Submitted by: Oliver Fromme MFC after: 1 month	2006-12-03 19:04:26 +00:00
rodrigc	792d126127	Add missing includes for <sys/buf.h> and <sys/bio.h>.	2006-12-02 22:30:30 +00:00
rodrigc	c4618bacd3	Many, many thanks to Masanori OZAWA <ozawa@ongs.co.jp> and Daichi GOTO <daichi@FreeBSD.org> for submitting this major rewrite of unionfs. This rewrite was done to try to solve many of the longstanding crashing and locking issues in the existing unionfs implementation. This implementation also adds a 'MASQUERADE mode', which allows the user to set different user, group, and file permission modes in the upper layer. Submitted by: daichi, Masanori OZAWA Reviewed by: rodrigc (modified for minor style issues)	2006-12-02 19:35:56 +00:00
maxim	24b0d3f0fd	o From the submitter: dos2unixchr will convert to lower case if LCASE_BASE or LCASE_EXT or both are set. But dos2unixfn uses dos2unixchr separately for the basename and the extension. So if either LCASE_BASE or LCASE_EXT is set, dos2unixfn will convert both the basename and extension to lowercase because it is blindly passing in the state of both flags to dos2unixchr. The bit masks I used ensure that only the state of LCASE_BASE gets passed to dos2unixchr when the basename is converted, and only the state of LCASE_EXT is passed in when the extension is converted. PR: kern/86655 Submitted by: Micah Lieske MFC after: 3 weeks	2006-11-26 18:49:44 +00:00
le	d8d1f1dab4	Fix an integer overflow and allow access to files larger than 4GB on NTFS.	2006-11-20 19:28:36 +00:00
kib	5da2fd53cf	Wake up PIOCWAIT handler on the process exit in addition to the stop events. &p->p_stype is explicitely woken up on process exit for us. Now, truss /nonexistent exits with error instead of waiting until killed by signal. Reported by: Nikos Vassiliadis nvass at teledomenet gr Reviewed by: jhb MFC after: 1 week	2006-11-17 14:52:38 +00:00
kmacy	0c00ea16db	change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)	2006-11-13 05:51:22 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
bp	85d804ea67	Create a bidirectional mapping of the DOS 'read only' attribute to the 'w' flag. PR: kern/77958 Submitted by: ghozzy gmail com MFC after: 1 month	2006-11-05 06:38:42 +00:00
jb	f82c799735	Make KSE a kernel option, turned on by default in all GENERIC kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@	2006-10-26 21:42:22 +00:00
phk	3334f92e7f	Ditch crummy fattime <--> timespec conversion functions	2006-10-24 11:55:18 +00:00
phk	a13705978b	Drop crummy fattime to timespec conversion routines. Leave a XXX here for anybody able to test.	2006-10-24 11:43:41 +00:00
phk	abedeeee55	Replace slightly crummy fattime<->timespec conversion functions.	2006-10-24 11:14:05 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
trhodes	7e0c59262e	Fake the link count until we have no choice but to load data from the MFT. PR: 86965 Submitted by: Lowell Gilbert <lgfbsd@be-well.ilk.org>	2006-10-21 08:17:17 +00:00
kib	8a69d7f5b4	Update the access and modification times for dev while still holding thread reference on it. Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 08:03:42 +00:00
kib	5f5bf9dadc	Fix the race between devfs_fp_check and devfs_reclaim. Derefence the vnode' v_rdev and increment the dev threadcount , as well as clear it (in devfs_reclaim) under the dev_lock(). Reviewed by: tegge Approved by: pjd (mentor)	2006-10-20 07:59:50 +00:00
kib	afa2e43fb6	Properly lock the vnode around vgone() calls. Unlock the vnode in devfs_close() while calling into the driver d_close() routine. devfs_revoke() changes by: ups Reviewed and bugfixes by: tegge Tested by: mbr, Peter Holm Approved by: pjd (mentor) MFC after: 1 week	2006-10-18 11:17:14 +00:00
phk	638e020bc6	Use utc_offset() where applicable, and hide the internals of it as static variables.	2006-10-02 18:23:37 +00:00
phk	50c81b8a9a	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
ru	4ef62e4ca5	Fix our ioctl(2) implementation when the argument is "int". New ioctls passing integer arguments should use the _IOWINT() macro. This fixes a lot of ioctl's not working on sparc64, most notable being keyboard/syscons ioctls. Full ABI compatibility is provided, with the bonus of fixing the handling of old ioctls on sparc64. Reviewed by: bde (with contributions) Tested by: emax, marius MFC after: 1 week	2006-09-27 19:57:02 +00:00
tegge	83154f853d	Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().	2006-09-26 04:12:49 +00:00
kib	11200e2de3	Fix the bug in rev. 1.134. In devfs_allocv_drop_refs(), when not_found == 2 and drop_dm_lock is true, no unlocking shall be attempted. The lock is already dropped and memory is freed. Found with: Coverity Prevent(tm) CID: 1536 Approved by: pjd (mentor)	2006-09-19 14:03:02 +00:00
kib	ecf34f4504	Resolve the devfs deadlock caused by LOR between devfs_mount->dm_lock and vnode lock in devfs_allocv. Do this by temporary dropping dm_lock around vnode locking. For safe operation, add hold counters for both devfs_mount and devfs_dirent, and DE_DOOMED flag for devfs_dirent. The facilities allow to continue after dropping of the dm_lock, by making sure that referenced memory does not disappear. Reviewed by: tegge Tested by: kris Approved by: kan (mentor) PR: kern/102335	2006-09-18 13:23:08 +00:00
imp	478d307596	Put the osta.c license on osta.h. The license is the same. Approved by: scottl@	2006-09-12 19:02:34 +00:00
imp	db85f415fa	while (0); -> while (0) in multi-line macros	2006-08-17 22:50:33 +00:00
alc	b98eae58a6	Introduce a field to struct vm_page for storing flags that are synchronized by the lock on the object containing the page. Transition PG_WANTED and PG_SWAPINPROG to use the new field, eliminating the need for holding the page queues lock when setting or clearing these flags. Rename PG_WANTED and PG_SWAPINPROG to VPO_WANTED and VPO_SWAPINPROG, respectively. Eliminate the assertion that the page queues lock is held in vm_page_io_finish(). Eliminate the acquisition and release of the page queues lock around calls to vm_page_io_finish() in kern_sendfile() and vfs_unbusy_pages().	2006-08-09 17:43:27 +00:00
yar	209e4786e7	Commit the results of the typo hunt by Darren Pilgrim. This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week	2006-08-04 07:56:35 +00:00
delphij	578a019ead	When the volume is being downgraded from a read-write mode, mark it as clean. PR: kern/85366 Submitted by: Dan Lukes <dan at obluda dot cz> MFC After: 2 weeks	2006-08-03 03:55:52 +00:00
yar	ee7ffb561f	In udf_find_partmaps(), when we find a type 1 partition map, we have to skip the actual type 1 length (6 bytes). With this change, it is now possible to correctly spot the VAT partition map in certain discs. Submitted by: Pedro Martelletto <pedro@ambientworks.net>	2006-07-25 14:15:50 +00:00
jhb	4e36804678	Update comment.	2006-07-18 22:29:54 +00:00
jhb	f126d3e34b	Lock the smb share before doing a 'put' on it in smbfs_unmount(). Tested by: "Jiawei Ye" <leafy7382 at gmail>	2006-07-17 16:13:42 +00:00
phk	0f924547b0	Remove the NDEVFSINO and NDEVFSOVERFLOW options which no longer exists in DEVFS. Remove the opt_devfs.h file now that it is empty.	2006-07-17 09:07:02 +00:00
ups	b5dc376dfa	Add vnode interlocking to devfs. This prevents race conditions that can cause pagefaults or devfs to use arbitrary vnodes. MFC after: 1 week	2006-07-12 20:25:35 +00:00
jhb	e09e5b52db	Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.	2006-07-08 20:03:39 +00:00
rwatson	ef5c0fe5ce	Remove unneeded mac.h include. MFC after: 3 days	2006-07-06 13:25:01 +00:00
rwatson	c9e5505f09	Remove now unneeded opt_mac.h and mac.h includes. MFC after: 3 days	2006-07-06 13:24:22 +00:00
rwatson	9d9a014802	Use #include "", not #include <> for opt_foo.h. MFC after: 3 days	2006-07-06 13:22:08 +00:00
netchild	1fbdab64b8	Correctly calculate a buffer length. It was off by one so a read() returned one byte less than needed. This is a RELENG_x_y candidate, since it fixes a problem with Oracle 10. Noticed by: Dmitry Ganenko <dima@apk-inform.com> Testcase by: Dmitry Ganenko <dima@apk-inform.com> Reviewed by: des Submitted by: rdivacky Sponsored by: Google SoC 2006 MFC after: 1 week	2006-06-27 20:21:38 +00:00
scottl	36eaea84d8	Fix a memory leak and a nested 'for' loop in the spare table handling. Submitted by: Pedro Martelletto	2006-06-26 03:21:19 +00:00
ghelmer	253ab973ad	Upon further review, DES prefers this change over that in revision 1.13 to resolve the directory access problem for processes with P_SUGID flag set. Suggested by: des	2006-06-05 16:41:27 +00:00
rodrigc	f00265f1cc	mount_msdosfs.c: - remove call to getmntopts(), and just pass -o options to nmount(). This removes some confusion as to what options msdosfs can parse, by pushing the responsibility of option parsing to the VFS and FS specific code in the kernel. msdosfs_vfsops.c: - add "force" and "sync" to msdosfs_opts. They used to be specified in mount_msdosfs.c, so move them here. It's not clear whethere these options should be placed into global_opts in vfs_mount.c or not. Motivated by: marcus	2006-06-01 02:25:00 +00:00
cperciva	4e501fd8a3	Enable inadvertantly disabled "securenet" access controls in ypserv. [1] Correct a bug in the handling of backslash characters in smbfs which can allow an attacker to escape from a chroot(2). [2] Security: FreeBSD-SA-06:15.ypserv [1] Security: FreeBSD-SA-06:16.smbfs [2]	2006-05-31 22:32:22 +00:00
rodrigc	92e37b9fef	Remove incorrect null_checkexp() routine. This will allow the NFS server to call vfs_stdcheckexp() on the exported nullfs filesystem, not the underlying filesystem being nullfs mounted. If the lower filesystem was not NFS exported, then the NFS exported null filesystem would not work. Pointed out by: scottl PR: kern/87906 MFC after: 1 week	2006-05-28 22:45:52 +00:00
rodrigc	d1d9c4f5bc	Modify MNT_UPDATE behavior for nullfs so that it does not return EOPNOTSUPP if an "export" parameter was passed in. This should allow nullfs mounts to be NFS exported. PR: kern/87906 MFC after: 1 week	2006-05-28 20:09:18 +00:00
rodrigc	52bbc2b4ab	Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.	2006-05-26 01:21:51 +00:00
rodrigc	055e2abe68	Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.	2006-05-26 00:32:21 +00:00
ups	d193a40233	Call vm_object_page_clean() with the object lock held. Submitted by: kensmith@ Reviewed by: mohans@ MFC after: 6 days	2006-05-25 17:16:11 +00:00
ups	4eb5a7d9ee	Do not set B_NOCACHE on buffers when releasing them in flushbuflist(). If B_NOCACHE is set the pages of vm backed buffers will be invalidated. However clean buffers can be backed by dirty VM pages so invalidating them can lead to data loss. Add support for flush dirty page in the data invalidation function of some network file systems. This fixes data losses during vnode recycling (and other code paths using invalbuf(,V_SAVE,,*)) for data written using an mmaped file. Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@ Reviewed by: tegge@ MFC after: 7 days	2006-05-25 01:00:35 +00:00
ghelmer	8ffa3afe92	Revision 1.4 set access for all sensitive files in /proc/<PID> to mode 0 if a process's uid or gid has changed, but the /proc/<PID> directory itself was also set to mode 0. Assuming this doesn't open any security holes, open access to the /proc/<PID> directory for users other than root to read or search the directory. Reviewed by: des (back in February) MFC after: 3 weeks	2006-05-24 14:03:51 +00:00
phk	ef310efff8	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
kbyanc	defe42e909	Restore the ability to mount procfs and fdescfs filesystems via the mount(2) system call: * Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and linprocfs). This (mostly) restores the ability to mount these filesystems using the old mount(2) system call (see below for the rest of the fix). * Remove not-NULL check for the data argument from the mount(2) entry point. Per the mount(2) man page, it is up to the individual filesystem being mounted to verify data. Or, in the case of procfs, etc. the filesystem is free to ignore the data parameter if it does not use it. Enforcing data to be not-NULL in the mount(2) system call entry point prevented passing NULL to filesystems which ignored the data pointer value. Apparently, passing NULL was common practice in such cases, as even our own mount_std(8) used to do it in the pre-nmount(2) world. All userland programs in the tree were converted to nmount(2) long ago, but I've found at least one external program which broke due to this (presumably unintentional) mount(2) API change. One could argue that external programs should also be converted to nmount(2), but then there isn't much point in keeping the mount(2) interface for backward compatibility if it isn't backward compatible.	2006-05-15 19:42:10 +00:00
pjd	b8538a9381	Remove unused prototypes.	2006-04-12 12:17:29 +00:00
jeff	158187fcb0	- Add a bogus vhold/vdrop around vgone() in devfs_revoke. Without this the vnode is never recycled. It is bogus because the reference really should be associated with the devfs dirent.	2006-03-31 23:37:29 +00:00
tegge	1952671e7a	Call vn_start_write() before locking vnode.	2006-03-19 20:45:06 +00:00
rwatson	918de4c556	Add a_fdidx to comment prototype for fifo_open(). MFC after: 3 days Submitted by: Kostik Belousov <kostikbel at gmail dot com>	2006-03-15 10:15:35 +00:00
rwatson	40fd390520	If fifo_open() is called with a negative file descriptor, return EINVAL rather than panicking later. This can occur if the kernel calls vn_open() on a fifo, as there will be no associated file descriptor, and therefore the file descriptor operations cannot be modified to point to the fifo operation set. MFC after: 3 days Reported by: Martin <nakal at nurfuerspam dot de> PR: 94278	2006-03-14 19:29:45 +00:00
joerg	0ec2804cee	When encountering a ISO_SUSP_CFLAG_ROOT element in Rock Ridge processing, this actually means there's a double slash recorded in the symbolic link's path name. We used to start over from / then, which caused link targets like ../../bsdi.1.0/include//pathnames.h to be interpreted as /pathnahes.h. This is both contradictionary to our conventional slash interpretation, as well as potentially dangerous. The right thing to do is (obviously) to just ignore that element. bde once pointed out that mistake when he noticed it on the 4.4BSD-Lite2 CD-ROM, and asked me for help. Reviewed by: bde (about half a year ago) MFC after: 3 days	2006-03-13 22:32:33 +00:00
jeff	c98db28d0e	- Define a null_getwritemount to get the mount-point for the lower filesystem so that nullfs doesn't permit you to circumvent snapshots. Discussed with: tegge Sponsored by: Isilon Systems, Inc.	2006-03-12 04:58:18 +00:00
kris	f1759f2396	Correct the vnode locking in fdescfs. PR: kern/93905 Submitted by: Kostik Belousov <kostikbel@gmail.com> Reviewed by: jeff MFC After: 1 week	2006-02-28 00:05:44 +00:00
yar	3822005cd3	CODA_COMPAT_5 may not be defined unconditionally in the coda5 module. Otherwise a kernel build would break in the coda5 module if the main kernel conf file enabled CODA_COMPAT_5, too. Redefined symbols are strictly disallowed by -Werror. To overcome this issue, introduce a different symbol indicating coda5 build, CODA5_MODULE, and translate it to CODA_COMPAT_5 appropriately in /sys/coda/coda.h. MFC after: 3 days	2006-02-27 12:04:13 +00:00
jhb	ff9c76bccd	Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD. Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week	2006-02-22 18:57:50 +00:00
jhb	82b4c89720	Change pfs_visible() to optionally return a pointer to the process associated with the passed in pfs_node. If it does return a pointer, it keeps the process locked. This allows a lot of places that were calling pfind() again right after pfs_visible() to not have to do that and avoids races since we don't drop the proc lock just to turn around and lock it again. This will become more important with future changes to fix races between procfs/ptrace and exit(2). Also, removed a duplicate pfs_visible() call in pfs_getextattr(). Reviewed by: des MFC after: 1 week	2006-02-22 17:24:54 +00:00
jhb	d6902d680b	Hold the proc lock while calling proc_sstep() since the function asserts it and remove a PRELE() that didn't have a matching PHOLD(). The calling code already has a PHOLD anyway. MFC after: 1 week	2006-02-22 17:20:37 +00:00
jeff	4c3ad6634a	- We must hold a reference to a vnode before calling vgone() otherwise it may not be removed from the freelist. MFC After: 1 week Found by: kris	2006-02-22 09:05:40 +00:00
jeff	d08a9e33bf	- spell VOP_LOCK(vp, LK_RELEASE... VOP_UNLOCK(vp,... so that asserts in vop_lock_post do not trigger. - Rearrange null_inactive to null_hashrem earlier so there is no chance of finding the null node on the hash list after the locks have been switched. - We should never have a NULL lowervp in null_reclaim() so there is no need to handle this situation. panic instead. MFC After: 1 week	2006-02-22 06:17:31 +00:00
jeff	79ca2be05c	- Assert that the lowervp is locked in null_hashget(). - Simplify the logic dealing with recycled vnodes in null_hashget() and null_hashins(). Since we hold the lower node locked in both cases the null node can not be undergoing recycling unless reclaim somehow called null_nodeget(). The logic that was in place was not safe and was essentially dead code. MFC After: 1 week	2006-02-22 06:15:12 +00:00
jeff	3133ba817d	- Deadfs should not use the std GETWRITEMOUNT routine. Add one that always returns NULL. MFC After: 1 week	2006-02-22 06:11:59 +00:00
jhb	6acd384eb7	Correctly set MNTK_MPSAFE flag from the lower vnode's mount rather than always turning it on along with any flags set in the lower mount. Tested by: kris Reviewed by: jeff MFC after: 3 days	2006-02-10 18:06:49 +00:00
jeff	9ea95f5e38	- No need to WANTPARENT when we're just going to vrele it in a deadlock prone way later. Reported by: kkenn MFC After: 3 days	2006-02-07 11:31:32 +00:00
will	a82365919d	Make UDF endian-safe. Submitted by: Pedro Martelletto <pedro@ambientworks.net> (via scottl) Tested on: sparc64	2006-02-03 15:25:52 +00:00
jeff	30a231055b	- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately. MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn	2006-02-01 00:25:26 +00:00
jeff	af5f248494	- Remove a stale comment. This function was rewritten to be SMP safe some time ago. Sponsored by: Isilon Systems, Inc.	2006-01-30 08:24:14 +00:00
trhodes	54e5c67329	Update incorrect comments here, there should not be a call to panic() over fs corruption. Discussed with: alfred, phk	2006-01-23 17:45:57 +00:00
fjoe	e39df5af9f	Do not assume that `char direntry::deExtension[3]' starts right after `char direntry::deName[8]' and access deExtension[] explicitly. Found by: Coverity Prevent(tm) CID: 350, 351, 352	2006-01-22 21:09:38 +00:00
rwatson	56bc8d8e33	Convert last four functions in coda_vnops.c to ANSI C function declarations. I knew I would get to fix something in Coda eventually. MFC after: 1 week	2006-01-21 19:51:47 +00:00
alfred	a0282ebc04	I ran into an nfs client panic a couple of times in a row over the last few days. I tracked it down to the fact that nfs_reclaim() is setting vp->v_data to NULL _before_ calling vnode_destroy_object(). After silence from the mailing list I checked further and discovered that ufs_reclaim() is unique among FreeBSD filesystems for calling vnode_destroy_object() early, long before tossing v_data or much of anything else, for that matter. The rest, including NFS, appear to be identical, as if they were just clones of one original routine. The enclosed patch fixes all file systems in essentially the same way, by moving the call to vnode_destroy_object() to early in the routine (before the call to vfs_hash_remove(), if any). I have only tested NFS, but I've now run for over eighteen hours with the patch where I wouldn't get past four or five without it. Submitted by: Frank Mayhar Requested by: Mohan Srinivasan MFC After: 1 week	2006-01-17 17:29:03 +00:00
tegge	d344c11861	Add marker vnodes to ensure that all vnodes associated with the mount point are iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman	2006-01-09 20:42:19 +00:00
maxim	93d7e294fc	o Fix typo in the define: s/MRAK_INT_GEN/MARK_INT_GEN/. The typo was harmless because the define is not used in coda_vfsops.c. Submitted by: Hugo Meiland	2006-01-09 18:07:06 +00:00
maxim	e065d5a185	o Typo in the debug message: s/skiped/skipped. PR: kern/91346 Submitted by: Gavin Atkinson	2006-01-05 13:39:23 +00:00
rwatson	428f554873	When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule lock. Otherwise the system comes to a rather sudden and grinding halt. MFC after: 1 week	2006-01-03 09:49:10 +00:00
trhodes	412f766852	Make tv_sec a time_t on all platforms but alpha. Brings us more in line with POSIX. This also makes the struct correct we ever implement an i386-time64 architecture. Not that we need too. Reviewed by: imp, brooks Approved by: njl (acpica), des (no objects, touches procfs) Tested with: make universe	2005-12-24 22:22:17 +00:00
des	5d3c44687b	Eradicate caddr_t from the VFS API.	2005-12-14 00:49:52 +00:00
avatar	257abc1fc1	Recent nmount(2) adoption in mount_smbfs(8) did not flag the "long" option since mount_smbfs(8) assumed long name mounting by default unless "-n long" was explicitly specified. Rather than supplying a "long" option in mount_smbfs(8), this commit brings back the original behaviour by associating SMBFS_MOUNT_NO_LONG with the "nolong" option. This should fix the broken long file names on smbfs people observed recently. Reported by: Vladimir Grebenschikov <vova at fbsd dot ru> Reviewed by: phk Tested by: Slawa Olhovchenkov <slw at zxy dot spb dot ru>	2005-12-05 19:05:06 +00:00
ru	9b19d72862	Fix -Wundef warnings found when compiling i386 LINT, GENERIC and custom kernels.	2005-12-05 11:58:35 +00:00
ru	798500dfd8	Fix -Wundef from compiling the amd64 LINT.	2005-12-04 10:06:06 +00:00
ru	522e9c2b7b	Fix -Wundef.	2005-12-04 02:12:43 +00:00
bp	4248fbe6a6	Fix interaction with Windows 2000/XP based servers: If the complete reply on the TRANS2_FIND_FIRST2 request fits exactly into one responce packet, then next call to TRANS2_FIND_NEXT2 will return zero entries and server will close current transaction. To avoid subsequent errors we should not perform FIND_CLOSE2 request. PR: kern/78953 Submitted by: Jim Carroll	2005-11-22 07:13:00 +00:00
rodrigc	736e6b710d	Properly parse the nowin95 mount option. Tested by: Rainer Hurling <rhurlin at gwdg dot de>	2005-11-19 16:38:39 +00:00
rodrigc	666e602c46	Add "shortnames" and "longnames" mount options which are synonyms for "shortname" and "longname" mount options. The old (before nmount()) mount_msdosfs program accepted "shortnames" and "longnames", but the kernel nmount() checked for "shortname" and "longname". So, make the kernel accept "shortnames", "longnames", "shortname", "longname" for forwards and backwarsd compatibility. Discovered by: Rainer Hurling <rhurlin at gwdg dot de>	2005-11-18 22:34:31 +00:00
rodrigc	bda951e116	- Add errmsg to the list of smbfs mount options. - Use vfs_mount_error() to propagate smbfs mount errors back to userspace. Reviewed by: bp (smbfs maintainer)	2005-11-16 02:26:25 +00:00
dwhite	0bcdf7c033	This is a workaround for a complicated issue involving VFS cookies and devfs. The PR and patch have the details. The ultimate fix requires architectural changes and clarifications to the VFS API, but this will prevent the system from panicking when someone does "ls /dev" while running in a shell under the linuxulator. This issue affects HEAD and RELENG_6 only. PR: 88249 Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com> MFC after: 3 days	2005-11-09 22:03:50 +00:00
rwatson	be4f357149	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
phk	3ed5c9efd0	Use correct cirteria for determining which directory entries we can purge right away and which we merely can hide. Beaten into my skull by: kris	2005-10-18 20:21:25 +00:00
des	4426988f2c	Implement the full range of ISO9660 number conversion routines in iso.h. MFC after: 2 weeks	2005-10-18 13:35:08 +00:00
rodrigc	cb57dd08dd	Unconditionally mount a CD9660 filesystem as read-only, instead of returning EROFS if we forget to mount it as read-only.	2005-10-17 03:29:53 +00:00
rodrigc	49d8776e03	Use the actual sector size of the media instead of hard-coding it to 2048. This eliminates KASSERTs in GEOM if we accidentally mount an audio CD as a cd9660 filesystem.	2005-10-17 03:27:35 +00:00
rodrigc	adce9d7a14	Unconditionally mount a UDF filesystem as read-only, instead of returning an EROFS if we forget to mount it as read-only.	2005-10-17 03:07:36 +00:00
flz	a0cf3f9b58	- Fix typo. Approved by: ssouhlal MFC after: 1 week	2005-10-17 00:04:35 +00:00
truckman	321926b9ba	Update nwfs_lookup() to match the current cache_lookup() API. cache_lookup() has returned a ref'ed and locked vnode since vfs_cache.c:1.96, dated Tue Mar 29 12:59:06 2005 UTC. This change is similar to the change made to smbfs_lookup() in smbfs_vnops.c:1.58. Tested by: "Antony Mawer" ant AT mawer.org MFC after: 2 weeks	2005-10-16 21:54:35 +00:00
kris	fa8ac58228	Reflect mpsafety of the underlying filesystem in the nullfs image. I benchmarked this by simultaneously extracting 4 large tarballs (basically world images) on a 4-processor AMD64 system, in a malloc-backed md. With this patch, system time was reduced by 43%, and wall clock time by 33%. Submitted by: jeff MFC after: 1 week	2005-10-16 21:45:25 +00:00
truckman	80700e8efc	Apply the same fix to a potential race in the ISDOTDOT code in cd9660_lookup() that was used to fix an actual race in ufs_lookup.c:1.78. This is not currently a hazard, but the bug would be activated by marking cd9660 as MPSAFE. Requested by: bde	2005-10-16 21:41:54 +00:00
yar	924e74a759	In preparation for making the modules actually use opt_*.h files provided in the kernel build directory, fix modules that were failing to build this way due to not quite correct kernel option usage. In particular: ng_mppc.c uses two complementary options, both of which are listed in sys/conf/files. Ideally, there should be a separate option for including ng_mppc.c in kernel build, but now only NETGRAPH_MPPC_ENCRYPTION is usable anyway, the other one requires proprietary files. nwfs and smbfs were trying to ensure they were built with proper network components, but the check was rather questionable. Discussed with: ru	2005-10-14 23:17:45 +00:00
davidxu	3fbdb3c215	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
rodrigc	5eb4cdb703	- Do not hardcode the bsize to a sectorsize of 2048, even though the UDF specification specifies a logical sectorsize of 2048. Instead, get it from GEOM. - When reading the UDF Anchor Volume Descriptor, use the logical sectorsize of 2048 when calculating the offset to read from, but use the actual sectorsize to determine how much to read. - works with reading a DVD disk and a DVD disk image file via mdconfig - correctly returns EINVAL if we try to mount_udf an audio CD, instead of panicking inside GEOM when INVARIANTS is set	2005-10-09 04:45:33 +00:00
pjd	5c01c35a0c	We don't need 'imp' here.	2005-10-07 10:30:47 +00:00
rwatson	5758dab896	Second attempt at a work-around for fifo-related socket panics during make -j with high levels of parallelism: acquire Giant in fifo I/O routines. Discussed with: ups MFC after: 3 days	2005-10-01 20:15:41 +00:00
phk	b23c35710b	The NWFS code in RELENG_6 is broken due to a typo in sys/fs/nwfs/nwfs_vfsop= s.c, introduced with the conversion to nmount with revision 1.38. This causes mount_nwfs to fail with the error message: mount_nwfs: mount error: /mnt/netware: syserr = No such file or directo= ry This is caused by a typo on line 178, which specifies "nwfw_args" rather than "nwfs_args". Submitted by: Antony Mawer <gnats@mawer.org> Fat fingers: phk PR: 86757 MFC: 3 days	2005-09-30 18:21:05 +00:00
peadar	e0565b5794	Remove checks for BOOTSIG[23] from FAT32 bootblocks. There seems to be very little documentary evidence outside this implementation to suggest a these checks are neccessary, and more than one camera-formatted flash disk fails the check, but mounts successfully on most other systems. Reviewed By: bde@	2005-09-29 14:09:46 +00:00
rwatson	332a994af0	Back out fifo_vnops.c:1.127, which introduced an sx lock around I/O on a fifo. While this did indeed close the race, confirming suspicions about the nature of the problem, it causes difficulties with blocking I/O on fifos. Discussed with: ups Also spotted by: Peter Holm <peter at holm dot cc>	2005-09-27 16:45:22 +00:00
rwatson	87ac2d2498	Assert v_fifoinfo is non-NULL in fifo_close() in order to catch non-conforming cases sooner. MFC after: 3 days Reported by: Peter Holm <peter at holm dot cc>	2005-09-26 08:17:03 +00:00
rwatson	c9044f078d	Lock the read socket receive buffer when frobbing the sb_state flag on that socket during open, not the write socket receive buffer. This might explain clearing of the sb_state SB_LOCK flag seen occasionally in soreceive() on fifos. MFC after: 3 days Spotted by: ups	2005-09-25 19:52:09 +00:00
phk	a951e327e6	Make rule zero really magical, that way we don't have to do anything when we mount and get zero cost if no rules are used in a mountpoint. Add code to deref rules on unmount. Switch from SLIST to TAILQ. Drop SYSINIT, use SX_SYSINIT and static initializer of TAILQ instead. Drop goto, a break will do. Reduce double pointers to single pointers. Combine reaping and destroying rulesets. Avoid memory leaks in a some error cases.	2005-09-24 07:03:09 +00:00
rwatson	27cf2ffed6	For reasons of consistency (and necessity), assert an exclusive vnode lock on the fifo vnode in fifo_open(): we rely on the vnode lock to serialize access to v_fifoinfo. MFC after: 3 days	2005-09-23 12:39:51 +00:00
rwatson	9c2c1cb9fb	Add fi_sx, an sx lock to serialize I/O operations on the socket pair underlying the POSIX fifo implementation. In 6.x/7.x, fifo access is moved from the VFS layer, where it was serialized using the vnode lock, to the file descriptor layer, where access is protected by a reference count but not serialized. This exposed socket buffer locking to high levels of parallelism in specific fifo workloads, such as make -j 32, which expose as yet unresolved socket buffer bugs. fi_sx re-adds serialization about the read and write routines, although not paths that simply test socket buffer mbuf queue state, such as the poll and kqueue methods. This restores the extra locking cost previously present in some cases, but is an effective workaround for the instability that has been experienced. This workaround should be removed once the bug in socket buffer handling has been fixed. Reported by: kris, jhb, Julien Gabel <jpeg at thilelli dot net>, Peter Holm <peter at holm dot cc>, others MFC after: 3 days	2005-09-22 10:51:12 +00:00
phk	6a408cbd71	Rewamp DEVFS internals pretty severely [1]. Give DEVFS a proper inode called struct cdev_priv. It is important to keep in mind that this "inode" is shared between all DEVFS mountpoints, therefore it is protected by the global device mutex. Link the cdev_priv's into a list, protected by the global device mutex. Keep track of each cdev_priv's state with a flag bit and of references from mountpoints with a dedicated usecount. Reap the benefits of much improved kernel memory allocator and the generally better defined device driver APIs to get rid of the tables of pointers + serial numbers, their overflow tables, the atomics to muck about in them and all the trouble that resulted in. This makes RAM the only limit on how many devices we can have. The cdev_priv is actually a super struct containing the normal cdev as the "public" part, and therefore allocation and freeing has moved to devfs_devs.c from kern_conf.c. The overall responsibility is (to be) split such that kern/kern_conf.c is the stuff that deals with drivers and struct cdev and fs/devfs handles filesystems and struct cdev_priv and their private liason exposed only in devfs_int.h. Move the inode number from cdev to cdev_priv and allocate inode numbers properly with unr. Local dirents in the mountpoints (directories, symlinks) allocate inodes from the same pool to guarantee against overlaps. Various other fields are going to migrate from cdev to cdev_priv in the future in order to hide them. A few fields may migrate from devfs_dirent to cdev_priv as well. Protect the DEVFS mountpoint with an sx lock instead of lockmgr, this lock also protects the directory tree of the mountpoint. Give each mountpoint a unique integer index, allocated with unr. Use it into an array of devfs_dirent pointers in each cdev_priv. Initially the array points to a single element also inside cdev_priv, but as more devfs instances are mounted, the array is extended with malloc(9) as necessary when the filesystem populates its directory tree. Retire the cdev alias lists, the cdev_priv now know about all the relevant devfs_dirents (and their vnodes) and devfs_revoke() will pick them up from there. We still spelunk into other mountpoints and fondle their data without 100% good locking. It may make better sense to vector the revoke event into the tty code and there do a destroy_dev/make_dev on the tty's devices, but that's for further study. Lots of shuffling of stuff and churn of bits for no good reason[2]. XXX: There is still nothing preventing the dev_clone EVENTHANDLER from being invoked at the same time in two devfs mountpoints. It is not obvious what the best course of action is here. XXX: comment out an if statement that lost its body, until I can find out what should go there so it doesn't do damage in the meantime. XXX: Leave in a few extra malloc types and KASSERTS to help track down any remaining issues. Much testing provided by: Kris Much confusion caused by (races in): md(4) [1] You are not supposed to understand anything past this point. [2] This line should simplify life for the peanut gallery.	2005-09-19 19:56:48 +00:00
rwatson	2d8b6f2e27	Assert that (vp) is locked in fifo_close(), since we rely on the exclusive vnode lock to synchronize the reference counts on struct fifoinfo. MFC after: 3 days	2005-09-18 10:44:50 +00:00
phk	2d4ad1cc44	Don't attempt to recurse lockmgr, it doesn't like it.	2005-09-15 21:16:43 +00:00
kan	b4bff5a977	Handle a race condition where NULLFS vnode can be cleaned while threads can still be asleep waiting for lowervp lock. Tested by: kkenn Discussed with: ssouhlal, jeffr	2005-09-15 19:21:26 +00:00
rwatson	7584b6b2c3	The socket pointers in fifoinfo are not permitted to be NULL, so don't check if they are, it just confuses the fifo code more. MFC after: 3 days	2005-09-15 15:45:34 +00:00
phk	3c4b94c1fe	Various minor polishing.	2005-09-15 10:28:19 +00:00
phk	eafa84f647	Protect the devfs rule internal global lists with a sx lock, the per mount locks are not enough. Finer granularity (x)locking could be implemented, but I prefer to keep it simple for now.	2005-09-15 08:50:16 +00:00
phk	1a7de63bbb	Absolve devfs_rule.c from locking responsibility and call it with all necessary locking held.	2005-09-15 08:36:37 +00:00
phk	a543c5b761	Close a race which could result in unwarranted "ruleset %d already running" panics. Previously, recursion through the "include" feature was prevented by marking each ruleset as "running" when applied. This doesn't work for the case where two DEVFS instances try to apply the same ruleset at the same time. Instead introduce the sysctl vfs.devfs.rule_depth (default == 1) which limits how many levels of "include" we will traverse. Be aware that traversal of "include" is recursive and kernel stack size is limited. MFC: after 3 days	2005-09-15 06:57:28 +00:00
rwatson	0e4f08263a	Trim down now (believed to be) unused fifo_ioctl() and fifo_kqfilter() VOP implementations, since they in theory are used only on open file descriptors, in which case the ioctls are via fifo_ioctl_f() and kqueue requests are via fifo_kqfilter_f(). Generate warnings if they are entered for now. These printf() calls should become panic() calls. Annotate and re-implement fifo_ioctl_f(): don't arbitrarily forward ioctls to the socket layer, only forward the ones we explicitly support for fifos. In the case of FIONREAD, don't forward the request to the write socket on a read-write fifo, or the read result is overwritten. Annotate a nasty case for the undefined POSIX O_RDWR on fifos, in which failure of the second ioctl will result in the socket pair being in an inconsistent state. Assert copyright as I find myself rewriting non-trivial parts of fifofs. MFC after: 3 days	2005-09-13 17:46:48 +00:00
rwatson	afc7b6e916	As a result of kqueue locking work, socket buffer locks will always be held when entering a kqueue filter for fifos via a socket buffer event: as such, assert the lock unconditionally rather than acquiring it conditionall. MFC after: 3 days	2005-09-13 10:39:24 +00:00
rwatson	2bda369cf8	Annotate two issues: 1) fifo_kqfilter() is not actually ever used, it likely should be GC'd. 2) fifo_kqfilter_f() doesn't implement EVFILT_VNODE, so detecting events on the underlying vnode for a fifo no longer works (it did in 4.x). Likely, fifo_kqfilter_f() should forward the request to the VFS using fp->f_vnode, which would work once fifo_kqfilter() was detached from the vnode operation vector (removing the fifo override). Discussed with: phk	2005-09-13 09:23:22 +00:00

... 2 3 4 5 6 ...

2086 Commits