freebsd-skq

Author	SHA1	Message	Date
Konstantin Belousov	e3c7e75305	Lookup up the directory entry for the tmpfs node that are deleted by both node pointer and name component. This does the right thing for hardlinks to the same node in the same directory. Submitted by: Yoshihiro Ota <ota j email ne jp> PR: kern/131356 MFC after: 2 weeks	2009-02-08 19:18:33 +00:00
John Baldwin	e3024df2e0	Add rudimentary support for symbolic links on UDF. Links are stored as a sequence of pathname components. We walk the list building a string in the caller's passed in buffer. Currently this only handles path names in CS8 (character set 8) as that is what mkisofs generates for UDF images. MFC after: 1 month	2009-02-06 22:24:03 +00:00
John Baldwin	61e69c80e4	Add support for fifos to UDF: - Add a separate set of vnode operations that inherits from the fifo ops and use it for fifo nodes. - Add a VOP_SETATTR() method that allows setting the size (by silently ignoring the requests) of fifos. This is to allow O_TRUNC opens of fifo devices (e.g. I/O redirection in shells using ">"). - Add a VOP_PRINT() handler while I'm here.	2009-02-06 20:09:14 +00:00
John Baldwin	8941aad19b	Tweak the output of VOP_PRINT/vn_printf() some. - Align the fifo output in fifo_print() with other vn_printf() output. - Remove the leading space from lockmgr_printinfo() so its output lines up in vn_printf(). - lockmgr_printinfo() now ends with a newline, so remove an extra newline from vn_printf().	2009-02-06 20:06:48 +00:00
Bjoern A. Zeeb	13fd4d2163	After r186194 the fs_strategy() functions always return 0. So we are no longer interested in the error returned from the fs_doio() functions. With that we can remove the error variable as its value is unused now. Submitted by: Christoph Mallon christoph.mallon@gmx.de	2009-01-31 18:06:34 +00:00
Bjoern A. Zeeb	7956d34b95	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks	2009-01-31 17:36:22 +00:00
Ed Schouten	f3b86a5fd7	Mark most often used sysctl's as MPSAFE. After running a `make buildkernel', I noticed most of the Giant locks in sysctl are only caused by a very small amount of sysctl's: - sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like sysctl.oidfmt. - kern.ident, kern.osrelease, kern.version, etc. These are just constant strings. - kern.arandom, used by the stack protector. It is already protected by arc4_mtx. I also saw the following sysctl's show up. Not as often as the ones above, but still quite often: - security.jail.jailed. Also mark security.jail.list as MPSAFE. They don't need locking or already use allprison_lock. - kern.devname, used by devname(3), ttyname(3), etc. This seems to reduce Giant locking inside sysctl by ~75% in my primitive test setup.	2009-01-28 19:58:05 +00:00
Warner Losh	bb5d3b71d3	Use the correct field name for the size of the sierra_id. While this is the same size as id, and is unlikely to change, it seems better to use the correct field here. There's no difference in the generated code.	2009-01-28 19:09:49 +00:00
John Baldwin	c222ece0cc	Mark cd9660 MPSAFE and add support for using shared vnode locks during pathname lookups. - Remove 'i_offset' and 'i_ino' from the ISO node structure and replace them with local variables in the lookup routine instead. - Cache a copy of 'i_diroff' for use during a lookup in a local variable. - Save a copy of the found directory entry in a malloc'd buffer after a successfull lookup before getting the vnode. This allows us to release the buffer holding the directory block before calling vget() which otherwise resulted in a LOR between "bufwait" and the vnode lock. - Use an inlined version of vn_vget_ino() to handle races with .. lookups. I had to inline the code here since cd9660 uses an internal vget routine to save a disk I/O that would otherwise re-read the directory block. - Honor the requested locking flags during lookups to allow for shared locking. - Honor the requested locking flags passed to VFS_ROOT() and VFS_VGET() similar to UFS. - Don't make every ISO 9660 vnode hold a reference on the vnode of the underlying device vnode of the mountpoint. The mountpoint already holds a suitable reference.	2009-01-28 18:54:56 +00:00
John Baldwin	04c98d464f	Sync with ufs_vnops.c:1.245 and remove support for accessing device nodes in ISO 9660 filesystems.	2009-01-28 18:46:29 +00:00
John Baldwin	be09858abd	Assert an exclusive vnode lock for fifo_cleanup() and fifo_close() since they change v_fifoinfo. Discussed with: ups (a while ago)	2009-01-28 18:10:57 +00:00
Ed Schouten	a4611ab612	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
Konstantin Belousov	e442a285a6	The kernel may do unbalanced calls to fifo_close() for fifo vnode, without corresponding number of fifo_open(). This causes assertion failure in fifo_close() due to vp->v_fifoinfo being NULL for kernel with INVARIANTS, or NULL pointer dereference otherwise. In fact, we may ignore excess calls to fifo_close() without bad consequences. Turn KASSERT() into the return, and print warning for now. Tested by: pho Reviewed by: rwatson MFC after: 2 weeks	2009-01-26 14:21:00 +00:00
Edward Tomasz Napierala	abb0cbf9c9	Turn a "panic: non-decreasing id" into an error printf. This seems to be caused by a metadata corruption that occurs quite often after unplugging a pendrive during write activity. Reviewed by: scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation	2009-01-13 22:35:26 +00:00
Edward Tomasz Napierala	f99f675d5a	Fix msdosfs_print(), which in turn fixes "show lockedvnods" for msdosfs vnodes. Reviewed by: kib Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation	2009-01-11 17:11:01 +00:00
Joe Marcus Clarke	4424c9d053	Fix a deadlock which can occur due to a pseudofs vnode not getting unlocked. Reported by: Richard Todd <rmtodd@ichotolot.servalan.com> Reviewed by: kib Approved by: kib	2009-01-09 22:06:48 +00:00
Edward Tomasz Napierala	71624181c8	Don't panic with "vinvalbuf: dirty bufs" when the mounted device that was being written to goes away. Reviewed by: kib, scottl Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation	2009-01-08 19:13:34 +00:00
Joe Marcus Clarke	e7f54c1b71	Add a VOP_VPTOCNP implementation for pseudofs which covers file systems such as procfs and linprocfs. This implementation's locking was enhanced by kib. Reviewed by: kib des Approved by: des kib Tested by: pho	2008-12-30 21:49:39 +00:00
Konstantin Belousov	78e4cea909	When the insmntque() in the pfs_vncache_alloc() fails, vop_reclaim calls pfs_vncache_free() that removes pvd from the list, while it is not yet put on the list. Prevent the invalid removal from the list by clearing pvd_next and pvd_prev for the newly allocated pvd, and only move pfs_vncache list head when the pvd was at the head. Suggested and approved by: des MFC after: 2 weeks	2008-12-29 13:25:58 +00:00
Konstantin Belousov	22a448c4d9	vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:45:11 +00:00
Konstantin Belousov	c990bf0896	Use curproc->p_sysent->sv_flags bit SV_ILP32 for detection of the 32 bit caller, instead of direct comparision with ia32_freebsd_sysvec. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:41:32 +00:00
Konstantin Belousov	505d02eebe	Drop the pseudofs vnode lock around call to pfs_read handler. The handler may need to lock arbitrary vnodes, causing either lock order reversal or recursive vnode lock acquisition. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:12:23 +00:00
Konstantin Belousov	99ec92c962	After the pfs_vncache_mutex is dropped, another thread may attempt to do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we could end up with two vnodes for the pair. Recheck the cache under the locked pfs_vncache_mutex after all sleeping operations are done [1]. This case mostly cannot happen now because pseudofs uses exclusive vnode locking for lookup. But it does drop the vnode lock for dotdot lookups, and Marcus' pseudofs_vptocnp implementation is vulnerable too. Do not call free() on the struct pfs_vdata after insmntque() failure, because vp->v_data points to the structure, and pseudofs_reclaim() frees it by the call to pfs_vncache_free(). Tested by: pho [1] Approved by: des MFC after: 2 weeks	2008-12-29 12:07:18 +00:00
Edward Tomasz Napierala	0da50f6ef8	According to phk@, VOP_STRATEGY should never, _ever_, return anything other than 0. Make it so. This fixes "panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648", encountered when writing to an orphaned filesystem. Reason for the panic was the following assert: KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp)); at vfs_bio:bufstrategy(). Reviewed by: scottl, phk Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation	2008-12-16 21:13:11 +00:00
Konstantin Belousov	c7462f4387	Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week	2008-12-12 12:12:36 +00:00
Konstantin Belousov	c7c7520a95	Do not leak defs_de_interlock on error. Another pointy hat for my collection.	2008-12-12 11:10:10 +00:00
Joe Marcus Clarke	4c44fd376a	Implement VOP_VPTOCNP for devfs. Directory and character device vnodes are properly translated to their component names. Reviewed by: arch Approved by: kib	2008-12-12 01:00:38 +00:00
Joe Marcus Clarke	933efde2e5	Add a simple VOP_VPTOCNP implementation for deadfs which returns EBADF. Reviewed by: arch Approved by: kib	2008-12-12 00:59:36 +00:00
Konstantin Belousov	c96f374195	Relock user map earlier, to have the lock held when break leaves the loop earlier due to sbuf error. Pointy hat to: me Submitted by: dchagin	2008-12-10 16:11:09 +00:00
Konstantin Belousov	9499cb83bf	Make two style changes to create new commit and document proper commit message for r185765. Noted by: rdivacky Requested by: des Commit message for r185765 should be: In procfs map handler, and in linprocfs maps handler, do not call vn_fullpath() while having vm map locked. This is done in anticipation of the vop_vptocnp commit, that would make vn_fullpath sometime acquire vnode lock. Also, in linprocfs, maps handler already acquires vnode lock. No objections from: des MFC after: 2 week	2008-12-08 13:15:31 +00:00
Konstantin Belousov	5a66e0259b	Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week	2008-12-08 12:34:52 +00:00
Tim Kientzle	5c423e0640	The timezone byte is a signed value, treat it as such. Otherwise, time zone information for time zones west of GMT gets discarded. PR: kern/128934 Submitted by: J.R. Oldroyd MFC after: 4 days	2008-11-27 06:21:04 +00:00
Konstantin Belousov	5147a76a0e	In null_lookup(), do the needed cleanup instead of panicing saying the cleanup is needed. Reported by: kris, pho Tested by: pho MFC after: 2 weeks	2008-11-26 13:41:15 +00:00
Ulf Lilleengen	f7b8cfa890	- Support IEEE_P1282 and IEEE_1282 tags in the rock ridge extensions record. PR: kern/128942 Submitted by: "J.R. Oldroyd" <fbsd - at - opal.com>	2008-11-26 13:09:45 +00:00
Daichi GOTO	16385727ce	Simplify mode_t check treatment (suggested by trasz). By semantical view, trasz's code is better than prior one. Submitted by: trasz Reviewed by: Masanori OZAWA <ozawa@ongs.co.jp>	2008-11-25 03:49:41 +00:00
Daichi GOTO	1e5da15a63	Fixes Unionfs socket issue reported as kern/118346. PR: 118346 Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> Discussed at: devsummit Strassburg, EuroBSDCon2008 Discussed with: rwatson, gnn, hrs MFC after: 2 week	2008-11-25 03:18:35 +00:00
John Baldwin	66a6ea1de2	- Fix a typo in a comment. - Whitespace fix. - Remove #if 0'd BSD 4.x code for flushing busy buffers from a mountpoint during an unmount. FreeBSD uses vflush() for this.	2008-11-18 23:19:43 +00:00
John Baldwin	77ddca67d5	When looking up the vnode for the device to mount the filesystem on, ask NDINIT to return a locked vnode instead of letting it drop the lock and return a referenced vnode and then relock the vnode a few lines down. This matches the behavior of other filesystem mount routines.	2008-11-18 23:18:37 +00:00
John Baldwin	1ea456e7a6	Remove copy/paste code from UFS to handle sparse blocks. While Rock Ridge does support sparse files, the cd9660 code does not currently support them.	2008-11-18 23:15:17 +00:00
John Baldwin	05b1d36516	Remove unused i_flags field and IN_ACCESS flag from cd9660 in-memory i-nodes. cd9660 doesn't support access times.	2008-11-18 23:13:40 +00:00
John Baldwin	2ff47c5f18	Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month	2008-11-04 19:04:01 +00:00
John Baldwin	7265164f53	Don't pass WANTPARENT to the pathname lookup of the mount point for a unionfs mount just so we can immediately drop the reference on the parent directory vnode without using it.	2008-11-04 18:54:44 +00:00
Edward Tomasz Napierala	ea49f15447	Fix few missed accmode changes in coda. Approved by: rwatson (mentor)	2008-11-03 16:36:23 +00:00
Doug Rabson	a9148abd9d	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
Robert Watson	2b7da2dbf1	Catch up with netsmb locking: explicit thread arguments no longer required.	2008-11-02 23:20:27 +00:00
Edward Tomasz Napierala	2a9e5e2e7c	Remove the call to getinoquota() from ntfs_access. How did it get there?! Approved by: rwatson (mentor)	2008-11-02 11:49:19 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav	e11e3f187d	Fix a number of style issues in the MALLOC / FREE commit. I've tried to be careful not to fix anything that was already broken; the NFSv4 code is particularly bad in this respect.	2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Robert Watson	f17c6f031e	The locking in portalfs's socket connect code is no less correct than identical code in connect(2), so remove XXX that it might be incorrect. MFC after: 3 days	2008-10-12 19:23:02 +00:00
Attilio Rao	0d7935fd01	Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-10 21:23:50 +00:00
Robert Watson	4759ebf015	Use soconnect2() rather than directly invoking uipc_connect2() to interconnect two UNIX domain sockets. MFC after: 3 days	2008-10-06 18:38:50 +00:00
Konstantin Belousov	9a1e630dfd	Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week	2008-10-04 14:08:16 +00:00
Edward Tomasz Napierala	a37d6ec935	Fix Vflags abuse in fdescfs. There should be no functional changes. Approved by: rwatson (mentor)	2008-10-03 23:21:14 +00:00
Edward Tomasz Napierala	464119c422	Fix Vflags abuse in cd9660. There should be no functional changes. Approved by: rwatson (mentor)	2008-10-03 23:17:22 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Konstantin Belousov	7818e0a545	Save previous content of the td_fpop before storing the current filedescriptor into it. Make sure that td_fpop is NULL when calling d_mmap from dev_pager_getpages(). Change guards against td_fpop field being non-NULL with private state for another device, and against sudden clearing the td_fpop. This could occur when either a driver method calls another driver through the filedescriptor operation, or a page fault happen while driver is writing to a memory backed by another driver. Noted by: rwatson Tested by: rnoland MFC after: 3 days	2008-09-26 14:50:49 +00:00
Ed Schouten	d3ce832719	Remove unit2minor() use from kernel code. When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib	2008-09-26 14:19:52 +00:00
David E. O'Brien	ae72afe0f2	The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical. So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).	2008-09-23 14:45:10 +00:00
Ed Schouten	219cc94999	Already initialize the vfs timestamps inside the cdev upon allocation. In the MPSAFE TTY branch I noticed the vfs timestamps inside devfs were allocated with 0, where the getattr() routine bumps the timestamps to boottime if the value is below 3600. The reason why it has been designed like this, is because timestamps during boot are likely to be invalid. This means that device nodes that are created on demand (posix_openpt()) have timestamps with a value of boottime, which is not what we want. Solve this by calling vfs_timestamp() inside devfs_alloc(). Discussed with: kib	2008-09-21 14:02:43 +00:00
Konstantin Belousov	caf8aec886	fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(), vattr_null() or by zeroing it. Remove these to allow preinitialization of fields work in vn_stat(). This is needed to get birthtime initialized correctly. Submitted by: Jaakko Heinonen <jh saunalahti fi> Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:50:52 +00:00
Konstantin Belousov	4c5a20e3da	Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR(). NODEV is more appropriate when va_rdev doesn't have a meaningful value. Submitted by: Jaakko Heinonen <jh saunalahti fi> Suggested by: bde Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:49:15 +00:00
Konstantin Belousov	86dacdfe2b	Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't initialize va_vaflags and va_spare because they are not part of the VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero. Submitted by: Jaakko Heinonen <jh saunalahti fi> Reviewed by: bde Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:46:45 +00:00
Ed Schouten	19c5cd6288	Fix two small typo's in comments in the nullfs vnops code. Submitted by: Jille Timmermans <jille quis cx>	2008-09-11 20:15:34 +00:00
Xin LI	e08d55674d	Reflect license change of NetBSD code. Obtained from: NetBSD MFC after: 3 days	2008-09-03 18:53:48 +00:00
Konstantin Belousov	67c7bbf39c	In rev. 1.17 (r33548) of msdosfs_fat.c, relative cluster numbers were replaced by file relative sector numbers as the buffer block number when zero-padding a file during extension. Revert the change, it causes wrong blocks filled with zeroes on seeking beyond end of file. PR: kern/47628 Submitted by: tegge MFC after: 3 days	2008-09-01 13:18:16 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Ed Schouten	bc093719ca	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Konstantin Belousov	f35db5f7ca	Remove unnecessary locking around pointer fetch. Requested by: jhb	2008-08-12 19:34:45 +00:00
Robert Watson	4f7d1876d5	Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks	2008-07-05 13:10:10 +00:00
Konstantin Belousov	813d71de08	The uniqdosname() function takes char[12] as it third argument. Found by: -fstack-protector Reported by: dougb Tested by: dougb, Rainer Hurling <rhurlin gwdg de> MFC after: 3 days	2008-07-04 09:40:52 +00:00
Robert Watson	e54fdca237	Remove unused 'td' arguments from smbfs_hash_lock() and smbfs_hash_unlock(). MFC after: 3 days	2008-07-01 07:51:16 +00:00
Oleksandr Tymoshenko	2da528a74f	Get pointer to devfs_ruleset struct after garbage collection has been performed. Otherwise if ruleset is used by given mountpoint and is empty it's freed by devfs_ruleset_reap and pointer becomes bogus. Submitted by: Mateusz Guzik <mjguzik@gmail.com> PR: kern/124853	2008-06-22 14:34:38 +00:00
Konstantin Belousov	05427aafc6	Struct cdev is always the member of the struct cdev_priv. When devfs needed to promote cdev to cdev_priv, the si_priv pointer was followed. Use member2struct() to calculate address of the wrapping cdev_priv. Rename si_priv to __si_reserved. Tested by: pho Reviewed by: ed MFC after: 2 weeks	2008-06-16 17:34:59 +00:00
Konstantin Belousov	a0b454dc4b	Do not redo the vnode tear-down work already done by insmntque() when vnode cannot be put on the vnode list for mount. Reported and tested by: marck Guilty party: me MFC after: 3 days	2008-06-15 18:40:58 +00:00
Ed Schouten	29d4cb241b	Don't enforce unique device minor number policy anymore. Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves. Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work. This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy. The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list. Approved by: philip (mentor)	2008-06-11 18:55:19 +00:00
Konstantin Belousov	ac8b6edd89	In cd9660_readdir vop, always initialize the idp->uio_off member. The while loop that is assumed to initialize the uio_off later, may be not entered at all, causing uninitialized value to be returned in uio->uio_offset. PR: 122925 Submitted by: Jaakko Heinonen <jh saunalahti fi> MFC after: 1 weeks	2008-06-11 12:46:09 +00:00
Konstantin Belousov	9e40a5f827	When devfs_allocv() committed to create new vnode, since de_vnode is NULL, the dm_lock is held while the newly allocated vnode is locked. Since no other threads may try to lock the new vnode yet, the LOR there cannot result in the deadlock. Shut down the witness warning to note this fact. Tested by: pho Prodded by: attilio	2008-06-05 09:15:47 +00:00
Ed Schouten	16151645c2	Revert the changes I made to devfs_setattr() in r179457. As discussed with Robert Watson and John Baldwin, it would be better if PTY's are created with proper permissions, turning grantpt() into a no-op. Bypassing security frameworks like MAC by passing NOCRED to VOP_SETATTR() will only make things more complex. Approved by: philip (mentor)	2008-06-01 14:02:46 +00:00
Ed Schouten	34d1dcf0cc	Merge back devfs changes from the mpsafetty branch. In the mpsafetty branch, PTY's are allocated through the posix_openpt() system call. The controller side of a PTY now uses its own file descriptor type (just like sockets, vnodes, pipes, etc). To remain compatible with existing FreeBSD and Linux C libraries, we can still create PTY's by opening /dev/ptmx or /dev/ptyXX. These nodes implement d_fdopen(). Devfs has been slightly changed here, to allow finit() to be called from d_fdopen(). The routine grantpt() has also been moved into the kernel. This routine is a little odd, because it needs to bypass standard UNIX permissions. It needs to change the owner/group/mode of the slave device node, which may often not be possible. The old implementation solved this by spawning a setuid utility. When VOP_SETATTR() is called with NOCRED, devfs_setattr() dereferences ap->a_cred, causing a kernel panic. Change the de_{uid,gid,mode} code to allow changes when a->a_cred is set to NOCRED. Approved by: philip (mentor)	2008-05-31 14:06:37 +00:00
Ulf Lilleengen	60af8a6a7a	- Add locking to all filesystem operations in fdescfs and flag it as MPSAFE. - Use proper synhronization primitives to protect the internal fdesc node cache used in fdescfs. - Properly initialize and uninitalize hash. - Remove unused functions. Since fdescfs might recurse on itself, adding proper locking to it needed some tricky workarounds in some parts to make it work. For instance, a descriptor in fdescfs could refer to an open descriptor to itself, thus forcing the thread to recurse on vnode locks. Because of this, other race conditions also had to be fixed. Tested by: pho Reviewed by: kib (mentor) Approved by: kib (mentor)	2008-05-24 14:51:30 +00:00
Konstantin Belousov	772e245341	When vget() fails (because the vnode has been reclaimed), there is no sense to loop trying to vget() the vnode again. PR: 122977 Submitted by: Arthur Hartwig <arthur.hartwig nokia com> Tested by: pho Reviewed by: jhb MFC after: 1 week	2008-05-23 16:36:39 +00:00
Konstantin Belousov	82f4d64035	Implement the per-open file data for the cdev. The patch does not change the cdevsw KBI. Management of the data is provided by the functions int devfs_set_cdevpriv(void priv, cdevpriv_dtr_t dtr); int devfs_get_cdevpriv(void *datap); void devfs_clear_cdevpriv(void); All of the functions are supposed to be called from the cdevsw method contexts. - devfs_set_cdevpriv assigns the priv as private data for the file descriptor which is used to initiate currently performed driver operation. dtr is the function that will be called when either the last refernce to the file goes away, the device is destroyed or devfs_clear_cdevpriv is called. - devfs_get_cdevpriv is the obvious accessor. - devfs_clear_cdevpriv allows to clear the private data for the still open file. Implementation keeps the driver-supplied pointers in the struct cdev_privdata, that is referenced both from the struct file and struct cdev, and cannot outlive any of the referee. Man pages will be provided after the KPI stabilizes. Reviewed by: jhb Useful suggestions from: jeff, antoine Debugging help and tested by: pho MFC after: 1 month	2008-05-21 09:31:44 +00:00
Markus Brueffer	9c2bf69d32	Fix and speedup timestamp calculations which is roughly based on the patch in the mentioned PR: - bounds check time->month as it is used as an array index - fix usage of time->month as array index (month is 1-12) - fix calculation based on time->day (day is 1-31) - fix the speedup code as it doesn't calculate correct timestamps before the year 2000 and reduce the number of calculation in the year-by-year code - speedup month calculations by replacing the array content with cumulative values - add microseconds calculation - fix an endian problem PR: kern/97786 Submitted by: Andriy Gapon <avg@topspin.kiev.ua> Reviewed by: scottl (earlier version) Approved by: emax (mentor) MFC after: 1 week	2008-05-16 22:31:17 +00:00
Attilio Rao	58c5a5eb70	lockinit() can't accept LK_EXCLUSIVE as an initializaiton flag, so just drop it. Reported by: Josh Carroll <josh dot carroll at gmail dot com> Submitted by: jhb	2008-05-15 21:39:25 +00:00
John Baldwin	06d0d0e274	Don't explicitly drop Giant around d_open/d_fdopen/d_close for MPSAFE drivers. Since devfs is already marked MPSAFE it shouldn't be held anyway. MFC after: 2 weeks Discussed with: phk	2008-05-07 19:03:57 +00:00
Daichi GOTO	3af387c9d2	- change function name from _vdir to _vnode because VSOCK has been added as cache target. Now they process not only VDIR but also VSOCK. - fixed panic issue caused by cache incorrect free process by "umount -f" Submitted by: Masanori OZAWA <ozawa@ongs.co.jp> MFC after: 1 week	2008-05-07 05:32:55 +00:00
Daichi GOTO	fe5f08cda3	o Fixed multi thread access issue reported by Alexander V. Chernikov (admin@su29.net) fixed: kern/109950 PR: kern/109950 Submitted by: Alexander V. Chernikov (admin@su29.net) Reviewed by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 11:37:20 +00:00
Daichi GOTO	938161d61a	o Improved unix socket connection issue fixed: kern/118346 PR: kern/118346 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:53:52 +00:00
Daichi GOTO	5307411cbe	o Fixed rename panic issue Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:44:47 +00:00
Daichi GOTO	a9b794ff5e	o Fixed inaccessible issue especially including devfs on unionfs case. fixed also: kern/117829 PR: kern/117829 Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 09:38:48 +00:00
Daichi GOTO	a68ae31c71	o Added system hang-up process when VOP_READDIR of unionfs_nodeget() returns not end of the file status on debug mode (DIAGNOSTIC defined) kernel. Submitted by: Masanori OZAWA (ozawa@ongs.co.jp) MFC after: 1 week	2008-04-25 07:58:19 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
Doug Rabson	18121c17f5	When calling lf_advlock to unlock a record, make sure that ap->a_fl->l_type is F_UNLCK otherwise we trigger a LOCKF_DEBUG panic. MFC after: 3 days	2008-04-14 09:22:48 +00:00
Attilio Rao	047dd67e96	Optimize lockmgr in order to get rid of the pool mutex interlock, of the state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007	2008-04-06 20:08:51 +00:00
Konstantin Belousov	8eb6b6ecb6	The temporary workaround for the call to the vget() without lock type in the fdesc_allocvp(). The caller of the fdesc_allocvp() expects that the returned vnode is not reclaimed. Do lock the vnode exclusive and drop the lock after. Reported by: pho Reviewed by: jeff	2008-04-04 09:37:57 +00:00
Konstantin Belousov	57b4252e45	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
Jeff Roberson	4c65d593e2	- Simplify null_hashget() and null_hashins() by using vref() rather than a complex series of steps involving vget() without a lock type to emulate the same thing.	2008-03-29 23:24:54 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Jeff Roberson	698b1a6643	- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)	2008-03-22 09:15:16 +00:00
Konstantin Belousov	91a35e7870	Do not dereference cdev->si_cdevsw, use the dev_refthread() to properly obtain the reference. In particular, this fixes the panic reported in the PR. Remove the comments stating that this needs to be done. PR: kern/119422 MFC after: 1 week	2008-03-20 16:08:42 +00:00
Jeff Roberson	6617724c5f	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
Robert Watson	970a2d8770	Replace lockmgr lock protecting nwfs vnode hash table with an sx lock. MFC after: 1 month	2008-03-02 19:02:30 +00:00
Robert Watson	7947229ff6	Replace lockmgr lock protecting smbfs node hash table with sx lock. MFC after: 1 month	2008-03-02 18:56:13 +00:00
Attilio Rao	7fbfba7bf8	- Handle buffer lock waiters count directly in the buffer cache instead than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>	2008-03-01 19:47:50 +00:00
Konstantin Belousov	e6591b84ff	Rename fdescfs vnode from "fdesc" to "fdescfs" to avoid name collision of the vnode lock with the fdesc_mtx mutex. Having different kinds of locks with the same name confuses witness.	2008-02-26 10:10:55 +00:00
Robert Watson	18ff731caa	Add "Make MPSAFE" to the Coda todo list. MFC after: 3 days	2008-02-26 09:27:47 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	628f51d275	Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.	2008-02-24 16:38:58 +00:00
Marcel Moolenaar	043ec583dc	Don't check the bpbSecPerTrack and bpbHeads fields of the BPB. They are typically 0 on new ia64 systems. Since we don't use either field, there's no harm in not checking.	2008-02-21 03:19:46 +00:00
Robert Watson	fa8003c6b9	Remove custom queue macros in Coda, replacing them with queue(9) tailq macros. The only semantic change was the need to add a vc_opened field to struct vcomm since we can no longer use the request queue returning to an uninitialized state to hold whether or not the device is open. MFC after: 1 month	2008-02-17 14:33:28 +00:00
Robert Watson	b15ce9be2e	Remove namecache performance-tuning todo for Coda: we now use the FreeBSD name cache. MFC after: 1 month	2008-02-17 12:40:27 +00:00
Robert Watson	a8c34e8ee0	The possibly interruptible msleep in coda_call() means well, but is fundamentally fairly confused about how signals work and when it is appropriate for upcalls to be interrupted. In particular, we should be exempting certain upcalls from interruption, we should not always eventually time out sleeping on a upcall, and we should not be interrupting the sleep for certain signals that we currently are (including SIGINFO). This code needs to be reworked in the style of NFS interruptible mounts. MFC after: 1 month	2008-02-15 13:31:35 +00:00
Robert Watson	c30ddc8d99	Spell replys as replies. MFC after: 1 month	2008-02-15 12:11:45 +00:00
Robert Watson	93b510870f	Reorder and clean up make_coda_node(), annotate weaknesses in the implementation. MFC after: 1 month	2008-02-15 11:58:11 +00:00
Robert Watson	c0964f549b	Remove debugging code under OLD_DIAGNOSTIC; this is all >10 years old and hasn't been used in that time. MFC after: 1 month	2008-02-14 00:55:03 +00:00
Robert Watson	57a77b811f	In Coda, flush the attribute cache for a cnode when its fid is changed, as its synthesized inode number may have changed and we want stat(2) to pick up the new inode number. MFC after: 1 month	2008-02-14 00:30:06 +00:00
Robert Watson	89d1d7886a	Update cache flushing behavior in light of recent namecache and access cache improvements: - Flush just access control state on CODA_PURGEUSER, not the full namecache for /coda. - When replacing a fid on a cnode as a result of, e.g., reintegration after offline operation, we no longer need to purge the namecache entries associated with its vnode. MFC after: 1 month	2008-02-13 19:50:17 +00:00
Robert Watson	38ab9a906a	Implement a rudimentary access cache for the Coda kernel module, modeled on the access cache found in NFS, smbfs, and the Linux coda module. This is a positive access cache of a single entry per file, tracking recently granted rights, but unlike NFS and smbfs, supporting explicit invalidation by the distributed file system. For each cnode, maintain a C_ACCCACHE flag indicating the validity of the cache, and a cached uid and mode tracking recently granted positive access control decisions. Prefer the cache to venus_access() in VOP_ACCESS() if it is valid, and when we must fall back to venus_access(), update the cache. Allow Venus to clear the access cache, either the whole cache on CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER. Unlike the Coda module on Linux, we don't flush all entries on a user purge using a generation number, we instead walk present cnodes and clear only entries for the specific user, meaning it is somewhat more expensive but won't hit all users. Since the Coda module is agressive about not keeping around unopened cnodes, the utility of the cache is somewhat limited for files, but works will for directories. We should make Coda less agressive about GCing cnodes in VOP_INACTIVE() in order to improve the effectiveness of in-kernel caching of attributes and access rights. MFC after: 1 month	2008-02-13 15:45:12 +00:00
Robert Watson	d25a3c4c44	Remove now-unused Coda namecache. MFC after: 1 month	2008-02-13 13:26:01 +00:00
Robert Watson	44abffb44b	Rather than having the Coda module use its own namecache, use the global VFS namecache, as is done by the Coda module on Linux. Unlike the Coda namecache, the global VFS namecache isn't tagged by credential, so use ore conservative flushing behavior (for now) when CODA_PURGEUSER is issued by Venus. This improves overall integration with the FreeBSD VFS, including allowing __getcwd() to work better, procfs/procstat monitoring, and so on. This improves shell behavior in many cases, and improves ".." handling. It may lead to some slowdown until we've implemented a specific access cache, which should net improve performance, but in the mean time, lookup access control now always goes to Venus, whereas previously it didn't. MFC after: 1 month	2008-02-13 13:06:22 +00:00
Attilio Rao	d1215e10d2	Fix a lock leak in the ntfs locking scheme: When ntfs_ntput() reaches 0 in the refcount the inode lockmgr is not released and directly destroyed. Fix this by unlocking the lockmgr() even in the case of zero-refcount. Reported by: dougb, yar, Scot Hetzel <swhetzel at gmail dot com> Submitted by: yar	2008-02-13 13:02:12 +00:00
Robert Watson	4f52b754df	Clean up coda_pathconf() slightly while debugging a problem there. MFC after: 1 month	2008-02-11 00:01:45 +00:00
Robert Watson	21bb029533	Since we're now actively maintaining the Coda module in the FreeBSD source tree, restyle everything but coda.h (which is more explicitly shared across systems) into a closer approximation to style(9). Remove a few more unused function prototypes. Add or clarify some comments. MFC after: 1 month	2008-02-10 11:18:12 +00:00
Robert Watson	d57786ec68	Various further non-functional cleanups to coda: - Rename print_vattr to coda_print_vattr and make static, rename print_cred to coda_print_cred. - Remove unused coda_vop_nop. - Add XXX comment because coda_readdir forwards to the cache vnode's readdir rather than venus_readdir, and annotate venus_readdir as unused. - Rename vc_nb_* to vc_. - Use d_open_t, d_close_t, d_read_t, d_write_t, d_ioctl_t and d_poll_t for prototyping vc_ as that is the intent, don't use our own definitions. - Rename coda_nb_statfs to coda_statfs, rename NB_SFS_SIZ to CODA_SFS_SIZ. - Replace one more OBE reference to NetBSD with a reference to FreeBSD. - Tidy up a little vertical whitespace here and there. - Annotate coda_nc_zapvnode as unused. - Remove unused vcodattach. - Annotate VM_INTR as unused. - Annotate that coda_fhtovp is unused and doesn't match the FreeBSD prototype, so isn't hooked up to vfs_fhtovp. If we want NFS export of Coda to work someday, this needs to be fixed. - Remove unused getNewVnode. - Remove unused coda_vget, coda_init, coda_quotactl prototypes. MFC after: 1 month	2008-02-09 12:49:18 +00:00
Robert Watson	fc9d8f0057	No reason not to maintain stats on statfs in Coda, as it's done for other VFS operations, so uncomment the existing statistics gathering. MFC after: 1 month	2008-02-09 11:40:49 +00:00
Robert Watson	8571e9a189	Remove unused devtomp(), which exploited UFS-specific knowledge to find the mountpoint for a specific device. This was implemented incorrectly, a bad idea in a fundamental sense, and also never used, so presumably a long-idle debugging function. MFC after: 1 month	2008-02-09 11:12:18 +00:00
Robert Watson	82e4904ffb	Since Coda is effectively a stacked file system, use VOP_EOPNOTSUPP for vop_bmap; delete the existing stub that returned either EINVAL or EOPNOTSUPP, and had unreachable calls to VOP_BMAP on the cache vnode. MFC after: 1 month	2008-02-09 09:33:19 +00:00
Robert Watson	37245e3742	Lock cache vnode when VOP_FSYNC() is called on a Coda vnode. MFC after: 1 month	2008-02-09 00:12:22 +00:00
Robert Watson	6dc70a9dec	Make all calls to vn_lock() in Coda, including recently added ones, use LK_RETRY, since failure is undesirable (and not handled). MFC after: 1 month Pointed out by: kib	2008-02-09 00:03:22 +00:00
Robert Watson	7a246a6314	The Coda module was originally ported to NetBSD from Mach by rvb, and then later to FreeBSD. Update various NetBSD-related comments: in some cases delete them because they don't appply, in others update to say FreeBSD as they still apply but in FreeBSD (and might for that matter no longer apply on NetBSD), and flag one case where I'm not sure whether it applies. MFC after: 1 month	2008-02-08 23:15:36 +00:00
Robert Watson	efeac2fb25	Before invoking vnode operations on cache vnodes, acquire the vnode locks of those vnodes. Probably, Coda should do the same lock sharing/ pass-through that is done for nullfs, but in the mean time this ensures that locks are adequately held to prevent corruption of data structures in the cache file system. Assuming most operations came from the top layer of Coda and weren't performed directly on the cache vnodes, in practice this corruption was relatively unlikely as the Coda vnode locks were ensuring exclusive access for most consumers. This causes WITNESS to squeal like a pig immediately when Coda is used, rather than waiting until file close; I noticed these problems because of the lack of said squealing. MFC after: 1 month	2008-02-08 23:01:40 +00:00
Robert Watson	99a2317ed3	Remove undefined coda excluded by #if 1 #else, which previously protected vget() calls using inode numbers to query the root of /coda, which is not needed since we now cache the root vnode with the mountpoint. MFC after: 1 month	2008-02-08 22:37:15 +00:00
Attilio Rao	2433c4883e	Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib	2008-02-08 21:45:47 +00:00
Robert Watson	c55376e791	Remove Giant acquisition around soreceive() and sosend() in fifofs. The bug that caused us to reintroduce it is believed to be fixed, and Kris says he no longer sees problems with fifofs in highly parallel builds. If this works out, we'll MFC it for 7.1. MFC after: 3 months Pointed out by: kris	2008-01-26 12:34:23 +00:00
Attilio Rao	0e9eb108f0	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo	2008-01-24 12:34:30 +00:00
Robert Watson	9d3e5c0e2b	Put "coda_rdwr: Internally Opening" printf generated by in-kernel writes to files, such as ktrace output, under CODA_VERBOSE. Otherwise, each such call to VOP_WRITE() results in a kernel printf. MFC after: 3 days Obtained from: NetBSD	2008-01-21 21:39:08 +00:00
Robert Watson	e866951b59	Replace references to VOP_LOCK() w/o LK_RETRY to vn_lock() with LK_RETRY, avoiding extra error handling, or in some cases, missing error handling. MFC after: 3 days Discussed with: kib	2008-01-21 21:19:07 +00:00
Robert Watson	9440b9f7ea	Remove unused oldhash definition from Coda namecache. MFC after: 3 days	2008-01-19 19:21:07 +00:00
Robert Watson	de5910460a	Improve default vnode operation handling for Coda: - Don't specify vnode operations for mknod, lease, and advlock--let them fall through to vop_default. - Implement vop_default with &default_vnodeops, rather than with VOP_PANIC, so that unimplemented vnode operations are handled in more sensible ways than panicking, such as EOPNOTSUPP on ACL queries generated by bsdtar, or mknod. MFC after: 3 days	2008-01-19 17:12:44 +00:00
Robert Watson	aeab4f72a0	Rework coda_statfs(): no longer need to zero the statfs structure or fill out all fields, just fill out the ones the file system knows about. Among other things, this causes the outpuf of "mount" and "df" to make quite a bit more sense as /dev/cfs0 is specified as the mountfrom name. MFC after: 3 days	2008-01-19 16:39:14 +00:00
Robert Watson	82bf4517ef	Zero mi_rotovp and coda_ctlvp immediately after calling vrele() on the vnodes during coda_unmount() in order to detect errant use of them after the vnode references may no longer be valid. No need to clear the VV_ROOT flag on mi_rootvp flag (especially after the vnode reference is no longer valid) as this isn't done on other file systems. MFC after: 3 days	2008-01-19 15:40:46 +00:00
Robert Watson	96b1e9b015	Don't acquire an additional vnode reference to a vnode when it is opened and then release it when it is closed: we rely on the caller to keep the vnode around with a valid reference. This avoids vrele() destroying the vnode vop_close() is being called from during a call to vop_close(), and a crash due to lockmgr recursing the vnode lock when a Coda unmount occurs. MFC after: 3 days	2008-01-19 15:39:10 +00:00
Robert Watson	76898521e8	Don't declare functions as extern. Move all extern variable definitions to associated .h files, move some extern variable definitions between include files to place them more appropriately. MFC after: 3 days	2008-01-19 14:32:44 +00:00
Robert Watson	11cc4ab95a	Use VOP_NULL rather than VOP_PANIC for Coda's vop_print routine, so as to avoid panicking in DDB show lockedvnods. MFC after: 3 days	2008-01-19 13:41:56 +00:00
Robert Watson	d883e8e720	Lock the new directory vnode returned by coda_mkdir(), as this is required by FreeBSD's vnode locking protocol. MFC after: 3 days	2008-01-19 13:29:14 +00:00
Robert Watson	6885d70dfe	Borrow the VM object associated with an underlying cache vnode with the Coda vnode derived from it, in the style of nullfs. This allows files in the Coda file system to be memory-mapped, such as with execve(2) or mmap(2). MFC after: 3 days Reported by: Rune <u+openafsdev-sr55 at chalmers dot se>	2008-01-19 13:27:14 +00:00
Konstantin Belousov	61af195933	udf_vget() shall vgone() the vnode when the file_entry cannot be allocated or read from the volume. Otherwise, half-constructed vnode could be found later and cause panic when accessed. PR: 118322 MFC after: 1 week	2008-01-18 12:09:54 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Attilio Rao	d7a7e17968	Remove explicit calling of lockmgr() with the NULL argument. Now, lockmgr() function can only be called passing curthread and the KASSERT() is upgraded according with this. In order to support on-the-fly owner switching, the new function lockmgr_disown() has been introduced and gets used in BUF_KERNPROC(). KPI, so, results changed and FreeBSD version will be bumped soon. Differently from previous code, we assume idle thread cannot try to acquire the lockmgr as it cannot sleep, so loose the relative check[1] in BUF_KERNPROC(). Tested by: kris [1] kib asked for a KASSERT in the lockmgr_disown() about this condition, but after thinking at it, as this is a well known general rule, I found it not really necessary.	2008-01-08 23:48:31 +00:00
John Baldwin	314464f422	Lock the vnode interlock while reading v_usecount to update si_usecount in a cdev in devfs_reclaim(). MFC after: 3 days Reviewed by: jeff (a while ago)	2008-01-08 04:45:24 +00:00
John Baldwin	e46502943a	Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson	2008-01-07 20:05:19 +00:00
Attilio Rao	7a52326a0d	g_vfs_close() wants the sx topology lock held while executing, so just add correct locking to the operation of unmounting. This will prevent debugging kernels from panicking if mounting a non-hpfs partition (I'm not sure if this can be a problem with a successful mounting operation though). MFC: 3 days	2008-01-07 16:51:24 +00:00
Jeff Roberson	397c19d175	Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho	2007-12-30 01:42:15 +00:00
Attilio Rao	100f241571	Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace. This option just adds complexity and the new implementation no longer will support it, so axing it now that it is unused is probabilly the better idea. FreeBSD version is bumped in order to reflect the KPI breakage introduced by this patch. In the ports tree, kris found that only old OSKit code uses it, but as it is thought to work only on 2.x kernels serie, version bumping will solve any problem.	2007-12-28 00:38:13 +00:00
Robert Watson	3de213cc00	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
Markus Brueffer	a8a27cb0f9	Fix calculation of descriptor tag checksums. According to ECMA-167, Part 4, 7.2.3, bytes 0-3 and 5-15 are used to calculate the checksum of a descriptor tag. PR: kern/90521 Submitted by: Björn König <bkoenig@cs.tu-berlin.de> Reviewed by: scottl Approved by: emax (mentor)	2007-12-11 19:49:40 +00:00
Xin LI	1fa8f5f051	Turn MPASS(0) into panic with more obvious reason why the assertion is failed.	2007-12-07 00:00:21 +00:00
Xin LI	745973bd99	size_max should be unsigned, as such, use size_t here.	2007-12-06 23:19:05 +00:00
Wojciech A. Koszek	9889281da3	Explicitly initialize 'error' to 0 (two places). It lets one to build tmpfs from the latest source tree with older compiler--gcc3. Reviewed by: kib@ (on freebsd-current@) Approved by: cognet@ (mentor)	2007-12-04 20:14:15 +00:00
Maxim Konovalov	23c1e989a6	o English lesson from bde@: "iff" is not a typo, it means "if and only if". Backout previous.	2007-11-18 09:21:30 +00:00
Xin LI	7871e52bfd	MFp4: Several fixes to tmpfs which makes it to survive from pho@'s strees2 suite, to quote his letter, this change: 1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed, because nothing protects vnode/tmpfs node between lookup is done, and actual operation is performed, in the case the vnode lock is dropped. At least, this is the case with the from vnode for rename. For now, we do the linear lookup in the parent node. This has its own drawbacks. Not mentioning speed (that could be fixed by using hash), the real problem is the situation where several hardlinks exist in the dvp. But, I think this is fixable. 2. The patch restores the VV_ROOT flag on the root vnode after it became reclaimed and allocated again. This fixes MPASS assertion at the start of the tmpfs_lookup() reported by many. Submitted by: kib	2007-11-18 04:52:40 +00:00
Xin LI	e0f51ae7cd	MFp4: Fix several style(9) bugs. Submitted by: des	2007-11-18 04:40:42 +00:00
Maxim Konovalov	3f61687ba1	o Mask maximum file permissions we get from mount_ntfs -m with ACCESSPERMS. Document in mount_ntfs(8) only the nine low-order bits of mask are used (taken from mount_msdosfs(8)). PR: kern/114856 Submitted by: Ighighi MFC after: 1 month	2007-11-17 17:05:01 +00:00
Maxim Konovalov	4adf89efc6	o Fix a typo in the comment.	2007-11-17 16:19:48 +00:00
Maxim Konovalov	6b0659fc0f	o Do not leak inodes hash table at module unload. PR: kern/118017 Submitted by: Ighighi MFC after: 1 week	2007-11-13 19:34:06 +00:00
Xin LI	eed4ee29e5	Correct a stack overflow which will trigger panics when mode= is specified, caused by incorrect format string specified to vfs_scanopt() and subsequently vsscanf(). Pointed out by: kib Submitted by: des	2007-11-12 18:57:33 +00:00
Tom Rhodes	ededffc06b	Remove some debugging code that, while useful, doesn't belong in the committed version. While here, expand a macro only used once. Discussed with/oked by: bde	2007-10-25 08:23:08 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Xin LI	3247c9ddcc	Fixes to msdosfs dirtyflag related stuff: - markvoldirty() needs to write to underlying GEOM provider. We have to do that before g_access() which sets the GEOM provider to read-only. - Remove dirty flag before free'ing iconv related resources. The dirty flag removal could fail, and it is hard to revert the iconv-free after the fail. - Mark volume as dirty if we have failed to mark it clean for safe. - Other style fixes to the touched functions.	2007-10-22 17:43:43 +00:00
Bruce Evans	cb65c1ee29	Implement the async (really, delayed-write) mount option for msdosfs. This is much simpler than for ffs since there are many fewer places where we need to choose between a delayed write and a sync write -- just 5 in msdosfs and more than 30 in ffs. This is more complete and correct than in ffs. Several places in ffs are are still missing the choice. ffs_update() has a layering violation that breaks callers which want to force a sync update (mainly fsync(2) and O_SYNC write(2)). However, fsync(2) and O_SYNC write(2) are still more broken than in ffs, since they are broken for default (non-sync non-async) mounts too. Both fail to sync the FAT in all cases, and both fail to sync the directory entry in some cases after losing a race. Async everything is probably safer than the half-baked sync of metadata given by default mounts.	2007-10-19 12:23:25 +00:00
Bruce Evans	9e916c3163	Add noclusterr and noclusterw options to the options list. I forgot these when I implemented clustering.	2007-10-18 16:25:47 +00:00
Bruce Evans	7c3fc9de5c	Fix some style bugs in the mount options list. Mainly, sort the list, leaving space for adding missing options. Negative options are sorted after removing their "no" prefix, and generic options are sorted before msdosfs-specific ones.	2007-10-18 15:48:10 +00:00
Bruce Evans	cefb55828f	In msdosfs_settattr(), don't do synchronous updates of the denode (except indirectly for the size pseudo-attribute). If anything deserves a sync update, then it is ids and immutable flags, since these are related to security, but ffs never synced these and msdosfs doesn't support them. (ufs_setattr() only does an update in one case where it is least needed (for timestamps); it did pessimal sync updates for timestamps until 1998/03/08 but was changed for unlogged reasons related to soft updates.) Now msdosfs calls deupdat() with waitfor == 0, which normally gives a delayed update to disk but always gives a sync update of timestamps in core, while for ffs everything is delayed until the syncer daemon or other activity causes an update (except for timestamps). This gives a large optimization mainly for things like cp -p, where attribute adjustment could easily triple the number of physical I/O's if it is done synchronously (but cp -p to msdosfs is not as bad as that, since msdosfs doesn't support many attributes so null adjustments are more common, and msdosfs doesn't support ctimes so even if cp doesn't weed out null adjustments they don't become non-null after clobbering the ctime).	2007-10-18 07:26:21 +00:00
Alfred Perlstein	77465d9390	Get rid of qaddr_t. Requested by: bde	2007-10-16 10:54:55 +00:00
Daichi GOTO	1016626062	This changes give nullfs correctly work with latest unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:57:11 +00:00
Daichi GOTO	20885def58	Added whiteout behavior option. ``-o whiteout=always'' is default mode (it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:55:38 +00:00
Daichi GOTO	524f3f285d	Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:53:38 +00:00
Daichi GOTO	7d72c5e67d	Fixed un-vrele issue of upper layer root vnode of unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:52:01 +00:00
Daichi GOTO	6c98d0e9db	Added NULL check code pointed out by Coverity. (via Stanislav Sedov. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:50:58 +00:00
Daichi GOTO	57821163d3	- It has been become MPSAFE. - Fixed lock panic issue under MPSAFE. - Fixed panic issue whenever it locks vnode with reclaim. - Fixed lock implementations not conforming to vnode_if.src style. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:49:30 +00:00
Daichi GOTO	7e0c899579	Fixed vnode unlock/vrele untreated issues whenever errors have occurred during some treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:47:44 +00:00
Daichi GOTO	dc2dd18518	- Added support for vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Fixed kern/111262 issue. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:46:11 +00:00
Daichi GOTO	5adc408078	Added treatments to prevent readdir infinity loop using with Linux binary compatibility feature. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:44:06 +00:00
Daichi GOTO	b2b0db08c5	Changed it frees unneeded memory ASAP. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:42:05 +00:00
Daichi GOTO	3282e2c406	Log: Improved access permission check treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:37:52 +00:00
John Baldwin	c1f7cf23b1	Use the correct pid when checking to see whether or not the /proc/<pid> directory itself (rather than any of its contents) is visible to the current thread. MFC after: 1 week PR: kern/90063 Submitted by: john of 8192.net Approved by: re (kensmith)	2007-10-05 17:37:25 +00:00
Xin LI	3543c1b429	MFp4: Provide a dummy verb "export" to shut up the message showed up at start when NFS is enabled. Reported by: rafan Approved by: re (tmpfs blanket)	2007-10-04 17:11:48 +00:00
Xin LI	386c969205	Additional work is still needed before we can claim that tmpfs is stable enough for production usage. Warn user upon mount. Approved by: re (tmpfs blanket)	2007-10-04 17:08:46 +00:00
Bruce Evans	ed316d339f	Remove some of the pessimizations involving writing the fsi sector. All active fields in fsi are advisory/optional, so we shouldn't do extra work to make them valid at all times, but instead we write to the fsi too often (we still do), and we searched for a free cluster for fsinxtfree too often. This commit just removes the whole search and its results, so that we write out our in-core copy of fsinxtfree instead of writing a "fixed" copy and clobbering our in-core copy. This saves fixing 3 bugs: - off-by-1 error for the end of the search, resulting in fsinxtfree not actually being adjusted iff only the last cluster is free. - missing adjustment when no clusters are free. - off-by-many error for the start of the search. Starting the search at 0 instead of at (the in-core copy of) fsinxtfree did more than defeat the reasons for existence of fsinxtfree. fsinxtfree exists mainly to avoid having to start at 0 for just the first search per mount, but has the side effect of reducing bias towards allocating near cluster 0. The bias would normally only be generated by the first search per mount (if fsinxtfree is not supported), but since we also adjusted the in-core copy of fsinxtfree here, we were doing extra work to maximize the bias. Approved by: re (kensmith)	2007-09-23 14:49:32 +00:00
Craig Rodrigues	00cedf971b	Disable multiple ntfs mounts to the same mountpoint. Eliminates panics due to locking issues. Idea taken from src/sys/gnu/fs/xfs/FreeBSD/xfs_super.c. PR: 89966, 92000, 104393 Reported by: H. Matsuo <hiroshi50000 yahoo co jp>, Chris <m2chrischou gmail.com>, Andrey V. Elsukov <bu7cher yandex ru>, Jan Henrik Sylvester <me janh de> Approved by: re (kensmith)	2007-09-21 23:50:15 +00:00
Jeff Roberson	b61ce5b0e6	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
Bruce Evans	c2819440b3	Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions can easily block in bread(), and then there was nothing to prevent the static buffer (nambuf_{ptr,len,last_id}) being clobbered by another thread. The effects of the bug seem to have been limited to failed lookups and mangled names in readdir(), since Giant locking provides enough serialization to prevent concurrent calls to the functions that access the buffer. They were very obvious for multiple concurrent tree walks, especially with a small cluster size. The bug was introduced in msdosfs_conv.c 1.34 and associated changes, and is in all releases starting with 5.2. The fix is to allocate the buffer as a local variable and pass around pointers to it like "_r" functions in libc do. Stack use from this is large but not too large. This also fixes a memory leak on module unload. Reviewed by: kib Approved by: re (kensmith)	2007-08-31 22:29:55 +00:00
Xin LI	1f32d0127b	MFp4: rework tmpfs_readdir() logic in terms of correctness. Approved by: re (tmpfs blanket) Tested with: fstest, fsx	2007-08-16 11:00:07 +00:00
John Baldwin	1dc5b1cc56	On 6.x this works: % mount \| grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount \| grep home /dev/ad4s1e on /home (ufs, local, soft-updates) Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home Ideally, when we introduce new mount options, we should avoid options starting with "no". :) Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc	2007-08-15 17:40:09 +00:00
Xin LI	ad3638ee08	MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code. Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)	2007-08-10 11:00:30 +00:00
Xin LI	0ae6383d39	MFp4: - Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes () () This is a WIP [1] by pjd via howardsu Thanks kib@ for his valuable VFS related comments. Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)	2007-08-10 05:24:49 +00:00
Bruce Evans	a4e6807c49	In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG). In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check. In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory. In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs. In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal. Approved by: re (kensmith) (blanket)	2007-08-07 10:35:27 +00:00

... 2 3 4 5 6 ...

2331 Commits