Commit Graph

995 Commits

Author SHA1 Message Date
Robert Watson
70f3685105 o Change the API and ABI of the Extended Attribute kernel interfaces to
introduce a new argument, "namespace", rather than relying on a first-
  character namespace indicator.  This is in line with more recent
  thinking on EA interfaces on various mailing lists, including the
  posix1e, Linux acl-devel, and trustedbsd-discuss forums.  Two namespaces
  are defined by default, EXTATTR_NAMESPACE_SYSTEM and
  EXTATTR_NAMESPACE_USER, where the primary distinction lies in the
  access control model: user EAs are accessible based on the normal
  MAC and DAC file/directory protections, and system attributes are
  limited to kernel-originated or appropriately privileged userland
  requests.

o These API changes occur at several levels: the namespace argument is
  introduced in the extattr_{get,set}_file() system call interfaces,
  at the vnode operation level in the vop_{get,set}extattr() interfaces,
  and in the UFS extended attribute implementation.  Changes are also
  introduced in the VFS extattrctl() interface (system call, VFS,
  and UFS implementation), where the arguments are modified to include
  a namespace field, as well as modified to advoid direct access to
  userspace variables from below the VFS layer (in the style of recent
  changes to mount by adrian@FreeBSD.org).  This required some cleanup
  and bug fixing regarding VFS locks and the VFS interface, as a vnode
  pointer may now be optionally submitted to the VFS_EXTATTRCTL()
  call.  Updated documentation for the VFS interface will be committed
  shortly.

o In the near future, the auto-starting feature will be updated to
  search two sub-directories to the ".attribute" directory in appropriate
  file systems: "user" and "system" to locate attributes intended for
  those namespaces, as the single filename is no longer sufficient
  to indicate what namespace the attribute is intended for.  Until this
  is committed, all attributes auto-started by UFS will be placed in
  the EXTATTR_NAMESPACE_SYSTEM namespace.

o The default POSIX.1e attribute names for ACLs and Capabilities have
  been updated to no longer include the '$' in their filename.  As such,
  if you're using these features, you'll need to rename the attribute
  backing files to the same names without '$' symbols in front.

o Note that these changes will require changes in userland, which will
  be committed shortly.  These include modifications to the extended
  attribute utilities, as well as to libutil for new namespace
  string conversion routines.  Once the matching userland changes are
  committed, a buildworld is recommended to update all the necessary
  include files and verify that the kernel and userland environments
  are in sync.  Note: If you do not use extended attributes (most people
  won't), upgrading is not imperative although since the system call
  API has changed, the new userland extended attribute code will no longer
  compile with old include files.

o Couple of minor cleanups while I'm there: make more code compilation
  conditional on FFS_EXTATTR, which should recover a bit of space on
  kernels running without EA's, as well as update copyright dates.

Obtained from:	TrustedBSD Project
2001-03-15 02:54:29 +00:00
Maxim Sobolev
a7436e684a Add missed MODULE_VERSION() call, so loading of unicode conversion routine
works properly.

Clue beaten in by:	des
2001-03-11 15:28:42 +00:00
Boris Popov
e3c805cd07 Do not kill vnodes after rename. This can cause deadlocks in the deadfs.
Noticed by:	Matthew N. Dodd <winter@jurai.net>
2001-03-11 11:51:42 +00:00
Boris Popov
c35e8e54cd Add a mount time option which slightly relaxes checks for valid Joilet
extensions.

PR:		kern/23315
Reviewed by:	adrian
2001-03-11 10:05:08 +00:00
Boris Popov
1db5c04bc0 Slightly reorganize allocation of new vnode. Use bit NVOLUME to detected
vnodes which represent volumes (before it was done via strcmp()).
Turn n_refparent into bit in the n_flag field.
2001-03-10 05:39:03 +00:00
Boris Popov
d691852ce6 Synch with changes in the NCP requester. 2001-03-10 05:31:22 +00:00
Kirk McKusick
589c7af992 Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).
2001-03-07 07:09:55 +00:00
John Baldwin
19eb87d22a Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00
John Baldwin
931cccf603 Proc locking identical to that of linprocfs' vnops except that we hold the
proc lock while calling psignal.
2001-03-07 03:15:05 +00:00
John Baldwin
30ac5d0f9e Protect read to p_pptr with proc lock rather than proctree lock. 2001-03-07 03:10:20 +00:00
John Baldwin
c65c565b44 Proc locking. Lock around psignal() and also ensure both an exclusive
proctree lock and the process lock are held when updating p_pptr and
p_oppid.  When we are just reaading p_pptr we only need the proc lock and
not a proctree lock as well.
2001-03-07 03:09:40 +00:00
John Baldwin
0087374731 Protect p_flag with the proc lock. 2001-03-07 02:07:56 +00:00
Boris Popov
1cebc48fb3 A name of the file can change while its id stays the same. So, we have
to update it as well.

Remove unused function.
2001-03-06 09:59:18 +00:00
Doug Rabson
a76decc6f7 Remove the copyinstr call which was trying to copy the pathname in from
user space. It has already been copied in and mp->mnt_stat.f_mntonname has
already been initialised by the caller.

This fixes a panic on the alpha caused by the fact that the variable
'size' wasn't initialised because the call to copyinstr() bailed out with
an EFAULT error.
2001-03-03 15:15:33 +00:00
Adrian Chadd
f3a90da995 Reviewed by: jlemon
An initial tidyup of the mount() syscall and VFS mount code.

This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.

* the guts of the mount work has been moved into vfs_mount().

* move `type', `path' and `flags' from being userland variables into being
  kernel variables in vfs_mount(). `data' remains a pointer into
  userspace.

* Attempt to verify the `type' and `path' strings passed to vfs_mount()
  aren't too long.

* rework mount() and linux_mount() to take the userland parameters
  (besides data, as mentioned) and pass kernel variables to vfs_mount().
  (linux_mount() already did this, I've just tidied it up a little more.)

* remove the copyin*() stuff for `path'. `data' still requires copyin*()
  since its a pointer into userland.

* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
  filesystem.  This variable is generally initialised with `path', and
  each filesystem can override it if they want to.

* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
2001-03-01 21:00:17 +00:00
Alfred Perlstein
8283130be4 Display the Joliet Extension 'level' in the log message.
PR: kern/24998
2001-02-23 03:43:05 +00:00
Robert Watson
91421ba234 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
Poul-Henning Kamp
3e8bea9634 Remove a debug printf. 2001-02-18 09:16:49 +00:00
Jonathan Lemon
608a3ce62a Extend kqueue down to the device layer.
Backwards compatible approach suggested by: peter
2001-02-15 16:34:11 +00:00
Maxim Sobolev
c4fefc4887 Add a hook for loading of a Unicode -> char conversion routine as a kld at a
run-time. This is temporary solution until proper kernel Unicode interfaces
are in place and as such was purposely designed to be as tiny as possible
(3 lines of the code not counting comments). The port with conversion routines
for the most popular single-byte languages will be added later today

Reviewed by:	bp, "Michael C . Wu" <keichii@iteration.net>
Approved by:	bp
2001-02-13 11:48:31 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Jeroen Ruigrok van der Werven
1a6e52d0e9 Fix typo: seperate -> separate.
Seperate does not exist in the english language.
2001-02-06 11:21:58 +00:00
Poul-Henning Kamp
37d4006626 Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
Poul-Henning Kamp
fc2ffbe604 Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)
2001-02-04 13:13:25 +00:00
Poul-Henning Kamp
ef9e85abba Use <sys/queue.h> macro API. 2001-02-04 12:37:48 +00:00
Poul-Henning Kamp
b99cfaf32c Remove a DIAGNOSTIC check which belongs in <sys/queue.h> if anyplace at all. 2001-02-04 11:53:51 +00:00
Poul-Henning Kamp
4b1c62b3f2 At the point in time where most devices are created, we don't know what
time it is because boottime is not yet initialized.  Finagle the relevant
fields when we get the chance.
2001-02-02 22:54:41 +00:00
Poul-Henning Kamp
ecde9a6dae Only superuser can create symlinks.
Give symlinks mode 755 by default to avoid triggering alert eyes.
(the mode isn't use on symlinks)
2001-02-02 18:35:29 +00:00
Peter Wemm
2508f69037 Zap last remaining references to (and a use use of) of simple_locks. 2001-01-31 04:29:52 +00:00
Poul-Henning Kamp
4997ad7c1f Add a BUF_KERNPROC() in the BIO_DELETE path.
This seems to fix the problem which md(4) backed filesystems exposed.
2001-01-30 10:06:08 +00:00
Poul-Henning Kamp
aadf265525 Fix two minor nits.
Existences revealed, but no details offered by: bp
2001-01-30 08:39:52 +00:00
Matthew Dillon
2a9737202a This patch reestablishes the spec_fsync() guarentee that synchronous
fsyncs, which typically occur during unmounting, will drain all dirty
buffers even if it takes multiple passes to do so.  The guarentee was
mangled by the last patch which solved a problem due to -current disabling
interrupts while holding giant (which caused an infinite spin loop waiting for
I/O to complete).  -stable does not have either patch, but has a similar
bug in the original spec_fsync() code which is triggered by a bug in the
softupdates umount code, a fix for which will be committed to -current
as soon as Kirk stamps it.  Then both solutions will be MFC'd to -stable.

-stable currently suffers from a combination of the softupdates bug and
a small window of opportunity in the original spec_fsync() code, and -stable
also suffers from the spin-loop bug but since interrupts are enabled the
spin resolves itself in a few milliseconds.
2001-01-29 08:19:28 +00:00
John Baldwin
ba88dfc733 Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.
2001-01-27 00:01:31 +00:00
Jason Evans
1b367556b5 Convert all simplelocks to mutexes and remove the simplelock implementations. 2001-01-24 12:35:55 +00:00
John Baldwin
b939335607 - Catch up to proc flag changes. 2001-01-24 11:20:05 +00:00
John Baldwin
54bd3c0306 The lock being destroyed was misnamed, not unused. Add the lockdestroy()
back in but with the proper name so that this compiles.

Submitted by:	jasone
2001-01-24 02:18:54 +00:00
John Baldwin
cfb4c0b4f9 Proc locking to protect p_ucred while we obtain additional references. 2001-01-24 00:26:19 +00:00
John Baldwin
d19a727628 - Remove unused header include.
- Use queue macros.
2001-01-23 22:38:38 +00:00
John Baldwin
1aab03a584 Proc locking to protect p_ucred while we obtain an additional reference. 2001-01-23 22:38:15 +00:00
John Baldwin
f5343b3219 - FreeBSD doesn't have an abortop vnop as far as I can tell, so #ifdef
references to the hpf op out.
- Remove a lockdestroy() on a non-existent variable.
2001-01-23 22:37:30 +00:00
Peter Wemm
10cf882b4f Fix breakage unconvered by LINT - dont refer to undefined variables in
KASSERT()
2001-01-17 01:10:23 +00:00
Garrett Wollman
d31a0944a1 Delete unused #include <sys/select.h>. 2001-01-09 04:32:24 +00:00
Garrett Wollman
b7ef0b1281 Don't compile a dead variable declaration. 2001-01-09 04:24:43 +00:00
Poul-Henning Kamp
49851cc706 Use macro API to <sys/queue.h> 2000-12-31 10:24:19 +00:00
Matthew Dillon
08c0a67b2e Fix a lockup problem that occurs with 'cvs update'. specfs's fsync can
get into the same sort of infinite loop that ffs's fsync used to get
into, probably due to background bitmap writes.  The solution is
the same.
2000-12-30 23:32:24 +00:00
Matthew Dillon
2b6b0df712 This implements a better launder limiting solution. There was a solution
in 4.2-REL which I ripped out in -stable and -current when implementing the
low-memory handling solution.  However, maxlaunder turns out to be the saving
grace in certain very heavily loaded systems (e.g. newsreader box).  The new
algorithm limits the number of pages laundered in the first pageout daemon
pass.  If that is not sufficient then suceessive will be run without any
limit.

Write I/O is now pipelined using two sysctls, vfs.lorunningspace and
vfs.hirunningspace.  This prevents excessive buffered writes in the
disk queues which cause long (multi-second) delays for reads.  It leads
to more stable (less jerky) and generally faster I/O streaming to disk
by allowing required read ops (e.g. for indirect blocks and such) to occur
without interrupting the write stream, amoung other things.

NOTE: eventually, filesystem write I/O pipelining needs to be done on a
per-device basis.  At the moment it is globalized.
2000-12-26 19:41:38 +00:00
Jake Burkholder
98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
John Baldwin
ea10b6b78e When p_ucred is passed to the venus daemon, first grab the proc lock to
protect the p_ucred pointer, obtain a seperate reference to the ucred,
release the lock, and then pass in the new ucred reference.
2000-12-15 00:12:30 +00:00
Robert Watson
f6a99e61c5 o Tighten restrictions on use of /proc/pid/ctl and move access checks
in ctl to using centralized p_can() inter-process access control
  interface.

Reviewed by:	sef
2000-12-13 04:28:24 +00:00
Jake Burkholder
c0c2557090 - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
Dag-Erling Smørgrav
668891c57b Add a module version (so that linprocfs can properly depend on procfs) 2000-12-09 13:17:51 +00:00
David Malone
7cc0979fd6 Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
Poul-Henning Kamp
2a0436783d staticize. 2000-12-08 15:07:24 +00:00
John Baldwin
f3a16450d0 Protect accesses to member of struct proc with the proc lock. 2000-12-06 01:45:20 +00:00
John Baldwin
c3f52eedeb Protect p_stat with the sched_lock.
Reviewed by:	jake
2000-12-02 01:58:15 +00:00
Jonathan Lemon
747fa57549 Update to reflect the disappearance of getsock().
Found by:  LINT
2000-11-25 07:16:06 +00:00
Boris Popov
f2b1e0d206 Use vop_defaultop() instead of ntfs_bypass().
PR:		kern/22756
2000-11-18 02:47:12 +00:00
Kirk McKusick
dc029a4bd7 Missed conversion of CIRCLEQ => TAILQ for mount list. 2000-11-14 06:38:18 +00:00
Eivind Eklund
b8c8516a7f More paranoia against overflows 2000-11-08 21:53:05 +00:00
Boris Popov
1c4c2d8235 v_interlock is a mutex now, not simple lock. 2000-11-04 02:42:11 +00:00
Poul-Henning Kamp
1d7e3e42e7 Take VBLK devices further out of their missery.
This should fix the panic I introduced in my previous commit on this topic.
2000-11-02 21:14:13 +00:00
Eivind Eklund
ab3240e198 Fix overflow from jail hostname.
Bug found by:	Esa Etelavuori <eetelavu@cc.hut.fi>
2000-11-01 19:38:08 +00:00
Eivind Eklund
e3c4036b18 Give vop_mmap an untimely death. The opportunity to give it a timely
death timed out in 1996.
2000-11-01 17:57:24 +00:00
David Malone
da71e9a21b Make malloc use M_ZERO in some more locations.
Don't check for a null pointer if malloc called with M_WAITOK.

Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
Approved by:	bp
2000-10-29 16:14:28 +00:00
Poul-Henning Kamp
cf9fa8e725 Move suser() and suser_xxx() prototypes and a related #define from
<sys/proc.h> to <sys/systm.h>.

Correctly document the #includes needed in the manpage.

Add one now needed #include of <sys/systm.h>.
Remove the consequent 48 unused #includes of <sys/proc.h>.
2000-10-29 16:06:56 +00:00
Poul-Henning Kamp
9f69a4578a Weaken a bogus dependency on <sys/proc.h> in <sys/buf.h> by #ifdef'ing
the offending inline function (BUF_KERNPROC) on it being #included
already.

I'm not sure BUF_KERNPROC() is even the right thing to do or in the
right place or implemented the right way (inline vs normal function).

Remove consequently unneeded #includes of <sys/proc.h>
2000-10-29 14:54:55 +00:00
Poul-Henning Kamp
53ce36d17a Remove unneeded #include <sys/proc.h> lines. 2000-10-29 13:57:19 +00:00
Boris Popov
6716c905c9 Rev 1.41 was committed from wrong diff, now do it right. 2000-10-22 16:15:12 +00:00
Boris Popov
3ae19dd8cd Release and unlock vnode if resource deadlock detected. 2000-10-22 15:40:22 +00:00
Boris Popov
b1b494a765 Update stale comment.
PR:		kern/21805
2000-10-22 14:24:30 +00:00
Boris Popov
e7b1ac75dd Remove de_lock field from denode structure and make msdosfs PDIRUNLOCK aware. 2000-10-22 14:22:17 +00:00
Boris Popov
d45a191e99 Fix nullfs breakage caused by incomplete migration of v_interlock from
simple_lock to mutex.

Reset LK_INTERLOCK flag when interlock released manually.
2000-10-15 06:25:42 +00:00
Chris Costello
3157714de7 o Move from Alfred Perstein's "exclusion" technique of handling special
file types to requiring all file types to properly implement fo_stat.
  This makes any new file type additions much easier as this code no
  longer has to be modified to accomodate it.

o Instead of using curproc in fdesc_allocvp, pass a `struct proc' pointer as
  a new fifth parameter.
2000-10-09 20:06:13 +00:00
Eivind Eklund
7eb9fca557 Blow away the v_specmountpoint define, replacing it with what it was
defined as (rdev->si_mountpoint)
2000-10-09 17:31:39 +00:00
Poul-Henning Kamp
8abea41d80 Don't hold an extra reference to vnodes. Devfs vnodes are sufficiently
cheap to setup that it doesn't really matter that we recycle device
vnodes at kleenex speed.

Implement first cut try at killing cloned devices when they are
not needed anymore.  For now only the bpf driver is involved in
this experiment.  Cloned devices can set the SI_CHEAPCLONE flag
which allows us to destroy_dev() it when the vcount() drops to zero
and the vnode is reclaimed.  For now it's a requirement that the
driver doesn't keep persistent state from close to (re)open.

Some whitespace changes.
2000-10-09 14:18:07 +00:00
Alfred Perlstein
e5b7b6b78e return correct type for process directory entries, DT_DIR not DT_REG 2000-10-05 23:19:51 +00:00
Bruce Evans
253315a5a3 Forward-declare struct mbuf so that this file is less self-insufficient
-- don't depend on garbage in <sys/mount.h>.  mbufs aren't actually
used here either.  They should have been completely removed from filesystem
interfaces when they were removed from the interfaces to convert between
file handles and vnodes.
2000-10-05 11:58:22 +00:00
Jason Evans
a18b1f1d4d Convert lockmgr locks from using simple locks to using mutexes.
Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.
2000-10-04 01:29:17 +00:00
Boris Popov
beafe997c7 Make cd9660 filesystem PDIRUNLOCK aware. Now it can be used in vnode stacks
and nullfs mounts.

Remove now unnecessary i_lock field from the iso_node structure.
2000-10-03 04:39:50 +00:00
Boris Popov
611025ec57 Prevent dereference of NULL pointer when null_lock() and null_unlock()
called and there is no underlying vnode.
2000-10-03 04:25:53 +00:00
Boris Popov
5c4db877e4 Protect hash data with lock manager instead of home grown one.
Replace shared lock on vnode with exclusive one. It shouldn't impact
perfomance as NCP protocol doesn't support outstanding requests.

Do not hold simple lock on vnode for long period of time.

Add functionality to the nwfs_print() routine.
2000-10-02 09:49:04 +00:00
Boris Popov
7f66accf02 Get rid from the legacy __P() macro. Remove 'register' keywords. 2000-10-02 09:29:59 +00:00
Peter Wemm
6adbf7beeb PDIRUNLOCK now exists on FreeBSD. Remove the (now incorrect) redefinition. 2000-10-02 04:47:19 +00:00
Boris Popov
4451405fbd Fix vnode locking bugs in the nullfs.
Add correct support for v_object management, so mmap() operation should
work properly.
Add support for extattrctl() routine (submitted by semenu).

At this point nullfs can be considered as functional and much more stable.
In fact, it should behave as a "hard" "symlink" to underlying filesystem.

Reviewed in general by:		mckusick, dillon
Parts of logic obtained from:	NetBSD
2000-09-25 15:38:32 +00:00
Poul-Henning Kamp
eb7ba7f95c Ignore attempts to set flags to zero. This quenches a syslog warning
from login(1).
2000-09-18 09:40:01 +00:00
Poul-Henning Kamp
c80d29139c Add canonical checks to devfs_setattr(). 2000-09-16 12:06:58 +00:00
John Baldwin
b570da11fe Use size_t instead of u_int for 4th argument to copyinstr(). 2000-09-12 22:39:34 +00:00
Jason Evans
0384fff8c5 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
Poul-Henning Kamp
93bcdfe270 Add refcounts to the "global" DEVFS inode slots, this allows us
to recycle inodes after a destroy_dev() but not until all mounts
have picked up the change.

Add support for an overflow table for DEVFS inodes.  The static
table defaults to 1024 inodes, if that fills, an overflow table
of 32k inodes is allocated.  Both numbers can be changed at
compile time, the size of the overflow table also with the
sysctl vfs.devfs.noverflow.

Use atomic instructions to barrier between make_dev()/destroy_dev()
and the mounts.

Add lockmgr() locking of directories for operations accessing or
modifying the directory TAILQs.

Various nitpicking here and there.
2000-09-06 11:26:43 +00:00
Boris Popov
8da8066061 Various cleanups towards make nullfs functional (it is still broken
at this point):

Replace all '#ifdef DEBUG' with '#ifdef NULLFS_DEBUG' and add NULLFSDEBUG
macro.

Protect nullfs hash table with lockmgr.

Use proper order of operations when freeing mnt_data.

Return correct fsid in the null_getattr().

Add null_open() function to catch MNT_NODEV (obtained from NetBSD).

Add null_rename() to catch cross-fs rename operations (submitted by
Ustimenko Semen <semen@iclub.nsu.ru>)

Remove duplicate $FreeBSD$ tags.
2000-09-05 09:02:07 +00:00
Boris Popov
7da1e3f0b0 Get rid from the __P() macros.
Encouraged by:	peter
2000-09-05 07:54:39 +00:00
Poul-Henning Kamp
7665e72021 Off by one error.
Submitted by:	des
2000-09-04 18:24:30 +00:00
Dag-Erling Smørgrav
18203af447 Remove a comment that has been not only obsolete but patently wrong for the
last 31 revisions (almost three years).
2000-09-04 18:18:17 +00:00
Poul-Henning Kamp
db90128160 Avoid the modules madness I inadvertently introduced by making the
cloning infrastructure standard in kern_conf.  Modules are now
the same with or without devfs support.

If you need to detect if devfs is present, in modules or elsewhere,
check the integer variable "devfs_present".

This happily removes an ugly hack from kern/vfs_conf.c.

This forces a rename of the eventhandler and the standard clone
helper function.

Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include
like <sys/queue.h>

Remove all #includes of opt_devfs.h they no longer matter.
2000-09-02 19:17:34 +00:00
Robert Watson
84a5637620 o Simplify if/then clause equating ESRCH with ENOENT when hiding a process
Submitted by:	des
2000-09-01 18:41:32 +00:00
Robert Watson
ca94dd37a3 o Make procfs use vaccess() for procfs_access() DAC and super-user checks,
rather than implementing its own {uid,gid,other} checks against vnode
  mode.  Similar change to linprocfs currently under review.

Obtained from:	TrustedBSD Project
2000-09-01 13:41:41 +00:00
Robert Watson
387d2c036b o Centralize inter-process access control, introducing:
int p_can(p1, p2, operation, privused)

  which allows specification of subject process, object process,
  inter-process operation, and an optional call-by-reference privused
  flag, allowing the caller to determine if privilege was required
  for the call to succeed.  This allows jail, kern.ps_showallprocs and
  regular credential-based interaction checks to occur in one block of
  code.  Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL,
  and P_CAN_DEBUG.  p_can currently breaks out as a wrapper to a
  series of static function checks in kern_prot, which should not
  be invoked directly.

o Commented out capabilities entries are included for some checks.

o Update most inter-process authorization to make use of p_can() instead
  of manual checks, PRISON_CHECK(), P_TRESPASS(), and
  kern.ps_showallprocs.

o Modify suser{,_xxx} to use const arguments, as it no longer modifies
  process flags due to the disabling of ASU.

o Modify some checks/errors in procfs so that ENOENT is returned instead
  of ESRCH, further improving concealment of processes that should not
  be visible to other processes.  Also introduce new access checks to
  improve hiding of processes for procfs_lookup(), procfs_getattr(),
  procfs_readdir().  Correct a bug reported by bp concerning not
  handling the CREATE case in procfs_lookup().  Remove volatile flag in
  procfs that caused apparently spurious qualifier warnigns (approved by
  bde).

o Add comment noting that ktrace() has not been updated, as its access
  control checks are different from ptrace(), whereas they should
  probably be the same.  Further discussion should happen on this topic.

Reviewed by:	bde, green, phk, freebsd-security, others
Approved by:	bde
Obtained from:	TrustedBSD Project
2000-08-30 04:49:09 +00:00
Robert Watson
012c643d3e o Restructure vaccess() so as to check for DAC permission to modify the
object before falling back on privilege.  Make vaccess() accept an
  additional optional argument, privused, to determine whether
  privilege was required for vaccess() to return 0.  Add commented
  out capability checks for reference.  Rename some variables to make
  it more clear which modes/uids/etc are associated with the object,
  and which with the access mode.
o Update file system use of vaccess() to pass NULL as the optional
  privused argument.  Once additional patches are applied, suser()
  will no longer set ASU, so privused will permit passing of
  privilege information up the stack to the caller.

Reviewed by:	bde, green, phk, -security, others
Obtained from:	TrustedBSD Project
2000-08-29 14:45:49 +00:00
Poul-Henning Kamp
c32d0a1dcd Reorder vop's alphabetically.
Smarter use of devfs_allocv() (from bp@)
 Introduce devfs_find()
 ".." fixes to devfs_lookup (from bp@)
2000-08-27 14:46:36 +00:00
Poul-Henning Kamp
749e1537ec Minor cleanups tp devfs_readdir();
Add devfs_read() for directories.  (inspired by bp@)
2000-08-26 16:20:57 +00:00
Bruce Evans
ff4ad0c4d8 Quick fix for msdsofs_write() on alphas and other machines with either
longs larger than 32 bits or strict alignment requirements.

pm_fatmask had type u_long, but it must have a type that has precisely
32 bits and this type must be no smaller than int, so that ~pmp->pm_fatmask
has no bits above the 31st set.  Otherwise, comparisons between (cn
| ~pmp->pm_fatmask) and magic 32-bit "cluster" numbers always fail.
The correct fix is to use the C99 type uint_least32_t and mask with
0xffffffff.  The quick fix is to use u_int32_t and assume that ints
have

msdosfs metadata is riddled with unaligned fields, and on alphas,
unaligned_fixup() apparently has problems fixing up the unaligned
accesses caused by this.  The quick fix is to not comment out the
NetBSD code that sort of handles this, and define UNALIGNED_ACCESS on
i386's so that the code doesn't change on i386's.  The correct fix
would define UNALIGNED_ACCESS in a central machine-dependent header
and maybe add some extra cases to unaligned_fixup().  UNALIGNED_ACCESS
is also tested in isofs.

Submitted by:	parts by Mark Abene <phiber@radicalmedia.com>
PR:		19086
2000-08-25 09:03:58 +00:00
Poul-Henning Kamp
a481b90b82 Fix panic when removing open device (found by bp@)
Implement subdirs.
 Build the full "devicename" for cloning functions.
 Fix panic when deleted device goes away.
 Collaps devfs_dir and devfs_dirent structures.
 Add proper cloning to the /dev/fd* "device-"driver.
 Fix a bug in make_dev_alias() handling which made aliases appear
  multiple times.
 Use devfs_clone to implement getdiskbyname()
 Make specfs maintain the stat(2) timestamps per dev_t
2000-08-24 15:36:55 +00:00
Poul-Henning Kamp
fcc9b84ca5 Fix devfs_access() bug on directories.
Remove unused #includes.

Bug spotted by:	markm
2000-08-21 14:45:19 +00:00
Poul-Henning Kamp
3f54a085a6 Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c)
Remove old DEVFS support fields from dev_t.

  Make uid, gid & mode members of dev_t and set them in make_dev().

  Use correct uid, gid & mode in make_dev in disk minilayer.

  Add support for registering alias names for a dev_t using the
  new function make_dev_alias().  These will show up as symlinks
  in DEVFS.

  Use makedev() rather than make_dev() for MFSs magic devices to prevent
  DEVFS from noticing this abuse.

  Add a field for DEVFS inode number in dev_t.

  Add new DEVFS in fs/devfs.

  Add devfs cloning to:
        disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
        md(4), tun(4), bpf(4), fd(4)

  If DEVFS add -d flag to /sbin/inits args to make it mount devfs.

  Add commented out DEVFS to GENERIC
2000-08-20 21:34:39 +00:00
Poul-Henning Kamp
e39c53eda5 Centralize the canonical vop_access user/group/other check in vaccess().
Discussed with: bde
2000-08-20 08:36:26 +00:00
Poul-Henning Kamp
39f70682ae Introduce vop_stdinactive() and make it the default if no vop_inactive
is declared.

Sort and prune a few vop_op[].
2000-08-18 10:01:02 +00:00
Sheldon Hearn
1b2fbe6ff9 Rename the loadable nullfs kernel module: null -> nullfs 2000-07-28 11:54:09 +00:00
Kirk McKusick
9b97113391 This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
David Malone
4ebb509c1f Certain error contitions cause msdosfs_rename() to decrement the
vnode reference count on 'fdvp' more times than it should.

PR:		17347
Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>
Approved by:	bde
2000-07-14 11:52:56 +00:00
Kirk McKusick
f2a2857bb3 Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
Poul-Henning Kamp
77978ab8bc Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
Poul-Henning Kamp
b7ffb34243 Pull the rug under block mode devices. they return ENXIO on open(2) now. 2000-07-03 13:48:37 +00:00
Poul-Henning Kamp
82d9ae4e32 Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
Boris Popov
6843189ab9 Fix memory leakage on module unload.
Spotted by:	fixed INVARIANTS code
2000-06-29 01:19:12 +00:00
Boris Popov
432a84000f Fix memory leakage on module unload.
Spotted by:	fixed INVARIANTS code
2000-06-29 01:12:47 +00:00
Chris Costello
726bd7bebf fdesc_getattr:
Don't fake any file types, just set vap->va_type to IFTOVT(stb.st_mode).
  If something does not report its mode, vap->va_type is set to VNON
  accordingly.
2000-06-28 19:18:25 +00:00
Alfred Perlstein
974c7068d7 by changing the logic here we can support dynamic additions of new
filetypes.

Reviewed by: green
2000-06-27 22:46:35 +00:00
Alfred Perlstein
70cb8de9ba if there are leading zeros fail the lookup
Pointed out by: Alexander Viro <viro@math.psu.edu>
2000-06-27 21:37:17 +00:00
Boris Popov
b1bd38b351 Remove obsolete comment.
Submitted by:	Marius Bendiksen <mbendiks@eunet.no>
2000-06-25 02:29:45 +00:00
Chris Costello
04e58856a6 Rename the VRXEC' macro used to clear read and exec bits to FDRX' so
as not to impede upon VFS namespace.
2000-06-20 20:34:11 +00:00
Poul-Henning Kamp
a2e7a027a7 Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
2000-06-16 08:48:51 +00:00
Chris Costello
0b1574bd33 Remove unused include <sys/socketvar.h>. 2000-06-15 20:13:51 +00:00
Chris Costello
3b5ec07347 Replace vattr_null() with VATTR_NULL() and do not explicity set vattr
fields to VNOVAL afterwards.
2000-06-15 17:19:22 +00:00
Jonathan M. Bresler
caea99a368 before this commit, specfs reported disk partitions
using decimal major and minor numbers.  "ls -l" reports
	disk partitions using decimal major numbers and hex
	minor numbers.

	make specfs use decimal major numbers and hex minor numbers,
	just like "ls -l"
2000-06-12 10:20:18 +00:00
Chris Costello
f574046378 Instead of completely disallowing VOP_SETATTR, just do it where there is
an underlying vnode.

Suggested by:	bde
2000-06-06 00:35:39 +00:00
Chris Costello
c1e15e06bb Update the comment for fdesc_setattr to reflect that we no longer
actually setattr() on underlying vnodes.
2000-06-02 07:08:18 +00:00
Chris Costello
908a81d424 - Do not allow VOP_SETATTR to modify underlying vnodes at all. This caused
problems when fetch(1) was passed `-o -'.  The rationale of this change
  is that applications attempting to change underlying vnodes for /dev/fd
  nodes are improperly written and the use of this interface should not
  ever have been encouraged.  Proper alternatives are fchmod, fchown and
  others.

  PR:		18952

- Remove stale, unused fdescnode->fd_link structure member.
2000-06-02 07:02:45 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Chris Costello
38edb6a32f Adapt fdesc to be mounted on /dev/fd and remove fd, stdin, stdout and
stderr nodes.  More specific items of this patch:
  o Removed support for symbolic links, and the need for
    fdesc_readlink().
  o Put all the code from fdesc_attr() into fdesc_getattr() and removed
    fdesc_attr().  This also made it easier to properly give all nodes
    unique inode numbers.
  o The removal of all non-fd nodes allowed the removal of the fdesc_read(),
    fdesc_write(), and fdesc_ioctl() nodes, since we no longer have nodes
    that get special handling.
  o Correct the component name validity-checking in fdesc_lookup().  It
    previously detected the end of the string by checking for a terminating
    NUL, now it uses cnp->cn_namelen.
  o Handle kqueue files as FIFOs.  This is probably the closest file type
    to represent this type of file there is, and it is unfortunately not
    very representative of a kqueue.  Creation time is not supported by
    kqueue, so ctime, mtime and atime are all set to the current time when
    getattr() was called.
  o Also set st_[mca]time to the current time since there's no data in
    socket structures that can be used to fill this in (FIFOs).
  o Simplify fdesc_readdir() since it only has to report the numbered
    fd nodes.  Add `.' and `..' directory links as well.
  o Remove read bits from directories as they tend to confuse programs
    like tar(1).

Reviewed by:	phk
Discussed with:	bde (earlier on, not quite review)
2000-05-11 22:10:51 +00:00
Poul-Henning Kamp
192c06ea1b Change the "bdev-whiner" to whine when open is attempted and extend
the deadline a month.
2000-05-09 18:53:57 +00:00
Poul-Henning Kamp
9626b608de Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
Poul-Henning Kamp
e9340d96fa Remove 42 unneeded #include <sys/ioccom.h>.
ioccom.h defines only implementation detail, and should therefore
only be included from the #include which defines the ioctl tags,
in other words: never include it from *.c
2000-05-03 07:31:38 +00:00
Peter Wemm
365c5db0a7 Add $FreeBSD$ 2000-05-01 20:32:07 +00:00
Poul-Henning Kamp
2c9b67a8df Remove unneeded #include <vm/vm_zone.h>
Generated by:	src/tools/tools/kerninclude
2000-04-30 18:52:11 +00:00
Poul-Henning Kamp
eb95c536ad Remove unneeded #include <sys/kernel.h> 2000-04-29 15:36:14 +00:00
Peter Wemm
6ff4e7af7e nwfs depends on ncp 2000-04-29 13:34:28 +00:00
Brian Feldman
b7db19017b Move procfs_fullpath() to vfs_cache.c, with a rename to textvp_fullpath().
There's no excuse to have code in synthetic filestores that allows direct
references to the textvp anymore.

Feature requested by:	msmith
Feature agreed to by:	warner
Move requested by:	phk
Move agreed to by:	bde
2000-04-26 11:57:45 +00:00
Brian Feldman
28749c79c8 Quiet an unused variable warning by commenting out a variable declaration
that goes with a commented out statement.
2000-04-22 17:58:40 +00:00
Brian Feldman
bb529be712 There's no reason to make "file" 0500 rather than 0555. 2000-04-22 04:01:54 +00:00
Brian Feldman
081d7b00c7 Welcome back our old friend from procfs, "file"! 2000-04-22 03:44:41 +00:00
Poul-Henning Kamp
3389ae9350 Remove ~25 unneeded #include <sys/conf.h>
Remove ~60 unneeded #include <sys/malloc.h>
2000-04-19 14:58:28 +00:00
Poul-Henning Kamp
ed6aff7387 Remove unneeded <sys/buf.h> includes.
Due to some interesting cpp tricks in lockmgr, the LINT kernel shrinks
by 924 bytes.
2000-04-18 15:15:39 +00:00
Jonathan Lemon
cb679c385e Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
Poul-Henning Kamp
8177437d85 Complete the bio/buf divorce for all code below devfs::strategy
Exceptions:
        Vinum untouched.  This means that it cannot be compiled.
        Greg Lehey is on the case.

        CCD not converted yet, casts to struct buf (still safe)

        atapi-cd casts to struct buf to examine B_PHYS
2000-04-15 05:54:02 +00:00
Robert Watson
a64ed08955 Introduce extended attribute support for FFS, allowing arbitrary
(name, value) pairs to be associated with inodes.  This support is
used for ACLs, MAC labels, and Capabilities in the TrustedBSD
security extensions, which are currently under development.

In this implementation, attributes are backed to data vnodes in the
style of the quota support in FFS.  Support for FFS extended
attributes may be enabled using the FFS_EXTATTR kernel option
(disabled by default).  Userland utilities and man pages will be
committed in the next batch.  VFS interfaces and man pages have
been in the repo since 4.0-RELEASE and are unchanged.

o ufs/ufs/extattr.h: UFS-specific extattr defines
o ufs/ufs/ufs_extattr.c: bulk of support routines
o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes
o contrib/softupdates/ffs_softdep.c: extattr.h includes
o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR

o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h
(This should not be the case, and will be fixed in a future commit)

Currently attributes are not supported in MFS.  This will be fixed.

Reviewed by:	adrian, bp, freebsd-fs, other unthanked souls
Obtained from:	TrustedBSD Project
2000-04-15 03:34:27 +00:00
Boris Popov
0f15f7aba5 Try to obtain timezone offset from an environment of mount program.
This helps in cases where CMOS clock set to UTC time.
2000-04-05 10:44:04 +00:00
Poul-Henning Kamp
c244d2de43 Move B_ERROR flag to b_ioflags and call it BIO_ERROR.
(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.
2000-04-02 15:24:56 +00:00
Matthew Dillon
7c58e473f5 Commit the buffer cache cleanup patch to 4.x and 5.x. This patch fixes a
fragmentation problem due to geteblk() reserving too much space for the
    buffer and imposes a larger granularity (16K) on KVA reservations for
    the buffer cache to avoid fragmentation issues.  The buffer cache size
    calculations have been redone to simplify them (fewer defines, better
    comments, less chance of running out of KVA).

    The geteblk() fix solves a performance problem that DG was able reproduce.

    This patch does not completely fix the KVA fragmentation problems, but
    it goes a long way

Mostly Reviewed by: bde and others
Approved by: jkh
2000-03-27 21:29:33 +00:00
Poul-Henning Kamp
b99c307a21 Rename the existing BUF_STRATEGY() to DEV_STRATEGY()
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)

substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)

This patch is machine generated except for the ccd.c and buf.h parts.
2000-03-20 11:29:10 +00:00
Poul-Henning Kamp
21144e3bf1 Remove B_READ, B_WRITE and B_FREEBUF and replace them with a new
field in struct buf: b_iocmd.  The b_iocmd is enforced to have
exactly one bit set.

B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.

Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.

Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue.  It is likely to write on your disk
where it should have been reading.

This change is a step in the direction towards a stackable BIO capability.

A lot of this patch were machine generated (Thanks to style(9) compliance!)

Vinum users:  Greg has not had time to test this yet, be careful.
2000-03-20 10:44:49 +00:00
Poul-Henning Kamp
db5f635acc Eliminate the undocumented, experimental, non-delivering and highly
dangerous MAX_PERF option.
2000-03-16 08:51:55 +00:00
Yoshihiro Takahashi
01f6cfbae0 Supported non-512 bytes/sector format.
PR:		misc/12992
Submitted by:	chi@bd.mbn.or.jp (Chiharu Shibata) and
		Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
Reviewed by:	Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
2000-01-27 14:43:07 +00:00
Robert Watson
8f0738756c Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Reviewed by:	bde
2000-01-19 06:07:34 +00:00
Boris Popov
1e77d7a17d Check if module was compiled without SMP support and running on
an SMP system.
2000-01-15 08:35:48 +00:00
Boris Popov
6cca21b14d Add VT_NWFS tag. 2000-01-15 08:28:03 +00:00
Bruce Evans
f9ad65b0f2 Forward declare some structs so that this header is more self-suifficent. 2000-01-14 19:54:42 +00:00
Bruce Evans
d3efeb2865 Use MALLOC_DECLARE when it is #defined, not when a (wrong) test of
__FreeBSD_version succeeds.
2000-01-14 19:47:07 +00:00
Poul-Henning Kamp
200988017c remove check now done in vn_isdisk(). 2000-01-10 12:24:36 +00:00
Poul-Henning Kamp
ba4ad1fcea Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by:	bde
2000-01-10 12:04:27 +00:00
Boris Popov
37713edc2d Treat negative uio_offset value as eof (idea by: bde).
Prevent overflows by casting uio_offset to uoff_t.
Return correct error number if directory entry is broken.

Reviewed by:	bde
2000-01-08 10:45:54 +00:00
Poul-Henning Kamp
e90d68c69c Return ENXIO if there is no device. 2000-01-02 15:16:17 +00:00
Boris Popov
70852092e8 Fix the mess with signed/unsigned longs and ints (inspired by bde).
Fix potential bug with directory reading.
Explicitly limit file size to 4GB (msdos can't handle larger files).
Slightly reorganize msdosfs_read() to reduce number of 'if's.
2000-01-02 03:30:42 +00:00
Peter Wemm
c447342094 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 05:07:58 +00:00
Boris Popov
687fce0361 Avoid to write garbage if uiomove fails. 1999-12-28 16:14:54 +00:00
Boris Popov
dc22f85f34 Fix an overflow in the msdosfs_read() function which exposed on the files
with size > 2GB.

PR:		15639
Submitted by:	Tim Kientzle <kientzle@acm.org>
Reviewed by:	phk
1999-12-28 15:34:23 +00:00
Boris Popov
499d3ffa94 It is possible that number of sectors specified in the BPB
will exceed FAT capacity. This will lead to kernel panic while other
systems just limit number of clusters.

PR:		4381, 15136
Reviewed by:	phk
1999-12-28 15:27:39 +00:00
Peter Wemm
e0f9d2869a Fix typo "," vs ";"
PR:		15696
Submitted by:	Takashi Okumura <taka@cs.pitt.edu>
1999-12-27 16:03:38 +00:00
Chris Costello
1b900f573c Fix a typo that was doing something kind of silly, and that is initializing
the creation time for files to the uninitialized value:

	vap->va_ctime = vap->va_ctime;

Changed to what was intended, assigning it to the modification time (thus
making all three values of access time, modification time and creation time
the same thing).

Reviewed by:	grog
1999-12-21 06:29:00 +00:00
Eivind Eklund
a6b174a83b Include vm/vm_extern.h to get at prototypes 1999-12-20 18:26:58 +00:00
Robert Watson
91f37dcba1 Second pass commit to introduce new ACL and Extended Attribute system
calls, vnops, vfsops, both in /kern, and to individual file systems that
require a vfsop_ array entry.

Reviewed by:	eivind
1999-12-19 06:08:07 +00:00
Eivind Eklund
762e6b856c Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
Peter Wemm
fc6e400716 Fix pointer problem for the Alpha 1999-12-12 21:10:53 +00:00
Boris Popov
6f6da2f326 Bump local version number to 1.3.4. 1999-12-12 05:53:02 +00:00
Eivind Eklund
6bdfe06ad9 Lock reporting and assertion changes.
* lockstatus() and VOP_ISLOCKED() gets a new process argument and a new
  return value: LK_EXCLOTHER, when the lock is held exclusively by another
  process.
* The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them
* Extend the vnode_if.src format to allow more exact specification than
  locked/unlocked.

This commit should not do any semantic changes unless you are using
DEBUG_VFS_LOCKS.

Discussed with:	grog, mch, peter, phk
Reviewed by:	peter
1999-12-11 16:13:02 +00:00
Peter Wemm
99b30c79b0 Don't simulate a pseudo address-space beyond VM_MAXUSER_ADDRESS that
maps onto the upages.  We used to use this extensively, particularly
for ps and gdb.  Both of these have been "fixed".  ps gets the p_stats
via eproc along with all the other stats, and gdb uses the regs, fpregs
etc files.

Once apon a time the UPAGES were mapped here, but that changed back
in January '96.  This essentially kills my revisions 1.16 and 1.17.
The 2-page "hole" above the stack can be reclaimed now.
1999-12-11 10:21:34 +00:00
Semen Ustimenko
daabca392e First version of HPFS stuff. 1999-12-09 19:10:13 +00:00
Poul-Henning Kamp
5ecdb702b0 Remove unused #includes.
Obtained from:	http://bogon.freebsd.dk/include
1999-12-08 08:59:40 +00:00
Søren Schmidt
cb2a8dffa0 Commit the kernel part of our DVD support. Nothing much to say really,
its just a number of new ioctl's, the rest is done in userland.
1999-12-07 22:25:28 +00:00
Semen Ustimenko
b17f083b0f Merged NetBSD version, as they have done improvements:
1. ntfs_read*attr*() functions now accept
	uio structure to eliminate one data copying.
	2. found and removed deadlock caused
	by 6 concurent ls -lR.
	3. started implementation of nromal
	Unicode<->unix recodeing.

Obtained from:	NetBSD
1999-12-03 20:37:40 +00:00
Kirk McKusick
e9cc475851 Collect read and write counts for filesystems. This new code
drops the counting in bwrite and puts it all in spec_strategy.
I did some tests and verified that the counts collected for writes
in spec_strategy is identical to the counts that we previously
collected in bwrite. We now also get read counts (async reads
come from requests for read-ahead blocks). Note that you need
to compile a new version of mount to get the read counts printed
out. The old mount binary is completely compatible, the only
reason to install a new mount is to get the read counts printed.

Submitted by:	Craig A Soules <soules+@andrew.cmu.edu>
Reviewed by:	Kirk McKusick <mckusick@mckusick.com>
1999-12-01 02:09:30 +00:00
Boris Popov
cb815af365 Remove abuse of struct nameidata.
Pointed by:	Eivind Eklund
1999-11-27 17:46:04 +00:00
Poul-Henning Kamp
a8704f8999 Add a sysctl to control if argv is disclosed to the world:
kern.ps_argsopen
It defaults to 1 which means that all users can see all argvs in ps(1).

Reviewed by:	Warner
1999-11-26 08:27:16 +00:00
Poul-Henning Kamp
a9e0361b4a Introduce the new function
p_trespass(struct proc *p1, struct proc *p2)
which returns zero or an errno depending on the legality of p1 trespassing
on p2.

Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one
extra signal related check.

Replace procfs.h:CHECKIO() macros with calls to p_trespass().

Only show command lines to process which can trespass on the target
process.
1999-11-21 19:03:20 +00:00
Boris Popov
4a22b7e60e Remove race condition under SMP.
Noted by:	Denis Kalinin <denis@mail.rbc.ru>
1999-11-21 16:35:29 +00:00
Poul-Henning Kamp
da654d9070 s/p_cred->pc_ucred/p_ucred/g 1999-11-21 12:38:21 +00:00
Sean Eric Fagan
13baacebcb A process should be able to examine itself. 1999-11-20 18:22:14 +00:00
Poul-Henning Kamp
0429e37ade struct mountlist and struct mount.mnt_list have no business being
a CIRCLEQ.  Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.

This removes ugly  mp != (void*)&mountlist  comparisons.

Requested by:   phk
Submitted by:   Jake Burkholder jake@checker.org
PR:             14967
1999-11-20 10:00:46 +00:00
Peter Wemm
ac09d23cfa Fix an unused variable warning. 1999-11-18 09:07:30 +00:00
Peter Wemm
6224a63b8b Fix a warning. 1999-11-18 08:47:10 +00:00
Poul-Henning Kamp
6153cb2048 Make proc/*/cmdline use the cached argv if available.
Submitted by:   Paul Saab <paul@mu.org>
Reviewed by:    phk
1999-11-17 21:35:07 +00:00
Poul-Henning Kamp
3cf5d0fd07 The function `procfs_getattr()' in procfs doesn't set the value of
vap->va_fsid, so we cannot get valid information about procfs.

Submitted by:   SAWADA Mizuki miz@pa.aix.or.jp
Reviewed by:    phk
PR:     1654
1999-11-17 21:33:25 +00:00
Eivind Eklund
dd8c04f4c7 Remove WILLRELE from VOP_SYMLINK
Note: Previous commit to these files (except coda_vnops and devfs_vnops)
that claimed to remove WILLRELE from VOP_RENAME actually removed it from
VOP_MKNOD.
1999-11-13 20:58:17 +00:00
Eivind Eklund
edfe736df9 Remove WILLRELE from VOP_RENAME 1999-11-12 03:34:28 +00:00
Poul-Henning Kamp
698f9cf828 Next step in the device cleanup process.
Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code.

Unify spec_open() for bdev and cdev cases.

Remove the disabled bdev specific read/write code.
1999-11-09 14:15:33 +00:00
Alan Cox
b561683329 Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It
should be "VM_FAULT_NORMAL".
1999-11-09 01:44:28 +00:00
Poul-Henning Kamp
e89fecb4d8 remove a confusing and stale comment. 1999-11-08 13:52:57 +00:00
Poul-Henning Kamp
b454088b82 Oops, a bit too hasty there. 1999-11-08 13:08:02 +00:00
Poul-Henning Kamp
0ed43ec68c Various cleanups. 1999-11-08 09:59:34 +00:00
Sean Eric Fagan
75bd443641 Explain why Warner is right, and I am wrong, in the removing of the
file object.  Also explain some possible directions to re-implement it --
I'm not sure it should be, given the minimal application use.  (Other
than having the debugger automatically access the symbols for a process,
the main use I'd found was with some minor accounting ability, but _that_
depends on it being in the filesystem space; an ioctl access method would
be useless in that case.)

This is a code-less change; only a comment has been added.
1999-11-08 05:13:54 +00:00
Peter Wemm
1949905f8b Update for fileops.fo_stat() addition. Note, this would panic if
it saw a DTYPE_PIPE.  This isn't quite right but should stop a crash.
1999-11-08 03:36:29 +00:00
Poul-Henning Kamp
f7ee7bbb21 Use vop_panic() instead of spec_badop(). 1999-11-07 15:09:59 +00:00
Poul-Henning Kamp
be8479a836 Remove the iskmemdev() function. Make it the responsibility of the mem.c
drivers to enforce the securelevel checks.
1999-11-07 12:01:32 +00:00
Sean Eric Fagan
900e2da760 Make an incredibly stupid change because Warner threatened to do it and
continue doing it despite objections by me (the principal author).

Note that this doesn't fix the real problem -- the real problem is generally
bad setup by ignorant users, and education is the right way to fix it.

So while this doesn't actually solve the prolem mentioned in the complaint
(since it's still possible to do it via other methods, although they mostly
involve a bit more complicity), and there are better methods to do this,
nobody was willing or able to provide me with a real world example that
couldn't be worked around using the existing permissions and group
mechanism.  And therefore, security by removing features is the method of
the day.

I only had three applications that used it, in any event.  One of them would
have made debugging easier, but I still haven't finished it, and won't
now, so it doesn't really matter.
1999-11-07 07:52:02 +00:00
Archie Cobbs
60fffafdc3 Change structure field named 'toupper' to 'to_upper' to avoid conflict
with the macro of the same name.  Same thing for 'tolower'.
1999-11-02 22:46:42 +00:00
Mike Smith
6d14782861 Newline-terminate the complaint message about not being able to find
the root vnode pointer.
1999-11-01 23:57:28 +00:00
Poul-Henning Kamp
dc0f93d45d Remove specfs::vop_lookup() There is no code path which can call it. 1999-11-01 02:53:38 +00:00
Boris Popov
96a9a981cc Bump version number to sync with ncplib 1.3.3 1999-10-31 15:11:43 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Matthew Dillon
dfbdd7d2b3 A tentative agreement has been reached in regards to a procedure
to remove 'b'lock devices.  The agreement is, essentially, that
    block devices will be collapsed into character devices as a first
    step (though I don't particularly agree), and raw device names 'rxxx'
    will become simply 'xxx' in devfs in the second step (i.e. no 'rxxx'
    names will exist).  The renaming will not effect the original /dev
    and the expectation is that devfs will eventually (but not immediately)
    become the standard way to access devices in the system.

    If it is determined that a reimplementation of block device access
    characteristics is beneficial, a number of alternatives will
    be possible that do not involve resurrecting the 'b'lock device class.
    For example, an ioctl() that might be made on an open character device
    descriptor or a generic buffered overlay device.

    This commit removes the blockdev disablement sysctl which does not
    apply to the solution that was reached.
1999-10-20 06:31:49 +00:00
Poul-Henning Kamp
8342096f29 Change the default for the vfs.bdev_buffered sysctl to zero.
This means that access to block devices nodes will act the
same as char device nodes for disk-like devices.

If you encounter problems after this, where programs accessing
disks directly fail to operate, please use the following command
to revert to previous behaviour:

        sysctl -w vfs.bdev_buffered=1

And verify that this was indeed the cause of your trouble.

See the mail-archives of the arch@FreeBSD.org list for background.
1999-10-18 16:59:50 +00:00
Boris Popov
2e60e8b92e Under some condition vnode can reference itself. 1999-10-14 09:35:37 +00:00
Boris Popov
51533e5859 Isolate old constant NCP_VOLNAME_LEN. 1999-10-14 08:57:54 +00:00
Boris Popov
cff51c813a Remove unnessary includes. 1999-10-12 10:37:00 +00:00
Poul-Henning Kamp
889fb68f34 remove unused #includes 1999-10-11 19:18:43 +00:00
Poul-Henning Kamp
1201869007 Add a couple of strategic KASSERTs 1999-10-08 19:07:23 +00:00
Poul-Henning Kamp
856de19089 Add back sysctl vfs.enable_userblk_io 1999-10-08 18:25:19 +00:00
Boris Popov
f05e3aac6f Put back cn_namelen initialization. Removed by phk in rev 1.2. 1999-10-07 12:18:12 +00:00
Poul-Henning Kamp
adab70d67a Warn once per driver about dev_t's not registered with make_dev(). 1999-10-04 12:33:05 +00:00
Poul-Henning Kamp
aa4f4b695e Move the buffered read/write code out of spec_{read|write} and into
two new functions spec_buf{read|write}.

Add sysctl vfs.bdev_buffered which defaults to 1 == true.  This
sysctl can be used to experimentally turn buffered behaviour for
bdevs off.  I should not be changed while any blockdevices are
open.  Remove the misplaced sysctl vfs.enable_userblk_io.

No other changes in behaviour.
1999-10-04 11:23:10 +00:00
Poul-Henning Kamp
3b6fb88590 Before we start to mess with the VFS name-cache clean things up a little bit:
Isolate the namecache in its own file, and give it a dedicated malloc type.
1999-10-03 12:18:29 +00:00
Boris Popov
dd166d3282 Import kernel part of ncplib: netncp and nwfs
Reviewed by:	msmith, peter
Obtained from:	ncplib
1999-10-02 04:06:24 +00:00
Poul-Henning Kamp
b89392e703 Remove the D_NOCLUSTER[RW] options which were added because vn had
problems.  Now that Matt has fixed vn, this can go.  The vn driver
should have used d_maxio (now si_iosize_max) anyway.
1999-09-30 07:11:30 +00:00
Poul-Henning Kamp
1b5464ef9d Remove v_maxio from struct vnode.
Replace it with mnt_iosize_max in struct mount.

Nits from:	bde
1999-09-29 20:05:33 +00:00
Marcel Moolenaar
2c42a14602 sigset_t change (part 2 of 5)
-----------------------------

The core of the signalling code has been rewritten to operate
on the new sigset_t. No methodological changes have been made.
Most references to a sigset_t object are through macros (see
signalvar.h) to create a level of abstraction and to provide
a basis for further improvements.

The NSIG constant has not been changed to reflect the maximum
number of signals possible. The reason is that it breaks
programs (especially shells) which assume that all signals
have a non-null name in sys_signame. See src/bin/sh/trap.c
for an example. Instead _SIG_MAXSIG has been introduced to
hold the maximum signal possible with the new sigset_t.

struct sigprop has been moved from signalvar.h to kern_sig.c
because a) it is only used there, and b) access must be done
though function sigprop(). The latter because the table doesn't
holds properties for all signals, but only for the first NSIG
signals.

signal.h has been reorganized to make reading easier and to
add the new and/or modified structures. The "old" structures
are moved to signalvar.h to prevent namespace polution.

Especially the coda filesystem suffers from the change, because
it contained lines like (p->p_sigmask == SIGIO), which is easy
to do for integral types, but not for compound types.

NOTE: kdump (and port linux_kdump) must be recompiled.

Thanks to Garrett Wollman and Daniel Eischen for pressing the
importance of changing sigreturn as well.
1999-09-29 15:03:48 +00:00
Matthew Dillon
e3a285c715 Make sure file after VOP_OPEN is VMIO'd when transfering control from
a lower layer to an upper layer.  I'm not sure how necessary this is
    for reading.

    Fix bug in union_lookup() (note: there are probably still several bugs
    in union_lookup()).  This one set lerror as a side effect without
    setting lowervp, causing copyup code further on down to crash on a null
    lowervp pointer.  Changed the side effect to use a temporary variable
    instead.
1999-09-28 05:48:39 +00:00
Matthew Dillon
2a31267e43 This is a major fixup of unionfs. At least 30 serious bugs have been
fixed (many due to changing semantics in other parts of the kernel and not
    the original author's fault), including one critical one: unionfs could
    cause UFS corruption in the fronting store due to calling VOP_OPEN for
    writing without turning on vmio for the UFS vnode.

    Most of the bugs were related to semantics changes in VOP calls, lock
    ordering problems (causing deadlocks), improper handling of a read-only
    backing store (such as an NFS mount), improper referencing and locking
    of vnodes, not using real struct locks for vnode locking, not using
    recursive locks when accessing the fronting store, and things like that.

    New functionality has been added:  unionfs now has mmap() support, but
    only partially tested, and rename has been enhanced considerably.

    There are still some things that unionfs cannot do.   You cannot
    rename a directory without confusing unionfs, and there are issues
    with softlinks, hardlinks, and special files.  unionfs mostly doesn't
    understand them (and never did).

    There are probably still panic situations, but hopefully no where near
    as many as before this commit.

    The unionfs in this commit has been tested overlayed on /usr/src
    (backing /usr/src being a read-only NFS mount, fronting /usr/src being
    a local filesystem).  kernel builds have been tested, buildworld is
    undergoing testing.  More testing is necessary.
1999-09-26 20:52:41 +00:00
Poul-Henning Kamp
9ab8e01991 Remove a warning check which was too general. 1999-09-25 18:52:03 +00:00
Poul-Henning Kamp
d6a0e38a1b Remove five now unused fields from struct cdevsw. They should never
have been there in the first place.  A GENERIC kernel shrinks almost 1k.

Add a slightly different safetybelt under nostop for tty drivers.

Add some missing FreeBSD tags
1999-09-25 18:24:47 +00:00
Poul-Henning Kamp
ae8e1d08d7 This patch clears the way for removing a number of tty related
fields in struct cdevsw:

        d_stop          moved to struct tty.
        d_reset         already unused.
        d_devtotty      linkage now provided by dev_t->si_tty.

These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.

The changes in this patch consist of:

        initialize dev->si_tty in *_open()
        initialize tty->t_stop
        remove devtotty functions
        rename ttpoll to ttypoll
        a few adjustments to these changes in the generic code
        a bump of __FreeBSD_version
        add a couple of FreeBSD tags
1999-09-25 16:21:39 +00:00
Poul-Henning Kamp
c428d4c048 Kill the cdevsw->d_maxio field.
d_maxio is replaced by the dev->si_iosize_max field which the driver
should be set in all calls to cdevsw->d_open if it has a better
idea than the system wide default.

The field is a generic dev_t field (ie: not disk specific) so that
tapes and other devices can use physio as well.
1999-09-22 19:56:14 +00:00
Matthew Dillon
67ddfcaf69 More removals of vnode->v_lastr, replaced by preexisting seqcount
heuristic to detect sequential operation.

    VM-related forced clustering code removed from ufs in preparation for a
    commit to vm/vm_fault.c that does it more generally.

Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
1999-09-20 23:27:58 +00:00
Matthew Dillon
8411bf657d Fix handling of a device EOF that occurs in the middle of a block. The
transfer size calculation was incorrect resulting in the last read being
    potentially larger then the actual extent of the device.

    EOF and write handling has not yet been fixed.

Reviewed by:	Tor.Egge@fast.no
1999-09-20 23:17:47 +00:00
Poul-Henning Kamp
fae03f66d1 Step one of replacing devsw->d_maxio with si_bsize_max.
Rename dev->si_bsize_max to si_iosize_max and set it in spec_open
if the device didn't.

Set vp->v_maxio from dev->si_bsize_max in spec_open rather than
in ufs_bmap.c
1999-09-20 19:57:28 +00:00
Matthew Dillon
bb01f28e97 Add vfs.enable_userblk_io sysctl to control whether user reads and writes
to buffered block devices are allowed.  The default is to be backwards
    compatible, i.e. reads and writes are allowed.

    The idea is for a larger crowd to start running with this disabled and
    see what problems, if any, crop up, and then to change the default to
    off and see if any problems crop up in the next 6 months prior to
    potentially removing support entirely.  There are still a few people,
    Julian and myself included, who believe the buffered block device
    access from usermode to be useful.

    Remove use of vnode->v_lastr from buffered block device I/O in
    preparation for removal of vnode->v_lastr field, replacing it with
    the already existing seqcount metric to detect sequential operation.

Reviewed by:	Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
1999-09-17 06:10:27 +00:00
Alfred Perlstein
c24fda81c9 Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from:	NetBSD
1999-09-11 00:46:08 +00:00
Julian Elischer
85a219d201 Changes to centralise the default blocksize behaviour.
More likely to follow.

Submitted by: phk@freebsd.org
1999-09-09 19:08:44 +00:00
Alfred Perlstein
5a5fccc8e7 All unimplemented VFS ops now have entries in kern/vfs_default.c that return
reasonable defaults.

This avoids confusing and ugly casting to eopnotsupp or making dummy functions.
Bogus casting of filesystem sysctls to eopnotsupp() have been removed.

This should make *_vfsops.c more readable and reduce bloat.

Reviewed by:	msmith, eivind
Approved by:	phk
Tested by:	Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
1999-09-07 22:42:38 +00:00
Bruce Evans
a0f40f54aa Get rid of the NULLFS_DIAGNOSTIC option. This option was as useful as
the other XXXFS_DIAGNOSTIC options (not very) and mostly controlled
tracing of normal operation.  Use `#ifdef DEBUG' for non-diagnostics
and `#ifdef DIAGNOSTIC' for diagnostics.
1999-09-04 12:35:09 +00:00
Bruce Evans
622163afbe Fixed the previous change. Some more code controlled by UMAPFS_DIAGNOSTIC
is actually for diagnostics; control it with DIAGNOSTIC and not DDB.
1999-09-04 11:51:41 +00:00
Julian Elischer
1f7e07329d Print out the device name when there is an uninitialised IO size or IO error
in spec_getpages().

Submitted by:	phk suggested the idea.
1999-09-03 09:14:36 +00:00
Julian Elischer
c5d6480694 Add a catchall to set default blocksize values for disk like devices.
Submitted by:	phk@freebsd.org
1999-09-03 08:26:46 +00:00
Julian Elischer
7012bab988 Revert a bunch of contraversial changes by PHK. After
a quick think and discussion among various people some form of some of
these changes will probably be recommitted.

The reversion requested was requested by dg while discussions proceed.
PHK has indicated that he can live with this, and it has been agreed
that some form of some of these changes may return shortly after further
discussion.
1999-09-03 05:16:59 +00:00
Poul-Henning Kamp
fd559c6fc1 Fix the sense of the vn_isdisk() check. 1999-09-01 15:17:18 +00:00
Poul-Henning Kamp
3b7df19ba4 Set the buffersize for non BSDFFS labeled partitions to
max(dev->si_bsize_phys, BLKDEV_IOSIZE).

Requested by:   davidg
1999-08-31 21:46:42 +00:00
Poul-Henning Kamp
586e1b7b46 Make buffered acces to bdevs from userland controllable with
a sysctl vfs.bdev_access.
1999-08-31 21:01:57 +00:00
Poul-Henning Kamp
02e1576966 Make bdev userland access work like cdev userland access unless
the highly non-recommended option ALLOW_BDEV_ACCESS is used.

(bdev access is evil because you don't get write errors reported.)

Kill si_bsize_best before it kills Matt :-)

Use the specfs routines rather having cloned copies in devfs.
1999-08-30 07:56:23 +00:00
Bruce Evans
4047cd0bb2 Converted the silly SAFTEY option into a new-style option by renaming it to
DIAGNOSTIC.

Fixed an English style bug in the panic messages controlled by SAFETY.
1999-08-30 07:08:04 +00:00
Bruce Evans
8021ba879a Changed old-style option UNION_DIAGNOSTIC to DEBUG and fixed printf
format errors exposed by this.  It has nothing to do with diagnostics
since it does little more than control tracing of normal operation.
Actual diagnostics for the union file system are still controlled by
the DIAGNOSTIC option.
1999-08-29 10:03:35 +00:00
Bruce Evans
b00b6d73dd Changed old-style options UMAPFS_DIAGNOSTIC and UMAP_DIAGNOSTIC to DEBUG
or DDB and fixed printf format errors exposed by this.  The options had
little to do with diagnostics; they mostly controlled tracing of normal
operation.
1999-08-29 09:54:17 +00:00