Commit Graph

103 Commits

Author SHA1 Message Date
Robert Watson
74237f55b0 Part I: Update extended attribute API and ABI:
o Modify the system call syntax for extattr_{get,set}_{fd,file}() so
  as not to use the scatter gather API (which appeared not to be used
  by any consumers, and be less portable), rather, accepts 'data'
  and 'nbytes' in the style of other simple read/write interfaces.
  This changes the API and ABI.

o Modify system call semantics so that extattr_get_{fd,file}() return
  a size_t.  When performing a read, the number of bytes read will
  be returned, unless the data pointer is NULL, in which case the
  number of bytes of data are returned.  This changes the API only.

o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t
  argument so as to return the size, if desirable.  If set to NULL,
  the size will not be returned.

o Update various filesystems (pseodofs, ufs) to DTRT.

These changes should make extended attributes more useful and more
portable.  More commits to rebuild the system call files, as well
as update userland utilities to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-02-10 04:43:22 +00:00
Bruce Evans
860965f144 Made osigreturn(2) standard so that SYS_osigreturn can be used in the
signal trampoline for old signals.  The arches that support old signals
currently abuse sigreturn(2) instead.  This mainly complicates things
and slightly breaks the the new sigreturn(2).

COMPAT is too limited to support the correct configuration of osigreturn,
and this commit doesn't attempt to fix it; it just moves the bogusness:
osigreturn() must now be provided unconditionally even on arches that
don't really need it; previously it had to be provided under the bogus
condition defined(COMPAT_43).
2002-02-01 17:27:14 +00:00
Alfred Perlstein
21d56e9c33 Make AIO a loadable module.
Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).

Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.

Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.

Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int".  Fix
it by using an inline version of the AS macro against the syscall
arguments.  (AS should be available globally but we'll get to that
later.)

Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.
2001-12-29 07:13:47 +00:00
Poul-Henning Kamp
c60693dbd3 Reserve 378 for the new mount syscall Maxime Henrion <mux@qualys.com>
is working on.  (This is to get us more than 32 mountoptions).
2001-11-02 17:58:26 +00:00
Robert Watson
b55abfd929 o Reserve system call 377 for afs_syscall; by reserving a system call
number, portable OpenAFS applications don't have to attempt to determine
  what system call number was dynamically allocated.  No system call
  prototype or implementation is defined.

Requested by:	Tom Maher <tardis@watson.org>
2001-10-13 13:19:34 +00:00
Robert Watson
9c94f7731e o Introduce eaccess(2), a version of access(2) that uses the effective
credentials rather than the real credentials.  This is useful for
  implementing GUI's which need to modify icons based on access rights,
  but where use of open(2) is too expensive, use of stat(2) doesn't
  reflect the file system's real protection model, and use of
  access() suffers from real/effective credential confusion.  This
  implementation provides the same semantics as the call of the same
  name on SCO OpenServer.  Note: using this call improperly can
  leave you subject to some of the same races present in the
  access(2) call.
o To implement this, break out the basic logic of access(2) into
  vpaccess(), which accepts a passed credential to perform the
  invocation of VOP_ACCESS().  Add eaccess(2) to invoke vpaccess(),
  and modify access(2) to use vpaccess().

Obtained from:	TrustedBSD Project
2001-09-21 21:33:22 +00:00
Peter Wemm
eb25edbda3 Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.
2001-09-18 23:32:09 +00:00
Matthew Dillon
257d198890 Synchronize syscalls.master(s) with recent Giant pushdown work 2001-09-01 19:36:48 +00:00
Matthew Dillon
918c3b1361 Make yield() MPSAFE.
Synchronize syscalls.master with all MPSAFE changes to date.  Synchronize
new syscall generation follows because yield() will panic if it is out
of sync with syscalls.master.
2001-09-01 03:54:09 +00:00
Matthew Dillon
df9987602f Giant pushdown syscalls in kern/uipc_syscalls.c. Affected calls:
recvmsg(), sendmsg(), recvfrom(), accept(), getpeername(), getsockname(),
socket(), connect(), accept(), send(), recv(), bind(), setsockopt(), listen(),
sendto(), shutdown(), socketpair(), sendfile()
2001-08-31 00:37:34 +00:00
Matthew Dillon
b6a4b4f9ae Giant Pushdown: sysv shm, sem, and msg calls. 2001-08-31 00:02:18 +00:00
Matthew Dillon
356861db03 Remove the MPSAFE keyword from the parser for syscalls.master.
Instead introduce the [M] prefix to existing keywords.  e.g.
MSTD is the MP SAFE version of STD.  This is prepatory for a
massive Giant lock pushdown.  The old MPSAFE keyword made
syscalls.master too messy.

Begin comments MP-Safe procedures with the comment:
/*
 * MPSAFE
 */
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).

sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE

ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)

Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not.  Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.

Rebuild syscall tables.
2001-08-30 18:50:57 +00:00
Poul-Henning Kamp
b63436919d Remove a comment which was past its shelf life.
PR:		18750
Submitted by:	Tony Finch <dot@dotat.at>
2001-05-29 09:22:22 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Tor Egge
b4b469e6bb gettimeofday() is MP safe on both -current and -stable. 2001-05-11 17:05:12 +00:00
Robert Watson
130d0157d1 o Introduce a new system call, __setsugid(), which allows a process to
toggle the P_SUGID bit explicitly, rather than relying on it being
  set implicitly by other protection and credential logic.  This feature
  is introduced to support inter-process authorization regression testing
  by simplifying userland credential management allowing the easy
  isolation and reproduction of authorization events with specific
  security contexts.  This feature is enabled only by "options REGRESSION"
  and is not intended to be used by applications.  While the feature is
  not known to introduce security vulnerabilities, it does allow
  processes to enter previously inaccessible parts of the credential
  state machine, and is therefore disabled by default.  It may not
  constitute a risk, and therefore in the future pending further analysis
  (and appropriate need) may become a published interface.

Obtained from:	TrustedBSD Project
2001-04-11 20:20:40 +00:00
Robert Watson
fec605c882 o Introduce extattr_{delete,get,set}_fd() to allow extended attribute
operations on file descriptors, which complement the existing set of
  calls, extattr_{delete,get,set}_file() which act on paths.  In doing
  so, restructure the system call implementation such that the two sets
  of functions share most of the relevant code, rather than duplicating
  it.  This pushes the vnode locking into the shared code, but keeps
  the copying in of some arguments in the system call code.  Allowing
  access via file descriptors reduces the opportunity for race
  conditions when managing extended attributes.

Obtained from:	TrustedBSD Project
2001-03-31 16:20:05 +00:00
Robert Watson
3063207147 o Rename "namespace" argument to "attrnamespace" as namespace is a C++
reserved word.

Submitted by:	jkh
Obtained from:	TrustedBSD Project
2001-03-19 05:44:15 +00:00
Robert Watson
70f3685105 o Change the API and ABI of the Extended Attribute kernel interfaces to
introduce a new argument, "namespace", rather than relying on a first-
  character namespace indicator.  This is in line with more recent
  thinking on EA interfaces on various mailing lists, including the
  posix1e, Linux acl-devel, and trustedbsd-discuss forums.  Two namespaces
  are defined by default, EXTATTR_NAMESPACE_SYSTEM and
  EXTATTR_NAMESPACE_USER, where the primary distinction lies in the
  access control model: user EAs are accessible based on the normal
  MAC and DAC file/directory protections, and system attributes are
  limited to kernel-originated or appropriately privileged userland
  requests.

o These API changes occur at several levels: the namespace argument is
  introduced in the extattr_{get,set}_file() system call interfaces,
  at the vnode operation level in the vop_{get,set}extattr() interfaces,
  and in the UFS extended attribute implementation.  Changes are also
  introduced in the VFS extattrctl() interface (system call, VFS,
  and UFS implementation), where the arguments are modified to include
  a namespace field, as well as modified to advoid direct access to
  userspace variables from below the VFS layer (in the style of recent
  changes to mount by adrian@FreeBSD.org).  This required some cleanup
  and bug fixing regarding VFS locks and the VFS interface, as a vnode
  pointer may now be optionally submitted to the VFS_EXTATTRCTL()
  call.  Updated documentation for the VFS interface will be committed
  shortly.

o In the near future, the auto-starting feature will be updated to
  search two sub-directories to the ".attribute" directory in appropriate
  file systems: "user" and "system" to locate attributes intended for
  those namespaces, as the single filename is no longer sufficient
  to indicate what namespace the attribute is intended for.  Until this
  is committed, all attributes auto-started by UFS will be placed in
  the EXTATTR_NAMESPACE_SYSTEM namespace.

o The default POSIX.1e attribute names for ACLs and Capabilities have
  been updated to no longer include the '$' in their filename.  As such,
  if you're using these features, you'll need to rename the attribute
  backing files to the same names without '$' symbols in front.

o Note that these changes will require changes in userland, which will
  be committed shortly.  These include modifications to the extended
  attribute utilities, as well as to libutil for new namespace
  string conversion routines.  Once the matching userland changes are
  committed, a buildworld is recommended to update all the necessary
  include files and verify that the kernel and userland environments
  are in sync.  Note: If you do not use extended attributes (most people
  won't), upgrading is not imperative although since the system call
  API has changed, the new userland extended attribute code will no longer
  compile with old include files.

o Couple of minor cleanups while I'm there: make more code compilation
  conditional on FFS_EXTATTR, which should recover a bit of space on
  kernels running without EA's, as well as update copyright dates.

Obtained from:	TrustedBSD Project
2001-03-15 02:54:29 +00:00
Jake Burkholder
86360fee54 Remove thr_sleep and thr_wakeup. Remove fields p_nthread and p_wakeup
from struct proc, which are now unused (p_nthread already was).
Remove process flag P_KTHREADP which was untested and only set
in vfs_aio.c (it should use kthread_create).  Move the yield
system call to kern_synch.c as kern_threads.c has been removed
completely.

moral support from:	alfred, jhb
2000-12-02 05:41:30 +00:00
Alfred Perlstein
78525ce318 sysvipc loadable.
new syscall entry lkmressys - "reserved loadable syscall"

Make syscall_register allow overwriting of such entries (lkmressys).
2000-12-01 08:57:47 +00:00
Marcel Moolenaar
ae51d56ce1 Fix prototypes for {o|}{g|s}etrlimit. A recent change in the
Linuxulator caused this bug to trigger.
2000-08-28 07:50:44 +00:00
Peter Wemm
4e0f152bbe Sigh. Fix SYS_exit problems. I misunderstood the significance of these
trailing options.
2000-07-29 10:05:25 +00:00
Peter Wemm
ac2b067b9a Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping
gcc's internal exit() prototypes and the (futile) hackery that we did to
try and avoid warnings.  main() was renamed for similar reasons.
Remove an exit related hack from makesyscalls.sh.
2000-07-29 00:16:28 +00:00
Jonathan Lemon
a8e65b915e Simplify kqueue API slightly.
Discussed on:	-arch
2000-07-18 19:31:52 +00:00
Robert Watson
92eebb8a9b o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file},
syscalls to manage capability sets on files.  First of two commits.

Obtained from:	TrustedBSD Project
2000-07-13 20:31:24 +00:00
Robert Watson
b09b66abf6 Introduce syscalls for process capability manipulation. Currently backs
onto already committed stubs.  Commit one of two.

Reviewed by:	Damned if I can remember.  Many people.
Obtained from:	TrustedBSD Project
2000-06-15 23:08:17 +00:00
Bruce Evans
aa4b7eae22 Fixed the declaration of mmap(). The crufty padding arg had the wrong
type.  This gave an inconsistent amount of crufty padding on i386's with
64-bit longs (8 bytes instead of 4).  On alphas it gives a consistent
amount of crufty padding (8 bytes) in addition to the 4 bytes of normal
padding caused by passing int args as register_t's.

Fixed the args struct tag for the NOPROTO syscalls (netbsd_lchown() and
netbsd_msync()).  The tag is currently unused for NOPROTO syscalls, so
the bug has no effect, but it will be used even in the NOPROTO case to
calculate sy_nargs correctly.
2000-05-09 08:31:06 +00:00
Peter Wemm
39e4c0c888 Remove undocumented broken-as-designed semconfig() syscall. 2000-05-01 11:11:44 +00:00
Jonathan Lemon
cb679c385e Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
Alfred Perlstein
c01df63183 Make makesyscalls.sh parse an optional field 'MPSAFE' that specifies
that a syscall does not want the BGL to be grabbed automatically.

Add the new MPSAFE flag to the syscalls that dillon has determined to
be MPSAFE.
2000-04-03 06:36:14 +00:00
Robert Watson
5134b3e92a Fix bde'isms in acl/extattr syscall interface, renaming syscalls to
prettier (?) names, adding some const's around here, et al.

Commit 1 out of 3.

Reviewed by:	bde
2000-01-19 06:01:07 +00:00
Peter Wemm
8ccd633455 Implement setres[ug]id() and getres[ug]id(). This has been sitting in
my tree for ages (~2 years) waiting for an excuse to commit it.  Now Linux
has implemented it and it seems that Staroffice (when using the
linux_base6.1 port's libc) calls this in the linux emulator and dies in
setup.  The Linux emulator can call these now.
2000-01-16 16:34:26 +00:00
Jason Evans
bfbbc4aa44 Add aio_waitcomplete(). Make aio work correctly for socket descriptors.
Make gratuitous style(9) fixes (me, not the submitter) to make the aio
code more readable.

PR:		kern/12053
Submitted by:	Chris Sedore <cmsedore@maxwell.syr.edu>
2000-01-14 02:53:29 +00:00
Alfred Perlstein
20883b0f10 make getfh a standard syscall instead of dependant on having
NFSSERVER defined, useful for userland fileservers that want to
use a filehandle type interface to the filesystem.

Submitted by: Assar Westerlund assar@stacken.kth.se
PR: kern/15452
1999-12-21 20:21:12 +00:00
Robert Watson
ef351daa32 First pass commit to introduce new ACL and Extended Attribute system calls.
The second pass commit with all the supporting code will happen shortly
afterwards.

Reviewed by:	eivind
1999-12-19 05:54:46 +00:00
Brian Somers
b08210f5fa modfind(char *) -> modfind(const char *)
Reminded by:	dfr
1999-11-17 21:32:40 +00:00
Marcel Moolenaar
b7d8512385 Now that userland including modules don't use the osig* syscalls,
make them of type COMPAT.
1999-10-12 09:29:53 +00:00
Marcel Moolenaar
da3605dbea sigset_t change (part 1 of 5)
-----------------------------

Rename sigaction, sigprocmask, sigpending and sigsuspend to
osigaction, osigprocmask, osigpending and osigsuspend (resp)
and add new syscalls for them to support the new sisgset_t
without breaking existing binaries.

Change the prototype of sigaltstack to use the typedef stack_t
instead of struct sigaltstack to reflect that it is SUSv2
compliant.

Also, rename sigreturn to osigreturn and add a new syscall
to support the modified stackframe. The change is caused by
sigreturn operating on ucontext_t now and the fact that
siginfo_t has been updated to conform to SUSv2.
1999-09-29 15:01:21 +00:00
Alfred Perlstein
c24fda81c9 Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from:	NetBSD
1999-09-11 00:46:08 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Nik Clayton
2395507999 Add CPT_NOA, LIBCOMPAT, NODEF, NOARGS, NOPROTO, and NOIMPL to the commented
list of available types.

PR:             docs/13007
Submitted by:   Assar Westerlund <assar@sics.se>
1999-08-11 22:13:46 +00:00
Jordan K. Hubbard
45f26d4120 Move syscall 180 back to where it was before and fix the
incorrect comment which led me to move it in the first place.
1999-08-05 08:18:45 +00:00
Jordan K. Hubbard
b24eb2795d Reserve a syscall for the arla folks. I'm assuming that since syscalls.c
and init_sysent.c are checked into CVS, I should also commit the regenerated
copies even though they're built by syscalls.master.  Correct?  Bruce? :)
1999-08-04 20:04:25 +00:00
Bruce Evans
f664346fbe Fixed nonsense arg type `const caddr_t' in the prototype() for utrace().
Changed to `const void *'.  utrace() is undocumented, so nothing should
notice.

Fixed missing consts for utrace() and ktrace() in syscalls.master.

sys/ktrace.h is missing some Lite2 changes of shorts to ints.
1999-05-13 09:09:37 +00:00
Poul-Henning Kamp
02daf150a4 Add the jail system call. 1999-04-28 11:28:49 +00:00
Dmitrij Tejblum
8fe387ab84 Add standard padding argument to pread and pwrite syscall. That should make them
NetBSD compatible.

Add parameter to fo_read and fo_write. (The only flag FOF_OFFSET mean that
the offset is set in the struct uio).

Factor out some common code from read/pread/write/pwrite syscalls.
1999-04-04 21:41:28 +00:00
Alan Cox
4160ccd978 Added pread and pwrite. These functions are defined by the X/Open
Threads Extension.  (Note: We use the same syscall numbers as NetBSD.)

Submitted by:	John Plevyak <jplevyak@inktomi.com>
1999-03-27 21:16:58 +00:00
Peter Wemm
325e13dd19 A kldsym(2) syscall prototype for extracting information from the in-kernel
linker.  This is intended to replace kvm_mkdb etc.  The first version
only does name->value lookups, but it's open ended.  value->name lookups
would probably be a good thing to do too.

It's been suggested to try and connect the symbol tables to sysctl (which
is probably a more flexible way of doing it if it's done right), but that
is far more complex and difficult than I was ready to have a shot at.
1998-11-11 12:45:14 +00:00
David Greenman
dd0b2081f4 Implemented zero-copy TCP/IP extensions via sendfile(2) - send a
file to a stream socket. sendfile(2) is similar to implementations in
HP-UX, Linux, and other systems, but the API is more extensive and
addresses many of the complaints that the Apache Group and others have
had with those other implementations. Thanks to Marc Slemko of the
Apache Group for helping me work out the best API for this.
Anyway, this has the "net" result of speeding up sends of files over
TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU
cycles) when compared to a traditional read/write loop.
1998-11-05 14:28:26 +00:00