Commit Graph

408 Commits

Author SHA1 Message Date
Warner Losh
42b3331b8a Fix compiling FREEBSD_COMPAT[4,5,6] without FREEBSD_COMPAT7.
Note: Not sure this is the right way to do compat, but it makes the
headers consistent with the implementations.
2009-12-16 17:17:40 +00:00
Konstantin Belousov
9afcc9986c Regenerate. 2009-12-04 21:53:20 +00:00
Konstantin Belousov
195edf8a5a Add several syscall compat32 entries for acl manipulation.
They do not require translation of the arguments.

Tested by:	bsam
MFC after:	1 week
2009-12-04 21:52:31 +00:00
Konstantin Belousov
063e9958a6 Regenerate 2009-10-27 11:02:04 +00:00
Konstantin Belousov
066d836b02 Current pselect(3) is implemented in usermode and thus vulnerable to
well-known race condition, which elimination was the reason for the
function appearance in first place. If sigmask supplied as argument to
pselect() enables a signal, the signal might be delivered before thread
called select(2), causing lost wakeup. Reimplement pselect() in kernel,
making change of sigmask and sleep atomic.

Since signal shall be delivered to the usermode, but sigmask restored,
set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK
should be cleared by ast() in case signal was not gelivered during
syscall execution.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:55:34 +00:00
Konstantin Belousov
d6e029adbe In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
Konstantin Belousov
84440afb54 In kern_sigsuspend(), better manipulate thread signal mask using
kern_sigprocmask() to properly notify other possible candidate threads
for signal delivery.

Since sigsuspend() shall only return to usermode after a signal was
delivered, do cursig/postsig loop immediately after waiting for
signal, repeating the wait if wakeup was spurious due to race with
other thread fetching signal from the process queue before us. Add
thread_suspend_check() call to allow the thread to be stopped or killed
while in loop.

Modify last argument of kern_sigprocmask() from boolean to flags,
allowing the function to be called with locked proc. Convertion of the
callers that supplied 1 to the old argument will be done in the next
commit, and due to SIGPROCMASK_OLD value equial to 1, code is formally
correct in between.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:42:24 +00:00
Robert Watson
718dbcfeaa Regenerate system call files following r197636. 2009-09-30 08:48:59 +00:00
Robert Watson
72c35fc69c Reserve system call numbers for Capsicum security framework capabilities,
capability mode, and process descriptors: cap_new, cap_getrights, cap_enter,
cap_getmode, pdfork, pdkill, pdgetpid, and pdwait.

Obtained from:	TrustedBSD Project
Sponsored by:	Google
MFC after:	3 weeks
2009-09-30 08:46:01 +00:00
Dag-Erling Smørgrav
edefbdd7cb If a certain feature that was present in FreeBSD 7 was removed or changed in
FreeBSD 8, the compatibility shims should be built not just when FreeBSD 7
compatibility is requested, but also when compatibility with any older
FreeBSD version where that feature was present is requested.o

Without this patch, a kernel config that sets COMPAT_FREEBSD6 but not *7
would fail to build due to inconsistencies between the declaration of the
compatibility shims and their use in the SysV code.

There are similar errors in other *proto.h headers in the tree.

MFC after:	3 weeks
2009-09-10 08:33:28 +00:00
Konstantin Belousov
b55ef216fe kern_select(9) copies fd_set in and out of userspace in quantities of
longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in
or out 4 bytes more then the process expected.

Calculate the amount of bytes to copy taking into account size of fd_set
for the current process ABI.

Diagnosed and tested by:	Peter Jeremy <peterjeremy acm org>
Reviewed by:	jhb
MFC after:	1 week
2009-09-09 20:59:01 +00:00
John Baldwin
8e3764c0a6 Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the
old ABI versions of the relevant control system call (e.g.
freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()).

Approved by:	re (kib)
2009-07-27 16:03:04 +00:00
Edward Tomasz Napierala
51334c8257 Regen the freebsd32 parts.
Approved by:	re (kib)
2009-07-08 16:30:34 +00:00
Edward Tomasz Napierala
92f76afe3a Fix freebsd32 version of lpathconf(2).
Approved by:	re (kib)
2009-07-08 16:26:43 +00:00
Edward Tomasz Napierala
c38898116a There is an optimization in chmod(1), that makes it not to call chmod(2)
if the new file mode is the same as it was before; however, this
optimization must be disabled for filesystems that support NFSv4 ACLs.
Chmod uses pathconf(2) to determine whether this is the case - however,
pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.

This change adds lpathconf(3) to make it possible to solve that problem
in a clean way.

Reviewed by:	rwatson (earlier version)
Approved by:	re (kib)
2009-07-08 15:23:18 +00:00
Robert Watson
14961ba789 Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type.  This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by:	brooks
Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-06-27 13:58:44 +00:00
John Baldwin
3899e0dfe7 Regen. 2009-06-24 21:54:08 +00:00
John Baldwin
b648d4806b Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
  short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
  short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
  (this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
  removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
  int.  This removes the need for the shm_bsegsz member in struct
  shmid_kernel and should allow for complete support of SYSV SHM regions
  >= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
  short.
- The shm_internal member of struct shmid_ds is now gone.  The internal
  VM object pointer for SHM regions has been moved into struct
  shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
  now marked COMPAT7 and new versions of those system calls which support
  the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc.  The
  FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
  symbol versions has been added to libc.  Version tags are added to
  system calls by adding an appropriate __sym_compat() entry to
  src/lib/libc/incldue/compat.h. [1]

PR:		kern/16195 kern/113218 bin/129855
Reviewed by:	arch@, rwatson
Discussed with:	kan, kib [1]
2009-06-24 21:10:52 +00:00
John Baldwin
3c366f1f14 Add a new COMPAT7 flag for FreeBSD 7.x compatibility system calls. 2009-06-24 13:36:37 +00:00
John Baldwin
e47a833f38 Regen. 2009-06-22 20:24:03 +00:00
John Baldwin
0b0fe06a5e Fix a typo in a comment. 2009-06-22 20:12:40 +00:00
John Baldwin
4b7b144f63 Regen. 2009-06-17 19:53:47 +00:00
John Baldwin
21def99b51 - Add the ability to mix multiple flags seperated by pipe ('|') characters
in the type field of system call tables.  Specifically, one can now use
  the 'NO*' types as flags in addition to the 'COMPAT*' types.  For example,
  to tag 'COMPAT*' system calls as living in a KLD via NOSTD.  The COMPAT*
  type is required to be listed first in this case.
- Add new functions 'type()' and 'flag()' to the embedded awk script in
  makesyscalls.sh that return true if a requested flag is found in the
  type field ($3).  The flag() function checks all of the flags in the
  field, but type() only checks the first flag.  type() is meant to be
  used in the top-level "switch" statement and flag() should be used
  otherwise.
- Retire the CPT_NOA type, it is now replaced with "COMPAT|NOARGS" using
  the flags approach.
- Tweak the comment descriptions of COMPAT[46] system calls so that they
  say "freebsd[46] foo" rather than "old foo".
- Document the COMPAT6 type.
- Sync comments in compat32 syscall table with the master table.
2009-06-17 19:50:38 +00:00
John Baldwin
6653a58307 Regen. 2009-06-15 20:40:23 +00:00
John Baldwin
c4f16b69e1 Add a new 'void closefrom(int lowfd)' system call. When called, it closes
any open file descriptors >= 'lowfd'.  It is largely identical to the same
function on other operating systems such as Solaris, DFly, NetBSD, and
OpenBSD.  One difference from other *BSD is that this closefrom() does not
fail with any errors.  In practice, while the manpages for NetBSD and
OpenBSD claim that they return EINTR, they ignore internal errors from
close() and never return EINTR.  DFly does return EINTR, but for the common
use case (closing fd's prior to execve()), the caller really wants all
fd's closed and returning EINTR just forces callers to call closefrom() in
a loop until it stops failing.

Note that this implementation of closefrom(2) does not make any effort to
resolve userland races with open(2) in other threads.  As such, it is not
multithread safe.

Submitted by:	rwatson (initial version)
Reviewed by:	rwatson
MFC after:	2 weeks
2009-06-15 20:38:55 +00:00
Konstantin Belousov
859357f2b3 Regenerate 2009-06-10 13:48:43 +00:00
Konstantin Belousov
f68be132fe Add several syscall compat32 entries for extattr manipulation syscalls,
that do not require translation of the arguments.

Requested by:	kientzle
Reviewed by:	jhb (previous wrong version)
MFC after:	1 week
2009-06-10 13:48:13 +00:00
Robert Watson
33dd50646e Regenerate generated syscall files following changes to struct sysent in
r193234.
2009-06-01 16:14:38 +00:00
Jamie Gritton
0304c73163 Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
Jamie Gritton
fe2f3c651f Regen for new jail system calls in r191673.
Approved by:	bz (mentor)
2009-04-29 21:50:13 +00:00
Jamie Gritton
b38ff370e4 Introduce the extensible jail framework, using the same "name=value"
interface as nmount(2).  Three new system calls are added:
* jail_set, to create jails and change the parameters of existing jails.
  This replaces jail(2).
* jail_get, to read the parameters of existing jails.  This replaces the
  security.jail.list sysctl.
* jail_remove to kill off a jail's processes and remove the jail.
Most jail parameters may now be changed after creation, and jails may be
set to exist without any attached processes.  The current jail(2) system
call still exists, though it is now a stub to jail_set(2).

Approved by:	bz (mentor)
2009-04-29 21:14:15 +00:00
Konstantin Belousov
9da6804056 Regen 2009-04-01 13:12:40 +00:00
Konstantin Belousov
088b38ac84 Rename implementation function for freebsd32 sysarch(2) to allow for
the arguments translations. Provide ABI-compatible definition of the
struct i386_ldt_args for freebsd32 compat layer.

In collaboration with:	pho
Reviewed by:	jhb
2009-04-01 13:11:50 +00:00
Ed Schouten
7eac36ed3c Emulate the FIODGNAME ioctl in our 32-bit emulator.
It's quite strange that nobody reported this issue before. It turns out
functions like ttyname(), ptsname() and fdevname() don't work in
compat32. This means it't not even possible to run applications like
script(1) inside a 32-bit FreeBSD jail.

Fix this by converting 32-bit fiodgname_arg structures to their 64-bit
equivalent.

Reported by:	kris
Tested by:	kris
2009-03-29 20:09:51 +00:00
Jamie Gritton
8571af59e5 Whitespace/spelling fixes in advance of upcoming functional changes.
Approved by:	bz (mentor)
2009-03-27 13:13:59 +00:00
Jamie Gritton
f86bce5ed0 Extend the "vfsopt" mount options for more general use. Make struct
vfsopt and the vfs_buildopts function public, and add some new fields
to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and
vfs_opterror.

Further extend the interface to allow reading options from the kernel
in addition to sending them to the kernel, with vfs_setopt and related
functions.

While this allows the "name=value" option interface to be used for more
than just FS mounts (planned use is for jails), it retains the current
"vfsopt" name and <sys/mount.h> requirement.

Approved by:	bz (mentor)
2009-03-02 23:26:30 +00:00
Ed Schouten
ddf9d24349 Push down Giant inside sysctl. Also add some more assertions to the code.
In the existing code we didn't really enforce that callers hold Giant
before calling userland_sysctl(), even though there is no guarantee it
is safe. Fix this by just placing Giant locks around the call to the oid
handler. This also means we only pick up Giant for a very short period
of time. Maybe we should add MPSAFE flags to sysctl or phase it out all
together.

I've also added SYSCTL_LOCK_ASSERT(). We have to make sure sysctl_root()
and name2oid() are called with the sysctl lock held.

Reviewed by:	Jille Timmermans <jille quis cx>
2008-12-29 12:58:45 +00:00
Bjoern A. Zeeb
b2dbdae9f3 Add 32-bit compat support for AIO.
jhb probably forgot to commit this file with r185878 and will want to
review this. It unbreaks the build here.

Obtained from:	p4 //depot/user/jhb/lock/compat/freebsd32/freebsd32_signal.h#2
2008-12-11 00:58:05 +00:00
John Baldwin
5d8d23c71b Regen. 2008-12-10 20:57:16 +00:00
John Baldwin
3858a1f4f5 - Add 32-bit compat system calls for VFS_AIO. The system calls live in the
aio code and are registered via the recently added SYSCALL32_*() helpers.
- Since the aio code likes to invoke fuword and suword a lot down in the
  "bowels" of system calls, add a structure holding a set of operations for
  things like storing errors, copying in the aiocb structure, storing
  status, etc.  The 32-bit system calls use a separate operations vector to
  handle fuword32 vs fuword, etc.  Also, the oldsigevent handling is now
  done by having seperate operation vectors with different aiocb copyin
  routines.
- Split out kern_foo() functions for the various AIO system calls so the
  32-bit front ends can manage things like copying in and converting
  timespec structures, etc.
- For both the native and 32-bit aio_suspend() and lio_listio() calls,
  just use copyin() to read the array of aiocb pointers instead of using
  a for loop that iterated over fuword/fuword32.  The error handling in
  the old case was incomplete (lio_listio() just ignored any aiocb's that
  it got an EFAULT trying to read rather than reporting an error), and
  possibly slower.

MFC after:	1 month
2008-12-10 20:56:19 +00:00
John Baldwin
3cdf485f87 When unloading a 32-bit system call module, restore the sysent vector in
the 32-bit system call table instead of the main system call table.
2008-12-03 18:45:38 +00:00
Bjoern A. Zeeb
759e7c0bbb Regen after jail support was added in r185435. 2008-11-29 14:34:30 +00:00
Bjoern A. Zeeb
413628a7e3 MFp4:
Bring in updated jail support from bz_jail branch.

This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..

SCTP support was updated and supports IPv6 in jails as well.

Cpuset support permits jails to be bound to specific processor
sets after creation.

Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.

DDB 'show jails' command was added to aid debugging.

Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.

Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.

Bump __FreeBSD_version for the afore mentioned and in kernel changes.

Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
  and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
  help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
  suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
  on cluster machines as well as all the testers and people
  who provided feedback the last months on freebsd-jail and
  other channels.
- My employer, CK Software GmbH, for the support so I could work on this.

Reviewed by:	(see above)
MFC after:	3 months (this is just so that I get the mail)
X-MFC Before:   7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
Peter Wemm
dc5aaa8410 Sigh. Fix a pointer/int compile error. 2008-11-10 23:36:20 +00:00
Peter Wemm
a22600a1dd Fix a signal emulation bug introduced in r163018 (and present in 7.x).
This prevents 32 bit signal handlers from finding out what the faulting
address is.  Both the secret 4th argument and siginfo->si_addr are zero.
2008-11-10 23:26:52 +00:00
Ed Schouten
ebb45b0620 Regenerate system call tables for r184789. 2008-11-09 10:48:06 +00:00
Ed Schouten
a1b5a8955e Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4.
Looking at our source code history, it seems the uname(),
getdomainname() and setdomainname() system calls got deprecated
somewhere after FreeBSD 1.1, but they have never been phased out
properly. Because we don't have a COMPAT_FREEBSD1, just use
COMPAT_FREEBSD4.

Also fix the Linuxolator to build without the setdomainname() routine by
just making it call userland_sysctl on kern.domainname. Also replace the
setdomainname()'s implementation to use this approach, because we're
duplicating code with sysctl_domainname().

I wasn't able to keep these three routines working in our
COMPAT_FREEBSD32, because that would require yet another keyword for
syscalls.master (COMPAT4+NOPROTO). Because this routine is probably
unused already, this won't be a problem in practice. If it turns out to
be a problem, we'll just restore this functionality.

Reviewed by:	rdivacky, kib
2008-11-09 10:45:13 +00:00
Doug Rabson
45e6ab7f81 Regen. 2008-11-03 10:39:35 +00:00
Doug Rabson
a9148abd9d Implement support for RPCSEC_GSS authentication to both the NFS client
and server. This replaces the RPC implementation of the NFS client and
server with the newer RPC implementation originally developed
(actually ported from the userland sunrpc code) to support the NFS
Lock Manager.  I have tested this code extensively and I believe it is
stable and that performance is at least equal to the legacy RPC
implementation.

The NFS code currently contains support for both the new RPC
implementation and the older legacy implementation inherited from the
original NFS codebase. The default is to use the new implementation -
add the NFS_LEGACYRPC option to fall back to the old code. When I
merge this support back to RELENG_7, I will probably change this so
that users have to 'opt in' to get the new code.

To use RPCSEC_GSS on either client or server, you must build a kernel
which includes the KGSSAPI option and the crypto device. On the
userland side, you must build at least a new libc, mountd, mount_nfs
and gssd. You must install new versions of /etc/rc.d/gssd and
/etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.

As long as gssd is running, you should be able to mount an NFS
filesystem from a server that requires RPCSEC_GSS authentication. The
mount itself can happen without any kerberos credentials but all
access to the filesystem will be denied unless the accessing user has
a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There
is currently no support for situations where the ticket file is in a
different place, such as when the user logged in via SSH and has
delegated credentials from that login. This restriction is also
present in Solaris and Linux. In theory, we could improve this in
future, possibly using Brooks Davis' implementation of variant
symlinks.

Supporting RPCSEC_GSS on a server is nearly as simple. You must create
service creds for the server in the form 'nfs/<fqdn>@<REALM>' and
install them in /etc/krb5.keytab. The standard heimdal utility ktutil
makes this fairly easy. After the service creds have been created, you
can add a '-sec=krb5' option to /etc/exports and restart both mountd
and nfsd.

The only other difference an administrator should notice is that nfsd
doesn't fork to create service threads any more. In normal operation,
there will be two nfsd processes, one in userland waiting for TCP
connections and one in the kernel handling requests. The latter
process will create as many kthreads as required - these should be
visible via 'top -H'. The code has some support for varying the number
of service threads according to load but initially at least, nfsd uses
a fixed number of threads according to the value supplied to its '-n'
option.

Sponsored by:	Isilon Systems
MFC after:	1 month
2008-11-03 10:38:00 +00:00
John Baldwin
23aa8eeafc Regen for freebsd32_getdirentries(). 2008-10-22 21:56:44 +00:00
John Baldwin
63f8fe9e8b Split the copyout of *base at the end of getdirentries() out leaving the
rest in kern_getdirentries().  Use kern_getdirentries() to implement
freebsd32_getdirentries().  This fixes a bug where calls to getdirentries()
in 32-bit binaries would trash the 4 bytes after the 'long base' in
userland.

Submitted by:	ups
MFC after:	1 week
2008-10-22 21:55:48 +00:00
John Baldwin
88ac915a9b Add support for installing 32-bit system calls from kernel modules. This
includes syscall32_{de,}register() routines as well as a module handler
and wrapper macros similar to the support for native syscalls in
<sys/sysent.h>.

MFC after:	1 month
2008-09-25 20:50:21 +00:00
John Baldwin
d47faadce3 Sort includes and add multiple include guards. 2008-09-25 20:12:38 +00:00
John Baldwin
74d9b5a551 Regen. 2008-09-25 20:08:36 +00:00
John Baldwin
48a43ae819 Tidy up a few things with syscall generation:
- Instead of using a syscall slot (370) just to get a function prototype
  for lkmressys(), add an explicit function prototype to <sys/sysent.h>.
  This also removes unused special case checks for 'lkmressys' from
  makesyscalls.sh.
- Instead of having magic logic in makesyscalls.sh to only generate a
  function prototype the first time 'lkmnosys' is seen, make 'NODEF'
  always not generate a function prototype and include an explicit
  prototype for 'lkmnosys' in <sys/sysent.h>.
- As a result of the fix in (2), update the LKM syscall entries in
  the freebsd32 syscall table to use 'lkmnosys' rather than 'nosys'.
- Use NOPROTO for the __syscall() entry (198) in the native ABI.  This
  avoids the need for magic logic in makesyscalls.h to only generate
  a function prototype the first time 'nosys' is encountered.
2008-09-25 20:07:42 +00:00
David E. O'Brien
c750e17cf5 Add freebsd32 compat shims for ioctl(2)
CDIOREADTOCHEADER and CDIOREADTOCENTRYS requests.
2008-09-22 16:24:36 +00:00
David E. O'Brien
663c58007e Regenerate for r183270. 2008-09-22 16:09:43 +00:00
David E. O'Brien
ae528485c4 Add freebsd32 compat shims for ioctl(2)
MDIOCATTACH, MDIOCDETACH, MDIOCQUERY, and MDIOCLIST requests.
2008-09-22 16:09:16 +00:00
David E. O'Brien
f1287854fd Regenerate for r183188. 2008-09-19 15:21:40 +00:00
David E. O'Brien
6e6049e9df Add freebsd32 compat shim for nmount(2).
(and quiet some compiler warnings for vfs_donmount)
2008-09-19 15:17:32 +00:00
David E. O'Brien
109ea24cc1 style(9) 2008-09-15 17:39:40 +00:00
David E. O'Brien
7e29bc757e Regenerate for r183042. 2008-09-15 17:39:01 +00:00
David E. O'Brien
f0f53d8f79 Fix bug in r100384 (rev 1.2) in which the 32-bit swapon(2) was made
"obsolete, not included in system", where as the system call does exist.
2008-09-15 17:37:41 +00:00
Robert Watson
5ae504055a Regenerate following r182123. 2008-08-24 21:23:08 +00:00
Robert Watson
e484af13ed When MPSAFE ttys were merged, a new BSM audit event identifier was
allocated for posix_openpt(2).  Unfortunately, that identifier
conflicts with other events already allocated to other systems in
OpenBSM.  Assign a new globally unique identifier and conform
better to the AUE_ event naming scheme.

This is a stopgap until a new OpenBSM import is done with the
correct identifier, so we'll maintain this as a local diff in svn
until then.

Discussed with:	ed
Obtained from:	TrustedBSD Project
2008-08-24 21:20:35 +00:00
David E. O'Brien
35c316caaf Add comments on NOARGS, NODEF, and NOPROTO. 2008-08-21 22:57:31 +00:00
Ed Schouten
18cf135421 Update system call tables.
The previous commit also included changes to all the system call lists,
but it is a tradition to update these lists in a second commit, so rerun
make sysent to update the $FreeBSD$ tags inside these files to refer to
the latest version of syscalls.master.

Requested by:	rwatson
2008-08-20 08:39:10 +00:00
Ed Schouten
bc093719ca Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
Brooks Davis
e44f0b2a63 style(9): put parentheses around return values. 2008-07-10 19:54:34 +00:00
Brooks Davis
774b72e12e Regen 2008-07-10 17:46:58 +00:00
Brooks Davis
a8c6d6d0ba id_t is a 64-bit integer and thus is passed as two arguments like off_t is.
As a result, those arguments must be recombined before calling the real
syscal implementation.  This change fixes 32-bit compatibility for
cpuset_getid(), cpuset_setid(), cpuset_getaffinity(), and
cpuset_setaffinity().
2008-07-10 17:45:57 +00:00
Konstantin Belousov
f2296b585e Regen 2008-03-31 12:12:27 +00:00
Konstantin Belousov
4f1e7213d4 Add the freebsd32 compatibility shims for the *at() syscalls.
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:08:30 +00:00
Doug Rabson
a7ac0db6cb Regen. 2008-03-26 15:24:02 +00:00
Doug Rabson
dfdcada31e Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
John Baldwin
5c63b21a1a Regen. 2008-03-25 19:35:34 +00:00
John Baldwin
30c6422a8a Add entries for the cpuset-related system calls. The existing system calls
can be used on little endian systems.

Pointy hat to:	jeff
2008-03-25 19:34:47 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
Ruslan Ermilov
b95bd24d29 Regenerate for readlink(2). 2008-02-12 20:11:54 +00:00
Ruslan Ermilov
5f56182b6f Change readlink(2)'s return type and type of the last argument
to match POSIX.

Prodded by:	Alexey Lyashkov
2008-02-12 20:09:04 +00:00
Robert Watson
20c6fe828a Regenerate. 2008-01-20 23:44:24 +00:00
Robert Watson
6c902059f2 Use audit events AUE_SHMOPEN and AUE_SHMUNLINK with new system calls
shm_open() and shm_unlink().  More auditing will need to be done for
these calls to capture arguments properly.
2008-01-20 23:43:06 +00:00
John Baldwin
4ad6d200d6 Regen for shm_open(2) and shm_unlink(2). 2008-01-08 22:01:26 +00:00
John Baldwin
8e38aeff17 Add a new file descriptor type for IPC shared memory objects and use it to
implement shm_open(2) and shm_unlink(2) in the kernel:
- Each shared memory file descriptor is associated with a swap-backed vm
  object which provides the backing store.  Each descriptor starts off with
  a size of zero, but the size can be altered via ftruncate(2).  The shared
  memory file descriptors also support fstat(2).  read(2), write(2),
  ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared
  memory file descriptors.
- shm_open(2) and shm_unlink(2) are now implemented as system calls that
  manage shared memory file descriptors.  The virtual namespace that maps
  pathnames to shared memory file descriptors is implemented as a hash
  table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash
  of the pathname.
- As an extension, the constant 'SHM_ANON' may be specified in place of the
  path argument to shm_open(2).  In this case, an unnamed shared memory
  file descriptor will be created similar to the IPC_PRIVATE key for
  shmget(2).  Note that the shared memory object can still be shared among
  processes by sharing the file descriptor via fork(2) or sendmsg(2), but
  it is unnamed.  This effectively serves to implement the getmemfd() idea
  bandied about the lists several times over the years.
- The backing store for shared memory file descriptors are garbage
  collected when they are not referenced by any open file descriptors or
  the shm_open(2) virtual namespace.

Submitted by:	dillon, peter (previous versions)
Submitted by:	rwatson (I based this on his version)
Reviewed by:	alc (suggested converting getmemfd() to shm_open())
2008-01-08 21:58:16 +00:00
John Baldwin
0a63574164 Bah, remove last vestiges of some statfs conversion fixes that aren't quite
ready for CVS yet that snuck into 1.68.

Pointy hat to:	jhb
2007-12-10 19:42:23 +00:00
Scott Long
d637500d06 Grrr, remove an unused variable missed in the last commit. 2007-12-08 01:41:31 +00:00
Scott Long
7815c9e2db Don't expect a return value from statfs_scale_blocks(). 2007-12-07 22:32:09 +00:00
John Baldwin
8120bb7e3a Regen. 2007-12-06 23:37:26 +00:00
John Baldwin
695e8d536c Add freebsd32 compat wrappers for msgctl() and __semctl() using
kern_msgctl() and kern_semctl().

MFC after:	1 week
2007-12-06 23:36:57 +00:00
John Baldwin
3c39e0d8d4 Add freebsd32 compat wrappers for msgctl() and _semctl() using
kern_msgctl() and kern_semctl().

MFC after:	1 week
2007-12-06 23:35:29 +00:00
John Baldwin
d43c6fa4fe Move 32-bit SYSV IPC structure definitions into freebsd32_ipc.h.
MFC after:	1 week
2007-12-06 23:23:16 +00:00
John Baldwin
74427aa423 Move several data structure definitions out of freebsd32_misc.c and into
freebsd32.h instead.

MFC after:	1 week
2007-12-06 23:11:27 +00:00
Jung-uk Kim
959a913b87 Remove redundant checks for msgsnd(3) and msgrcv(3).
COMPAT_IA32 (implicitly) requires SYSVSEM, SYSVSHM and SYSVMSG in kernel.

Pointed out by:	jhb
2007-12-04 20:25:41 +00:00
John Baldwin
cc479dda4a Rework the routines to convert a 5.x+ statfs structure (with fixed-size
64-bit counters) to a 4.x statfs structure (with long-sized counters).
- For block counters, we scale up the block size sufficiently large so
  that the resulting block counts fit into a the long-sized (long for the
  ABI, so 32-bit in freebsd32) counters.  In 4.x the NFS client's statfs
  VOP did this already.  This can lie about the block size to 4.x binaries,
  but it presents a more accurate picture of the ratios of free and
  available space.
- For non-block counters, fix the freebsd32 stats converter to cap the
  values at INT32_MAX rather than losing the upper 32-bits to match the
  behavior of the 4.x statfs conversion routine in vfs_syscalls.c

Approved by:	re (kensmith)
2007-08-28 20:28:12 +00:00
David Xu
6ec46f7aa8 Regenerate.
Approved by: re(kensmith)
2007-08-16 05:32:26 +00:00
David Xu
81ca5b4257 Add thr_kill2 compat32 syscall.
Submitted by: Tijl Coosemans tijl at ulyssis dot org
Approved by: re (kensmith)
2007-08-16 05:30:04 +00:00
Peter Wemm
5aa69f9c72 Add compat6 wrapper code for mmap/lseek/pread/pwrite/truncate/ftruncate.
Approved by:  re (kensmith)
2007-07-04 23:04:41 +00:00
Peter Wemm
486abf939c Regenerate after mmap/lseek/etc syscall changes
Approved by:  re (kensmith)
2007-07-04 23:03:50 +00:00
Peter Wemm
b9f3e68f95 Add i386 emulation wrappers for mmap/lseek/etc. These use COMPAT6, so
you must use the already existing, already in generic, COMPAT_FREEBSD6
kernel option for running old 32 bit binaries.

Approved by:  re (kensmith)
2007-07-04 23:02:40 +00:00
Matt Jacob
739c673c8d Try a cheap way to get around gcc4.2 believing that user arguments
to system calls can change across intervening functions.
2007-06-17 04:37:57 +00:00
Ed Maste
1dd702a59a Remove stale 'XXX implement' comments for syscalls which have since been
implemented.
2007-06-15 21:54:26 +00:00
Olivier Houchard
302e130edc Remove duplicate includes.
Submitted by:   Cyril Nguyen Huu <cyril ci0 org>
2007-05-23 13:36:02 +00:00
Alan Cox
37f3c8939a Eliminate the use of Giant from ia64-specific code in freebsd32_mmap(). 2007-05-01 17:10:01 +00:00
Jung-uk Kim
5e868cbb79 Regen. 2006-12-20 19:39:10 +00:00
Jung-uk Kim
127891cab9 MFP4: (part of) 110058
Fix 32-bit msgsnd(3) and msgrcv(3) emulations for amd64.
2006-12-20 19:36:03 +00:00
Ruslan Ermilov
9f70620442 Regen.
Forgotten by:	trhodes
2006-11-11 21:49:08 +00:00
Ruslan Ermilov
f42326c579 Regen. 2006-11-03 21:23:33 +00:00
Ruslan Ermilov
0b160a7d2b Fix build breakage introduced in previous commit (redeclatation
of sctp functions).
2006-11-03 21:21:28 +00:00
Randall Stewart
af99851047 This commits the remake in kern/ make sysent to get
the correct syscalls.master's $FreeBSD$ tag record and
a make sysent in sys/compat/freebsd32. Thanks Ruslan
for pointing out the steps I missed :-0
Approved by:	gnn
2006-11-03 18:57:49 +00:00
Randall Stewart
f8829a4a40 Ok, here it is, we finally add SCTP to current. Note that this
work is not just mine, but it is also the works of Peter Lei
and Michael Tuexen. They both are my two key other developers
working on the project.. and they need ata-boy's too:
****
peterlei@cisco.com
tuexen@fh-muenster.de
****
I did do a make sysent which updated the
syscall's and sysproto.. I hope that is correct... without
it you don't build since we have new syscalls for SCTP :-0

So go out and look at the NOTES, add
option SCTP (make sure inet and inet6 are present too)
and play with SCTP.

I will see about comitting some test tools I have after I
figure out where I should place them. I also have a
lib (libsctp.a) that adds some of the missing socketapi
functions that I need to put into lib's.. I will talk
to George about this :-)

There may still be some 64 bit issues in here, none of
us have a 64 bit processor to test with yet.. Michael
may have a MAC but thats another beast too..

If you have a mac and want to use SCTP contact Michael
he maintains a web site with a loadable module with
this code :-)

Reviewed by:	gnn
Approved by:	gnn
2006-11-03 15:23:16 +00:00
Maxim Sobolev
016b81e405 Regen. 2006-10-24 17:25:36 +00:00
Maxim Sobolev
ef16706d34 Fix kernel breakage introduced in the previous commit (redeclatation
of the audit functions).
2006-10-24 17:24:11 +00:00
Robert Watson
c71bf4bf63 Regenerate. 2006-10-24 13:54:56 +00:00
Robert Watson
a1dce47980 Hook up audit functions in the freebsd32 compatibility code. It is
believed these likely don't require wrappers.

Reported by:	sobomax
MFC after:	3 days
2006-10-24 13:49:44 +00:00
David Xu
034b26fc65 Regenerate. 2006-10-17 02:28:58 +00:00
David Xu
3f9223b65d Sync with master. 2006-10-17 02:28:26 +00:00
David Xu
295426f4c5 Regenerate. 2006-10-06 08:24:37 +00:00
David Xu
ae7d8a6766 Implement 32bit umtx_lock and umtx_unlock system calls, these two system
calls are not used by libthr in RELENG_6 and HEAD, it is only used by
the libthr in RELENG-5, the _umtx_op system call can do more incremental
dirty works than these two system calls without having to introduce new
system calls or throw away old system calls when things are going on.
2006-10-06 08:22:08 +00:00
David Xu
312a0e5f06 Regenerate. 2006-10-05 01:58:57 +00:00
David Xu
e6e7f16cb4 Oops, add the missing file. 2006-10-05 01:58:08 +00:00
David Xu
c6511aea86 Move some declaration of 32-bit signal structures into file
freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.
2006-10-05 01:56:11 +00:00
Robert Watson
531147aa3e Regenerate. 2006-10-03 20:48:11 +00:00
Robert Watson
dfb041ca62 Change getpagesize() system call audit event to more clearly indicate
that we don't audit it.

MFC after:	3 days
Obtained from:	TrustedBSD Project
2006-10-03 20:48:03 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
David Xu
4af4fcb71a Regenerate. 2006-09-23 00:27:53 +00:00
David Xu
5c26f4cea8 Enable sigwait. 2006-09-23 00:27:11 +00:00
David Xu
ac3674aa52 Regenerate. 2006-09-22 15:05:34 +00:00
David Xu
cda9a0d1c2 Add compatible code to let 32bit libthr work on 64bit kernel. 2006-09-22 15:04:28 +00:00
David Xu
27bbb2e71f Regenerate. 2006-09-22 00:53:43 +00:00
David Xu
1eec02f538 Add umtx support for 32bit process on AMD64 machine. 2006-09-22 00:52:54 +00:00
David Xu
ecc313475b Regenerate. 2006-09-21 04:50:38 +00:00
David Xu
47bd78d24d sync with master. 2006-09-21 04:49:36 +00:00
Robert Watson
da7cbdc2b3 Regenerate. 2006-09-17 13:29:36 +00:00
Robert Watson
6c2d307a0e AUE_SIGALTSTACK instead of AUE_SIGPENDING for sigaltstack().
Obtained from:	TrustedBSD Project
MFC after:	3 days
2006-09-17 13:28:11 +00:00
David Xu
c0ba6c1783 The following functions need not to be reimplemented, reuse 64bit
syscalls instead:
sigqueue, thr_set_name, thr_setscheduler, thr_getscheduler,
thr_setschedparam.
2006-09-09 01:22:13 +00:00
Robert Watson
e482025ebd Regenerate. 2006-09-03 16:24:36 +00:00
Robert Watson
e8a6d7e554 Set freebsd32 system call event identifiers for:
- old truncate, ftruncate
- old getpeername, gethostid, sethostid, getrlimit, setrlimit, killpg.
- old quota, getsockname, getdirentries.
- lgetfh
- old getdomainname, setdomainname
- sysarch, rtprio, __getcwd, jail, sigtimedwait
- extattrctl, extattr_{get,set,delete,list}_{file,fd,link}
- getresgid, getresuid, kqueue, eaccess, nmount, sendfile
- fhstatfs, kldunloadf

Right identifiers for:

- nfssvc

Remove incorrect identifier for:

- __acl_get_file

Compile tested with help of:	sam
Obtained from:	TrustedBSD Project
2006-09-03 16:17:49 +00:00
Robert Watson
8075da7e8b Regenerate. Looks like someone missed doing this previously as more than
just the audit event change appears in the diff.
2006-09-03 13:47:52 +00:00
Robert Watson
1b25e5f3c4 Use AUE_NTP_ADJTIME instead of AUE_ADJTIME for ntp_adjtime().
Obtained from:	TrustedBSD Project
2006-09-03 13:47:24 +00:00
Warner Losh
1a3c917f9d while (0); -> while (0) in multi-line macros 2006-08-17 22:50:33 +00:00
Peter Wemm
bad9a7a5f9 Grab two syscall numbers. One is used to emulate functionality that linux
has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname
that corresponds to a given file descriptor).  Valgrind-3.x needs this
functionality.  This is a placeholder only at this time.
2006-08-16 22:32:50 +00:00
Jung-uk Kim
a88d050dfc Include sys/limits.h for INT_MAX. freebsd32_proto.h 1.58 does not include
sys/umtx.h any more and previously it was included from there.
2006-08-16 00:02:36 +00:00
John Baldwin
f8f1f7fb85 Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier
systrace changes.
2006-08-15 17:37:01 +00:00
John Baldwin
df78f6d313 - Remove unused sysvec variables from various syscalls.conf.
- Send the systrace_args files for all the compat ABIs to /dev/null for
  now.  Right now makesyscalls.sh generates a file with a hardcoded
  function name, so it wouldn't work for any of the ABIs anyway.  Probably
  the function name should be configurable via a 'systracename' variable
  and the functions should be stored in a function pointer in the sysvec
  structure.
2006-08-15 17:25:55 +00:00
John Baldwin
91ce2694d1 Regen for MPSAFE flag removal. 2006-07-28 19:08:37 +00:00
John Baldwin
af5bf12239 Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
John Baldwin
e0b4add8d8 Various fixes to comments in the syscall master files including removing
cruft from the audit import and adding mention of COMPAT4 to freebsd32.
2006-07-28 18:55:18 +00:00
David Xu
2df96d8e02 sync with master. 2006-07-14 01:57:09 +00:00
John Baldwin
c870740e09 - Split out kern_accept(), kern_getpeername(), and kern_getsockname() for
use by ABI emulators.
- Alter the interface of kern_recvit() somewhat.  Specifically, go ahead
  and hard code UIO_USERSPACE in the uio as that's what all the callers
  specify.  In place, add a new uioseg to indicate what type of pointer
  is in mp->msg_name.  Previously it was always a userland address, but
  ABI emulators may pass in kernel-side sockaddrs.  Also, remove the
  namelenp field and instead require the two places that used it to
  explicitly copy mp->msg_namelen out to userland.
- Use the patched kern_recvit() to replace svr4_recvit() and the stock
  kern_sendit() to replace svr4_sendit().
- Use kern_bind() instead of stackgap use in ti_bind().
- Use kern_getpeername() and kern_getsockname() instead of stackgap in
  svr4_stream_ti_ioctl().
- Use kern_connect() instead of stackgap in svr4_do_putmsg().
- Use kern_getpeername() and kern_accept() instead of stackgap in
  svr4_do_getmsg().
- Retire the stackgap from SVR4 compat as it is no longer used.
2006-07-10 21:38:17 +00:00
John Baldwin
acdd09f944 Unexpand PTRIN() in several places and fix one instance where 0 was being
used instead of NULL.
2006-07-10 19:37:43 +00:00