Commit Graph

103 Commits

Author SHA1 Message Date
Peter Wemm
2e17a05929 Fix warning:
413: warning: long unsigned int format, vm_offset_t arg (arg 2)
2001-06-15 07:46:18 +00:00
Robert Watson
b1fc0ec1a7 o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Robert Watson
c7e1887023 o Change a suser() call to a suser_xxx(..., PRISON_ROOT) call in the
linuxulator so as to allow privileged processes within a jail() to
  invoke the Linux initgroups() system call.  This allows the Linux
  "su" to work properly (better) when running a complete Linux
  environment under jail().  This problem was reported by Attila
  Nagy <bra@fsn.hu>.

Reviewed by:	marcel
2001-04-24 19:08:53 +00:00
John Baldwin
33a9ed9d0e Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by:	-smp, dfr, jake
2001-04-24 00:51:53 +00:00
Alan Cox
21c8cdfb96 Add linux_sched_get_priority_max() and linux_sched_get_priority_min(): The
policy parameter requires translation.
2001-04-01 06:37:40 +00:00
Andrew Gallatin
6d4aa00ac1 fix linux_times() to take into account linux's value of CLK_TCK on the alpha.
Previously, results were off by a factor of 10

Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>
2001-03-23 19:22:21 +00:00
Jonathan Lemon
2459336973 Allow debugging output to be controlled on a per-syscall granularity.
Also clean up debugging output in a slightly more uniform fashion.

The default behavior remains the same (all debugging output is turned on)
2001-02-16 16:40:43 +00:00
Jonathan Lemon
705deb78a3 Add mount syscall to linux emulation. Also improve emulation of reboot. 2001-02-16 14:42:11 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
John Baldwin
ba88dfc733 Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.
2001-01-27 00:01:31 +00:00
John Baldwin
fb29c3e083 Protect calcru() with sched_lock. 2001-01-23 20:50:40 +00:00
John Baldwin
216af8221e Lock access to proc members.
Glanced over by:	marcel
2000-12-15 19:41:27 +00:00
Marcel Moolenaar
b4c6727a3a Don't auto-generate the syscalls. 2000-12-03 01:30:31 +00:00
Jake Burkholder
4f55983606 Use callout_reset instead of timeout(9). Most callouts are statically
allocated, 2 have been added to struct proc for setitimer and sleep.

Reviewed by:	jhb, jlemon
2000-11-27 22:52:31 +00:00
Marcel Moolenaar
ebea866055 Revert auto-generation. The Alpha port is broken.
Syncing with it is wrong.
2000-11-10 21:30:19 +00:00
Marcel Moolenaar
2da829a0c8 Sync with Alpha:
Do not use sysent.c, proto.h and syscall.h in source tree;
use auto-generated versions.
2000-11-09 07:27:55 +00:00
David E. O'Brien
5231fb2059 The MI/MD split wasn't perfect and the MI files need hacks for the
AlphaLinux compat bits.  This will be better cleaned up soon.

Agreed to what ever was necessary by:	marcel
2000-11-01 19:48:35 +00:00
Marcel Moolenaar
4a22d85023 Fix bug in previous commit. We need to trim the limits to fit
the datatype (= long). Use ULONG_MAX and LONG_MAX to avoid
creating MD code.
2000-08-26 05:08:10 +00:00
Marcel Moolenaar
eebc2a071f Re-implement linux_{g|s}etrlimit in terms of {g|s}etrlimit
instead of the o{g|s}etrlimit so that the dependency on
COMPAT_43 is removed.
2000-08-26 02:18:41 +00:00
Marcel Moolenaar
a751315ca8 Update include directives.
Move linux_select to MD code (i386 compat. syscall).

Move linux_fork, linux_vfork, linux_clone, linux_mmap,
linux_pipe, linux_ioperm, linux_iopl and linux_modify_ldt
to MD code.
2000-08-22 01:46:50 +00:00
Marcel Moolenaar
03567510a8 Add bounds checking to stackgap_alloc. Previously it was possible
to construct a path that was long enough (ie longer than
SPARE_USRSPACE bytes) and trash the stack.

Note that SPARE_USRSPACE is much smaller than MAXPATHLEN so that
the Linuxulator will now return ENAMETOOLONG even if the path
is smaller than MAXPATHLEN.

PR: 12749
2000-07-23 16:54:18 +00:00
Marcel Moolenaar
a603fe5a07 Revert implementation of setfsuid and setfsgid due to security
issues.

Requested by: rwatson
Backed by: kris
2000-07-20 05:37:41 +00:00
Marcel Moolenaar
ddb48608ab Implement setfsuid and setfsgid. Implementation derived from patch
in PR.

PR: 16993
Submitted by: Bjoern Groenvall <bg@sics.se>
2000-07-16 21:23:34 +00:00
Martin Cracauer
6f6b2cd019 Linux allows to mmap annonymous with a file descriptor passed, FreeBSD
doesn't.  In the Linux emulation layer, ignore the fd passed when
MAP_ANON is specified.

Known application to be fixed: Xanalys/Harlequin Lispworks

Also improve debug output for mmap, now showing what the emulation
layer mapped to what (-DDEBUG).

Reviewed by:	marcel
2000-06-15 09:57:34 +00:00
Poul-Henning Kamp
2c9b67a8df Remove unneeded #include <vm/vm_zone.h>
Generated by:	src/tools/tools/kerninclude
2000-04-30 18:52:11 +00:00
Marcel Moolenaar
3c1124cfdf Fix bug in linux_wait4 and linux_waitpid where garbage in the status
argument could panic the kernel.

Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Prompted by: jkh, gallatin
Approved by: prompters
2000-03-09 17:52:01 +00:00
Eivind Eklund
762e6b856c Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Marcel Moolenaar
956d3333ca sigset_t change (part 4 of 5)
-----------------------------

The compatibility code and/or emulators have been updated:

iBCS2 now mostly uses the older syscalls. SVR4 now properly
handles all signals. This has been achieved by using the
new sigset_t throughout the emulator. The Linuxulator has
been severely updated. Internally the new Linux sigset_t is
made the default. These are then mapped to and from the
new FreeBSD sigset_t.

Also, rt_sigsuspend has been implemented in the Linuxulator.
Implementing this syscall basicly caused all this sigset_t
changing in the first place and the syscall has been used
throughout the change as a means for testing. It basicly is
too much work to undo the implementation so that it can
later be added again.

A special note on the use of sv_sigtbl and sv_sigsize in
struct sysentvec:
Every signal larger than sv_sigsize is not translated and is
passed on to the signal handler unmodified. Signals in the
range 1 upto and including sv_sigsize are translated.
The rationale is that only the system defined signals need to
be translated.

The emulators also have been updated so that the translation
tables are only indexed for valid (system defined) signals.
This change also fixes the translation bug already in the
SVR4 emulator.
1999-09-29 15:12:18 +00:00
Luoqi Chen
2323686abc Implement linux_ioperm() syscall. Fix linux_iopl() to use the level argument.
SVGAlib should now work.

Reviewed by:	marcel
1999-09-22 22:01:51 +00:00
Marcel Moolenaar
6771d80337 I missed the namechange of field desc in struct i386_ldt_args into descs while
reviewing luoqi's changes...

Pointed out by: luoqi
1999-09-03 06:18:39 +00:00
Marcel Moolenaar
ff78e85043 Implementation of the modify_ldt syscall. Use the sysarch() interface to do
the actual work. When USER_LDT is not defined for a kernel, sysarch returns
EOPNOTSUPP. Display a message in that case and return ENOSYS to userland.

Reviewed by: luoqi
1999-09-02 21:50:42 +00:00
Marcel Moolenaar
d4c45842d7 Fix a missing '-1' in the size argument of copyout in getgroups. Spotted while
reviewing the MFC in -stable.
1999-08-29 08:52:38 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Marcel Moolenaar
c6dfea0ebd Add sysctl variables for the Linuxulator. These reside under `compat.linux' as
discussed on current.

The following variables are defined (for now):

    osname (defaults to "Linux")
        Allow users to change the name of the OS as returned by uname(2),
        specially added for all those Linux Netscape users and statistics
        maniacs :-) We now have what we all wanted!

    osrelease (defaults to "2.2.5")
        Allow users to change the version of the OS as returned by uname(2).
        Since -current supports glibc2.1 now, change the default to 2.2.5
        (was 2.0.36).

    oss_version (defaults to 198144 [0x030600])
        This one will be used by the OSS_GETVERSION ioctl (PR 12917) which I
        can commit now that we have the MIB. The default version number is the
        lowest version possible with the current 'encoding'.

A note about imprisoned processes (see jail(2)):
  These variables are copy-on-write (as suggested by phk). This means that
  imprisoned processes will use the system wide value unless it is written/set
  by the process. From that moment on, a copy local to the prison will be
  used.

A note about the implementation:
  I choose to add a single pointer to struct prison, because I didn't like the
  idea of changing struct prison every time I come up with a new variable. As
  a side effect, the extra storage is only needed when a variable is set from
  within the prison. This also minimizes kernel bloat when the Linuxulator is
  not used; both compiled in or as a module.

Reviewed by: bde (first version only) and phk
1999-08-27 19:47:41 +00:00
Marcel Moolenaar
c85f67175f Fix {g|s}etgroups semantics. We use cr_groups[0] to hold egid. This means that
egid will be twice in the set and that setting cr_groups[0] will change egid.
This is simply solved by ignoring cr_groups[0]. That is; linux_getgroups does
not return cr_groups[0] and linux_setgroups does not touch it.

Noticed by: bde
Brought to my attention by: sheldonh
1999-08-25 14:11:01 +00:00
Marcel Moolenaar
2fdc82e093 Change all UNIMPL syscalls to STD and add them to linux_dummy. Now we always
know if and when an unimplemented or obsoleted syscall is being used. Make the
message more end-user friendly.

And as long as we're here, rename some unimplemeted syscalls (linux_phys ->
linux_umount2, linux_vm86 -> linux_vm86old, linux_new_vm86 -> linux_vm86).

Change prototype for linux_newuname from `struct linux_newuname_t *' into
`struct linux_new_utsname *'. This change is reflected in linux.h and
linux_misc.c.
1999-08-25 11:19:03 +00:00
Marcel Moolenaar
ce2b2a92fc Fix bug in the debug-printf of the vfork syscall, where the format specifier
didn't match the argument (p->p_pid).

While I'm at it, also fix the dupo in the format string and fix the annoying
inconsistency in all the debug-printfs wrt p_pid arguments. Change all of them
to use the %ld format specifier and cast the p_pid arguments to long.

Submitted by: billf
1999-08-17 10:09:06 +00:00
Marcel Moolenaar
42035021f5 Implement linux_vfork() syscall by calling vfork(). Analogous to the
linux_fork() implementation.
1999-08-16 11:49:30 +00:00
Marcel Moolenaar
a171f5adb6 Provide wrappers for sched_{s|g}etscheduler. We need to convert the policy
argument.

PR: 12006
Originator: Jean-Claude MICHOT <jcmichot@teaser.fr>
1999-08-15 17:28:40 +00:00
Marcel Moolenaar
20c661befb Include opt_compat.h so that COMPAT_43 is defined. This gives us the proper
prototypes of o{s|g}etrlimit (from sys/sysproto.h). Update linux_{s|g}etrlimit
so that the arguments to o{s|g}etrlimit are corresponding the prototypes.

Pointed out by: bde
1999-08-15 13:28:35 +00:00
Marcel Moolenaar
175db64b3e Do not map {s|g}etrlimit onto FreeBSD syscalls. The arguments don't match.
The linux syscalls translate the arguments first before invoking the
FreeBSD native syscalls.

PR: kern/9591
Originator: John Plevyak <jplevyak@inktomi.com>
1999-08-11 13:34:31 +00:00
Marcel Moolenaar
6a6ea79ac8 Fix page fault in linux_uselib syscall.
PR: 12910
Submitted by: Peter Holm <peter@holm.cc>
1999-08-08 11:26:46 +00:00
Marcel Moolenaar
19e520961c Let newuname return "Linux" as the OS name and not "FreeBSD". Also, return a
more sensible (for Linux applications) release number. Hardcoding a release
number has its drawbacks, but it will do for now.
1999-07-05 19:18:03 +00:00
Peter Wemm
d5558c001a Fix up a few easy 'assignment used as truth value' and 'suggest parens
around && within ||' type warnings.  I'm pretty sure I have not masked
any problems here, I've committed real problem fixes seperately.
1999-05-06 18:44:42 +00:00
Luoqi Chen
5206bca10a Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
Poul-Henning Kamp
1c308b817a Change suser_xxx() to suser() where it applies. 1999-04-27 12:21:16 +00:00
Poul-Henning Kamp
f711d546d2 Suser() simplification:
1:
  s/suser/suser_xxx/

2:
  Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
  s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.
1999-04-27 11:18:52 +00:00
Peter Wemm
db42d90829 unifdef -DVM_STACK - it's been on for a while for x86 and was checked
and appeared to be working for the Alpha some time ago.
1999-04-19 14:14:14 +00:00