syscall path inward. A system call may select whether it needs the MP
lock or not (the default being that it does need it).
A great deal of conditional SMP code for various deadended experiments
has been removed. 'cil' and 'cml' have been removed entirely, and the
locking around the cpl has been removed. The conditional
separately-locked fast-interrupt code has been removed, meaning that
interrupts must hold the CPL now (but they pretty much had to anyway).
Another reason for doing this is that the original separate-lock for
interrupts just doesn't apply to the interrupt thread mechanism being
contemplated.
Modifications to the cpl may now ONLY occur while holding the MP
lock. For example, if an otherwise MP safe syscall needs to mess with
the cpl, it must hold the MP lock for the duration and must (as usual)
save/restore the cpl in a nested fashion.
This is precursor work for the real meat coming later: avoiding having
to hold the MP lock for common syscalls and I/O's and interrupt threads.
It is expected that the spl mechanisms and new interrupt threading
mechanisms will be able to run in tandem, allowing a slow piecemeal
transition to occur.
This patch should result in a moderate performance improvement due to
the considerable amount of code that has been removed from the critical
path, especially the simplification of the spl*() calls. The real
performance gains will come later.
Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
my tree for ages (~2 years) waiting for an excuse to commit it. Now Linux
has implemented it and it seems that Staroffice (when using the
linux_base6.1 port's libc) calls this in the linux emulator and dies in
setup. The Linux emulator can call these now.
to wake up any processes waiting via PIOCWAIT on process exit, and truss
needs to be more aware that a process may actually disappear while it's
waiting.
Reviewed by: Paul Saab <ps@yahoo-inc.com>
login (or not if root)
then exit the shell
truss will get stuct in tsleep
I dont know if this is correct, but it fixes the problem and
according to the commends in pioctl.h, PF_ISUGID is set when we
want to ignore UID changes.
The code is checking for when PF_ISUGID is not set and since it
never is set, we always ignore UID changes.
Submitted by: Paul Saab <ps@yahoo-inc.com>
p_trespass(struct proc *p1, struct proc *p2)
which returns zero or an errno depending on the legality of p1 trespassing
on p2.
Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one
extra signal related check.
Replace procfs.h:CHECKIO() macros with calls to p_trespass().
Only show command lines to process which can trespass on the target
process.
This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.
Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail
still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for
jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
1:
s/suser/suser_xxx/
2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with
later.
There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
dereference a NULL pointer, causing a panic. Instead of following
s_leader to find the session id, store it in the session structure.
Jukka found the following info:
BTW - I just found what I have been looking for. Std 1003.1
Part 1: SYSTEM API [C LANGUAGE] section 2.2.2.80 states quite
explicitly...
Session lifetime: The period between when a session is created
and the end of lifetime of all the process groups that remain
as members of the session.
So, this quite clearly tells that while there is any single
process in any process group which is a member of the session,
the session remains as an independent entity.
Reviewed by: peter
Submitted by: "Jukka A. Ukkonen" <jau@jau.tmt.tele.fi>
flag is set in the p_pfsflags field. This, essentially, prevents an SUID
proram from hanging after being traced. (E.g., "truss /usr/bin/rlogin" would
fail, but leave rlogin in a stopevent state.) Yet another case where procctl
is (hopefully ;)) no longer needed in the general case.
Reviewed by: bde (thanks bruce :))
it in struct proc instead.
This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.
I have not removed the /*ARGSUSED*/, they will require some looking at.
libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.
A couple of finer points by: bde
same syscall number as NetBSD/OpenBSD. The getpgid() came from NetBSD
(I think) originally, but it's basically cut/paste/edit from the other
simple get*() syscalls.
by bde.
Don't return EPERM in setre[ug]id() just because the caller passes in
the current effective id in the second arg (ie: no change), as suggested
by ache.
This is valueable for library code which needs to be able to find out
whether the current process is or *was* set[ug]id at some point in the
past, and may have a "tainted" execution environment. This is especially
a problem with the trend to immediately revoke privs at startup and regain
them for critical sections. One problem with this is that if a cracker
is able to compromise the program while it's still got a saved id, the
cracker can direct the program to regain the privs. Another problem is
that the user may be able to affect the program in some other way (eg:
setting resolver host aliases) and the library code needs to know when it
should disable these sorts of features.
Reviewed by: ache
Inspired by: OpenBSD (but with a different implementation)
that allows traditional BSD setuid/setgid behavior.
The only visible difference should be that a non-root setuid program
(eg: inn's "rnews" program) that is setuid to news, can completely
"become" uid news. (ie: setuid(geteuid()) This was allowed in
traditional 4.2/4.3BSD and is now "blessed" by Posix as a special
case of "appropriate privilige".
Also, be much more careful with the P_SUGID flag so that we can use it
for issetugid() - only set it if something changed.
Reviewed by: ache
vector except for the egid in groups[0]. There is a risk that programs
that come from SYSV/Linux that expect this to work and don't check for
error returns may accidently pass root's groups on to child processes.
We now do what is least suprising (to non BSD programs/programmers) in
this scenario, and nothing is changed for programs written with BSD groups
rules in mind.
Reviewed by: ache
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
structs and prototypes for syscalls.
Ifdefed duplicated decentralized declarations of args structs. It's
convenient to have this visible but they are hard to maintain. Some
are already different from the central declarations. 4.4lite2 puts
them in comments in the function headers but I wanted to avoid the
large changes for that.
Prototypes are located in <sys/sysproto.h>.
Add appropriate #include <sys/sysproto.h> to files that needed
protos from systm.h.
Add structure definitions to appropriate files that relied on sys/systm.h,
right before system call definition, as in the rest of the kernel source.
In kern_prot.c, instead of using the dummy structure "args", create
individual dummy structures named <syscall>_args. This makes
life easier for prototype generation.