Commit Graph

1900 Commits

Author SHA1 Message Date
jonathan
5ecd1c9d40 Add experimental support for process descriptors
A "process descriptor" file descriptor is used to manage processes
without using the PID namespace. This is required for Capsicum's
Capability Mode, where the PID namespace is unavailable.

New system calls pdfork(2) and pdkill(2) offer the functional equivalents
of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote
process for debugging purposes. The currently-unimplemented pdwait(2) will,
in the future, allow querying rusage/exit status. In the interim, poll(2)
may be used to check (and wait for) process termination.

When a process is referenced by a process descriptor, it does not issue
SIGCHLD to the parent, making it suitable for use in libraries---a common
scenario when using library compartmentalisation from within large
applications (such as web browsers). Some observers may note a similarity
to Mach task ports; process descriptors provide a subset of this behaviour,
but in a UNIX style.

This feature is enabled by "options PROCDESC", but as with several other
Capsicum kernel features, is not enabled by default in GENERIC 9.0.

Reviewed by: jhb, kib
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
2011-08-18 22:51:30 +00:00
rwatson
4af919b491 Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *.  With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by:	re (bz)
Submitted by:	jonathan
Sponsored by:	Google Inc
2011-08-11 12:30:23 +00:00
kib
96a4fe50dc Implement the linprocfs swaps file, providing information about the
configured swap devices in the Linux-compatible format.

Based on the submission by:	Robert Millan <rmh debian org>
PR:	kern/159281
Reviewed by:	bde
Approved by:	re (kensmith)
MFC after:	2 weeks
2011-08-01 19:12:15 +00:00
bz
1a8cc2bad9 Rename ki_ocomm to ki_tdname and OCOMMLEN to TDNAMLEN.
Provide backward compatibility defines under BURN_BRIDGES.

Suggested by:	jhb
Reviewed by:	emaste
Sponsored by:	Sandvine Incorporated
Approved by:	re (kib)
2011-07-18 20:06:15 +00:00
marck
2f33209e13 Correct small typo in a do{}while(0) define
Approved by:	kib
MFC after:	2 weeks
2011-07-17 17:12:17 +00:00
bz
0b1276123e Remove the 'either' from the comment as it'll be less obvious that we
removed semmap in a bit of time from now.   Re-wrap.

Suggested by:	jhb
2011-07-17 05:33:22 +00:00
jonathan
5132d7b9f3 Auto-generated system call code with cap_new(), cap_getrights().
Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-15 18:33:12 +00:00
jonathan
4ec3aaddb5 Add cap_new() and cap_getrights() system calls.
Implement two previously-reserved Capsicum system calls:
- cap_new() creates a capability to wrap an existing file descriptor
- cap_getrights() queries the rights mask of a capability.

Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-15 18:26:19 +00:00
bz
8448ba638c Remove semaphore map entry count "semmap" field and its tuning
option that is highly recommended to be adjusted in too much
documentation while doing nothing in FreeBSD since r2729 (rev 1.1).

ipcs(1) needs to be recompiled as it is accessing _KERNEL private
variables.

Reviewed by:	jhb (before comment change on linux code)
Sponsored by:	Sandvine Incorporated
2011-07-14 14:18:14 +00:00
pluknet
108671f810 Return empty cmdline/environ string for processes with kernel address
space. This is consistent with the behavior in linux.

PR:		kern/157871
Reported by:	Petr Salinger <Petr Salinger att seznam cz>
Verified on:	GNU/kFreeBSD debian 8.2-1-amd64 (by reporter)
Reviewed by:	kib (some time ago)
MFC after:	2 weeks
2011-06-17 07:30:56 +00:00
kib
7c77d46862 Regen. 2011-06-16 22:06:35 +00:00
kib
d5407645c7 Implement compat32 for old lseek, for the a.out binaries on amd64. 2011-06-16 22:05:56 +00:00
netchild
428c437c67 Commit the missing linux_videdev2_compat.h (lost somewhere between
commit tree patch generation -> successful compile tree build test -> commmit).

Pointy hat to:	netchild
2011-05-04 13:09:20 +00:00
netchild
897499f709 Add FEATURE macros for v4l and v4l2 to the linuxulator.
Suggested by:	ae
2011-05-04 09:52:34 +00:00
netchild
939c0e0ef5 This is v4l2 support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l2 API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd
or multimedia/webcamd supplied drivers.

Submitted by:	nox
MFC after:	1 month
2011-05-04 09:05:39 +00:00
netchild
5e0eb77d30 Fix typo in comment, improve comment. 2011-05-04 08:42:31 +00:00
netchild
991566d22b Add explanation about the use-permission and FreeBSDify it. 2011-05-04 08:41:55 +00:00
netchild
443cec2596 Copy the v4l2 header unchanged from the vendor branch. 2011-05-04 08:31:58 +00:00
mdf
b0f8474766 Regen. 2011-04-18 16:32:47 +00:00
mdf
9c9a32d97b Add the posix_fallocate(2) syscall. The default implementation in
vop_stdallocate() is filesystem agnostic and will run as slow as a
read/write loop in userspace; however, it serves to correctly
implement the functionality for filesystems that do not implement a
VOP_ALLOCATE.

Note that __FreeBSD_version was already bumped today to 900036 for any
ports which would like to use this function.

Also reserve space in the syscall table for posix_fadvise(2).

Reviewed by:	-arch (previous version)
2011-04-18 16:32:22 +00:00
trasz
935751ee5c Remove stray semicolon. 2011-04-10 10:15:49 +00:00
jkim
95c723445e Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now.  More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly).  Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
2011-04-07 23:28:28 +00:00
trasz
92bec9b84c Add accounting for most of the memory-related resources.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-05 20:23:59 +00:00
kib
4999e59537 Implement compat32 shims for PCIOCGETCONF.
There is a generic problem with the shims for ioctls that receive
pointers to the usermode data areas in the data argument. We either have
to modify the handler to accept UIO_USERSPACE/UIO_SYSSPACE indicator, or
allocate and fill a usermode memory for data buffer in the host format.
The change goes the second route, in particular because we do not need
to modify the handler.

Submitted by:	John Wehle <john feith com>
MFC after:	2 weeks
2011-04-02 16:02:25 +00:00
kib
72cac8e666 Provide the structures and ioctl number definition for handling
PCIOCGETCONF compat32.

Submitted by:	John Wehle <john feith com>
MFC after:	2 weeks
2011-04-02 15:47:23 +00:00
kib
e452eed194 Regen 2011-04-01 11:16:53 +00:00
kib
7c2eaa21fe Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.
In particular:
- implement compat shims for old stat(2) variants and ogetdirentries(2);
- implement delivery of signals with ancient stack frame layout and
  corresponding sigreturn(2);
- implement old getpagesize(2);
- provide a user-mode trampoline and LDT call gate for lcall $7,$0;
- port a.out image activator and connect it to the build as a module
  on amd64.

The changes are hidden under COMPAT_43.

MFC after:   1 month
2011-04-01 11:16:29 +00:00
avg
94ec7d2988 Revert r220032:linux compat: add SO_PASSCRED option with basic handling
I have not properly thought through the commit.  After r220031 (linux
compat: improve and fix sendmsg/recvmsg compatibility) the basic
handling for SO_PASSCRED is not sufficient as it breaks recvmsg
functionality for SCM_CREDS messages because now we would need to handle
sockcred data in addition to cmsgcred.  And that is not implemented yet.

Pointyhat to:	avg
2011-03-31 08:14:51 +00:00
trasz
3adbd8337d Regenerate. 2011-03-30 17:59:54 +00:00
trasz
2f99052d80 Add rctl. It's used by racct to take user-configurable actions based
on the set of rules it maintains and the current resource usage.  It also
privides userland API to manage that ruleset.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-30 17:48:15 +00:00
kib
dfc44ec337 Regen. 2011-03-30 14:46:55 +00:00
kib
5c02dd1e9a Provide compat32 shims for kldstat(2).
Requested and tested by:	jpaetzel
MFC after:	1 week
2011-03-30 14:46:12 +00:00
avg
df7a39b1d0 linux compat: add SO_PASSCRED option with basic handling
This seems to have been a part of a bigger patch by dchagin that either
haven't been committed or committed partially.

Submitted by:	dchagin, nox
MFC after:	2 weeks
2011-03-26 11:25:36 +00:00
avg
a923658583 linux compat: improve and fix sendmsg/recvmsg compatibility
- implement baseic stubs for capget, capset, prctl PR_GET_KEEPCAPS
  and prctl PR_SET_KEEPCAPS.
- add SCM_CREDS support to sendmsg and recvmsg
- modify sendmsg to ignore control messages if not using UNIX
  domain sockets

This should allow linux pulse audio daemon and client work on FreeBSD
and interoperate with native counter-parts modulo the differences in
pulseaudio versions.

PR:		kern/149168
Submitted by:	John Wehle <john@feith.com>
Reviewed by:	netchild
MFC after:	2 weeks
2011-03-26 11:05:53 +00:00
kib
948b7589fc Implement compat32 MEMRANGE_GET and MEMRANGE_SET. This is needed to
run 32bit Xorg server with VESA driver.

Submitted by:	John Wehle <john feith com>
MFC after:	1 week
2011-03-25 11:52:31 +00:00
kib
0d40bf4b19 Fully emulate MDIOCLIST for compat32.
MFC after:	1 week
2011-03-25 11:43:49 +00:00
kib
abbe29dccf Remove unneccessary panics, that can be easily triggered by user.
The copyin() function handles NULL as well as any other pointer.

MFC after:	3 days
2011-03-25 11:05:28 +00:00
kib
1d38d5630c Fix file leakage in the freebsd32_ioctl routines.
Code inspection shows freebsd32_ioctl calls fget for a fd and calls
a subroutine to handle each specific ioctl.  It is expected that the
subroutine will call fdrop when done.  However many of the subroutines
will exit out early if copyin encounters an error resulting in fdrop
never being called.

Submitted by:	John Wehle <john feith com>
MFC after:	3 days
2011-03-25 10:57:57 +00:00
jhb
c7ac62aecd Fix some locking nits with the p_state field of struct proc:
- Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL
  in fork to honor the locking requirements.  While here, expand the scope
  of the PROC_LOCK() on the new process (p2) to avoid some LORs.  Previously
  the code was locking the new child process (p2) after it had locked the
  parent process (p1).  However, when locking two processes, the safe order
  is to lock the child first, then the parent.
- Fix various places that were checking p_state against PRS_NEW without
  having the process locked to use PROC_LOCK().  Every place was already
  locking the process, just after the PRS_NEW check.
- Remove or reduce the use of PROC_SLOCK() for places that were checking
  p_state against PRS_NEW.  The PROC_LOCK() alone is sufficient for reading
  the current state.
- Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.

MFC after:	1 week
2011-03-24 18:40:11 +00:00
netchild
2d4b88f638 Staticize functions which are not used somewhere else, move the
corresponding prototypes from the header to the code file.
2011-03-15 13:40:47 +00:00
avg
5a2a285ac9 add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
Regenerate system call and systrace support files.

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (earlier version)
MFC after:	3 weeks
2011-03-12 08:58:19 +00:00
avg
666906fcd7 add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
This commits makes necessary changes in syscall/sysent generation
infrastructure.

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (ealier version)
MFC after:	3 weeks
2011-03-12 08:51:43 +00:00
dchagin
1878bfa86d Style(9) fixes. No functional changes.
MFC after:	2 Week
2011-03-12 07:47:05 +00:00
jhb
faa7c47cee Remove now-obsolete comment.
Submitted by:	netchild
MFC after:	1 week
2011-03-10 19:50:12 +00:00
jkim
a023421583 Remove custom interrupt dispatcher. This is a pointless micro-optimization
and it may cause problems if SS and SP are modified by real-mode code.

MFC after:	1 month
2011-03-09 16:16:38 +00:00
dchagin
83dbcc9afd Indeed, remove bogus since r219405 check of the Linux ABI.
Pointed out:	jhb

MFC after:	2 Week
2011-03-09 05:59:33 +00:00
dchagin
69b8756d3d Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with:	kib

MFC after:	2 Week
2011-03-08 19:01:45 +00:00
trasz
1618438630 Export login class information via kinfo and make it possible to view
it using "ps -o class".
2011-03-05 14:41:49 +00:00
trasz
0525662d59 Regenerate. 2011-03-05 12:46:24 +00:00
trasz
62f6a13e39 Add two new system calls, setloginclass(2) and getloginclass(2). This makes
it possible for the kernel to track login class the process is assigned to,
which is required for RCTL.  This change also make setusercontext(3) call
setloginclass(2) and makes it possible to retrieve current login class using
id(1).

Reviewed by:	kib (as part of a larger patch)
2011-03-05 12:40:35 +00:00