212 Commits

Author SHA1 Message Date
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Attilio Rao
a1fe14bc33 rufetch and calcru sometimes should be called atomically together.
This patch fixes places where they should be called atomically changing
their locking requirements (both assume per-proc spinlock held) and
introducing rufetchcalc which wrappers both calls to be performed in
atomic way.

Reviewed by: jeff
Approved by: jeff (mentor)
2007-06-09 21:48:44 +00:00
Attilio Rao
2feb50bf7d Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
Konstantin Belousov
9e223287c0 Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by:	jhb
Reviewed by:	daichi (unionfs)
Approved by:	re (kensmith)
2007-05-31 11:51:53 +00:00
Jeff Roberson
222d01951f - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
Alexander Leidinger
802e08a360 Partial MFp4 of 114977:
Whitespace commit: Fix grammar, spelling and punctuation.

Submitted by:	"Scot Hetzel" <swhetzel@gmail.com>
2007-02-24 16:49:25 +00:00
Alexander Leidinger
1a26db0a3a MFp4 (114193 (i386 part), 114194, 114195, 114200):
- Dont "return" in linux_clone() after we forked the new process in a case
   of problems.
 - Move the copyout of p2->p_pid outside the emul_lock coverage in
   linux_clone().
 - Cache the em->pdeath_signal in a local variable and move the copyout
   out of the emul_lock coverage.
 - Move the free() out of the emul_shared_lock coverage in a preparation
   to switch emul_lock to non-sleepable lock (mutex).

Submitted by:	rdivacky
2007-02-23 22:39:26 +00:00
Konstantin Belousov
75ee4e5462 No need to lock emul_lock in exit_group() because em->shared
cannot change (because its referenced by curthread). This fixes
a LOR caused by acquiring emul_shared_lock while holding emul_lock.

Fix typo in comment.

Submitted by: rdivacky
2007-02-01 13:33:33 +00:00
Alexander Leidinger
a849401985 MFp4 (112646):
Now (ok it's been a while...) that FreeBSD has RLIMIT_AS too, we can use
it in the linuxolator instead of ignoring it.

This fixes a LTP test.

Submitted by:	rdivacky
2007-01-07 19:30:19 +00:00
Alexander Leidinger
0ed6f09c4e MFp4 (112534):
Dont lock em in a case of just using em->shared->group_pid because
the group_pid never changes.

Submitted by:	rdivacky
Reviewed by:	kib
Glanced at by:	jhb
2007-01-07 19:14:06 +00:00
Alexander Leidinger
1c65504ca8 MFp4 (112498):
Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion.

Submitted by:	rdivacky
2007-01-07 19:00:38 +00:00
Alexander Leidinger
c9447c7551 MFp4 (111746, 108671, 108945, 112352):
- add linux utimes syscall [1]
 - add linux rt_sigtimedwait syscall [2]

Submitted by:	"Scot Hetzel" <swhetzel@gmail.com> [1]
Submitted by:	Bruce Becker <hostmaster@whois.gts.net> [2]
PR:		93199 [2]
2006-12-31 13:16:00 +00:00
Alexander Leidinger
9ce8f9bcdd MFp4 (111746+):
Redo the checking for 2.6 emulation. We now cache the value of
  use26 and replace calls to linux_get_osrelease() + parsing with
  a call to linux_use26(). Typical path is lockless now.

  Pointed out by: kib

This allows to ship RELENG_7_0 with a default osrelease of 2.4.2 and the
possibility to enable 2.6.x emulation without the possible performance
impact of the previous version of the check.

Submitted by:	rdivacky
2006-12-31 12:39:10 +00:00
Alexander Leidinger
ef95cfeab9 MFp4:
- semi-automatic style fixes
 - spelling fixes in comments
 - add some comments
2006-12-31 11:56:16 +00:00
Jung-uk Kim
b34608fea5 MFP4: 109653
Linux mknod(2) can open any files, not just char/block or fifo files.
This fixes Linux Test Project test cases mknod01, mknod07 and mknod09.
2006-12-04 22:46:09 +00:00
Alexander Leidinger
f6018b1434 MFP4 (108673, 110519, 110874):
- Currently LINUX_MAX_COMM_LEN is smaller than MAXCOMLEN, but in case
  this will change we have a buffer overflow. Apply some defensive
  programming to DTRT when this should happen.
- Use copyinstr() instead of copyin where appropriate.
  * Fallback to copyin() in case of ENAMETOOLONG. [1]
  * Use the right source and destination (it was wrong before).
- Use strlcpy instead of strcpy.
- Properly lock the read case (PR_GET_NAME) like the write case.

Reviewed by:	rwatson (except [1])
Suggested by:	rwatson [1]
2006-12-02 14:56:25 +00:00
Konstantin Belousov
cce1514679 Sync struct sysinfo with real one from linux.
Submitted by:	rdivacky
2006-11-18 14:37:54 +00:00
Konstantin Belousov
d559d18183 Add debuging printfs to syscalls that do not contain it yet. In
sethostname do not print the hostname because it would require to copyin
the string. Sethostname is not very frequently used.

Submitted by:	rdivacky
2006-11-18 13:00:59 +00:00
Konstantin Belousov
f472c6e35a Remove unecessary locking of process in linux_getpid.
Suggested by:	jhb
Submitted by:	rdivacky
2006-11-18 10:12:43 +00:00
Konstantin Belousov
0132096dfd In rev 1.188 of linux_misc.c the added check for valid options ommited
__WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies
after exit.

Submitted by: rdivacky
Reported by: Bakul Shah (bakul at bitblocks com)
2006-11-15 10:01:06 +00:00
Tom Rhodes
6aeb05d7be Merge posix4/* into normal kernel hierarchy.
Reviewed by:	glanced at by jhb
Approved by:	silence on -arch@ and -standards@
2006-11-11 16:26:58 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
Alexander Leidinger
c4ce314b40 Fix style(9).
Noticed by:	rwatson
2006-10-28 16:47:38 +00:00
Alexander Leidinger
955d762aca MFP4:
Implement prctl().

Submitted by:	rdivacky
Tested with:	LTP
2006-10-28 10:59:59 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
Alexander Leidinger
7660ace19c - Replace homegrown check for FIFO with S_ISFIFO. [1]
- Check the status of the options before messing with it.

Inspired by:	NetBSD [1]
Submitted by:	rdivacky
Tested with:	LTP
2006-10-08 17:08:27 +00:00
Alexander Leidinger
18f81b3dfa - don't reboot() when feed with wrong parameters (and enough permissions) [1]
- add support to power off the system [2]
- check the linux magic values [3]

Submitted by:	Marcin Cieslak <saper@SYSTEM.PL> [1,2]
Modelled after:	linux man page of the reboot() syscall [3]
Found by:	LTP testcase "reboot02" [1]
Tested with:	LTP testcase "reboot02" [1,3]
MFC after:	1 week
2006-09-16 14:12:04 +00:00
Robert Watson
3e8df637c0 Don't call suser_cred() directly from linux_sethostname(), as it just
wraps userland_sysctl(), which performs necessary privilege checks as
part of its normal operation.

MFC after:	1 week
2006-08-25 11:02:42 +00:00
Alexander Leidinger
1a28c0df09 Sync the MI parts for amd64 with i386 and remove the corresponding special
handling for amd64 in the common code. The MD parts for amd64 are still
outstanding, but at least this fixes some panics on amd64.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Tested by:	bsam
2006-08-20 13:50:27 +00:00
Alexander Leidinger
590e3a06e8 - disable some more code when osrelease=2.4.2
- protect td->td_proc->p_pid with the proc lock in linux_getpid
  in the amd64 (= non i386) case [1]

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	netchild [1]
2006-08-17 21:21:30 +00:00
Alexander Leidinger
94cb2ecf79 Move some stuff into headers where they belong.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-17 21:06:48 +00:00
Alexander Leidinger
a43eeaabe4 Disable some parts of the code on amd64 for now to prevent a panic. A better
fix will come later.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-08-15 15:15:17 +00:00
Alexander Leidinger
9b44bfc556 Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
 - pid/tid mangling - complete
 - thread area - complete
 - futexes - complete with issues
 - clone() extension - complete with some possible minor issues
 - mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
   disabled when not build as part of the kernel with native FreeBSD mq*
   support (module support for this will come later)

Tested with:
 - linux-firefox - works, tested
 - linux-opera - works, tested
 - linux-realplay - doesnt work, issue with futexes
 - linux-skype - doesnt work, issue with futexes
 - linux-rt2-demo - works, tested
 - linux-acroread - doesnt work, unknown reason (coredump) and sometimes
   issue with futexes
 - various unix utilities in linux-base-gentoo3 and linux-base-fc4:
   everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
	sysctl compat.linux.osrelease=2.6.16
to switch back use
	sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by:			Google SoC 2006
Submitted by:			rdivacky
Some suggestions/help by:	jhb, kib, manu@NetBSD.org, netchild
2006-08-15 12:54:30 +00:00
John Baldwin
b4c63329d5 - Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional
Giant VFS locking in that function.
- Remove bogus code to handle the case where namei() returns success but a
  NULL vnode pointer.
- Note that this code duplicates exec_check_permissions() and annotate
  where it differs.
- Hold the vnode lock longer to protect the write to set VV_TEXT in
  v_vflag.
- Mark linux_uselib() MPSAFE.

Reviewed by:	rwatson
2006-07-21 20:22:13 +00:00
Alexander Leidinger
555f86b8b6 The linux times syscall can be called with a NULL pointer, so keep cool
and don't panic.

This fix is different from the patch submitted as it not only prevents
a NULL-pointer dereference, but also skips some work in this case.

Noticed by:	Dmitry Ganenko <dima@apk-inform.com>
Reviewed by:	rdivacky (the original version as in emulation@)
MFC after:	1 week
Security:	This is a RELENG_x_y candidate (local DoS).
Go ahead by:	secteam (cperciva)
2006-06-23 18:49:38 +00:00
Alexander Leidinger
01e0ffbae8 Now that we don't have a linuxolator on alpha anymore:
- unifdef __alpha__
 - revert rev. 1.66 of linux_socket.c
2006-05-10 20:38:16 +00:00
Tai-hwa Liang
d9d46ed258 Unbreaking build by removing a now unused variable. 2006-03-27 23:27:11 +00:00
John Baldwin
b77619bd7f Use td_ucred rather than p_ucred to avoid panics and general unhappiness.
Pointy hat to:	netchild
2006-03-27 19:16:31 +00:00
Ruslan Ermilov
aefce619cf Unbreak COMPAT_LINUX32 option support on amd64.
Broken by:	netchild
2006-03-19 11:10:33 +00:00
Alexander Leidinger
d4a3f5ddb6 Fixup some problems in my previous commit (COMPAT_43).
Pointyhat to:	netchild
2006-03-18 20:47:36 +00:00
Alexander Leidinger
5c8919adf4 Get rid of the need of COMPAT_43 in the linuxolator.
Submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Obtained from:	DragonFly (some parts)
2006-03-18 18:20:17 +00:00
Tom Rhodes
0e36e11d57 Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code. 2005-12-28 07:08:54 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
John Baldwin
8d948cd1ec Fix the computation of uptime for linux_sysinfo(). Before it was returning
the uptime in seconds mod 60 which wasn't very useful.

Approved by:	re (scottl)
2005-07-07 19:17:55 +00:00
Maxim Sobolev
bc165ab0fe Properly convert FreeBSD priority values into Linux values in the
getpriority(2) syscall.

PR:		kern/81951
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
2005-06-08 20:41:28 +00:00
Jeff Roberson
7625cbf3cc - Pass the ISOPEN flag to namei so filesystems will know we're about to
open them or otherwise access the data.
2005-04-27 09:05:19 +00:00
John Baldwin
98df9218da - Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
  vnodes and anonymous memory.  Note that mmaping a cdev directly does not
  currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
  cdev the ioctl is acting on rather than trying to find a suitable vnode
  to map from.

Reviewed by:	alc, arch@
2005-04-01 20:00:11 +00:00
Maxim Sobolev
e3478fe000 Handle unimplemented syscall by instantly returning ENOSYS instead of sending
signal first and only then returning ENOSYS to match what real linux does.

PR:		kern/74302
Submitted by:	Travis Poppe <tlp@LiquidX.org>
2005-03-07 00:18:06 +00:00
John Baldwin
12dd959a7d Use kern_setitimer() to implement linux_alarm() instead of fondling the
real interval timer directly.
2005-02-07 18:36:21 +00:00
Maxim Sobolev
cfa0efe7ab Split out kernel side of {get,set}itimer(2) into two parts: the first that
pops data from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrappers
of those syscalls.

MFC after:	2 weeks
2005-01-25 21:28:28 +00:00