Commit Graph

94 Commits

Author SHA1 Message Date
jkim
e210f689a8 - Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27
but GNU libc used it without checking its kernel version, e. g., Fedora 10.
- Move pipe(2) implementation for Linuxulator from MD files to MI file,
sys/compat/linux/linux_file.c.  There is no MD code for this syscall at all.
- Correct an argument type for pipe() from l_ulong * to l_int *.  Probably
this was the source of MI/MD confusion.

Reviewed by:	emulation
2012-04-16 21:22:02 +00:00
kmacy
99851f359e In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
rwatson
4af919b491 Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *.  With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by:	re (bz)
Submitted by:	jonathan
Sponsored by:	Google Inc
2011-08-11 12:30:23 +00:00
dchagin
9f708ad0aa Move linux_clone(), linux_fork(), linux_vfork() to a MI path. 2011-02-12 18:17:12 +00:00
dchagin
a999d3553b In preparation for moving linux_clone() to a MI path
introduce linux_set_upcall_kse().
2011-02-12 16:33:00 +00:00
dchagin
8b4a007006 In preparation for moving linux_clone () to a MI path
move the TLS code in a separate function.

Use function parameter instead of direct using register.
2011-02-12 15:50:21 +00:00
dchagin
6115f650de The kern_wait() code already removes the SIGCHLD signal for the waited
process. Removing other SIGCHLD signals is not needed and may cause
problems.

Pointed out by:	jilles

MFC after:	1 Month.
2011-01-30 18:17:38 +00:00
dchagin
051ceeb5f3 Implement a variation of the linux_common_wait() which should
be used by linuxolator itself.

Move linux_wait4() to MD path as it requires native struct
rusage translation to struct l_rusage on linux32/amd64.

MFC after:	1 Month.
2011-01-28 18:47:07 +00:00
dchagin
1e124ec538 Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after:	1 month
2011-01-26 20:03:58 +00:00
kan
46bba0ff81 Do not require pos parameter to be zero in MAP_ANONYMOUS mmap requests
in Linux emulation layer. Linux seems to only require that pos is
page-aligned, but otherwise ignores it. Default FreeBSD mmap parameter
checking is too strict to allow some Linux binaries to run. tsMuxeR is
one example of such a binary.

Discussed with:	jhb
MFC after:	1 week
2010-06-10 17:59:47 +00:00
jhb
b13876f956 Fix some problems with effective mmap() offsets > 32 bits. This was
partially fixed on amd64 earlier.  Rather than forcing linux_mmap_common()
to use a 32-bit offset, have it accept a 64-bit file offset.  This offset
is then passed to the real mmap() call.  Rather than inventing a structure
to hold the normal linux_mmap args that has a 64-bit offset, just pass
each of the arguments individually to linux_mmap_common() since that more
closes matches the existing style of various kern_foo() functions.

Submitted by:	Christian Zander @ Nvidia
MFC after:	1 week
2009-10-28 20:17:54 +00:00
jhb
12b7caa1dd Return ENOSYS instead of EINVAL for invalid function codes to match the
behavior of Linux.

Reported by:	Alexander Best  alexbestms of math.uni-muenster.de
Approved by:	re (kib)
2009-06-26 19:39:33 +00:00
kib
021a7529ae Adapt linux emulation to use cv for vfork wait.
Submitted by:	Takahiro Kurosawa <takahiro.kurosawa gmail com>
PR:	kern/131506
2009-02-18 16:11:39 +00:00
ed
8d12469978 Several cleanups related to pipe(2).
- Use `fildes[2]' instead of `*fildes' to make more clear that pipe(2)
  fills an array with two descriptors.

- Remove EFAULT from the manual page. Because of the current calling
  convention, pipe(2) raises a segmentation fault when an invalid
  address is passed.

- Introduce kern_pipe() to make it easier for binary emulations to
  implement pipe(2).

- Make Linux binary emulation use kern_pipe(), which means we don't have
  to recover td_retval after calling the FreeBSD system call.

Approved by:	rdivacky
Discussed on:	arch
2008-11-11 14:55:59 +00:00
jkim
3bffed0bec Fix Linux mmap with MAP_GROWSDOWN flag.
Reported by:	Andriy Gapon (avg at icyb dot net dot ua)
Tested by:	Andriy Gapon (avg at icyb dot net dot ua)
Pointyhat:	me
MFC after:	3 days
2008-02-11 19:35:03 +00:00
kib
20098981e1 Implement read_default_ldt in linux_modify_ldt(). It copies out zeroed
descriptor, like real Linux does.

Tested by: Yuriy Tsibizov <yuriy.tsibizov at gmail com>
Submitted by:	rdivacky
MFC after:	1 week
2007-11-26 11:06:19 +00:00
attilio
d6dfb4f4cb i386_set_ioperm, i386_get_ldt and i386_set_ldt are now MPSAFE
(Giant/sched_lock free) so remove unuseful Giant cruft.

Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
2007-07-20 08:35:18 +00:00
peter
6d9e6c677c Don't add the 'pad' argument to the mmap/truncate/etc syscalls.
Submitted by: kensmith
Approved by: re (kensmith)
2007-07-04 23:06:43 +00:00
jeff
91d1501790 Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
kan
ea141892dc Do not dereference linux_to_bsd_signal[-1] if userland has
passed zero as exit signal.

GCC 4.2 changes the kernel data segment layout not to have 0
in that memory location. This code ran by luck before and now
the luck has run out.
2007-05-11 01:25:51 +00:00
jkim
554fb0a678 MFP4: 115220, 115222
- Fix style(9) and reduce diff between amd64 and i386.
- Prefix Linuxulator macros with LINUX_ to prevent future collision.
2007-03-02 00:08:47 +00:00
jkim
2620bd06da MFP4: 115094
Linux does not check file descriptor when MAP_ANONYMOUS is set.
This should fix recent LTP test regressions.

Reported by:	Scot Hetzel (swhetzel at gmail dot com)
		netchild
2007-02-27 02:08:01 +00:00
netchild
888f5e57b2 Partial MFp4 of 114977:
Whitespace commit: Fix grammar, spelling and punctuation.

Submitted by:	"Scot Hetzel" <swhetzel@gmail.com>
2007-02-24 16:49:25 +00:00
netchild
902cc4aeba MFp4 (114193 (i386 part), 114194, 114195, 114200):
- Dont "return" in linux_clone() after we forked the new process in a case
   of problems.
 - Move the copyout of p2->p_pid outside the emul_lock coverage in
   linux_clone().
 - Cache the em->pdeath_signal in a local variable and move the copyout
   out of the emul_lock coverage.
 - Move the free() out of the emul_shared_lock coverage in a preparation
   to switch emul_lock to non-sleepable lock (mutex).

Submitted by:	rdivacky
2007-02-23 22:39:26 +00:00
jkim
df99d574b5 MFP4: 113025, 113146, 113177, 113203, 113500, 113546, 113570
- PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC.
Linux/ia64's i386 emulation layer does this and it complies with Linux
header files.  This fixes mmap05 LTP test case on amd64.
- Do not adjust stack size when failure has occurred.
- Synchronize i386 mmap/mprotect with amd64.
2007-02-15 00:54:40 +00:00
kib
b9ce1aaa2a Fix LOR that occurs because proctree_lock was acquired while holding
emuldata lock by moving the code upwards outside the emul_lock coverage.

Submitted by: rdivacky
2007-02-01 13:27:52 +00:00
jeff
474b917526 - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
netchild
42392e7a0b MFp4 (113077, 113083, 113103, 113124, 113097):
Dont expose em->shared to the outside world before its properly
	initialized. Might not affect anything but its at least a better
	coding style.

	Dont expose em via p->p_emuldata until its properly initialized.
	This also enables us to get rid of some locking and simplify the
	code because we are workin on a local copy.

	In linux_fork and linux_vfork create the process in stopped state
	to be sure that the new process runs with fully initialized emuldata
	structure [1]. Also fix the vfork (both in linux_clone and linux_vfork)
	race that could result in never woken up process [2].

Reported by:	Scot Hetzel	[1]
Suggested by:	jhb		[2]
Reviewed by:	jhb (at least some important parts)
Submitted by:	rdivacky
Tested by:	Scot Hetzel (on amd64)

Change 2 comments (in the new code) to comply to style(9).

Suggested by:	jhb
2007-01-20 14:58:59 +00:00
netchild
4ffc7bc7ea MFp4 (112893):
Make linux_vfork() actually work. This enables make to work again with 2.6.
It also fixes the LTP vfork tests.

Submitted by:	rdivacky
2007-01-14 16:20:37 +00:00
netchild
977ef4a8bc MFp4 (112498):
Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion.

Submitted by:	rdivacky
2007-01-07 19:00:38 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
netchild
357a456178 Fix a recent regression regarding valid signals.
Submitted by:	rdivacky
2006-10-20 10:09:40 +00:00
netchild
a9266094f2 MFP4 (106538 + 106541):
Implement CLONE_VFORK. This fixes the clone05 LTP test.

Submitted by:	rdivacky
2006-10-15 13:39:40 +00:00
netchild
96943a3038 Revert my previous commit, I mismerged this to the wrong place.
Pointy hat to:	netchild
2006-10-15 13:30:45 +00:00
netchild
7e5ad63262 MFP4 (106541): Fix the clone05 test in the LTP.
Submitted by:	rdivacky
2006-10-15 13:25:23 +00:00
netchild
6dc0f7cde5 MFP4 (107144[1]): Implement CLONE_FS on i386[1] and amd64.
Submitted by:	rdivacky	[1]
2006-10-15 13:22:14 +00:00
netchild
4afde07449 MFP4 (107868 - 107870):
Use a macro to test for a valid signal instead of doing it my hand everywhere.

Submitted by:	rdivacky
2006-10-15 12:51:43 +00:00
netchild
81cdbc19d7 style(9)
While I'm here add a MFC reminder, I forgot it in the previous commit.

Noticed by:	ssouhlal
MFC after:	1 week
2006-09-20 19:27:11 +00:00
netchild
9f4eed62b7 Bring the i386 linux mmap code more into line with how linux (2.4.x)
behaves. This fixes a lot of test which failed before. For amd64 there
are still some problems, but without any testers which apply patches
and run some predefines tests we can't do more ATM.

Submitted by:	Marcin Cieslak <saper@SYSTEM.PL> (minor fixups by myself)
Tested with:	LTP
2006-09-20 17:24:20 +00:00
netchild
c1c941b5f5 Fix video playing and network connections in realplayer (and most likely
other stuff) in the osrelease=2.6.16 case:
 - implement CLONE_PARENT semantic
 - fix TLS loading in clone CLONE_SETTLS
 - lock proc in the currently disabled part of CLONE_THREAD

I suggest to not unload the linux module after testing this, there are
some "<defunct>" processes hanging around after exiting (they aren't
with osrelease=2.4.2) and they may panic your kernel when unloading the
linux module. They are in state Z and some of them consume CPU according
to ps. But I don't trust the CPU part, the idle threads gets too much CPU
that this may be possible (accumulating idle, X and 2 defunct processes
results in 104.7%, this looks to much to be a rounding error).

Noticed by:	Intron <mag@intron.ac>
Submitted by:	rdivacky (in collaboration with Intron)
Tested by:	Intron, netchild
Reviewed by:	jhb (previous version)
2006-08-27 18:51:32 +00:00
netchild
fedc5604a0 Emulate what vfork does instead of using it in linux_vfork. This way
we can do the stuff we need to do with linux processes at fork and
don't panic the kernel at exit of the child.

Submitted by:	rdivacky
Tested with:	tst-vfork* (glibc regression tests)
Tested by:	netchild
2006-08-25 11:59:56 +00:00
netchild
5d552cdc47 Move some stuff into headers where they belong.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-17 21:06:48 +00:00
netchild
39fd1c6d47 Style fixes to comments.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-16 18:54:51 +00:00
netchild
ec2ba5d85d Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
 - pid/tid mangling - complete
 - thread area - complete
 - futexes - complete with issues
 - clone() extension - complete with some possible minor issues
 - mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
   disabled when not build as part of the kernel with native FreeBSD mq*
   support (module support for this will come later)

Tested with:
 - linux-firefox - works, tested
 - linux-opera - works, tested
 - linux-realplay - doesnt work, issue with futexes
 - linux-skype - doesnt work, issue with futexes
 - linux-rt2-demo - works, tested
 - linux-acroread - doesnt work, unknown reason (coredump) and sometimes
   issue with futexes
 - various unix utilities in linux-base-gentoo3 and linux-base-fc4:
   everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
	sysctl compat.linux.osrelease=2.6.16
to switch back use
	sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by:			Google SoC 2006
Submitted by:			rdivacky
Some suggestions/help by:	jhb, kib, manu@NetBSD.org, netchild
2006-08-15 12:54:30 +00:00
jhb
ae432f93f2 - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
sobomax
c3270af7f0 Propagate error code of kern_execve() to the caller properly.
PR:		81670
Submitted by:	Andrew Bliznak <andriko.b@gmail.com>
Pointy hat to:	sobomax
2005-08-01 17:35:48 +00:00
sobomax
1485460070 In linux emulation layer try to detect attempt to use linux_clone() to
create kernel threads and call rfork(2) with RFTHREAD flag set in this case,
which puts parent and child into the same threading group. As a result
all threads that belong to the same program end up in the same threading
group.

This is similar to what linuxthreads port does, though in this case we don't
have a luxury of having access to the source code and there is no definite
way to differentiate linux_clone() called for threading purposes from other
uses, so that we have to resort to heuristics.

Allow SIGTHR to be delivered between all processes in the same threading
group previously it has been blocked for s[ug]id processes.

This also should improve locking of the same file descriptor from different
threads in programs running under linux compat layer.

PR:			kern/72922
Reported by:		Andriy Gapon <avg@icyb.net.ua>
Idea suggested by:	rwatson
2005-03-03 16:57:55 +00:00
jhb
2e8b9720fa Use the LCONVPATHEXIST() macro rather than it's exact expansion to be
consistent.
2005-02-07 18:37:13 +00:00
sobomax
f489acaf0f o Split out kernel part of execve(2) syscall into two parts: one that
copies arguments into the kernel space and one that operates
  completely in the kernel space;

o use kernel-only version of execve(2) to kill another stackgap in
  linuxlator/i386.

Obtained from:  DragonFlyBSD (partially)
MFC after:      2 weeks
2005-01-29 23:12:00 +00:00
sobomax
ef41053770 o Move copyin()/copyout() out of i386_{get,set}_ldt() and
i386_{get,set}_ioperm() and make those APIs visible in the kernel namespace;

o use i386_{get,set}_ldt() and i386_{get,set}_ioperm() instead of sysarch()
  in the linuxlator, which allows to kill another two stackgaps.

MFC after:	2 weeks
2005-01-26 13:59:46 +00:00