Commit Graph

45 Commits

Author SHA1 Message Date
Dmitry Chagin
23e8912c60 Implement Linux personality() system call mainly due to READ_IMPLIES_EXEC flag.
In Linux if this flag is set, PROT_READ implies PROT_EXEC for mmap().
Linux/i386 set this flag automatically if the binary requires executable stack.

READ_IMPLIES_EXEC flag will be used in the next Linux mmap() commit.
2016-07-10 08:15:50 +00:00
Dmitry Chagin
32ba368ba9 Finish r283544. In exec case properly detach threads from user space
before suicide.
2015-06-06 06:12:14 +00:00
Dmitry Chagin
d707582f83 When I merged the lemul branch I missied kib@'s r282708 commit.
This is not the final fix as I need properly cleanup thread resources
before other threads suicide.

Tested by:	Ruslan Makhmatkhanov
2015-05-25 20:44:46 +00:00
Dmitry Chagin
7d96520b25 Improve ktr(9) records in thread managment code.
Differential Revision:	https://reviews.freebsd.org/D1464
Reviewed by:	trasz
2015-05-24 17:09:07 +00:00
Dmitry Chagin
68cf0367e9 Use local struct proc * varable instead of dereferencing td->td_proc.
Differential Revision:	https://reviews.freebsd.org/D1463
Reviewed by:	emaste
2015-05-24 17:08:25 +00:00
Dmitry Chagin
97cfa5c899 Avoid unnecessary em zeroing in non-exec path
as it already zeroed by malloc with M_ZERO flag
and move zeroing to the proper place in exec path.

Differential Revision:	https://reviews.freebsd.org/D1462
Reviewed by:	trasz
2015-05-24 17:07:10 +00:00
Dmitry Chagin
e0327ddba0 Remove the unnecessary cast.
Differential Revision:	https://reviews.freebsd.org/D1461
Reviewed by:	emaste
2015-05-24 17:05:59 +00:00
Dmitry Chagin
e16fe1c730 Implement epoll family system calls. This is a tiny wrapper
around kqueue() to implement epoll subset of functionality.
The kqueue user data are 32bit on i386 which is not enough for
epoll user data, so we keep user data in the proc emuldata.

Initial patch developed by rdivacky@ in 2007, then extended
by Yuri Victorovich @ r255672 and finished by me
in collaboration with mjg@ and jillies@.

Differential Revision:	https://reviews.freebsd.org/D1092
2015-05-24 16:41:39 +00:00
Dmitry Chagin
e0d3ea8c65 Where possible we will use M_LINUX malloc(9) type.
Move M_FUTEX defines to the linux_common.ko.

Differential Revision:	https://reviews.freebsd.org/D1077
Reviewed by:	emaste
2015-05-24 16:14:41 +00:00
Dmitry Chagin
bc27367760 Refund the proc emuldata struct for future use. For now move flags from
thread emuldata to proc emuldata as it was originally intended.

As we can have both 64 & 32 bit Linuxulator running any eventhandler
can be called twice for us. To prevent this move eventhandlers code
from linux_emul.c to the linux_common.ko module.

Differential Revision:	https://reviews.freebsd.org/D1073
2015-05-24 15:54:58 +00:00
Dmitry Chagin
5e609834bd pthread_join() caller do futex_wait on child_clear_tid. As a results
of multiple simultaneous calls to pthread_join() specifying the same
target thread are undefined wake up the one thread.

Differential Revision:	https://reviews.freebsd.org/D1040
2015-05-24 14:54:12 +00:00
Dmitry Chagin
81338031c4 Switch linuxulator to use the native 1:1 threads.
The reasons:
1. Get rid of the stubs/quirks with process dethreading,
   process reparent when the process group leader exits and close
   to this problems on wait(), waitpid(), etc.
2. Reuse our kernel code instead of writing excessive thread
   managment routines in Linuxulator.

Implementation details:

1. The thread is created via kern_thr_new() in the clone() call with
   the CLONE_THREAD parameter. Thus, everything else is a process.
2. The test that the process has a threads is done via P_HADTHREADS
   bit p_flag of struct proc.
3. Per thread emulator state data structure is now located in the
   struct thread and freed in the thread_dtor() hook.
   Mandatory holdig of the p_mtx required when referencing emuldata
   from the other threads.
4. PID mangling has changed. Now Linux pid is the native tid
   and Linux tgid is the native pid, with the exception of the first
   thread in the process where tid and pid are one and the same.

Ugliness:

   In case when the Linux thread is the initial thread in the thread
   group thread id is equal to the process id. Glibc depends on this
   magic (assert in pthread_getattr_np.c). So for system calls that
   take thread id as a parameter we should use the special method
   to reference struct thread.

Differential Revision:	https://reviews.freebsd.org/D1039
2015-05-24 14:53:16 +00:00
Attilio Rao
54366c0bd7 - For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
  calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
  version of the releasing functions for mutex, rwlock and sxlock.
  Failing to do so skips the lockstat_probe_func invokation for
  unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
  kernel compiled without lock debugging options, potentially every
  consumer must be compiled including opt_kdtrace.h.
  Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
  dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
  is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested.  As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while.  Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by:	EMC / Isilon storage division
Discussed with:	rstone
[0] Reported by:	rstone
[1] Discussed with:	philip
2013-11-25 07:38:45 +00:00
John Baldwin
d825ce0a5d Reduce duplication between i386/linux/linux.h and amd64/linux32/linux.h
by moving bits that are MI out into headers in compat/linux.

Reviewed by:	Chagin Dmitry  dmitry | gmail
MFC after:	2 weeks
2013-01-29 18:41:30 +00:00
Alexander Leidinger
19e252baeb - >500 static DTrace probes for the linuxulator
- DTrace scripts to check for errors, performance, ...
  they serve mostly as examples of what you can do with the static probe;s
  with moderate load the scripts may be overwhelmed, excessive lock-tracing
  may influence program behavior (see the last design decission)

Design decissions:
 - use "linuxulator" as the provider for the native bitsize; add the
   bitsize for the non-native emulation (e.g. "linuxuator32" on amd64)
 - Add probes only for locks which are acquired in one function and released
   in another function. Locks which are aquired and released in the same
   function should be easy to pair in the code, inter-function
   locking is more easy to verify in DTrace.
 - Probes for locks should be fired after locking and before releasing to
   prevent races (to provide data/function stability in DTrace, see the
   man-page of "dtrace -v ..." and the corresponding DTrace docs).
2012-05-05 19:42:38 +00:00
Kip Macy
8451d0dd78 In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
Dmitry Chagin
a2cd91cf28 Indeed, remove bogus since r219405 check of the Linux ABI.
Pointed out:	jhb

MFC after:	2 Week
2011-03-09 05:59:33 +00:00
Dmitry Chagin
e5d81ef1b5 Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with:	kib

MFC after:	2 Week
2011-03-08 19:01:45 +00:00
Dmitry Chagin
d14cc07d07 Rename used_requeue and use it as bitwise field to store more flags.
Reimplement used_requeue logic with LINUX_XDEPR_REQUEUEOP flag.
2011-02-12 20:58:59 +00:00
Dimitry Andric
95353459ae Fix linux kernel module breakage introduced in r215675, by including
<sys/sysent.h>.

Noticed by:	many
Pointy hat to:	netchild
2010-11-22 20:23:18 +00:00
Alexander Leidinger
526384ecf2 Do not take the process lock. The assignment to u_short inside the
properly aligned structure is atomic on all supported architectures, and
the thread that should see side-effect of assignment is the same thread
that does assignment.

Use a more appropriate conditional to detect the linux ABI.

Suggested by:	kib
X-MFC:		together with r215664
2010-11-22 12:42:32 +00:00
Alexander Leidinger
bb63fdde6d By using the 32-bit Linux version of Sun's Java Development Kit 1.6
on FreeBSD (amd64), invocations of "javac" (or "java") eventually
end with the output of "Killed" and exit code 137.

This is caused by:
1. After calling exec() in multithreaded linux program threads are not
   destroyed and continue running. They get killed after program being
   executed finishes.

2. linux_exit_group doesn't return correct exit code when called not
   from group leader. Which happens regularly using sun jvm.

The submitters fix this in a similar way to how NetBSD handles this.

I took the PRs away from dchagin, who seems to be out of touch of
this since a while (no response from him).

The patches committed here are from [2], with some little modifications
from me to the style.

PR:		141439 [1], 144194 [2]
Submitted by:	Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk
Reviewed by:	rdivacky (in april 2010)
MFC after:	5 days
2010-11-22 09:06:59 +00:00
Dmitry Chagin
b1121623d2 Remove support for FUTEX_REQUEUE operation.
Glibc does not use this operation since 2.3.3 version (Jun 2004),
as it is racy and replaced by FUTEX_CMP_REQUEUE operation.
Glibc versions prior to 2.3.3 fall back to FUTEX_WAKE when
FUTEX_REQUEUE returned EINVAL.

Any application directly using FUTEX_REQUEUE without return
value checking are definitely broken.

Limit quantity of messages per process about unsupported
operation.

Approved by:	kib (mentor)
MFC after:	1 month
2009-04-19 13:48:42 +00:00
Konstantin Belousov
17b9edd35a The code in linux_proc_exit() contains a race when multiple linux based
processes exits at the same time.  The linux_emuldata structure is freed
but p->p_emuldata is left as a dangling pointer to the just freed memory.

The check for W_EXIT in the loop scanning the child processes isn't safe
since the state of the child process can change right afterwards. Lock
the process and check the W_EXIT before delivering signal.

Submitted by:	tegge
Reviewed by:	davidxu
MFC after:	1 week
2008-10-31 10:38:30 +00:00
Roman Divacky
4732e446fb Implement robust futexes. Most of the code is modelled after
what Linux does. This is because robust futexes are mostly
userspace thing which we cannot alter. Two syscalls maintain
pointer to userspace list and when process exits a routine
walks this list waking up processes sleeping on futexes
from that list.

Reviewed by:	kib (mentor)
MFC after:	1 month
2008-05-13 20:01:27 +00:00
Jung-uk Kim
357afa7113 MFP4: Turn emul_lock into a mutex.
Submitted by:	rdivacky
2007-04-02 18:38:13 +00:00
Jung-uk Kim
a4e3bad794 MFP4: 115220, 115222
- Fix style(9) and reduce diff between amd64 and i386.
- Prefix Linuxulator macros with LINUX_ to prevent future collision.
2007-03-02 00:08:47 +00:00
Alexander Leidinger
802e08a360 Partial MFp4 of 114977:
Whitespace commit: Fix grammar, spelling and punctuation.

Submitted by:	"Scot Hetzel" <swhetzel@gmail.com>
2007-02-24 16:49:25 +00:00
Alexander Leidinger
1a26db0a3a MFp4 (114193 (i386 part), 114194, 114195, 114200):
- Dont "return" in linux_clone() after we forked the new process in a case
   of problems.
 - Move the copyout of p2->p_pid outside the emul_lock coverage in
   linux_clone().
 - Cache the em->pdeath_signal in a local variable and move the copyout
   out of the emul_lock coverage.
 - Move the free() out of the emul_shared_lock coverage in a preparation
   to switch emul_lock to non-sleepable lock (mutex).

Submitted by:	rdivacky
2007-02-23 22:39:26 +00:00
Alexander Leidinger
e8b8b834b4 MFp4 (part of 114132):
- Fix a LOR caused by holding emul_lock and proctree_lock at once.

Submitted by:	rdivacky
2007-02-23 22:29:24 +00:00
Konstantin Belousov
b4bb515484 Remove extern int hz; use proper include file instead. 2007-02-02 08:58:16 +00:00
Konstantin Belousov
25954d7430 No need to synchronize linux_schedtail with linux_proc_init.
p->p_emuldata is properly initialized in the time when the child can run.

Do not set p->p_emuldata to NULL when the process is exiting.
It does not make any sense and only costs 2 mutex operations.

Do not lock emul_data to unlock it on the very next line.
Comment on possible race while there.

Reparent all procs that are part of a threading group but not its leaders
to init and SIGCHLD init to finish the zombies off. This fixes zombies
left after opera's exit. [1]

There is no need to lock p_em in the linux_proc_init CLONE_THREAD
case because the process cannot change the address of the p_em->shared
because its currently running this code path.
Move assigning of em->shared outside emul_shared_lock.

Noticed by: Scott Robbins <scottro@nyc.rr.com> [1]
Submitted by: rdivacky
2007-02-01 13:29:27 +00:00
Alexander Leidinger
d071f5048c MFp4 (113077, 113083, 113103, 113124, 113097):
Dont expose em->shared to the outside world before its properly
	initialized. Might not affect anything but its at least a better
	coding style.

	Dont expose em via p->p_emuldata until its properly initialized.
	This also enables us to get rid of some locking and simplify the
	code because we are workin on a local copy.

	In linux_fork and linux_vfork create the process in stopped state
	to be sure that the new process runs with fully initialized emuldata
	structure [1]. Also fix the vfork (both in linux_clone and linux_vfork)
	race that could result in never woken up process [2].

Reported by:	Scot Hetzel	[1]
Suggested by:	jhb		[2]
Reviewed by:	jhb (at least some important parts)
Submitted by:	rdivacky
Tested by:	Scot Hetzel (on amd64)

Change 2 comments (in the new code) to comply to style(9).

Suggested by:	jhb
2007-01-20 14:58:59 +00:00
Alexander Leidinger
291081ce0a MFp4 (112499):
Protect em->shared with the lock in case of CLONE_THREAD.

Submitted by:	rdivacky
2007-01-07 19:09:20 +00:00
Alexander Leidinger
1c65504ca8 MFp4 (112498):
Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion.

Submitted by:	rdivacky
2007-01-07 19:00:38 +00:00
Alexander Leidinger
a628609ee9 MFp4:
- semi-automatic style fixes
2006-12-31 12:42:55 +00:00
Konstantin Belousov
292a85f4a8 Group pid and parent are shared in a case of CLONE_THREAD not CLONE_VM.
This fix lets clone02 LTP test pass with 2.6 emulation. In reality 99%
of the cases are that CLONE_VM and CLONE_THREAD are both set so it
seemed to work.

Submitted by: rdivacky
2006-11-15 11:04:37 +00:00
Alexander Leidinger
955d762aca MFP4:
Implement prctl().

Submitted by:	rdivacky
Tested with:	LTP
2006-10-28 10:59:59 +00:00
Alexander Leidinger
28638377ad - change if (cond) panic() to KASSERT.
- Dont forget to free em in a case of error.

Suggested by:	ssouhlal
Submitted by:	rdivacky
Tested with:	LTP
2006-10-08 17:10:34 +00:00
Alexander Leidinger
8618fd85a3 - Extend the coverage of PROC_LOCK to cover wakeup(&p->p_emuldata);
- Lock the emuldata in a case when we just created it.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Suggested by:	jhb
2006-09-09 16:55:55 +00:00
Suleiman Souhlal
c67e0cc9e7 FREE -> free
Submitted by:	rdivacky
2006-08-28 13:52:27 +00:00
Suleiman Souhlal
5342db0872 MALLOC -> malloc and FREE -> free
Submitted by:	rdivacky
Pointed out by:	jhb
2006-08-19 11:54:19 +00:00
Alexander Leidinger
9b0bcbfbda Fix the DEBUG build:
- linux_emul.c [1]
 - linux_futex.c [2]

Sponsored by:	Google SoC 2006	[1]
Submitted by:	rdivacky	[1]
		netchild	[2]
2006-08-17 09:50:30 +00:00
Alexander Leidinger
0eef2f8a4e Style fixes to comments.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-16 18:54:51 +00:00
Alexander Leidinger
ad2056f2c4 Add some new files needed for linux 2.6.x compatibility.
Please don't style(9) the NetBSD code, we want to stay in sync. Not imported
on a vendor branch since we need local changes.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
With help from:	manu@NetBSD.org
Obtained from:	NetBSD (linux_{futex,time}.*)
2006-08-15 12:20:59 +00:00