freebsd-skq

Author	SHA1	Message	Date
dchagin	010f4da5f8	To avoid excessive code duplication move MI definitions to the MI header file. As it is defined in Linux. Approved by: kib (mentor) MFC after: 1 month	2009-05-07 09:39:20 +00:00
dchagin	32b5830d97	Move extern variable definitions to the header file. Approved by: kib (mentor) MFC after: 1 month	2009-05-02 10:06:49 +00:00
dchagin	dca50049ce	Reimplement futexes. Old implemention used Giant to protect the kernel data structures, but at the same time called malloc(M_WAITOK), that could cause the calling thread to sleep and lost Giant protection. User-visible result was the missed wakeup. New implementation uses one sx lock per futex. The sx protects the futex structures and allows to sleep while copyin or copyout are performed. Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation is requested and either caller specified futexes are equial or second futex already exists. This is acceptable since the situation can only occur from the application error, and glibc falls back to old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error. Approved by: kib (mentor) MFC after: 1 month	2009-05-01 15:36:02 +00:00
dchagin	01bf63c9fb	Fix KBI breakage by r190520 which affects older linux.ko binaries: 1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored. Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days	2009-04-05 09:27:19 +00:00
dchagin	2408b715a0	Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section. The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through. Move code which fetch osreldate for ELF binary to check_note() handler. PR: 118473 Approved by: kib (mentor)	2009-03-13 16:40:51 +00:00
jhb	e1b708897e	A better fix for handling different FPU initial control words for different ABIs: - Store the FPU initial control word in the pcb for each thread. - When first using the FPU, load the initial control word after restoring the clean state if it is not the standard control word. - Provide a correct control word for Linux/i386 binaries under FreeBSD/amd64. - Adjust the control word returned for fpugetregs()/npxgetregs() when a thread hasn't used the FPU yet to reflect the real initial control word for the current ABI. - The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control word instead of trashing whatever the current state of the FPU is. Reviewed by: bde	2009-03-05 19:42:11 +00:00
dchagin	45cda70b8f	Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?" from some programs at start, among them are top and pkill. Do the assignment of the vector entries in elf_linux_fixup() as it is done in glibc. Fix some minor style issues. Submitted by: Marcin Cieslak <saper at SYSTEM PL> Approved by: kib (mentor) MFC after: 1 week	2009-03-04 12:14:33 +00:00
kib	021a7529ae	Adapt linux emulation to use cv for vfork wait. Submitted by: Takahiro Kurosawa <takahiro.kurosawa gmail com> PR: kern/131506	2009-02-18 16:11:39 +00:00
obrien	7a153194ec	Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions for moving between a segment register and a 32-bit memory location. Looked at by: jhb	2009-01-31 11:37:21 +00:00
imp	e4a424be30	Remove obsolete AT_DEBUG stuff. It never should have been committed in the first place, let alone migrated to linux emulation. Reviewed by: peter, rdivacky	2008-12-17 06:11:42 +00:00
kib	8ffb383318	Make linux_sendmsg() and linux_recvmsg() work on linux32/amd64. Change types used in the linux' struct msghdr and struct cmsghdr definitions to the properly-sized architecture-specific types. Move ancillary data handler from linux_sendit() to linux_sendmsg(). Submitted by: dchagin	2008-11-29 17:14:06 +00:00
kib	8fad2283b3	Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features. Discussed with: dchagin, imp, jhb, peter	2008-11-22 12:36:15 +00:00
kib	f5d16a4d66	In the robust futexes list head, futex_offset shall be signed, and glibc actually supplies negative offsets. Change l_ulong to l_long. Submitted by: dchagin	2008-11-16 15:45:41 +00:00
ed	8d12469978	Several cleanups related to pipe(2). - Use `fildes[2]' instead of `*fildes' to make more clear that pipe(2) fills an array with two descriptors. - Remove EFAULT from the manual page. Because of the current calling convention, pipe(2) raises a segmentation fault when an invalid address is passed. - Introduce kern_pipe() to make it easier for binary emulations to implement pipe(2). - Make Linux binary emulation use kern_pipe(), which means we don't have to recover td_retval after calling the FreeBSD system call. Approved by: rdivacky Discussed on: arch	2008-11-11 14:55:59 +00:00
ed	7baae41248	Regenerate system call tables for r184789.	2008-11-09 10:48:06 +00:00
ed	9d3703b842	Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4. Looking at our source code history, it seems the uname(), getdomainname() and setdomainname() system calls got deprecated somewhere after FreeBSD 1.1, but they have never been phased out properly. Because we don't have a COMPAT_FREEBSD1, just use COMPAT_FREEBSD4. Also fix the Linuxolator to build without the setdomainname() routine by just making it call userland_sysctl on kern.domainname. Also replace the setdomainname()'s implementation to use this approach, because we're duplicating code with sysctl_domainname(). I wasn't able to keep these three routines working in our COMPAT_FREEBSD32, because that would require yet another keyword for syscalls.master (COMPAT4+NOPROTO). Because this routine is probably unused already, this won't be a problem in practice. If it turns out to be a problem, we'll just restore this functionality. Reviewed by: rdivacky, kib	2008-11-09 10:45:13 +00:00
kib	29ccf7d166	Correctly fill siginfo for the signals delivered by linux tkill/tgkill. It is required for async cancellation to work. Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made to not linux process. Do not call em_find(p, ...) with p unlocked. Move common code for linux_tkill() and linux_tgkill() into linux_do_tkill(). Change linux siginfo_t definition to match actual linux one. Extend uid fields to 4 bytes from 2. The extension does not change structure layout and is binary compatible with previous definition, because i386 is little endian, and each uid field has 2 byte padding after it. Reported by: Nicolas Joly <njoly pasteur fr> Submitted by: dchangin MFC after: 1 month	2008-10-19 10:02:26 +00:00
kib	faae1c0f2f	Make robust futexes work on linux32/amd64. Use PTRIN to read user-mode pointers. Change types used in the structures definitions to properly-sized architecture-specific types. Submitted by: dchagin MFC after: 1 week	2008-10-14 07:59:23 +00:00
kib	c500808674	Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month	2008-09-24 10:14:37 +00:00
kib	a568a3185e	Segment registers are stored in the uc_mcontext member of the struct l_ucontext. To restore the registers content, trampoline needs to dereference uc_mcontext instead of taking some undefined values from l_ucontext. Submitted by: Dmitry Chagin <dchagin@> MFC after: 1 week	2008-09-07 16:39:21 +00:00
rdivacky	faae559cb1	Regen. Approved by: kib (mentor)	2008-05-13 20:02:26 +00:00
rdivacky	13cbd9c97e	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
rdivacky	dd1e82ea4d	Implement linux_truncate64() syscall. Tested by: Aline de Freitas <aline@riseup.net> Approved by: kib (mentor)	2008-04-23 15:56:33 +00:00
jkim	e0c673b7e2	Regenerate.	2008-04-16 19:27:36 +00:00
jkim	513781a1c1	Add stubs for syscalls introduced in Linux 2.6.17 kernel. Some GNU libc version started using them before 2.6.17 was officially out. MFC after: 3 days	2008-04-16 19:25:39 +00:00
kib	133f8f7798	Regenerate	2008-04-08 09:51:19 +00:00
kib	eb77b477b4	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
kib	eff8c6d35e	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
rdivacky	64c7931e65	Regen.	2008-03-16 16:29:37 +00:00
rdivacky	b13a84dcb7	Implement sched_setaffinity and get_setaffinity using real cpu affinity setting primitives. Reviewed by: jeff Approved by: kib (mentor)	2008-03-16 16:27:44 +00:00
kib	be9c86776f	Since version 4.3, gcc changed its behaviour concerning the i386/amd64 ABI and the direction flag, that is it now assumes that the direction flag is cleared at the entry of a function and it doesn't clear once more if needed. This new behaviour conforms to the i386/amd64 ABI. Modify the signal handler frame setup code to clear the DF {e,r}flags bit on the amd64/i386 for the signal handlers. jhb@ noted that it might break old apps if they assumed DF == 1 would be preserved in the signal handlers, but that such apps should be rare and that older versions of gcc would not generate such apps. Submitted by: Aurelien Jarno <aurelien aurel32 net> PR: 121422 Reviewed by: jhb MFC after: 2 weeks	2008-03-13 10:54:38 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
jkim	3bffed0bec	Fix Linux mmap with MAP_GROWSDOWN flag. Reported by: Andriy Gapon (avg at icyb dot net dot ua) Tested by: Andriy Gapon (avg at icyb dot net dot ua) Pointyhat: me MFC after: 3 days	2008-02-11 19:35:03 +00:00
attilio	71b7824213	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
kib	20098981e1	Implement read_default_ldt in linux_modify_ldt(). It copies out zeroed descriptor, like real Linux does. Tested by: Yuriy Tsibizov <yuriy.tsibizov at gmail com> Submitted by: rdivacky MFC after: 1 week	2007-11-26 11:06:19 +00:00
kib	9ae733819b	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
kib	038cf0387b	Fill in cr2 in the signal context from ksi->ksi_addr. Together with the sys/i386/i386/trap.c rev. 1.306 it fixes the PR. Submitted by: rdivacky Suggested by: jhb Sponsored by: Google Summer of Code 2007 PR: kern/77710 Approved by: re (kensmith)	2007-09-20 13:46:26 +00:00
dwmalone	11cf0c8f4a	regen. Approved by: re (kensmith)	2007-09-18 19:51:49 +00:00
dwmalone	37c880369b	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
jeff	3fc0f8b973	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
kib	5b26984cf1	Regenerate. Approved by: re (kensmith)	2007-08-28 12:36:23 +00:00
kib	39e24dc75d	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
attilio	d6dfb4f4cb	i386_set_ioperm, i386_get_ldt and i386_set_ldt are now MPSAFE (Giant/sched_lock free) so remove unuseful Giant cruft. Approved by: jeff Approved by: re Sponsorized by: NGX Italy (http://www.ngx.it)	2007-07-20 08:35:18 +00:00
peter	6d9e6c677c	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
kib	cdee790df9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
kan	ea141892dc	Do not dereference linux_to_bsd_signal[-1] if userland has passed zero as exit signal. GCC 4.2 changes the kernel data segment layout not to have 0 in that memory location. This code ran by luck before and now the luck has run out.	2007-05-11 01:25:51 +00:00
jkim	b204c9cc13	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
julian	93fc8e768e	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
jkim	554fb0a678	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
jkim	2620bd06da	MFP4: 115094 Linux does not check file descriptor when MAP_ANONYMOUS is set. This should fix recent LTP test regressions. Reported by: Scot Hetzel (swhetzel at gmail dot com) netchild	2007-02-27 02:08:01 +00:00
netchild	888f5e57b2	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
netchild	902cc4aeba	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
jkim	6fadbd6f66	Regen.	2007-02-15 00:57:04 +00:00
jkim	df99d574b5	MFP4: 113025, 113146, 113177, 113203, 113500, 113546, 113570 - PROT_READ, PROT_WRITE, or PROT_EXEC implies PROT_READ and PROT_EXEC. Linux/ia64's i386 emulation layer does this and it complies with Linux header files. This fixes mmap05 LTP test case on amd64. - Do not adjust stack size when failure has occurred. - Synchronize i386 mmap/mprotect with amd64.	2007-02-15 00:54:40 +00:00
kib	8f812418c1	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
kib	b9ce1aaa2a	Fix LOR that occurs because proctree_lock was acquired while holding emuldata lock by moving the code upwards outside the emul_lock coverage. Submitted by: rdivacky	2007-02-01 13:27:52 +00:00
jeff	474b917526	- Remove setrunqueue and replace it with direct calls to sched_add(). setrunqueue() was mostly empty. The few asserts and thread state setting were moved to the individual schedulers. sched_add() was chosen to displace it for naming consistency reasons. - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be different on all three schedulers where it was only called in one place each. - Remove the long ifdef'd out remrunqueue code. - Remove the now redundant ts_state. Inspect the thread state directly. - Don't set TSF_* flags from kern_switch.c, we were only doing this to support a feature in one scheduler. - Change sched_choose() to return a thread rather than a td_sched. Also, rely on the schedulers to return the idlethread. This simplifies the logic in choosethread(). Aside from the run queue links kern_switch.c mostly does not care about the contents of td_sched. Discussed with: julian - Move the idle thread loop into the per scheduler area. ULE wants to do something different from the other schedulers. Suggested by: jhb Tested on: x86/amd64 sched_{4BSD, ULE, CORE}.	2007-01-23 08:46:51 +00:00
netchild	42392e7a0b	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
netchild	4ffc7bc7ea	MFp4 (112893): Make linux_vfork() actually work. This enables make to work again with 2.6. It also fixes the LTP vfork tests. Submitted by: rdivacky	2007-01-14 16:20:37 +00:00
netchild	977ef4a8bc	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
netchild	f87d2b65bb	regen after addition of linux_utimes and linux_rt_sigtimedwait	2006-12-31 13:20:31 +00:00
netchild	33166d619b	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
jkim	6397f13732	Regen (just to fix 'generated from' line from the previous commit).	2006-12-20 20:42:58 +00:00
jkim	da8c5f8136	Add linux_nanosleep() and regen.	2006-12-20 20:21:48 +00:00
jkim	3b05cb0c58	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
trhodes	58cca8458a	Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@	2006-11-11 16:26:58 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
netchild	248a5cec67	regen after linux_io_* backout	2006-10-29 14:12:44 +00:00
netchild	b17bbadb52	Backout the linux aio stuff. Several problems where identified and the dynamic nature (if no native aio code is available, the linux part returns ENOSYS because of missing requisites) should be solved differently than it is. All this will be done in P4. Not included in this commit is a backout of the changes to the native aio code (removing static in some places). Those changes (and some more) will also be needed when the reworked linux aio stuff will reenter the tree. Requested by: rwatson Discussed with: rwatson	2006-10-29 14:02:39 +00:00
netchild	75889b9911	regen (prctl addition)	2006-10-28 11:24:38 +00:00
netchild	963ac453db	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
netchild	357a456178	Fix a recent regression regarding valid signals. Submitted by: rdivacky	2006-10-20 10:09:40 +00:00
netchild	f2cc0e8140	regen (linux AIO stuff)	2006-10-15 14:24:10 +00:00
netchild	183bd5a34b	MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>	2006-10-15 14:22:14 +00:00
netchild	a9266094f2	MFP4 (106538 + 106541): Implement CLONE_VFORK. This fixes the clone05 LTP test. Submitted by: rdivacky	2006-10-15 13:39:40 +00:00
netchild	96943a3038	Revert my previous commit, I mismerged this to the wrong place. Pointy hat to: netchild	2006-10-15 13:30:45 +00:00
netchild	7e5ad63262	MFP4 (106541): Fix the clone05 test in the LTP. Submitted by: rdivacky	2006-10-15 13:25:23 +00:00
netchild	6dc0f7cde5	MFP4 (107144[1]): Implement CLONE_FS on i386[1] and amd64. Submitted by: rdivacky [1]	2006-10-15 13:22:14 +00:00
netchild	4afde07449	MFP4 (107868 - 107870): Use a macro to test for a valid signal instead of doing it my hand everywhere. Submitted by: rdivacky	2006-10-15 12:51:43 +00:00
rwatson	582a76db5e	Regenerate.	2006-09-21 16:20:38 +00:00
rwatson	03d308eea1	Use AUE_CREAT instead of AUE_O_CREAT for linux_creat(). Obtained from: TrustedBSD Project	2006-09-21 16:18:33 +00:00
rwatson	9dfc38a01d	Regenerate.	2006-09-21 16:13:16 +00:00
rwatson	d385bd6fbb	Use AUE_GETDIRENTRIES instead of AUE_O_GETDENTS and AUE_NULL for a number of directory reading system calls. Respell a mis-spelled event name. Clean up white space/line wraps in a couple of places. Assign event numbers to some new system call entries that have turned up in the list since audit support was added. Obtained from: TrustedBSD Project	2006-09-21 16:12:58 +00:00
netchild	81cdbc19d7	style(9) While I'm here add a MFC reminder, I forgot it in the previous commit. Noticed by: ssouhlal MFC after: 1 week	2006-09-20 19:27:11 +00:00
netchild	9f4eed62b7	Bring the i386 linux mmap code more into line with how linux (2.4.x) behaves. This fixes a lot of test which failed before. For amd64 there are still some problems, but without any testers which apply patches and run some predefines tests we can't do more ATM. Submitted by: Marcin Cieslak <saper@SYSTEM.PL> (minor fixups by myself) Tested with: LTP	2006-09-20 17:24:20 +00:00
netchild	2140995733	Change futex lock from mutex to sx. Make futex_get atomic (protected by the futex lock). Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:25:25 +00:00
netchild	c1c941b5f5	Fix video playing and network connections in realplayer (and most likely other stuff) in the osrelease=2.6.16 case: - implement CLONE_PARENT semantic - fix TLS loading in clone CLONE_SETTLS - lock proc in the currently disabled part of CLONE_THREAD I suggest to not unload the linux module after testing this, there are some "<defunct>" processes hanging around after exiting (they aren't with osrelease=2.4.2) and they may panic your kernel when unloading the linux module. They are in state Z and some of them consume CPU according to ps. But I don't trust the CPU part, the idle threads gets too much CPU that this may be possible (accumulating idle, X and 2 defunct processes results in 104.7%, this looks to much to be a rounding error). Noticed by: Intron <mag@intron.ac> Submitted by: rdivacky (in collaboration with Intron) Tested by: Intron, netchild Reviewed by: jhb (previous version)	2006-08-27 18:51:32 +00:00
netchild	ac9f0aa27b	regen	2006-08-27 08:58:00 +00:00
netchild	33681b868d	Add the linux statfs64 call. This allows Tivoli backup to proceed a little but further on -current (still not successful, but a step into the right direction). Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: Paul Mather <paul@gromit.dlib.vt.edu>	2006-08-27 08:56:54 +00:00
netchild	fedc5604a0	Emulate what vfork does instead of using it in linux_vfork. This way we can do the stuff we need to do with linux processes at fork and don't panic the kernel at exit of the child. Submitted by: rdivacky Tested with: tst-vfork* (glibc regression tests) Tested by: netchild	2006-08-25 11:59:56 +00:00
netchild	81450589e7	Get rid of some nested includes. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb	2006-08-19 15:13:01 +00:00
netchild	5d552cdc47	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
netchild	39fd1c6d47	Style fixes to comments. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-16 18:54:51 +00:00
jhb	d900df3c77	Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier systrace changes.	2006-08-15 17:37:01 +00:00
jhb	61e1e0725a	- Remove unused sysvec variables from various syscalls.conf. - Send the systrace_args files for all the compat ABIs to /dev/null for now. Right now makesyscalls.sh generates a file with a hardcoded function name, so it wouldn't work for any of the ABIs anyway. Probably the function name should be configurable via a 'systracename' variable and the functions should be stored in a function pointer in the sysvec structure.	2006-08-15 17:25:55 +00:00
netchild	133c6ea862	add autogenerated systrace_args stuff for dtrace	2006-08-15 12:56:36 +00:00
netchild	ec2ba5d85d	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
netchild	e8cb5b5578	regen	2006-08-15 12:51:45 +00:00
netchild	fd333609bf	Add new syscalls in the linuxolator (only used when the sysctl compat.linux.osrelease is changed to "2.6.16" or similar). On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 12:28:14 +00:00
netchild	1f1a93f2ab	Add some more errno mappings (bsd -> linux) and a comment about the status.. Submitted by: "Intron" <mag@intron.ac>	2006-08-10 22:05:25 +00:00
jhb	dee1b3da95	Regen for MPSAFE flag removal.	2006-07-28 19:08:37 +00:00
jhb	c62c38439f	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
jhb	6a211b6d81	Various fixes to comments in the syscall master files including removing cruft from the audit import and adding mention of COMPAT4 to freebsd32.	2006-07-28 18:55:18 +00:00
jhb	e96f2e292b	Regen.	2006-07-21 20:41:33 +00:00
jhb	675c87997e	- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional Giant VFS locking in that function. - Remove bogus code to handle the case where namei() returns success but a NULL vnode pointer. - Note that this code duplicates exec_check_permissions() and annotate where it differs. - Hold the vnode lock longer to protect the write to set VV_TEXT in v_vflag. - Mark linux_uselib() MPSAFE. Reviewed by: rwatson	2006-07-21 20:22:13 +00:00
jhb	286a0ec5a8	Regen.	2006-07-11 20:55:23 +00:00
jhb	9569e81b84	- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.	2006-07-11 20:52:08 +00:00
jhb	a63b63284f	Regen.	2006-07-06 21:43:14 +00:00
jhb	4d231459c7	- Protect the list of linux ioctl handlers with an sx lock. - Hold Giant while calling linux ioctl handlers for now as they aren't all known to be MPSAFE yet. - Mark linux_ioctl() MPSAFE.	2006-07-06 21:42:36 +00:00
jhb	693417c025	Regen.	2006-06-27 18:32:16 +00:00
jhb	dff69a853e	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
jhb	db4d1f72c7	Regen.	2006-06-27 14:47:08 +00:00
jhb	5ceeece21b	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
jhb	368eefb9bf	Regen.	2006-06-26 18:37:36 +00:00
jhb	ddfdf64e37	linux_brk() is MPSAFE.	2006-06-26 18:36:16 +00:00
netchild	64550de991	regen after change to syscalls.master	2006-06-20 20:41:29 +00:00
netchild	247b98ef25	Switch to using the DUMMY infrastructure instead of UNIMPL for the new syscalls. This way there will be a log message printed to the console (this time for real). Note: UNIMPL should be used for syscalls we do not implement ever, e.g. syscalls to load linux kernel modules. Submitted by: rdivacky Sponsored by: Goole SoC 2006 P4 IDs: 99600, 99602	2006-06-20 20:38:44 +00:00
netchild	de5cf4e1bd	regen after MFP4 (soc2006/rdivacky_linuxolator) of syscalls.master P4-Changes: similar to 98673 and 98675 but regenerated locally Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:48:30 +00:00
netchild	a561ebc3f4	MFP4 (soc2006/rdivacky_linuxolator) Update of syscall.master: o Adding of several new dummy syscalls (268-310) o Synchronization of amd64 syscall.master with i386 one o Auditing added to amd64 syscall.master o Change auditing type for lstat syscall (bugfix). [1] P4-Changes: 98672, 98674 Noticed by: rwatson [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-06-13 18:43:55 +00:00
netchild	021fd75458	regen (linux rt_sigpending)	2006-05-10 18:19:51 +00:00
netchild	24c492f42c	Implement rt_sigpending in the linuxolator. PR: 92671 Submitted by: Markus Niemist"o <markus.niemisto@gmx.net>	2006-05-10 18:17:29 +00:00
ambrisko	31b22ce017	Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy. Add back in a scheme to emulate old type major/minor numbers via hooks into stat, linprocfs to return major/minors that Linux app's expect. Currently only /dev/null is always registered. Drivers can register via the Linux type shim similar to the ioctl shim but by using linux_device_register_handler/linux_device_unregister_handler functions. The structure is: struct linux_device_handler { char bsd_driver_name; char linux_driver_name; char bsd_device_name; char linux_device_name; int linux_major; int linux_minor; int linux_char_device; }; Linprocfs uses this to display the major number of the driver. The soon to be available linsysfs will use it to fill in the driver name. Linux_stat uses it to translate the major/minor into Linux type values. Note major numbers are dynamically assigned via passing in a -1 for the major number so we don't need to keep track of them. This is somewhat needed due to us switching to our devfs. MegaCli will not run until I add in the linsysfs and mfi Linux compat changes. Sponsored by: IronPort Systems	2006-05-05 16:10:45 +00:00
netchild	39276e2b1e	regen	2006-03-18 20:49:01 +00:00
netchild	d1db96cb48	Fixup some problems in my previous commit (COMPAT_43). Pointyhat to: netchild	2006-03-18 20:47:36 +00:00
netchild	8fd6664412	regen after COMPAT_43 removal	2006-03-18 18:24:38 +00:00
netchild	c1829f604c	Get rid of the need of COMPAT_43 in the linuxolator. Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)	2006-03-18 18:20:17 +00:00
jhb	ff9c76bccd	Close some races between procfs/ptrace and exit(2): - Reorder the events in exit(2) slightly so that we trigger the S_EXIT stop event earlier. After we have signalled that, we set P_WEXIT and then wait for any processes with a hold on the vmspace via PHOLD to release it. PHOLD now KASSERT()'s that P_WEXIT is clear when it is invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops to zero. - Change proc_rwmem() to require that the processing read from has its vmspace held via PHOLD by the caller and get rid of all the junk to screw around with the vmspace reference count as we no longer need it. - In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it doesn't exist. - Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem() to clear an earlier single-step simualted via a breakpoint). We only do one to avoid races. Also, by making the EINVAL error for unknown requests be part of the default: case in the switch, the various switch cases can now just break out to return which removes a _lot_ of duplicated PRELE and proc unlocks, etc. Also, it fixes at least one bug where a LWP ptrace command could return EINVAL with the proc lock still held. - Changed the locking for ptrace_single_step(), ptrace_set_pc(), and ptrace_clear_single_step() to always be called with the proc lock held (it was a mixed bag previously). Alpha and arm have to drop the lock while the mess around with breakpoints, but other archs avoid extra lock release/acquires in ptrace(). I did have to fix a couple of other consumers in kern_kse and a few other places to hold the proc lock and PHOLD. Tested by: ps (1 mostly, but some bits of 2-4 as well) MFC after: 1 week	2006-02-22 18:57:50 +00:00
jhb	ae432f93f2	- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.	2006-02-06 22:06:54 +00:00
rwatson	3a79f09166	Regenerate.	2006-02-06 01:40:48 +00:00
rwatson	59732048da	Assign audit event identifiers to Linux i386 system calls. Obtained from: TrustedBSD Project	2006-02-06 01:40:30 +00:00
sobomax	34fa5a81a5	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
jhb	feebef55c2	Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1) which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT() has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here was redundant and resulted in panics in debug kernels. MFC after: 1 week Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu	2005-12-15 16:30:41 +00:00
jhb	ecc6e8dc5a	The signal code is now an int rather than a long, so update debug printfs.	2005-10-14 20:22:57 +00:00
davidxu	3fbdb3c215	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
sobomax	c3270af7f0	Propagate error code of kern_execve() to the caller properly. PR: 81670 Submitted by: Andrew Bliznak <andriko.b@gmail.com> Pointy hat to: sobomax	2005-08-01 17:35:48 +00:00
jhb	114f6b764d	Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c so that they aren't duplicated 3 times and are also in the same file as the code that depends on the SYSVIPC modules.	2005-07-29 19:40:39 +00:00
jhb	8ca187d620	Regen.	2005-07-13 20:35:09 +00:00
jhb	7e35629af2	Make a pass through all the compat ABIs sychronizing the MP safe flags with the master syscall table as well as marking several ABI wrapper functions safe. MFC after: 1 week	2005-07-13 20:32:42 +00:00
delphij	019106f6e5	Remove the CPU_ENABLE_SSE option from the i386 and pc98 architectures, as they are already default for I686_CPU for almost 3 years, and CPU_DISABLE_SSE always disables it. On the other hand, CPU_ENABLE_SSE does not work for I486_CPU and I586_CPU. This commit has: - Removed the option from conf/options.* - Removed the option and comments from MD NOTES files - Simplified the CPU_ENABLE_SSE ifdef's so they don't deal with CPU_ENABLE_SSE from kernel configuration. () For most users, this commit should be largely no-op. If you used to place CPU_ENABLE_SSE into your kernel configuration for some reason, it is time to remove it. () The ifdef's of CPU_ENABLE_SSE are not removed at this point, since we need to change it to !defined(CPU_DISABLE_SSE) && defined(I686_CPU), not just !defined(CPU_DISABLE_SSE), if we really want to do so. Discussed on: -arch Approved by: re (scottl)	2005-07-02 20:06:44 +00:00
sobomax	3d445ed2f2	Regen after addition of linux_getpriority wrapper. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua> MFC after: 1 week	2005-06-08 20:47:30 +00:00
sobomax	307c6bb149	Properly convert FreeBSD priority values into Linux values in the getpriority(2) syscall. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua>	2005-06-08 20:41:28 +00:00
rwatson	5010364761	Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:20:21 +00:00
rwatson	370e72b242	Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:09:18 +00:00
mdodd	6f940cf20f	Add support for O_NOFOLLOW and O_DIRECT to Linux fcntl() F_GETFL/F_SETFL.	2005-04-13 04:31:43 +00:00
jhb	a3c6b782c3	- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@	2005-04-01 20:00:11 +00:00
sobomax	44e9d0b353	Regen after addition of linux_nosys handler.	2005-03-07 00:23:58 +00:00
sobomax	f706f4bce8	Handle unimplemented syscall by instantly returning ENOSYS instead of sending signal first and only then returning ENOSYS to match what real linux does. PR: kern/74302 Submitted by: Travis Poppe <tlp@LiquidX.org>	2005-03-07 00:18:06 +00:00
sobomax	1485460070	In linux emulation layer try to detect attempt to use linux_clone() to create kernel threads and call rfork(2) with RFTHREAD flag set in this case, which puts parent and child into the same threading group. As a result all threads that belong to the same program end up in the same threading group. This is similar to what linuxthreads port does, though in this case we don't have a luxury of having access to the source code and there is no definite way to differentiate linux_clone() called for threading purposes from other uses, so that we have to resort to heuristics. Allow SIGTHR to be delivered between all processes in the same threading group previously it has been blocked for s[ug]id processes. This also should improve locking of the same file descriptor from different threads in programs running under linux compat layer. PR: kern/72922 Reported by: Andriy Gapon <avg@icyb.net.ua> Idea suggested by: rwatson	2005-03-03 16:57:55 +00:00
jhb	d92d6a0f9f	Use linux_emul_convpath() rather than linux_emul_find() as linux_emul_find() is going away.	2005-02-07 18:37:51 +00:00
jhb	2e8b9720fa	Use the LCONVPATHEXIST() macro rather than it's exact expansion to be consistent.	2005-02-07 18:37:13 +00:00
das	89cc41ef8f	When running Linux binaries, set up the initial FPU state as Linux would. PR: 28966	2005-02-06 17:29:20 +00:00
sobomax	f489acaf0f	o Split out kernel part of execve(2) syscall into two parts: one that copies arguments into the kernel space and one that operates completely in the kernel space; o use kernel-only version of execve(2) to kill another stackgap in linuxlator/i386. Obtained from: DragonFlyBSD (partially) MFC after: 2 weeks	2005-01-29 23:12:00 +00:00
sobomax	ef41053770	o Move copyin()/copyout() out of i386_{get,set}_ldt() and i386_{get,set}_ioperm() and make those APIs visible in the kernel namespace; o use i386_{get,set}_ldt() and i386_{get,set}_ioperm() instead of sysarch() in the linuxlator, which allows to kill another two stackgaps. MFC after: 2 weeks	2005-01-26 13:59:46 +00:00
imp	8d58b9df12	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 22:18:23 +00:00
das	7f13dc5af0	Axe the semblance of support for PECOFF and Linux a.out core dumps.	2004-11-27 06:46:45 +00:00
das	8d8b5ace18	Maintain the broken state of backwards compatibilty for a.out (and PECOFF!) core dumps. None of the old versions of gdb I tried were able to read a.out core dumps before or after this change. Reviewed by: arch@	2004-11-20 02:32:04 +00:00
das	81fc7cf485	Fix the following race: 1. Process p1 is currently being swapped in. 2. Process p2 calls linux_ptrace(PTRACE_GETFPXREGS, p1_pid, ...) 3. After acquiring a reference to FIRST_THREAD_IN_PROC(p1), p2 blocks in faultin() while p1 finishes being swapped in. This means p2 won't get back the lock on p1 until after p1's threads are runnable. 4. After p1 is swapped in, the first thread in p1 exits. 5. p2 now uses its dangling reference to p1's first thread.	2004-10-01 05:01:00 +00:00
dfr	abfb7537fa	Regen.	2004-09-06 09:33:30 +00:00
dfr	865b03d472	Add a few stub syscalls to get TransGaming's winex a bit closer to working.	2004-09-06 09:32:59 +00:00
julian	e9d9514975	Give setrunqueue() and sched_add() more of a clue as to where they are coming from and what is expected from them. MFC after: 2 days	2004-09-01 02:11:28 +00:00
jhb	325fe79e0c	Correct the arguments to kern_sigaltstack() as they were reversed. PR: kern/68079 Submitted by: Georg-W. Koltermann gwk at rahn-koltermann dot de	2004-08-24 20:52:52 +00:00
jhb	ac08ecfc54	Regenerate after fcntl() wrappers were marked MP safe.	2004-08-24 20:24:34 +00:00
jhb	cc23ea84d0	Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl() directly. This removes a few more users of the stackgap and also marks the syscalls using these wrappers MP safe where appropriate. Tested on: i386 with linux acroread5 Compiled on: i386, alpha LINT	2004-08-24 20:21:21 +00:00
tjr	e6930a385c	Add a new type, l_uintptr_t, which is an unsigned integer type with the same width as a pointer under Linux. Add two new macros, PTRIN and PTROUT, which convert between l_uintptr_t and native pointers.	2004-08-16 07:05:44 +00:00
phk	5c95d686a1	Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".	2004-07-15 08:26:07 +00:00
stefanf	9dea8aeba1	Consistently use __inline instead of __inline__ as the former is an empty macro in <sys/cdefs.h> for compilers without support for inline.	2004-07-04 16:11:03 +00:00
obrien	73ce2e712d	Add casts so all these quantities are a constant type.	2004-06-24 02:24:39 +00:00
tjr	02a7d287a2	Change the types of vn_rdwr_inchunks()'s len and aresid arguments to size_t and size_t *, respectively. Update callers for the new interface. This is a better fix for overflows that occurred when dumping segments larger than 2GB to core files.	2004-06-05 02:18:28 +00:00
bms	54410035ea	Use the BSD madvise() syscall implementation for Linux binary emulation, instead of treating it as an unimplemented syscall. This appears to make StarOffice 7.0 Linux binaries work according to submitter; also tested with nvidia driver by submitter. Submitted by: Matthias Schuendehuette	2004-03-28 21:43:27 +00:00
jhb	15596982a5	Regenerate.	2004-03-15 22:44:35 +00:00
jhb	28f51bd3cc	- Mark ABI syscalls that call wait4() MP safe as recent changes to the kernel wait4() made these all panic() implementations otherwise. - The i386 linux_ptrace() syscall is MP safe. Alpha was already marked MP safe.	2004-03-15 22:43:49 +00:00
jhb	27c73ac133	Regen.	2004-02-04 22:00:44 +00:00
jhb	bb001b4d31	The following compat syscalls are now mpsafe: linux_getrlimit(), linux_setrlimit(), linux_old_getrlimit(), osf1_getrlimit(), osf1_setrlimit(), svr4_sys_ulimit(), svr4_sys_setrlimit(), svr4_sys_getrlimit(), svr4_sys_setrlimit64(), svr4_sys_getrlimit64(), ibcs2_sysconf(), and ibcs2_ulimit().	2004-02-04 21:57:00 +00:00
jhb	279b2b8278	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
davidxu	d72ded3ec8	Make sigaltstack as per-threaded, because per-process sigaltstack state is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process.	2004-01-03 23:31:29 +00:00
davidxu	f39653dda8	Make sigaltstack as per-threaded, because per-process sigaltstack state is useless for threaded programs, multiple threads can not share same stack. The alternative signal stack is private for thread, no lock is needed, the orignal P_ALTSTACK is now moved into td_pflags and renamed to TDP_ALTSTACK. For single thread or Linux clone() based threaded program, there is no semantic changed, because those programs only have one kernel thread in every process. Reviewed by: deischen, dfr	2004-01-03 02:02:26 +00:00
bde	14fc79e77b	Sorted includes. Removed duplicates exposed by this.	2003-12-29 06:51:10 +00:00
peter	72906fa267	GC unused 'syshide' override to /dev/null. This was here to disable the output of the namespc column. Its functionality was removed some time ago, but the overrides and the namespc column remained.	2003-12-24 00:32:07 +00:00
peter	66b968e3cb	Regen (should be a NOP except for rcsid changes)	2003-12-23 03:55:06 +00:00
peter	1246f19923	GC unused third namespace column.	2003-12-23 03:54:40 +00:00
peter	998b79089f	Add an additional field to the elf brandinfo structure to support quicker exec-time replacement of the elf interpreter on an emulation environment where an entire /compat/* tree isn't really warranted.	2003-12-23 02:42:39 +00:00
sobomax	a621621dc9	Pull latest changes from OpenBSD: - improve sysinfo(2) syscall; - add dummy fadvise64(2) syscall; - add dummy *xattr(2) family of syscalls; - add protos for the syscalls 222-225, 238-249 and 253-267; - add exit_group(2) syscall, which is currently just wired to exit(2). Obtained from: OpenBSD MFC after: 2 weeks	2003-11-16 15:07:10 +00:00
jhb	2be76da54f	Regen.	2003-11-07 21:36:35 +00:00
jhb	15178fba7e	Sync up MP safe flags with global syscalls.master for the first time. This includes read(), write(), close(), linux_setuid16(), linux_getuid16(), linux_pause(), linux_nice(), linux_kill(), dup(), linux_pipe(), linux_setgid16(), linux_getgid16(), linux_signal(), linux_geteuid16(), linux_getegid16(), acct(), setpgid(), umask(), dup2(), getppid(), getpgrp(), setsid(), linux_sigaction(), linux_sgetmask(), linux_ssetmask(), linux_setreuid16(), linux_setregid16(), linux_sigsuspend(), getrusage(), gettimeofday(), linux_getgroups16(), linux_setgroups16(), getpriority(), setpriority(), linux_sigreturn(), linux_clone(), linux_sigprocmask(), linux_getsid(), mlock(), munlock(), mlockall(), munlockall(), sched_setparam(), sched_getparam(), linux_sched_setscheduler(), linux_sched_getscheduler(), linux_sched_get_priority_max(), linux_sched_get_priority_min(), sched_rr_get_interval(), linux_setresuid16(), linux_getresuid16(), linux_setresgid16(), linux_getresgid16(), linux_rt_sigaction(), linux_rt_sigprocmask(), linux_rt_sigsuspend(), geteuid(), getegid(), setreuid(), setregid(), linux_getgroups(), linux_setgroups(), setresuid(), getresuid(), setresgid(), getresgid(), setuid(), and setgid().	2003-11-07 21:36:14 +00:00
peter	8ecb3577d8	Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit systems where the data/stack/etc limits are too big for a 32 bit process. Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c. Supply an ia32_fixlimits function. Export the clip/default values to sysctl under the compat.ia32 heirarchy. Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max value rather than the sysctl tweakable variable. This allows mmap to place mappings at sensible locations when limits have been reduced. Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same method as mmap(0, ...) now does. Note that we cannot remove all references to the sysctl tweakable maxdsiz etc variables because /etc/login.conf specifies a datasize of 'unlimited'. And that causes exec etc to fail since it can no longer find space to mmap things.	2003-09-25 01:10:26 +00:00
bde	216d4d73b8	Restored non-egregious casts so that this file compiles on i386's with 64-bit longs again.	2003-09-07 13:23:45 +00:00
davidxu	abb4420bbe	Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.	2003-06-15 00:31:24 +00:00
obrien	d898e2dba5	Use __FBSDID().	2003-06-02 16:56:40 +00:00
jhb	89a4eb17de	- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)	2003-05-13 20:36:02 +00:00
mdodd	232af61924	Provide exec_linux_setregs() to override exec_setregs(). Linux initializes %gs to 0. Mimic this behavior. Submitted by: Christian Zander <zander@minion.de> Reviewed by: jake Approved by: re	2003-05-11 21:51:11 +00:00
jhb	d5cf4c5275	Prefer the proc lock to sched_lock when testing PS_INMEM now that it is safe to do so.	2003-04-22 20:01:56 +00:00
jhb	85d7526d96	Synchronize the two linux_clone() implementations which includes a few minor cleanups in both.	2003-04-18 20:54:41 +00:00
jhb	f6f1e291b9	Don't drop the proc lock just to reacquire it after a few simple assignment statements. Just hold the lock the entire time.	2003-04-17 22:18:07 +00:00
jhb	64f102eb0c	Sync up with changes to ptrace() and use P_SHOULDSTOP instead of a duplicate P_TRACED check. Submitted by: marcel	2003-04-15 16:29:39 +00:00
jeff	46e6ba39f1	- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.	2003-03-31 22:49:17 +00:00
jeff	4a3718fb25	- Change trapsignal() to accept a thread and not a proc. - Change all consumers to pass in a thread. Right now this does not cause any functional changes but it will be important later when signals can be delivered to specific threads.	2003-03-31 22:02:38 +00:00
jhb	fdc61a3a24	Add missing includes from previous commit. Reported by: des	2003-03-27 18:18:35 +00:00
jhb	72a1a2619c	Add a cleanup function to destroy the osname_lock and call it on module unload. Submitted by: gallatin Reported by: Martin Karlsson <mk-freebsd@bredband.net>	2003-03-26 18:29:44 +00:00

... 2 3 4 5 6 ...

789 Commits