freebsd-skq

Author	SHA1	Message	Date
Roman Divacky	2e1a489300	d_ino member of linux_dirent structure should be unsigned long. Submitted by: Chagin Dmitry <chagin.dmitry@gmail.com> Approved by: kib (mentor)	2008-06-08 11:09:25 +00:00
Roman Divacky	a47444d525	Switch to emulating Linux 2.6 on default. Approved by: kib (mentor)	2008-06-03 17:50:13 +00:00
Ed Schouten	a147e6cadf	Push down the major/minor conversion for pts/%u to improve consistency. In the mpsafetty branch, Linux sshd seems to work properly inside a jail. Some small modifications had to be made to the Linux compatibility layer. The Linux PTY routines always expect the device major number to be 136 or higher. Our code always set the major/minor number pair to 136:0. This makes routines like ttyname() and ptsname() fail, because we'll end up having ambiguous device numbers. The conversion was not performed on all *stat() routines, which meant in some cases the numbers didn't get transformed. By pushing the conversion into linux_driver_get_major_minor(), the transformation will take place on all calls. Approved by: philip (mentor), rdivacky	2008-06-02 08:40:06 +00:00
Roman Divacky	4732e446fb	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
Roman Divacky	a6d043e30d	Implement linux_truncate64() syscall. Tested by: Aline de Freitas <aline@riseup.net> Approved by: kib (mentor)	2008-04-23 15:56:33 +00:00
Roman Divacky	872cbe6466	Remove using magic value of -1 to distinguish between linux_open() and linux_openat(). Instead just pass AT_FDCWD into linux_common_open() for the linux_open() case. This prevents passing -1 as a dirfd to openat() from succeeding which is wrong. Suggested by: rwatson, kib Approved by: kib (mentor)	2008-04-09 16:42:50 +00:00
Konstantin Belousov	48b05c3f82	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
Konstantin Belousov	57b4252e45	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Ruslan Ermilov	d7a38db650	Fix build. Reported by: ache, tinderbox	2008-03-25 13:20:52 +00:00
Roman Divacky	6af821237d	o Add stub support for some new futex operations, so the annoying message is not printed. o Don't warn about FUTEX_FD not being implemented and return ENOSYS instead of 0 (eg. success). o Clear FUTEX_PRIVATE_FLAG as we actually implement only private futexes so there is no reason to return ENOSYS when app asks for a private futex. We don't reject shared futexes because they worked just fine with our implementation so far. Approved by: kib (mentor) Tested by: bsam MFC after: 1 week	2008-03-20 17:03:55 +00:00
Roman Divacky	5dfb688191	Implement sched_setaffinity and get_setaffinity using real cpu affinity setting primitives. Reviewed by: jeff Approved by: kib (mentor)	2008-03-16 16:27:44 +00:00
Konstantin Belousov	a0b0d286bc	Return ENOSYS instead of 0 for the unknown futex operations. Submitted by: rdivacky Reported and tested by: Gary Stanley <gary velocity-servers net>	2008-03-02 14:00:50 +00:00
Konstantin Belousov	cbd2c621f8	Sanitize arguments to linux_mremap(). Check that only MREMAP_FIXED and MREMAP_MAYMOVE flags are specified. Check for the page alignment of the addr argument. Submitted by: rdivacky MFC after: 1 week	2008-02-22 11:47:56 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Konstantin Belousov	d075105da0	After applying LCONVPATH() to the path, do use the converted path instead of original user-mode string in the linux_stat() and linux_lstat() syscalls. Tested by: Peter Holm MFC after: 3 days	2008-01-05 12:36:35 +00:00
Konstantin Belousov	93eba2d50d	Plug the leaks in the present (hopefully, soon to be replaced) implementation of the linux_openat() for the quick MFC. Reported and tested by: Peter Holm MFC after: 3 days	2007-12-29 14:28:01 +00:00
Konstantin Belousov	15b78ac5d1	Apply the LCONVPATH() to the (old) linux_stat() and linux_lstat() syscalls. Without it, code has two problems: - behaviour of the old and new [l]stat are different with regard of the /compat/linux - directly accessing the userspace data from the kernel asks for the panics. Reported and tested by: Peter Holm Reviewed by: rdivacky MFC after: 3 days	2007-12-29 14:25:29 +00:00
Konstantin Belousov	d60f0a3d6a	Implement LINUX_SIOCGIFCOUNT and LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX. LINUX_SIOCGIFCOUNT just returns 0 since it is not implemented in the Linux 2.6.16. LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX are mapped to the FreeBSD native SIOCGIFINDEX. Tested by: Peter Kostouros <kpeter@melbpc.org.au> Reviewed by: brooks, rpaulo (on net@) Submitted by: rdivacky MFC after: 1 week	2007-11-07 16:42:52 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
David Malone	3ab8526963	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
Konstantin Belousov	b6e645c90f	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
Robert Watson	0bf686c125	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
Peter Wemm	79d5bdcca5	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Matt Jacob	2ba956ed13	Ensure that newpath is always initialized, even for the error case.	2007-06-10 04:37:22 +00:00
Attilio Rao	a1fe14bc33	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Konstantin Belousov	1c182de9a9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Robert Watson	d72a615878	Some Linux applications (ping) pass a non-NULL msg_control argument to sendmsg() while using a 0-length msg_controllen. This isn't allowed in the FreeBSD system call ABI, so detect this case and set msg_control to NULL. This allows Linux ping to work. Submitted by: rdivacky	2007-04-14 10:35:09 +00:00
Scott Long	6eef46be3b	Whitespace fixes	2007-04-10 21:37:37 +00:00
Scott Long	1eba4c7948	Add the CAM 'SG' peripheral device. This device implements a subset of the Linux SCSI SG passthrough device API. The intention is to allow for both running of Linux apps that want to talk to /dev/sg* nodes, and to facilitate porting of apps from Linux to FreeBSD. As such, both native and linuxolator entry points and definitions are provided. Caveats: - This does not support the procfs and sysfs nodes that the Linux SG driver provides. Some Linux apps may rely on these for operation, others may only use them for informational purposes. - More ioctls need to be implemented. - Linux uses a naming scheme of "sg[a-z]" for devices, while FreeBSD uses a scheme of "sg[0-9]". Devfs aliasis (symlinks) are automatically created to link the two together. However, tools like camcontrol only see the native names. - Some operations were originally designed to return byte counts or other data directly as the syscall return value. The linuxolator doesn't appear to support this well, so this driver just punts for these cases. Now that the driver is in place, others are welcome to add missing functionality. Thanks to Roman Divacky for pushing this work along.	2007-04-07 19:40:58 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Jung-uk Kim	357afa7113	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
Jung-uk Kim	a328699b34	MFP4: Linux futex support for amd64. Initial patch was submitted by kib and additional work was done by Divacky Roman. Tested by: emulation	2007-03-30 01:07:28 +00:00
Julian Elischer	6734f35eac	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
Robert Watson	b77ad8fc3b	In translate_path_major_minor(), do not calculate otherwise unused 'fp' variable, avoiding an extra locking of the file descriptor array.	2007-03-06 07:39:12 +00:00
Jung-uk Kim	a4e3bad794	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
Alexander Leidinger	8cf5ee2e2a	MFp4 (110541): Sync with rev 1.7 in NetBSD. Obtained from: NetBSD	2007-02-25 12:43:07 +00:00
Alexander Leidinger	f9dac96185	MFp4 (110523, parts which apply cleanly): semi-automatic style(9) The futex stuff already differs a lot (only a small part does not differ) from NetBSD, so we are already way off and can't apply changes from NetBSD automatically. As we need to merge everything by hand already, we can even make the files comply to our world order.	2007-02-25 12:40:35 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
Alexander Leidinger	e8b8b834b4	MFp4 (part of 114132): - Fix a LOR caused by holding emul_lock and proctree_lock at once. Submitted by: rdivacky	2007-02-23 22:29:24 +00:00
Konstantin Belousov	b4bb515484	Remove extern int hz; use proper include file instead.	2007-02-02 08:58:16 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	75ee4e5462	No need to lock emul_lock in exit_group() because em->shared cannot change (because its referenced by curthread). This fixes a LOR caused by acquiring emul_shared_lock while holding emul_lock. Fix typo in comment. Submitted by: rdivacky	2007-02-01 13:33:33 +00:00
Konstantin Belousov	25954d7430	No need to synchronize linux_schedtail with linux_proc_init. p->p_emuldata is properly initialized in the time when the child can run. Do not set p->p_emuldata to NULL when the process is exiting. It does not make any sense and only costs 2 mutex operations. Do not lock emul_data to unlock it on the very next line. Comment on possible race while there. Reparent all procs that are part of a threading group but not its leaders to init and SIGCHLD init to finish the zombies off. This fixes zombies left after opera's exit. [1] There is no need to lock p_em in the linux_proc_init CLONE_THREAD case because the process cannot change the address of the p_em->shared because its currently running this code path. Move assigning of em->shared outside emul_shared_lock. Noticed by: Scott Robbins <scottro@nyc.rr.com> [1] Submitted by: rdivacky	2007-02-01 13:29:27 +00:00

1 2 3 4 5 ...

635 Commits