freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	74f5d68011	Make linux_sendmsg() and linux_recvmsg() work on linux32/amd64. Change types used in the linux' struct msghdr and struct cmsghdr definitions to the properly-sized architecture-specific types. Move ancillary data handler from linux_sendit() to linux_sendmsg(). Submitted by: dchagin	2008-11-29 17:14:06 +00:00
Bjoern A. Zeeb	759e7c0bbb	Regen after jail support was added in r185435.	2008-11-29 14:34:30 +00:00
Bjoern A. Zeeb	413628a7e3	MFp4: Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible	2008-11-29 14:32:14 +00:00
Roman Divacky	7356a43c88	Document that all the other commands are either identical to the FreeBSD ones or rejected by kern_msgctl(). Found with: Coverity Prevent(tm) CID: 3456 Approved by: kib (mentor)	2008-11-26 16:38:43 +00:00
Konstantin Belousov	b4cf0e62f4	Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features. Discussed with: dchagin, imp, jhb, peter	2008-11-22 12:36:15 +00:00
Konstantin Belousov	62162dfc94	In the robust futexes list head, futex_offset shall be signed, and glibc actually supplies negative offsets. Change l_ulong to l_long. Submitted by: dchagin	2008-11-16 15:45:41 +00:00
Peter Wemm	dc5aaa8410	Sigh. Fix a pointer/int compile error.	2008-11-10 23:36:20 +00:00
Peter Wemm	a22600a1dd	Fix a signal emulation bug introduced in r163018 (and present in 7.x). This prevents 32 bit signal handlers from finding out what the faulting address is. Both the secret 4th argument and siginfo->si_addr are zero.	2008-11-10 23:26:52 +00:00
Ed Schouten	ebb45b0620	Regenerate system call tables for r184789.	2008-11-09 10:48:06 +00:00
Ed Schouten	a1b5a8955e	Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4. Looking at our source code history, it seems the uname(), getdomainname() and setdomainname() system calls got deprecated somewhere after FreeBSD 1.1, but they have never been phased out properly. Because we don't have a COMPAT_FREEBSD1, just use COMPAT_FREEBSD4. Also fix the Linuxolator to build without the setdomainname() routine by just making it call userland_sysctl on kern.domainname. Also replace the setdomainname()'s implementation to use this approach, because we're duplicating code with sysctl_domainname(). I wasn't able to keep these three routines working in our COMPAT_FREEBSD32, because that would require yet another keyword for syscalls.master (COMPAT4+NOPROTO). Because this routine is probably unused already, this won't be a problem in practice. If it turns out to be a problem, we'll just restore this functionality. Reviewed by: rdivacky, kib	2008-11-09 10:45:13 +00:00
Dag-Erling Smørgrav	faecfd5641	utf-8 MFC after: 3 weeks	2008-11-05 15:08:09 +00:00
John Baldwin	b1b3a8653d	Don't leak a reference on the /compat/linux vnode everytime the linprocfs 'mtab' file is read. MFC after: 1 month	2008-11-04 18:53:33 +00:00
Doug Rabson	45e6ab7f81	Regen.	2008-11-03 10:39:35 +00:00
Doug Rabson	a9148abd9d	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
Konstantin Belousov	17b9edd35a	The code in linux_proc_exit() contains a race when multiple linux based processes exits at the same time. The linux_emuldata structure is freed but p->p_emuldata is left as a dangling pointer to the just freed memory. The check for W_EXIT in the loop scanning the child processes isn't safe since the state of the child process can change right afterwards. Lock the process and check the W_EXIT before delivering signal. Submitted by: tegge Reviewed by: davidxu MFC after: 1 week	2008-10-31 10:38:30 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
John Baldwin	23aa8eeafc	Regen for freebsd32_getdirentries().	2008-10-22 21:56:44 +00:00
John Baldwin	63f8fe9e8b	Split the copyout of *base at the end of getdirentries() out leaving the rest in kern_getdirentries(). Use kern_getdirentries() to implement freebsd32_getdirentries(). This fixes a bug where calls to getdirentries() in 32-bit binaries would trash the 4 bytes after the 'long base' in userland. Submitted by: ups MFC after: 1 week	2008-10-22 21:55:48 +00:00
Konstantin Belousov	aa8b201112	Correctly fill siginfo for the signals delivered by linux tkill/tgkill. It is required for async cancellation to work. Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made to not linux process. Do not call em_find(p, ...) with p unlocked. Move common code for linux_tkill() and linux_tgkill() into linux_do_tkill(). Change linux siginfo_t definition to match actual linux one. Extend uid fields to 4 bytes from 2. The extension does not change structure layout and is binary compatible with previous definition, because i386 is little endian, and each uid field has 2 byte padding after it. Reported by: Nicolas Joly <njoly pasteur fr> Submitted by: dchangin MFC after: 1 month	2008-10-19 10:02:26 +00:00
Konstantin Belousov	175c6c319b	Make robust futexes work on linux32/amd64. Use PTRIN to read user-mode pointers. Change types used in the structures definitions to properly-sized architecture-specific types. Submitted by: dchagin MFC after: 1 week	2008-10-14 07:59:23 +00:00
Konstantin Belousov	68da8b22d2	Current linux_fooaffinity() emulation fails, as the FreeBSD affinity syscalls expect the bitmap size in the range from 32 to 128. Old glibc always assumed size 1024, while newer glibc searches for approriate size, starting from 1024 and going up. For now, use FreeBSD size of cpuset_t for bitmap size parameter and return EINVAL if length of user space bitmap less than our size of cpuset_t. Submitted by: dchagin MFC after: 1 week [This requires MFC of the actual linux affinity syscalls]	2008-10-04 19:23:30 +00:00
Konstantin Belousov	9a1e630dfd	Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week	2008-10-04 14:08:16 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Olivier Houchard	3b30175391	Advertise bit 26 as sse2. Spotted out by: gahr	2008-09-26 15:29:18 +00:00
John Baldwin	88ac915a9b	Add support for installing 32-bit system calls from kernel modules. This includes syscall32_{de,}register() routines as well as a module handler and wrapper macros similar to the support for native syscalls in <sys/sysent.h>. MFC after: 1 month	2008-09-25 20:50:21 +00:00
John Baldwin	d47faadce3	Sort includes and add multiple include guards.	2008-09-25 20:12:38 +00:00
John Baldwin	74d9b5a551	Regen.	2008-09-25 20:08:36 +00:00
John Baldwin	48a43ae819	Tidy up a few things with syscall generation: - Instead of using a syscall slot (370) just to get a function prototype for lkmressys(), add an explicit function prototype to <sys/sysent.h>. This also removes unused special case checks for 'lkmressys' from makesyscalls.sh. - Instead of having magic logic in makesyscalls.sh to only generate a function prototype the first time 'lkmnosys' is seen, make 'NODEF' always not generate a function prototype and include an explicit prototype for 'lkmnosys' in <sys/sysent.h>. - As a result of the fix in (2), update the LKM syscall entries in the freebsd32 syscall table to use 'lkmnosys' rather than 'nosys'. - Use NOPROTO for the __syscall() entry (198) in the native ABI. This avoids the need for magic logic in makesyscalls.h to only generate a function prototype the first time 'nosys' is encountered.	2008-09-25 20:07:42 +00:00
Konstantin Belousov	a8d403e102	Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month	2008-09-24 10:14:37 +00:00
Edward Tomasz Napierala	9545354ed5	Fix usage of mac_vnode_check_open() in linuxulator - last argument should be VREAD, not FREAD. Approved by: rwatson (mentor)	2008-09-22 18:59:24 +00:00
David E. O'Brien	c750e17cf5	Add freebsd32 compat shims for ioctl(2) CDIOREADTOCHEADER and CDIOREADTOCENTRYS requests.	2008-09-22 16:24:36 +00:00
David E. O'Brien	663c58007e	Regenerate for r183270.	2008-09-22 16:09:43 +00:00
David E. O'Brien	ae528485c4	Add freebsd32 compat shims for ioctl(2) MDIOCATTACH, MDIOCDETACH, MDIOCQUERY, and MDIOCLIST requests.	2008-09-22 16:09:16 +00:00
David E. O'Brien	f1287854fd	Regenerate for r183188.	2008-09-19 15:21:40 +00:00
David E. O'Brien	6e6049e9df	Add freebsd32 compat shim for nmount(2). (and quiet some compiler warnings for vfs_donmount)	2008-09-19 15:17:32 +00:00
David E. O'Brien	109ea24cc1	style(9)	2008-09-15 17:39:40 +00:00
David E. O'Brien	7e29bc757e	Regenerate for r183042.	2008-09-15 17:39:01 +00:00
David E. O'Brien	f0f53d8f79	Fix bug in r100384 (rev 1.2) in which the 32-bit swapon(2) was made "obsolete, not included in system", where as the system call does exist.	2008-09-15 17:37:41 +00:00
Ed Schouten	7969b32c44	Allow COMPAT_SVR4 to be built without COMPAT_43. It seems we only depend on COMPAT_43 to implement the send() and recv() routines. We can easily implement them using sendto() and recvfrom(), just like we do inside our very own C library. I wasn't able to really test it, apart from simple compilation testing. I've heard rumours that COMPAT_SVR4 is broken inside execve() anyway. It's still worth to fix this, because I suspect we'll get rid of COMPAT_43 somewhere in the future... Reviewed by: rdivacky Discussed with: jhb	2008-09-15 15:09:35 +00:00
Andrew Thompson	8fa962c745	Allow PAGE_SHIFT to already be defined. Submitted by: Hans Petter Selasky	2008-09-13 17:34:18 +00:00
Roman Divacky	0d62170990	The ERESTART to EINTR conversion is already done in kern_select so there is no need to repeat it in linux_select(). Submitted by: Dmitry Chagin <dchagin@> MFC after: 1 week Approved by: kib (mentor)	2008-09-11 15:28:28 +00:00
Roman Divacky	2963584278	Getdents requires padding with 2 bytes instead of 1 byte as with getdents64. The last byte is used for storing the d_type, add this to plain getdents case where it was missing before. Also change the code to use strlcpy instead of plain strcpy. This changes fix the getdents crash we had reports about (hl2 server etc.) PR: kern/117010 MFC after: 1 week Submitted by: Dmitry Chagin (dchagin@) Tested by: MITA Yoshio <mita ee.t.u-tokyo.ac jp> Approved by: kib (mentor)	2008-09-09 16:00:17 +00:00
Konstantin Belousov	745aaef5b5	Remove superfluous copyin() of args, structures are already in kernel space. Submitted by: dchagin MFC after: 1 week	2008-09-09 13:01:14 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Julian Elischer	576c43c844	We left out V_static_len from ip_fw2.c (also a whitespace diff that i'd rahter fix her ethan break in the vimage branch.)	2008-08-25 05:38:18 +00:00
Julian Elischer	1d89fc4ebe	All opt_x.h includes go at the top of other includes.	2008-08-25 04:55:29 +00:00
Robert Watson	5ae504055a	Regenerate following r182123.	2008-08-24 21:23:08 +00:00
Robert Watson	e484af13ed	When MPSAFE ttys were merged, a new BSM audit event identifier was allocated for posix_openpt(2). Unfortunately, that identifier conflicts with other events already allocated to other systems in OpenBSM. Assign a new globally unique identifier and conform better to the AUE_ event naming scheme. This is a stopgap until a new OpenBSM import is done with the correct identifier, so we'll maintain this as a local diff in svn until then. Discussed with: ed Obtained from: TrustedBSD Project	2008-08-24 21:20:35 +00:00
David E. O'Brien	35c316caaf	Add comments on NOARGS, NODEF, and NOPROTO.	2008-08-21 22:57:31 +00:00
Ed Schouten	18cf135421	Update system call tables. The previous commit also included changes to all the system call lists, but it is a tradition to update these lists in a second commit, so rerun make sysent to update the $FreeBSD$ tags inside these files to refer to the latest version of syscalls.master. Requested by: rwatson	2008-08-20 08:39:10 +00:00
Ed Schouten	bc093719ca	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Ed Schouten	b377be43a5	Add TIOCPKT and TIOCSPTLCK to the Linuxolator. We're very lucky, because the flags used by our TIOCPKT implementation are the same as flags used by Linux. We can safely enable TIOCPKT, assuming EXTPROC is not used. TIOCSPTLCK is used by unlockpt(). Because we don't need unlockpt() in our implementation, make this ioctl a no-op. Approved by: philip (mentor, implicit), rdivacky Obtained from: P4 (//depot/projects/mpsafetty/...)	2008-07-23 17:47:44 +00:00
Roman Divacky	0864e2a4f1	Fix linux_alarm, the linux behaviour is to limit the secs to INT_MAX when the passed in parameter is bigger than INT_MAX. Submitted by: Dmitry Chagin <chagin.dmitry gmail com> Approved by: kib (mentor)	2008-07-23 17:19:02 +00:00
Weongyo Jeong	138ddff935	when NDIS framework try to query/set informations NDIS drivers can return NDIS_STATUS_PENDING. In this case, it's waiting for 5 secs to get the response from drivers now. However, some NDIS drivers can send the response before NDIS framework gets ready to receive it so we might always be blocked for 5 secs in current implementation. NDIS framework should reset the event before calling NDIS driver's callback not after. MFC after: 1 month	2008-07-23 10:49:27 +00:00
Brooks Davis	e44f0b2a63	style(9): put parentheses around return values.	2008-07-10 19:54:34 +00:00
Brooks Davis	774b72e12e	Regen	2008-07-10 17:46:58 +00:00
Brooks Davis	a8c6d6d0ba	id_t is a 64-bit integer and thus is passed as two arguments like off_t is. As a result, those arguments must be recombined before calling the real syscal implementation. This change fixes 32-bit compatibility for cpuset_getid(), cpuset_setid(), cpuset_getaffinity(), and cpuset_setaffinity().	2008-07-10 17:45:57 +00:00
Robert Watson	4f7d1876d5	Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks	2008-07-05 13:10:10 +00:00
Coleman Kane	38ad9366dc	Silence warning about missing IoGetDeviceObjectPointer by implementing a simple stub that always returns STATUS_SUCCESS. Submitted by: Paul B. Mahol <onemda@gmail.com> Reviewed by: thompsa MFC after: 1 week	2008-06-15 13:37:29 +00:00
Wojciech A. Koszek	53a609f064	Remove obselete PECOFF image activator support. PRs assigned at the time of removal: kern/80742 Discussed on: freebsd-current (silence), IRC Tested by: make universe Approved by: cognet (mentor)	2008-06-14 12:51:44 +00:00
Weongyo Jeong	1f22fabdfb	fix a page fault that it occurred during ifp is NULL. This bug happens when NDIS driver's initialization is failed and NDIS driver's trying to call NdisWriteErrorLogEntry().	2008-06-11 07:55:07 +00:00
Roman Divacky	2e1a489300	d_ino member of linux_dirent structure should be unsigned long. Submitted by: Chagin Dmitry <chagin.dmitry@gmail.com> Approved by: kib (mentor)	2008-06-08 11:09:25 +00:00
Roman Divacky	a47444d525	Switch to emulating Linux 2.6 on default. Approved by: kib (mentor)	2008-06-03 17:50:13 +00:00
Ed Schouten	a147e6cadf	Push down the major/minor conversion for pts/%u to improve consistency. In the mpsafetty branch, Linux sshd seems to work properly inside a jail. Some small modifications had to be made to the Linux compatibility layer. The Linux PTY routines always expect the device major number to be 136 or higher. Our code always set the major/minor number pair to 136:0. This makes routines like ttyname() and ptsname() fail, because we'll end up having ambiguous device numbers. The conversion was not performed on all *stat() routines, which meant in some cases the numbers didn't get transformed. By pushing the conversion into linux_driver_get_major_minor(), the transformation will take place on all calls. Approved by: philip (mentor), rdivacky	2008-06-02 08:40:06 +00:00
Weongyo Jeong	32e9c9dc71	Fix a panic that a priority value which is passed to cv_broadcastpri(9) can be < 0. We don't ignore a `increment' argument but at least we keep a priority value of NDIS threads over PRI_MIN_KERN. Reviewed by: thompsa	2008-05-30 06:31:55 +00:00
Weongyo Jeong	d9585f801b	Fix a panic when it occurred during initializing the ndis driver because it try to read network address through ifnet structure which is NULL until the ndis driver's initialization is finished. Reviewed by: thompsa	2008-05-15 04:29:28 +00:00
Roman Divacky	4732e446fb	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
Roman Divacky	a6d043e30d	Implement linux_truncate64() syscall. Tested by: Aline de Freitas <aline@riseup.net> Approved by: kib (mentor)	2008-04-23 15:56:33 +00:00
Roman Divacky	cabce2bf19	The vmspace->vm_daddr is constant until freed, there is no need to hold lock while accessing it. Approved by: kib (mentor)	2008-04-21 21:24:08 +00:00
Roman Divacky	872cbe6466	Remove using magic value of -1 to distinguish between linux_open() and linux_openat(). Instead just pass AT_FDCWD into linux_common_open() for the linux_open() case. This prevents passing -1 as a dirfd to openat() from succeeding which is wrong. Suggested by: rwatson, kib Approved by: kib (mentor)	2008-04-09 16:42:50 +00:00
Konstantin Belousov	48b05c3f82	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
Konstantin Belousov	f2296b585e	Regen	2008-03-31 12:12:27 +00:00
Konstantin Belousov	4f1e7213d4	Add the freebsd32 compatibility shims for the *at() syscalls. Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:08:30 +00:00
Konstantin Belousov	57b4252e45	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
John Birrell	8f0cc58815	Remove files that have been repo copied to their new location in cddl-specific parts of the source tree.	2008-03-28 00:08:47 +00:00
Doug Rabson	a7ac0db6cb	Regen.	2008-03-26 15:24:02 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
John Baldwin	5c63b21a1a	Regen.	2008-03-25 19:35:34 +00:00
John Baldwin	30c6422a8a	Add entries for the cpuset-related system calls. The existing system calls can be used on little endian systems. Pointy hat to: jeff	2008-03-25 19:34:47 +00:00
Ruslan Ermilov	d7a38db650	Fix build. Reported by: ache, tinderbox	2008-03-25 13:20:52 +00:00
Roman Divacky	6af821237d	o Add stub support for some new futex operations, so the annoying message is not printed. o Don't warn about FUTEX_FD not being implemented and return ENOSYS instead of 0 (eg. success). o Clear FUTEX_PRIVATE_FLAG as we actually implement only private futexes so there is no reason to return ENOSYS when app asks for a private futex. We don't reject shared futexes because they worked just fine with our implementation so far. Approved by: kib (mentor) Tested by: bsam MFC after: 1 week	2008-03-20 17:03:55 +00:00
Antoine Brodin	afe5acff1b	Simplify fcntl(SVR4_F_DUP2FD) code now that FreeBSD has F_DUP2FD. Approved by: rwatson (mentor)	2008-03-17 18:27:28 +00:00
Roman Divacky	5dfb688191	Implement sched_setaffinity and get_setaffinity using real cpu affinity setting primitives. Reviewed by: jeff Approved by: kib (mentor)	2008-03-16 16:27:44 +00:00
Jeff Roberson	66257bc8d9	- The P_SA flag has been removed. Don't reference it in a KASSERT.	2008-03-12 22:17:06 +00:00
Jeff Roberson	6617724c5f	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
Konstantin Belousov	a0b0d286bc	Return ENOSYS instead of 0 for the unknown futex operations. Submitted by: rdivacky Reported and tested by: Gary Stanley <gary velocity-servers net>	2008-03-02 14:00:50 +00:00
Konstantin Belousov	cbd2c621f8	Sanitize arguments to linux_mremap(). Check that only MREMAP_FIXED and MREMAP_MAYMOVE flags are specified. Check for the page alignment of the addr argument. Submitted by: rdivacky MFC after: 1 week	2008-02-22 11:47:56 +00:00
Ruslan Ermilov	b95bd24d29	Regenerate for readlink(2).	2008-02-12 20:11:54 +00:00
Ruslan Ermilov	5f56182b6f	Change readlink(2)'s return type and type of the last argument to match POSIX. Prodded by: Alexey Lyashkov	2008-02-12 20:09:04 +00:00
Poul-Henning Kamp	cf827063a9	Give MEXTADD() another argument to make both void pointers to the free function controlable, instead of passing the KVA of the buffer storage as the first argument. Fix all conventional users of the API to pass the KVA of the buffer as the first argument, to make this a no-op commit. Likely break the only non-convetional user of the API, after informing the relevant committer. Update the mbuf(9) manual page, which was already out of sync on this point. Bump __FreeBSD_version to 800016 as there is no way to tell how many arguments a CPP macro needs any other way. This paves the way for giving sendfile(9) a way to wait for the passed storage to have been accessed before returning. This does not affect the memory layout or size of mbufs. Parental oversight by: sam and rwatson. No MFC is anticipated.	2008-02-01 19:36:27 +00:00
Pawel Jakub Dawidek	44ce1efd91	Change type of kmem_used() and kmem_size() functions to uint64_t, so it doesn't overflow in arc.c in this check: if (kmem_used() > (kmem_size() * 4) / 5) return (1); With this bug ZFS almost doesn't cache. Only 32bit machines are affected that have vm.kmem_size set to values >=1GB. Reported by: David Taylor <davidt@yadt.co.uk>	2008-01-24 11:21:54 +00:00
Robert Watson	20c6fe828a	Regenerate.	2008-01-20 23:44:24 +00:00
Robert Watson	6c902059f2	Use audit events AUE_SHMOPEN and AUE_SHMUNLINK with new system calls shm_open() and shm_unlink(). More auditing will need to be done for these calls to capture arguments properly.	2008-01-20 23:43:06 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
John Baldwin	4ad6d200d6	Regen for shm_open(2) and shm_unlink(2).	2008-01-08 22:01:26 +00:00
John Baldwin	8e38aeff17	Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())	2008-01-08 21:58:16 +00:00
Konstantin Belousov	d075105da0	After applying LCONVPATH() to the path, do use the converted path instead of original user-mode string in the linux_stat() and linux_lstat() syscalls. Tested by: Peter Holm MFC after: 3 days	2008-01-05 12:36:35 +00:00
Jeff Roberson	397c19d175	Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho	2007-12-30 01:42:15 +00:00
Konstantin Belousov	93eba2d50d	Plug the leaks in the present (hopefully, soon to be replaced) implementation of the linux_openat() for the quick MFC. Reported and tested by: Peter Holm MFC after: 3 days	2007-12-29 14:28:01 +00:00
Konstantin Belousov	15b78ac5d1	Apply the LCONVPATH() to the (old) linux_stat() and linux_lstat() syscalls. Without it, code has two problems: - behaviour of the old and new [l]stat are different with regard of the /compat/linux - directly accessing the userspace data from the kernel asks for the panics. Reported and tested by: Peter Holm Reviewed by: rdivacky MFC after: 3 days	2007-12-29 14:25:29 +00:00
Robert Watson	3de213cc00	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
John Baldwin	0a63574164	Bah, remove last vestiges of some statfs conversion fixes that aren't quite ready for CVS yet that snuck into 1.68. Pointy hat to: jhb	2007-12-10 19:42:23 +00:00
Scott Long	d637500d06	Grrr, remove an unused variable missed in the last commit.	2007-12-08 01:41:31 +00:00
Scott Long	7815c9e2db	Don't expect a return value from statfs_scale_blocks().	2007-12-07 22:32:09 +00:00
John Baldwin	8120bb7e3a	Regen.	2007-12-06 23:37:26 +00:00
John Baldwin	695e8d536c	Add freebsd32 compat wrappers for msgctl() and __semctl() using kern_msgctl() and kern_semctl(). MFC after: 1 week	2007-12-06 23:36:57 +00:00
John Baldwin	3c39e0d8d4	Add freebsd32 compat wrappers for msgctl() and _semctl() using kern_msgctl() and kern_semctl(). MFC after: 1 week	2007-12-06 23:35:29 +00:00
John Baldwin	d43c6fa4fe	Move 32-bit SYSV IPC structure definitions into freebsd32_ipc.h. MFC after: 1 week	2007-12-06 23:23:16 +00:00
John Baldwin	74427aa423	Move several data structure definitions out of freebsd32_misc.c and into freebsd32.h instead. MFC after: 1 week	2007-12-06 23:11:27 +00:00
Jung-uk Kim	959a913b87	Remove redundant checks for msgsnd(3) and msgrcv(3). COMPAT_IA32 (implicitly) requires SYSVSEM, SYSVSHM and SYSVMSG in kernel. Pointed out by: jhb	2007-12-04 20:25:41 +00:00
Andrew Thompson	ac740aebcf	Implement functions required by some ndis drivers. NdisIMCopySendPerPacketInfo [1] KeQuerySystemTime [1] KeTickCount [1] strncat [1] KeBugCheckEx Submitted by: Marcin Simonides [1]	2007-12-03 23:43:58 +00:00
Andrew Thompson	e880149eb9	Correct the calculation for the number of 100ns intervals since January 1, 1601. The 1601 - 1970 period was in seconds rather than 100ns units. Remove duplication by having NdisGetCurrentSystemTime call ntoskrnl_time.	2007-12-02 08:54:50 +00:00
Andrew Thompson	f3ad39ccf5	Correct the nwbx_ies field type in struct ndis_wlan_bssid_ex. PR: kern/118369 Submitted by: Weongyo Jeong	2007-12-02 04:04:42 +00:00
Peter Wemm	7628402b07	Move the shared cp_time array (counts %sys, %user, %idle etc) to the per-cpu area. cp_time[] goes away and a new function creates a merged cp_time-like array for things like linprocfs, sysctl etc. The atomic ops for updating cp_time[] in statclock go away, and the scope of the thread lock is reduced. sysctl kern.cp_time returns a backwards compatible cp_time[] array. A new kern.cp_times sysctl returns the individual per-cpu stats. I have pending changes to make top and vmstat optionally show per-cpu stats. I'm very aware that there are something like 5 or 6 other versions "out there" for doing this - but none were handy when I needed them. I did merge my changes with John Baldwin's, and ended up replacing a few chunks of my stuff with his, and stealing some other code. Reviewed by: jhb Partly obtained from: jhb	2007-11-29 06:34:30 +00:00
John Birrell	35a04710d7	Remove some compatibility stuff that we now get from the Solaris header.	2007-11-29 00:15:08 +00:00
John Birrell	57438287ab	Add more OpenSolaris compatibility headers.	2007-11-28 21:50:40 +00:00
John Birrell	eca148b637	Remove an extern that is defined elsewhere.	2007-11-28 21:50:05 +00:00
John Birrell	edadde229a	Add compatibility cruft moved from under _SOLARIS_C_SOURCE in sys/types.h	2007-11-28 21:49:16 +00:00
John Birrell	35ba7f225f	Remove a typedef which was just a hack to avoid including vmem.h. That typedef breaks other Solaris code.	2007-11-28 21:48:25 +00:00
John Birrell	773f4e3849	Add a missing volatile so that the code compiles cleanly.	2007-11-28 21:47:09 +00:00
John Birrell	4fc8feafc7	Rename the definition of lbolt to LBOLT to avoid a clash with a global variable in FreeBSD. Until now lbolt in sys/proc.h has been #ifdef'ed out based on _SOLARIS_C_SOURCE, but that is going away now.	2007-11-28 21:44:17 +00:00
Konstantin Belousov	d60f0a3d6a	Implement LINUX_SIOCGIFCOUNT and LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX. LINUX_SIOCGIFCOUNT just returns 0 since it is not implemented in the Linux 2.6.16. LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX are mapped to the FreeBSD native SIOCGIFINDEX. Tested by: Peter Kostouros <kpeter@melbpc.org.au> Reviewed by: brooks, rpaulo (on net@) Submitted by: rdivacky MFC after: 1 week	2007-11-07 16:42:52 +00:00
Pawel Jakub Dawidek	171eb887e9	Remove "zfs:" prefix from lock and condvar names and also skip non-letter characters (mostly "&"). Because top(1) shows only first six characters of wait channel, without this change we saw only one meaningful character. Requested by: kris & others MFC after: 1 week	2007-11-05 18:40:55 +00:00
Konstantin Belousov	89b57fcf01	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
Pawel Jakub Dawidek	4f2398ea17	- Move crfree() outside MNT_ILOCK()/MNT_IUNLOCK() to eliminate a LOR: 1st 0xc4cea568 struct mount mtx (struct mount mtx) @ /usr/src/sys/modules/zfs/../../compat/opensolaris/kern/opensolaris_vfs.c:209 2nd 0xc3ee9010 sleep mtxpool (sleep mtxpool) @ /usr/src/sys/kern/kern_resource.c:1266 - Move crdup() outside MNT_ILOCK()/MNT_IUNLOCK(), as it can sleep. Reported by: Olli Hauer <ohauer@gmx.de> MFC after: 3 days	2007-11-01 08:58:29 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
Julian Elischer	3745c395ec	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
Kevin Lo	976b010645	Spelling fix for interupt -> interrupt	2007-10-12 06:03:46 +00:00
John Baldwin	977b6507cb	Allow the ia32 resource limits (compat.ia32.max{dsiz,ssiz,vmem} to be set via loader tunables. They are already tunable via sysctl. MFC after: 1 week Approved by: re (kensmith)	2007-09-24 20:49:39 +00:00
David Malone	3ab8526963	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
John Baldwin	cc479dda4a	Rework the routines to convert a 5.x+ statfs structure (with fixed-size 64-bit counters) to a 4.x statfs structure (with long-sized counters). - For block counters, we scale up the block size sufficiently large so that the resulting block counts fit into a the long-sized (long for the ABI, so 32-bit in freebsd32) counters. In 4.x the NFS client's statfs VOP did this already. This can lie about the block size to 4.x binaries, but it presents a more accurate picture of the ratios of free and available space. - For non-block counters, fix the freebsd32 stats converter to cap the values at INT32_MAX rather than losing the upper 32-bits to match the behavior of the 4.x statfs conversion routine in vfs_syscalls.c Approved by: re (kensmith)	2007-08-28 20:28:12 +00:00
Konstantin Belousov	b6e645c90f	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
Pawel Jakub Dawidek	70eaa4219c	Some ZFS threads needs stack larger than the default 8kB, so use 16kB of alternate stack if the default is smaller than 16kB. Approved by: re (rwatson)	2007-08-16 20:33:20 +00:00
David Xu	6ec46f7aa8	Regenerate. Approved by: re(kensmith)	2007-08-16 05:32:26 +00:00
David Xu	81ca5b4257	Add thr_kill2 compat32 syscall. Submitted by: Tijl Coosemans tijl at ulyssis dot org Approved by: re (kensmith)	2007-08-16 05:30:04 +00:00
Robert Watson	0bf686c125	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
Andrew Thompson	a4e531102e	ndis will signal the kthread to exit and then sleep on the proc pointer to be woken up by kthread_exit. This is racey and in some cases the kthread will exit before ndis gets around to sleep so it will be stuck indefinitely. This change reuses the kq_exit variable to indicate that the thread has gone and will loop on tsleep with a timeout waiting for it. If the kthread has already exited then it will not sleep at all. Approved by: re (rwatson)	2007-07-22 20:53:28 +00:00
John Baldwin	59d8f3ff08	Fix a couple of issues with the stack limit for 32-bit processes on 64-bit kernels exposed by the recent fixes to resource limits for 32-bit processes on 64-bit kernels: - Let ABIs expose their maximum stack size via a new pointer in sysentvec and use that in preference to maxssiz during exec() rather than always using maxssiz for all processses. - Apply the ABI's limit fixup to the previous stack size when adjusting RLIMIT_STACK to determine if the existing mapping for the stack needs to be grown or shrunk (as well as how much it should be grown or shrunk). Approved by: re (kensmith)	2007-07-12 18:01:31 +00:00
Peter Wemm	b77acb8748	Quiet warnings. I believe gcc is incorrect about these. Approved by: re (rwatson)	2007-07-05 07:38:17 +00:00
Peter Wemm	79d5bdcca5	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
Peter Wemm	5aa69f9c72	Add compat6 wrapper code for mmap/lseek/pread/pwrite/truncate/ftruncate. Approved by: re (kensmith)	2007-07-04 23:04:41 +00:00
Peter Wemm	486abf939c	Regenerate after mmap/lseek/etc syscall changes Approved by: re (kensmith)	2007-07-04 23:03:50 +00:00
Peter Wemm	b9f3e68f95	Add i386 emulation wrappers for mmap/lseek/etc. These use COMPAT6, so you must use the already existing, already in generic, COMPAT_FREEBSD6 kernel option for running old 32 bit binaries. Approved by: re (kensmith)	2007-07-04 23:02:40 +00:00
Matt Jacob	739c673c8d	Try a cheap way to get around gcc4.2 believing that user arguments to system calls can change across intervening functions.	2007-06-17 04:37:57 +00:00
Ed Maste	1dd702a59a	Remove stale 'XXX implement' comments for syscalls which have since been implemented.	2007-06-15 21:54:26 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Matt Jacob	3a4ac24970	Quiesce warnings by initializing irql values to zero.	2007-06-10 04:40:13 +00:00
Matt Jacob	2ba956ed13	Ensure that newpath is always initialized, even for the error case.	2007-06-10 04:37:22 +00:00
Attilio Rao	a1fe14bc33	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
Attilio Rao	a140976eb4	The current rusage code show peculiar problems: - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed. A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process. Submitted by: jeff Approved by: jeff (mentor)	2007-06-09 18:56:11 +00:00
Pawel Jakub Dawidek	3b7917d766	- Reduce number of atomic operations needed to be implemented in asm by implementing some of them using existing ones. - Allow to compile ZFS on all archs and use atomic operations surrounded by global mutex on archs we don't have or can't have all atomic operations needed by ZFS.	2007-06-08 12:35:47 +00:00
Jeff Roberson	982d11f836	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
David Malone	041b706b2f	Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.	2007-06-04 18:25:08 +00:00
Pawel Jakub Dawidek	b166b92692	Reimplement traverse() helper function: 1. Pass locking flags to VFS_ROOT(). 2. Check v_mountedhere while the vnode is locked. 3. Always return locked vnode on success. Change 1 fixes problem reported by Stephen M. Rumble - after zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode unconditionally and traverse() didn't pass right lock type to VFS_ROOT(). The result was that kernel paniced when .zfs/ directory was accessed via NFS.	2007-06-04 11:31:46 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Pawel Jakub Dawidek	0d99488ded	There are too many false positive LORs reported by WITNESS, so when ZFS debug is turned off, initialize locks with NOWITNESS flag. At some point I'll get back to them, we would probably need BLESSING functionality, which is currently turned off by default.	2007-05-26 21:37:14 +00:00
Pawel Jakub Dawidek	fbd08bbe6a	DNLC_NO_VNODE can't be NULL. Reported by: ru	2007-05-24 13:44:45 +00:00
Pawel Jakub Dawidek	d4c4dfe96f	FreeBSD's namecache works quite well with ZFS, so remove DNLC.	2007-05-23 21:33:02 +00:00
Olivier Houchard	302e130edc	Remove duplicate includes. Submitted by: Cyril Nguyen Huu <cyril ci0 org>	2007-05-23 13:36:02 +00:00
Konstantin Belousov	1c182de9a9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
Alexander Kabaev	23a29e45cd	Allow FreeBSD's native ELF image activators to execute shared libraries the same way it was enabled for Linux binares in linuxulator. This allows binaries built with -pie. Many ports auto-detect -fPIE support in GCC 4.2 and build binaries FreeBSD was unable to run.	2007-05-22 02:22:58 +00:00
Jeff Roberson	0ad5e7f326	- Move GDT/LDT locking into a seperate spinlock, removing the global scheduler lock from this responsibility. Contributed by: Attilio Rao <attilio@FreeBSD.org> Tested by: jeff, kkenn	2007-05-20 22:03:57 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
John Baldwin	19059a13ed	Rework the support for ABIs to override resource limits (used by 32-bit processes under 64-bit kernels). Previously, each 32-bit process overwrote its resource limits at exec() time. The problem with this approach is that the new limits affect all child processes of the 32-bit process, including if the child process forks and execs a 64-bit process. To fix this, don't ovewrite the resource limits during exec(). Instead, sv_fixlimits() is now replaced with a different function sv_fixlimit() which asks the ABI to sanitize a single resource limit. We then use this when querying and setting resource limits. Thus, if a 32-bit process sets a limit, then that new limit will be inherited by future children. However, if the 32-bit process doesn't change a limit, then a future 64-bit child will see the "full" 64-bit limit rather than the 32-bit limit. MFC is tentative since it will break the ABI of old linux.ko modules (no other modules are affected). MFC after: 1 week	2007-05-14 22:40:04 +00:00
Pawel Jakub Dawidek	57504dcfaf	Share-lock a vnode where possible.	2007-05-02 01:03:10 +00:00
Alan Cox	37f3c8939a	Eliminate the use of Giant from ia64-specific code in freebsd32_mmap().	2007-05-01 17:10:01 +00:00
Alan Cox	4bd4f5a2e2	Synchronize vm map and object accesses. Approved by: des@	2007-05-01 03:09:57 +00:00
Pawel Jakub Dawidek	cc7cd831b2	MFp4: Reduce diff against vendor code: - Move FreeBSD-specific code to zfs_freebsd_*() functions in zfs_vnops.c and keep original functions as similar to vendor's code as possible. - Add various includes back, now that we have them.	2007-04-23 00:52:07 +00:00
Dag-Erling Smørgrav	7621783a55	Now that we're MPSAFE, tell namei() to acquire Giant if necessary.	2007-04-22 08:41:52 +00:00
Pawel Jakub Dawidek	9de81c7273	MFp4: @118370 Correct typo. @118371 Integrate changes from vendor. @118491 Show backtrace on unexpected code paths. @118494 Integrate changes from vendor. @118504 Fix sendfile(2). I had two ways of fixing it: 1. Fixing sendfile(2) itself to use VOP_GETPAGES() instead of hacking around with vn_rdwr(UIO_NOCOPY), which was suggested by ups. 2. Modify ZFS behaviour to handle this special case. Although 1 is more correct, I've choosen 2, because hack from 1 have a side-effect of beeing faster - it reads ahead MAXBSIZE bytes instead of reading page by page. This is not easy to implement with VOP_GETPAGES(), at least not for me in this very moment. Reported by: Andrey V. Elsukov <bu7cher@yandex.ru> @118525 Reorganize the code to reduce diff. @118526 This code path is expected. It is simply when file is opened with O_FSYNC flag. Reported by: kris Reported by: Michal Suszko <dry@dry.pl>	2007-04-21 12:02:57 +00:00
Pawel Jakub Dawidek	32371d2025	MFp4: Fix automatic snapshot mount when unprivileged user does lookup on a snapshot directory: - Remove PRIV_VFS_MOUNT check - regular users can mount snapshots via lookups on snapshot directory. - Reset mount credential to kcred, so user won't be able to unmount the snapshot. - Reset owner uid. - Unlock vnode in case of a failure. Reported by: simokawa	2007-04-18 15:24:48 +00:00
Pawel Jakub Dawidek	a1bcf4dc7b	- Fix a leftover - vfs_mount_alloc() is now exported properly. This fixes stange panics when listing .zfs/snapshot/ directory for me. Reported by: simokawa Reported by: Johan Hendriks <Johan@double-l.nl> - Hide cache_purge() under FREEBSD_NAMECACHE like in other files. - Protect mnt_flag with mount interlock.	2007-04-17 21:16:34 +00:00
Dag-Erling Smørgrav	78c3440e7d	Whitespace cleanup.	2007-04-15 17:02:03 +00:00
Robert Watson	d72a615878	Some Linux applications (ping) pass a non-NULL msg_control argument to sendmsg() while using a 0-length msg_controllen. This isn't allowed in the FreeBSD system call ABI, so detect this case and set msg_control to NULL. This allows Linux ping to work. Submitted by: rdivacky	2007-04-14 10:35:09 +00:00
Wojciech A. Koszek	f7caeade24	strchr() and strrchr() are already present in the kernel, but with less popular names. Hence: - comment current index() and rindex() functions, as these serve the same functionality as, respectively, strchr() and strrchr() from userland; - add inlined version of strchr() and strrchr(), as we tend to use them more often; - remove str[r]chr() definitions from ZFS code; Reviewed by: pjd Approved by: cognet (mentor)	2007-04-10 21:42:12 +00:00
Scott Long	6eef46be3b	Whitespace fixes	2007-04-10 21:37:37 +00:00
Pawel Jakub Dawidek	2d03e33170	Try to stabilize ZFS with regard to memory consumption: - Allow to shrink ARC down to 16MB (instead of 64MB). - Set arc_max to 1/2 of kmem_map by default. - Start freeing things earlier when low memory situation is detected. - Serialize execution of arc_lowmem(). I decided to setup minimum ZFS memory requirements to 512MB of RAM and 256MB of kmem_map size. If there is less RAM or kmem_map, a warning will be printed. World is cruel, be no better. In other words: modern file system requires modern hardware:) From ZFS administration guide: "Currently the minimum amount of memory recommended to install a Solaris system is 512 Mbytes. However, for good ZFS performance, at least one Gbyte or more of memory is recommended."	2007-04-10 02:35:57 +00:00
Pawel Jakub Dawidek	24bda1641f	Instead of detecting if lock is already initialized based on standard 1 bit check, use more accurate 13 bits check. We had too many false-positives with the standard check. Reported by: mlaier	2007-04-09 01:05:31 +00:00
Pawel Jakub Dawidek	bdebccf9b9	Extend kobj compatibility KPI to support operating on files before and after the root file system is mounted. This is one of the changes that will allow to put root file system on ZFS.	2007-04-08 23:57:08 +00:00
Pawel Jakub Dawidek	ffe54ff0ec	MFp4: Synchronize with recent OpenSolaris changes.	2007-04-08 16:29:25 +00:00
Scott Long	1eba4c7948	Add the CAM 'SG' peripheral device. This device implements a subset of the Linux SCSI SG passthrough device API. The intention is to allow for both running of Linux apps that want to talk to /dev/sg* nodes, and to facilitate porting of apps from Linux to FreeBSD. As such, both native and linuxolator entry points and definitions are provided. Caveats: - This does not support the procfs and sysfs nodes that the Linux SG driver provides. Some Linux apps may rely on these for operation, others may only use them for informational purposes. - More ioctls need to be implemented. - Linux uses a naming scheme of "sg[a-z]" for devices, while FreeBSD uses a scheme of "sg[0-9]". Devfs aliasis (symlinks) are automatically created to link the two together. However, tools like camcontrol only see the native names. - Some operations were originally designed to return byte counts or other data directly as the syscall return value. The linuxolator doesn't appear to support this well, so this driver just punts for these cases. Now that the driver is in place, others are welcome to add missing functionality. Thanks to Roman Divacky for pushing this work along.	2007-04-07 19:40:58 +00:00
Jung-uk Kim	6e612eca81	Fix kernel module dependency. linprocfs depends on sysvmsg and sysvsem. Submitted by: nork	2007-04-06 18:15:56 +00:00
Pawel Jakub Dawidek	4d00f78b40	We have strcasecmp() in libkern now.	2007-04-06 11:18:57 +00:00
Pawel Jakub Dawidek	f0a75d274a	Please welcome ZFS - The last word in file systems. ZFS file system was ported from OpenSolaris operating system. The code in under CDDL license. I'd like to thank all SUN developers that created this great piece of software. Supported by: Wheel LTD (http://www.wheel.pl/) Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/) Supported by: Sentex (http://www.sentex.net/)	2007-04-06 01:09:06 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Jung-uk Kim	357afa7113	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
Jung-uk Kim	3dd8390fd9	Use underlying structures instead of kernel_sysctlbyname() for msginfo and seminfo because kernel_sysctlbyname() is slow. There is no dependency problem since linux module depends on both sysvmsg and sysvsem and linprocfs depends on it in turn. Pointed out by: des Reviewed by: des	2007-03-30 17:56:44 +00:00
Jung-uk Kim	a328699b34	MFP4: Linux futex support for amd64. Initial patch was submitted by kib and additional work was done by Divacky Roman. Tested by: emulation	2007-03-30 01:07:28 +00:00
Julian Elischer	6734f35eac	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
Dag-Erling Smørgrav	771709eb78	Add a pn_destroy field to pfs_node. This field points to a destructor function which is called from pfs_destroy() before the node is reclaimed. Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers. This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal. Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>	2007-03-12 12:16:52 +00:00
Robert Watson	b77ad8fc3b	In translate_path_major_minor(), do not calculate otherwise unused 'fp' variable, avoiding an extra locking of the file descriptor array.	2007-03-06 07:39:12 +00:00
Jung-uk Kim	5017af608d	MFP4: 113090, 113130, 113132 Add Linux kernel version strings to /proc/sys/kernel.	2007-03-02 01:10:26 +00:00
Jung-uk Kim	a4e3bad794	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
Alexander Leidinger	8cf5ee2e2a	MFp4 (110541): Sync with rev 1.7 in NetBSD. Obtained from: NetBSD	2007-02-25 12:43:07 +00:00
Alexander Leidinger	f9dac96185	MFp4 (110523, parts which apply cleanly): semi-automatic style(9) The futex stuff already differs a lot (only a small part does not differ) from NetBSD, so we are already way off and can't apply changes from NetBSD automatically. As we need to merge everything by hand already, we can even make the files comply to our world order.	2007-02-25 12:40:35 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
Alexander Leidinger	e8b8b834b4	MFp4 (part of 114132): - Fix a LOR caused by holding emul_lock and proctree_lock at once. Submitted by: rdivacky	2007-02-23 22:29:24 +00:00
John Baldwin	a96255b62d	Use 'pause' in several places rather than trying to tsleep() on NULL (which triggers a KASSERT) or local variables. In the case of kern_ndis, the tsleep() actually used a common sleep address (curproc) making it susceptible to a premature wakeup.	2007-02-23 16:25:08 +00:00
Paolo Pisati	ef544f6312	o break newbus api: add a new argument of type driver_filter_t to bus_setup_intr() o add an int return code to all fast handlers o retire INTR_FAST/IH_FAST For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current Reviewed by: many Approved by: re@	2007-02-23 12:19:07 +00:00
Konstantin Belousov	b4bb515484	Remove extern int hz; use proper include file instead.	2007-02-02 08:58:16 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	75ee4e5462	No need to lock emul_lock in exit_group() because em->shared cannot change (because its referenced by curthread). This fixes a LOR caused by acquiring emul_shared_lock while holding emul_lock. Fix typo in comment. Submitted by: rdivacky	2007-02-01 13:33:33 +00:00
Konstantin Belousov	25954d7430	No need to synchronize linux_schedtail with linux_proc_init. p->p_emuldata is properly initialized in the time when the child can run. Do not set p->p_emuldata to NULL when the process is exiting. It does not make any sense and only costs 2 mutex operations. Do not lock emul_data to unlock it on the very next line. Comment on possible race while there. Reparent all procs that are part of a threading group but not its leaders to init and SIGCHLD init to finish the zombies off. This fixes zombies left after opera's exit. [1] There is no need to lock p_em in the linux_proc_init CLONE_THREAD case because the process cannot change the address of the p_em->shared because its currently running this code path. Move assigning of em->shared outside emul_shared_lock. Noticed by: Scott Robbins <scottro@nyc.rr.com> [1] Submitted by: rdivacky	2007-02-01 13:29:27 +00:00
Alexander Leidinger	eff9c72b4b	Use a printf-modifier which doesn't need a cast. Submitted by: scottl	2007-01-21 13:18:52 +00:00
Alexander Leidinger	9cb5a012fb	Fix tinderbox build on amd64.	2007-01-20 19:32:23 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Alexander Leidinger	f0cad96d23	Ooops, fix the ratelimit.	2007-01-20 11:31:14 +00:00
Alexander Leidinger	456ede3976	Convert a KASSERT into a runtime warning (rate limited) + failsafe fallback. Because of a stupid bug (also fixed with this commit) the KASSERT was triggered when runnung the linux top. Pointy hat to: netchild	2007-01-20 11:07:41 +00:00
Konstantin Belousov	4349c6ba29	Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flags to open() [1]. Improve locking for accessing session control structures [2]. Try to document (most likely harmless) races in the code [3]. Based on submission by: Intron (intron at intron ac) [1] Reviewed by: jhb [2] Discussed with: netchild, rwatson, jhb [3]	2007-01-18 09:32:08 +00:00
Alexander Leidinger	17011df1e1	MFp4 (112379): Implement SETALL/GETALL IPC primitives. This fixes some LTP testcases and LabView is able to proceed a little bit further. Submitted by: rdivacky	2007-01-14 16:34:43 +00:00
Alexander Leidinger	31becc7692	MFp4 (112705): Inherit setting of the default emulation version to the jails. Pointed out by: jhb Submitted by: rdivacky	2007-01-14 16:07:01 +00:00
Alexander Leidinger	a849401985	MFp4 (112646): Now (ok it's been a while...) that FreeBSD has RLIMIT_AS too, we can use it in the linuxolator instead of ignoring it. This fixes a LTP test. Submitted by: rdivacky	2007-01-07 19:30:19 +00:00
Alexander Leidinger	bb419e1b5b	MFp4 (112535): No need to lock prison in a case of linux_use26 because the int setting is atomic and process cannot leave jail. Submitted by: kib Reviewed by: jhb Requested by: rdivacky	2007-01-07 19:20:17 +00:00
Alexander Leidinger	0ed6f09c4e	MFp4 (112534): Dont lock em in a case of just using em->shared->group_pid because the group_pid never changes. Submitted by: rdivacky Reviewed by: kib Glanced at by: jhb	2007-01-07 19:14:06 +00:00
Alexander Leidinger	291081ce0a	MFp4 (112499): Protect em->shared with the lock in case of CLONE_THREAD. Submitted by: rdivacky	2007-01-07 19:09:20 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Xin LI	59038483f5	Fix amd64 build. Submitted by: Divacky Roman <xdivac02 stud fit vutbr cz>	2007-01-01 14:47:45 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Alexander Leidinger	a628609ee9	MFp4: - semi-automatic style fixes	2006-12-31 12:42:55 +00:00
Alexander Leidinger	9ce8f9bcdd	MFp4 (111746+): Redo the checking for 2.6 emulation. We now cache the value of use26 and replace calls to linux_get_osrelease() + parsing with a call to linux_use26(). Typical path is lockless now. Pointed out by: kib This allows to ship RELENG_7_0 with a default osrelease of 2.4.2 and the possibility to enable 2.6.x emulation without the possible performance impact of the previous version of the check. Submitted by: rdivacky	2006-12-31 12:39:10 +00:00
Alexander Leidinger	ef95cfeab9	MFp4: - semi-automatic style fixes - spelling fixes in comments - add some comments	2006-12-31 11:56:16 +00:00
Sam Leffler	b1d83a2508	add entry points required by newer broadcom wireless driver PR: kern/106131 Submitted by: Scot Hetzel MFC after: 2 weeks	2006-12-25 17:04:41 +00:00
Alexander Leidinger	de6bf3bfcd	MFP4 (110956): Add definition for LINUX_MSG_INFO. This fixes the tinderbox errors. Submitted by: rdivacky	2006-12-21 13:11:06 +00:00
Jung-uk Kim	77424f4177	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
Jung-uk Kim	34ec45fe0d	MFP4: 110179 Add rudimentary IPC_INFO/MSG_INFO command support for linux_msgctl() to pacify Linux ipcs(1). While I am here, add more bound checks for linux_msgsnd() and linux_msgrcv().	2006-12-20 20:08:45 +00:00
Jung-uk Kim	5e868cbb79	Regen.	2006-12-20 19:39:10 +00:00
Jung-uk Kim	127891cab9	MFP4: (part of) 110058 Fix 32-bit msgsnd(3) and msgrcv(3) emulations for amd64.	2006-12-20 19:36:03 +00:00
Jung-uk Kim	f61480ecf5	MFP4: (part of) 110058 Use new kern_msgsnd()/kern_msgrcv() to fix linux32 emulation on amd64.	2006-12-20 19:30:52 +00:00
Jung-uk Kim	b34608fea5	MFP4: 109653 Linux mknod(2) can open any files, not just char/block or fifo files. This fixes Linux Test Project test cases mknod01, mknod07 and mknod09.	2006-12-04 22:46:09 +00:00
Jung-uk Kim	b256a1e10b	MFP4: 109652 Fixes for 'blocking in fifoor state' problem of LTP tests. linux_stat() functions were opening files with O_RDONLY to get major/minor pair for char/block special files. Unfortunately, when these functions are used against fifo, it is blocked forever because there is no writer. Instead, we only open char/block special files for major/minor conversion. We have to get rid of kern_open() entirely from translate_path_major_minor() but today is not the day. While I am here, add checks for errors before calling translate_path_major_minor().	2006-12-04 22:38:52 +00:00
Alexander Leidinger	5ac7315788	MFP4 (110957) Use TAILQ_FOREACH_SAFE instead of the unsafe one where an item is removed from the queue. This prevents a panic on kldunload. Submitted by: rdivacky Tested by: bsam	2006-12-03 21:00:31 +00:00
Alexander Leidinger	f6018b1434	MFP4 (108673, 110519, 110874): - Currently LINUX_MAX_COMM_LEN is smaller than MAXCOMLEN, but in case this will change we have a buffer overflow. Apply some defensive programming to DTRT when this should happen. - Use copyinstr() instead of copyin where appropriate. * Fallback to copyin() in case of ENAMETOOLONG. [1] * Use the right source and destination (it was wrong before). - Use strlcpy instead of strcpy. - Properly lock the read case (PR_GET_NAME) like the write case. Reviewed by: rwatson (except [1]) Suggested by: rwatson [1]	2006-12-02 14:56:25 +00:00
Jung-uk Kim	e40fc50b9f	MFP4: Change 109654 Add two linprocfs entries for Linux IPC: /proc/sys/kernel/msgmni -> kern.ipc.msgmni /proc/sys/kernel/sem -> kern.ipc.semmsl kern.ipc.semmns kern.ipc.semopm kern.ipc.semmni This fixes msgget03 and semget05 from Linux Test Project (LTP) test suite. msgctl08 and msgctl09 also use /proc/sys/kernel/msgmni but another fix is required from p4 (Change 110179). Requested by: netchild	2006-11-27 21:10:55 +00:00
Konstantin Belousov	bdaee9ef4e	Add missed ")". Fix the build. Pointy hat to: kib	2006-11-18 17:27:39 +00:00
Konstantin Belousov	cce1514679	Sync struct sysinfo with real one from linux. Submitted by: rdivacky	2006-11-18 14:37:54 +00:00
Konstantin Belousov	0c00520b93	Use standard debugging facilities in linux_getcwd(). Submitted by: rdivacky	2006-11-18 13:31:03 +00:00
Konstantin Belousov	d559d18183	Add debuging printfs to syscalls that do not contain it yet. In sethostname do not print the hostname because it would require to copyin the string. Sethostname is not very frequently used. Submitted by: rdivacky	2006-11-18 13:00:59 +00:00
Konstantin Belousov	f472c6e35a	Remove unecessary locking of process in linux_getpid. Suggested by: jhb Submitted by: rdivacky	2006-11-18 10:12:43 +00:00
Konstantin Belousov	292a85f4a8	Group pid and parent are shared in a case of CLONE_THREAD not CLONE_VM. This fix lets clone02 LTP test pass with 2.6 emulation. In reality 99% of the cases are that CLONE_VM and CLONE_THREAD are both set so it seemed to work. Submitted by: rdivacky	2006-11-15 11:04:37 +00:00
Konstantin Belousov	0132096dfd	In rev 1.188 of linux_misc.c the added check for valid options ommited __WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies after exit. Submitted by: rdivacky Reported by: Bakul Shah (bakul at bitblocks com)	2006-11-15 10:01:06 +00:00
Ruslan Ermilov	9f70620442	Regen. Forgotten by: trhodes	2006-11-11 21:49:08 +00:00
Tom Rhodes	6aeb05d7be	Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@	2006-11-11 16:26:58 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Ruslan Ermilov	f42326c579	Regen.	2006-11-03 21:23:33 +00:00
Ruslan Ermilov	0b160a7d2b	Fix build breakage introduced in previous commit (redeclatation of sctp functions).	2006-11-03 21:21:28 +00:00
Randall Stewart	af99851047	This commits the remake in kern/ make sysent to get the correct syscalls.master's $FreeBSD$ tag record and a make sysent in sys/compat/freebsd32. Thanks Ruslan for pointing out the steps I missed :-0 Approved by: gnn	2006-11-03 18:57:49 +00:00
Randall Stewart	f8829a4a40	Ok, here it is, we finally add SCTP to current. Note that this work is not just mine, but it is also the works of Peter Lei and Michael Tuexen. They both are my two key other developers working on the project.. and they need ata-boy's too: ** peterlei@cisco.com tuexen@fh-muenster.de ** I did do a make sysent which updated the syscall's and sysproto.. I hope that is correct... without it you don't build since we have new syscalls for SCTP :-0 So go out and look at the NOTES, add option SCTP (make sure inet and inet6 are present too) and play with SCTP. I will see about comitting some test tools I have after I figure out where I should place them. I also have a lib (libsctp.a) that adds some of the missing socketapi functions that I need to put into lib's.. I will talk to George about this :-) There may still be some 64 bit issues in here, none of us have a 64 bit processor to test with yet.. Michael may have a MAC but thats another beast too.. If you have a mac and want to use SCTP contact Michael he maintains a web site with a loadable module with this code :-) Reviewed by: gnn Approved by: gnn	2006-11-03 15:23:16 +00:00
Alexander Leidinger	3680a41902	Backout the linux aio stuff. Several problems where identified and the dynamic nature (if no native aio code is available, the linux part returns ENOSYS because of missing requisites) should be solved differently than it is. All this will be done in P4. Not included in this commit is a backout of the changes to the native aio code (removing static in some places). Those changes (and some more) will also be needed when the reworked linux aio stuff will reenter the tree. Requested by: rwatson Discussed with: rwatson	2006-10-29 14:02:39 +00:00
Alexander Leidinger	e3e6449247	style(9) Noticed by: rwatson	2006-10-29 09:50:55 +00:00
Alexander Leidinger	c4ce314b40	Fix style(9). Noticed by: rwatson	2006-10-28 16:47:38 +00:00
Alexander Leidinger	955d762aca	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
Maxim Sobolev	016b81e405	Regen.	2006-10-24 17:25:36 +00:00
Maxim Sobolev	ef16706d34	Fix kernel breakage introduced in the previous commit (redeclatation of the audit functions).	2006-10-24 17:24:11 +00:00
Robert Watson	c71bf4bf63	Regenerate.	2006-10-24 13:54:56 +00:00
Robert Watson	a1dce47980	Hook up audit functions in the freebsd32 compatibility code. It is believed these likely don't require wrappers. Reported by: sobomax MFC after: 3 days	2006-10-24 13:49:44 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
David Xu	034b26fc65	Regenerate.	2006-10-17 02:28:58 +00:00
David Xu	3f9223b65d	Sync with master.	2006-10-17 02:28:26 +00:00
Alexander Leidinger	6474221698	Fix compile (use the right variable name).	2006-10-15 14:34:03 +00:00
Alexander Leidinger	6a1162d4cd	MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>	2006-10-15 14:22:14 +00:00
Alexander Leidinger	687c23be1d	MFP4 (107868 - 107870): Use a macro to test for a valid signal instead of doing it my hand everywhere. Submitted by: rdivacky	2006-10-15 12:51:43 +00:00
Giorgos Keramidas	050f8bb67d	Spell proc/sys/kernel/pid_max correctly in a comment. Submitted by: rdivacky	2006-10-11 20:32:46 +00:00
John Baldwin	8528552b0d	Don't pass unused bufsz to kern_shmctl().	2006-10-10 22:46:50 +00:00
John Baldwin	f3ea244ea9	Only try to copyin a msqid for the IPC_SET command to msgctl(). Other commands (such as IPC_RMID) were bogusly failing with EFAULT. Tested by: jkim	2006-10-10 22:46:22 +00:00
John Baldwin	7f4c1dd0d6	Remove unnecessary casts before PTRIN().	2006-10-10 22:44:59 +00:00
Alexander Leidinger	28638377ad	- change if (cond) panic() to KASSERT. - Dont forget to free em in a case of error. Suggested by: ssouhlal Submitted by: rdivacky Tested with: LTP	2006-10-08 17:10:34 +00:00
Alexander Leidinger	7660ace19c	- Replace homegrown check for FIFO with S_ISFIFO. [1] - Check the status of the options before messing with it. Inspired by: NetBSD [1] Submitted by: rdivacky Tested with: LTP	2006-10-08 17:08:27 +00:00
Alexander Leidinger	236e97b2b2	Implement /proc/sys/kernel/pid_max. Submitted by: rdivacky Tested with: LTP	2006-10-08 16:55:27 +00:00
David Xu	295426f4c5	Regenerate.	2006-10-06 08:24:37 +00:00
David Xu	ae7d8a6766	Implement 32bit umtx_lock and umtx_unlock system calls, these two system calls are not used by libthr in RELENG_6 and HEAD, it is only used by the libthr in RELENG-5, the _umtx_op system call can do more incremental dirty works than these two system calls without having to introduce new system calls or throw away old system calls when things are going on.	2006-10-06 08:22:08 +00:00
David Xu	312a0e5f06	Regenerate.	2006-10-05 01:58:57 +00:00
David Xu	e6e7f16cb4	Oops, add the missing file.	2006-10-05 01:58:08 +00:00
David Xu	c6511aea86	Move some declaration of 32-bit signal structures into file freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.	2006-10-05 01:56:11 +00:00
Robert Watson	531147aa3e	Regenerate.	2006-10-03 20:48:11 +00:00
Robert Watson	dfb041ca62	Change getpagesize() system call audit event to more clearly indicate that we don't audit it. MFC after: 3 days Obtained from: TrustedBSD Project	2006-10-03 20:48:03 +00:00
Poul-Henning Kamp	f645b0b51c	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
Alexander Leidinger	d4b7423fa1	MFp4: - Linux returns ENOPROTOOPT in a case of not supported opt to setsockopt. - Return EISDIR in pread() when arg is a directory. - Return EINVAL instead of EFAULT when namelen is not correct in accept(). - Return EINVAL instead of EACCESS if invalid access mode is entered in access(). - Return EINVAL instead of EADDRNOTAVAIL in a case of bad salen param to bind(). Submitted by: rdivacky Tested with: LTP (vfork01 fails now, but it seems to be a race and not caused by those changes) MFC after: 1 week	2006-09-23 19:06:54 +00:00
David Xu	4af4fcb71a	Regenerate.	2006-09-23 00:27:53 +00:00
David Xu	5c26f4cea8	Enable sigwait.	2006-09-23 00:27:11 +00:00
David Xu	ac3674aa52	Regenerate.	2006-09-22 15:05:34 +00:00
David Xu	cda9a0d1c2	Add compatible code to let 32bit libthr work on 64bit kernel.	2006-09-22 15:04:28 +00:00
David Xu	27bbb2e71f	Regenerate.	2006-09-22 00:53:43 +00:00
David Xu	1eec02f538	Add umtx support for 32bit process on AMD64 machine.	2006-09-22 00:52:54 +00:00
David Xu	ecc313475b	Regenerate.	2006-09-21 04:50:38 +00:00
David Xu	47bd78d24d	sync with master.	2006-09-21 04:49:36 +00:00
Robert Watson	da7cbdc2b3	Regenerate.	2006-09-17 13:29:36 +00:00
Robert Watson	6c2d307a0e	AUE_SIGALTSTACK instead of AUE_SIGPENDING for sigaltstack(). Obtained from: TrustedBSD Project MFC after: 3 days	2006-09-17 13:28:11 +00:00
Alexander Leidinger	18f81b3dfa	- don't reboot() when feed with wrong parameters (and enough permissions) [1] - add support to power off the system [2] - check the linux magic values [3] Submitted by: Marcin Cieslak <saper@SYSTEM.PL> [1,2] Modelled after: linux man page of the reboot() syscall [3] Found by: LTP testcase "reboot02" [1] Tested with: LTP testcase "reboot02" [1,3] MFC after: 1 week	2006-09-16 14:12:04 +00:00
Alexander Leidinger	db0d964062	The Linux unlink syscall uses a different errno value when trying to unlink a directory. PR: 102897 [1] Noticed by: Knut Anders Hatlen <kahatlen@gmail.com>, testrun with LTP [1] Submitted by: Marcin Cieslak <saper@SYSTEM.PL> Tested by: netchild (LTP test run)	2006-09-10 13:47:56 +00:00
Alexander Leidinger	8618fd85a3	- Extend the coverage of PROC_LOCK to cover wakeup(&p->p_emuldata); - Lock the emuldata in a case when we just created it. Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:55:55 +00:00
Alexander Leidinger	bb59e63f8f	Change futex lock from mutex to sx. Make futex_get atomic (protected by the futex lock). Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:25:25 +00:00
Alexander Leidinger	c19ddeda07	- don't wake every sleeper just the first one [1] - remove debuging printf [2] Submitted by: intron <mag@intron.ac> [1], rdivacky [2]	2006-09-09 13:04:28 +00:00
David Xu	c0ba6c1783	The following functions need not to be reimplemented, reuse 64bit syscalls instead: sigqueue, thr_set_name, thr_setscheduler, thr_getscheduler, thr_setschedparam.	2006-09-09 01:22:13 +00:00
Robert Watson	e482025ebd	Regenerate.	2006-09-03 16:24:36 +00:00
Robert Watson	e8a6d7e554	Set freebsd32 system call event identifiers for: - old truncate, ftruncate - old getpeername, gethostid, sethostid, getrlimit, setrlimit, killpg. - old quota, getsockname, getdirentries. - lgetfh - old getdomainname, setdomainname - sysarch, rtprio, __getcwd, jail, sigtimedwait - extattrctl, extattr_{get,set,delete,list}_{file,fd,link} - getresgid, getresuid, kqueue, eaccess, nmount, sendfile - fhstatfs, kldunloadf Right identifiers for: - nfssvc Remove incorrect identifier for: - __acl_get_file Compile tested with help of: sam Obtained from: TrustedBSD Project	2006-09-03 16:17:49 +00:00
Robert Watson	8075da7e8b	Regenerate. Looks like someone missed doing this previously as more than just the audit event change appears in the diff.	2006-09-03 13:47:52 +00:00
Robert Watson	1b25e5f3c4	Use AUE_NTP_ADJTIME instead of AUE_ADJTIME for ntp_adjtime(). Obtained from: TrustedBSD Project	2006-09-03 13:47:24 +00:00
Robert Watson	0ee913128d	Remove two hypothetical calls to suser() in ifdef'd (and uncompilable) svr4 code: this code would call centralized sysctl code that does these checks also. MFC after: 1 week Obtained from: TrustedBSD Project Sponsored by: nCircle Network Security, Inc.	2006-09-02 08:18:22 +00:00
Suleiman Souhlal	c67e0cc9e7	FREE -> free Submitted by: rdivacky	2006-08-28 13:52:27 +00:00
Alexander Leidinger	835e506190	Add the linux statfs64 call. This allows Tivoli backup to proceed a little but further on -current (still not successful, but a step into the right direction). Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: Paul Mather <paul@gromit.dlib.vt.edu>	2006-08-27 08:56:54 +00:00
Alexander Leidinger	84ed9f91d8	Correct the number of retries in a futex_wake() call. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-26 10:36:16 +00:00
Robert Watson	3e8df637c0	Don't call suser_cred() directly from linux_sethostname(), as it just wraps userland_sysctl(), which performs necessary privilege checks as part of its normal operation. MFC after: 1 week	2006-08-25 11:02:42 +00:00
Alexander Leidinger	1a28c0df09	Sync the MI parts for amd64 with i386 and remove the corresponding special handling for amd64 in the common code. The MD parts for amd64 are still outstanding, but at least this fixes some panics on amd64. Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: bsam	2006-08-20 13:50:27 +00:00
Alexander Leidinger	29ddc19bbf	Get rid of some nested includes. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb	2006-08-19 15:13:01 +00:00
Suleiman Souhlal	5342db0872	MALLOC -> malloc and FREE -> free Submitted by: rdivacky Pointed out by: jhb	2006-08-19 11:54:19 +00:00
Suleiman Souhlal	b273d5aa72	ifdef DEBUG a printf Submitted by: rdivacky	2006-08-19 11:07:22 +00:00
Warner Losh	1a3c917f9d	while (0); -> while (0) in multi-line macros	2006-08-17 22:50:33 +00:00
Alexander Leidinger	590e3a06e8	- disable some more code when osrelease=2.4.2 - protect td->td_proc->p_pid with the proc lock in linux_getpid in the amd64 (= non i386) case [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: netchild [1]	2006-08-17 21:21:30 +00:00
Alexander Leidinger	94cb2ecf79	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
Alexander Leidinger	9b0bcbfbda	Fix the DEBUG build: - linux_emul.c [1] - linux_futex.c [2] Sponsored by: Google SoC 2006 [1] Submitted by: rdivacky [1] netchild [2]	2006-08-17 09:50:30 +00:00
Peter Wemm	bad9a7a5f9	Grab two syscall numbers. One is used to emulate functionality that linux has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname that corresponds to a given file descriptor). Valgrind-3.x needs this functionality. This is a placeholder only at this time.	2006-08-16 22:32:50 +00:00
Alexander Leidinger	0eef2f8a4e	Style fixes to comments. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-16 18:54:51 +00:00
Jung-uk Kim	a88d050dfc	Include sys/limits.h for INT_MAX. freebsd32_proto.h 1.58 does not include sys/umtx.h any more and previously it was included from there.	2006-08-16 00:02:36 +00:00
John Baldwin	f8f1f7fb85	Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier systrace changes.	2006-08-15 17:37:01 +00:00
John Baldwin	df78f6d313	- Remove unused sysvec variables from various syscalls.conf. - Send the systrace_args files for all the compat ABIs to /dev/null for now. Right now makesyscalls.sh generates a file with a hardcoded function name, so it wouldn't work for any of the ABIs anyway. Probably the function name should be configurable via a 'systracename' variable and the functions should be stored in a function pointer in the sysvec structure.	2006-08-15 17:25:55 +00:00
Alexander Leidinger	a43eeaabe4	Disable some parts of the code on amd64 for now to prevent a panic. A better fix will come later. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 15:15:17 +00:00
Alexander Leidinger	9b44bfc556	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
Alexander Leidinger	ad2056f2c4	Add some new files needed for linux 2.6.x compatibility. Please don't style(9) the NetBSD code, we want to stay in sync. Not imported on a vendor branch since we need local changes. Sponsored by: Google SoC 2006 Submitted by: rdivacky With help from: manu@NetBSD.org Obtained from: NetBSD (linux_{futex,time}.*)	2006-08-15 12:20:59 +00:00
Konstantin Belousov	1565bf54af	Lock the vnode around the call to VOP_GETATTR. Move the locked code and vn_fullpath (that call malloc(..., M_WAITOK)) from under the vm object lock, since sleep is not allowed while holding the mutex. Being there, wrap VOP_GETATTR call with conditional Giant aquire. Currently this is (almost) noop because pseudofs is Giant-locked. Tested by: kris Approved by: pjd (mentor) MFC after: 2 weeks	2006-08-08 12:29:26 +00:00
Robert Watson	3d0685834f	With socket code no longer in svr4_stream.c, MAC includes are no longer required, so GC.	2006-08-05 22:04:21 +00:00
Brooks Davis	012759b743	Use TAILQ_EMPTY instead of checking if TAILQ_FIRST is NULL.	2006-08-04 21:15:09 +00:00
John Baldwin	91ce2694d1	Regen for MPSAFE flag removal.	2006-07-28 19:08:37 +00:00
John Baldwin	af5bf12239	Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to mark system calls as being MPSAFE: - Stop conditionally acquiring Giant around system call invocations. - Remove all of the 'M' prefixes from the master system call files. - Remove support for the 'M' prefix from the script that generates the syscall-related files from the master system call files. - Don't explicitly set SYF_MPSAFE when registering nfssvc.	2006-07-28 19:05:28 +00:00
John Baldwin	e0b4add8d8	Various fixes to comments in the syscall master files including removing cruft from the audit import and adding mention of COMPAT4 to freebsd32.	2006-07-28 18:55:18 +00:00
John Baldwin	78371ec202	Regen.	2006-07-28 16:56:44 +00:00
John Baldwin	95e7d19dfa	- Explicitly lock Giant to protect the fields in the svr4_strm structure except for s_family (which is read-only once after it is set when the structure is created). - Mark svr4_sys_ioctl(), svr4_sys_getmsg(), and svr4_sys_putmsg() MPSAFE.	2006-07-28 16:56:17 +00:00
John Baldwin	f30e89ced3	Fix a file descriptor race I reintroduced when I split accept1() up into kern_accept() and accept1(). If another thread closed the new file descriptor and the first thread later got an error trying to copyout the socket address, then it would attempt to close the wrong file object. To fix, add a struct file ** argument to kern_accept(). If it is non-NULL, then on success kern_accept() will store a pointer to the new file object there and not release any of the references. It is up to the calling code to drop the references appropriately (including a call to fdclose() in case of error to safely handle the aforementioned race). While I'm at it, go ahead and fix the svr4 streams code to not leak the accept fd if it gets an error trying to copyout the streams structures.	2006-07-27 19:54:41 +00:00
John Baldwin	3ce72960e8	Regen.	2006-07-21 20:41:33 +00:00
John Baldwin	e0569c0798	Clean up the svr4 socket cache and streams code some to make it more easily locked. - Move all the svr4 socket cache code into svr4_socket.c, specifically move svr4_delete_socket() over from streams.c. Make the socket cache entry structure and svr4_head private to svr4_socket.c as a result. - Add a mutex to protect the svr4 socket cache. - Change svr4_find_socket() to copy the sockaddr_un struct into a caller-supplied sockaddr_un rather than giving the caller a pointer to our internal one. This removes the one case where code outside of svr4_socket.c could access data in the cache. - Add an eventhandler for process_exit and process_exec to purge the cache of any entries for the exiting or execing process. - Add methods to init and destroy the socket cache and call them from the svr4 ABI module's event handler. - Conditionally grab Giant around socreate() in streamsopen(). - Use fdclose() instead of inlining it in streamsopen() when handling socreate() failure. - Only allocate a stream structure and attach it to a socket in streamsopen(). Previously, if a svr4 program performed a stream operation on an arbitrary socket not opened via the streams device, we would attach streams state data to it and change f_ops of the associated struct file while it was in use. The latter was especially not safe, and if a program wants a stream object it should open it via the streams device anyway. - Don't bother locking so_emuldata in the streams code now that we only touch it right after creating a socket (in streamsopen()) or when tearing it down when the file is closed. - Remove D_NEEDGIANT from the streams device as it is no longer needed.	2006-07-21 20:40:13 +00:00
John Baldwin	52d639a953	Add conditional VFS Giant locking to svr4_sys_fchroot() and mark it MPSAFE. Also, call change_dir() instead of doing part of it inline (this now adds a mac_check_vnode_chdir() call) to match fchdir() and call mac_check_vnode_chroot() to match chroot(). Also, use the change_root() function to do the actual change root to match chroot(). Reviewed by: rwatson	2006-07-21 20:28:56 +00:00
John Baldwin	b4c63329d5	- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional Giant VFS locking in that function. - Remove bogus code to handle the case where namei() returns success but a NULL vnode pointer. - Note that this code duplicates exec_check_permissions() and annotate where it differs. - Hold the vnode lock longer to protect the write to set VV_TEXT in v_vflag. - Mark linux_uselib() MPSAFE. Reviewed by: rwatson	2006-07-21 20:22:13 +00:00
John Baldwin	7cf6a457ea	Regen.	2006-07-19 19:03:21 +00:00
John Baldwin	a1ca3e0ba7	Add conditional VFS Giant locking to svr4_sys_resolvepath() and mark it MPSAFE.	2006-07-19 19:03:03 +00:00
John Baldwin	a3616b117c	Make svr4_sys_waitsys() a lot less ugly and mark it MPSAFE. - If the WNOWAIT flag isn't specified and either of WEXITED or WTRAPPED is set, then just call kern_wait() and let it do all the work. This means that this function no longer has to duplicate the work to teardown zombies that is done in kern_wait(). Instead, if the above conditions aren't true, then it uses a simpler loop to implement WNOWAIT and/or tracing for only stopped or continued processes. This function still has to duplicate code from kern_wait() for the latter two cases, but those are much simpler. - Sync the code to handle the WCONTINUED and WSTOPPED cases with the equivalent code in kern_wait(). - Fix several places that would return with the proctree lock still held. - Lock the current process to prevent lost wakeup races when blocking.	2006-07-19 19:01:10 +00:00
John Baldwin	b33887ea31	Don't free the sockaddr in kern_bind() and kern_connect() as not all callers pass a sockaddr allocated via malloc() from M_SONAME anymore. Instead, free it in the callers when necessary.	2006-07-19 18:28:52 +00:00
John Baldwin	a02f5c6204	Initialize svr4_head during MOD_LOAD rather than on demand.	2006-07-19 18:26:09 +00:00
David Xu	2df96d8e02	sync with master.	2006-07-14 01:57:09 +00:00
John Baldwin	90aff9de2d	Regen.	2006-07-11 20:55:23 +00:00
John Baldwin	be5747d5b5	- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.	2006-07-11 20:52:08 +00:00
John Baldwin	c870740e09	- Split out kern_accept(), kern_getpeername(), and kern_getsockname() for use by ABI emulators. - Alter the interface of kern_recvit() somewhat. Specifically, go ahead and hard code UIO_USERSPACE in the uio as that's what all the callers specify. In place, add a new uioseg to indicate what type of pointer is in mp->msg_name. Previously it was always a userland address, but ABI emulators may pass in kernel-side sockaddrs. Also, remove the namelenp field and instead require the two places that used it to explicitly copy mp->msg_namelen out to userland. - Use the patched kern_recvit() to replace svr4_recvit() and the stock kern_sendit() to replace svr4_sendit(). - Use kern_bind() instead of stackgap use in ti_bind(). - Use kern_getpeername() and kern_getsockname() instead of stackgap in svr4_stream_ti_ioctl(). - Use kern_connect() instead of stackgap in svr4_do_putmsg(). - Use kern_getpeername() and kern_accept() instead of stackgap in svr4_do_getmsg(). - Retire the stackgap from SVR4 compat as it is no longer used.	2006-07-10 21:38:17 +00:00
John Baldwin	acdd09f944	Unexpand PTRIN() in several places and fix one instance where 0 was being used instead of NULL.	2006-07-10 19:37:43 +00:00
John Baldwin	c1cccebe8b	Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.	2006-07-08 20:03:39 +00:00
John Baldwin	b1ee5b654d	Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This mostly consists of pushing a few copyin's and copyout's up into __semctl() as all the other callers were already doing the UIO_SYSSPACE case. This also changes kern_semctl() to set the return value in a passed in pointer to a register_t rather than td->td_retval[0] directly so that callers can only set td->td_retval[0] if all the various copyout's succeed. As a result of these changes, kern_semctl() no longer does copyin/copyout (except for GETALL/SETALL) so simplify the locking to acquire the semakptr mutex before the MAC check and hold it all the way until the end of the big switch statement. The GETALL/SETALL cases have to temporarily drop it while they do copyin/malloc and copyout. Also, simplify the SETALL case to remove handling for a non-existent race condition.	2006-07-08 19:51:38 +00:00
John Baldwin	ad6d226d43	- Protect the list of linux ioctl handlers with an sx lock. - Hold Giant while calling linux ioctl handlers for now as they aren't all known to be MPSAFE yet. - Mark linux_ioctl() MPSAFE.	2006-07-06 21:42:36 +00:00
John Baldwin	d699b1ce00	Don't try to copyin extra data for IPC_RMID requests to msgctl() or shmctl(). None of the other ABI's do this (including the native FreeBSD ABI), and uselessly trying to do a copyin() can actually result in a bogus EFAULT if the a process specifies NULL for the optional argument (which is what they should do in this case).	2006-07-06 21:38:24 +00:00
Mark Murray	93c005929f	Housekeeping. Update for maintainers who have handed in their commit bits or (in my case) no longer feel that oversight is necessary.	2006-07-01 10:51:55 +00:00
Alexander Leidinger	550be19e16	Improve linprovfs to provide/fix the - process state (idle, sleeping, running, ...) [1] - the process group ID of the process which owns the connected tty - some page fault stats - time spend in kernel/userland - priority/nice value - starttime [1] - memory/swap stats - scheduling policy Additionally add some new fields and correct some not filled out ones. This brings us down to 15 dummy fields. The fields marked with [1] are needed to get Oracle 10 running. The starttime field is not completely right, since it displays the _same_ starttime for _every_ process, but at least it is not 0 and Oracle accepts this. This is a RELENG_x_y candidate. Noticed by: Dmitry Ganenko <dima@apk-inform.com> [1] Reviewed by: des, rdivacky MFC after: 1 week	2006-06-27 20:11:58 +00:00
John Baldwin	cec34dbf79	Regen.	2006-06-27 18:32:16 +00:00
John Baldwin	bb639715d4	Use kern_shmctl() in svr4_sys_shmctl() and drop use of the stackgap. Mark svr4_sys_shmctl() MPSAFE.	2006-06-27 18:31:36 +00:00
John Baldwin	4db580972e	Axe the stackgap macros as the Linux ABIs no longer use the stackgap.	2006-06-27 18:30:49 +00:00
John Baldwin	49d409a108	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
John Baldwin	0cceebeeb2	Regen.	2006-06-27 14:47:08 +00:00
John Baldwin	597d608f86	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
John Baldwin	b820787fb3	Regen.	2006-06-26 18:37:36 +00:00
John Baldwin	b0f6106af9	Change svr4_sys_break() to just call obreak() and mark it MPSAFE. Not objected to by: alc	2006-06-26 18:36:57 +00:00
John Baldwin	04a8728231	- Sync with master: rmdir(), mkdir(), and extattr_*() are all MPSAFE. - freebsd32_utimes() is MPSAFE.	2006-06-26 18:35:57 +00:00
Alexander Leidinger	555f86b8b6	The linux times syscall can be called with a NULL pointer, so keep cool and don't panic. This fix is different from the patch submitted as it not only prevents a NULL-pointer dereference, but also skips some work in this case. Noticed by: Dmitry Ganenko <dima@apk-inform.com> Reviewed by: rdivacky (the original version as in emulation@) MFC after: 1 week Security: This is a RELENG_x_y candidate (local DoS). Go ahead by: secteam (cperciva)	2006-06-23 18:49:38 +00:00
Diomidis Spinellis	462da4d616	Move conditional preprocessing out of the SYSCTL_ADD_STRING macro invocation. Per C99 6.10.3 paragraph 11 preprocessing directives appearing inside macro arguments yield undefined behavior.	2006-06-22 13:11:36 +00:00
John Baldwin	62d615d508	Conditionally acquire Giant around VFS operations.	2006-06-20 21:31:38 +00:00
John Baldwin	932151064a	- Add a new linker_file_foreach() function that walks the list of linker file objects calling a user-specified predicate function on each object. The iteration terminates either when the entire list has been iterated over or the predicate function returns a non-zero value. linker_file_foreach() returns the value returned by the last invocation of the predicate function. It also accepts a void * context pointer that is passed to the predicate function as well. Using an iterator function avoids exposing linker internals to the rest of the kernel making locking simpler. - Use linker_file_foreach() instead of walking the list of linker files manually to lookup ndis files in ndis(4). - Use linker_file_foreach() to implement linker_hwpmc_list_objects().	2006-06-20 20:37:17 +00:00
John Baldwin	a6e25132d4	Forcefully turn off GPROF in this file if it is enabled as GPROF's attempt to use a macro for 'ret' doesn't play well with the wrappers trying to implement 'Pascal-style' calling conventions.	2006-06-12 20:35:59 +00:00
Dag-Erling Smørgrav	5ef57544fc	Add the model name, obtained from the hw.model sysctl variable. MFC after: 3 weeks	2006-06-12 18:14:49 +00:00
Paul Saab	e8b62ee79e	Do not copy out the iovec in the 32bit recvmsg call since soreceive calls uiomove directly. Reviewed by: ups MFC after: 1 week	2006-06-08 18:33:08 +00:00
Dag-Erling Smørgrav	b19bfd3db5	As far as I can tell, the correct CPU family for amd64 (which Linux calls x86_64) is 15, not 6. MFC after: 3 weeks	2006-06-02 13:01:25 +00:00
Doug Ambrisko	edb75eca27	Fix file leaking in translate_path_major_minor.	2006-05-16 17:57:00 +00:00
Poul-Henning Kamp	c40da00ca3	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
John Baldwin	73dbd3da73	Remove various bits of conditional Alpha code and fixup a few comments.	2006-05-12 05:04:46 +00:00
Doug Ambrisko	0b1c233427	Remove the dependency on procfs since it isn't used. Noticed by: des	2006-05-11 15:27:58 +00:00
Alexander Leidinger	01e0ffbae8	Now that we don't have a linuxolator on alpha anymore: - unifdef __alpha__ - revert rev. 1.66 of linux_socket.c	2006-05-10 20:38:16 +00:00
Alexander Leidinger	17138b619c	Implement rt_sigpending in the linuxolator. PR: 92671 Submitted by: Markus Niemist"o <markus.niemisto@gmx.net>	2006-05-10 18:17:29 +00:00
Doug Ambrisko	32397ce071	Add in linsysfs. A linux 2.6 like sys filesystem to pacify the Linux LSI MegaRAID SAS utility. Sponsored by: IronPort Systems Man page help from: brueffer	2006-05-09 22:27:01 +00:00
Doug Ambrisko	03487601c2	Fix the the duplicate cut-n-paste in linux_fstat64 pointed out by Alexander Leidinger. I forget to fix it in this version.	2006-05-05 16:17:59 +00:00
Doug Ambrisko	060e488247	Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy. Add back in a scheme to emulate old type major/minor numbers via hooks into stat, linprocfs to return major/minors that Linux app's expect. Currently only /dev/null is always registered. Drivers can register via the Linux type shim similar to the ioctl shim but by using linux_device_register_handler/linux_device_unregister_handler functions. The structure is: struct linux_device_handler { char bsd_driver_name; char linux_driver_name; char bsd_device_name; char linux_device_name; int linux_major; int linux_minor; int linux_char_device; }; Linprocfs uses this to display the major number of the driver. The soon to be available linsysfs will use it to fill in the driver name. Linux_stat uses it to translate the major/minor into Linux type values. Note major numbers are dynamically assigned via passing in a -1 for the major number so we don't need to keep track of them. This is somewhat needed due to us switching to our devfs. MegaCli will not run until I add in the linsysfs and mfi Linux compat changes. Sponsored by: IronPort Systems	2006-05-05 16:10:45 +00:00
Robert Watson	f7f45ac8e2	Annotate uses of fgetsock() with indications that they should rely on their existing file descriptor references to sockets, rather than use fgetsock() to retrieve a direct socket reference. MFC after: 3 months	2006-04-01 15:25:01 +00:00
Paul Saab	74f7258fb7	regen for 32bit System V shared memory	2006-03-30 07:43:01 +00:00
Paul Saab	fbb273bc05	Properly support for FreeBSD 4 32bit System V shared memory. Submitted by: peter Obtained from: Yahoo! MFC after: 3 weeks	2006-03-30 07:42:32 +00:00
Tai-hwa Liang	d9d46ed258	Unbreaking build by removing a now unused variable.	2006-03-27 23:27:11 +00:00
John Baldwin	b77619bd7f	Use td_ucred rather than p_ucred to avoid panics and general unhappiness. Pointy hat to: netchild	2006-03-27 19:16:31 +00:00
Alexander Leidinger	1daa386fcf	Fix the LINT build on alpha: - rename some file local structure definitions, the names clash with autogenerated names - on !alpha add some compatibility defines for those renamed structures - make some functions globally visible on alpha	2006-03-21 21:56:04 +00:00
Alexander Leidinger	61da9d97fb	Fix tinderbox on alpha. Tested by: cross-compile	2006-03-20 19:46:56 +00:00
Ruslan Ermilov	aefce619cf	Unbreak COMPAT_LINUX32 option support on amd64. Broken by: netchild	2006-03-19 11:10:33 +00:00
Alexander Leidinger	d4a3f5ddb6	Fixup some problems in my previous commit (COMPAT_43). Pointyhat to: netchild	2006-03-18 20:47:36 +00:00
Alexander Leidinger	5c8919adf4	Get rid of the need of COMPAT_43 in the linuxolator. Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)	2006-03-18 18:20:17 +00:00
Stephan Uphoff	68ff3c2445	Fix exec_map resource leaks. Tested by: kris@	2006-03-08 20:21:54 +00:00
Paul Saab	6308f39da8	use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure the copied strings are properly terminated. bzero the statfs32 struct in copy_statfs.	2006-03-04 00:09:09 +00:00
Paul Saab	26e4fb05dc	regen for 32bit sendfile	2006-02-28 19:39:52 +00:00
Paul Saab	fa545f434c	Fix 32bit sendfile by implementing kern_sendfile so that it takes the header and trailers as iovec arguments instead of copying them in inside of sendfile. Reviewed by: jhb MFC after: 3 weeks	2006-02-28 19:39:18 +00:00
John Baldwin	8917b8d28c	- Always call exec_free_args() in kern_execve() instead of doing it in all the callers if the exec either succeeds or fails early. - Move the code to call exit1() if the exec fails after the vmspace is gone to the bottom of kern_execve() to cut down on some code duplication.	2006-02-06 22:06:54 +00:00
Jeff Roberson	c4be19469a	- Remove ifdef disabled code that doesn't have a chance of working anymore.	2006-02-06 10:10:42 +00:00
Robert Watson	ef572cf5bb	Regenerate.	2006-02-04 13:29:09 +00:00
Robert Watson	2b8d08f814	Audit FreeBSD 32-bit system calls on 64-bit FreeBSD systems. Obtained from: TrustedBSD Project	2006-02-04 13:28:55 +00:00
Jeff Roberson	d6791f7615	- vn_lock with LK_RETRY can not return an error. The code that handled this case was not necessary. Sponsored by: Isilon Systems, Inc.	2006-01-30 08:22:56 +00:00
Olivier Houchard	d425dbec89	Fix a typo : deivce => device Spotted by: rwatson	2006-01-26 21:48:50 +00:00
Olivier Houchard	e83d253beb	Linux compat bits needed to make linux programs use the new ptys : linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number linux_stats.c : - Return the magic number for devfs. - In various stats()-related functions, check that we're stating a file in /dev/pts, and if so, change the st_rdev field to match what linux expects to be there for a slave pty device. The glibc checks for this, and their openpty() fails if it is no correct.	2006-01-26 01:32:46 +00:00
Doug Ambrisko	f06b864361	Fix the build. When I added the lutimes the futimes definitions went away in the generated files? This didn't happen on my amd64 test machine but did when I committed it on my other i386 machine. I need to figure this out since a regen on the amd64 doesn't fix it now. For now make the build work again. Matt caught this before my local mirror caught up.	2006-01-20 20:51:27 +00:00
Doug Ambrisko	cac2fa646c	Regen.	2006-01-20 16:22:37 +00:00
Doug Ambrisko	08a3081da8	Add 32bit version of lutimes so untar doesn't mess up sym-links on amd64.	2006-01-20 16:22:06 +00:00
Tom Rhodes	0e36e11d57	Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code.	2005-12-28 07:08:54 +00:00
Gleb Smirnoff	3c6160327d	Add \n to log() message. Submitted by: Stanislaw Halik <weirdo tehran.lain.pl>	2005-12-27 00:17:11 +00:00
Maxim Sobolev	900b28f9f6	Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually allow executing elf dynamic binaries (aka shared libraries). When it is requested to execute ET_DYN elf image check if this flag is on after we know the elf brand allowing execution if so. PR: kern/87615 Submitted by: Marcin Koziej <creep@desk.pl>	2005-12-26 21:23:57 +00:00
Ruslan Ermilov	bebb4536ce	Regen.	2005-12-23 20:06:50 +00:00
Ruslan Ermilov	c647318411	Fix build.	2005-12-23 20:06:14 +00:00
Poul-Henning Kamp	25f6e35a05	Regenerate sysent with new abort2 system call. Implement abort2(const char reason, int narg, void *args); Submitted by: "Wojciech A. Koszek" <dunstan@freebsd.czest.pl>	2005-12-23 11:58:42 +00:00
Poul-Henning Kamp	fe322ece24	Add missing 455-462 syscalls as unimplemented	2005-12-23 11:56:39 +00:00
Poul-Henning Kamp	5a56b437ec	Add abort2() systemcall.	2005-12-23 11:54:11 +00:00
John Baldwin	410d857972	Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1) which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT() has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here was redundant and resulted in panics in debug kernels. MFC after: 1 week Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu	2005-12-15 16:30:41 +00:00
Xin LI	1278dd6847	In Linux, kernel parameters passed to ioctl are by value, while in FreeBSD they are passed by reference. Handle the difference within the linux_ioctl_termio on the LINUX_TCFLSH path. Submitted by: Jaroslav Drzik <jaro_AT_coop-voz_dot_sk>	2005-12-13 15:32:52 +00:00
Max Laier	2694019753	Fix calculation of meminfo's swaptotal and swapfree on at least amd64. MFC after: 3 days	2005-12-11 21:37:42 +00:00
Doug Ambrisko	204634a652	Regen for futimes.	2005-12-08 22:15:09 +00:00
Doug Ambrisko	8e7604db06	Add 32bit version of futimes so untar doesn't result in bad dates (Jan 1, 1970) when run on amd64. Reviewed by: ps	2005-12-08 22:14:25 +00:00
Gleb Smirnoff	7a14354549	Suppress logging about unimplemented syscalls to one time per process. This prevents hard flood of the system console. Reviewed by: bde	2005-12-08 13:33:57 +00:00
Peter Wemm	79880f7327	Catch up to the system siginfo changes. Use a union for the ia32 layout of siginfo just like the system one. There are now two fields to copy instead of one.	2005-12-06 23:06:29 +00:00
Ruslan Ermilov	f4e9888107	Fix -Wundef.	2005-12-04 02:12:43 +00:00
Craig Rodrigues	2207c7648e	Remove MNT_NODEV mount option. In RELENG_6, MNT_NODEV was a no-op. The presence of MNT_NODEV was confusing the am-utils autoconf scripts. PR: conf/79715	2005-11-29 00:28:17 +00:00
Bill Paul	f1b78ee016	Somehow memmove() got mapped to memset() in the patch table. Create a real memmove() implementation and use that instead.	2005-11-23 17:10:46 +00:00
Bill Paul	78edd540cf	Correct the API for Windows interupt handling a little. The prototype for a Windows ISR is 'BOOLEAN isrfunc(KINTERRUPT , void )' meaning the ISR get a pointer to the interrupt object and a context pointer, and returns TRUE if the ISR determines the interrupt was really generated by the associated device, or FALSE if not. I had mistakenly used 'void isrfunc(void *)' instead. It happens the only thing this affects is the internal ndis_intr() ISR in subr_ndis.c, but it should be fixed just in case we ever need to register a real Windows ISR vi IoConnectInterrupt(). For NDIS miniports that provide a MiniportISR() method, the 'is_our_intr' value returned by the method serves as the return value from ndis_isr(), and 'call_isr' is used to decide whether or not to schedule the interrupt handler via DPC. For drivers that only supply MiniportEnableInterrupt() and MiniportDisableInterrupt() methods, call_isr is always TRUE and is_our_intr is always FALSE. In the end, there should be no functional changes, except that now ntoskrnl_intr() can terminate early once it finds the ISR that wants to service the interrupt.	2005-11-20 01:29:29 +00:00
Ruslan Ermilov	f95871b97b	Unlike the rest of the world, NDIS code can access "struct ifnet" before is has been fully initialized by if_attach(). Account for that to avoid a null pointer dereference.	2005-11-14 18:19:57 +00:00
Bill Paul	86a8393963	Restore backwards source compatibility with 6.x and 5.x.	2005-11-13 21:36:48 +00:00
Ruslan Ermilov	4a0d6638b3	- Store pointer to the link-level address right in "struct ifnet" rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.	2005-11-11 16:04:59 +00:00
Bill Paul	e73e17729b	Implement RtlZeroMemory() and RtlCopyMemory(). This seems to allow the Broadcom Win64 wireless driver for the BCM4318 to work on amd64.	2005-11-10 02:22:55 +00:00
Bill Paul	65983e40e1	Change the definition for EXT_NDIS to EXT_NET_DRV. Since the latest mbuf code changes, MEXTADD() can be used to add an external buffer with arbitrary type, but mb_ext_free() won't let you free it.	2005-11-07 16:57:14 +00:00
Bill Paul	b5b548a6bc	The latest version of the Intel 2200BG/2915ABG driver (9.0.0.3-9) from Intel's web site requires some minor tweaks to get it to work: - The driver seems to have been released with full WMI tracing enabled, and makes references to some WMI APIs, namely IoWMIRegistrationControl(), WmiQueryTraceInformation() and WmiTraceMessage(). Only the first one is ever called (during intialization). These have been implemented as do-nothing stubs for now. Also added a definition for STATUS_NOT_FOUND to ntoskrnl_var.h, which is used as a return code for one of the WMI routines. - The driver references KeRaiseIrqlToDpcLevel() and KeLowerIrql() (the latter as a function, which is unusual because normally KeLowerIrql() is a macro in the Windows DDK that calls KfLowewIrql()). I'm not sure why these are being called since they're not really part of WDM. Presumeably they're being used for backwards compatibility with old versions of Windows. These have been implemented in subr_hal.c. (Note that they're _stdcall routines instead of _fastcall.) - When querying the OID_802_11_BSSID_LIST OID to get a BSSID list, you don't know ahead of time how many networks the NIC has found during scanning, so you're allowed to pass 0 as the list length. This should cause the driver to return an 'insufficient resources' error and set the length to indicate how many bytes are actually needed. However for some reason, the Intel driver does not honor this convention: if you give it a length of 0, it returns some other error and doesn't tell you how much space is really needed. To get around this, if using a length of 0 yields anything besides the expected error case, we arbitrarily assume a length of 64K. This is similar to the hack that wpa_supplicant uses when doing a BSSID list query.	2005-11-06 19:38:34 +00:00
Paul Saab	506df56c79	Copy out the number of iovecs in freebsd32_recvmsg, not the length of a single iovec.	2005-11-06 18:12:43 +00:00
Paul Saab	1471f287e1	Calling setrlimit from 32bit apps could potentially increase certain limits beyond what should be capiable in a 32bit process, so we must fixup the limits. Reviewed by: jhb	2005-11-02 21:18:07 +00:00
Bill Paul	a91395a9d0	Tests with my dual Opteron system have shown that it's possible for code to start out on one CPU when thunking into Windows mode in ctxsw_utow(), and then be pre-empted and migrated to another CPU before thunking back to UNIX mode in ctxsw_wtou(). This is bad, because then we can end up looking at the wrong 'thread environment block' when trying to come back to UNIX mode. To avoid this, we now pin ourselves to the current CPU when thunking into Windows code. Few other cleanups, since I'm here: - Get rid of the ndis_isr(), ndis_enable_interrupt() and ndis_disable_interrupt() wrappers from kern_ndis.c and just invoke the miniport's methods directly in the interrupt handling routines in subr_ndis.c. We may as well lose the function call overhead, since we don't need to export these things outside of ndis.ko now anyway. - Remove call to ndis_enable_interrupt() from ndis_init() in if_ndis.c. We don't need to do it there anyway (the miniport init routine handles it, if needed). - Fix the logic in NdisWriteErrorLogEntry() a little. - Change some NDIS_STATUS_xxx codes in subr_ntoskrnl.c into STATUS_xxx codes. - Handle kthread_create() failure correctly in PsCreateSystemThread().	2005-11-02 18:01:04 +00:00
Andre Oppermann	34333b16cd	Retire MT_HEADER mbuf type and change its users to use MT_DATA. Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-02 13:46:32 +00:00
Bill Paul	fde84c1850	Clean up one remaining 'multiple DPC thread' bogon: only bzero() one sizeof(kq_queue), not sizeof(kq_queue) * mp_ncpus.	2005-11-01 09:24:35 +00:00
Paul Saab	ecc44de7a2	Reformat socket control messages on input/output for 32bit compatibility on 64bit systems. Submitted by: ps, ups Reviewed by: jhb	2005-10-31 21:09:56 +00:00
Peter Wemm	946bca4fcd	Regenerate (with the correct #ifdef COMPAT_43 tests now)	2005-10-26 22:21:03 +00:00
Peter Wemm	767dfc44be	There is no 'freebsd3_' prefix for COMPAT_43 syscalls. Those are all bundled under MCOMPAT and have an 'o' prefix. Adjust as appropriate. This re-enables compiling without COMPAT_43 again.	2005-10-26 22:19:51 +00:00
Bill Paul	4cf9a535a8	Minor nit: in ntoskrnl_finddev(), only free the 'children' device_t array if device_find_children() actually returned a non-NULL array pointer.	2005-10-26 20:21:45 +00:00
Bill Paul	51d6d0952b	Clean up and apply the fix for PR 83477. The calculation for locating the start of the section headers has to take into account the fact that the image_nt_header is really variable sized. It happens that the existing calculation is correct for _most_ production binaries produced by the Windows DDK, but if we get a binary with oddball offsets, the PE loader could crash. Changes from the supplied patch are: - We don't really need to use the IMAGE_SIZEOF_NT_HEADER() macro when computing how much of the header to return to callers of pe_get_optional_header(). While it's important to take the variable size of the header into account in other calculations, we never actually look at anything outside the non-variable portion of the header. This saves callers from having to allocate a variable sized buffer off the heap (I purposely tried to avoid using malloc() in subr_pe.c to make it easier to compile in both the -D_KERNEL and !-D_KERNEL case), and since we're copying into a buffer on the stack, we always have to copy the same amount of data or else we'll trash the stack something fierce. - We need <stddef.h> to get offsetof() in the !-D_KERNEL case. - ndiscvt.c needs the IMAGE_FIRST_SECTION() macro too, since it does a little bit of section pre-processing. PR: kern/83477	2005-10-26 18:46:27 +00:00
Bill Paul	7f3cc43211	Get rid of the timer tracking and reaping code in NdisMInitializeTimer() and ndis_halt_nic(). It's been disabled for some time anyway, and it turns out there's a possible deadlock in NdisMInitializeTimer() when acquiring the miniport block lock to modify the timer list: it's possible for a driver to call NdisMInitializeTimer() when the miniport block lock has already been acquired by an earlier piece of code. You can't acquire the same spinlock twice, so this can deadlock. Also, implement MmMapIoSpace() and MmUnmapIoSpace(), and make NdisMMapIoSpace() and NdisMUnmapIoSpace() use them. There are some drivers that want MmMapIoSpace() and MmUnmapIoSpace() so that they can map arbitrary register spaces not directly associated with their device resources. For example, there's an Atheros driver for a miniPci card (0x168C:0x1014) on the IBM Thinkpad x40 that wants to map some I/O spaces at 0xF00000 and 0xE00000 which are held by the acpi0 device. I don't know what it wants these ranges for, but if it can't map and access them, the MiniportInitialize() method fails.	2005-10-26 06:52:57 +00:00
Bill Paul	4ba4b2c45c	Fix handling of message table messages that got broken when I converted NdisWriteErrorLogEntry() to use the RtlXXX unicode/ansi conversion routines.	2005-10-24 05:05:09 +00:00
David E. O'Brien	2f3e5b2f15	Add a 'clean' target.	2005-10-23 23:58:23 +00:00
Paul Saab	90168b92f2	regen	2005-10-23 10:43:39 +00:00
Paul Saab	e7abd4a000	Implement for FreeBSD 3 32 binaries: sigaction, sigprocmask, sigpending, sigvec, sigblock, sigsetmask, sigsuspend, sigstack	2005-10-23 10:43:14 +00:00
Bill Paul	a50286e21d	Make the multiple DPC threads an option, and create only one by default. This avoids the need for sched_bind() in the default case so that you can start up the NDIS subsystem at boot time when only CPU 0 is running. There are potentially ways to fix it so that the DPC threads aren't started until after the other CPUs are launched, but doing it correctly is tricky. You need to defer the startup of the ntoskrnl subsystem (ntoskrnl_libinit()), not just defer ndis_attach(). For now, I don't think it will make much difference having just the single DPC thread (I started out with just one anyway). Note that this turns the KeSetTargetProcessorDpc() routine into a no-op, since the CPU number in struct kdpc is now ignored.	2005-10-22 05:15:20 +00:00
Bill Paul	87ff20ed78	Correct the macro definition for KeRaiseIrql(). The official API is KeRaiseIrql(newirql, &oldirql), not oldirql = KeRaiseIrql(newirql). (The macro ultimately translates to KfRaiseIrql() which does use the latter API, so this has no effect on generated code.) Also, wait for thread termination the right way: kthread_exit() will ultimately do a wakeup(td->td_proc). This is the event we should wait on. Eliminate the previous synchronization machinery for this since it was never guaranteed to work correctly.	2005-10-21 05:23:20 +00:00
Bill Paul	1e956d87e1	Use sched_bind() to make sure the DPC threads are bound to the correct processor, to insure DPC thread 0 runs on CPU0, DPC thread 1 runs on CPU1, and so on. Elevate the priority of the workitem threads, though don't use as high a priority as the DPC threads.	2005-10-20 17:45:58 +00:00
David Xu	3e1c732ffa	Fix compiling problem by adding prefix name svr4 to si_xxx macro, the si_xxx macro should not be used in compat headers, as these are standard member names or only can be used in our native header file signal.h.	2005-10-19 09:33:15 +00:00
Bill Paul	a3ced67adf	Another round of cleanups and fixes: - Change ndis_return() from a DPC to a workitem so that it doesn't run at DISPATCH_LEVEL (with the dispatcher lock held). - In if_ndis.c, submit packets to the stack via (*ifp->if_input)() in a workitem instead of doing it directly in ndis_rxeof(), because ndis_rxeof() runs in a DPC, and hence at DISPATCH_LEVEL. This implies that the 'dispatch level' mutex for the current CPU is being held, and we don't want to call if_input while holding any locks. - Reimplement IoConnectInterrupt()/IoDisconnectInterrupt(). The original approach I used to track down the interrupt resource (by scanning the device tree starting at the nexus) is prone to problems when two devices share an interrupt. (E.g removing ndis1 might disable interrupts for ndis0.) The new approach is to multiplex all the NDIS interrupts through a common internal dispatcher (ntoskrnl_intr()) and allow IoConnectInterrupt()/IoDisconnectInterrupt() to add or remove interrupts from the dispatch list. - Implement KeAcquireInterruptSpinLock() and KeReleaseInterruptSpinLock(). - Change the DPC and workitem threads to use the KeXXXSpinLock API instead of mtx_lock_spin()/mtx_unlock_spin(). - Simplify the NdisXXXPacket routines by creating an actual packet pool structure and using the InterlockedSList routines to manage the packet queue. - Only honor the value returned by OID_GEN_MAXIMUM_SEND_PACKETS for serialized drivers. For deserialized drivers, we now create a packet array of 64 entries. (The Microsoft DDK documentation says that for deserialized miniports, OID_GEN_MAXIMUM_SEND_PACKETS is ignored, and the driver for the Marvell 8335 chip, which is a deserialized miniport, returns 1 when queried.) - Clean up timer handling in subr_ntoskrnl. - Add the following conditional debugging code: NTOSKRNL_DEBUG_TIMERS - add debugging and stats for timers NDIS_DEBUG_PACKETS - add extra sanity checking for NdisXXXPacket API NTOSKRNL_DEBUG_SPINLOCKS - add test for spinning too long - In kern_ndis.c, always start the HAL first and shut it down last, since Windows spinlocks depend on it. Ntoskrnl should similarly be started second and shut down next to last.	2005-10-18 19:52:15 +00:00
Paul Saab	15857ef5ea	regen after recvmsg, recvfrom, sendmsg	2005-10-15 05:57:34 +00:00
Paul Saab	a372f8224c	Implement the 32bit versions of recvmsg, recvfrom, sendmsg Partially obtained from: jhb	2005-10-15 05:57:06 +00:00
Paul Saab	fd151bb940	regen for clock_gettime, clock_settime, clock_getres	2005-10-15 02:54:39 +00:00
Paul Saab	f0b479cd75	Implement 32bit wrappers for clock_gettime, clock_settime, and clock_getres.	2005-10-15 02:54:18 +00:00
Paul Saab	145f7e60da	regen	2005-10-15 02:40:34 +00:00
Paul Saab	d5c7796115	Correct the prototype for freebsd32_nanosleep and use the proper size when copying struct timespec32 in and out.	2005-10-15 02:40:10 +00:00
David Xu	9104847f21	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
Bill Paul	85c13a8375	Convert ndis_set_info() and ndis_get_info() from using msleep() to KeSetEvent()/KeWaitForSingleObject(). Also make object argument of KeWaitForSingleObject() a void * like it's supposed to be.	2005-10-12 03:02:50 +00:00
Bill Paul	21628ddbd6	This commit makes a big round of updates and fixes many, many things. First and most importantly, I threw out the thread priority-twiddling implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in favor of a new scheme that uses sleep mutexes. The old scheme was really very naughty and sought to provide the same behavior as Windows spinlocks (i.e. blocking pre-emption) but in a way that wouldn't raise the ire of WITNESS. The new scheme represents 'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If a thread on cpu0 acquires the 'dispatcher mutex,' it will block any other thread on the same processor that tries to acquire it, in effect only allowing one thread on the processor to be at 'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit and spin' routine on the spinlock variable itself. If a thread on cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher mutex' for cpu1 and then it too does an atomic sit and spin to try acquiring the spinlock. Unlike real spinlocks, this does not disable pre-emption of all threads on the CPU, but it does put any threads involved with the NDISulator to sleep, which is just as good for our purposes. This means I can now play nice with WITNESS, and I can safely do things like call malloc() when I'm at 'DISPATCH_LEVEL,' which you're allowed to do in Windows. Next, I completely re-wrote most of the event/timer/mutex handling and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects() have been re-written to use condition variables instead of msleep(). This allows us to use the Windows convention whereby thread A can tell thread B "wake up with a boosted priority." (With msleep(), you instead have thread B saying "when I get woken up, I'll use this priority here," and thread A can't tell it to do otherwise.) The new KeWaitForMultipleObjects() has been better tested and better duplicates the semantics of its Windows counterpart. I also overhauled the IoQueueWorkItem() API and underlying code. Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the same work item isn't put on the queue twice. ExQueueWorkItem(), which in my implementation is built on top of IoQueueWorkItem(), was also modified to perform a similar test. I renamed the doubly-linked list macros to give them the same names as their Windows counterparts and fixed RemoveListTail() and RemoveListHead() so they properly return the removed item. I also corrected the list handling code in ntoskrnl_dpc_thread() and ntoskrnl_workitem_thread(). I realized that the original logic did not correctly handle the case where a DPC callout tries to queue up another DPC. It works correctly now. I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to use them. I also tried to duplicate the interrupt handling scheme used in Windows. The interrupt handling is now internal to ndis.ko, and the ndis_intr() function has been removed from if_ndis.c. (In the USB case, interrupt handling isn't needed in if_ndis.c anyway.) NdisMSleep() has been rewritten to use a KeWaitForSingleObject() and a KeTimer, which is how it works in Windows. (This is mainly to insure that the NDISulator uses the KeTimer API so I can spot any problems with it that may arise.) KeCancelTimer() has been changed so that it only cancels timers, and does not attempt to cancel a DPC if the timer managed to fire and queue one up before KeCancelTimer() was called. The Windows DDK documentation seems to imply that KeCantelTimer() will also call KeRemoveQueueDpc() if necessary, but it really doesn't. The KeTimer implementation has been rewritten to use the callout API directly instead of timeout()/untimeout(). I still cheat a little in that I have to manage my own small callout timer wheel, but the timer code works more smoothly now. I discovered a race condition using timeout()/untimeout() with periodic timers where untimeout() fails to actually cancel a timer. I don't quite understand where the race is, using callout_init()/callout_reset()/callout_stop() directly seems to fix it. I also discovered and fixed a bug in winx32_wrap.S related to translating _stdcall calls. There are a couple of routines (i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that return 64-bit quantities. On the x86 arch, 64-bit values are returned in the %eax and %edx registers. However, it happens that the ctxsw_utow() routine uses %edx as a scratch register, and x86_stdcall_wrap() and x86_stdcall_call() were only preserving %eax before branching to ctxsw_utow(). This means %edx was getting clobbered in some cases. Curiously, the most noticeable effect of this bug is that the driver for the TI AXC110 chipset would constantly drop and reacquire its link for no apparent reason. Both %eax and %edx are preserved on the stack now. The _fastcall and _regparm wrappers already handled everything correctly. I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem() instead of the NdisScheduleWorkItem() API. This is to avoid possible deadlocks with any drivers that use NdisScheduleWorkItem() themselves. The unicode/ansi conversion handling code has been cleaned up. The internal routines have been moved to subr_ntoskrnl and the RtlXXX routines have been exported so that subr_ndis can call them. This removes the incestuous relationship between the two modules regarding this code and fixes the implementation so that it honors the 'maxlen' fields correctly. (Previously it was possible for NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't own, which was causing many mysterious crashes in the Marvell 8335 driver.) The registry handling code (NdisOpen/Close/ReadConfiguration()) has been fixed to allocate memory for all the parameters it hands out to callers and delete whem when NdisCloseConfiguration() is called. (Previously, it would secretly use a single static buffer.) I also substantially updated if_ndis so that the source can now be built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled. The original WPA code has been updated to fit in more cleanly with the net80211 API, and to eleminate the use of magic numbers. The ndis_80211_setstate() routine now sets a default authmode of OPEN and initializes the RTS threshold and fragmentation threshold. The WPA routines were changed so that the authentication mode is always set first, followed by the cipher. Some drivers depend on the operations being performed in this order. I also added passthrough ioctls that allow application code to directly call the MiniportSetInformation()/MiniportQueryInformation() methods via ndis_set_info() and ndis_get_info(). The ndis_linksts() routine also caches the last 4 events signalled by the driver via NdisMIndicateStatus(), and they can be queried by an application via a separate ioctl. This is done to allow wpa_supplicant to directly program the various crypto and key management options in the driver, allowing things like WPA2 support to work. Whew.	2005-10-10 16:46:39 +00:00
John Baldwin	f2107e8d54	Use the constants for the syscall names from syscall.h rather than hardcoding the numbers for the SYSVIPC syscalls.	2005-10-03 18:34:17 +00:00
Robert Watson	5f419982c2	Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133: Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant. This aligns this code more with the eventual locking of ttys. Suggested by: bde	2005-09-28 07:03:03 +00:00
Peter Wemm	a11ea6e325	Regenerate	2005-09-27 18:04:52 +00:00
Peter Wemm	add121a476	Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added stubs for ia64 to keep it compiling. These are used by 32 bit apps such as gdb.	2005-09-27 18:04:20 +00:00
Robert Watson	84d2b7df26	Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week	2005-09-19 16:51:43 +00:00
Andre Oppermann	e72b668b69	Test the mbuf flags against the correct constant. The previous version worked as intended but only by chance. MT_HEADER == M_PKTHDR == 0x2.	2005-08-30 16:21:51 +00:00
Xin LI	e68796868a	Fix kernel build. Reported by: tinderbox	2005-08-28 13:11:08 +00:00
Craig Rodrigues	8739cd44d0	Rewrite linux_ifconf() to be more like ifconf() in net/if.c so that we do not call uiomove() while IFNET_RLOCK() is held. This eliminates the witness warning: Calling uiomove() with the following non-sleepable locks held: exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @ /usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170 MFC after: 2 days	2005-08-27 14:44:10 +00:00
Robert Watson	13f4c340ae	Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:20:02 +00:00
John Baldwin	ec1f24a934	Add missing dependencies on the SYSVIPC modules.	2005-07-29 19:41:04 +00:00
John Baldwin	813a5e14ec	Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c so that they aren't duplicated 3 times and are also in the same file as the code that depends on the SYSVIPC modules.	2005-07-29 19:40:39 +00:00
John Baldwin	ac5ee935dd	Regen.	2005-07-13 20:35:09 +00:00
John Baldwin	8683e7fdc1	Make a pass through all the compat ABIs sychronizing the MP safe flags with the master syscall table as well as marking several ABI wrapper functions safe. MFC after: 1 week	2005-07-13 20:32:42 +00:00
John Baldwin	6e9b02cf80	Regen.	2005-07-13 15:14:54 +00:00
John Baldwin	2773347338	- Stop hardcoding #define's for options and use the appropriate opt_foo.h headers instead. - Hook up the IPC SVR4 syscalls. MFC after: 3 days	2005-07-13 15:14:33 +00:00
John Baldwin	fa34d9b7a5	Wrap the ia64-specific freebsd32_mmap_partial() hack in Giant for now since it calls into VFS and VM. This makes the freebsd32_mmap() routine MP safe and the extra Giants here can be revisited later. Glanced at by: marcel MFC after: 3 days	2005-07-13 15:12:19 +00:00
John Baldwin	02295eedc7	Add Giant around linux_getcwd_common() in linux_getcwd(). Approved by: re (scottl)	2005-07-09 12:34:49 +00:00
John Baldwin	4641373fde	Add missing locking to linux_connect() so that it can be marked MP safe: - Conditionally grab Giant around the EISCONN hack at the end based on debug.mpsafenet. - Protect access to so_emuldata via SOCK_LOCK. Reviewed by: rwatson Approved by: re (scottl)	2005-07-09 12:26:22 +00:00
Roman Kurakin	fbb7165a4b	Use implicit type cast for ->k_lock to fix compilation of ndis as a part of the GENERIC kernel with INVARIANT* and WITNESS* turned off. (For non GENERIC kernel KTR and MUTEX_PROFILING should be also off). Submitted by: Eygene A. Ryabinkin <rea at rea dot mbslab dot kiae dot ru> Approved by: re (scottl) PR: 81767	2005-07-08 18:36:59 +00:00
John Baldwin	55522478e6	Lock Giant in svr4_add_socket() so that the various svr4_*stat() calls can be marked MP safe as this is the only part of them that is not already MP safe. Approved by: re (scottl)	2005-07-07 19:27:29 +00:00
John Baldwin	03badf38ab	Remove an unused syscallarg() macro leftover from this code's origins in NetBSD. Approved by: re (scottl)	2005-07-07 19:26:43 +00:00
John Baldwin	07fac65b15	Rototill this file so that it actually compiles. It doesn't do anything in the build still due to some #undef's in svr4.h, but if you hack around that and add some missing entries to syscalls.master, then this file will now compile. The changes involved proc -> thread, using FreeBSD syscall names instead of NetBSD, and axeing syscallarg() and retval arguments. Approved by: re (scottl)	2005-07-07 19:25:47 +00:00
John Baldwin	8d948cd1ec	Fix the computation of uptime for linux_sysinfo(). Before it was returning the uptime in seconds mod 60 which wasn't very useful. Approved by: re (scottl)	2005-07-07 19:17:55 +00:00
John Baldwin	9f3157a254	Regenerate. Approved by: re (scottl)	2005-07-07 18:20:38 +00:00
John Baldwin	bcd9e0dd20	- Add two new system calls: preadv() and pwritev() which are like readv() and writev() except that they take an additional offset argument and do not change the current file position. In SAT speak: preadv:readv::pread:read and pwritev:writev::pwrite:write. - Try to reduce code duplication some by merging most of the old kern_foov() and dofilefoo() functions into new dofilefoo() functions that are called by kern_foov() and kern_pfoov(). The non-v functions now all generate a simple uio on the stack from the passed in arguments and then call kern_foov(). For example, read() now just builds a uio and calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev(). PR: kern/80362 Submitted by: Marc Olzheim marcolz at stack dot nl (1) Approved by: re (scottl) MFC after: 1 week	2005-07-07 18:17:55 +00:00
Peter Wemm	62919d788b	Jumbo-commit to enhance 32 bit application support on 64 bit kernels. This is good enough to be able to run a RELENG_4 gdb binary against a RELENG_4 application, along with various other tools (eg: 4.x gcore). We use this at work. ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace, procfs and core dumps. procfs_regs.c: vary the format of proc/XXX/regs depending on the client and target application. procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their sscanf fails. They expect an unsigned long. imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps. sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note that 64 bit consumers can still debug 32 bit targets. IA64 has got stubs for ia32_reg.c. Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't implemented in the 32/64 wrapper yet. We also make a tiny patch to gdb pacify it over conflicting formats of ld-elf.so.1. Approved by: re	2005-06-30 07:49:22 +00:00
John Baldwin	19042f9cce	- Change the commented out freebsd32_xxx() example to use kern_xxx() along with a single copyin() + translate and translate + copyout() rather than using the stackgap. - Remove implementation of the stackgap for freebsd32 since it is no longer used for that compat ABI. Approved by: re (scottl)	2005-06-29 15:16:20 +00:00
John Baldwin	de1c01ad37	Correct the amount of data to allocate in these local copies of exec_copyin_strings() to catch up to rev 1.266 of kern_exec.c. This fixes panics on amd64 with compat binaries since exec_free_args() was freeing more memory than these functions were allocating and the mismatch could cause memory to be freed out from under other concurrent execs. Approved by: re (scottl)	2005-06-24 17:41:28 +00:00
Pawel Jakub Dawidek	06a137780b	Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)	2005-06-23 22:13:29 +00:00
Pawel Jakub Dawidek	3a996d6e91	Do not allocate memory based on not-checked argument from userland. It can be used to panic the kernel by giving too big value. Fix it by moving allocation and size verification into kern_getfsstat(). This even simplifies kern_getfsstat() consumers, but destroys symmetry - memory is allocated inside kern_getfsstat(), but has to be freed by the caller. Found by: FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/ Reported by: Peter Holm <peter@holm.cc>	2005-06-11 14:58:20 +00:00
Brooks Davis	fc74a9f93a	Stop embedding struct ifnet at the top of driver softcs. Instead the struct ifnet or the layer 2 common structure it was embedded in have been replaced with a struct ifnet pointer to be filled by a call to the new function, if_alloc(). The layer 2 common structure is also allocated via if_alloc() based on the interface type. It is hung off the new struct ifnet member, if_l2com. This change removes the size of these structures from the kernel ABI and will allow us to better manage them as interfaces come and go. Other changes of note: - Struct arpcom is no longer referenced in normal interface code. Instead the Ethernet address is accessed via the IFP2ENADDR() macro. To enforce this ac_enaddr has been renamed to _ac_enaddr. - The second argument to ether_ifattach is now always the mac address from driver private storage rather than sometimes being ac_enaddr. Reviewed by: sobomax, sam	2005-06-10 16:49:24 +00:00
Pawel Jakub Dawidek	820a0de9a9	Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson	2005-06-09 18:49:19 +00:00
Pawel Jakub Dawidek	13a82b9623	Avoid code duplication in serval places by introducing universal kern_getfsstat() function. Obtained from: jhb	2005-06-09 17:44:46 +00:00
Maxim Sobolev	bc165ab0fe	Properly convert FreeBSD priority values into Linux values in the getpriority(2) syscall. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua>	2005-06-08 20:41:28 +00:00
Paul Saab	efe5becafa	Wrap copyin/copyout for kevent so the 32bit wrapper does not have to malloc nchanges * sizeof(struct kevent) AND/OR nevents * sizeof(struct kevent) on every syscall. Glanced at by: peter, jmg Obtained from: Yahoo! MFC after: 2 weeks	2005-06-03 23:15:01 +00:00
Robert Watson	3984b2328c	Rebuild generated system call definition files following the addition of the audit event field to the syscalls.master file format. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:20:21 +00:00
Robert Watson	f3596e3370	Introduce a new field in the syscalls.master file format to hold the audit event identifier associated with each system call, which will be stored by makesyscalls.sh in the sy_auevent field of struct sysent. For now, default the audit identifier on all system calls to AUE_NULL, but in the near future, other BSM event identifiers will be used. The mapping of system calls to event identifiers is many:one due to multiple system calls that map to the same end functionality across compatibility wrappers, ABI wrappers, etc. Submitted by: wsalamon Obtained from: TrustedBSD Project	2005-05-30 15:09:18 +00:00
Yoshihiro Takahashi	d4fcf3cba5	Remove bus_{mem,p}io.h and related code for a micro-optimization on i386 and amd64. The optimization is a trivial on recent machines. Reviewed by: -arch (imp, marcel, dfr)	2005-05-29 04:42:30 +00:00
Pawel Jakub Dawidek	d0cad55da8	Remove (now) unused argument 'td' from bsd_to_linux_statfs().	2005-05-27 19:25:39 +00:00
Paul Saab	473dd55f2e	Copyout to userland if kern_sigaction succeeds	2005-05-24 17:52:14 +00:00
Pawel Jakub Dawidek	672d95c55d	The code is under '#ifdef not_that_way', but anyway: - Add missing prison_check_mount() check.	2005-05-22 22:30:31 +00:00
Pawel Jakub Dawidek	a0e96a49df	If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us, so do not duplicate the code in cvtstatfs(). Note, that we now need to clear fsid in freebsd4_getfsstat(). This moves all security related checks from functions like cvtstatfs() and will allow to add more security related stuff (like statfs(2), etc. protection for jails) a bit easier.	2005-05-22 21:52:30 +00:00
Bill Paul	0b6c3bf1bc	Missed kern_windrv.c in the last checkin.	2005-05-20 04:01:36 +00:00
Bill Paul	450a94af7a	Deal with a few bootstrap issues: We can't call KeFlushQueuedDpcs() during bootstrap (cold == 1), since the flush operation sleeps to wait for completion, and we can't sleep here (clowns will eat us). On an i386 SMP system, if we're loaded/probed/attached during bootstrap, smp_rendezvous() won't run us anywhere except CPU 0 (since the other CPUs aren't launched until later), which means we won't be able to set up the GDTs anywhere except CPU 0. To deal with this case, ctxsw_utow() now checks to see if the TID for the current processor has been properly initialized and sets up the GTD for the current CPU if not. Lastly, in if_ndis.c:ndis_shutdown(), do an ndis_stop() to insure we really halt the NIC and stop interrupts from happening. Note that loading a driver during bootstrap is, unfortunately, kind of a hit or miss sort of proposition. In Windows, the expectation is that by the time a given driver's MiniportInitialize() method is called, the system is already in 'multiuser' state, i.e. it's up and running enough to support all the stuff specified in the NDIS API, which includes the underlying OS-supplied facilities it implicitly depends on, such as having all CPUs running, having the DPC queues initialized, WorkItem threads running, etc. But in UNIX, a lot of that stuff won't work during bootstrap. This causes a problem since we need to call MiniportInitialize() at least once during ndis_attach() in order to find out what kind of NIC we have and learn its station address. What this means is that some cards just plain won't work right if you try to pre-load the driver along with the kernel: they'll only be probed/attach correctly if the driver is kldloaded _after_ the system has reached multiuser. I can't really think of a way around this that would still preserve the ability to use an NDIS device for diskless booting.	2005-05-20 04:00:50 +00:00

... 8 9 10 11 12 ...

1978 Commits