freebsd-skq

Author	SHA1	Message	Date
kib	d5407645c7	Implement compat32 for old lseek, for the a.out binaries on amd64.	2011-06-16 22:05:56 +00:00
netchild	428c437c67	Commit the missing linux_videdev2_compat.h (lost somewhere between commit tree patch generation -> successful compile tree build test -> commmit). Pointy hat to: netchild	2011-05-04 13:09:20 +00:00
netchild	897499f709	Add FEATURE macros for v4l and v4l2 to the linuxulator. Suggested by: ae	2011-05-04 09:52:34 +00:00
netchild	939c0e0ef5	This is v4l2 support for the linuxulator. This allows to access FreeBSD native devices which support the v4l2 API from processes running within the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd or multimedia/webcamd supplied drivers. Submitted by: nox MFC after: 1 month	2011-05-04 09:05:39 +00:00
netchild	5e0eb77d30	Fix typo in comment, improve comment.	2011-05-04 08:42:31 +00:00
netchild	991566d22b	Add explanation about the use-permission and FreeBSDify it.	2011-05-04 08:41:55 +00:00
netchild	443cec2596	Copy the v4l2 header unchanged from the vendor branch.	2011-05-04 08:31:58 +00:00
mdf	b0f8474766	Regen.	2011-04-18 16:32:47 +00:00
mdf	9c9a32d97b	Add the posix_fallocate(2) syscall. The default implementation in vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by: -arch (previous version)	2011-04-18 16:32:22 +00:00
trasz	935751ee5c	Remove stray semicolon.	2011-04-10 10:15:49 +00:00
jkim	95c723445e	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
trasz	92bec9b84c	Add accounting for most of the memory-related resources. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)	2011-04-05 20:23:59 +00:00
kib	4999e59537	Implement compat32 shims for PCIOCGETCONF. There is a generic problem with the shims for ioctls that receive pointers to the usermode data areas in the data argument. We either have to modify the handler to accept UIO_USERSPACE/UIO_SYSSPACE indicator, or allocate and fill a usermode memory for data buffer in the host format. The change goes the second route, in particular because we do not need to modify the handler. Submitted by: John Wehle <john feith com> MFC after: 2 weeks	2011-04-02 16:02:25 +00:00
kib	72cac8e666	Provide the structures and ioctl number definition for handling PCIOCGETCONF compat32. Submitted by: John Wehle <john feith com> MFC after: 2 weeks	2011-04-02 15:47:23 +00:00
kib	e452eed194	Regen	2011-04-01 11:16:53 +00:00
kib	7c2eaa21fe	Add support for executing the FreeBSD 1/i386 a.out binaries on amd64. In particular: - implement compat shims for old stat(2) variants and ogetdirentries(2); - implement delivery of signals with ancient stack frame layout and corresponding sigreturn(2); - implement old getpagesize(2); - provide a user-mode trampoline and LDT call gate for lcall $7,$0; - port a.out image activator and connect it to the build as a module on amd64. The changes are hidden under COMPAT_43. MFC after: 1 month	2011-04-01 11:16:29 +00:00
avg	94ec7d2988	Revert r220032:linux compat: add SO_PASSCRED option with basic handling I have not properly thought through the commit. After r220031 (linux compat: improve and fix sendmsg/recvmsg compatibility) the basic handling for SO_PASSCRED is not sufficient as it breaks recvmsg functionality for SCM_CREDS messages because now we would need to handle sockcred data in addition to cmsgcred. And that is not implemented yet. Pointyhat to: avg	2011-03-31 08:14:51 +00:00
trasz	3adbd8337d	Regenerate.	2011-03-30 17:59:54 +00:00
trasz	2f99052d80	Add rctl. It's used by racct to take user-configurable actions based on the set of rules it maintains and the current resource usage. It also privides userland API to manage that ruleset. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)	2011-03-30 17:48:15 +00:00
kib	dfc44ec337	Regen.	2011-03-30 14:46:55 +00:00
kib	5c02dd1e9a	Provide compat32 shims for kldstat(2). Requested and tested by: jpaetzel MFC after: 1 week	2011-03-30 14:46:12 +00:00
avg	df7a39b1d0	linux compat: add SO_PASSCRED option with basic handling This seems to have been a part of a bigger patch by dchagin that either haven't been committed or committed partially. Submitted by: dchagin, nox MFC after: 2 weeks	2011-03-26 11:25:36 +00:00
avg	a923658583	linux compat: improve and fix sendmsg/recvmsg compatibility - implement baseic stubs for capget, capset, prctl PR_GET_KEEPCAPS and prctl PR_SET_KEEPCAPS. - add SCM_CREDS support to sendmsg and recvmsg - modify sendmsg to ignore control messages if not using UNIX domain sockets This should allow linux pulse audio daemon and client work on FreeBSD and interoperate with native counter-parts modulo the differences in pulseaudio versions. PR: kern/149168 Submitted by: John Wehle <john@feith.com> Reviewed by: netchild MFC after: 2 weeks	2011-03-26 11:05:53 +00:00
kib	948b7589fc	Implement compat32 MEMRANGE_GET and MEMRANGE_SET. This is needed to run 32bit Xorg server with VESA driver. Submitted by: John Wehle <john feith com> MFC after: 1 week	2011-03-25 11:52:31 +00:00
kib	0d40bf4b19	Fully emulate MDIOCLIST for compat32. MFC after: 1 week	2011-03-25 11:43:49 +00:00
kib	abbe29dccf	Remove unneccessary panics, that can be easily triggered by user. The copyin() function handles NULL as well as any other pointer. MFC after: 3 days	2011-03-25 11:05:28 +00:00
kib	1d38d5630c	Fix file leakage in the freebsd32_ioctl routines. Code inspection shows freebsd32_ioctl calls fget for a fd and calls a subroutine to handle each specific ioctl. It is expected that the subroutine will call fdrop when done. However many of the subroutines will exit out early if copyin encounters an error resulting in fdrop never being called. Submitted by: John Wehle <john feith com> MFC after: 3 days	2011-03-25 10:57:57 +00:00
jhb	c7ac62aecd	Fix some locking nits with the p_state field of struct proc: - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once. MFC after: 1 week	2011-03-24 18:40:11 +00:00
netchild	2d4b88f638	Staticize functions which are not used somewhere else, move the corresponding prototypes from the header to the code file.	2011-03-15 13:40:47 +00:00
avg	5a2a285ac9	add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls Regenerate system call and systrace support files. PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (earlier version) MFC after: 3 weeks	2011-03-12 08:58:19 +00:00
avg	666906fcd7	add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls This commits makes necessary changes in syscall/sysent generation infrastructure. PR: kern/152822 Submitted by: Artem Belevich <fbsdlist@src.cx> Reviewed by: jhb (ealier version) MFC after: 3 weeks	2011-03-12 08:51:43 +00:00
dchagin	1878bfa86d	Style(9) fixes. No functional changes. MFC after: 2 Week	2011-03-12 07:47:05 +00:00
jhb	faa7c47cee	Remove now-obsolete comment. Submitted by: netchild MFC after: 1 week	2011-03-10 19:50:12 +00:00
jkim	a023421583	Remove custom interrupt dispatcher. This is a pointless micro-optimization and it may cause problems if SS and SP are modified by real-mode code. MFC after: 1 month	2011-03-09 16:16:38 +00:00
dchagin	83dbcc9afd	Indeed, remove bogus since r219405 check of the Linux ABI. Pointed out: jhb MFC after: 2 Week	2011-03-09 05:59:33 +00:00
dchagin	69b8756d3d	Extend struct sysvec with new method sv_schedtail, which is used for an explicit process at fork trampoline path instead of eventhadler(schedtail) invocation for each child process. Remove eventhandler(schedtail) code and change linux ABI to use newly added sysvec method. While here replace explicit comparing of module sysentvec structure with the newly created process sysentvec to detect the linux ABI. Discussed with: kib MFC after: 2 Week	2011-03-08 19:01:45 +00:00
trasz	1618438630	Export login class information via kinfo and make it possible to view it using "ps -o class".	2011-03-05 14:41:49 +00:00
trasz	0525662d59	Regenerate.	2011-03-05 12:46:24 +00:00
trasz	62f6a13e39	Add two new system calls, setloginclass(2) and getloginclass(2). This makes it possible for the kernel to track login class the process is assigned to, which is required for RCTL. This change also make setusercontext(3) call setloginclass(2) and makes it possible to retrieve current login class using id(1). Reviewed by: kib (as part of a larger patch)	2011-03-05 12:40:35 +00:00
dchagin	75c35db69e	Print out shared flag for debug purpose. MFC after: 1 Week	2011-03-03 18:29:55 +00:00
dchagin	a583811c0a	Switch PROCESS_SHARE to AUTO_SHARE (as umtx do). Even for SHARED, if page mapped MAP_ANON linux uses private algorithm too. Disscussed with: jhb MFC after: 3 Days	2011-03-03 18:19:10 +00:00
rwatson	9c02915234	Regenerate system call files following addition of cap_enter(2), cap_getmode(2), and capabilities.conf. Reviewed by: anderson Discussed with: benl, kris, pjd Obtained from: Capsicum Project Sponsored by: Google, Inc. MFC after: 3 months	2011-03-01 13:30:23 +00:00
rwatson	6894aabcb5	Add initial support for Capsicum's Capability Mode to the FreeBSD kernel, compiled conditionally on options CAPABILITIES: Add a new credential flag, CRED_FLAG_CAPMODE, which indicates that a subject (typically a process) is in capability mode. Add two new system calls, cap_enter(2) and cap_getmode(2), which allow setting and querying (but never clearing) the flag. Export the capability mode flag via process information sysctls. Sponsored by: Google, Inc. Reviewed by: anderson Discussed with: benl, kris, pjd Obtained from: Capsicum Project MFC after: 3 months	2011-03-01 13:23:37 +00:00
brucec	17108b16ca	Use the cprd_mem field when setting the start and length for a memory resource - the layout of cprd_port is identical but using cprd_mem makes the code easier to understand. PR: kern/118493 Submitted by: Weongyo Jeong <weongyo.jeong at gmail.com> MFC after: 3 days	2011-02-23 21:45:28 +00:00
jhb	6e95ae3692	Use umtx_key objects to uniquely identify futexes. Private futexes in different processes that happen to use the same user address in the separate processes will now be treated as distinct futexes rather than the same futex. We can now honor shared futexes properly by mapping them to a PROCESS_SHARED umtx_key. Private futexes use THREAD_SHARED umtx_key objects. In conjunction with: dchagin Reviewed by: kib MFC after: 1 week	2011-02-23 13:23:28 +00:00
brucec	6d9b42b486	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
dchagin	7309a9eb12	Do not clobber %rdx. Before calling vfork() syscall the linux user-space stores the current PID in the %rdx and restore it when the parent process will leave the kernel.	2011-02-20 07:58:30 +00:00
dchagin	be13e396c9	For realtime signals fill the sigval value.	2011-02-15 21:46:36 +00:00
dchagin	77c093e6c2	Make a linux_rt_sigtimedwait() system call is actually working. 1) Translate the native signal number in the appropriate Linux signal. 2) Remove bogus code, which can lead to a panic as it calls kern_sigtimedwait with same ksiginfo. 3) Return the corresponding signal number.	2011-02-15 21:42:48 +00:00
dchagin	ee445bda81	Style(9) fix. Wrap long lines in linux_rt_sigtimedwait().	2011-02-15 21:24:50 +00:00
dchagin	5a6a633611	Put the macro declaration in the relevant include file for future use.	2011-02-15 21:22:09 +00:00
dchagin	efb892cc10	Style(9) fix. Do not initialize variables in the declarations.	2011-02-14 17:24:58 +00:00
dchagin	185c49188e	Sort include files in the alphabetical order.	2011-02-13 20:07:48 +00:00
dchagin	bc6bf7dad6	Remove comment about 'ftlk' LOR.	2011-02-13 18:46:34 +00:00
dchagin	d9864e4484	Stop printing the LOR, as this is expected behavior.	2011-02-13 18:41:40 +00:00
dchagin	5cfdaf1c59	The bitset field of freshly created futex should be initialized explicity. Otherwise, REQUEUE operations fails.	2011-02-13 17:56:22 +00:00
dchagin	c26a933750	Rename used_requeue and use it as bitwise field to store more flags. Reimplement used_requeue logic with LINUX_XDEPR_REQUEUEOP flag.	2011-02-12 20:58:59 +00:00
dchagin	bd20f71621	Slightly rewrite linux_fork: 1) Remove bogus error checking. 2) A new process exit from kernel through fork_trampoline(), so remove bogus check.	2011-02-12 20:16:25 +00:00
dchagin	9cf49ae032	Remove bogus include <machine/frame.h>	2011-02-12 19:14:57 +00:00
dchagin	9f708ad0aa	Move linux_clone(), linux_fork(), linux_vfork() to a MI path.	2011-02-12 18:17:12 +00:00
netchild	35ebea3dcf	Linux' shm_open() fails because it wants to find some funky shmfs to construct the full pathname. It starts to search at the default mountpoint which is /dev/shm. If this fails it runs through fstab and searches for shmfs and tmpfs. Whatever it finds will be statfs()'ed to be checked for Linux' fs magic for shmfs (0x01021994). Ideally our tmpfs should deliver this fs magic to Linux processes, but as our tmpfs is considered to be an experimental feature we can not assume that there is always a tmpfs available. To make shared memory work in the Linuxulator, force the fs type of /dev/shm (which can be a symlink) to match what Linux expects. The user is responsible (info has to be added to the linux base ports and the docs) to setup a suitable link for /dev/shm. Noticed by: Andre Albsmeier <Andre.Albsmeier@siemens.com> Submitted by: Andre Albsmeier <Andre.Albsmeier@siemens.com> MFC after: 1 month	2011-02-09 20:23:22 +00:00
dchagin	633a205e26	Yet another unimplemented futex operation, print out about. Submitted by: arundel MFC after: 1 month.	2011-01-31 06:06:23 +00:00
dchagin	6570332d31	Implement a futex BITSET op. Submitted by: arundel MFC after: 1 month.	2011-01-31 05:59:05 +00:00
bz	126cbabd2e	Update interface stats counters to match the current format in linux and try to export as much information as we can match. Requested on: Debian GNU/kFreeBSD list (debian-bsd lists.debian.org) 2010-12 Tested by: Mats Erik Andersson (mats.andersson gisladisker.se) MFC after: 10 days	2011-01-31 00:09:52 +00:00
dchagin	e85dbed4b7	Style(9) fixes. MFC after: 1 Month.	2011-01-28 19:04:15 +00:00
dchagin	051ceeb5f3	Implement a variation of the linux_common_wait() which should be used by linuxolator itself. Move linux_wait4() to MD path as it requires native struct rusage translation to struct l_rusage on linux32/amd64. MFC after: 1 Month.	2011-01-28 18:47:07 +00:00
dchagin	2501c5eeee	Style(9) fix. MFC after: 1 month.	2011-01-28 05:42:14 +00:00
dchagin	1e124ec538	Add macro to test the sv_flags of any process. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures. MFC after: 1 month	2011-01-26 20:03:58 +00:00
dchagin	72b8fc74b4	Style(9) fix. Approved by: kib(mentor) MFC after: 1 month	2011-01-23 09:50:39 +00:00
kib	71201deeff	In linuxolator getdents_common(), it seems there is no reason to loop if no records where returned by VOP_READDIR(). Readdir implementations allowed to return 0 records when first record is larger then supplied buffer. In this case trying to execute VOP_READDIR() again causes the syscall looping forewer. The goto was there from the day 1, which goes back to 1995 year. Reported and tested by: Beat G?tzi <beat chruetertee ch> MFC after: 2 weeks	2011-01-19 12:19:25 +00:00
mdf	caeebe8c54	Fix a few more SYSCTL_PROC() that were missing a CTLFLAG type specifier.	2011-01-19 00:57:58 +00:00
kib	06eb8de1e1	Create shared (readonly) page. Each ABI may specify the use of page by setting SV_SHP flag and providing pointer to the vm object and mapping address. Provide simple allocator to carve space in the page, tailored to put the code with alignment restrictions. Enable shared page use for amd64, both native and 32bit FreeBSD binaries. Page is private mapped at the top of the user address space, moving a start of the stack one page down. Move signal trampoline code from the top of the stack to the shared page. Reviewed by: alc	2011-01-08 16:13:44 +00:00
scf	3e36dcd710	Fix the LINUX_SOUND_MIXER_INFO ioctl to return success after the information is set to FreeBSD. It had been falling through to the end of linux_ioctl_sound() and returning ENOIOCTL. Noticed when running the Linux ALSA amixer tool. Add a LINUX_SOUND_MIXER_READ_CAPS ioctl which is used by the Skype v2.1.0.81 binary. Reviewed by: gavin MFC after: 2 weeks	2010-12-30 02:18:04 +00:00
tijl	0f810ef0a2	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace the original amd64 and i386 headers with stubs. Rename (AMD64\|I386)_BUS_SPACE_* to X86_BUS_SPACE_* everywhere. Reviewed by: imp (previous version), jhb Approved by: kib (mentor)	2010-12-20 16:39:43 +00:00
kib	3d12be787f	Restore the ABI of struct kinfo_proc32 after r213536. MFC after: 3 days	2010-12-19 21:18:33 +00:00
bschmidt	bd1f37ab17	Implement NdisGetRoutineAddress and MmGetSystemRoutineAddress used in newer Ralink drivers. Submitted by: Paul B Mahol <onemda at gmail.com>	2010-12-06 20:54:53 +00:00
bschmidt	2fe8bd5bcc	Add a dummy for IoOpenDeviceRegistryKey(). With that change the Atheros 9xxx driver is actually usable and does not panic anymore. Submitted by: Paul B Mahol <onemda at gmail.com> MFC after: 2 weeks	2010-11-29 10:21:45 +00:00
bschmidt	a055d1840e	Some drivers rely on the existence of certain keys. The Atheros 9xxx driver for example requests the NetCfgInstanceId but doesn't check the returned status code and will happily access random memory instead. Submitted by: Paul B Mahol <onemda at gmail.com> MFC after: 2 weeks	2010-11-29 10:10:56 +00:00
bschmidt	77837b201a	Add prototype for InitializeSListHead().	2010-11-23 22:17:06 +00:00
bschmidt	b1be2833eb	Add a few functions used in newer drivers. Fix RtlCompareMemory() while here. Submitted by: Paul B Mahol <onemda@gmail.com>	2010-11-23 21:49:32 +00:00
pluknet	64dd3dbf39	Update MNT_ROOTFS comments after changes in the root mount logic. Reported by: arundel Suggested by: marcel (phrasing) Approved by: kib (mentor)	2010-11-23 13:49:15 +00:00
kib	d6e3624f09	Add include guards. MFC after: 3 days	2010-11-23 12:47:15 +00:00
bschmidt	82896fc74f	Resurrect amd64 support. - Many drivers on amd64 are picking system uptime, interrupt time and ticks via global data structure instead of calling functions for performance reasons. For now just patch such address so driver will not trigger page fault when trying to access such data. In future, additional callout may be added to update data in periodic intervals. - On amd64 we need to allocate "shadow space" on stack before calling any function. Submitted by: Paul B Mahol <onemda at gmail.com>	2010-11-22 20:46:38 +00:00
bschmidt	06a992a873	Prefer pmap_extract() over pmap_kextract() as done in MmIsAddressValid(). According to the comment for MmIsAddressValid() there are issues on PAE kernels using pmap_kextract(). Submitted by: Paul B Mahol <onemda at gmail.com>	2010-11-22 20:39:29 +00:00
dim	11b4830687	Fix linux kernel module breakage introduced in r215675, by including <sys/sysent.h>. Noticed by: many Pointy hat to: netchild	2010-11-22 20:23:18 +00:00
attilio	7718cbcbf4	Add the ability for GDB to printout the thread name along with other thread specific informations. In order to do that, and in order to avoid KBI breakage with existing infrastructure the following semantic is implemented: - For live programs, a new member to the PT_LWPINFO is added (pl_tdname) - For cores, a new ELF note is added (NT_THRMISC) that can be used for storing thread specific, miscellaneous, informations. Right now it is just popluated with a thread name. GDB, then, retrieves the correct informations from the corefile via the BFD interface, as it groks the ELF notes and create appropriate pseudo-sections. Sponsored by: Sandvine Incorporated Tested by: gianni Discussed with: dim, kan, kib MFC after: 2 weeks	2010-11-22 14:42:13 +00:00
netchild	81eaf1c587	Do not take the process lock. The assignment to u_short inside the properly aligned structure is atomic on all supported architectures, and the thread that should see side-effect of assignment is the same thread that does assignment. Use a more appropriate conditional to detect the linux ABI. Suggested by: kib X-MFC: together with r215664	2010-11-22 12:42:32 +00:00
netchild	47500ea1a1	Remove trailing dot from the unimplemented futex messages to make them consistent with the syscall and ipc messages. Submitted by: arundel MFC after: 3 days	2010-11-22 09:25:32 +00:00
netchild	46e50a7603	By using the 32-bit Linux version of Sun's Java Development Kit 1.6 on FreeBSD (amd64), invocations of "javac" (or "java") eventually end with the output of "Killed" and exit code 137. This is caused by: 1. After calling exec() in multithreaded linux program threads are not destroyed and continue running. They get killed after program being executed finishes. 2. linux_exit_group doesn't return correct exit code when called not from group leader. Which happens regularly using sun jvm. The submitters fix this in a similar way to how NetBSD handles this. I took the PRs away from dchagin, who seems to be out of touch of this since a while (no response from him). The patches committed here are from [2], with some little modifications from me to the style. PR: 141439 [1], 144194 [2] Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk Reviewed by: rdivacky (in april 2010) MFC after: 5 days	2010-11-22 09:06:59 +00:00
bschmidt	389cf6e631	Fix a panic on i386 for drivers using MmAllocateContiguousMemory() and MmAllocateContiguousMemorySpecifyCache(). Those two functions take 64-bit variable(s) for their arguments. On i386 that takes additional 32-bit variable per argument. This is required so that windrv_wrap() can correctly wrap function that miniport driver calls with stdcall convention. Similar explanation is provided in subr_ndis.c for other functions. Submitted by: Paul B Mahol <onemda at gmail.com>	2010-11-17 09:32:39 +00:00
bschmidt	87fbb488d4	Use kmem_alloc_contig() to honour the cache_type variable. Pointed out by: alc	2010-11-17 09:28:17 +00:00
des	5c90ddd6c4	Remove no-op assignment. Submitted by: clang via arundel@ MFC after: 2 weeks	2010-11-15 23:14:14 +00:00
netchild	ec2a62f430	Some style(9) fixes. Submitted by: arundel MFC after: 1 week	2010-11-15 13:07:10 +00:00
netchild	d3aba4235e	- print out the PID and program name of the program trying to use an unsupported futex operation - for those futex operations which are known to be not supported, print out which futex operation it is - shortcut the error return of the unsupported FUTEX_CLOCK_REALTIME in some cases: FUTEX_CLOCK_REALTIME can be used to tell linux to use CLOCK_REALTIME instead of CLOCK_MONOTONIC. FUTEX_CLOCK_REALTIME however must only be set, if either FUTEX_WAIT_BITSET or FUTEX_WAIT_REQUEUE_PI are set too. If that's not the case we can die with ENOSYS right at the beginning. Submitted by: arundel Reviewed by: rdivacky (earlier iteration of the patch) MFC after: 1 week	2010-11-15 13:03:35 +00:00
bschmidt	b1835fd211	According to specs for MmAllocateContiguousMemorySpecifyCache() physically contiguous memory with requested restrictions must be allocated. Submitted by: Paul B Mahol <onemda at gmail.com>	2010-11-11 18:43:31 +00:00
des	87d7052b67	Break long line.	2010-11-08 15:14:14 +00:00
des	543d80b9c5	Fix CPU ID in /proc/cpuinfo. PR: kern/56451 Submitted by: arundel@ MFC after: 3 weeks	2010-11-08 12:04:41 +00:00
bschmidt	7edbc46989	Remove 4.x, 5.x and 6.x compatibility bits. Submitted by: Paul B Mahol <onemda at gmail.com>	2010-11-04 18:43:57 +00:00
kib	dd8752941f	Remove stale comment. Submitted by: arundel MFC after: 3 days	2010-10-14 19:30:44 +00:00
kib	84212c7551	Add macro DECLARE_MODULE_TIED to denote a module as requiring the kernel of exactly the same __FreeBSD_version as the headers module was compiled against. Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules use kernel interfaces that the Release Engineering Team feel are not stable enough to guarantee they will not change during the life cycle of a STABLE branch. In particular, the layout of struct sysentvec is declared to be not part of the STABLE KBI. Discussed with: bz, rwatson Approved by: re (bz, kensmith) MFC after: 2 weeks	2010-10-12 09:18:17 +00:00
jkim	6c222550e8	Simplify timeout check in futex_wait() using itimerfix() and return error if the given timeout is invalid. Consistently use int type for timeout and correct a format string in futex_sleep().	2010-10-06 18:51:22 +00:00
netchild	77b4c590d7	Fix a comparision of an uninitialised pointer. Submitted by: arundel Found by: clang analysis (automatic service by uqs@) Reviewed by: rdivacky	2010-10-06 07:34:41 +00:00
thompsa	b942a32370	Use the printf-like capability from kproc_create(). Submitted by: Paul B Mahol	2010-10-05 20:56:08 +00:00
jkim	e9d0730bf8	Prefer pmap_unmapbios() over pmap_unmapdev(). The binary does not change after this because pmap_unmapbios() is a macro for pmap_unmapdev() on amd64.	2010-10-05 18:38:23 +00:00
kib	2fc806f00c	In linprocfs_doargv(): - handle compat32 processes; - remove the checks for copied in addresses to belong into valid usermode range, proc_rwmem() does this; - simplify loop reading single string, limit the total amount of strings collected by ARG_MAX bytes; - correctly add '\0' at the end of each copied string; - fix style. In linprocfs_doprocenviron(): - unlock the process before calling copyin code [1]. The process is held by pseudofs. In linprocfs_doproccmdline: - use linprocfs_doargv() to handle !curproc case for which p_args is not cached. Reported by: plulnet [1] Tested by: pluknet Approved by: des (linprocfs maintainer, previous version of the patch) MFC after: 3 weeks	2010-09-28 11:32:17 +00:00
des	a14c121cce	Implement proc/$$/environment. Submitted by: Fernando Apesteguía <fernando.apesteguia@gmail.com> MFC after: 3 weeks	2010-09-16 07:56:34 +00:00
mdf	ab3a8b533a	Replace sbuf_overflowed() with sbuf_error(), which returns any error code associated with overflow or with the drain function. While this function is not expected to be used often, it produces more information in the form of an errno that sbuf_overflowed() did.	2010-09-10 16:42:16 +00:00
jkim	0117aaf574	Add x86bios_set_intr() to set interrupt vectors for real mode and simplify x86bios_get_intr() a little.	2010-08-25 21:03:50 +00:00
jkim	8c8d33fe9f	Check opcode for short jump as well. Some option ROMs do short jumps (e.g., some NVIDIA video cards) and we were not able to do POST while resuming because we only honored long jump. MFC after: 3 days	2010-08-25 20:52:40 +00:00
kib	d9f088a03e	Supply some useful information to the started image using ELF aux vectors. In particular, provide pagesize and pagesizes array, the canary value for SSP use, number of host CPUs and osreldate. Tested by: marius (sparc64) MFC after: 1 month	2010-08-17 08:55:45 +00:00
jkim	b25a196078	Place spinlock_enter() and spinlock_exit() just around X86EMU calls.	2010-08-10 15:22:48 +00:00
jkim	781d513f0b	Tidy up locking and memory allocation for the real mode emulator wrapper. Now we use a regular mutex instead of a spin mutex. When we enter and exit the emulator, spinlock_enter() and spinlock_exit() are additionally used. Move some page table related stuff from x86bios_init() and x86bios_uninit() to x86bios_map_mem() and x86bios_unmap_mem().	2010-08-10 06:25:08 +00:00
jkim	6328a1bf23	Tidy up printf() calls for debugging.	2010-08-09 22:06:08 +00:00
jkim	c07bc7f517	Initialize a variable just before its use.	2010-08-09 18:10:32 +00:00
jkim	c9e34bbc39	Reduce diffs between VM86 and X86EMU wrappers for x86bios_alloc() and x86bios_free(). Add strict sanity checks for VM86 wrapper and add strict page table locking for X86EMU wrapper.	2010-08-09 17:54:26 +00:00
kib	8e1e89f01b	Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGS in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well. MFC after: 1 week	2010-08-07 11:57:13 +00:00
kib	8043767b92	Add compat32 definition for (old) struct ostat. MFC after: 1 week	2010-08-07 11:53:38 +00:00
jkim	77b28d0e95	Do not block any I/O port on amd64.	2010-08-07 04:05:58 +00:00
jkim	012f478c81	Optimize interrupt vector lookup. There is no need to check the page table.	2010-08-07 03:45:45 +00:00
jkim	57b610d580	Consistently use architecture specific macros.	2010-08-06 15:24:37 +00:00
jkim	c1d76c06de	Fix allocation of multiple pages, which forgot to increase page number. Particularly, it caused "vm86_addpage: overlap" panics under VirtualBox. Add a safety check before freeing memory while I am here.	2010-08-06 15:04:01 +00:00
jkim	9c60808f39	Re-add flag register for output. Some BIOS calls actually use it to return success/failure status. Oops.	2010-08-05 19:30:57 +00:00
jkim	0a7f48a833	Do not copy stack pointer and flags. These registers are unconditionally destroyed from vm86_prepcall().	2010-08-05 19:12:35 +00:00
jkim	f183f61cf2	Implement a simple native VM86 backend for X86BIOS. Now i386 uses native VM86 calls instead of the real mode emulator as a backend. VM86 has been proven reliable for very long time and it is actually few times faster than emulation. Increase maximum number of page table entries per VM86 context from 3 to 8 pages. It was (ridiculously) low and insufficient for new VM86 backend, which shares one context globally. Slighly rearrange and clean up the emulator backend to accommodate new code. The only visible change here is stack size, which is decreased from 64K to 4K bytes to sync. with VM86. Actually, it seems there is no need for big stack in real mode. MFC after: 1 month	2010-08-05 18:48:30 +00:00
kib	4a7e2ba2a3	Copy inode birthtime to the struct stat32. MFC after: 1 week	2010-08-04 14:38:20 +00:00
kib	36b27b8587	Fix style. MFC after: 1 week	2010-08-04 14:35:05 +00:00
kib	734aeecfaf	When compat32 recvmsg(2) does not need to copy out control messages, set msg_controllen to 0. PR: kern/149227 Submitted by: Stef Walter <stef memberwebs com> MFC after: 1 weeks	2010-08-03 11:23:44 +00:00
alc	256c63de28	Introduce exec_alloc_args(). The objective being to encapsulate the details of the string buffer allocation in one place. Eliminate the portion of the string buffer that was dedicated to storing the interpreter name. The pointer to the interpreter name can simply be made to point to the appropriate argument string. Reviewed by: kib	2010-07-27 17:31:03 +00:00
kib	5d01c62502	Revert r210451, and the similar part of the r210431. The forward-declaration for the enum tag when enum definition is not complete is not allowed by C99, and is gcc extension. Requested by: stefanf MFC after: 28 days	2010-07-26 12:52:44 +00:00
alc	02c0473d35	Change the order in which the file name, arguments, environment, and shell command are stored in exec*()'s demand-paged string buffer. For a "buildworld" on an 8GB amd64 multiprocessor, the new order reduces the number of global TLB shootdowns by 31%. It also eliminates about 330k page faults on the kernel address space. Change exec_shell_imgact() to use "args->begin_argv" consistently as the start of the argument and environment strings. Previously, it would sometimes use "args->buf", which is the start of the overall buffer, but no longer the start of the argument and environment strings. While I'm here, eliminate unnecessary passing of "&length" to copystr(), where we don't actually care about the length of the copied string. Clean up the initialization of the exec map. In particular, use the correct size for an entry, and express that size in the same way that is used when an entry is allocated. The old size was one page too large. (This discrepancy originated in 2004 when I rewrote exec_map_first_page() to use sf_buf_alloc() instead of the exec map for mapping the first page of the executable.) Reviewed by: kib	2010-07-25 17:43:38 +00:00
kib	229b3b9c19	Remove the linux_exec_copyin_args(), freebsd32_exec_copyin_args() may server as well. COMPAT_FREEBSD32 is a prerequisite for COMPAT_LINUX32. Reviewed by: alc MFC after: 3 weeks	2010-07-23 21:30:33 +00:00
alc	0c709bf109	Eliminate a little bit of duplicated code.	2010-07-23 18:58:27 +00:00
trasz	e2cd3ad716	Remove proc locking, it's not needed after r210132.	2010-07-17 15:52:11 +00:00
trasz	e3a946ddad	Make svr4(4) version of poll(2) use the same limit of file descriptors as the usual poll(2) does, instead of checking resource limits.	2010-07-15 18:44:58 +00:00
kib	5f8b30cbbb	Constify source argument for siginfo_to_siginfo32(). MFC after: 1 week	2010-07-04 11:43:53 +00:00
jhb	df7979cf76	Tweak the in-kernel API for sending signals to threads: - Rename tdsignal() to tdsendsignal() and make it private to kern_sig.c. - Add tdsignal() and tdksignal() routines that mirror psignal() and pksignal() except that they accept a thread as an argument instead of a process. They send a signal to a specific thread rather than to an individual process. Reviewed by: kib	2010-06-29 20:41:52 +00:00
kib	180cca1c2d	Regenerate	2010-06-28 18:17:21 +00:00
kib	b6d8416eac	Count number of threads that enter and leave dynamically registered syscalls. On the dynamic syscall deregistration, wait until all threads leave the syscall code. This somewhat increases the safety of the loadable modules unloading. Reviewed by: jhb Tested by: pho MFC after: 1 month	2010-06-28 18:06:46 +00:00
jkim	f68d88b142	Let x86bios_alloc() pass contigmalloc(9) flags. Use it to set M_WAITOK from VESA BIOS initialization. All other malloc(9) uses in the function is blocking any way.	2010-06-23 17:20:51 +00:00
ed	d7eaa4520b	ANSIfy prototypes in subr_usbd.c. Clang generates the following warnings when building subr_usbd.c: \| subr_usbd.c:598:13: warning: promoted type 'int' of K&R function \| parameter is not compatible with the parameter type 'uint8_t' (aka \| 'unsigned char') declared in a previous prototype \| subr_usbd.c:627:13: warning: promoted type 'int' of K&R function \| parameter is not compatible with the parameter type 'uint8_t' (aka \| 'unsigned char') declared in a previous prototype \| subr_usbd.c:649:13: warning: promoted type 'int' of K&R function \| parameter is not compatible with the parameter type 'uint8_t' (aka \| 'unsigned char') declared in a previous prototype Instead of just ANSIfying these three prototypes, do it for the entire file. Spotted by: clang	2010-06-12 12:19:08 +00:00
jhb	9b74a62d73	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
wkoszek	ab9f5dbe35	Bring USB fixes for linux(4). Intention of this commit is to let us take a full advantage of libusb(8) ported to Linux. This decreases a possibility of getting any collisions within ioctl() "command" space, especially with relation to LINUX_SNDCTL_SEQ... stuff. Basically, we provide commands, that will be mapped in the kernel to correct ones and forward those to the USB layer. Port enabling functionality brought with this patch is here: http://www.freebsd.org/cgi/query-pr.cgi?pr=146895 Bump __FreeBSD_version to catch, since which version installing a port makes sense. This patch should bring no regressions. So far, only i386 is tested. Tested by: thompsa@ Reviewed by: thompsa@ OKed by: netchild@	2010-05-24 07:04:00 +00:00
kib	4208ccbe79	Reorganize syscall entry and leave handling. Extend struct sysvec with three new elements: sv_fetch_syscall_args - the method to fetch syscall arguments from usermode into struct syscall_args. The structure is machine-depended (this might be reconsidered after all architectures are converted). sv_set_syscall_retval - the method to set a return value for usermode from the syscall. It is a generalization of cpu_set_syscall_retval(9) to allow ABIs to override the way to set a return value. sv_syscallnames - the table of syscall names. Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding the call to cpu_set_syscall_retval(). The new functions syscallenter(9) and syscallret(9) are provided that use sv_syscall pointers and contain the common repeated code from the syscall() implementations for the architecture-specific syscall trap handlers. Syscallenter() fetches arguments, calls syscall implementation from ABI sysent table, and set up return frame. The end of syscall bookkeeping is done by syscallret(). Take advantage of single place for MI syscall handling code and implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the thread is stopped at syscall entry or return point respectively. The EXEC flag augments SCX and notifies debugger that the process address space was changed by one of exec(2)-family syscalls. The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are changed to use syscallenter()/syscallret(). MIPS and arm are not converted and use the mostly unchanged syscall() implementation. Reviewed by: jhb, marcel, marius, nwhitehorn, stas Tested by: marcel (ia64), marius (sparc64), nwhitehorn (powerpc), stas (mips) MFC after: 1 month	2010-05-23 18:32:02 +00:00
netchild	88731dce0c	- #ifdef out the cliplist part, skype seems like using an uninitialized variable and can cause problems, without the cliplist handling it works without problems - improve the cliplist error handling - fix VIDIOCGTUNER and VIDIOCSMICROCODE (still no hardware available to test) Submitted by: J.R. Oldroyd <jr@opal.com> X-MFC after: soon (together with all the v4l stuff)	2010-05-03 14:19:58 +00:00
jkim	428aff4f1c	Reduce MD code further. At least, it compiles on ia64 now (but it is not connected to build). The idea/code was shamelessly taken from r207329.	2010-05-01 01:05:07 +00:00
jkim	b092dfff59	Do not initialize mutex and return error if it cannot map memory.	2010-05-01 00:36:40 +00:00
kib	1b4a81ab7e	Provide compat32 shims for kinfo_proc sysctl. This allows 32bit ps(1) to mostly work on 64bit host. The work is based on an original patch submitted by emaste, obtained from Sandvine's source tree. Reviewed by: jhb MFC after: 1 week	2010-04-21 19:32:00 +00:00
kib	d4d906be00	Extract the code to copy-out struct rusage32 from struct rusage into the new function. Reviewed by: jhb MFC after: 1 week	2010-04-21 19:28:01 +00:00
emaste	c1468b9b67	Linux puts a blank line between each CPU.	2010-04-14 13:44:22 +00:00
bz	7fe3c0a85b	Add a forward declaration to silence a warning when compiling ia32_genassym.c. Reviewed by: kib MFC after: 3 days	2010-04-03 12:34:32 +00:00
netchild	1edbfe1bf0	Re-apply r205683 with some modifications: Fix some bogus values in linprocfs. Submitted by: Petr Salinger <Petr.Salinger@seznam.cz> Verified on: GNU/kFreeBSD debian 8.0-1-686 (by submitter) PR: 144584 Reviewed by / discussed with: kib, des, jhb, submitter	2010-04-02 06:50:28 +00:00
ed	4f08ecd7ed	Rename st_timespec fields to st_tim for POSIX 2008 compliance. A nice thing about POSIX 2008 is that it finally standardizes a way to obtain file access/modification/change times in sub-second precision, namely using struct timespec, which we already have for a very long time. Unfortunately POSIX uses different names. This commit adds compatibility macros, so existing code should still build properly. Also change all source code in the kernel to work without any of the compatibility macros. This makes it all a less ambiguous. I am also renaming st_birthtime to st_birthtim, even though it was a local extension anyway. It seems Cygwin also has a st_birthtim.	2010-03-28 13:13:22 +00:00
netchild	d82aae616f	Revert r205683 to resolve some code quality issues which do not affect the build or use of linprocfs, before committing the reworked patch again. Requested by: des	2010-03-26 14:36:16 +00:00
netchild	8cb631a42d	Fix some bogus values in linprocfs. Submitted by: Petr Salinger <Petr.Salinger@seznam.cz> Verified on: GNU/kFreeBSD debian 8.0-1-686 (by submitter) PR: 144584	2010-03-26 11:43:15 +00:00
netchild	db8514f06a	Fix some problems which may lead to a panic: - right order of src and dst in memcpy - NULL out the clips after freeing to prevent an accident Noticed by: hselasky	2010-03-26 08:42:11 +00:00
jkim	3ce45e9870	Revert accidentally committed initial real mode %sp change of r205347. Note I am keeping %ds change because X.org int10 handler does it and it seems reasonable.	2010-03-25 17:14:47 +00:00
jkim	7725810bd9	Optimize real mode page table lookup.	2010-03-25 17:03:52 +00:00
jkim	49ca5d49ba	Fix stupid typos. Some VESA BIOSes directly call BIOS interrupt handlers within the VBE interrupt handler. Unfortunately it was causing real mode page faults because we were fetching instructions from bogus addresses. Pass me the pointyhat, please. PR: kern/144654 MFC after: 3 days	2010-03-25 15:56:04 +00:00
nwhitehorn	d63c82a6ac	Change the arguments of exec_setregs() so that it receives a pointer to the image_params struct instead of several members of that struct individually. This makes it easier to expand its arguments in the future without touching all platforms. Reviewed by: jhb	2010-03-25 14:24:00 +00:00
jhb	72e71a3c3d	Add missing Giant locking for the vfsconf list. Submitted by: kib	2010-03-24 14:20:37 +00:00
jhb	878de09a93	Implement /proc/filesystems. Submitted by: Fernando Apesteguia fernando.apesteguia (gmail)	2010-03-23 21:49:33 +00:00
jkim	a40c3ddf5a	Support memory wraparound instead of high memory as VM86 mode does. Suggested by: delphij	2010-03-22 18:43:36 +00:00
jkim	c581c7d82f	Fix i386 PAE kernel build. Reported by: tinderbox	2010-03-22 17:30:34 +00:00
ed	6156503467	Actually make O_DIRECTORY work. According to POSIX open() must return ENOTDIR when the path name does not refer to a path name. Change vn_open() to respect this flag. This also simplifies the Linuxolator a bit.	2010-03-21 20:43:23 +00:00
jkim	a0f967165a	- Map EBDA if available and add 64KB above 1MB (high memory), just in case. - Print the initial memory map when bootverbose is set. - Change the page fault address format from linear to %cs:%ip style. - Move duplicate code into a newly added function. - Add strictly aligned memory access for distant future. ;-)	2010-03-19 21:15:43 +00:00
kib	169d74536d	Regen	2010-03-19 11:14:37 +00:00
kib	b0ac73182e	Remove empty line. MFC after: 2 weeks	2010-03-19 11:13:42 +00:00
kib	06319cba03	Implement compat32 shims for mqueuefs. Reviewed by: jhb MFC after: 2 weeks	2010-03-19 11:10:24 +00:00
kib	34d2655cb1	Implement compat32 shims for ksem syscalls. Reviewed by: jhb MFC after: 2 weeks	2010-03-19 11:08:43 +00:00
kib	b27fa06f97	Move SysV IPC freebsd32 compat shims from freebsd32_misc.c to corresponding sysv_{msg,sem,shm}.c files. Mark SysV IPC freebsd32 syscalls as NOSTD and add required SYSCALL_INIT_HELPER/SYSCALL32_INIT_HELPERs to provide auto register/unregister on module load. This makes COMPAT_FREEBSD32 functional with SysV IPC compiled and loaded as modules. Reviewed by: jhb MFC after: 2 weeks	2010-03-19 11:04:42 +00:00
kib	610214ed4c	Move SysV IPC freebsd32 compat shims helpers from freebsd32_misc.c to sysv_ipc.c. Reviewed by: jhb MFC after: 2 weeks	2010-03-19 11:01:51 +00:00
kib	d19a162142	Introduce SYSCALL_INIT_HELPER and SYSCALL32_INIT_HELPER macros and neccessary support functions to allow registering dynamically loaded syscalls from the MOD_LOAD handlers. Helpers handle registration failures semi-automatically. Reviewed by: jhb MFC after: 2 weeks	2010-03-19 10:56:30 +00:00
kib	7932a1f757	FOr SYSCALL_MODULE_HELPER, use "sys/<syscallname>" module name. FOr SYSCALL32_MODULE_HELPER, use "sys32/<syscallname>" module name. This avoids modules name conflict when compat32 syscall does not need shims. Note that SYSCALL_MODULE_HELPER is going to be unused in the tree by several next commits. Suggested by: jhb MFC after: 2 weeks	2010-03-19 10:52:54 +00:00
kib	e28b3013f9	Make freebsd32_copyiniov() available outside of freebsd32_misc. MFC after: 2 weeks	2010-03-19 10:49:03 +00:00
jkim	de04cb233d	Detect illegal access to unmapped memory within real mode emulator to aid debugging. Update copyright date while I am here.	2010-03-18 20:15:34 +00:00
nwhitehorn	8cbb7143ab	Regen after big endian compatibility import.	2010-03-11 14:56:59 +00:00
nwhitehorn	142a4d2993	Provide groundwork for 32-bit binary compatibility on non-x86 platforms, for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32 option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts of the kernel and enhances the freebsd32 compatibility code to support big-endian platforms. Reviewed by: kib, jhb	2010-03-11 14:49:06 +00:00
ed	3c620db1c4	Make /proc/self/fd `work'. On Linux, /proc/<pid>/fd is comparable to fdescfs, where it allows you to inspect the file descriptors used by each process. Glibc's ttyname() works by performing a readlink() on these nodes, since all nodes in this directory are symlinks. It is a bit hard to implement this in linprocfs right now, so I am not going to bother. Add a way to make ttyname(3) work, by adding a /proc/<pid>/fd symlink, which points to /dev/fd only if the calling process matches. When fdescfs is mounted, this will cause the readlink() in ttyname() to fail, causing it to fall back on manually finding a matching node in /dev. Discussed on: emulation@	2010-03-07 10:43:45 +00:00
joel	58fee7f0f3	The NetBSD Foundation has granted permission to remove clause 3 and 4 from their software. Obtained from: NetBSD	2010-03-01 17:20:04 +00:00
pjd	c527452336	No need to include security/mac/mac_framework.h here.	2010-02-18 22:26:01 +00:00
delphij	54ac71ad0f	- Return EAFNOSUPPORT instead of EINVAL for unsupported address family, this matches the Linux behavior. - Check if we have sufficient space allocated for socket structure, which fixes a buffer overflow when wrong length is being passed into the emulation layer. [1] PR: kern/138860 Submitted by: Mateusz Guzik <mjguzik gmail com> Reported by: Alexander Best [1] MFC after: 2 weeks	2010-02-09 22:30:51 +00:00
ed	d40177139e	Remove unused LIBCOMPAT keyword from syscalls.master.	2010-02-08 10:02:01 +00:00
wkoszek	260e3efd2e	Let us to use our libusb(3) in Linuxolator. With this change, Linux binaries can work with our libusb(3) when it's compiled against our header files on GNU/Linux system -- this solves the problem with differences between /dev layouts. With ported libusb(3), I am able to use my USB JTAG cable with Linux binaries that support it. Reviewed by: thompsa	2010-01-18 22:46:06 +00:00
netchild	7cde5eb62d	Whitespace change to be able to provide the correct commit log for r202364: ---snip--- Add video clipping support but with the caveats below. Background info: Video clipping allows the user to provide either a series of clip rectangles or a clip bitmap to the driver and have the driver mask the video according to the clipping specs provided. Adding support for clipping to the FreeBSD Linux emulator is problematic because it seems that this feature is not supported by many drivers and therefore it is ignored by many applications. Unfortunately, when not using it, rather than passing in a null clipping list, some apps leave the clipping fields uninitialized, casuing random values to be passed in. In the case where the driver does not use the clipping info, this is not a problem (although it is bad form). But the Linux emulator does not know which drivers will use this and which won't, so the Linux emulator must try to handle this clip list, and deal gracefully with cases where the values seem to be uninitialized. Video clipping info is passed in using the VIDIOCSWIN ioctl in two fields in the video_window structure: the integer clipcount and the pointer clips. How the linuxulator handles this from this commit on: * if (clipcount == VIDEO_CLIP_BITMAP) The clips variable is a void * pointer to a 128625 byte (1024625 bit) memory area containing a bitmap of the clipping area. The pointer in the video_window structure is copied, but no video_clip structures are copied. * if (clipcount > 0 && clipcount <= 16384) The clips variable is pointer to a list of video_clip structures. Up to clipcount structures are copied and passed to the driver. The upper limit of 16384 was imposed here so that user code that does not properly initialize clipcount falls through below and no attempt is made to copy an uninitialized list. This value was found by examining Linux drivers that support the clip list. * else The clipcount is either negative (but not VIDEO_CLIP_BITMAP), zero or positive (> 16384). All these cases are treated as invalid data. Both the clipcount field and clips pointer are forced to zero/NULL and passed to the driver. It should be noted that, at the time of developing this V4L emulator code, the pwc(4) V4L driver does not support clipping. Submitted by: J.R. Oldroyd <fbsd@opal.com> MFC after: 1 month ---snip---	2010-01-15 15:38:31 +00:00
netchild	9c3e54b694	This is v4l support for the linuxulator. This allows to access FreeBSD native devices which support the v4l API from processes running within the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver. Not tested is firmware upload, framebuffer stuff and video tuner stuff due to lack of hardware. The clipping part (VIDIOCSWIN) needs a little bit of further work (partly in progress, but can not be tested due to lack of a suitable device). The submitter tested this sucessfully with Skype and flash apps on amd64 and i386 with the multimedia/pwcbsd driver. Submitted by: J.R. Oldroyd <fbsd@opal.com>	2010-01-15 14:58:19 +00:00
brooks	f354ae0814	Since all other comparisons involving ngroups_max use "ngroups_max + 1", use ">= ngroups_max+1" instead of the equivalent "> ngroups_max" to reduce confusion.	2010-01-15 07:05:00 +00:00
brooks	a093b41daf	Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications. MFC after: 1 month	2010-01-12 07:49:34 +00:00
mckusick	0cddeb2cb4	Background: When renaming a directory it passes through several intermediate states. First its new name will be created causing it to have two names (from possibly different parents). Next, if it has different parents, its value of ".." will be changed from pointing to the old parent to pointing to the new parent. Concurrently, its old name will be removed bringing it back into a consistent state. When fsck encounters an extra name for a directory, it offers to remove the "extraneous hard link"; when it finds that the names have been changed but the update to ".." has not happened, it offers to rewrite ".." to point at the correct parent. Both of these changes were considered unexpected so would cause fsck in preen mode or fsck in background mode to fail with the need to run fsck manually to fix these problems. Fsck running in preen mode or background mode now corrects these expected inconsistencies that arise during directory rename. The functionality added with this update is used by fsck running in background mode to make these fixes. Solution: This update adds three new fsck sysctl commands to support background fsck in correcting expected inconsistencies that arise from incomplete directory rename operations. They are: setcwd(dirinode) - set the current directory to dirinode in the filesystem associated with the snapshot. setdotdot(oldvalue, newvalue) - Verify that the inode number for ".." in the current directory is oldvalue then change it to newvalue. unlink(nameptr, oldvalue) - Verify that the inode number associated with nameptr in the current directory is oldvalue then unlink it. As with all other fsck sysctls, these new ones may only be used by processes with appropriate priviledge. Reported by: jeff Security issues: rwatson	2010-01-11 20:44:05 +00:00
mbr	7450f52a57	Remove extraneous semicolons, no functional changes. Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week	2010-01-07 21:01:37 +00:00
kib	19ee2b964b	Signal 0 is used to check the permission for current process to signal target one. Since r184058, linux_do_tkill() calls tdsignal() instead of kill(), without checking for validity of supplied signal number. Prevent panic when supplied signal is 0 by finishing work after checks. Found and tested by: scf MFC after: 3 days	2009-12-18 14:27:18 +00:00
imp	747567aeb7	Revert 200606.	2009-12-16 21:53:56 +00:00
imp	09c3f98bcc	Fix compiling FREEBSD_COMPAT[4,5,6] without FREEBSD_COMPAT7. Note: Not sure this is the right way to do compat, but it makes the headers consistent with the implementations.	2009-12-16 17:17:40 +00:00
jkim	ddc390f24b	Add two new debugging tunables for x86bios instead of abusing bootverbose, i.e., debug.x86bios.call and debug.x86bios.int.	2009-12-15 22:44:28 +00:00
kib	ed39e6c80b	Regenerate.	2009-12-04 21:53:20 +00:00
kib	080e59eb45	Add several syscall compat32 entries for acl manipulation. They do not require translation of the arguments. Tested by: bsam MFC after: 1 week	2009-12-04 21:52:31 +00:00
netchild	3b599cd6fa	This is v4l support for the linuxulator. This allows to access FreeBSD native devices which support the v4l API from processes running within the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver. Not tested is firmware upload, framebuffer stuff and video tuner stuff due to lack of hardware. The clipping part (VIDIOCSWIN) needs a little bit of further work (partly in progress, but can not be tested due to lack of a suitable device). The submitter tested this sucessfully with Skype and flash apps on amd64 and i386 with the multimedia/pwcbsd driver. Submitted by: J.R. Oldroyd <fbsd@opal.com>	2009-12-04 21:06:54 +00:00
netchild	21465cca7f	Import the unchanged v4l videodev.h from the vendor branch.	2009-12-04 20:46:45 +00:00
ed	0ddd901675	Include <sys/tty.h> instead of <sys/termios.h>. Right now <sys/termios.h> includes <sys/ttycom.h>, which provides the TTY ioctls to the svr4 code. We need both struct termios and the ioctls, so include <sys/tty.h> for now.	2009-11-28 16:30:06 +00:00
netchild	063f564d20	Fix typo in kernel message. The fix is based upon the patch in the PR. PR: kern/140279 Submitted by: Alexander Best <alexbestms@math.uni-muenster.de> MFC after: 1 week	2009-11-05 07:37:48 +00:00
rpaulo	36d63b3d79	Revert a functional change that snuck in.	2009-11-02 19:13:12 +00:00
rpaulo	10361c8d0a	Fix a non-style change that snuck in. Spotted by: danfe	2009-11-02 18:51:24 +00:00
rpaulo	898a75fb36	Big style cleanup. While there remove references to FreeBSD versions older than 6.0. Submitted by: Paul B Mahol <onemda at gmail.com>	2009-11-02 11:07:42 +00:00
kib	7b3cdceb8e	Regenerate	2009-10-27 11:02:04 +00:00
kib	08e5013938	Current pselect(3) is implemented in usermode and thus vulnerable to well-known race condition, which elimination was the reason for the function appearance in first place. If sigmask supplied as argument to pselect() enables a signal, the signal might be delivered before thread called select(2), causing lost wakeup. Reimplement pselect() in kernel, making change of sigmask and sleep atomic. Since signal shall be delivered to the usermode, but sigmask restored, set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK should be cleared by ast() in case signal was not gelivered during syscall execution. Reviewed by: davidxu Tested by: pho MFC after: 1 month	2009-10-27 10:55:34 +00:00
kib	ce081b037e	In r197963, a race with thread being selected for signal delivery while in kernel mode, and later changing signal mask to block the signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls. Use kern_sigprocmask() instead of direct manipulation of td_sigmask to reschedule newly blocked signals, closing the race. Reviewed by: davidxu Tested by: pho MFC after: 1 month	2009-10-27 10:47:58 +00:00
kib	eb4c68098b	In kern_sigsuspend(), better manipulate thread signal mask using kern_sigprocmask() to properly notify other possible candidate threads for signal delivery. Since sigsuspend() shall only return to usermode after a signal was delivered, do cursig/postsig loop immediately after waiting for signal, repeating the wait if wakeup was spurious due to race with other thread fetching signal from the process queue before us. Add thread_suspend_check() call to allow the thread to be stopped or killed while in loop. Modify last argument of kern_sigprocmask() from boolean to flags, allowing the function to be called with locked proc. Convertion of the callers that supplied 1 to the old argument will be done in the next commit, and due to SIGPROCMASK_OLD value equial to 1, code is formally correct in between. Reviewed by: davidxu Tested by: pho MFC after: 1 month	2009-10-27 10:42:24 +00:00
bz	b991a4ad12	Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets no matter whether we are compiled as module or if our default of the net.inet6.ip6.v6only sysctl already matches what we would set. This avoids unnecessary complications with modules, VIMAGES, INET6 and the sysctl value, especially considering that most users will use linux compat as a module. Discussed with: kib, rwatson (weeks ago) Reviewed by: rwatson MFC after: 6 weeks	2009-10-25 09:58:56 +00:00
jkim	22d120ca09	Fix a copy-and-pasto in the previous commit.	2009-10-19 21:01:42 +00:00
jkim	99279734b8	Rewrite x86bios and update its dependent drivers. - Do not map entire real mode memory (1MB). Instead, we map IVT/BDA and ROM area separately. Most notably, ROM area is mapped as device memory (uncacheable) as it should be. User memory is dynamically allocated and free'ed with contigmalloc(9) and contigfree(9). Remove now redundant and potentially dangerous x86bios_alloc.c. If this emulator ever grows to support non-PC hardware, we may implement it with rman(9) later. - Move all host-specific initializations from x86emu_util.c to x86bios.c and remove now unnecessary x86emu_util.c. Currently, non-PC hardware is not supported. We may use bus_space(9) later when the KPI is fixed. - Replace all bzero() calls for emulated registers with more obviously named x86bios_init_regs(). This function also initializes DS and SS properly. - Add x86bios_get_intr(). This function checks if the interrupt vector is available for the platform. It is not necessary for PC-compatible hardware but it may be needed later. ;-) - Do not try turning off monitor if DPMS does not support the state. - Allocate stable memory for VESA OEM strings instead of just holding pointers to them. They may or may not be accessible always. Fix a memory leak of video mode table while I am here. - Add (experimental) BIOS POST call for vesa(4). This function calls VGA BIOS POST code from the current VGA option ROM. Some video controllers cannot save and restore the state properly even if it is claimed to be supported. Usually the symptom is blank display after resuming from suspend state. If the video mode does not match the previous mode after restoring, we try BIOS POST and force the known good initial state. Some magic was taken from NetBSD (and it was taken from vbetool, I believe.) - Add a loader tunable for vgapci(4) to give a hint to dpms(4) and vesa(4) to identify who owns the VESA BIOS. This is very useful for multi-display adapter setup. By default, the POST video controller is automatically probed and the tunable "hw.pci.default_vgapci_unit" is set to corresponding vgapci unit number. You may override it from loader but it is very unlikely to be necessary. Unfortunately only AGP/PCI/PCI-E controllers can be matched because ISA controller does not have necessary device IDs. - Fix a long standing bug in state save/restore function. The state buffer pointer should be ES:BX, not ES:DI according to VBE 3.0. If it ever worked, that's because BX was always zero. :-) - Clean up register initializations more clearer per VBE 3.0. - Fix a lot of style issues with vesa(4).	2009-10-19 20:58:10 +00:00
bz	8e183cd852	Make sure that the primary native brandinfo always gets added first and the native ia32 compat as middle (before other things). o(ld)brandinfo as well as third party like linux, kfreebsd, etc. stays on SI_ORDER_ANY coming last. The reason for this is only to make sure that even in case we would overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo would still be there and the system would be operational. Reviewed by: kib MFC after: 1 month	2009-10-03 11:57:21 +00:00
rwatson	47ee86367a	Regenerate system call files following r197636.	2009-09-30 08:48:59 +00:00
rwatson	3d5e3df28c	Reserve system call numbers for Capsicum security framework capabilities, capability mode, and process descriptors: cap_new, cap_getrights, cap_enter, cap_getmode, pdfork, pdkill, pdgetpid, and pdwait. Obtained from: TrustedBSD Project Sponsored by: Google MFC after: 3 weeks	2009-09-30 08:46:01 +00:00
delphij	987c4a758d	Use a 2 clause BSD-style license instead of stating the code as public domain, as requested by core@ and reviewed by the author.	2009-09-28 08:14:15 +00:00
jkim	08823d6c8c	- Reduce BIOS memory mapping. We want 1MB of physical memory, not 12MB[1]. - Remove CS and IP registers from x86bios.h. They have no use for us. - Adjust register dump to make it little bit more useful for debugging. Submitted by: paradox (ddkprog yahoo com)[1] (initial version)	2009-09-25 17:56:32 +00:00
jkim	21b9526006	Dump real mode registers under bootverbose to help debugging BIOS emulator.	2009-09-24 22:42:35 +00:00
jkim	54c804074e	- Use FreeBSD function naming convention. - Change x86biosCall() to more appropriate x86bios_intr().[1] Discussed with: delphij, paradox (ddkprog yahoo com) Submitted by: paradox (ddkprog yahoo com)[1]	2009-09-24 19:24:42 +00:00
jkim	4f6b75d358	Move sys/dev/x86bios to sys/compat/x86bios. It may not be optimal but it is clearly better than the old place. OK'ed by: delphij, paradox (ddkprog yahoo com)	2009-09-23 20:49:14 +00:00
zec	840419a4c4	Lock the ifnet list while iterating over it. Submitted by: julian MFC after: 3 days	2009-09-13 21:30:18 +00:00
des	e4fc8331e6	As jhb@ pointed out to me, r197057 was incorrect, not least because these are generated files.	2009-09-10 13:20:27 +00:00
des	b9f96dd3c0	If a certain feature that was present in FreeBSD 7 was removed or changed in FreeBSD 8, the compatibility shims should be built not just when FreeBSD 7 compatibility is requested, but also when compatibility with any older FreeBSD version where that feature was present is requested.o Without this patch, a kernel config that sets COMPAT_FREEBSD6 but not 7 would fail to build due to inconsistencies between the declaration of the compatibility shims and their use in the SysV code. There are similar errors in other proto.h headers in the tree. MFC after: 3 weeks	2009-09-10 08:33:28 +00:00
kib	91e6a5b3cc	kern_select(9) copies fd_set in and out of userspace in quantities of longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in or out 4 bytes more then the process expected. Calculate the amount of bytes to copy taking into account size of fd_set for the current process ABI. Diagnosed and tested by: Peter Jeremy <peterjeremy acm org> Reviewed by: jhb MFC after: 1 week	2009-09-09 20:59:01 +00:00
bz	840afe36da	Make sure FreeBSD binaries without .note.ABI-tag section work correctly and do not match a colliding Debian GNU/kFreeBSD brandinfo statements. For this mark the Debian GNU/kFreeBSD brandinfo that it must have an .note.ABI-tag section and ignore the old EI_OSABI brandinfo when comparing a possibly colliding set of options. Due to SYSINIT we add the brandinfo in a non-deterministic order, so native FreeBSD is not always first. We may want to consider to force native FreeBSD to come first as well. The only way a problem could currently be noticed is when running an i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD brandinfo was matched first, as the fallback to ld-elf32.so.1 does not exist in that case. Reported and tested by: ticso In collaboration with: kib MFC after: 3 days	2009-08-30 14:38:17 +00:00
zec	e9ba936d59	Fix a few panics in linuxulator + VIMAGE due to curvnet not being set. This change affects only options VIMAGE builds. Reviewed by: julian MFC after: 3 days	2009-08-28 22:51:07 +00:00
bz	ba7b3afabc	Fix handling of .note.ABI-tag section for GNU systems [1]. Handle GNU/Linux according to LSB Core Specification 4.0, Chapter 11. Object Format, 11.8. ABI note tag. Also check the first word of desc, not only name, according to glibc abi-tags specification to distinguish between Linux and kFreeBSD. Add explicit handling for Debian GNU/kFreeBSD, which runs on our kernels as well [2]. In {amd64,i386}/trap.c, when checking osrel of the current process, also check the ABI to not change the signal behaviour for Linux binary processes, now that we save an osrel version for all three from the lists above in struct proc [2]. These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD and Linux binaries on the same machine again for at least i386 and amd64, and no longer break kFreeBSD which was detected as GNU(/Linux). PR: kern/135468 Submitted by: dchagin [1] (initial patch) Suggested by: kib [2] Tested by: Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD Reviewed by: kib MFC after: 3 days	2009-08-24 16:19:47 +00:00
rwatson	ef8d755d4d	Rework global locks for interface list and index management, correcting several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days	2009-08-23 20:40:19 +00:00
rwatson	fb9ffed650	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
jhb	da6cb6e20c	Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the old ABI versions of the relevant control system call (e.g. freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()). Approved by: re (kib)	2009-07-27 16:03:04 +00:00
jamie	274ea197bb	Some jail parameters (in particular, "ip4" and "ip6" for IP address restrictions) were found to be inadequately described by a boolean. Define a new parameter type with three values (disable, new, inherit) to handle these and future cases. Approved by: re (kib), bz (mentor) Discussed with: rwatson	2009-07-25 14:48:57 +00:00
rwatson	57ca4583e7	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
trasz	f2432e2bd7	Regen the freebsd32 parts. Approved by: re (kib)	2009-07-08 16:30:34 +00:00
trasz	3189216dbc	Fix freebsd32 version of lpathconf(2). Approved by: re (kib)	2009-07-08 16:26:43 +00:00
trasz	09784497a2	There is an optimization in chmod(1), that makes it not to call chmod(2) if the new file mode is the same as it was before; however, this optimization must be disabled for filesystems that support NFSv4 ACLs. Chmod uses pathconf(2) to determine whether this is the case - however, pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't. This change adds lpathconf(3) to make it possible to solve that problem in a clean way. Reviewed by: rwatson (earlier version) Approved by: re (kib)	2009-07-08 15:23:18 +00:00
rwatson	da78c9e4a2	Replace AUDIT_ARG() with variable argument macros with a set more more specific macros for each audit argument type. This makes it easier to follow call-graphs, especially for automated analysis tools (such as fxr). In MFC, we should leave the existing AUDIT_ARG() macros as they may be used by third-party kernel modules. Suggested by: brooks Approved by: re (kib) Obtained from: TrustedBSD Project MFC after: 1 week	2009-06-27 13:58:44 +00:00
weongyo	9a472dd588	provides a extra write buffer when the NDIS driver want to send a request whose body has some datas through the default pipe. Tested by: Nikos Vassiliadis <nvass9573 at gmx.com>	2009-06-26 01:42:41 +00:00
jhb	2908b25ed7	Regen.	2009-06-24 21:54:08 +00:00
jhb	6f52fe78fb	Change the ABI of some of the structures used by the SYSV IPC API: - The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned short. - The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned short. - The mode member of struct ipc_perm is now mode_t instead of unsigned short (this is merely a style bug). - The rather dubious padding fields for ABI compat with SV/I386 have been removed from struct msqid_ds and struct semid_ds. - The shm_segsz member of struct shmid_ds is now a size_t instead of an int. This removes the need for the shm_bsegsz member in struct shmid_kernel and should allow for complete support of SYSV SHM regions >= 2GB. - The shm_nattch member of struct shmid_ds is now an int instead of a short. - The shm_internal member of struct shmid_ds is now gone. The internal VM object pointer for SHM regions has been moved into struct shmid_kernel. - The existing __semctl(), msgctl(), and shmctl() system call entries are now marked COMPAT7 and new versions of those system calls which support the new ABI are now present. - The new system calls are assigned to the FBSD-1.1 version in libc. The FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls. - A simplistic framework for tagging system calls with compatibility symbol versions has been added to libc. Version tags are added to system calls by adding an appropriate __sym_compat() entry to src/lib/libc/incldue/compat.h. [1] PR: kern/16195 kern/113218 bin/129855 Reviewed by: arch@, rwatson Discussed with: kan, kib [1]	2009-06-24 21:10:52 +00:00
jhb	cceae54c51	Add a new COMPAT7 flag for FreeBSD 7.x compatibility system calls.	2009-06-24 13:36:37 +00:00
bz	0808d0b1a6	After cleaning up rt_tables from vnet.h and cleaning up opt_route.h a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.	2009-06-23 17:03:45 +00:00
thompsa	30004d4d8e	Fix a typeo in the frame len function to unbreak the build, make it shorter while I am here.	2009-06-23 06:00:31 +00:00
thompsa	74c6c20b93	- Make struct usb_xfer opaque so that drivers can not access the internals - Reduce the number of headers needed for a usb driver, the common case is just usb.h and usbdi.h	2009-06-23 02:19:59 +00:00
jhb	e206daf142	Regen.	2009-06-22 20:24:03 +00:00
jhb	062accfe3d	Fix a typo in a comment.	2009-06-22 20:12:40 +00:00
brooks	f53c1c309d	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
jhb	0abfb2bd6a	Regen.	2009-06-17 19:53:47 +00:00
jhb	fd29528e09	- Add the ability to mix multiple flags seperated by pipe ('\|') characters in the type field of system call tables. Specifically, one can now use the 'NO' types as flags in addition to the 'COMPAT' types. For example, to tag 'COMPAT' system calls as living in a KLD via NOSTD. The COMPAT type is required to be listed first in this case. - Add new functions 'type()' and 'flag()' to the embedded awk script in makesyscalls.sh that return true if a requested flag is found in the type field ($3). The flag() function checks all of the flags in the field, but type() only checks the first flag. type() is meant to be used in the top-level "switch" statement and flag() should be used otherwise. - Retire the CPT_NOA type, it is now replaced with "COMPAT\|NOARGS" using the flags approach. - Tweak the comment descriptions of COMPAT[46] system calls so that they say "freebsd[46] foo" rather than "old foo". - Document the COMPAT6 type. - Sync comments in compat32 syscall table with the master table.	2009-06-17 19:50:38 +00:00
bz	48dc6805f8	Add explicit includes for jail.h to the files that need them and remove the "hidden" one from vimage.h.	2009-06-17 15:01:01 +00:00
jhb	28b41377e3	Regen.	2009-06-15 20:40:23 +00:00
jhb	447d980cd0	Add a new 'void closefrom(int lowfd)' system call. When called, it closes any open file descriptors >= 'lowfd'. It is largely identical to the same function on other operating systems such as Solaris, DFly, NetBSD, and OpenBSD. One difference from other *BSD is that this closefrom() does not fail with any errors. In practice, while the manpages for NetBSD and OpenBSD claim that they return EINTR, they ignore internal errors from close() and never return EINTR. DFly does return EINTR, but for the common use case (closing fd's prior to execve()), the caller really wants all fd's closed and returning EINTR just forces callers to call closefrom() in a loop until it stops failing. Note that this implementation of closefrom(2) does not make any effort to resolve userland races with open(2) in other threads. As such, it is not multithread safe. Submitted by: rwatson (initial version) Reviewed by: rwatson MFC after: 2 weeks	2009-06-15 20:38:55 +00:00
jamie	f950eed7d7	Get vnets from creds instead of threads where they're available, and from passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor)	2009-06-15 19:01:53 +00:00
thompsa	06303d491a	s/usb2_/usb_\|usbd_/ on all function names for the USB stack.	2009-06-15 01:02:43 +00:00
dchagin	c4e9ea4c7e	Unlock process lock when return error from getrobustlist call. Tested by: Alexander Best <alexbestms at math uni-muenster de> Approved by: kib (mentor) MFC after: 3 days	2009-06-14 17:53:55 +00:00
jamie	e9da16507b	Add counterparts to getcredhostname: getcreddomainname, getcredhostuuid, getcredhostid Suggested by: rmacklem Approved by: bz	2009-06-13 00:12:02 +00:00
kib	6e0d8393e5	Regenerate	2009-06-10 13:48:43 +00:00
kib	6013601373	Add several syscall compat32 entries for extattr manipulation syscalls, that do not require translation of the arguments. Requested by: kientzle Reviewed by: jhb (previous wrong version) MFC after: 1 week	2009-06-10 13:48:13 +00:00
bz	b7ff2bdc20	After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds. Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.	2009-06-08 19:57:35 +00:00
thompsa	2d149b09c5	Rename usb pipes to endpoints as it better represents what they are, and struct usb_pipe may be used for a different purpose later on.	2009-06-07 19:41:11 +00:00
rwatson	f4934662e5	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
dchagin	8e9d8c289c	Add forgotten in previous commit flags argument. Approved by: kib (mentor) MFC after: 1 month	2009-06-01 20:54:41 +00:00
dchagin	bb8f1f3e67	Implement accept4 syscall. Approved by: kib (mentor) MFC after: 1 month	2009-06-01 20:48:39 +00:00
dchagin	76d24c5be3	Implement a variation of the accept_common() which takes a flags argument. Do not preserve td_retval before kern_fcntl(F_SETFL) as it does not changed. Approved by: kib (mentor) MFC after: 1 month	2009-06-01 20:44:58 +00:00
dchagin	0cc88e7ca3	Split linux_accept() syscall onto linux_accept_common() which should be used by linuxulator and linux_accept() itself. Approved by: kib (mentor) MFC after: 1 month	2009-06-01 20:42:27 +00:00
rwatson	b850b85534	Regenerate generated syscall files following changes to struct sysent in r193234.	2009-06-01 16:14:38 +00:00
dchagin	6fb0275352	Implement a variation of the socketpair() syscall which takes a flags in addition to the type argument. Approved by: kib (mentor) MFC after: 1 month	2009-05-31 12:16:31 +00:00
dchagin	ab797d42e4	Move new socket flags handling into a separate function as Linux introduced more syscalls which uses these flags. Approved by: kib (mentor) MFC after: 1 month	2009-05-31 12:04:01 +00:00
dchagin	fbb545b684	Remove empty lines. Approved by: kib (mentor) MFC after: 1 month	2009-05-31 12:00:16 +00:00
delphij	7059dd02fa	Attempt to fix build by updating hostid to follow the new world order.	2009-05-30 07:33:32 +00:00
jamie	572db1408a	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
thompsa	44c17bdf07	s/usb2_/usb_/ on all typedefs for the USB stack.	2009-05-29 18:46:57 +00:00
delphij	fa743d1903	Implement SI_ISALIST. PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD	2009-05-29 06:27:30 +00:00
delphij	10c7a24940	Fix the sysinfo(SI_HW_SERIAL, emulation so that we actually get the hostid of the machine rather than always getting "0". PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD	2009-05-29 06:19:37 +00:00
delphij	0339972dd9	copyinstr(9) takes parameter 'len' as a size_t , not int . PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD	2009-05-29 06:04:26 +00:00
delphij	5abf6d12a5	de-register. Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD PR: kern/91293	2009-05-29 05:58:46 +00:00
delphij	88adf607ea	svr4_sys_getdents64() should not assume that the cookie would exist everywhere. PR: kern/91293 Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD	2009-05-29 05:51:19 +00:00
delphij	2095d11d4e	Add new sysconfig bits, Fix the bogus numbering of the old bits. Submitted by: "Pedro f. Giffuni" <giffunip asme org> Obtained from: NetBSD PR: kern/91293	2009-05-29 05:37:27 +00:00
delphij	9c4052a028	Use strlcpy().	2009-05-28 21:12:43 +00:00
thompsa	af6fb4f3d2	s/usb2_/usb_/ on all C structs for the USB stack.	2009-05-28 17:36:36 +00:00
avg	8466b56c6c	linux_ioctl_cdrom: reduce stack usage ... by moving two ~2KB structures from stack to heap allocation. I experienced stack overflow in linux emulation on i386 (8K stack) when LINUX_DVD_READ_STRUCT ioctl was performed on atapicam cd device and there was an error that resulted in additional quite heavy stack use in cam layer. Reviewed by: dchagin Approved by: jhb (mentor)	2009-05-27 15:23:12 +00:00
jamie	a013e0afcb	Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)	2009-05-27 14:11:23 +00:00
antoine	29949828e6	Remove an unused variable.	2009-05-24 18:35:53 +00:00
jhb	27f9129d79	Comment nits.	2009-05-20 18:36:17 +00:00
jhb	bc9abffcd5	Put the vnode returned from namei() immediately after namei() returns in svr4_sys_resolvepath().	2009-05-20 18:25:16 +00:00
dchagin	56c9819821	Validate user-supplied arguments values. Args argument is a pointer to the structure located in user space in which the socketcall arguments are packed. The structure must be copied to the kernel instead of direct dereferencing. Approved by: kib (mentor) MFC after: 1 week	2009-05-19 09:10:53 +00:00
dchagin	7316b5296a	Implement MSG_CMSG_CLOEXEC flag for linux_recvmsg(). Approved by: kib (mentor) MFC after: 1 month	2009-05-18 04:07:46 +00:00
dchagin	5351e06699	Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC and SOCK_NONBLOCK flags, that allow to save fcntl() calls. Implement a variation of the socket() syscall which takes a flags in addition to the type argument. Approved by: kib (mentor) MFC after: 1 month	2009-05-16 18:48:41 +00:00
dchagin	a0c026b20b	Return EINVAL in case when the incorrect or unsupported type argument is specified. Do not map type argument value as its Linux values are identical to FreeBSD values. Approved by: kib (mentor)	2009-05-16 18:46:51 +00:00
dchagin	eae11e9cce	Use the protocol family constants for the domain argument validation. Return immediately when the socket() failed. Approved by: kib (mentor) MFC after: 1 month	2009-05-16 18:44:56 +00:00
dchagin	bc4e3c1f6d	Emulate SO_PEERCRED socket option. Temporarily use 0 for pid member as the FreeBSD does not cache remote UNIX domain socket peer pid. PR: kern/102956 Reviewed by: rwatson Approved by: kib (mentor) MFC after: 1 month	2009-05-16 18:42:18 +00:00
brueffer	e420c3ccc5	Remove an unused variable. Found with: Coverity Prevent(tm) CID: 1167	2009-05-14 09:28:02 +00:00
brueffer	7850ca4b03	Fix memory leak in an error case. Found with: Coverity Prevent(tm) CID: 371 MFC after: 2 weeks	2009-05-13 08:50:13 +00:00
dchagin	ebcb202672	Translate l_timeval arg to native struct timeval in linux_setsockopt()/linux_getsockopt() for SO_RCVTIMEO, SO_SNDTIMEO opts as l_timeval has MD members. Remove bogus __packed attribute from l_timeval struct on __amd64__. PR: kern/134276 Submitted by: Thomas Mueller <tmueller sysgo com> Approved by: kib (mentor) MFC after: 2 weeks	2009-05-11 13:50:42 +00:00
dchagin	4f4faf9d43	Add forgotten linux to bsd flags argument mapping into the linux_recv(). PR: kern/134276 Submitted by: Thomas Mueller <tmueller sysgo com> Approved by: kib (mentor) MFC after: 2 weeks	2009-05-11 13:42:40 +00:00
dchagin	51f122997d	Do not export AT_CLKTCK when emulating Linux kernel prior to 2.4.0, as it has appeared in the 2.4.0-rc7 first time. Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK), glibc falls back to the hard-coded CLK_TCK value when aux entry is not present. Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value. For older applications/libc's which depends on hard-coded CLK_TCK value user should set compat.linux.osrelease less than 2.4.0. Approved by: kib (mentor)	2009-05-10 18:43:43 +00:00
dchagin	e4e6bf246f	Introduce linux_kernver() interface which is intended for an exact designation of the emulated kernel version. linux_kernver() returns integer value formatted as 'VVVMMMIII' where VVV - version, MMM - major revision, III - minor revision. Approved by: kib (mentor)	2009-05-10 18:27:20 +00:00
dchagin	ab5a6b0d18	Rework r189362, r191883. The frequency of the statistics clock is given by stathz. Use stathz if it is available, otherwise use hz. Pointed out by: bde Approved by: kib (mentor)	2009-05-10 18:16:07 +00:00
ed	8dbae36d2b	Regenerate system call tables to use SVN ids.	2009-05-08 20:16:04 +00:00
ed	59fb74ae92	Burn TTY ioctl bridges in compat layers. I really don't want any pieces of code to include ioctl_compat.h, so let the ibcs2 and svr4 compat leave sgtty alone. If they want to support sgtty, they should emulate it on top of termios, not sgtty. The code has been marked with BURN_BRIDGES for a long time. ibcs2 and svr4 are not really popular pieces of code anyway.	2009-05-08 20:06:37 +00:00
zec	639797b2e6	Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)	2009-05-08 14:11:06 +00:00
jamie	fa0fd85038	Give vfs_getopt the type it's expecting. Write 100 times: "32 bits is so twentieth century." Noticed by: dchagin	2009-05-07 19:46:29 +00:00
jamie	267ea54b44	Move the per-prison Linux MIB from a private one-off pointer to the new OSD-based jail extensions. This allows the Linux MIB to accessed via jail_set and jail_get, and serves as a demonstration of adding jail support to a module. Reviewed by: dchagin, kib Approved by: bz (mentor)	2009-05-07 18:36:47 +00:00
dchagin	dfa2940c7a	Add KTR(9) tracing for futex emulation. Approved by: kib (mentor) MFC after: 1 month	2009-05-07 16:14:31 +00:00
dchagin	f096e29878	Linux exports HZ value to user space via AT_CLKTCK auxiliary vector entry, which is available for Glibc as sysconf(_SC_CLK_TCK). If AT_CLKTCK entry is not exported, Glibc uses 100. linux_times() shall use the value that is exported to user space. Pointyhat to: dchagin PR: kern/134251 Approved by: kib (mentor) MFC after: 2 weeks	2009-05-07 14:24:50 +00:00
dchagin	e0ce6b415e	Change linux struct tms definition to match actual linux one. Approved by: kib (mentor) MFC after: 2 weeks	2009-05-07 12:55:58 +00:00
dchagin	69492be31f	Add preliminary KTR(9) support to the linux emulation layer. Approved by: kib (mentor) MFC after: 1 month	2009-05-07 10:01:05 +00:00
dchagin	010f4da5f8	To avoid excessive code duplication move MI definitions to the MI header file. As it is defined in Linux. Approved by: kib (mentor) MFC after: 1 month	2009-05-07 09:39:20 +00:00
dchagin	3ce50871ce	Return EAFNOSUPPORT instead of EINVAL in case when the incorrect or unsupported domain argument is specified. Approved by: kib (mentor)	2009-05-07 09:34:02 +00:00
dchagin	9f1df51422	Rework r191742. Use the protocol family constants for the domain argument validation. Return EAFNOSUPPORT in case when the incorrect domain argument is specified. Return EPROTONOSUPPORT instead of passing values that are not 0 to the BSD layer. Suggested by: rwatson Approved by: kib (mentor) MFC after: 1 month	2009-05-07 03:23:22 +00:00
jamie	9fea2e998c	Mark Linux MIB sysctls MPSAFE. Reviewed by: dchagin, kib Approved by: bz (mentor)	2009-05-04 19:06:05 +00:00
dchagin	f04150bca8	Linux socketpair() call expects explicit specified protocol for AF_LOCAL domain unlike FreeBSD which expects 0 in this case. Approved by: kib (mentor) MFC after: 1 month	2009-05-02 10:51:40 +00:00
dchagin	32b5830d97	Move extern variable definitions to the header file. Approved by: kib (mentor) MFC after: 1 month	2009-05-02 10:06:49 +00:00
dchagin	dca50049ce	Reimplement futexes. Old implemention used Giant to protect the kernel data structures, but at the same time called malloc(M_WAITOK), that could cause the calling thread to sleep and lost Giant protection. User-visible result was the missed wakeup. New implementation uses one sx lock per futex. The sx protects the futex structures and allows to sleep while copyin or copyout are performed. Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation is requested and either caller specified futexes are equial or second futex already exists. This is acceptable since the situation can only occur from the application error, and glibc falls back to old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error. Approved by: kib (mentor) MFC after: 1 month	2009-05-01 15:36:02 +00:00
jamie	8fbb51e637	Regen for new jail system calls in r191673. Approved by: bz (mentor)	2009-04-29 21:50:13 +00:00
jamie	453b86f943	Introduce the extensible jail framework, using the same "name=value" interface as nmount(2). Three new system calls are added: * jail_set, to create jails and change the parameters of existing jails. This replaces jail(2). * jail_get, to read the parameters of existing jails. This replaces the security.jail.list sysctl. * jail_remove to kill off a jail's processes and remove the jail. Most jail parameters may now be changed after creation, and jails may be set to exist without any attached processes. The current jail(2) system call still exists, though it is now a stub to jail_set(2). Approved by: bz (mentor)	2009-04-29 21:14:15 +00:00
zec	8d976eab5c	In preparation for turning on options VIMAGE in next commits, rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace. Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)	2009-04-26 22:06:42 +00:00
dchagin	ada9604fd2	Remove support for FUTEX_REQUEUE operation. Glibc does not use this operation since 2.3.3 version (Jun 2004), as it is racy and replaced by FUTEX_CMP_REQUEUE operation. Glibc versions prior to 2.3.3 fall back to FUTEX_WAKE when FUTEX_REQUEUE returned EINVAL. Any application directly using FUTEX_REQUEUE without return value checking are definitely broken. Limit quantity of messages per process about unsupported operation. Approved by: kib (mentor) MFC after: 1 month	2009-04-19 13:48:42 +00:00
thompsa	f498dc2227	MFp4 //depot/projects/usb@159909 - make usb2_power_mask_t 16-bit - remove "usb2_config_sub" structure from "usb2_config". To compensate for this "usb2_config" has a new field called "usb_mode" which select for which mode the current xfer entry is active. Options are: a) Device mode only b) Host mode only (default-by-zero) c) Both modes. This change was scripted using the following sed script: "s/\.mh\././g". - the standard packet size table in "usb_transfer.c" is now a function, hence the code for the function uses less memory than the table itself. Submitted by: Hans Petter Selasky	2009-04-05 18:20:38 +00:00
dchagin	01bf63c9fb	Fix KBI breakage by r190520 which affects older linux.ko binaries: 1) Move the new field (brand_note) to the end of the Brandinfo structure. 2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer is valid. 3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old modules won't have the flag set, so the new field brand_note would be ignored. Suggested by: jhb Reviewed by: jhb Approved by: kib (mentor) MFC after: 6 days	2009-04-05 09:27:19 +00:00
kib	9fb298f6bc	Regen	2009-04-01 13:12:40 +00:00
kib	75320d2f76	Rename implementation function for freebsd32 sysarch(2) to allow for the arguments translations. Provide ABI-compatible definition of the struct i386_ldt_args for freebsd32 compat layer. In collaboration with: pho Reviewed by: jhb	2009-04-01 13:11:50 +00:00
kib	8e7a736a88	Add all segment registers for the amd64 CPU to struct reg and mcontext. To keep these structures ABI-compatible, half the size of r_trapno, r_err, mc_trapno, mc_flags. Add fsbase and gsbase to mcontext on both amd64 and i386. Add flags to amd64 mcontext to indicate that it contains valid segments or bases. In collaboration with: pho Discussed with: peter Reviewed by: jhb	2009-04-01 12:44:17 +00:00
ed	7fc8939e4a	Emulate the FIODGNAME ioctl in our 32-bit emulator. It's quite strange that nobody reported this issue before. It turns out functions like ttyname(), ptsname() and fdevname() don't work in compat32. This means it't not even possible to run applications like script(1) inside a 32-bit FreeBSD jail. Fix this by converting 32-bit fiodgname_arg structures to their 64-bit equivalent. Reported by: kris Tested by: kris	2009-03-29 20:09:51 +00:00
jamie	5a5a677581	Whitespace/spelling fixes in advance of upcoming functional changes. Approved by: bz (mentor)	2009-03-27 13:13:59 +00:00
ambrisko	ac334eb30e	Add stuff to support upcoming BMC/IPMI flashing of newer Dell machine via the Linux tool. - Add Linux shim to ipmi(4) - Create a partitions file to linprocfs to make Linux fdisk see disks. This file is dynamic so we can see disks come and go. - Convert msdosfs to vfat in mtab since Linux uses that for msdosfs. - In the Linux mount path convert vfat passed in to msdosfs so Linux mount works on FreeBSD. Note that tasting works so that if da0 is a msdos file system /compat/linux/bin/mount /dev/da0 /mnt works. - fix a 64it bug for l_off_t. Grabing sh, mount, fdisk, df from Linux, creating a symlink of mtab to /compat/linux/etc/mtab and then some careful unpacking of the Linux bmc update tool and hacking makes it work on newer Dell boxes. Note, probably if you can't figure out how to do this, then you probably shouldn't be doing it :-)	2009-03-26 17:14:22 +00:00
weongyo	7fabe111cb	Some NDIS USB drivers try to call URB funcs like URB_FUNCTION_VENDOR_xxx or URB_FUNCTION_CLASS_xxx with HAL preemption lock that means it's non-sleepable during USB requests though usb2_do_request() requires a sleep so it needs to send queries to the default pipe without those interfaces to avoid sleep.	2009-03-18 02:38:35 +00:00
weongyo	bcc40d445d	If the caller sets irp_usriostat or irp_usrevent it try to process it whatever the IRP flag is because some drivers (eg. RTL8187L NDIS driver) call IoCompleteRequest() without setting flags. It will prevent waiting a event forever at attach.	2009-03-18 01:57:54 +00:00
kib	e905171fbe	Supply AT_EXECPATH auxinfo entry to the interpreter, both for native and compat32 binaries. Tested by: pho Reviewed by: kan	2009-03-17 12:53:28 +00:00
weongyo	a204e02a55	grab NDIS USB lock instead of HAL preemption. This change should be happened in the previous.	2009-03-17 05:57:43 +00:00
weongyo	bde9ac2d70	use usb2_desc_foreach() to iterate the USB config descriptor instread of accessing structures directly to check some invalid descriptors. Pointed by: hps	2009-03-16 11:19:07 +00:00
dchagin	f248585449	Sort include files in the alphabetical order. Approved by: kib (mentor) MFC after: 2 weeks	2009-03-16 05:39:37 +00:00
dchagin	e488f4df7a	Ignore FUTEX_FD op, as it is done by linux. Approved by: kib (mentor) MFC after: 2 weeks	2009-03-15 19:38:34 +00:00
dchagin	09af73f25f	Include linux_futex.h before linux_emul.h Approved by: kib (mentor) MFC after: 6 days	2009-03-15 19:16:12 +00:00
dchagin	2408b715a0	Implement new way of branding ELF binaries by looking to a ".note.ABI-tag" section. The search order of a brand is changed, now first of all the ".note.ABI-tag" is looked through. Move code which fetch osreldate for ELF binary to check_note() handler. PR: 118473 Approved by: kib (mentor)	2009-03-13 16:40:51 +00:00
weongyo	efbcaa065b	o change a lock model based on HAL preemption lock to a normal mtx. Based on the HAL preemption lock there is a problem on SMP machines and causes a panic. o When a device detached the current tactic to detach NDIS USB driver is to call SURPRISE_REMOVED event. So it don't need to call ndis_halt_nic() again. This fixes some page faults when some drivers work abnormal. o it assumes now that URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER is in DISPATCH_LEVEL (non-sleepable) and as further work URB_FUNCTION_VENDOR_XXX and URB_FUNCTION_CLASS_XXX should be. Reviewed by: Hans Petter Selasky <hselasky_at_freebsd.org> Tested by: Paul B. Mahol <onemda_at_gmail.com>	2009-03-12 02:51:55 +00:00
weongyo	6d523cd42a	o port NDIS USB support from USB1 to the new usb(USB2). o implement URB_FUNCTION_ABORT_PIPE handling. o remove unused code related with canceling the timer list for USB drivers. o whitespace cleanup and style(9) Obtained from: hps's original patch	2009-03-07 07:26:22 +00:00
jhb	e1b708897e	A better fix for handling different FPU initial control words for different ABIs: - Store the FPU initial control word in the pcb for each thread. - When first using the FPU, load the initial control word after restoring the clean state if it is not the standard control word. - Provide a correct control word for Linux/i386 binaries under FreeBSD/amd64. - Adjust the control word returned for fpugetregs()/npxgetregs() when a thread hasn't used the FPU yet to reflect the real initial control word for the current ABI. - The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control word instead of trashing whatever the current state of the FPU is. Reviewed by: bde	2009-03-05 19:42:11 +00:00
dchagin	45cda70b8f	Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?" from some programs at start, among them are top and pkill. Do the assignment of the vector entries in elf_linux_fixup() as it is done in glibc. Fix some minor style issues. Submitted by: Marcin Cieslak <saper at SYSTEM PL> Approved by: kib (mentor) MFC after: 1 week	2009-03-04 12:14:33 +00:00
jamie	63f98fcc6a	Extend the "vfsopt" mount options for more general use. Make struct vfsopt and the vfs_buildopts function public, and add some new fields to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and vfs_opterror. Further extend the interface to allow reading options from the kernel in addition to sending them to the kernel, with vfs_setopt and related functions. While this allows the "name=value" option interface to be used for more than just FS mounts (planned use is for jails), it retains the current "vfsopt" name and <sys/mount.h> requirement. Approved by: bz (mentor)	2009-03-02 23:26:30 +00:00
bz	df2be82cec	For all files including net/vnet.h directly include opt_route.h and net/route.h. Remove the hidden include of opt_route.h and net/route.h from net/vnet.h. We need to make sure that both opt_route.h and net/route.h are included before net/vnet.h because of the way MRT figures out the number of FIBs from the kernel option. If we do not, we end up with the default number of 1 when including net/vnet.h and array sizes are wrong. This does not change the list of files which depend on opt_route.h but we can identify them now more easily.	2009-02-27 14:12:05 +00:00
rdivacky	e5bfcba080	Change the functions to ANSI in those cases where it breaks promotion to int rule. See ISO C Standard: SS6.7.5.3:15. Approved by: kib (mentor) Reviewed by: warner Tested by: silence on -current	2009-02-24 18:09:31 +00:00
thompsa	44cdb003f7	Move usb to a graveyard location under sys/legacy/dev, it is intended that the new USB2 stack will fully replace this for 8.0. Remove kernel modules, a subsequent commit will update conf/files. Unhook usbdevs from the build.	2009-02-23 18:16:17 +00:00
ed	72727e8d9f	Don't make Linux stat() open character devices to resolve its name. The existing code calls kern_open() to resolve the vnode of a pathname right after a stat(). This is not correct, because it causes random character devices to be opened in /dev. This means ls'ing a tape streamer will cause it to rewind, for example. Changes I have made: - Add kern_statat_vnhook() to allow binary emulators to `post-process' struct stat, using the proper vnode. - Remove unneeded printf's from stat() and statfs(). - Make the Linuxolator use kern_statat_vnhook(), replacing translate_path_major_minor_at(). - Let translate_fd_major_minor() use vp->v_rdev instead of vp->v_un.vu_cdev. Result: crw-rw-rw- 1 root root 0, 14 Feb 20 13:54 /dev/ptmx crw--w---- 1 root adm 136, 0 Feb 20 14:03 /dev/pts/0 crw--w---- 1 root adm 136, 1 Feb 20 14:02 /dev/pts/1 crw--w---- 1 ed tty 136, 2 Feb 20 14:03 /dev/pts/2 Before this commit, ptmx also had a major number of 136, because it silently allocated and deallocated a pseudo-terminal. Device nodes that cannot be opened now have proper major/minor-numbers. Reviewed by: kib, netchild, rdivacky (thanks!)	2009-02-20 13:05:29 +00:00
jhb	26e338d6fc	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
jhb	7ef64c1bf9	Fix a bug in the previous change to the mtab handler: use the path returned by vn_fullpath() when vn_fullpath() succeeds instead of when it fails. Submitted by: Artem Belevich fbsdlist of src.cx MFC after: 3 days	2009-02-13 15:32:03 +00:00
netchild	810bd8f924	Fix an edge-case of the linux readdir: We need the size of a linux dirent structure, not the size of a pointer to it. PR: 131099 Submitted by: Andreas Kies <andikies@gmail.com> MFC after: 2 weeks	2009-02-13 11:55:19 +00:00
obrien	7a153194ec	Change some movl's to mov's. Newer GAS no longer accept 'movl' instructions for moving between a segment register and a 32-bit memory location. Looked at by: jhb	2009-01-31 11:37:21 +00:00
ed	a964306db9	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
jkim	ad7caec2a5	Replace couple of strcmp(cpu_vendor, "foo") with cpu_vendor_id for i386 and hide i386-specific code under #ifdef.	2009-01-22 17:06:33 +00:00
ed	f3a9a195cb	Push down Giant inside sysctl. Also add some more assertions to the code. In the existing code we didn't really enforce that callers hold Giant before calling userland_sysctl(), even though there is no guarantee it is safe. Fix this by just placing Giant locks around the call to the oid handler. This also means we only pick up Giant for a very short period of time. Maybe we should add MPSAFE flags to sysctl or phase it out all together. I've also added SYSCTL_LOCK_ASSERT(). We have to make sure sysctl_root() and name2oid() are called with the sysctl lock held. Reviewed by: Jille Timmermans <jille quis cx>	2008-12-29 12:58:45 +00:00
kib	bd5d614be8	vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:45:11 +00:00
ganbold	ab8a937c28	Remove unused variable. Found with: Coverity Prevent(tm) CID: 542 Approved by: weongyo	2008-12-28 13:50:58 +00:00
weongyo	4a6f6562e4	fix a bug to handling the argument that it passed `device_t' but it's handled as `struct ndis_softc'. It'll cause a panic when the driver is detached.	2008-12-27 09:42:17 +00:00
weongyo	0f8825b3f7	Integrate the NDIS USB support code to CURRENT. Now the NDISulator supports NDIS USB drivers that it've tested with devices as follows: - Anygate XM-142 (Conexant) - Netgear WG111v2 (Realtek) - U-Khan UW-2054u (Marvell) - Shuttle XPC Accessory PN20 (Realtek) - ipTIME G054U2 (Ralink) - UNiCORN WL-54G (ZyDAS) - ZyXEL G-200v2 (ZyDAS) All of them succeeded to attach and worked though there are still some problems that it's expected to be solved. To use NDIS USB support, you should rebuild and install ndiscvt(8) and if you encounter a problem to attach please set `hw.ndisusb.halt' to 0 then retry. I expect no changes of the NDIS code for PCI, PCMCIA devices. Obtained from: //depot/projects/ndisusb/...	2008-12-27 08:03:32 +00:00
kib	ce7791f58d	Remove two remnant uses of AT_DEBUG.	2008-12-17 13:13:35 +00:00
kib	e747469903	Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week	2008-12-12 12:12:36 +00:00
bz	da8c897826	Add 32-bit compat support for AIO. jhb probably forgot to commit this file with r185878 and will want to review this. It unbreaks the build here. Obtained from: p4 //depot/user/jhb/lock/compat/freebsd32/freebsd32_signal.h#2	2008-12-11 00:58:05 +00:00
jhb	0bf9254d64	Regen.	2008-12-10 20:57:16 +00:00
jhb	f3dcc2d9e0	- Add 32-bit compat system calls for VFS_AIO. The system calls live in the aio code and are registered via the recently added SYSCALL32_*() helpers. - Since the aio code likes to invoke fuword and suword a lot down in the "bowels" of system calls, add a structure holding a set of operations for things like storing errors, copying in the aiocb structure, storing status, etc. The 32-bit system calls use a separate operations vector to handle fuword32 vs fuword, etc. Also, the oldsigevent handling is now done by having seperate operation vectors with different aiocb copyin routines. - Split out kern_foo() functions for the various AIO system calls so the 32-bit front ends can manage things like copying in and converting timespec structures, etc. - For both the native and 32-bit aio_suspend() and lio_listio() calls, just use copyin() to read the array of aiocb pointers instead of using a for loop that iterated over fuword/fuword32. The error handling in the old case was incomplete (lio_listio() just ignored any aiocb's that it got an EFAULT trying to read rather than reporting an error), and possibly slower. MFC after: 1 month	2008-12-10 20:56:19 +00:00
kib	ca4d4f3494	Relock user map earlier, to have the lock held when break leaves the loop earlier due to sbuf error. Pointy hat to: me Submitted by: dchagin	2008-12-10 16:11:09 +00:00
kib	5981f9f73b	Make two style changes to create new commit and document proper commit message for r185765. Noted by: rdivacky Requested by: des Commit message for r185765 should be: In procfs map handler, and in linprocfs maps handler, do not call vn_fullpath() while having vm map locked. This is done in anticipation of the vop_vptocnp commit, that would make vn_fullpath sometime acquire vnode lock. Also, in linprocfs, maps handler already acquires vnode lock. No objections from: des MFC after: 2 week	2008-12-08 13:15:31 +00:00
kib	4f0c734de3	Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week	2008-12-08 12:34:52 +00:00
jhb	d5575b642e	When unloading a 32-bit system call module, restore the sysent vector in the 32-bit system call table instead of the main system call table.	2008-12-03 18:45:38 +00:00
bz	604d89458a	Rather than using hidden includes (with cicular dependencies), directly include only the header files needed. This reduces the unneeded spamming of various headers into lots of files. For now, this leaves us with very few modules including vnet.h and thus needing to depend on opt_route.h. Reviewed by: brooks, gnn, des, zec, imp Sponsored by: The FreeBSD Foundation	2008-12-02 21:37:28 +00:00
kib	8ffb383318	Make linux_sendmsg() and linux_recvmsg() work on linux32/amd64. Change types used in the linux' struct msghdr and struct cmsghdr definitions to the properly-sized architecture-specific types. Move ancillary data handler from linux_sendit() to linux_sendmsg(). Submitted by: dchagin	2008-11-29 17:14:06 +00:00
bz	7a6d0a128f	Regen after jail support was added in r185435.	2008-11-29 14:34:30 +00:00
bz	d2730d5b27	MFp4: Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible	2008-11-29 14:32:14 +00:00
rdivacky	b213864d66	Document that all the other commands are either identical to the FreeBSD ones or rejected by kern_msgctl(). Found with: Coverity Prevent(tm) CID: 3456 Approved by: kib (mentor)	2008-11-26 16:38:43 +00:00
kib	8fad2283b3	Add sv_flags field to struct sysentvec with intention to provide description of the ABI of the currently executing image. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures to determine ABI features. Discussed with: dchagin, imp, jhb, peter	2008-11-22 12:36:15 +00:00
kib	f5d16a4d66	In the robust futexes list head, futex_offset shall be signed, and glibc actually supplies negative offsets. Change l_ulong to l_long. Submitted by: dchagin	2008-11-16 15:45:41 +00:00
peter	c0dbc72cb7	Sigh. Fix a pointer/int compile error.	2008-11-10 23:36:20 +00:00
peter	82e115654c	Fix a signal emulation bug introduced in r163018 (and present in 7.x). This prevents 32 bit signal handlers from finding out what the faulting address is. Both the secret 4th argument and siginfo->si_addr are zero.	2008-11-10 23:26:52 +00:00
ed	7baae41248	Regenerate system call tables for r184789.	2008-11-09 10:48:06 +00:00
ed	9d3703b842	Mark uname(), getdomainname() and setdomainname() with COMPAT_FREEBSD4. Looking at our source code history, it seems the uname(), getdomainname() and setdomainname() system calls got deprecated somewhere after FreeBSD 1.1, but they have never been phased out properly. Because we don't have a COMPAT_FREEBSD1, just use COMPAT_FREEBSD4. Also fix the Linuxolator to build without the setdomainname() routine by just making it call userland_sysctl on kern.domainname. Also replace the setdomainname()'s implementation to use this approach, because we're duplicating code with sysctl_domainname(). I wasn't able to keep these three routines working in our COMPAT_FREEBSD32, because that would require yet another keyword for syscalls.master (COMPAT4+NOPROTO). Because this routine is probably unused already, this won't be a problem in practice. If it turns out to be a problem, we'll just restore this functionality. Reviewed by: rdivacky, kib	2008-11-09 10:45:13 +00:00
des	dd07e118d8	utf-8 MFC after: 3 weeks	2008-11-05 15:08:09 +00:00
jhb	31cc9ab8d9	Don't leak a reference on the /compat/linux vnode everytime the linprocfs 'mtab' file is read. MFC after: 1 month	2008-11-04 18:53:33 +00:00
dfr	6929a6d99b	Regen.	2008-11-03 10:39:35 +00:00
dfr	2fb03513fc	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
kib	288874a97d	The code in linux_proc_exit() contains a race when multiple linux based processes exits at the same time. The linux_emuldata structure is freed but p->p_emuldata is left as a dangling pointer to the just freed memory. The check for W_EXIT in the loop scanning the child processes isn't safe since the state of the child process can change right afterwards. Lock the process and check the W_EXIT before delivering signal. Submitted by: tegge Reviewed by: davidxu MFC after: 1 week	2008-10-31 10:38:30 +00:00
trasz	0ad8692247	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
jhb	e416d53f44	Regen for freebsd32_getdirentries().	2008-10-22 21:56:44 +00:00
jhb	327ae6eb3a	Split the copyout of *base at the end of getdirentries() out leaving the rest in kern_getdirentries(). Use kern_getdirentries() to implement freebsd32_getdirentries(). This fixes a bug where calls to getdirentries() in 32-bit binaries would trash the 4 bytes after the 'long base' in userland. Submitted by: ups MFC after: 1 week	2008-10-22 21:55:48 +00:00
kib	29ccf7d166	Correctly fill siginfo for the signals delivered by linux tkill/tgkill. It is required for async cancellation to work. Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made to not linux process. Do not call em_find(p, ...) with p unlocked. Move common code for linux_tkill() and linux_tgkill() into linux_do_tkill(). Change linux siginfo_t definition to match actual linux one. Extend uid fields to 4 bytes from 2. The extension does not change structure layout and is binary compatible with previous definition, because i386 is little endian, and each uid field has 2 byte padding after it. Reported by: Nicolas Joly <njoly pasteur fr> Submitted by: dchangin MFC after: 1 month	2008-10-19 10:02:26 +00:00
kib	faae1c0f2f	Make robust futexes work on linux32/amd64. Use PTRIN to read user-mode pointers. Change types used in the structures definitions to properly-sized architecture-specific types. Submitted by: dchagin MFC after: 1 week	2008-10-14 07:59:23 +00:00
kib	d7ec3f21ab	Current linux_fooaffinity() emulation fails, as the FreeBSD affinity syscalls expect the bitmap size in the range from 32 to 128. Old glibc always assumed size 1024, while newer glibc searches for approriate size, starting from 1024 and going up. For now, use FreeBSD size of cpuset_t for bitmap size parameter and return EINVAL if length of user space bitmap less than our size of cpuset_t. Submitted by: dchagin MFC after: 1 week [This requires MFC of the actual linux affinity syscalls]	2008-10-04 19:23:30 +00:00
kib	79819484b6	Change the linprocfs <pid>/maps and procfs <pid>/map handlers to use sbuf instead of doing uiomove. This allows for reads from non-zero offsets to work. Patch is forward-ported des@' one, and was adopted to current code by dchagin@ and me. Reviewed by: des (linprocfs part) PR: kern/101453 MFC after: 1 week	2008-10-04 14:08:16 +00:00
zec	8797d4caec	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
cognet	f2b740a1ea	Advertise bit 26 as sse2. Spotted out by: gahr	2008-09-26 15:29:18 +00:00
jhb	97facf9f0e	Add support for installing 32-bit system calls from kernel modules. This includes syscall32_{de,}register() routines as well as a module handler and wrapper macros similar to the support for native syscalls in <sys/sysent.h>. MFC after: 1 month	2008-09-25 20:50:21 +00:00
jhb	7cd998e440	Sort includes and add multiple include guards.	2008-09-25 20:12:38 +00:00
jhb	6ccb676bf2	Regen.	2008-09-25 20:08:36 +00:00
jhb	00776aeb58	Tidy up a few things with syscall generation: - Instead of using a syscall slot (370) just to get a function prototype for lkmressys(), add an explicit function prototype to <sys/sysent.h>. This also removes unused special case checks for 'lkmressys' from makesyscalls.sh. - Instead of having magic logic in makesyscalls.sh to only generate a function prototype the first time 'lkmnosys' is seen, make 'NODEF' always not generate a function prototype and include an explicit prototype for 'lkmnosys' in <sys/sysent.h>. - As a result of the fix in (2), update the LKM syscall entries in the freebsd32 syscall table to use 'lkmnosys' rather than 'nosys'. - Use NOPROTO for the __syscall() entry (198) in the native ABI. This avoids the need for magic logic in makesyscalls.h to only generate a function prototype the first time 'nosys' is encountered.	2008-09-25 20:07:42 +00:00
kib	c500808674	Change the static struct sysentvec and struct Elf_Brandinfo initializers to the C99 style. At least, it is easier to read sysent definitions that way, and search for the actual instances of sigcode etc. Explicitely initialize sysentvec.sv_maxssiz that was missed in most sysvecs. No objection from: jhb MFC after: 1 month	2008-09-24 10:14:37 +00:00
trasz	d1f0654ba4	Fix usage of mac_vnode_check_open() in linuxulator - last argument should be VREAD, not FREAD. Approved by: rwatson (mentor)	2008-09-22 18:59:24 +00:00
obrien	f419b74a96	Add freebsd32 compat shims for ioctl(2) CDIOREADTOCHEADER and CDIOREADTOCENTRYS requests.	2008-09-22 16:24:36 +00:00
obrien	3b732278f8	Regenerate for r183270.	2008-09-22 16:09:43 +00:00
obrien	52c49eeb94	Add freebsd32 compat shims for ioctl(2) MDIOCATTACH, MDIOCDETACH, MDIOCQUERY, and MDIOCLIST requests.	2008-09-22 16:09:16 +00:00
obrien	b0fffd3316	Regenerate for r183188.	2008-09-19 15:21:40 +00:00
obrien	0c0da6bba7	Add freebsd32 compat shim for nmount(2). (and quiet some compiler warnings for vfs_donmount)	2008-09-19 15:17:32 +00:00
obrien	be294ffd1c	style(9)	2008-09-15 17:39:40 +00:00
obrien	9966e57b58	Regenerate for r183042.	2008-09-15 17:39:01 +00:00
obrien	a842ca36c9	Fix bug in r100384 (rev 1.2) in which the 32-bit swapon(2) was made "obsolete, not included in system", where as the system call does exist.	2008-09-15 17:37:41 +00:00
ed	98f8e6b0ee	Allow COMPAT_SVR4 to be built without COMPAT_43. It seems we only depend on COMPAT_43 to implement the send() and recv() routines. We can easily implement them using sendto() and recvfrom(), just like we do inside our very own C library. I wasn't able to really test it, apart from simple compilation testing. I've heard rumours that COMPAT_SVR4 is broken inside execve() anyway. It's still worth to fix this, because I suspect we'll get rid of COMPAT_43 somewhere in the future... Reviewed by: rdivacky Discussed with: jhb	2008-09-15 15:09:35 +00:00
thompsa	dbfcc4871f	Allow PAGE_SHIFT to already be defined. Submitted by: Hans Petter Selasky	2008-09-13 17:34:18 +00:00
rdivacky	74e5140a73	The ERESTART to EINTR conversion is already done in kern_select so there is no need to repeat it in linux_select(). Submitted by: Dmitry Chagin <dchagin@> MFC after: 1 week Approved by: kib (mentor)	2008-09-11 15:28:28 +00:00
rdivacky	817c519713	Getdents requires padding with 2 bytes instead of 1 byte as with getdents64. The last byte is used for storing the d_type, add this to plain getdents case where it was missing before. Also change the code to use strlcpy instead of plain strcpy. This changes fix the getdents crash we had reports about (hl2 server etc.) PR: kern/117010 MFC after: 1 week Submitted by: Dmitry Chagin (dchagin@) Tested by: MITA Yoshio <mita ee.t.u-tokyo.ac jp> Approved by: kib (mentor)	2008-09-09 16:00:17 +00:00
kib	626be4984b	Remove superfluous copyin() of args, structures are already in kernel space. Submitted by: dchagin MFC after: 1 week	2008-09-09 13:01:14 +00:00
attilio	dbf35e279f	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
julian	64d908d08e	We left out V_static_len from ip_fw2.c (also a whitespace diff that i'd rahter fix her ethan break in the vimage branch.)	2008-08-25 05:38:18 +00:00
julian	18137ef251	All opt_x.h includes go at the top of other includes.	2008-08-25 04:55:29 +00:00
rwatson	70366e8fcc	Regenerate following r182123.	2008-08-24 21:23:08 +00:00
rwatson	6a45d33f33	When MPSAFE ttys were merged, a new BSM audit event identifier was allocated for posix_openpt(2). Unfortunately, that identifier conflicts with other events already allocated to other systems in OpenBSM. Assign a new globally unique identifier and conform better to the AUE_ event naming scheme. This is a stopgap until a new OpenBSM import is done with the correct identifier, so we'll maintain this as a local diff in svn until then. Discussed with: ed Obtained from: TrustedBSD Project	2008-08-24 21:20:35 +00:00
obrien	3b12eba1b0	Add comments on NOARGS, NODEF, and NOPROTO.	2008-08-21 22:57:31 +00:00
ed	4b93c9151b	Update system call tables. The previous commit also included changes to all the system call lists, but it is a tradition to update these lists in a second commit, so rerun make sysent to update the $FreeBSD$ tags inside these files to refer to the latest version of syscalls.master. Requested by: rwatson	2008-08-20 08:39:10 +00:00
ed	cc3116a938	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
ed	6b178cd86d	Add TIOCPKT and TIOCSPTLCK to the Linuxolator. We're very lucky, because the flags used by our TIOCPKT implementation are the same as flags used by Linux. We can safely enable TIOCPKT, assuming EXTPROC is not used. TIOCSPTLCK is used by unlockpt(). Because we don't need unlockpt() in our implementation, make this ioctl a no-op. Approved by: philip (mentor, implicit), rdivacky Obtained from: P4 (//depot/projects/mpsafetty/...)	2008-07-23 17:47:44 +00:00
rdivacky	32f23bcf55	Fix linux_alarm, the linux behaviour is to limit the secs to INT_MAX when the passed in parameter is bigger than INT_MAX. Submitted by: Dmitry Chagin <chagin.dmitry gmail com> Approved by: kib (mentor)	2008-07-23 17:19:02 +00:00
weongyo	0293d4a27f	when NDIS framework try to query/set informations NDIS drivers can return NDIS_STATUS_PENDING. In this case, it's waiting for 5 secs to get the response from drivers now. However, some NDIS drivers can send the response before NDIS framework gets ready to receive it so we might always be blocked for 5 secs in current implementation. NDIS framework should reset the event before calling NDIS driver's callback not after. MFC after: 1 month	2008-07-23 10:49:27 +00:00
brooks	97c7080d1f	style(9): put parentheses around return values.	2008-07-10 19:54:34 +00:00
brooks	4b0dbab536	Regen	2008-07-10 17:46:58 +00:00
brooks	87a2c8d1bf	id_t is a 64-bit integer and thus is passed as two arguments like off_t is. As a result, those arguments must be recombined before calling the real syscal implementation. This change fixes 32-bit compatibility for cpuset_getid(), cpuset_setid(), cpuset_getaffinity(), and cpuset_setaffinity().	2008-07-10 17:45:57 +00:00
rwatson	051819b847	Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks	2008-07-05 13:10:10 +00:00
cokane	974e7b1858	Silence warning about missing IoGetDeviceObjectPointer by implementing a simple stub that always returns STATUS_SUCCESS. Submitted by: Paul B. Mahol <onemda@gmail.com> Reviewed by: thompsa MFC after: 1 week	2008-06-15 13:37:29 +00:00
wkoszek	813c6fb348	Remove obselete PECOFF image activator support. PRs assigned at the time of removal: kern/80742 Discussed on: freebsd-current (silence), IRC Tested by: make universe Approved by: cognet (mentor)	2008-06-14 12:51:44 +00:00
weongyo	ad0eff64be	fix a page fault that it occurred during ifp is NULL. This bug happens when NDIS driver's initialization is failed and NDIS driver's trying to call NdisWriteErrorLogEntry().	2008-06-11 07:55:07 +00:00
rdivacky	7fba368b69	d_ino member of linux_dirent structure should be unsigned long. Submitted by: Chagin Dmitry <chagin.dmitry@gmail.com> Approved by: kib (mentor)	2008-06-08 11:09:25 +00:00
rdivacky	25a34fb524	Switch to emulating Linux 2.6 on default. Approved by: kib (mentor)	2008-06-03 17:50:13 +00:00
ed	ff609f1187	Push down the major/minor conversion for pts/%u to improve consistency. In the mpsafetty branch, Linux sshd seems to work properly inside a jail. Some small modifications had to be made to the Linux compatibility layer. The Linux PTY routines always expect the device major number to be 136 or higher. Our code always set the major/minor number pair to 136:0. This makes routines like ttyname() and ptsname() fail, because we'll end up having ambiguous device numbers. The conversion was not performed on all *stat() routines, which meant in some cases the numbers didn't get transformed. By pushing the conversion into linux_driver_get_major_minor(), the transformation will take place on all calls. Approved by: philip (mentor), rdivacky	2008-06-02 08:40:06 +00:00
weongyo	1ea90a0dea	Fix a panic that a priority value which is passed to cv_broadcastpri(9) can be < 0. We don't ignore a `increment' argument but at least we keep a priority value of NDIS threads over PRI_MIN_KERN. Reviewed by: thompsa	2008-05-30 06:31:55 +00:00
weongyo	6354067da5	Fix a panic when it occurred during initializing the ndis driver because it try to read network address through ifnet structure which is NULL until the ndis driver's initialization is finished. Reviewed by: thompsa	2008-05-15 04:29:28 +00:00
rdivacky	13cbd9c97e	Implement robust futexes. Most of the code is modelled after what Linux does. This is because robust futexes are mostly userspace thing which we cannot alter. Two syscalls maintain pointer to userspace list and when process exits a routine walks this list waking up processes sleeping on futexes from that list. Reviewed by: kib (mentor) MFC after: 1 month	2008-05-13 20:01:27 +00:00
rdivacky	dd1e82ea4d	Implement linux_truncate64() syscall. Tested by: Aline de Freitas <aline@riseup.net> Approved by: kib (mentor)	2008-04-23 15:56:33 +00:00
rdivacky	2c58e9a054	The vmspace->vm_daddr is constant until freed, there is no need to hold lock while accessing it. Approved by: kib (mentor)	2008-04-21 21:24:08 +00:00
rdivacky	69ec9a439c	Remove using magic value of -1 to distinguish between linux_open() and linux_openat(). Instead just pass AT_FDCWD into linux_common_open() for the linux_open() case. This prevents passing -1 as a dirfd to openat() from succeeding which is wrong. Suggested by: rwatson, kib Approved by: kib (mentor)	2008-04-09 16:42:50 +00:00
kib	eb77b477b4	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
kib	5c017b360f	Regen	2008-03-31 12:12:27 +00:00
kib	7a1c49c4b3	Add the freebsd32 compatibility shims for the *at() syscalls. Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:08:30 +00:00
kib	eff8c6d35e	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
jb	291b24b755	Remove files that have been repo copied to their new location in cddl-specific parts of the source tree.	2008-03-28 00:08:47 +00:00
dfr	1c5a20ad66	Regen.	2008-03-26 15:24:02 +00:00
dfr	79d2dfdaa6	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
jhb	fce41b3b76	Regen.	2008-03-25 19:35:34 +00:00
jhb	a8ff4f0990	Add entries for the cpuset-related system calls. The existing system calls can be used on little endian systems. Pointy hat to: jeff	2008-03-25 19:34:47 +00:00
ru	e9ab62a9ff	Fix build. Reported by: ache, tinderbox	2008-03-25 13:20:52 +00:00
rdivacky	4a8a8b1c08	o Add stub support for some new futex operations, so the annoying message is not printed. o Don't warn about FUTEX_FD not being implemented and return ENOSYS instead of 0 (eg. success). o Clear FUTEX_PRIVATE_FLAG as we actually implement only private futexes so there is no reason to return ENOSYS when app asks for a private futex. We don't reject shared futexes because they worked just fine with our implementation so far. Approved by: kib (mentor) Tested by: bsam MFC after: 1 week	2008-03-20 17:03:55 +00:00
antoine	a52d65bf2b	Simplify fcntl(SVR4_F_DUP2FD) code now that FreeBSD has F_DUP2FD. Approved by: rwatson (mentor)	2008-03-17 18:27:28 +00:00
rdivacky	b13a84dcb7	Implement sched_setaffinity and get_setaffinity using real cpu affinity setting primitives. Reviewed by: jeff Approved by: kib (mentor)	2008-03-16 16:27:44 +00:00
jeff	25711383c0	- The P_SA flag has been removed. Don't reference it in a KASSERT.	2008-03-12 22:17:06 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
kib	86936eba80	Return ENOSYS instead of 0 for the unknown futex operations. Submitted by: rdivacky Reported and tested by: Gary Stanley <gary velocity-servers net>	2008-03-02 14:00:50 +00:00
kib	7ad2fb2ee1	Sanitize arguments to linux_mremap(). Check that only MREMAP_FIXED and MREMAP_MAYMOVE flags are specified. Check for the page alignment of the addr argument. Submitted by: rdivacky MFC after: 1 week	2008-02-22 11:47:56 +00:00
ru	841dab65e0	Regenerate for readlink(2).	2008-02-12 20:11:54 +00:00
ru	56aa644e2a	Change readlink(2)'s return type and type of the last argument to match POSIX. Prodded by: Alexey Lyashkov	2008-02-12 20:09:04 +00:00
phk	df9c99b9c2	Give MEXTADD() another argument to make both void pointers to the free function controlable, instead of passing the KVA of the buffer storage as the first argument. Fix all conventional users of the API to pass the KVA of the buffer as the first argument, to make this a no-op commit. Likely break the only non-convetional user of the API, after informing the relevant committer. Update the mbuf(9) manual page, which was already out of sync on this point. Bump __FreeBSD_version to 800016 as there is no way to tell how many arguments a CPP macro needs any other way. This paves the way for giving sendfile(9) a way to wait for the passed storage to have been accessed before returning. This does not affect the memory layout or size of mbufs. Parental oversight by: sam and rwatson. No MFC is anticipated.	2008-02-01 19:36:27 +00:00
pjd	435a09e625	Change type of kmem_used() and kmem_size() functions to uint64_t, so it doesn't overflow in arc.c in this check: if (kmem_used() > (kmem_size() * 4) / 5) return (1); With this bug ZFS almost doesn't cache. Only 32bit machines are affected that have vm.kmem_size set to values >=1GB. Reported by: David Taylor <davidt@yadt.co.uk>	2008-01-24 11:21:54 +00:00
rwatson	0e6bbfc8e3	Regenerate.	2008-01-20 23:44:24 +00:00
rwatson	ff05f9dd9d	Use audit events AUE_SHMOPEN and AUE_SHMUNLINK with new system calls shm_open() and shm_unlink(). More auditing will need to be done for these calls to capture arguments properly.	2008-01-20 23:43:06 +00:00
attilio	71b7824213	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
jhb	1975c09543	Regen for shm_open(2) and shm_unlink(2).	2008-01-08 22:01:26 +00:00
jhb	8cd9437636	Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())	2008-01-08 21:58:16 +00:00
kib	39cc81f40e	After applying LCONVPATH() to the path, do use the converted path instead of original user-mode string in the linux_stat() and linux_lstat() syscalls. Tested by: Peter Holm MFC after: 3 days	2008-01-05 12:36:35 +00:00
jeff	ce18638805	Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho	2007-12-30 01:42:15 +00:00
kib	66636dffb7	Plug the leaks in the present (hopefully, soon to be replaced) implementation of the linux_openat() for the quick MFC. Reported and tested by: Peter Holm MFC after: 3 days	2007-12-29 14:28:01 +00:00
kib	6dc4f55b55	Apply the LCONVPATH() to the (old) linux_stat() and linux_lstat() syscalls. Without it, code has two problems: - behaviour of the old and new [l]stat are different with regard of the /compat/linux - directly accessing the userspace data from the kernel asks for the panics. Reported and tested by: Peter Holm Reviewed by: rdivacky MFC after: 3 days	2007-12-29 14:25:29 +00:00
rwatson	bdee30611d	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
jhb	95c5027710	Bah, remove last vestiges of some statfs conversion fixes that aren't quite ready for CVS yet that snuck into 1.68. Pointy hat to: jhb	2007-12-10 19:42:23 +00:00
scottl	4570db7ea2	Grrr, remove an unused variable missed in the last commit.	2007-12-08 01:41:31 +00:00
scottl	38b1bc6f6b	Don't expect a return value from statfs_scale_blocks().	2007-12-07 22:32:09 +00:00
jhb	f05fcb702a	Regen.	2007-12-06 23:37:26 +00:00
jhb	d675d97b05	Add freebsd32 compat wrappers for msgctl() and __semctl() using kern_msgctl() and kern_semctl(). MFC after: 1 week	2007-12-06 23:36:57 +00:00
jhb	f4e63ed7ac	Add freebsd32 compat wrappers for msgctl() and _semctl() using kern_msgctl() and kern_semctl(). MFC after: 1 week	2007-12-06 23:35:29 +00:00
jhb	eb9403bc51	Move 32-bit SYSV IPC structure definitions into freebsd32_ipc.h. MFC after: 1 week	2007-12-06 23:23:16 +00:00
jhb	0373a29045	Move several data structure definitions out of freebsd32_misc.c and into freebsd32.h instead. MFC after: 1 week	2007-12-06 23:11:27 +00:00
jkim	7c00aa0ce7	Remove redundant checks for msgsnd(3) and msgrcv(3). COMPAT_IA32 (implicitly) requires SYSVSEM, SYSVSHM and SYSVMSG in kernel. Pointed out by: jhb	2007-12-04 20:25:41 +00:00
thompsa	b56e8f172a	Implement functions required by some ndis drivers. NdisIMCopySendPerPacketInfo [1] KeQuerySystemTime [1] KeTickCount [1] strncat [1] KeBugCheckEx Submitted by: Marcin Simonides [1]	2007-12-03 23:43:58 +00:00
thompsa	200d23553e	Correct the calculation for the number of 100ns intervals since January 1, 1601. The 1601 - 1970 period was in seconds rather than 100ns units. Remove duplication by having NdisGetCurrentSystemTime call ntoskrnl_time.	2007-12-02 08:54:50 +00:00
thompsa	3f699c4d4e	Correct the nwbx_ies field type in struct ndis_wlan_bssid_ex. PR: kern/118369 Submitted by: Weongyo Jeong	2007-12-02 04:04:42 +00:00
peter	8e9baed553	Move the shared cp_time array (counts %sys, %user, %idle etc) to the per-cpu area. cp_time[] goes away and a new function creates a merged cp_time-like array for things like linprocfs, sysctl etc. The atomic ops for updating cp_time[] in statclock go away, and the scope of the thread lock is reduced. sysctl kern.cp_time returns a backwards compatible cp_time[] array. A new kern.cp_times sysctl returns the individual per-cpu stats. I have pending changes to make top and vmstat optionally show per-cpu stats. I'm very aware that there are something like 5 or 6 other versions "out there" for doing this - but none were handy when I needed them. I did merge my changes with John Baldwin's, and ended up replacing a few chunks of my stuff with his, and stealing some other code. Reviewed by: jhb Partly obtained from: jhb	2007-11-29 06:34:30 +00:00
jb	ff51f4effa	Remove some compatibility stuff that we now get from the Solaris header.	2007-11-29 00:15:08 +00:00
jb	4b02e22567	Add more OpenSolaris compatibility headers.	2007-11-28 21:50:40 +00:00
jb	1e16c5b22b	Remove an extern that is defined elsewhere.	2007-11-28 21:50:05 +00:00
jb	38265529fe	Add compatibility cruft moved from under _SOLARIS_C_SOURCE in sys/types.h	2007-11-28 21:49:16 +00:00
jb	ce9a474352	Remove a typedef which was just a hack to avoid including vmem.h. That typedef breaks other Solaris code.	2007-11-28 21:48:25 +00:00
jb	7d547ae260	Add a missing volatile so that the code compiles cleanly.	2007-11-28 21:47:09 +00:00
jb	68f7a0964d	Rename the definition of lbolt to LBOLT to avoid a clash with a global variable in FreeBSD. Until now lbolt in sys/proc.h has been #ifdef'ed out based on _SOLARIS_C_SOURCE, but that is going away now.	2007-11-28 21:44:17 +00:00
kib	42f4fb0d92	Implement LINUX_SIOCGIFCOUNT and LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX. LINUX_SIOCGIFCOUNT just returns 0 since it is not implemented in the Linux 2.6.16. LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX are mapped to the FreeBSD native SIOCGIFINDEX. Tested by: Peter Kostouros <kpeter@melbpc.org.au> Reviewed by: brooks, rpaulo (on net@) Submitted by: rdivacky MFC after: 1 week	2007-11-07 16:42:52 +00:00
pjd	34e2d48eab	Remove "zfs:" prefix from lock and condvar names and also skip non-letter characters (mostly "&"). Because top(1) shows only first six characters of wait channel, without this change we saw only one meaningful character. Requested by: kris & others MFC after: 1 week	2007-11-05 18:40:55 +00:00
kib	9ae733819b	Fix for the panic("vm_thread_new: kstack allocation failed") and silent NULL pointer dereference in the i386 and sparc64 pmap_pinit() when the kmem_alloc_nofault() failed to allocate address space. Both functions now return error instead of panicing or dereferencing NULL. As consequence, vmspace_exec() and vmspace_unshare() returns the errno int. struct vmspace arg was added to vm_forkproc() to avoid dealing with failed allocation when most of the fork1() job is already done. The kernel stack for the thread is now set up in the thread_alloc(), that itself may return NULL. Also, allocation of the first process thread is performed in the fork1() to properly deal with stack allocation failure. proc_linkup() is separated into proc_linkup() called from fork1(), and proc_linkup0(), that is used to set up the kernel process (was known as swapper). In collaboration with: Peter Holm Reviewed by: jhb	2007-11-05 11:36:16 +00:00
pjd	72109da06e	- Move crfree() outside MNT_ILOCK()/MNT_IUNLOCK() to eliminate a LOR: 1st 0xc4cea568 struct mount mtx (struct mount mtx) @ /usr/src/sys/modules/zfs/../../compat/opensolaris/kern/opensolaris_vfs.c:209 2nd 0xc3ee9010 sleep mtxpool (sleep mtxpool) @ /usr/src/sys/kern/kern_resource.c:1266 - Move crdup() outside MNT_ILOCK()/MNT_IUNLOCK(), as it can sleep. Reported by: Olli Hauer <ohauer@gmx.de> MFC after: 3 days	2007-11-01 08:58:29 +00:00
rwatson	60570a92bf	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
julian	51d643caa6	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
kevlo	7a9f1e285b	Spelling fix for interupt -> interrupt	2007-10-12 06:03:46 +00:00
jhb	0cf1ea80ad	Allow the ia32 resource limits (compat.ia32.max{dsiz,ssiz,vmem} to be set via loader tunables. They are already tunable via sysctl. MFC after: 1 week Approved by: re (kensmith)	2007-09-24 20:49:39 +00:00
dwmalone	37c880369b	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
jhb	736eaf5ce3	Rework the routines to convert a 5.x+ statfs structure (with fixed-size 64-bit counters) to a 4.x statfs structure (with long-sized counters). - For block counters, we scale up the block size sufficiently large so that the resulting block counts fit into a the long-sized (long for the ABI, so 32-bit in freebsd32) counters. In 4.x the NFS client's statfs VOP did this already. This can lie about the block size to 4.x binaries, but it presents a more accurate picture of the ratios of free and available space. - For non-block counters, fix the freebsd32 stats converter to cap the values at INT32_MAX rather than losing the upper 32-bits to match the behavior of the 4.x statfs conversion routine in vfs_syscalls.c Approved by: re (kensmith)	2007-08-28 20:28:12 +00:00
kib	39e24dc75d	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
pjd	65eefb41d2	Some ZFS threads needs stack larger than the default 8kB, so use 16kB of alternate stack if the default is smaller than 16kB. Approved by: re (rwatson)	2007-08-16 20:33:20 +00:00
davidxu	3fef07aaec	Regenerate. Approved by: re(kensmith)	2007-08-16 05:32:26 +00:00
davidxu	06ae13be4d	Add thr_kill2 compat32 syscall. Submitted by: Tijl Coosemans tijl at ulyssis dot org Approved by: re (kensmith)	2007-08-16 05:30:04 +00:00
rwatson	23574c8673	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
thompsa	c60c38c413	ndis will signal the kthread to exit and then sleep on the proc pointer to be woken up by kthread_exit. This is racey and in some cases the kthread will exit before ndis gets around to sleep so it will be stuck indefinitely. This change reuses the kq_exit variable to indicate that the thread has gone and will loop on tsleep with a timeout waiting for it. If the kthread has already exited then it will not sleep at all. Approved by: re (rwatson)	2007-07-22 20:53:28 +00:00
jhb	ef42e8706b	Fix a couple of issues with the stack limit for 32-bit processes on 64-bit kernels exposed by the recent fixes to resource limits for 32-bit processes on 64-bit kernels: - Let ABIs expose their maximum stack size via a new pointer in sysentvec and use that in preference to maxssiz during exec() rather than always using maxssiz for all processses. - Apply the ABI's limit fixup to the previous stack size when adjusting RLIMIT_STACK to determine if the existing mapping for the stack needs to be grown or shrunk (as well as how much it should be grown or shrunk). Approved by: re (kensmith)	2007-07-12 18:01:31 +00:00
peter	a977642fca	Quiet warnings. I believe gcc is incorrect about these. Approved by: re (rwatson)	2007-07-05 07:38:17 +00:00
peter	6d9e6c677c	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
peter	01169f916d	Add compat6 wrapper code for mmap/lseek/pread/pwrite/truncate/ftruncate. Approved by: re (kensmith)	2007-07-04 23:04:41 +00:00
peter	45430aa747	Regenerate after mmap/lseek/etc syscall changes Approved by: re (kensmith)	2007-07-04 23:03:50 +00:00
peter	7acc95de6e	Add i386 emulation wrappers for mmap/lseek/etc. These use COMPAT6, so you must use the already existing, already in generic, COMPAT_FREEBSD6 kernel option for running old 32 bit binaries. Approved by: re (kensmith)	2007-07-04 23:02:40 +00:00
mjacob	b967f0e2c0	Try a cheap way to get around gcc4.2 believing that user arguments to system calls can change across intervening functions.	2007-06-17 04:37:57 +00:00
emaste	11712476cc	Remove stale 'XXX implement' comments for syscalls which have since been implemented.	2007-06-15 21:54:26 +00:00
rwatson	00b02345d4	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
mjacob	f3bb1a3d1a	Quiesce warnings by initializing irql values to zero.	2007-06-10 04:40:13 +00:00
mjacob	96d1e042bf	Ensure that newpath is always initialized, even for the error case.	2007-06-10 04:37:22 +00:00
attilio	12d804e413	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
attilio	c105658c88	The current rusage code show peculiar problems: - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed. A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process. Submitted by: jeff Approved by: jeff (mentor)	2007-06-09 18:56:11 +00:00
pjd	9eba2904d1	- Reduce number of atomic operations needed to be implemented in asm by implementing some of them using existing ones. - Allow to compile ZFS on all archs and use atomic operations surrounded by global mutex on archs we don't have or can't have all atomic operations needed by ZFS.	2007-06-08 12:35:47 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
dwmalone	771efb08f5	Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.	2007-06-04 18:25:08 +00:00
pjd	f28297d01f	Reimplement traverse() helper function: 1. Pass locking flags to VFS_ROOT(). 2. Check v_mountedhere while the vnode is locked. 3. Always return locked vnode on success. Change 1 fixes problem reported by Stephen M. Rumble - after zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode unconditionally and traverse() didn't pass right lock type to VFS_ROOT(). The result was that kernel paniced when .zfs/ directory was accessed via NFS.	2007-06-04 11:31:46 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
kib	f13486a222	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
pjd	88c31e8d66	There are too many false positive LORs reported by WITNESS, so when ZFS debug is turned off, initialize locks with NOWITNESS flag. At some point I'll get back to them, we would probably need BLESSING functionality, which is currently turned off by default.	2007-05-26 21:37:14 +00:00
pjd	257a0a9c28	DNLC_NO_VNODE can't be NULL. Reported by: ru	2007-05-24 13:44:45 +00:00
pjd	8b3bf231cc	FreeBSD's namecache works quite well with ZFS, so remove DNLC.	2007-05-23 21:33:02 +00:00
cognet	5b7b926727	Remove duplicate includes. Submitted by: Cyril Nguyen Huu <cyril ci0 org>	2007-05-23 13:36:02 +00:00
kib	cdee790df9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
kan	4c2d706212	Allow FreeBSD's native ELF image activators to execute shared libraries the same way it was enabled for Linux binares in linuxulator. This allows binaries built with -pie. Many ports auto-detect -fPIE support in GCC 4.2 and build binaries FreeBSD was unable to run.	2007-05-22 02:22:58 +00:00
jeff	bcfa98d019	- Move GDT/LDT locking into a seperate spinlock, removing the global scheduler lock from this responsibility. Contributed by: Attilio Rao <attilio@FreeBSD.org> Tested by: jeff, kkenn	2007-05-20 22:03:57 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
jhb	b667f507a0	Rework the support for ABIs to override resource limits (used by 32-bit processes under 64-bit kernels). Previously, each 32-bit process overwrote its resource limits at exec() time. The problem with this approach is that the new limits affect all child processes of the 32-bit process, including if the child process forks and execs a 64-bit process. To fix this, don't ovewrite the resource limits during exec(). Instead, sv_fixlimits() is now replaced with a different function sv_fixlimit() which asks the ABI to sanitize a single resource limit. We then use this when querying and setting resource limits. Thus, if a 32-bit process sets a limit, then that new limit will be inherited by future children. However, if the 32-bit process doesn't change a limit, then a future 64-bit child will see the "full" 64-bit limit rather than the 32-bit limit. MFC is tentative since it will break the ABI of old linux.ko modules (no other modules are affected). MFC after: 1 week	2007-05-14 22:40:04 +00:00
pjd	de2973eb6e	Share-lock a vnode where possible.	2007-05-02 01:03:10 +00:00
alc	d11e79c795	Eliminate the use of Giant from ia64-specific code in freebsd32_mmap().	2007-05-01 17:10:01 +00:00
alc	9c008bb303	Synchronize vm map and object accesses. Approved by: des@	2007-05-01 03:09:57 +00:00
pjd	2063f01374	MFp4: Reduce diff against vendor code: - Move FreeBSD-specific code to zfs_freebsd_*() functions in zfs_vnops.c and keep original functions as similar to vendor's code as possible. - Add various includes back, now that we have them.	2007-04-23 00:52:07 +00:00
des	c494d6613e	Now that we're MPSAFE, tell namei() to acquire Giant if necessary.	2007-04-22 08:41:52 +00:00
pjd	24d4489802	MFp4: @118370 Correct typo. @118371 Integrate changes from vendor. @118491 Show backtrace on unexpected code paths. @118494 Integrate changes from vendor. @118504 Fix sendfile(2). I had two ways of fixing it: 1. Fixing sendfile(2) itself to use VOP_GETPAGES() instead of hacking around with vn_rdwr(UIO_NOCOPY), which was suggested by ups. 2. Modify ZFS behaviour to handle this special case. Although 1 is more correct, I've choosen 2, because hack from 1 have a side-effect of beeing faster - it reads ahead MAXBSIZE bytes instead of reading page by page. This is not easy to implement with VOP_GETPAGES(), at least not for me in this very moment. Reported by: Andrey V. Elsukov <bu7cher@yandex.ru> @118525 Reorganize the code to reduce diff. @118526 This code path is expected. It is simply when file is opened with O_FSYNC flag. Reported by: kris Reported by: Michal Suszko <dry@dry.pl>	2007-04-21 12:02:57 +00:00
pjd	65e2222ba4	MFp4: Fix automatic snapshot mount when unprivileged user does lookup on a snapshot directory: - Remove PRIV_VFS_MOUNT check - regular users can mount snapshots via lookups on snapshot directory. - Reset mount credential to kcred, so user won't be able to unmount the snapshot. - Reset owner uid. - Unlock vnode in case of a failure. Reported by: simokawa	2007-04-18 15:24:48 +00:00
pjd	8f71c77931	- Fix a leftover - vfs_mount_alloc() is now exported properly. This fixes stange panics when listing .zfs/snapshot/ directory for me. Reported by: simokawa Reported by: Johan Hendriks <Johan@double-l.nl> - Hide cache_purge() under FREEBSD_NAMECACHE like in other files. - Protect mnt_flag with mount interlock.	2007-04-17 21:16:34 +00:00
des	4d29cf6f60	Whitespace cleanup.	2007-04-15 17:02:03 +00:00
rwatson	c4b43c46c9	Some Linux applications (ping) pass a non-NULL msg_control argument to sendmsg() while using a 0-length msg_controllen. This isn't allowed in the FreeBSD system call ABI, so detect this case and set msg_control to NULL. This allows Linux ping to work. Submitted by: rdivacky	2007-04-14 10:35:09 +00:00
wkoszek	e97a378b02	strchr() and strrchr() are already present in the kernel, but with less popular names. Hence: - comment current index() and rindex() functions, as these serve the same functionality as, respectively, strchr() and strrchr() from userland; - add inlined version of strchr() and strrchr(), as we tend to use them more often; - remove str[r]chr() definitions from ZFS code; Reviewed by: pjd Approved by: cognet (mentor)	2007-04-10 21:42:12 +00:00
scottl	60bd60f09f	Whitespace fixes	2007-04-10 21:37:37 +00:00
pjd	648f58f532	Try to stabilize ZFS with regard to memory consumption: - Allow to shrink ARC down to 16MB (instead of 64MB). - Set arc_max to 1/2 of kmem_map by default. - Start freeing things earlier when low memory situation is detected. - Serialize execution of arc_lowmem(). I decided to setup minimum ZFS memory requirements to 512MB of RAM and 256MB of kmem_map size. If there is less RAM or kmem_map, a warning will be printed. World is cruel, be no better. In other words: modern file system requires modern hardware:) From ZFS administration guide: "Currently the minimum amount of memory recommended to install a Solaris system is 512 Mbytes. However, for good ZFS performance, at least one Gbyte or more of memory is recommended."	2007-04-10 02:35:57 +00:00
pjd	a26ee9422b	Instead of detecting if lock is already initialized based on standard 1 bit check, use more accurate 13 bits check. We had too many false-positives with the standard check. Reported by: mlaier	2007-04-09 01:05:31 +00:00
pjd	e47fb9eabd	Extend kobj compatibility KPI to support operating on files before and after the root file system is mounted. This is one of the changes that will allow to put root file system on ZFS.	2007-04-08 23:57:08 +00:00
pjd	2b260dcd5e	MFp4: Synchronize with recent OpenSolaris changes.	2007-04-08 16:29:25 +00:00
scottl	1320f9f144	Add the CAM 'SG' peripheral device. This device implements a subset of the Linux SCSI SG passthrough device API. The intention is to allow for both running of Linux apps that want to talk to /dev/sg* nodes, and to facilitate porting of apps from Linux to FreeBSD. As such, both native and linuxolator entry points and definitions are provided. Caveats: - This does not support the procfs and sysfs nodes that the Linux SG driver provides. Some Linux apps may rely on these for operation, others may only use them for informational purposes. - More ioctls need to be implemented. - Linux uses a naming scheme of "sg[a-z]" for devices, while FreeBSD uses a scheme of "sg[0-9]". Devfs aliasis (symlinks) are automatically created to link the two together. However, tools like camcontrol only see the native names. - Some operations were originally designed to return byte counts or other data directly as the syscall return value. The linuxolator doesn't appear to support this well, so this driver just punts for these cases. Now that the driver is in place, others are welcome to add missing functionality. Thanks to Roman Divacky for pushing this work along.	2007-04-07 19:40:58 +00:00
jkim	bbfc500036	Fix kernel module dependency. linprocfs depends on sysvmsg and sysvsem. Submitted by: nork	2007-04-06 18:15:56 +00:00
pjd	6be01b7ba0	We have strcasecmp() in libkern now.	2007-04-06 11:18:57 +00:00
pjd	3b005d3302	Please welcome ZFS - The last word in file systems. ZFS file system was ported from OpenSolaris operating system. The code in under CDDL license. I'd like to thank all SUN developers that created this great piece of software. Supported by: Wheel LTD (http://www.wheel.pl/) Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/) Supported by: Sentex (http://www.sentex.net/)	2007-04-06 01:09:06 +00:00
rwatson	765a83fd79	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00

... 9 10 11 12 13 ...

2389 Commits