freebsd-dev

Author	SHA1	Message	Date
Pawel Jakub Dawidek	1dc31587bf	Regen after r247602.	2013-03-02 00:55:09 +00:00
Pawel Jakub Dawidek	2609222ab4	Merge Capsicum overhaul: - Capability is no longer separate descriptor type. Now every descriptor has set of its own capability rights. - The cap_new(2) system call is left, but it is no longer documented and should not be used in new code. - The new syscall cap_rights_limit(2) should be used instead of cap_new(2), which limits capability rights of the given descriptor without creating a new one. - The cap_getrights(2) syscall is renamed to cap_rights_get(2). - If CAP_IOCTL capability right is present we can further reduce allowed ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed ioctls can be retrived with cap_ioctls_get(2) syscall. - If CAP_FCNTL capability right is present we can further reduce fcntls that can be used with the new cap_fcntls_limit(2) syscall and retrive them with cap_fcntls_get(2). - To support ioctl and fcntl white-listing the filedesc structure was heavly modified. - The audit subsystem, kdump and procstat tools were updated to recognize new syscalls. - Capability rights were revised and eventhough I tried hard to provide backward API and ABI compatibility there are some incompatible changes that are described in detail below: CAP_CREATE old behaviour: - Allow for openat(2)+O_CREAT. - Allow for linkat(2). - Allow for symlinkat(2). CAP_CREATE new behaviour: - Allow for openat(2)+O_CREAT. Added CAP_LINKAT: - Allow for linkat(2). ABI: Reuses CAP_RMDIR bit. - Allow to be target for renameat(2). Added CAP_SYMLINKAT: - Allow for symlinkat(2). Removed CAP_DELETE. Old behaviour: - Allow for unlinkat(2) when removing non-directory object. - Allow to be source for renameat(2). Removed CAP_RMDIR. Old behaviour: - Allow for unlinkat(2) when removing directory. Added CAP_RENAMEAT: - Required for source directory for the renameat(2) syscall. Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR): - Allow for unlinkat(2) on any object. - Required if target of renameat(2) exists and will be removed by this call. Removed CAP_MAPEXEC. CAP_MMAP old behaviour: - Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and PROT_WRITE. CAP_MMAP new behaviour: - Allow for mmap(2)+PROT_NONE. Added CAP_MMAP_R: - Allow for mmap(PROT_READ). Added CAP_MMAP_W: - Allow for mmap(PROT_WRITE). Added CAP_MMAP_X: - Allow for mmap(PROT_EXEC). Added CAP_MMAP_RW: - Allow for mmap(PROT_READ \| PROT_WRITE). Added CAP_MMAP_RX: - Allow for mmap(PROT_READ \| PROT_EXEC). Added CAP_MMAP_WX: - Allow for mmap(PROT_WRITE \| PROT_EXEC). Added CAP_MMAP_RWX: - Allow for mmap(PROT_READ \| PROT_WRITE \| PROT_EXEC). Renamed CAP_MKDIR to CAP_MKDIRAT. Renamed CAP_MKFIFO to CAP_MKFIFOAT. Renamed CAP_MKNODE to CAP_MKNODEAT. CAP_READ old behaviour: - Allow pread(2). - Disallow read(2), readv(2) (if there is no CAP_SEEK). CAP_READ new behaviour: - Allow read(2), readv(2). - Disallow pread(2) (CAP_SEEK was also required). CAP_WRITE old behaviour: - Allow pwrite(2). - Disallow write(2), writev(2) (if there is no CAP_SEEK). CAP_WRITE new behaviour: - Allow write(2), writev(2). - Disallow pwrite(2) (CAP_SEEK was also required). Added convinient defines: #define CAP_PREAD (CAP_SEEK \| CAP_READ) #define CAP_PWRITE (CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_R (CAP_MMAP \| CAP_SEEK \| CAP_READ) #define CAP_MMAP_W (CAP_MMAP \| CAP_SEEK \| CAP_WRITE) #define CAP_MMAP_X (CAP_MMAP \| CAP_SEEK \| 0x0000000000000008ULL) #define CAP_MMAP_RW (CAP_MMAP_R \| CAP_MMAP_W) #define CAP_MMAP_RX (CAP_MMAP_R \| CAP_MMAP_X) #define CAP_MMAP_WX (CAP_MMAP_W \| CAP_MMAP_X) #define CAP_MMAP_RWX (CAP_MMAP_R \| CAP_MMAP_W \| CAP_MMAP_X) #define CAP_RECV CAP_READ #define CAP_SEND CAP_WRITE #define CAP_SOCK_CLIENT \ (CAP_CONNECT \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| CAP_GETSOCKOPT \| \ CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| CAP_SETSOCKOPT \| CAP_SHUTDOWN) #define CAP_SOCK_SERVER \ (CAP_ACCEPT \| CAP_BIND \| CAP_GETPEERNAME \| CAP_GETSOCKNAME \| \ CAP_GETSOCKOPT \| CAP_LISTEN \| CAP_PEELOFF \| CAP_RECV \| CAP_SEND \| \ CAP_SETSOCKOPT \| CAP_SHUTDOWN) Added defines for backward API compatibility: #define CAP_MAPEXEC CAP_MMAP_X #define CAP_DELETE CAP_UNLINKAT #define CAP_MKDIR CAP_MKDIRAT #define CAP_RMDIR CAP_UNLINKAT #define CAP_MKFIFO CAP_MKFIFOAT #define CAP_MKNOD CAP_MKNODAT #define CAP_SOCK_ALL (CAP_SOCK_CLIENT \| CAP_SOCK_SERVER) Sponsored by: The FreeBSD Foundation Reviewed by: Christoph Mallon <christoph.mallon@gmx.de> Many aspects discussed with: rwatson, benl, jonathan ABI compatibility discussed with: kib	2013-03-02 00:53:12 +00:00
Xin LI	69136e792a	Fix wrong assignment. Submitted by: Sascha Wildner <saw online de> Obtained from: DragonFly rev 9568dd07a22a136e380e6c19a8ea188eb92976d5 MFC after: 2 weeks	2013-03-01 23:21:18 +00:00
John Baldwin	d825ce0a5d	Reduce duplication between i386/linux/linux.h and amd64/linux32/linux.h by moving bits that are MI out into headers in compat/linux. Reviewed by: Chagin Dmitry dmitry \| gmail MFC after: 2 weeks	2013-01-29 18:41:30 +00:00
Dmitry Chagin	4d04cf1d9e	Arithmetic on pointers takes into account the size of the type. Properly cast the pointer to avoid incorrect pointer scaling. MFC after: 1 Week	2013-01-25 14:40:54 +00:00
John Baldwin	fb709557a3	Don't assume that all Linux TCP-level socket options are identical to FreeBSD TCP-level socket options (only the first two are). Instead, using a mapping function and fail unsupported options as we do for other socket option levels. MFC after: 2 weeks	2013-01-23 21:44:48 +00:00
Gleb Smirnoff	eb1b1807af	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
Colin Percival	43f13bea35	MFS security patches which seem to have accidentally not reached HEAD: Fix insufficient message length validation for EAP-TLS messages. Fix Linux compatibility layer input validation error. Security: FreeBSD-SA-12:07.hostapd Security: FreeBSD-SA-12:08.linux Security: CVE-2012-4445, CVE-2012-4576 With hat: so@	2012-11-23 01:48:31 +00:00
Konstantin Belousov	a2a8559624	Style fixes for r242958. Reported and reviewed by: bde MFC after: 28 days	2012-11-16 06:22:14 +00:00
Konstantin Belousov	552e993580	Regen	2012-11-13 12:53:41 +00:00
Konstantin Belousov	f13b5a0f01	Add the wait6(2) system call. It takes POSIX waitid()-like process designator to select a process which is waited for. The system call optionally returns siginfo_t which would be otherwise provided to SIGCHLD handler, as well as extended structure accounting for child and cumulative grandchild resource usage. Allow to get the current rusage information for non-exited processes as well, similar to Solaris. The explicit WEXITED flag is required to wait for exited processes, allowing for more fine-grained control of the events the waiter is interested in. Fix the handling of siginfo for WNOWAIT option for all wait*(2) family, by not removing the queued signal state. PR: standards/170346 Submitted by: "Jukka A. Ukkonen" <jau@iki.fi> MFC after: 1 month	2012-11-13 12:52:31 +00:00
Konstantin Belousov	140dedb81c	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
Kevin Lo	9823d52705	Revert previous commit... Pointyhat to: kevlo (myself)	2012-10-10 08:36:38 +00:00
Kevin Lo	a10cee30c9	Prefer NULL over 0 for pointers	2012-10-09 08:27:40 +00:00
Konstantin Belousov	877d24ac8a	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
Kevin Lo	457a9cfbc1	Remove redundant check	2012-09-12 10:12:03 +00:00
John Baldwin	a89828a2b0	Remove some more NetBSD compat shims and other unused bits from these drivers: - Remove scsi_low_pisa.*, they were unused. - Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that header. They were empty nops. - Retire sl_xname and use device_get_nameunit() and device_printf() with the underlying device_t instead. - Remove unused {ct,ncv,nsp,stg}print() functions. - Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.	2012-09-10 18:49:49 +00:00
David Xu	e31eb35c3f	regen.	2012-08-17 02:47:16 +00:00
David Xu	d65f1abca7	Implement syscall clock_getcpuclockid2, so we can get a clock id for process, thread or others we want to support. Use the syscall to implement POSIX API clock_getcpuclock and pthread_getcpuclockid. PR: 168417	2012-08-17 02:26:31 +00:00
Konstantin Belousov	f9ca52f21d	Regenerate.	2012-08-15 15:18:20 +00:00
Konstantin Belousov	1a5fe89842	Provide 32bit compat for truncate(2) and ftruncate(2). MFC after: 1 week	2012-08-15 15:17:56 +00:00
Konstantin Belousov	a8d4becf02	Regenerate.	2012-08-14 12:09:36 +00:00
Konstantin Belousov	f90fabce5a	Implement the old mmap syscall for compat32, when COMPAT_43 option is enabled. The syscall is used by FreeBSD 1.1.5.1 dynamic linker. MFC after: 1 week	2012-08-14 12:09:09 +00:00
Konstantin Belousov	481af8b933	Cosmetics: define FREEBSD32_MINUSER and AOUT32_MINUSER for struct sysentvec .sv_minuser. Also improve style. Submitted by: Oliver Pinter <oliver.pinter@gmail.com> MFC after: 1 week	2012-07-22 13:41:45 +00:00
Konstantin Belousov	c5c1199c83	Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks	2012-07-02 21:01:03 +00:00
Kevin Lo	544c5e5b53	Make sure that each va_start has one and only one matching va_end, especially in error cases.	2012-05-29 01:48:06 +00:00
Konstantin Belousov	371778a333	Fix ki_cow for compat32 binaries. MFC after: 3 days	2012-05-27 05:24:53 +00:00
Ed Schouten	4412ad4887	Regenerate system call tables.	2012-05-25 21:52:57 +00:00
Ed Schouten	520b6a84f6	Remove use of non-ISO-C integer types from system call tables. These files already use ISO-C-style integer types, so make them less inconsistent by preferring the standard types.	2012-05-25 21:50:48 +00:00
Gleb Kurtsou	76dcec5d09	Add kern_fhstat(), adjust sys_fhstat() to use it. Extend kern_getdirentries() to accept uio segflag and optionally return buffer residue. Sponsored by: Google Summer of Code 2011	2012-05-24 08:00:26 +00:00
Alexander Leidinger	19e252baeb	- >500 static DTrace probes for the linuxulator - DTrace scripts to check for errors, performance, ... they serve mostly as examples of what you can do with the static probe;s with moderate load the scripts may be overwhelmed, excessive lock-tracing may influence program behavior (see the last design decission) Design decissions: - use "linuxulator" as the provider for the native bitsize; add the bitsize for the non-native emulation (e.g. "linuxuator32" on amd64) - Add probes only for locks which are acquired in one function and released in another function. Locks which are aquired and released in the same function should be easy to pair in the code, inter-function locking is more easy to verify in DTrace. - Probes for locks should be fired after locking and before releasing to prevent races (to provide data/function stability in DTrace, see the man-page of "dtrace -v ..." and the corresponding DTrace docs).	2012-05-05 19:42:38 +00:00
Jung-uk Kim	d69a426fce	- Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27 but GNU libc used it without checking its kernel version, e. g., Fedora 10. - Move pipe(2) implementation for Linuxulator from MD files to MI file, sys/compat/linux/linux_file.c. There is no MD code for this syscall at all. - Correct an argument type for pipe() from l_ulong * to l_int *. Probably this was the source of MI/MD confusion. Reviewed by: emulation	2012-04-16 21:22:02 +00:00
Tijl Coosemans	a5e3a5a56f	Remove some unnecessary includes.	2012-03-18 19:15:11 +00:00
Tijl Coosemans	6e310b206f	Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h. Reviewed by: kib	2012-03-18 19:12:11 +00:00
Tijl Coosemans	01cd19680d	Copy i386 reg.h to x86 and merge with amd64 reg.h. Replace i386/amd64/pc98 reg.h with stubs. The tREGISTER macros are only made visible on i386. These macros are deprecated and should not be available on amd64. The i386 and amd64 versions of struct reg have been renamed to struct __reg32 and struct __reg64. During compilation either __reg32 or __reg64 is defined as reg depending on the machine architecture. On amd64 the i386 struct is also available as struct reg32 which is used in COMPAT_FREEBSD32 code. Most of compat/ia32/ia32_reg.h is now IA64 only. Reviewed by: kib (previous version)	2012-03-18 19:06:38 +00:00
Tijl Coosemans	786645078b	Move userland bits of i386 npx.h and amd64 fpu.h to x86 fpu.h. Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed. Create machine/npx.h on amd64 to allow compiling i386 code that uses this header. The original npx.h and fpu.h define struct envxmm differently. Both definitions have been included in the new x86 header as struct __envxmm32 and struct __envxmm64. During compilation either __envxmm32 or __envxmm64 is defined as envxmm depending on machine architecture. On amd64 the i386 struct is also available as struct envxmm32. Reviewed by: kib	2012-03-16 20:24:30 +00:00
Rebecca Cran	03225fac13	Fix race condition in KfRaiseIrql(). After getting the current irql, if the kthread gets preempted and subsequently runs on a different CPU, the saved irql could be wrong. Also, correct the panic string. PR: kern/165630 Submitted by: Vladislav Movchan <vladislav.movchan at gmail.com>	2012-03-04 17:08:43 +00:00
Juli Mallett	a6d20bbaa2	On MIPS, _ALIGN always aligns to 8 bytes, even for 32-bit binaries. This might not be ideal, but is the ABI we've shipped so far. Fix macros which reflect the results of _ALIGN on 32-bit MIPS to use the right alignment. This fixes sendmsg under COMPAT_FREEBSD32 on n64 MIPS kernels.	2012-03-03 21:39:12 +00:00
Juli Mallett	9624d94701	o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with userlands using the o32 ABI. This mostly follows nwhitehorn's lead in implementing COMPAT_FREEBSD32 on powerpc64. o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the 32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit ABIs use a 64-bit time_t. o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS with 32-bit compatibility, then, disable some code which assumes otherwise wrongly when built for MIPS. A more general macro to check in this case would seem like a good idea eventually. If someone adds support for using n32 userland with n64 kernels on MIPS, then they will have to add a variety of flags related to each piece of the ABI that can vary. That's probably the right time to generalize further. o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the freebsd32 compat code. Probably this should be generalized at some point. Reviewed by: gonzo	2012-03-03 08:19:18 +00:00
Martin Matuska	41c0675e6e	Add procfs to jail-mountable filesystems. Reviewed by: jamie MFC after: 1 week	2012-02-29 00:30:18 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Konstantin Belousov	3494f31ad2	Fix misuse of the kernel map in miscellaneous image activators. Vnode-backed mappings cannot be put into the kernel map, since it is a system map. Use exec_map for transient mappings, and remove the mappings with kmem_free_wakeup() to notify the waiters on available map space. Do not map the whole executable into KVA at all to copy it out into usermode. Directly use vn_rdwr() for the case of not page aligned binary. There is one place left where the potentially unbounded amount of data is mapped into exec_map, namely, in the COFF image activator enumeration of the needed shared libraries. Reviewed by: alc MFC after: 2 weeks	2012-02-17 23:47:16 +00:00
Ed Schouten	7870adb640	Remove direct access to si_name. Code should just use the devtoname() function to obtain the name of a character device. Also add const keywords to pieces of code that need it to build properly. MFC after: 2 weeks	2012-02-10 12:35:57 +00:00
David Xu	d56e058a79	Add 32-bit compat code for AIO kevent flags introduced in revision 230857.	2012-02-05 04:49:31 +00:00
Konstantin Belousov	8c6f8f3d5b	Add support for the extended FPU states on amd64, both for native 64bit and 32bit ABIs. As a side-effect, it enables AVX on capable CPUs. In particular: - Query the CPU support for XSAVE, list of the supported extensions and the required size of FPU save area. The hw.use_xsave tunable is provided for disabling XSAVE, and hw.xsave_mask may be used to select the enabled extensions. - Remove the FPU save area from PCB and dynamically allocate the (run-time sized) user save area on the top of the kernel stack, right above the PCB. Reorganize the thread0 PCB initialization to postpone it after BSP is queried for save area size. - The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as well. FPU state is only useful for suspend, where it is saved in dynamically allocated suspfpusave area. - Use XSAVE and XRSTOR to save/restore FPU state, if supported and enabled. - Define new mcontext_t flag _MC_HASFPXSTATE, indicating that mcontext_t has a valid pointer to out-of-struct extended FPU state. Signal handlers are supplied with stack-allocated fpu state. The sigreturn(2) and setcontext(2) syscall honour the flag, allowing the signal handlers to inspect and manipilate extended state in the interrupted context. - The getcontext(2) never returns extended state, since there is no place in the fixed-sized mcontext_t to place variable-sized save area. And, since mcontext_t is embedded into ucontext_t, makes it impossible to fix in a reasonable way. Instead of extending getcontext(2) syscall, provide a sysarch(2) facility to query extended FPU state. - Add ptrace(2) support for getting and setting extended state; while there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries. - Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to consumers, making it opaque. Internally, struct fpu_kern_ctx now contains a space for the extended state. Convert in-kernel consumers of fpu_kern KPI both on i386 and amd64. First version of the support for AVX was submitted by Tim Bird <tim.bird am sony com> on behalf of Sony. This version was written from scratch. Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org> MFC after: 1 month	2012-01-21 17:45:27 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Dimitry Andric	69ee3e2f52	In sys/compat/linux/linux_ioctl.c, work around a warning when a pointer is compared to an integer, by casting the pointer to l_uintptr_t. No functional difference on both i386 and amd64. Reviewed by: ed, jhb MFC after: 1 week	2012-01-03 18:49:39 +00:00

1 2 3 4 5 ...

1985 Commits