Commit Graph

1979 Commits

Author SHA1 Message Date
glebius
8e20fa5ae9 Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
cperciva
748c98fc62 MFS security patches which seem to have accidentally not reached HEAD:
Fix insufficient message length validation for EAP-TLS messages.

Fix Linux compatibility layer input validation error.

Security:	FreeBSD-SA-12:07.hostapd
Security:	FreeBSD-SA-12:08.linux
Security:	CVE-2012-4445, CVE-2012-4576
With hat:	so@
2012-11-23 01:48:31 +00:00
kib
de90907af2 Style fixes for r242958.
Reported and reviewed by:	bde
MFC after:	28 days
2012-11-16 06:22:14 +00:00
kib
63c9e066e5 Regen 2012-11-13 12:53:41 +00:00
kib
1409e8df20 Add the wait6(2) system call. It takes POSIX waitid()-like process
designator to select a process which is waited for. The system call
optionally returns siginfo_t which would be otherwise provided to
SIGCHLD handler, as well as extended structure accounting for child
and cumulative grandchild resource usage.

Allow to get the current rusage information for non-exited processes
as well, similar to Solaris.

The explicit WEXITED flag is required to wait for exited processes,
allowing for more fine-grained control of the events the waiter is
interested in.

Fix the handling of siginfo for WNOWAIT option for all wait*(2)
family, by not removing the queued signal state.

PR:	standards/170346
Submitted by:	"Jukka A. Ukkonen" <jau@iki.fi>
MFC after:	1 month
2012-11-13 12:52:31 +00:00
kib
f16ea99007 The r241025 fixed the case when a binary, executed from nullfs mount,
was still possible to open for write from the lower filesystem.  There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.

Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount.  Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode.  Always use the lower vnode v_writecount
for the checks.

Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode.  Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.

Tested by:	pho
MFC after:	3 weeks
2012-11-02 13:56:36 +00:00
kib
560aa751e0 Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by:	attilio
Tested by:	pho
2012-10-22 17:50:54 +00:00
kevlo
ceb08698f2 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
kevlo
8747a46991 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
kib
8f845e475e Fix the mis-handling of the VV_TEXT on the nullfs vnodes.
If you have a binary on a filesystem which is also mounted over by
nullfs, you could execute the binary from the lower filesystem, or
from the nullfs mount. When executed from lower filesystem, the lower
vnode gets VV_TEXT flag set, and the file cannot be modified while the
binary is active. But, if executed as the nullfs alias, only the
nullfs vnode gets VV_TEXT set, and you still can open the lower vnode
for write.

Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.

Tested by:	pho (previous version)
MFC after:	2 weeks
2012-09-28 11:25:02 +00:00
kevlo
ce4326d001 Remove redundant check 2012-09-12 10:12:03 +00:00
jhb
e8b429e1c0 Remove some more NetBSD compat shims and other unused bits from these
drivers:
- Remove scsi_low_pisa.*, they were unused.
- Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that
  header.  They were empty nops.
- Retire sl_xname and use device_get_nameunit() and device_printf() with
  the underlying device_t instead.
- Remove unused {ct,ncv,nsp,stg}print() functions.
- Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.
2012-09-10 18:49:49 +00:00
davidxu
b788233f5b regen. 2012-08-17 02:47:16 +00:00
davidxu
3f0806aa1f Implement syscall clock_getcpuclockid2, so we can get a clock id
for process, thread or others we want to support.
Use the syscall to implement POSIX API clock_getcpuclock and
pthread_getcpuclockid.

PR:	168417
2012-08-17 02:26:31 +00:00
kib
9f82bec8aa Regenerate. 2012-08-15 15:18:20 +00:00
kib
b2d9fb2c07 Provide 32bit compat for truncate(2) and ftruncate(2).
MFC after:	1 week
2012-08-15 15:17:56 +00:00
kib
876d70b046 Regenerate. 2012-08-14 12:09:36 +00:00
kib
bc5359fb02 Implement the old mmap syscall for compat32, when COMPAT_43 option is
enabled. The syscall is used by FreeBSD 1.1.5.1 dynamic linker.

MFC after:	1 week
2012-08-14 12:09:09 +00:00
kib
6253150288 Cosmetics: define FREEBSD32_MINUSER and AOUT32_MINUSER for struct
sysentvec .sv_minuser. Also improve style.

Submitted by:	Oliver Pinter <oliver.pinter@gmail.com>
MFC after:	1 week
2012-07-22 13:41:45 +00:00
kib
53224f018a Extend the KPI to lock and unlock f_offset member of struct file. It
now fully encapsulates all accesses to f_offset, and extends f_offset
locking to other consumers that need it, in particular, to lseek() and
variants of getdirentries().

Ensure that on 32bit architectures f_offset, which is 64bit quantity,
always read and written under the mtxpool protection. This fixes
apparently easy to trigger race when parallel lseek()s or lseek() and
read/write could destroy file offset.

The already broken ABI emulations, including iBCS and SysV, are not
converted (yet).

Tested by:	pho
No objections from:	jhb
MFC after:    3 weeks
2012-07-02 21:01:03 +00:00
kevlo
07ebfe1b9c Make sure that each va_start has one and only one matching va_end,
especially in error cases.
2012-05-29 01:48:06 +00:00
kib
cae6484163 Fix ki_cow for compat32 binaries.
MFC after:	3 days
2012-05-27 05:24:53 +00:00
ed
0d9131d0d0 Regenerate system call tables. 2012-05-25 21:52:57 +00:00
ed
55e4d6365d Remove use of non-ISO-C integer types from system call tables.
These files already use ISO-C-style integer types, so make them less
inconsistent by preferring the standard types.
2012-05-25 21:50:48 +00:00
gleb
3c7243df78 Add kern_fhstat(), adjust sys_fhstat() to use it.
Extend kern_getdirentries() to accept uio segflag and optionally return
buffer residue.

Sponsored by:	Google Summer of Code 2011
2012-05-24 08:00:26 +00:00
netchild
9895b5ca9d - >500 static DTrace probes for the linuxulator
- DTrace scripts to check for errors, performance, ...
  they serve mostly as examples of what you can do with the static probe;s
  with moderate load the scripts may be overwhelmed, excessive lock-tracing
  may influence program behavior (see the last design decission)

Design decissions:
 - use "linuxulator" as the provider for the native bitsize; add the
   bitsize for the non-native emulation (e.g. "linuxuator32" on amd64)
 - Add probes only for locks which are acquired in one function and released
   in another function. Locks which are aquired and released in the same
   function should be easy to pair in the code, inter-function
   locking is more easy to verify in DTrace.
 - Probes for locks should be fired after locking and before releasing to
   prevent races (to provide data/function stability in DTrace, see the
   man-page of "dtrace -v ..." and the corresponding DTrace docs).
2012-05-05 19:42:38 +00:00
jkim
e210f689a8 - Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27
but GNU libc used it without checking its kernel version, e. g., Fedora 10.
- Move pipe(2) implementation for Linuxulator from MD files to MI file,
sys/compat/linux/linux_file.c.  There is no MD code for this syscall at all.
- Correct an argument type for pipe() from l_ulong * to l_int *.  Probably
this was the source of MI/MD confusion.

Reviewed by:	emulation
2012-04-16 21:22:02 +00:00
tijl
212d562cf9 Remove some unnecessary includes. 2012-03-18 19:15:11 +00:00
tijl
35c7447060 Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h.
Reviewed by:	kib
2012-03-18 19:12:11 +00:00
tijl
2bf580ea66 Copy i386 reg.h to x86 and merge with amd64 reg.h. Replace i386/amd64/pc98
reg.h with stubs.

The tREGISTER macros are only made visible on i386. These macros are
deprecated and should not be available on amd64.

The i386 and amd64 versions of struct reg have been renamed to struct
__reg32 and struct __reg64. During compilation either __reg32 or __reg64
is defined as reg depending on the machine architecture. On amd64 the i386
struct is also available as struct reg32 which is used in COMPAT_FREEBSD32
code.

Most of compat/ia32/ia32_reg.h is now IA64 only.

Reviewed by:	kib (previous version)
2012-03-18 19:06:38 +00:00
tijl
9c671fcaca Move userland bits of i386 npx.h and amd64 fpu.h to x86 fpu.h.
Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed.
Create machine/npx.h on amd64 to allow compiling i386 code that uses
this header.

The original npx.h and fpu.h define struct envxmm differently. Both
definitions have been included in the new x86 header as struct __envxmm32
and struct __envxmm64. During compilation either __envxmm32 or __envxmm64
is defined as envxmm depending on machine architecture. On amd64 the i386
struct is also available as struct envxmm32.

Reviewed by:	kib
2012-03-16 20:24:30 +00:00
brucec
88520fea69 Fix race condition in KfRaiseIrql().
After getting the current irql, if the kthread gets preempted and
subsequently runs on a different CPU, the saved irql could be wrong.

Also, correct the panic string.

PR:		kern/165630
Submitted by:	Vladislav Movchan <vladislav.movchan at gmail.com>
2012-03-04 17:08:43 +00:00
jmallett
6485e73b87 On MIPS, _ALIGN always aligns to 8 bytes, even for 32-bit binaries. This might
not be ideal, but is the ABI we've shipped so far.  Fix macros which reflect
the results of _ALIGN on 32-bit MIPS to use the right alignment.

This fixes sendmsg under COMPAT_FREEBSD32 on n64 MIPS kernels.
2012-03-03 21:39:12 +00:00
jmallett
50c253779f o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with userlands
using the o32 ABI.  This mostly follows nwhitehorn's lead in implementing
   COMPAT_FREEBSD32 on powerpc64.
o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the
   32-bit ABI being used.  Since the MIPS port is relatively-new, even the 32-bit
   ABIs use a 64-bit time_t.
o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS
   with 32-bit compatibility, then, disable some code which assumes otherwise
   wrongly when built for MIPS.  A more general macro to check in this case would
   seem like a good idea eventually.  If someone adds support for using n32
   userland with n64 kernels on MIPS, then they will have to add a variety of
   flags related to each piece of the ABI that can vary.  That's probably the
   right time to generalize further.
o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the
   freebsd32 compat code.  Probably this should be generalized at some point.

Reviewed by:	gonzo
2012-03-03 08:19:18 +00:00
mm
77766742e1 Add procfs to jail-mountable filesystems.
Reviewed by:	jamie
MFC after:	1 week
2012-02-29 00:30:18 +00:00
kib
80ae8fe82c Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with:	bde, das (previous versions)
MFC after:	1 month
2012-02-21 01:05:12 +00:00
kib
abd1094f17 Fix misuse of the kernel map in miscellaneous image activators.
Vnode-backed mappings cannot be put into the kernel map, since it is a
system map.

Use exec_map for transient mappings, and remove the mappings with
kmem_free_wakeup() to notify the waiters on available map space.

Do not map the whole executable into KVA at all to copy it out into
usermode.  Directly use vn_rdwr() for the case of not page aligned
binary.

There is one place left where the potentially unbounded amount of data
is mapped into exec_map, namely, in the COFF image activator
enumeration of the needed shared libraries.

Reviewed by:   alc
MFC after:     2 weeks
2012-02-17 23:47:16 +00:00
ed
28b4a002d6 Remove direct access to si_name.
Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.

MFC after:	2 weeks
2012-02-10 12:35:57 +00:00
davidxu
c4909ace45 Add 32-bit compat code for AIO kevent flags introduced in revision 230857. 2012-02-05 04:49:31 +00:00
kib
361bfae5c2 Add support for the extended FPU states on amd64, both for native
64bit and 32bit ABIs.  As a side-effect, it enables AVX on capable
CPUs.

In particular:

- Query the CPU support for XSAVE, list of the supported extensions
  and the required size of FPU save area. The hw.use_xsave tunable is
  provided for disabling XSAVE, and hw.xsave_mask may be used to
  select the enabled extensions.

- Remove the FPU save area from PCB and dynamically allocate the
  (run-time sized) user save area on the top of the kernel stack,
  right above the PCB. Reorganize the thread0 PCB initialization to
  postpone it after BSP is queried for save area size.

- The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as
  well. FPU state is only useful for suspend, where it is saved in
  dynamically allocated suspfpusave area.

- Use XSAVE and XRSTOR to save/restore FPU state, if supported and
  enabled.

- Define new mcontext_t flag _MC_HASFPXSTATE, indicating that
  mcontext_t has a valid pointer to out-of-struct extended FPU
  state. Signal handlers are supplied with stack-allocated fpu
  state. The sigreturn(2) and setcontext(2) syscall honour the flag,
  allowing the signal handlers to inspect and manipilate extended
  state in the interrupted context.

- The getcontext(2) never returns extended state, since there is no
  place in the fixed-sized mcontext_t to place variable-sized save
  area. And, since mcontext_t is embedded into ucontext_t, makes it
  impossible to fix in a reasonable way.  Instead of extending
  getcontext(2) syscall, provide a sysarch(2) facility to query
  extended FPU state.

- Add ptrace(2) support for getting and setting extended state; while
  there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries.

- Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to
  consumers, making it opaque. Internally, struct fpu_kern_ctx now
  contains a space for the extended state. Convert in-kernel consumers
  of fpu_kern KPI both on i386 and amd64.

First version of the support for AVX was submitted by Tim Bird
<tim.bird am sony com> on behalf of Sony. This version was written
from scratch.

Tested by:	pho (previous version), Yamagi Burmeister <lists yamagi org>
MFC after:	1 month
2012-01-21 17:45:27 +00:00
mckusick
af2e331939 Make sure all intermediate variables holding mount flags (mnt_flag)
and that all internal kernel calls passing mount flags are declared
as uint64_t so that flags in the top 32-bits are not lost.

MFC after: 2 weeks
2012-01-17 01:08:01 +00:00
trociny
d4e71152bd Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want
to read strings completely to know the actual size.

As a side effect it fixes the issue with kern.proc.args and kern.proc.env
sysctls, which didn't return the size of available data when calling
sysctl(3) with the NULL argument for oldp.

Note, in get_ps_strings(), which does actual work for proc_getargv() and
proc_getenvv(), we still have a safety limit on the size of data read in
case of a corrupted procces stack.

Suggested by:	kib
MFC after:	3 days
2012-01-15 18:47:24 +00:00
uqs
d61d88a310 Convert files to UTF-8 2012-01-15 13:23:18 +00:00
dim
d106f2fd7c In sys/compat/linux/linux_ioctl.c, work around a warning when a pointer
is compared to an integer, by casting the pointer to l_uintptr_t.  No
functional difference on both i386 and amd64.

Reviewed by:	ed, jhb
MFC after:	1 week
2012-01-03 18:49:39 +00:00
dim
e28679ec14 In sys/compat/ndis/subr_ntoskrnl.c, change the RtlFillMemory function
definition from K&R to ANSI, to avoid a clang warning about the uint8_t
parameter being promoted to int, which is not compatible with the type
declared in the earlier prototype.

MFC after:	1 week
2011-12-30 17:18:09 +00:00
jhb
cab9e82ffb Implement linux_fadvise64() and linux_fadvise64_64() using
kern_posix_fadvise().

Reviewed by:	silence on emulation@
MFC after:	2 weeks
2011-12-29 15:34:59 +00:00
trociny
783806918a Protect process environment variables with p_candebug().
Discussed with:	jilles, kib, rwatson
MFC after:	2 weeks
2011-12-04 21:43:13 +00:00
trociny
96623b87d9 Retire linprocfs_doargv(). Instead use new functions, proc_getargv()
and proc_getenvv(), which were implemented using linprocfs_doargv() as
a reference.

Suggested by:	kib
Reviewed by:	kib
Approved by:	des (linprocfs maintainer)
MFC after:	2 weeks
2011-11-22 20:45:11 +00:00
lstewart
cca3084242 - Add the ffclock_getcounter(), ffclock_getestimate() and ffclock_setestimate()
system calls to provide feed-forward clock management capabilities to
  userspace processes. ffclock_getcounter() returns the current value of the
  kernel's feed-forward clock counter. ffclock_getestimate() returns the current
  feed-forward clock parameter estimates and ffclock_setestimate() updates the
  feed-forward clock parameter estimates.

- Document the syscalls in the ffclock.2 man page.

- Regenerate the script-derived syscall related files.

Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.

For more information, see http://www.synclab.org/radclock/

Submitted by:	Julien Ridoux (jridoux at unimelb edu au)
2011-11-21 01:26:10 +00:00
ed
98496492d6 Make the Linux *at() calls a bit more complete.
Properly support:

- AT_EACCESS for faccessat(),
- AT_SYMLINK_FOLLOW for linkat().
2011-11-19 07:19:37 +00:00