Commit Graph

1945 Commits

Author SHA1 Message Date
bschmidt
b1835fd211 According to specs for MmAllocateContiguousMemorySpecifyCache() physically
contiguous memory with requested restrictions must be allocated.

Submitted by:	Paul B Mahol <onemda at gmail.com>
2010-11-11 18:43:31 +00:00
des
87d7052b67 Break long line. 2010-11-08 15:14:14 +00:00
des
543d80b9c5 Fix CPU ID in /proc/cpuinfo.
PR:		kern/56451
Submitted by:	arundel@
MFC after:	3 weeks
2010-11-08 12:04:41 +00:00
bschmidt
7edbc46989 Remove 4.x, 5.x and 6.x compatibility bits.
Submitted by:	Paul B Mahol <onemda at gmail.com>
2010-11-04 18:43:57 +00:00
kib
dd8752941f Remove stale comment.
Submitted by:	arundel
MFC after:	3 days
2010-10-14 19:30:44 +00:00
kib
84212c7551 Add macro DECLARE_MODULE_TIED to denote a module as requiring the
kernel of exactly the same __FreeBSD_version as the headers module was
compiled against.

Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules
use kernel interfaces that the Release Engineering Team feel are not
stable enough to guarantee they will not change during the life cycle
of a STABLE branch. In particular, the layout of struct sysentvec is
declared to be not part of the STABLE KBI.

Discussed with:	bz, rwatson
Approved by:	re (bz, kensmith)
MFC after:	2 weeks
2010-10-12 09:18:17 +00:00
jkim
6c222550e8 Simplify timeout check in futex_wait() using itimerfix() and return error
if the given timeout is invalid.  Consistently use int type for timeout and
correct a format string in futex_sleep().
2010-10-06 18:51:22 +00:00
netchild
77b4c590d7 Fix a comparision of an uninitialised pointer.
Submitted by:	arundel
Found by:	clang analysis (automatic service by uqs@)
Reviewed by:	rdivacky
2010-10-06 07:34:41 +00:00
thompsa
b942a32370 Use the printf-like capability from kproc_create().
Submitted by:	Paul B Mahol
2010-10-05 20:56:08 +00:00
jkim
e9d0730bf8 Prefer pmap_unmapbios() over pmap_unmapdev(). The binary does not change
after this because pmap_unmapbios() is a macro for pmap_unmapdev() on amd64.
2010-10-05 18:38:23 +00:00
kib
2fc806f00c In linprocfs_doargv():
- handle compat32 processes;
- remove the checks for copied in addresses to belong into valid
  usermode range, proc_rwmem() does this;
- simplify loop reading single string, limit the total amount of strings
  collected by ARG_MAX bytes;
- correctly add '\0' at the end of each copied string;
- fix style.

In linprocfs_doprocenviron():
- unlock the process before calling copyin code [1]. The process is held
  by pseudofs.

In linprocfs_doproccmdline:
- use linprocfs_doargv() to handle !curproc case for which p_args is not cached.

Reported by:		plulnet [1]
Tested by:		pluknet
Approved by:		des (linprocfs maintainer, previous
				version of the patch)
MFC after:		3 weeks
2010-09-28 11:32:17 +00:00
des
a14c121cce Implement proc/$$/environment.
Submitted by:	Fernando Apesteguía <fernando.apesteguia@gmail.com>
MFC after:	3 weeks
2010-09-16 07:56:34 +00:00
mdf
ab3a8b533a Replace sbuf_overflowed() with sbuf_error(), which returns any error
code associated with overflow or with the drain function.  While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.
2010-09-10 16:42:16 +00:00
jkim
0117aaf574 Add x86bios_set_intr() to set interrupt vectors for real mode and simplify
x86bios_get_intr() a little.
2010-08-25 21:03:50 +00:00
jkim
8c8d33fe9f Check opcode for short jump as well. Some option ROMs do short jumps
(e.g., some NVIDIA video cards) and we were not able to do POST while
resuming because we only honored long jump.

MFC after:	3 days
2010-08-25 20:52:40 +00:00
kib
d9f088a03e Supply some useful information to the started image using ELF aux vectors.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.

Tested by:	marius (sparc64)
MFC after:	1 month
2010-08-17 08:55:45 +00:00
jkim
b25a196078 Place spinlock_enter() and spinlock_exit() just around X86EMU calls. 2010-08-10 15:22:48 +00:00
jkim
781d513f0b Tidy up locking and memory allocation for the real mode emulator wrapper.
Now we use a regular mutex instead of a spin mutex.  When we enter and exit
the emulator, spinlock_enter() and spinlock_exit() are additionally used.
Move some page table related stuff from x86bios_init() and x86bios_uninit()
to x86bios_map_mem() and x86bios_unmap_mem().
2010-08-10 06:25:08 +00:00
jkim
6328a1bf23 Tidy up printf() calls for debugging. 2010-08-09 22:06:08 +00:00
jkim
c07bc7f517 Initialize a variable just before its use. 2010-08-09 18:10:32 +00:00
jkim
c9e34bbc39 Reduce diffs between VM86 and X86EMU wrappers for x86bios_alloc() and
x86bios_free().  Add strict sanity checks for VM86 wrapper and add strict
page table locking for X86EMU wrapper.
2010-08-09 17:54:26 +00:00
kib
8e1e89f01b Prefer struct sysentvec sv_psstrings to hardcoding FREEBSD32_PS_STRINGS
in the compat32 code. Use sv_usrstack instead of FREEBSD32_USRSTACK as well.

MFC after:	1 week
2010-08-07 11:57:13 +00:00
kib
8043767b92 Add compat32 definition for (old) struct ostat.
MFC after:	1 week
2010-08-07 11:53:38 +00:00
jkim
77b28d0e95 Do not block any I/O port on amd64. 2010-08-07 04:05:58 +00:00
jkim
012f478c81 Optimize interrupt vector lookup. There is no need to check the page table. 2010-08-07 03:45:45 +00:00
jkim
57b610d580 Consistently use architecture specific macros. 2010-08-06 15:24:37 +00:00
jkim
c1d76c06de Fix allocation of multiple pages, which forgot to increase page number.
Particularly, it caused "vm86_addpage: overlap" panics under VirtualBox.
Add a safety check before freeing memory while I am here.
2010-08-06 15:04:01 +00:00
jkim
9c60808f39 Re-add flag register for output. Some BIOS calls actually use it to return
success/failure status.  Oops.
2010-08-05 19:30:57 +00:00
jkim
0a7f48a833 Do not copy stack pointer and flags. These registers are unconditionally
destroyed from vm86_prepcall().
2010-08-05 19:12:35 +00:00
jkim
f183f61cf2 Implement a simple native VM86 backend for X86BIOS. Now i386 uses native
VM86 calls instead of the real mode emulator as a backend.  VM86 has been
proven reliable for very long time and it is actually few times faster than
emulation.  Increase maximum number of page table entries per VM86 context
from 3 to 8 pages.  It was (ridiculously) low and insufficient for new VM86
backend, which shares one context globally.  Slighly rearrange and clean up
the emulator backend to accommodate new code.  The only visible change here
is stack size, which is decreased from 64K to 4K bytes to sync. with VM86.
Actually, it seems there is no need for big stack in real mode.

MFC after:	1 month
2010-08-05 18:48:30 +00:00
kib
4a7e2ba2a3 Copy inode birthtime to the struct stat32.
MFC after:	1 week
2010-08-04 14:38:20 +00:00
kib
36b27b8587 Fix style.
MFC after:	1 week
2010-08-04 14:35:05 +00:00
kib
734aeecfaf When compat32 recvmsg(2) does not need to copy out control messages, set
msg_controllen to 0.

PR:	kern/149227
Submitted by:	Stef Walter <stef memberwebs com>
MFC after:	1 weeks
2010-08-03 11:23:44 +00:00
alc
256c63de28 Introduce exec_alloc_args(). The objective being to encapsulate the
details of the string buffer allocation in one place.

Eliminate the portion of the string buffer that was dedicated to storing
the interpreter name.  The pointer to the interpreter name can simply be
made to point to the appropriate argument string.

Reviewed by:	kib
2010-07-27 17:31:03 +00:00
kib
5d01c62502 Revert r210451, and the similar part of the r210431. The forward-declaration
for the enum tag when enum definition is not complete is not allowed by
C99, and is gcc extension.

Requested by:	stefanf
MFC after:	28 days
2010-07-26 12:52:44 +00:00
alc
02c0473d35 Change the order in which the file name, arguments, environment, and
shell command are stored in exec*()'s demand-paged string buffer.  For
a "buildworld" on an 8GB amd64 multiprocessor, the new order reduces
the number of global TLB shootdowns by 31%.  It also eliminates about
330k page faults on the kernel address space.

Change exec_shell_imgact() to use "args->begin_argv" consistently as
the start of the argument and environment strings.  Previously, it
would sometimes use "args->buf", which is the start of the overall
buffer, but no longer the start of the argument and environment
strings.  While I'm here, eliminate unnecessary passing of "&length"
to copystr(), where we don't actually care about the length of the
copied string.

Clean up the initialization of the exec map.  In particular, use the
correct size for an entry, and express that size in the same way that
is used when an entry is allocated.  The old size was one page too
large.  (This discrepancy originated in 2004 when I rewrote
exec_map_first_page() to use sf_buf_alloc() instead of the exec map
for mapping the first page of the executable.)

Reviewed by:	kib
2010-07-25 17:43:38 +00:00
kib
229b3b9c19 Remove the linux_exec_copyin_args(), freebsd32_exec_copyin_args() may
server as well. COMPAT_FREEBSD32 is a prerequisite for COMPAT_LINUX32.

Reviewed by:	alc
MFC after:	3 weeks
2010-07-23 21:30:33 +00:00
alc
0c709bf109 Eliminate a little bit of duplicated code. 2010-07-23 18:58:27 +00:00
trasz
e2cd3ad716 Remove proc locking, it's not needed after r210132. 2010-07-17 15:52:11 +00:00
trasz
e3a946ddad Make svr4(4) version of poll(2) use the same limit of file descriptors as the
usual poll(2) does, instead of checking resource limits.
2010-07-15 18:44:58 +00:00
kib
5f8b30cbbb Constify source argument for siginfo_to_siginfo32().
MFC after:	1 week
2010-07-04 11:43:53 +00:00
jhb
df7979cf76 Tweak the in-kernel API for sending signals to threads:
- Rename tdsignal() to tdsendsignal() and make it private to kern_sig.c.
- Add tdsignal() and tdksignal() routines that mirror psignal() and
  pksignal() except that they accept a thread as an argument instead of
  a process.  They send a signal to a specific thread rather than to an
  individual process.

Reviewed by:	kib
2010-06-29 20:41:52 +00:00
kib
180cca1c2d Regenerate 2010-06-28 18:17:21 +00:00
kib
b6d8416eac Count number of threads that enter and leave dynamically registered
syscalls. On the dynamic syscall deregistration, wait until all
threads leave the syscall code. This somewhat increases the safety
of the loadable modules unloading.

Reviewed by:	jhb
Tested by:	pho
MFC after:	1 month
2010-06-28 18:06:46 +00:00
jkim
f68d88b142 Let x86bios_alloc() pass contigmalloc(9) flags. Use it to set M_WAITOK
from VESA BIOS initialization.  All other malloc(9) uses in the function is
blocking any way.
2010-06-23 17:20:51 +00:00
ed
d7eaa4520b ANSIfy prototypes in subr_usbd.c.
Clang generates the following warnings when building subr_usbd.c:

| subr_usbd.c:598:13: warning: promoted type 'int' of K&R function
|   parameter is not compatible with the parameter type 'uint8_t' (aka
|   'unsigned char') declared in a previous prototype
| subr_usbd.c:627:13: warning: promoted type 'int' of K&R function
|   parameter is not compatible with the parameter type 'uint8_t' (aka
|   'unsigned char') declared in a previous prototype
| subr_usbd.c:649:13: warning: promoted type 'int' of K&R function
|   parameter is not compatible with the parameter type 'uint8_t' (aka
|   'unsigned char') declared in a previous prototype

Instead of just ANSIfying these three prototypes, do it for the entire
file.

Spotted by:	clang
2010-06-12 12:19:08 +00:00
jhb
9b74a62d73 Update several places that iterate over CPUs to use CPU_FOREACH(). 2010-06-11 18:46:34 +00:00
wkoszek
ab9f5dbe35 Bring USB fixes for linux(4).
Intention of this commit is to let us take a full advantage
of libusb(8) ported to Linux. This decreases a possibility of getting
any collisions within ioctl() "command" space, especially with
relation to  LINUX_SNDCTL_SEQ... stuff.

Basically, we provide commands, that will be mapped in the kernel
to correct ones and forward those to the USB layer. Port enabling
functionality brought with this patch is here:

	http://www.freebsd.org/cgi/query-pr.cgi?pr=146895

Bump __FreeBSD_version to catch, since which version installing a
port makes sense.

This patch should bring no regressions. So far, only i386 is tested.

Tested by:	thompsa@
Reviewed by:	thompsa@
OKed by:	netchild@
2010-05-24 07:04:00 +00:00
kib
4208ccbe79 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
netchild
88731dce0c - #ifdef out the cliplist part, skype seems like using an uninitialized
variable and can cause problems, without the cliplist handling it works
  without problems
- improve the cliplist error handling
- fix VIDIOCGTUNER and VIDIOCSMICROCODE (still no hardware available to test)

Submitted by:	J.R. Oldroyd <jr@opal.com>
X-MFC after:	soon (together with all the v4l stuff)
2010-05-03 14:19:58 +00:00
jkim
428aff4f1c Reduce MD code further. At least, it compiles on ia64 now (but it is not
connected to build).  The idea/code was shamelessly taken from r207329.
2010-05-01 01:05:07 +00:00
jkim
b092dfff59 Do not initialize mutex and return error if it cannot map memory. 2010-05-01 00:36:40 +00:00
kib
1b4a81ab7e Provide compat32 shims for kinfo_proc sysctl. This allows 32bit ps(1) to
mostly work on 64bit host.

The work is based on an original patch submitted by emaste, obtained
from Sandvine's source tree.

Reviewed by:	jhb
MFC after:	1 week
2010-04-21 19:32:00 +00:00
kib
d4d906be00 Extract the code to copy-out struct rusage32 from struct rusage
into the new function.

Reviewed by:	jhb
MFC after:	1 week
2010-04-21 19:28:01 +00:00
emaste
c1468b9b67 Linux puts a blank line between each CPU. 2010-04-14 13:44:22 +00:00
bz
7fe3c0a85b Add a forward declaration to silence a warning when compiling ia32_genassym.c.
Reviewed by:	kib
MFC after:	3 days
2010-04-03 12:34:32 +00:00
netchild
1edbfe1bf0 Re-apply r205683 with some modifications:
Fix some bogus values in linprocfs.

  Submitted by:	Petr Salinger <Petr.Salinger@seznam.cz>
  Verified on:	GNU/kFreeBSD debian 8.0-1-686 (by submitter)
  PR:		144584

Reviewed by / discussed with:	kib, des, jhb, submitter
2010-04-02 06:50:28 +00:00
ed
4f08ecd7ed Rename st_*timespec fields to st_*tim for POSIX 2008 compliance.
A nice thing about POSIX 2008 is that it finally standardizes a way to
obtain file access/modification/change times in sub-second precision,
namely using struct timespec, which we already have for a very long
time. Unfortunately POSIX uses different names.

This commit adds compatibility macros, so existing code should still
build properly. Also change all source code in the kernel to work
without any of the compatibility macros. This makes it all a less
ambiguous.

I am also renaming st_birthtime to st_birthtim, even though it was a
local extension anyway. It seems Cygwin also has a st_birthtim.
2010-03-28 13:13:22 +00:00
netchild
d82aae616f Revert r205683 to resolve some code quality issues which do not affect the
build or use of linprocfs, before committing the reworked patch again.

Requested by:	des
2010-03-26 14:36:16 +00:00
netchild
8cb631a42d Fix some bogus values in linprocfs.
Submitted by:	Petr Salinger <Petr.Salinger@seznam.cz>
Verified on:	GNU/kFreeBSD debian 8.0-1-686 (by submitter)
PR:		144584
2010-03-26 11:43:15 +00:00
netchild
db8514f06a Fix some problems which may lead to a panic:
- right order of src and dst in memcpy
 - NULL out the clips after freeing to prevent an accident

Noticed by:	hselasky
2010-03-26 08:42:11 +00:00
jkim
3ce45e9870 Revert accidentally committed initial real mode %sp change of r205347.
Note I am keeping %ds change because X.org int10 handler does it and
it seems reasonable.
2010-03-25 17:14:47 +00:00
jkim
7725810bd9 Optimize real mode page table lookup. 2010-03-25 17:03:52 +00:00
jkim
49ca5d49ba Fix stupid typos. Some VESA BIOSes directly call BIOS interrupt handlers
within the VBE interrupt handler.  Unfortunately it was causing real mode
page faults because we were fetching instructions from bogus addresses.
Pass me the pointyhat, please.

PR:		kern/144654
MFC after:	3 days
2010-03-25 15:56:04 +00:00
nwhitehorn
d63c82a6ac Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by:	jhb
2010-03-25 14:24:00 +00:00
jhb
72e71a3c3d Add missing Giant locking for the vfsconf list.
Submitted by:	kib
2010-03-24 14:20:37 +00:00
jhb
878de09a93 Implement /proc/filesystems.
Submitted by:	Fernando Apesteguia fernando.apesteguia (gmail)
2010-03-23 21:49:33 +00:00
jkim
a40c3ddf5a Support memory wraparound instead of high memory as VM86 mode does.
Suggested by:	delphij
2010-03-22 18:43:36 +00:00
jkim
c581c7d82f Fix i386 PAE kernel build.
Reported by:	tinderbox
2010-03-22 17:30:34 +00:00
ed
6156503467 Actually make O_DIRECTORY work.
According to POSIX open() must return ENOTDIR when the path name does
not refer to a path name. Change vn_open() to respect this flag. This
also simplifies the Linuxolator a bit.
2010-03-21 20:43:23 +00:00
jkim
a0f967165a - Map EBDA if available and add 64KB above 1MB (high memory), just in case.
- Print the initial memory map when bootverbose is set.
- Change the page fault address format from linear to %cs:%ip style.
- Move duplicate code into a newly added function.
- Add strictly aligned memory access for distant future. ;-)
2010-03-19 21:15:43 +00:00
kib
169d74536d Regen 2010-03-19 11:14:37 +00:00
kib
b0ac73182e Remove empty line.
MFC after:	2 weeks
2010-03-19 11:13:42 +00:00
kib
06319cba03 Implement compat32 shims for mqueuefs.
Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:10:24 +00:00
kib
34d2655cb1 Implement compat32 shims for ksem syscalls.
Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:08:43 +00:00
kib
b27fa06f97 Move SysV IPC freebsd32 compat shims from freebsd32_misc.c to corresponding
sysv_{msg,sem,shm}.c files.

Mark SysV IPC freebsd32 syscalls as NOSTD and add required
SYSCALL_INIT_HELPER/SYSCALL32_INIT_HELPERs to provide auto
register/unregister on module load.

This makes COMPAT_FREEBSD32 functional with SysV IPC compiled and loaded
as modules.

Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:04:42 +00:00
kib
610214ed4c Move SysV IPC freebsd32 compat shims helpers from freebsd32_misc.c to
sysv_ipc.c.

Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:01:51 +00:00
kib
d19a162142 Introduce SYSCALL_INIT_HELPER and SYSCALL32_INIT_HELPER macros and
neccessary support functions to allow registering dynamically loaded
syscalls from the MOD_LOAD handlers. Helpers handle registration
failures semi-automatically.

Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 10:56:30 +00:00
kib
7932a1f757 FOr SYSCALL_MODULE_HELPER, use "sys/<syscallname>" module name.
FOr SYSCALL32_MODULE_HELPER, use "sys32/<syscallname>" module name.
This avoids modules name conflict when compat32 syscall does not
need shims.

Note that SYSCALL_MODULE_HELPER is going to be unused in the tree by
several next commits.

Suggested by:	jhb
MFC after:	2 weeks
2010-03-19 10:52:54 +00:00
kib
e28b3013f9 Make freebsd32_copyiniov() available outside of freebsd32_misc.
MFC after:	2 weeks
2010-03-19 10:49:03 +00:00
jkim
de04cb233d Detect illegal access to unmapped memory within real mode emulator to aid
debugging.  Update copyright date while I am here.
2010-03-18 20:15:34 +00:00
nwhitehorn
8cbb7143ab Regen after big endian compatibility import. 2010-03-11 14:56:59 +00:00
nwhitehorn
142a4d2993 Provide groundwork for 32-bit binary compatibility on non-x86 platforms,
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.

Reviewed by:	kib, jhb
2010-03-11 14:49:06 +00:00
ed
3c620db1c4 Make /proc/self/fd `work'.
On Linux, /proc/<pid>/fd is comparable to fdescfs, where it allows you
to inspect the file descriptors used by each process. Glibc's ttyname()
works by performing a readlink() on these nodes, since all nodes in this
directory are symlinks.

It is a bit hard to implement this in linprocfs right now, so I am not
going to bother. Add a way to make ttyname(3) work, by adding a
/proc/<pid>/fd symlink, which points to /dev/fd only if the calling
process matches. When fdescfs is mounted, this will cause the
readlink() in ttyname() to fail, causing it to fall back on manually
finding a matching node in /dev.

Discussed on:	emulation@
2010-03-07 10:43:45 +00:00
joel
58fee7f0f3 The NetBSD Foundation has granted permission to remove clause 3 and 4 from
their software.

Obtained from:	NetBSD
2010-03-01 17:20:04 +00:00
pjd
c527452336 No need to include security/mac/mac_framework.h here. 2010-02-18 22:26:01 +00:00
delphij
54ac71ad0f - Return EAFNOSUPPORT instead of EINVAL for unsupported address family,
this matches the Linux behavior.
 - Check if we have sufficient space allocated for socket structure, which
   fixes a buffer overflow when wrong length is being passed into the
   emulation layer. [1]

PR:		kern/138860
Submitted by:	Mateusz Guzik <mjguzik gmail com>
Reported by:	Alexander Best [1]
MFC after:	2 weeks
2010-02-09 22:30:51 +00:00
ed
d40177139e Remove unused LIBCOMPAT keyword from syscalls.master. 2010-02-08 10:02:01 +00:00
wkoszek
260e3efd2e Let us to use our libusb(3) in Linuxolator.
With this change, Linux binaries can work with our libusb(3) when
it's compiled against our header files on GNU/Linux system -- this
solves the problem with differences between /dev layouts.

With ported libusb(3), I am able to use my USB JTAG cable with Linux
binaries that support it.

Reviewed by:	thompsa
2010-01-18 22:46:06 +00:00
netchild
7cde5eb62d Whitespace change to be able to provide the correct commit log for r202364:
---snip---
Add video clipping support but with the caveats below.

Background info:

Video clipping allows the user to provide either a series of clip rectangles
or a clip bitmap to the driver and have the driver mask the video according
to the clipping specs provided.

Adding support for clipping to the FreeBSD Linux emulator is problematic
because it seems that this feature is not supported by many drivers and
therefore it is ignored by many applications. Unfortunately, when not
using it, rather than passing in a null clipping list, some apps leave the
clipping fields uninitialized, casuing random values to be passed in. In
the case where the driver does not use the clipping info, this is not a
problem (although it is bad form). But the Linux emulator does not know
which drivers will use this and which won't, so the Linux emulator must
try to handle this clip list, and deal gracefully with cases where the
values seem to be uninitialized.

Video clipping info is passed in using the VIDIOCSWIN ioctl in two fields
in the video_window structure: the integer clipcount and the pointer clips.

How the linuxulator handles this from this commit on:

    * if (clipcount == VIDEO_CLIP_BITMAP)
      The clips variable is a void * pointer to a 128*625 byte
      (1024*625 bit) memory area containing a bitmap of the clipping area.
      The pointer in the video_window structure is copied, but no
      video_clip structures are copied.
    * if (clipcount > 0 && clipcount <= 16384)
      The clips variable is pointer to a list of video_clip structures. Up
      to clipcount structures are copied and passed to the driver.
      The upper limit of 16384 was imposed here so that user code that does
      not properly initialize clipcount falls through below and no attempt
      is made to copy an uninitialized list. This value was found by
      examining Linux drivers that support the clip list.
    * else
      The clipcount is either negative (but not VIDEO_CLIP_BITMAP), zero or
      positive (> 16384).
      All these cases are treated as invalid data. Both the clipcount field
      and clips pointer are forced to zero/NULL and passed to the driver.

It should be noted that, at the time of developing this V4L emulator code,
the pwc(4) V4L driver does not support clipping.

Submitted by:	J.R. Oldroyd <fbsd@opal.com>
MFC after:	1 month
---snip---
2010-01-15 15:38:31 +00:00
netchild
9c3e54b694 This is v4l support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver.

Not tested is firmware upload, framebuffer stuff and video tuner stuff
due to lack of hardware.
The clipping part (VIDIOCSWIN) needs a little bit of further work (partly
in progress, but can not be tested due to lack of a suitable device).

The submitter tested this sucessfully with Skype and flash apps on amd64 and
i386 with the multimedia/pwcbsd driver.

Submitted by:	J.R. Oldroyd <fbsd@opal.com>
2010-01-15 14:58:19 +00:00
brooks
f354ae0814 Since all other comparisons involving ngroups_max use
"ngroups_max + 1", use ">= ngroups_max+1" instead of the equivalent
"> ngroups_max" to reduce confusion.
2010-01-15 07:05:00 +00:00
brooks
a093b41daf Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic
kern.ngroups+1.  kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1.  Given that the Windows group limit is 1024, this range
should be sufficient for most applications.

MFC after:	1 month
2010-01-12 07:49:34 +00:00
mckusick
0cddeb2cb4 Background:
When renaming a directory it passes through several intermediate
states. First its new name will be created causing it to have two
names (from possibly different parents). Next, if it has different
parents, its value of ".." will be changed from pointing to the old
parent to pointing to the new parent. Concurrently, its old name
will be removed bringing it back into a consistent state. When fsck
encounters an extra name for a directory, it offers to remove the
"extraneous hard link"; when it finds that the names have been
changed but the update to ".." has not happened, it offers to rewrite
".." to point at the correct parent. Both of these changes were
considered unexpected so would cause fsck in preen mode or fsck in
background mode to fail with the need to run fsck manually to fix
these problems. Fsck running in preen mode or background mode now
corrects these expected inconsistencies that arise during directory
rename. The functionality added with this update is used by fsck
running in background mode to make these fixes.

Solution:

This update adds three new fsck sysctl commands to support background
fsck in correcting expected inconsistencies that arise from incomplete
directory rename operations. They are:

setcwd(dirinode) - set the current directory to dirinode in the
    filesystem associated with the snapshot.
setdotdot(oldvalue, newvalue) - Verify that the inode number for ".."
    in the current directory is oldvalue then change it to newvalue.
unlink(nameptr, oldvalue) - Verify that the inode number associated
    with nameptr in the current directory is oldvalue then unlink it.

As with all other fsck sysctls, these new ones may only be used by
processes with appropriate priviledge.

Reported by:    	jeff
Security issues:	rwatson
2010-01-11 20:44:05 +00:00
mbr
7450f52a57 Remove extraneous semicolons, no functional changes.
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week
2010-01-07 21:01:37 +00:00
kib
19ee2b964b Signal 0 is used to check the permission for current process to signal
target one. Since r184058, linux_do_tkill() calls tdsignal() instead of
kill(), without checking for validity of supplied signal number. Prevent
panic when supplied signal is 0 by finishing work after checks.

Found and tested by:	scf
MFC after:	3 days
2009-12-18 14:27:18 +00:00
imp
747567aeb7 Revert 200606. 2009-12-16 21:53:56 +00:00
imp
09c3f98bcc Fix compiling FREEBSD_COMPAT[4,5,6] without FREEBSD_COMPAT7.
Note: Not sure this is the right way to do compat, but it makes the
headers consistent with the implementations.
2009-12-16 17:17:40 +00:00
jkim
ddc390f24b Add two new debugging tunables for x86bios instead of abusing bootverbose,
i.e., debug.x86bios.call and debug.x86bios.int.
2009-12-15 22:44:28 +00:00
kib
ed39e6c80b Regenerate. 2009-12-04 21:53:20 +00:00
kib
080e59eb45 Add several syscall compat32 entries for acl manipulation.
They do not require translation of the arguments.

Tested by:	bsam
MFC after:	1 week
2009-12-04 21:52:31 +00:00
netchild
3b599cd6fa This is v4l support for the linuxulator. This allows to access FreeBSD
native devices which support the v4l API from processes running within
the linuxulator, e.g. skype or flash can access the multimedia/pwcbsd driver.

Not tested is firmware upload, framebuffer stuff and video tuner stuff
due to lack of hardware.
The clipping part (VIDIOCSWIN) needs a little bit of further work (partly
in progress, but can not be tested due to lack of a suitable device).

The submitter tested this sucessfully with Skype and flash apps on amd64 and
i386 with the multimedia/pwcbsd driver.

Submitted by:	J.R. Oldroyd <fbsd@opal.com>
2009-12-04 21:06:54 +00:00
netchild
21465cca7f Import the unchanged v4l videodev.h from the vendor branch. 2009-12-04 20:46:45 +00:00
ed
0ddd901675 Include <sys/tty.h> instead of <sys/termios.h>.
Right now <sys/termios.h> includes <sys/ttycom.h>, which provides the
TTY ioctls to the svr4 code. We need both struct termios and the ioctls,
so include <sys/tty.h> for now.
2009-11-28 16:30:06 +00:00
netchild
063f564d20 Fix typo in kernel message. The fix is based upon the patch in the PR.
PR:		kern/140279
Submitted by:	Alexander Best <alexbestms@math.uni-muenster.de>
MFC after:	1 week
2009-11-05 07:37:48 +00:00
rpaulo
36d63b3d79 Revert a functional change that snuck in. 2009-11-02 19:13:12 +00:00
rpaulo
10361c8d0a Fix a non-style change that snuck in.
Spotted by: danfe
2009-11-02 18:51:24 +00:00
rpaulo
898a75fb36 Big style cleanup. While there remove references to FreeBSD versions
older than 6.0.

Submitted by:	Paul B Mahol <onemda at gmail.com>
2009-11-02 11:07:42 +00:00
kib
7b3cdceb8e Regenerate 2009-10-27 11:02:04 +00:00
kib
08e5013938 Current pselect(3) is implemented in usermode and thus vulnerable to
well-known race condition, which elimination was the reason for the
function appearance in first place. If sigmask supplied as argument to
pselect() enables a signal, the signal might be delivered before thread
called select(2), causing lost wakeup. Reimplement pselect() in kernel,
making change of sigmask and sleep atomic.

Since signal shall be delivered to the usermode, but sigmask restored,
set TDP_OLDMASK and save old mask in td_oldsigmask. The TDP_OLDMASK
should be cleared by ast() in case signal was not gelivered during
syscall execution.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:55:34 +00:00
kib
ce081b037e In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
kib
eb4c68098b In kern_sigsuspend(), better manipulate thread signal mask using
kern_sigprocmask() to properly notify other possible candidate threads
for signal delivery.

Since sigsuspend() shall only return to usermode after a signal was
delivered, do cursig/postsig loop immediately after waiting for
signal, repeating the wait if wakeup was spurious due to race with
other thread fetching signal from the process queue before us. Add
thread_suspend_check() call to allow the thread to be stopped or killed
while in loop.

Modify last argument of kern_sigprocmask() from boolean to flags,
allowing the function to be called with locked proc. Convertion of the
callers that supplied 1 to the old argument will be done in the next
commit, and due to SIGPROCMASK_OLD value equial to 1, code is formally
correct in between.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:42:24 +00:00
bz
b991a4ad12 Unconditionally call the setsockopt for IPV6_V6ONLY for v6 linux sockets
no matter whether we are compiled as module or if our default of the
net.inet6.ip6.v6only sysctl already matches what we would set.

This avoids unnecessary complications with modules, VIMAGES, INET6 and
the sysctl value, especially considering that most users will use
linux compat as a module.

Discussed with:	kib, rwatson (weeks ago)
Reviewed by:	rwatson
MFC after:	6 weeks
2009-10-25 09:58:56 +00:00
jkim
22d120ca09 Fix a copy-and-pasto in the previous commit. 2009-10-19 21:01:42 +00:00
jkim
99279734b8 Rewrite x86bios and update its dependent drivers.
- Do not map entire real mode memory (1MB).  Instead, we map IVT/BDA and
ROM area separately.  Most notably, ROM area is mapped as device memory
(uncacheable) as it should be.  User memory is dynamically allocated and
free'ed with contigmalloc(9) and contigfree(9).  Remove now redundant and
potentially dangerous x86bios_alloc.c.  If this emulator ever grows to
support non-PC hardware, we may implement it with rman(9) later.
- Move all host-specific initializations from x86emu_util.c to x86bios.c and
remove now unnecessary x86emu_util.c.  Currently, non-PC hardware is not
supported.  We may use bus_space(9) later when the KPI is fixed.
- Replace all bzero() calls for emulated registers with more obviously named
x86bios_init_regs().  This function also initializes DS and SS properly.
- Add x86bios_get_intr().  This function checks if the interrupt vector is
available for the platform.  It is not necessary for PC-compatible hardware
but it may be needed later. ;-)
- Do not try turning off monitor if DPMS does not support the state.
- Allocate stable memory for VESA OEM strings instead of just holding
pointers to them.  They may or may not be accessible always.  Fix a memory
leak of video mode table while I am here.
- Add (experimental) BIOS POST call for vesa(4).  This function calls VGA
BIOS POST code from the current VGA option ROM.  Some video controllers
cannot save and restore the state properly even if it is claimed to be
supported.  Usually the symptom is blank display after resuming from suspend
state.  If the video mode does not match the previous mode after restoring,
we try BIOS POST and force the known good initial state.  Some magic was
taken from NetBSD (and it was taken from vbetool, I believe.)
- Add a loader tunable for vgapci(4) to give a hint to dpms(4) and vesa(4)
to identify who owns the VESA BIOS.  This is very useful for multi-display
adapter setup.  By default, the POST video controller is automatically
probed and the tunable "hw.pci.default_vgapci_unit" is set to corresponding
vgapci unit number.  You may override it from loader but it is very unlikely
to be necessary.  Unfortunately only AGP/PCI/PCI-E controllers can be
matched because ISA controller does not have necessary device IDs.
- Fix a long standing bug in state save/restore function.  The state buffer
pointer should be ES:BX, not ES:DI according to VBE 3.0.  If it ever worked,
that's because BX was always zero. :-)
- Clean up register initializations more clearer per VBE 3.0.
- Fix a lot of style issues with vesa(4).
2009-10-19 20:58:10 +00:00
bz
8e183cd852 Make sure that the primary native brandinfo always gets added
first and the native ia32 compat as middle (before other things).
o(ld)brandinfo as well as third party like linux, kfreebsd, etc.
stays on SI_ORDER_ANY coming last.

The reason for this is only to make sure that even in case we would
overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo
would still be there and the system would be operational.

Reviewed by:	kib
MFC after:	1 month
2009-10-03 11:57:21 +00:00
rwatson
47ee86367a Regenerate system call files following r197636. 2009-09-30 08:48:59 +00:00
rwatson
3d5e3df28c Reserve system call numbers for Capsicum security framework capabilities,
capability mode, and process descriptors: cap_new, cap_getrights, cap_enter,
cap_getmode, pdfork, pdkill, pdgetpid, and pdwait.

Obtained from:	TrustedBSD Project
Sponsored by:	Google
MFC after:	3 weeks
2009-09-30 08:46:01 +00:00
delphij
987c4a758d Use a 2 clause BSD-style license instead of stating the code as public
domain, as requested by core@ and reviewed by the author.
2009-09-28 08:14:15 +00:00
jkim
08823d6c8c - Reduce BIOS memory mapping. We want 1MB of physical memory, not 12MB[1].
- Remove CS and IP registers from x86bios.h.  They have no use for us.
- Adjust register dump to make it little bit more useful for debugging.

Submitted by:	paradox (ddkprog yahoo com)[1] (initial version)
2009-09-25 17:56:32 +00:00
jkim
21b9526006 Dump real mode registers under bootverbose to help debugging BIOS emulator. 2009-09-24 22:42:35 +00:00
jkim
54c804074e - Use FreeBSD function naming convention.
- Change x86biosCall() to more appropriate x86bios_intr().[1]

Discussed with:	delphij, paradox (ddkprog yahoo com)
Submitted by:	paradox (ddkprog yahoo com)[1]
2009-09-24 19:24:42 +00:00
jkim
4f6b75d358 Move sys/dev/x86bios to sys/compat/x86bios.
It may not be optimal but it is clearly better than the old place.

OK'ed by:	delphij, paradox (ddkprog yahoo com)
2009-09-23 20:49:14 +00:00
zec
840419a4c4 Lock the ifnet list while iterating over it.
Submitted by:	julian
MFC after:	3 days
2009-09-13 21:30:18 +00:00
des
e4fc8331e6 As jhb@ pointed out to me, r197057 was incorrect, not least because these
are generated files.
2009-09-10 13:20:27 +00:00
des
b9f96dd3c0 If a certain feature that was present in FreeBSD 7 was removed or changed in
FreeBSD 8, the compatibility shims should be built not just when FreeBSD 7
compatibility is requested, but also when compatibility with any older
FreeBSD version where that feature was present is requested.o

Without this patch, a kernel config that sets COMPAT_FREEBSD6 but not *7
would fail to build due to inconsistencies between the declaration of the
compatibility shims and their use in the SysV code.

There are similar errors in other *proto.h headers in the tree.

MFC after:	3 weeks
2009-09-10 08:33:28 +00:00
kib
91e6a5b3cc kern_select(9) copies fd_set in and out of userspace in quantities of
longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in
or out 4 bytes more then the process expected.

Calculate the amount of bytes to copy taking into account size of fd_set
for the current process ABI.

Diagnosed and tested by:	Peter Jeremy <peterjeremy acm org>
Reviewed by:	jhb
MFC after:	1 week
2009-09-09 20:59:01 +00:00
bz
840afe36da Make sure FreeBSD binaries without .note.ABI-tag section work
correctly and do not match a colliding Debian GNU/kFreeBSD
brandinfo statements.
For this mark the Debian GNU/kFreeBSD brandinfo that it must have
an .note.ABI-tag section and ignore the old EI_OSABI brandinfo
when comparing a possibly colliding set of options.

Due to SYSINIT we add the brandinfo in a non-deterministic order,
so native FreeBSD is not always first. We may want to consider
to force native FreeBSD to come first as well.

The only way a problem could currently be noticed is when running an
i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD
brandinfo  was matched first,  as the fallback to ld-elf32.so.1 does
not exist in that case.

Reported and tested by:	ticso
In collaboration with:	kib
MFC after:		3 days
2009-08-30 14:38:17 +00:00
zec
e9ba936d59 Fix a few panics in linuxulator + VIMAGE due to curvnet not being set.
This change affects only options VIMAGE builds.

Reviewed by:	julian
MFC after:	3 days
2009-08-28 22:51:07 +00:00
bz
ba7b3afabc Fix handling of .note.ABI-tag section for GNU systems [1].
Handle GNU/Linux according to LSB Core Specification 4.0,
Chapter 11. Object Format, 11.8. ABI note tag.

Also check the first word of desc, not only name, according to
glibc abi-tags specification to distinguish between Linux and
kFreeBSD.

Add explicit handling for Debian GNU/kFreeBSD, which runs
on our kernels as well [2].

In {amd64,i386}/trap.c, when checking osrel of the current process,
also check the ABI to not change the signal behaviour for Linux
binary processes, now that we save an osrel version for all three
from the lists above in struct proc [2].

These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD
and Linux binaries on the same machine again for at least i386 and
amd64, and no longer break kFreeBSD which was detected as GNU(/Linux).

PR:		kern/135468
Submitted by:	dchagin [1] (initial patch)
Suggested by:	kib [2]
Tested by:	Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD
Reviewed by:	kib
MFC after:	3 days
2009-08-24 16:19:47 +00:00
rwatson
ef8d755d4d Rework global locks for interface list and index management, correcting
several critical bugs, including race conditions and lock order issues:

Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an
sxlock.  Either can be held to stablize the lists and indexes, but both
are required to write.  This allows the list to be held stable in both
network interrupt contexts and sleepable user threads across sleeping
memory allocations or device driver interactions.  As before, writes to
the interface list must occur from sleepable contexts.

Reviewed by:	bz, julian
MFC after:	3 days
2009-08-23 20:40:19 +00:00
rwatson
fb9ffed650 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
jhb
da6cb6e20c Fix the freebsd32 versions of semsys(), shmsys(), and msgsys() to use the
old ABI versions of the relevant control system call (e.g.
freebsd7_freebsd32_msgctl() instead of freebsd32_msgctl() for msgsys()).

Approved by:	re (kib)
2009-07-27 16:03:04 +00:00
jamie
274ea197bb Some jail parameters (in particular, "ip4" and "ip6" for IP address
restrictions) were found to be inadequately described by a boolean.
Define a new parameter type with three values (disable, new, inherit)
to handle these and future cases.

Approved by:	re (kib), bz (mentor)
Discussed with:	rwatson
2009-07-25 14:48:57 +00:00
rwatson
57ca4583e7 Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
trasz
f2432e2bd7 Regen the freebsd32 parts.
Approved by:	re (kib)
2009-07-08 16:30:34 +00:00
trasz
3189216dbc Fix freebsd32 version of lpathconf(2).
Approved by:	re (kib)
2009-07-08 16:26:43 +00:00
trasz
09784497a2 There is an optimization in chmod(1), that makes it not to call chmod(2)
if the new file mode is the same as it was before; however, this
optimization must be disabled for filesystems that support NFSv4 ACLs.
Chmod uses pathconf(2) to determine whether this is the case - however,
pathconf(2) always follows symbolic links, while the 'chmod -h' doesn't.

This change adds lpathconf(3) to make it possible to solve that problem
in a clean way.

Reviewed by:	rwatson (earlier version)
Approved by:	re (kib)
2009-07-08 15:23:18 +00:00
rwatson
da78c9e4a2 Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type.  This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by:	brooks
Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-06-27 13:58:44 +00:00
weongyo
9a472dd588 provides a extra write buffer when the NDIS driver want to send a
request whose body has some datas through the default pipe.

Tested by:	Nikos Vassiliadis <nvass9573 at gmx.com>
2009-06-26 01:42:41 +00:00
jhb
2908b25ed7 Regen. 2009-06-24 21:54:08 +00:00
jhb
6f52fe78fb Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
  short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
  short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
  (this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
  removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
  int.  This removes the need for the shm_bsegsz member in struct
  shmid_kernel and should allow for complete support of SYSV SHM regions
  >= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
  short.
- The shm_internal member of struct shmid_ds is now gone.  The internal
  VM object pointer for SHM regions has been moved into struct
  shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
  now marked COMPAT7 and new versions of those system calls which support
  the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc.  The
  FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
  symbol versions has been added to libc.  Version tags are added to
  system calls by adding an appropriate __sym_compat() entry to
  src/lib/libc/incldue/compat.h. [1]

PR:		kern/16195 kern/113218 bin/129855
Reviewed by:	arch@, rwatson
Discussed with:	kan, kib [1]
2009-06-24 21:10:52 +00:00
jhb
cceae54c51 Add a new COMPAT7 flag for FreeBSD 7.x compatibility system calls. 2009-06-24 13:36:37 +00:00
bz
0808d0b1a6 After cleaning up rt_tables from vnet.h and cleaning up opt_route.h
a lot of files no longer need route.h either. Garbage collect them.
While here remove now unneeded vnet.h #includes as well.
2009-06-23 17:03:45 +00:00
thompsa
30004d4d8e Fix a typeo in the frame len function to unbreak the build, make it shorter
while I am here.
2009-06-23 06:00:31 +00:00
thompsa
74c6c20b93 - Make struct usb_xfer opaque so that drivers can not access the internals
- Reduce the number of headers needed for a usb driver, the common case is just   usb.h and usbdi.h
2009-06-23 02:19:59 +00:00
jhb
e206daf142 Regen. 2009-06-22 20:24:03 +00:00
jhb
062accfe3d Fix a typo in a comment. 2009-06-22 20:12:40 +00:00
brooks
f53c1c309d Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively.  (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer.  Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively.  Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary.  In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups.  When feasible, truncate
the group list rather than generating an error.

Minor changes:
  - Reduce the number of hand rolled versions of groupmember().
  - Do not assign to both cr_gid and cr_groups[0].
  - Modify ipfw to cache ucreds instead of part of their contents since
    they are immutable once referenced by more than one entity.

Submitted by:	Isilon Systems (initial implementation)
X-MFC after:	never
PR:		bin/113398 kern/133867
2009-06-19 17:10:35 +00:00
jhb
0abfb2bd6a Regen. 2009-06-17 19:53:47 +00:00
jhb
fd29528e09 - Add the ability to mix multiple flags seperated by pipe ('|') characters
in the type field of system call tables.  Specifically, one can now use
  the 'NO*' types as flags in addition to the 'COMPAT*' types.  For example,
  to tag 'COMPAT*' system calls as living in a KLD via NOSTD.  The COMPAT*
  type is required to be listed first in this case.
- Add new functions 'type()' and 'flag()' to the embedded awk script in
  makesyscalls.sh that return true if a requested flag is found in the
  type field ($3).  The flag() function checks all of the flags in the
  field, but type() only checks the first flag.  type() is meant to be
  used in the top-level "switch" statement and flag() should be used
  otherwise.
- Retire the CPT_NOA type, it is now replaced with "COMPAT|NOARGS" using
  the flags approach.
- Tweak the comment descriptions of COMPAT[46] system calls so that they
  say "freebsd[46] foo" rather than "old foo".
- Document the COMPAT6 type.
- Sync comments in compat32 syscall table with the master table.
2009-06-17 19:50:38 +00:00
bz
48dc6805f8 Add explicit includes for jail.h to the files that need them and
remove the "hidden" one from vimage.h.
2009-06-17 15:01:01 +00:00
jhb
28b41377e3 Regen. 2009-06-15 20:40:23 +00:00
jhb
447d980cd0 Add a new 'void closefrom(int lowfd)' system call. When called, it closes
any open file descriptors >= 'lowfd'.  It is largely identical to the same
function on other operating systems such as Solaris, DFly, NetBSD, and
OpenBSD.  One difference from other *BSD is that this closefrom() does not
fail with any errors.  In practice, while the manpages for NetBSD and
OpenBSD claim that they return EINTR, they ignore internal errors from
close() and never return EINTR.  DFly does return EINTR, but for the common
use case (closing fd's prior to execve()), the caller really wants all
fd's closed and returning EINTR just forces callers to call closefrom() in
a loop until it stops failing.

Note that this implementation of closefrom(2) does not make any effort to
resolve userland races with open(2) in other threads.  As such, it is not
multithread safe.

Submitted by:	rwatson (initial version)
Reviewed by:	rwatson
MFC after:	2 weeks
2009-06-15 20:38:55 +00:00
jamie
f950eed7d7 Get vnets from creds instead of threads where they're available, and from
passed threads instead of curthread.

Reviewed by:	zec, julian
Approved by:	bz (mentor)
2009-06-15 19:01:53 +00:00
thompsa
06303d491a s/usb2_/usb_|usbd_/ on all function names for the USB stack. 2009-06-15 01:02:43 +00:00
dchagin
c4e9ea4c7e Unlock process lock when return error from getrobustlist call.
Tested by:	Alexander Best <alexbestms at math uni-muenster de>
Approved by:	kib (mentor)
MFC after:	3 days
2009-06-14 17:53:55 +00:00
jamie
e9da16507b Add counterparts to getcredhostname:
getcreddomainname, getcredhostuuid, getcredhostid

Suggested by:	rmacklem
Approved by:	bz
2009-06-13 00:12:02 +00:00
kib
6e0d8393e5 Regenerate 2009-06-10 13:48:43 +00:00
kib
6013601373 Add several syscall compat32 entries for extattr manipulation syscalls,
that do not require translation of the arguments.

Requested by:	kientzle
Reviewed by:	jhb (previous wrong version)
MFC after:	1 week
2009-06-10 13:48:13 +00:00
bz
b7ff2bdc20 After r193232 rt_tables in vnet.h are no longer indirectly dependent on
the ROUTETABLES kernel option thus there is no need to include opt_route.h
anymore in all consumers of vnet.h and no longer depend on it for module
builds.

Remove the hidden include in flowtable.h as well and leave the two
explicit #includes in ip_input.c and ip_output.c.
2009-06-08 19:57:35 +00:00
thompsa
2d149b09c5 Rename usb pipes to endpoints as it better represents what they are, and struct
usb_pipe may be used for a different purpose later on.
2009-06-07 19:41:11 +00:00
rwatson
f4934662e5 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
dchagin
8e9d8c289c Add forgotten in previous commit flags argument.
Approved by:	kib (mentor)
MFC after:	1 month
2009-06-01 20:54:41 +00:00
dchagin
bb8f1f3e67 Implement accept4 syscall.
Approved by:	kib (mentor)
MFC after:	1 month
2009-06-01 20:48:39 +00:00
dchagin
76d24c5be3 Implement a variation of the accept_common() which takes
a flags argument.

Do not preserve td_retval before kern_fcntl(F_SETFL) as it does not
changed.

Approved by:	kib (mentor)
MFC after:	1 month
2009-06-01 20:44:58 +00:00
dchagin
0cc88e7ca3 Split linux_accept() syscall onto linux_accept_common() which should
be used by linuxulator and linux_accept() itself.

Approved by:	kib (mentor)
MFC after:	1 month
2009-06-01 20:42:27 +00:00
rwatson
b850b85534 Regenerate generated syscall files following changes to struct sysent in
r193234.
2009-06-01 16:14:38 +00:00
dchagin
6fb0275352 Implement a variation of the socketpair() syscall which takes a flags
in addition to the type argument.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-31 12:16:31 +00:00
dchagin
ab797d42e4 Move new socket flags handling into a separate function as Linux
introduced more syscalls which uses these flags.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-31 12:04:01 +00:00
dchagin
fbb545b684 Remove empty lines.
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-31 12:00:16 +00:00
delphij
7059dd02fa Attempt to fix build by updating hostid to follow the new world order. 2009-05-30 07:33:32 +00:00
jamie
572db1408a Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex.  Jails may
have their own host information, or they may inherit it from the
parent/system.  The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL.  The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by:	bz (mentor)
2009-05-29 21:27:12 +00:00
thompsa
44c17bdf07 s/usb2_/usb_/ on all typedefs for the USB stack. 2009-05-29 18:46:57 +00:00
delphij
fa743d1903 Implement SI_ISALIST.
PR:		kern/91293
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
2009-05-29 06:27:30 +00:00
delphij
10c7a24940 Fix the sysinfo(SI_HW_SERIAL, emulation so that we actually get the
hostid of the machine rather than always getting "0".

PR:		kern/91293
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
2009-05-29 06:19:37 +00:00
delphij
0339972dd9 copyinstr(9) takes parameter 'len' as a size_t *, not int *.
PR:		kern/91293
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
2009-05-29 06:04:26 +00:00
delphij
5abf6d12a5 de-register.
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
PR:		kern/91293
2009-05-29 05:58:46 +00:00
delphij
88adf607ea svr4_sys_getdents64() should not assume that the cookie would exist
everywhere.

PR:		kern/91293
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
2009-05-29 05:51:19 +00:00
delphij
2095d11d4e Add new sysconfig bits, Fix the bogus numbering of the old bits.
Submitted by:	"Pedro f. Giffuni" <giffunip asme org>
Obtained from:	NetBSD
PR:		kern/91293
2009-05-29 05:37:27 +00:00
delphij
9c4052a028 Use strlcpy(). 2009-05-28 21:12:43 +00:00
thompsa
af6fb4f3d2 s/usb2_/usb_/ on all C structs for the USB stack. 2009-05-28 17:36:36 +00:00
avg
8466b56c6c linux_ioctl_cdrom: reduce stack usage
... by moving two ~2KB structures from stack to heap allocation.
I experienced stack overflow in linux emulation on i386 (8K stack)
when LINUX_DVD_READ_STRUCT ioctl was performed on atapicam cd
device and there was an error that resulted in additional quite
heavy stack use in cam layer.

Reviewed by:	dchagin
Approved by:	jhb (mentor)
2009-05-27 15:23:12 +00:00
jamie
a013e0afcb Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
antoine
29949828e6 Remove an unused variable. 2009-05-24 18:35:53 +00:00
jhb
27f9129d79 Comment nits. 2009-05-20 18:36:17 +00:00
jhb
bc9abffcd5 Put the vnode returned from namei() immediately after namei() returns in
svr4_sys_resolvepath().
2009-05-20 18:25:16 +00:00
dchagin
56c9819821 Validate user-supplied arguments values.
Args argument is a pointer to the structure located in user space in
which the socketcall arguments are packed. The structure must be
copied to the kernel instead of direct dereferencing.

Approved by:	kib (mentor)
MFC after:	1 week
2009-05-19 09:10:53 +00:00
dchagin
7316b5296a Implement MSG_CMSG_CLOEXEC flag for linux_recvmsg().
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-18 04:07:46 +00:00
dchagin
5351e06699 Somewhere between 2.6.23 and 2.6.27, Linux added SOCK_CLOEXEC and
SOCK_NONBLOCK flags, that allow to save fcntl() calls.

Implement a variation of the socket() syscall which takes a flags
in addition to the type argument.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-16 18:48:41 +00:00
dchagin
a0c026b20b Return EINVAL in case when the incorrect or unsupported
type argument is specified.

Do not map type argument value as its Linux values are
identical to FreeBSD values.

Approved by:	kib (mentor)
2009-05-16 18:46:51 +00:00
dchagin
eae11e9cce Use the protocol family constants for the domain argument validation.
Return immediately when the socket() failed.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-16 18:44:56 +00:00
dchagin
bc4e3c1f6d Emulate SO_PEERCRED socket option.
Temporarily use 0 for pid member as the FreeBSD does not cache remote
UNIX domain socket peer pid.

PR:		kern/102956
Reviewed by:	rwatson
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-16 18:42:18 +00:00
brueffer
e420c3ccc5 Remove an unused variable.
Found with:	Coverity Prevent(tm)
CID:		1167
2009-05-14 09:28:02 +00:00
brueffer
7850ca4b03 Fix memory leak in an error case.
Found with:	Coverity Prevent(tm)
CID:		371
MFC after:	2 weeks
2009-05-13 08:50:13 +00:00
dchagin
ebcb202672 Translate l_timeval arg to native struct timeval in
linux_setsockopt()/linux_getsockopt() for SO_RCVTIMEO,
SO_SNDTIMEO opts as l_timeval has MD members.

Remove bogus __packed attribute from l_timeval struct on __amd64__.

PR:		kern/134276
Submitted by:	Thomas Mueller <tmueller sysgo com>
Approved by:	kib (mentor)
MFC after:	2 weeks
2009-05-11 13:50:42 +00:00
dchagin
4f4faf9d43 Add forgotten linux to bsd flags argument mapping into the linux_recv().
PR:		kern/134276
Submitted by:	Thomas Mueller <tmueller sysgo com>
Approved by:	kib (mentor)
MFC after:	2 weeks
2009-05-11 13:42:40 +00:00
dchagin
51f122997d Do not export AT_CLKTCK when emulating Linux kernel prior
to 2.4.0, as it has appeared in the 2.4.0-rc7 first time.
Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK),
glibc falls back to the hard-coded CLK_TCK value when aux entry
is not present.

Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value.

For older applications/libc's which depends on hard-coded CLK_TCK
value user should set compat.linux.osrelease less than 2.4.0.

Approved by:	kib (mentor)
2009-05-10 18:43:43 +00:00
dchagin
e4e6bf246f Introduce linux_kernver() interface which is intended for an exact
designation of the emulated kernel version.

linux_kernver() returns integer value formatted as 'VVVMMMIII' where
VVV - version, MMM - major revision, III - minor revision.

Approved by:	kib (mentor)
2009-05-10 18:27:20 +00:00
dchagin
ab5a6b0d18 Rework r189362, r191883.
The frequency of the statistics clock is given by stathz.
Use stathz if it is available, otherwise use hz.

Pointed out by:	bde

Approved by:	kib (mentor)
2009-05-10 18:16:07 +00:00