Commit Graph

1480 Commits

Author SHA1 Message Date
Ruslan Ermilov
f42326c579 Regen. 2006-11-03 21:23:33 +00:00
Ruslan Ermilov
0b160a7d2b Fix build breakage introduced in previous commit (redeclatation
of sctp functions).
2006-11-03 21:21:28 +00:00
Randall Stewart
af99851047 This commits the remake in kern/ make sysent to get
the correct syscalls.master's $FreeBSD$ tag record and
a make sysent in sys/compat/freebsd32. Thanks Ruslan
for pointing out the steps I missed :-0
Approved by:	gnn
2006-11-03 18:57:49 +00:00
Randall Stewart
f8829a4a40 Ok, here it is, we finally add SCTP to current. Note that this
work is not just mine, but it is also the works of Peter Lei
and Michael Tuexen. They both are my two key other developers
working on the project.. and they need ata-boy's too:
****
peterlei@cisco.com
tuexen@fh-muenster.de
****
I did do a make sysent which updated the
syscall's and sysproto.. I hope that is correct... without
it you don't build since we have new syscalls for SCTP :-0

So go out and look at the NOTES, add
option SCTP (make sure inet and inet6 are present too)
and play with SCTP.

I will see about comitting some test tools I have after I
figure out where I should place them. I also have a
lib (libsctp.a) that adds some of the missing socketapi
functions that I need to put into lib's.. I will talk
to George about this :-)

There may still be some 64 bit issues in here, none of
us have a 64 bit processor to test with yet.. Michael
may have a MAC but thats another beast too..

If you have a mac and want to use SCTP contact Michael
he maintains a web site with a loadable module with
this code :-)

Reviewed by:	gnn
Approved by:	gnn
2006-11-03 15:23:16 +00:00
Alexander Leidinger
3680a41902 Backout the linux aio stuff. Several problems where identified and the
dynamic nature (if no native aio code is available, the linux part
returns ENOSYS because of missing requisites) should be solved differently
than it is.

All this will be done in P4.

Not included in this commit is a backout of the changes to the native aio
code (removing static in some places). Those changes (and some more) will
also be needed when the reworked linux aio stuff will reenter the tree.

Requested by:	rwatson
Discussed with:	rwatson
2006-10-29 14:02:39 +00:00
Alexander Leidinger
e3e6449247 style(9)
Noticed by:	rwatson
2006-10-29 09:50:55 +00:00
Alexander Leidinger
c4ce314b40 Fix style(9).
Noticed by:	rwatson
2006-10-28 16:47:38 +00:00
Alexander Leidinger
955d762aca MFP4:
Implement prctl().

Submitted by:	rdivacky
Tested with:	LTP
2006-10-28 10:59:59 +00:00
Maxim Sobolev
016b81e405 Regen. 2006-10-24 17:25:36 +00:00
Maxim Sobolev
ef16706d34 Fix kernel breakage introduced in the previous commit (redeclatation
of the audit functions).
2006-10-24 17:24:11 +00:00
Robert Watson
c71bf4bf63 Regenerate. 2006-10-24 13:54:56 +00:00
Robert Watson
a1dce47980 Hook up audit functions in the freebsd32 compatibility code. It is
believed these likely don't require wrappers.

Reported by:	sobomax
MFC after:	3 days
2006-10-24 13:49:44 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
David Xu
034b26fc65 Regenerate. 2006-10-17 02:28:58 +00:00
David Xu
3f9223b65d Sync with master. 2006-10-17 02:28:26 +00:00
Alexander Leidinger
6474221698 Fix compile (use the right variable name). 2006-10-15 14:34:03 +00:00
Alexander Leidinger
6a1162d4cd MFP4 (with some minor changes):
Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.

From the submitter:
---snip---
DESIGN NOTES:

1. Linux permits a process to own multiple AIO queues (distinguished by
   "context"), but FreeBSD creates only one single AIO queue per process.
   My code maintains a request queue (STAILQ of queue(3)) per "context",
   and throws all AIO requests of all contexts owned by a process into
   the single FreeBSD per-process AIO queue.

   When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
   io_cancel(2), my code can pick out requests owned by the specified context
   from the single FreeBSD per-process AIO queue according to the per-context
   request queues maintained by my code.

2. The request queue maintained by my code stores contrast information between
   Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
   (struct aiocb). FreeBSD IO control block actually exists in userland memory
   space, required by FreeBSD native aio_XXXXXX(2).

3. It is quite troubling that the function io_getevents() of libaio-0.3.105
   needs to use Linux-specific "struct aio_ring", which is a partial mirror
   of context in user space. I would rather take the address of context in
   kernel as the context ID, but the io_getevents() of libaio forces me to
   take the address of the "ring" in user space as the context ID.

   To my surprise, one comment line in the file "io_getevents.c" of
   libaio-0.3.105 reads:

             Ben will hate me for this

REFERENCE:

1. Linux kernel source code:   http://www.kernel.org/pub/linux/kernel/v2.6/
   (include/linux/aio_abi.h, fs/aio.c)

2. Linux manual pages:         http://www.kernel.org/pub/linux/docs/manpages/
   (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))

3. Linux Scalability Effort:   http://lse.sourceforge.net/io/aio.html
   The design notes:           http://lse.sourceforge.net/io/aionotes.txt

4. The package libaio, both source and binary:
       http://rpmfind.net/linux/rpm2html/search.php?query=libaio
   Simple transparent interface to Linux AIO system calls.

5. Libaio-oracle:              http://oss.oracle.com/projects/libaio-oracle/
   POSIX AIO implementation based on Linux AIO system calls (depending on
   libaio).
---snip---

Submitted by:	Li, Xiao <intron@intron.ac>
2006-10-15 14:22:14 +00:00
Alexander Leidinger
687c23be1d MFP4 (107868 - 107870):
Use a macro to test for a valid signal instead of doing it my hand everywhere.

Submitted by:	rdivacky
2006-10-15 12:51:43 +00:00
Giorgos Keramidas
050f8bb67d Spell proc/sys/kernel/pid_max correctly in a comment.
Submitted by:	rdivacky
2006-10-11 20:32:46 +00:00
John Baldwin
8528552b0d Don't pass unused bufsz to kern_shmctl(). 2006-10-10 22:46:50 +00:00
John Baldwin
f3ea244ea9 Only try to copyin a msqid for the IPC_SET command to msgctl(). Other
commands (such as IPC_RMID) were bogusly failing with EFAULT.

Tested by:	jkim
2006-10-10 22:46:22 +00:00
John Baldwin
7f4c1dd0d6 Remove unnecessary casts before PTRIN(). 2006-10-10 22:44:59 +00:00
Alexander Leidinger
28638377ad - change if (cond) panic() to KASSERT.
- Dont forget to free em in a case of error.

Suggested by:	ssouhlal
Submitted by:	rdivacky
Tested with:	LTP
2006-10-08 17:10:34 +00:00
Alexander Leidinger
7660ace19c - Replace homegrown check for FIFO with S_ISFIFO. [1]
- Check the status of the options before messing with it.

Inspired by:	NetBSD [1]
Submitted by:	rdivacky
Tested with:	LTP
2006-10-08 17:08:27 +00:00
Alexander Leidinger
236e97b2b2 Implement /proc/sys/kernel/pid_max.
Submitted by:	rdivacky
Tested with:	LTP
2006-10-08 16:55:27 +00:00
David Xu
295426f4c5 Regenerate. 2006-10-06 08:24:37 +00:00
David Xu
ae7d8a6766 Implement 32bit umtx_lock and umtx_unlock system calls, these two system
calls are not used by libthr in RELENG_6 and HEAD, it is only used by
the libthr in RELENG-5, the _umtx_op system call can do more incremental
dirty works than these two system calls without having to introduce new
system calls or throw away old system calls when things are going on.
2006-10-06 08:22:08 +00:00
David Xu
312a0e5f06 Regenerate. 2006-10-05 01:58:57 +00:00
David Xu
e6e7f16cb4 Oops, add the missing file. 2006-10-05 01:58:08 +00:00
David Xu
c6511aea86 Move some declaration of 32-bit signal structures into file
freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.
2006-10-05 01:56:11 +00:00
Robert Watson
531147aa3e Regenerate. 2006-10-03 20:48:11 +00:00
Robert Watson
dfb041ca62 Change getpagesize() system call audit event to more clearly indicate
that we don't audit it.

MFC after:	3 days
Obtained from:	TrustedBSD Project
2006-10-03 20:48:03 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Alexander Leidinger
d4b7423fa1 MFp4:
- Linux returns ENOPROTOOPT in a case of not supported opt to setsockopt.
- Return EISDIR in pread() when arg is a directory.
- Return EINVAL instead of EFAULT when namelen is not correct in accept().
- Return EINVAL instead of EACCESS if invalid access mode is entered in
  access().
- Return EINVAL instead of EADDRNOTAVAIL in a case of bad salen param
  to bind().

Submitted by:	rdivacky
Tested with:	LTP (vfork01 fails now, but it seems to be a race and
		not caused by those changes)
MFC after:	1 week
2006-09-23 19:06:54 +00:00
David Xu
4af4fcb71a Regenerate. 2006-09-23 00:27:53 +00:00
David Xu
5c26f4cea8 Enable sigwait. 2006-09-23 00:27:11 +00:00
David Xu
ac3674aa52 Regenerate. 2006-09-22 15:05:34 +00:00
David Xu
cda9a0d1c2 Add compatible code to let 32bit libthr work on 64bit kernel. 2006-09-22 15:04:28 +00:00
David Xu
27bbb2e71f Regenerate. 2006-09-22 00:53:43 +00:00
David Xu
1eec02f538 Add umtx support for 32bit process on AMD64 machine. 2006-09-22 00:52:54 +00:00
David Xu
ecc313475b Regenerate. 2006-09-21 04:50:38 +00:00
David Xu
47bd78d24d sync with master. 2006-09-21 04:49:36 +00:00
Robert Watson
da7cbdc2b3 Regenerate. 2006-09-17 13:29:36 +00:00
Robert Watson
6c2d307a0e AUE_SIGALTSTACK instead of AUE_SIGPENDING for sigaltstack().
Obtained from:	TrustedBSD Project
MFC after:	3 days
2006-09-17 13:28:11 +00:00
Alexander Leidinger
18f81b3dfa - don't reboot() when feed with wrong parameters (and enough permissions) [1]
- add support to power off the system [2]
- check the linux magic values [3]

Submitted by:	Marcin Cieslak <saper@SYSTEM.PL> [1,2]
Modelled after:	linux man page of the reboot() syscall [3]
Found by:	LTP testcase "reboot02" [1]
Tested with:	LTP testcase "reboot02" [1,3]
MFC after:	1 week
2006-09-16 14:12:04 +00:00
Alexander Leidinger
db0d964062 The Linux unlink syscall uses a different errno value when trying to unlink
a directory.

PR:		102897 [1]
Noticed by:	Knut Anders Hatlen <kahatlen@gmail.com>, testrun with LTP [1]
Submitted by:	Marcin Cieslak <saper@SYSTEM.PL>
Tested by:	netchild (LTP test run)
2006-09-10 13:47:56 +00:00
Alexander Leidinger
8618fd85a3 - Extend the coverage of PROC_LOCK to cover wakeup(&p->p_emuldata);
- Lock the emuldata in a case when we just created it.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Suggested by:	jhb
2006-09-09 16:55:55 +00:00
Alexander Leidinger
bb59e63f8f Change futex lock from mutex to sx. Make futex_get atomic (protected by the
futex lock).

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Suggested by:	jhb
2006-09-09 16:25:25 +00:00
Alexander Leidinger
c19ddeda07 - don't wake every sleeper just the first one [1]
- remove debuging printf			[2]

Submitted by:	intron <mag@intron.ac> [1], rdivacky [2]
2006-09-09 13:04:28 +00:00
David Xu
c0ba6c1783 The following functions need not to be reimplemented, reuse 64bit
syscalls instead:
sigqueue, thr_set_name, thr_setscheduler, thr_getscheduler,
thr_setschedparam.
2006-09-09 01:22:13 +00:00
Robert Watson
e482025ebd Regenerate. 2006-09-03 16:24:36 +00:00
Robert Watson
e8a6d7e554 Set freebsd32 system call event identifiers for:
- old truncate, ftruncate
- old getpeername, gethostid, sethostid, getrlimit, setrlimit, killpg.
- old quota, getsockname, getdirentries.
- lgetfh
- old getdomainname, setdomainname
- sysarch, rtprio, __getcwd, jail, sigtimedwait
- extattrctl, extattr_{get,set,delete,list}_{file,fd,link}
- getresgid, getresuid, kqueue, eaccess, nmount, sendfile
- fhstatfs, kldunloadf

Right identifiers for:

- nfssvc

Remove incorrect identifier for:

- __acl_get_file

Compile tested with help of:	sam
Obtained from:	TrustedBSD Project
2006-09-03 16:17:49 +00:00
Robert Watson
8075da7e8b Regenerate. Looks like someone missed doing this previously as more than
just the audit event change appears in the diff.
2006-09-03 13:47:52 +00:00
Robert Watson
1b25e5f3c4 Use AUE_NTP_ADJTIME instead of AUE_ADJTIME for ntp_adjtime().
Obtained from:	TrustedBSD Project
2006-09-03 13:47:24 +00:00
Robert Watson
0ee913128d Remove two hypothetical calls to suser() in ifdef'd (and uncompilable)
svr4 code: this code would call centralized sysctl code that does
these checks also.

MFC after:	1 week
Obtained from:	TrustedBSD Project
Sponsored by:	nCircle Network Security, Inc.
2006-09-02 08:18:22 +00:00
Suleiman Souhlal
c67e0cc9e7 FREE -> free
Submitted by:	rdivacky
2006-08-28 13:52:27 +00:00
Alexander Leidinger
835e506190 Add the linux statfs64 call. This allows Tivoli backup to proceed a little
but further on -current (still not successful, but a step into the right
direction).

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Tested by:	Paul Mather <paul@gromit.dlib.vt.edu>
2006-08-27 08:56:54 +00:00
Alexander Leidinger
84ed9f91d8 Correct the number of retries in a futex_wake() call.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-08-26 10:36:16 +00:00
Robert Watson
3e8df637c0 Don't call suser_cred() directly from linux_sethostname(), as it just
wraps userland_sysctl(), which performs necessary privilege checks as
part of its normal operation.

MFC after:	1 week
2006-08-25 11:02:42 +00:00
Alexander Leidinger
1a28c0df09 Sync the MI parts for amd64 with i386 and remove the corresponding special
handling for amd64 in the common code. The MD parts for amd64 are still
outstanding, but at least this fixes some panics on amd64.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Tested by:	bsam
2006-08-20 13:50:27 +00:00
Alexander Leidinger
29ddc19bbf Get rid of some nested includes.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb
2006-08-19 15:13:01 +00:00
Suleiman Souhlal
5342db0872 MALLOC -> malloc and FREE -> free
Submitted by:	rdivacky
Pointed out by:	jhb
2006-08-19 11:54:19 +00:00
Suleiman Souhlal
b273d5aa72 ifdef DEBUG a printf
Submitted by:	rdivacky
2006-08-19 11:07:22 +00:00
Warner Losh
1a3c917f9d while (0); -> while (0) in multi-line macros 2006-08-17 22:50:33 +00:00
Alexander Leidinger
590e3a06e8 - disable some more code when osrelease=2.4.2
- protect td->td_proc->p_pid with the proc lock in linux_getpid
  in the amd64 (= non i386) case [1]

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	netchild [1]
2006-08-17 21:21:30 +00:00
Alexander Leidinger
94cb2ecf79 Move some stuff into headers where they belong.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-17 21:06:48 +00:00
Alexander Leidinger
9b0bcbfbda Fix the DEBUG build:
- linux_emul.c [1]
 - linux_futex.c [2]

Sponsored by:	Google SoC 2006	[1]
Submitted by:	rdivacky	[1]
		netchild	[2]
2006-08-17 09:50:30 +00:00
Peter Wemm
bad9a7a5f9 Grab two syscall numbers. One is used to emulate functionality that linux
has in its procfs (do a readlink of /proc/self/fd/<nn> to find the pathname
that corresponds to a given file descriptor).  Valgrind-3.x needs this
functionality.  This is a placeholder only at this time.
2006-08-16 22:32:50 +00:00
Alexander Leidinger
0eef2f8a4e Style fixes to comments.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-16 18:54:51 +00:00
Jung-uk Kim
a88d050dfc Include sys/limits.h for INT_MAX. freebsd32_proto.h 1.58 does not include
sys/umtx.h any more and previously it was included from there.
2006-08-16 00:02:36 +00:00
John Baldwin
f8f1f7fb85 Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier
systrace changes.
2006-08-15 17:37:01 +00:00
John Baldwin
df78f6d313 - Remove unused sysvec variables from various syscalls.conf.
- Send the systrace_args files for all the compat ABIs to /dev/null for
  now.  Right now makesyscalls.sh generates a file with a hardcoded
  function name, so it wouldn't work for any of the ABIs anyway.  Probably
  the function name should be configurable via a 'systracename' variable
  and the functions should be stored in a function pointer in the sysvec
  structure.
2006-08-15 17:25:55 +00:00
Alexander Leidinger
a43eeaabe4 Disable some parts of the code on amd64 for now to prevent a panic. A better
fix will come later.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-08-15 15:15:17 +00:00
Alexander Leidinger
9b44bfc556 Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
 - pid/tid mangling - complete
 - thread area - complete
 - futexes - complete with issues
 - clone() extension - complete with some possible minor issues
 - mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
   disabled when not build as part of the kernel with native FreeBSD mq*
   support (module support for this will come later)

Tested with:
 - linux-firefox - works, tested
 - linux-opera - works, tested
 - linux-realplay - doesnt work, issue with futexes
 - linux-skype - doesnt work, issue with futexes
 - linux-rt2-demo - works, tested
 - linux-acroread - doesnt work, unknown reason (coredump) and sometimes
   issue with futexes
 - various unix utilities in linux-base-gentoo3 and linux-base-fc4:
   everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
	sysctl compat.linux.osrelease=2.6.16
to switch back use
	sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by:			Google SoC 2006
Submitted by:			rdivacky
Some suggestions/help by:	jhb, kib, manu@NetBSD.org, netchild
2006-08-15 12:54:30 +00:00
Alexander Leidinger
ad2056f2c4 Add some new files needed for linux 2.6.x compatibility.
Please don't style(9) the NetBSD code, we want to stay in sync. Not imported
on a vendor branch since we need local changes.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
With help from:	manu@NetBSD.org
Obtained from:	NetBSD (linux_{futex,time}.*)
2006-08-15 12:20:59 +00:00
Konstantin Belousov
1565bf54af Lock the vnode around the call to VOP_GETATTR. Move the locked code
and vn_fullpath (that call malloc(..., M_WAITOK)) from under the
vm object lock, since sleep is not allowed while holding the mutex.

Being there, wrap VOP_GETATTR call with conditional Giant aquire.
Currently this is (almost) noop because pseudofs is Giant-locked.

Tested by:	kris
Approved by:	pjd (mentor)
MFC after:	2 weeks
2006-08-08 12:29:26 +00:00
Robert Watson
3d0685834f With socket code no longer in svr4_stream.c, MAC includes are no longer
required, so GC.
2006-08-05 22:04:21 +00:00
Brooks Davis
012759b743 Use TAILQ_EMPTY instead of checking if TAILQ_FIRST is NULL. 2006-08-04 21:15:09 +00:00
John Baldwin
91ce2694d1 Regen for MPSAFE flag removal. 2006-07-28 19:08:37 +00:00
John Baldwin
af5bf12239 Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
John Baldwin
e0b4add8d8 Various fixes to comments in the syscall master files including removing
cruft from the audit import and adding mention of COMPAT4 to freebsd32.
2006-07-28 18:55:18 +00:00
John Baldwin
78371ec202 Regen. 2006-07-28 16:56:44 +00:00
John Baldwin
95e7d19dfa - Explicitly lock Giant to protect the fields in the svr4_strm structure
except for s_family (which is read-only once after it is set when the
  structure is created).
- Mark svr4_sys_ioctl(), svr4_sys_getmsg(), and svr4_sys_putmsg() MPSAFE.
2006-07-28 16:56:17 +00:00
John Baldwin
f30e89ced3 Fix a file descriptor race I reintroduced when I split accept1() up into
kern_accept() and accept1().  If another thread closed the new file
descriptor and the first thread later got an error trying to copyout the
socket address, then it would attempt to close the wrong file object.  To
fix, add a struct file ** argument to kern_accept().  If it is non-NULL,
then on success kern_accept() will store a pointer to the new file object
there and not release any of the references.  It is up to the calling code
to drop the references appropriately (including a call to fdclose() in case
of error to safely handle the aforementioned race).  While I'm at it, go
ahead and fix the svr4 streams code to not leak the accept fd if it gets an
error trying to copyout the streams structures.
2006-07-27 19:54:41 +00:00
John Baldwin
3ce72960e8 Regen. 2006-07-21 20:41:33 +00:00
John Baldwin
e0569c0798 Clean up the svr4 socket cache and streams code some to make it more easily
locked.
- Move all the svr4 socket cache code into svr4_socket.c, specifically
  move svr4_delete_socket() over from streams.c.  Make the socket cache
  entry structure and svr4_head private to svr4_socket.c as a result.
- Add a mutex to protect the svr4 socket cache.
- Change svr4_find_socket() to copy the sockaddr_un struct into a
  caller-supplied sockaddr_un rather than giving the caller a pointer to
  our internal one.  This removes the one case where code outside of
  svr4_socket.c could access data in the cache.
- Add an eventhandler for process_exit and process_exec to purge the cache
  of any entries for the exiting or execing process.
- Add methods to init and destroy the socket cache and call them from the
  svr4 ABI module's event handler.
- Conditionally grab Giant around socreate() in streamsopen().
- Use fdclose() instead of inlining it in streamsopen() when handling
  socreate() failure.
- Only allocate a stream structure and attach it to a socket in
  streamsopen().  Previously, if a svr4 program performed a stream
  operation on an arbitrary socket not opened via the streams device,
  we would attach streams state data to it and change f_ops of the
  associated struct file while it was in use.  The latter was especially
  not safe, and if a program wants a stream object it should open it via
  the streams device anyway.
- Don't bother locking so_emuldata in the streams code now that we only
  touch it right after creating a socket (in streamsopen()) or when
  tearing it down when the file is closed.
- Remove D_NEEDGIANT from the streams device as it is no longer needed.
2006-07-21 20:40:13 +00:00
John Baldwin
52d639a953 Add conditional VFS Giant locking to svr4_sys_fchroot() and mark it MPSAFE.
Also, call change_dir() instead of doing part of it inline (this now adds
a mac_check_vnode_chdir() call) to match fchdir() and call
mac_check_vnode_chroot() to match chroot().  Also, use the change_root()
function to do the actual change root to match chroot().

Reviewed by:	rwatson
2006-07-21 20:28:56 +00:00
John Baldwin
b4c63329d5 - Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional
Giant VFS locking in that function.
- Remove bogus code to handle the case where namei() returns success but a
  NULL vnode pointer.
- Note that this code duplicates exec_check_permissions() and annotate
  where it differs.
- Hold the vnode lock longer to protect the write to set VV_TEXT in
  v_vflag.
- Mark linux_uselib() MPSAFE.

Reviewed by:	rwatson
2006-07-21 20:22:13 +00:00
John Baldwin
7cf6a457ea Regen. 2006-07-19 19:03:21 +00:00
John Baldwin
a1ca3e0ba7 Add conditional VFS Giant locking to svr4_sys_resolvepath() and mark it
MPSAFE.
2006-07-19 19:03:03 +00:00
John Baldwin
a3616b117c Make svr4_sys_waitsys() a lot less ugly and mark it MPSAFE.
- If the WNOWAIT flag isn't specified and either of WEXITED or WTRAPPED is
  set, then just call kern_wait() and let it do all the work.  This means
  that this function no longer has to duplicate the work to teardown
  zombies that is done in kern_wait().  Instead, if the above conditions
  aren't true, then it uses a simpler loop to implement WNOWAIT and/or
  tracing for only stopped or continued processes.  This function still
  has to duplicate code from kern_wait() for the latter two cases, but
  those are much simpler.
- Sync the code to handle the WCONTINUED and WSTOPPED cases with the
  equivalent code in kern_wait().
- Fix several places that would return with the proctree lock still held.
- Lock the current process to prevent lost wakeup races when blocking.
2006-07-19 19:01:10 +00:00
John Baldwin
b33887ea31 Don't free the sockaddr in kern_bind() and kern_connect() as not all
callers pass a sockaddr allocated via malloc() from M_SONAME anymore.
Instead, free it in the callers when necessary.
2006-07-19 18:28:52 +00:00
John Baldwin
a02f5c6204 Initialize svr4_head during MOD_LOAD rather than on demand. 2006-07-19 18:26:09 +00:00
David Xu
2df96d8e02 sync with master. 2006-07-14 01:57:09 +00:00
John Baldwin
90aff9de2d Regen. 2006-07-11 20:55:23 +00:00
John Baldwin
be5747d5b5 - Add conditional VFS Giant locking to getdents_common() (linux ABIs),
ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(),
  and svr4_sys_getdents64() similar to that in getdirentries().
- Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(),
  linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and
  svr4_sys_getdents64() MPSAFE.
2006-07-11 20:52:08 +00:00
John Baldwin
c870740e09 - Split out kern_accept(), kern_getpeername(), and kern_getsockname() for
use by ABI emulators.
- Alter the interface of kern_recvit() somewhat.  Specifically, go ahead
  and hard code UIO_USERSPACE in the uio as that's what all the callers
  specify.  In place, add a new uioseg to indicate what type of pointer
  is in mp->msg_name.  Previously it was always a userland address, but
  ABI emulators may pass in kernel-side sockaddrs.  Also, remove the
  namelenp field and instead require the two places that used it to
  explicitly copy mp->msg_namelen out to userland.
- Use the patched kern_recvit() to replace svr4_recvit() and the stock
  kern_sendit() to replace svr4_sendit().
- Use kern_bind() instead of stackgap use in ti_bind().
- Use kern_getpeername() and kern_getsockname() instead of stackgap in
  svr4_stream_ti_ioctl().
- Use kern_connect() instead of stackgap in svr4_do_putmsg().
- Use kern_getpeername() and kern_accept() instead of stackgap in
  svr4_do_getmsg().
- Retire the stackgap from SVR4 compat as it is no longer used.
2006-07-10 21:38:17 +00:00
John Baldwin
acdd09f944 Unexpand PTRIN() in several places and fix one instance where 0 was being
used instead of NULL.
2006-07-10 19:37:43 +00:00
John Baldwin
c1cccebe8b Add a kern_close() so that the ABIs can close a file descriptor w/o having
to populate a close_args struct and change some of the places that do.
2006-07-08 20:03:39 +00:00
John Baldwin
b1ee5b654d Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This
mostly consists of pushing a few copyin's and copyout's up into
__semctl() as all the other callers were already doing the UIO_SYSSPACE
case.  This also changes kern_semctl() to set the return value in a passed
in pointer to a register_t rather than td->td_retval[0] directly so that
callers can only set td->td_retval[0] if all the various copyout's succeed.

As a result of these changes, kern_semctl() no longer does copyin/copyout
(except for GETALL/SETALL) so simplify the locking to acquire the semakptr
mutex before the MAC check and hold it all the way until the end of the
big switch statement.  The GETALL/SETALL cases have to temporarily drop it
while they do copyin/malloc and copyout.  Also, simplify the SETALL case to
remove handling for a non-existent race condition.
2006-07-08 19:51:38 +00:00
John Baldwin
ad6d226d43 - Protect the list of linux ioctl handlers with an sx lock.
- Hold Giant while calling linux ioctl handlers for now as they aren't all
  known to be MPSAFE yet.
- Mark linux_ioctl() MPSAFE.
2006-07-06 21:42:36 +00:00
John Baldwin
d699b1ce00 Don't try to copyin extra data for IPC_RMID requests to msgctl() or
shmctl().  None of the other ABI's do this (including the native FreeBSD
ABI), and uselessly trying to do a copyin() can actually result in a
bogus EFAULT if the a process specifies NULL for the optional argument
(which is what they should do in this case).
2006-07-06 21:38:24 +00:00
Mark Murray
93c005929f Housekeeping. Update for maintainers who have handed in their commit bits
or (in my case) no longer feel that oversight is necessary.
2006-07-01 10:51:55 +00:00
Alexander Leidinger
550be19e16 Improve linprovfs to provide/fix the
- process state (idle, sleeping, running, ...) [1]
 - the process group ID of the process which owns the connected tty
 - some page fault stats
 - time spend in kernel/userland
 - priority/nice value
 - starttime [1]
 - memory/swap stats
 - scheduling policy

Additionally add some new fields and correct some not filled out ones.

This brings us down to 15 dummy fields.

The fields marked with [1] are needed to get Oracle 10 running. The starttime
field is not completely right, since it displays the _same_ starttime for
_every_ process, but at least it is not 0 and Oracle accepts this.

This is a RELENG_x_y candidate.

Noticed by:	Dmitry Ganenko <dima@apk-inform.com> [1]
Reviewed by:	des, rdivacky
MFC after:	1 week
2006-06-27 20:11:58 +00:00
John Baldwin
cec34dbf79 Regen. 2006-06-27 18:32:16 +00:00
John Baldwin
bb639715d4 Use kern_shmctl() in svr4_sys_shmctl() and drop use of the stackgap. Mark
svr4_sys_shmctl() MPSAFE.
2006-06-27 18:31:36 +00:00
John Baldwin
4db580972e Axe the stackgap macros as the Linux ABIs no longer use the stackgap. 2006-06-27 18:30:49 +00:00
John Baldwin
49d409a108 - Add a kern_semctl() helper function for __semctl(). It accepts a pointer
to a copied-in copy of the 'union semun' and a uioseg to indicate which
  memory space the 'buf' pointer of the union points to.  This is then used
  in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap.
- Mark linux_ipc() and svr4_sys_semsys() MPSAFE.
2006-06-27 18:28:50 +00:00
John Baldwin
0cceebeeb2 Regen. 2006-06-27 14:47:08 +00:00
John Baldwin
597d608f86 - Expand the scope of Giant some in mount(2) to protect the vfsp structure
from going away.  mount(2) is now MPSAFE.
- Expand the scope of Giant some in unmount(2) to protect the mp structure
  (or rather, to handle concurrent unmount races) from going away.
  umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount().
- nmount(2) and linux_mount() were already MPSAFE.
2006-06-27 14:46:31 +00:00
John Baldwin
b820787fb3 Regen. 2006-06-26 18:37:36 +00:00
John Baldwin
b0f6106af9 Change svr4_sys_break() to just call obreak() and mark it MPSAFE.
Not objected to by:	alc
2006-06-26 18:36:57 +00:00
John Baldwin
04a8728231 - Sync with master: rmdir(), mkdir(), and extattr_*() are all MPSAFE.
- freebsd32_utimes() is MPSAFE.
2006-06-26 18:35:57 +00:00
Alexander Leidinger
555f86b8b6 The linux times syscall can be called with a NULL pointer, so keep cool
and don't panic.

This fix is different from the patch submitted as it not only prevents
a NULL-pointer dereference, but also skips some work in this case.

Noticed by:	Dmitry Ganenko <dima@apk-inform.com>
Reviewed by:	rdivacky (the original version as in emulation@)
MFC after:	1 week
Security:	This is a RELENG_x_y candidate (local DoS).
Go ahead by:	secteam (cperciva)
2006-06-23 18:49:38 +00:00
Diomidis Spinellis
462da4d616 Move conditional preprocessing out of the SYSCTL_ADD_STRING macro
invocation.  Per C99 6.10.3 paragraph 11 preprocessing directives
appearing inside macro arguments yield undefined behavior.
2006-06-22 13:11:36 +00:00
John Baldwin
62d615d508 Conditionally acquire Giant around VFS operations. 2006-06-20 21:31:38 +00:00
John Baldwin
932151064a - Add a new linker_file_foreach() function that walks the list of linker
file objects calling a user-specified predicate function on each object.
  The iteration terminates either when the entire list has been iterated
  over or the predicate function returns a non-zero value.
  linker_file_foreach() returns the value returned by the last invocation
  of the predicate function.  It also accepts a void * context pointer that
  is passed to the predicate function as well.  Using an iterator function
  avoids exposing linker internals to the rest of the kernel making locking
  simpler.
- Use linker_file_foreach() instead of walking the list of linker files
  manually to lookup ndis files in ndis(4).
- Use linker_file_foreach() to implement linker_hwpmc_list_objects().
2006-06-20 20:37:17 +00:00
John Baldwin
a6e25132d4 Forcefully turn off GPROF in this file if it is enabled as GPROF's
attempt to use a macro for 'ret' doesn't play well with the wrappers
trying to implement 'Pascal-style' calling conventions.
2006-06-12 20:35:59 +00:00
Dag-Erling Smørgrav
5ef57544fc Add the model name, obtained from the hw.model sysctl variable.
MFC after:	3 weeks
2006-06-12 18:14:49 +00:00
Paul Saab
e8b62ee79e Do not copy out the iovec in the 32bit recvmsg call since soreceive
calls uiomove directly.

Reviewed by:	ups
MFC after:	1 week
2006-06-08 18:33:08 +00:00
Dag-Erling Smørgrav
b19bfd3db5 As far as I can tell, the correct CPU family for amd64 (which Linux calls
x86_64) is 15, not 6.

MFC after:	3 weeks
2006-06-02 13:01:25 +00:00
Doug Ambrisko
edb75eca27 Fix file leaking in translate_path_major_minor. 2006-05-16 17:57:00 +00:00
Poul-Henning Kamp
c40da00ca3 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
John Baldwin
73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
Doug Ambrisko
0b1c233427 Remove the dependency on procfs since it isn't used.
Noticed by:	des
2006-05-11 15:27:58 +00:00
Alexander Leidinger
01e0ffbae8 Now that we don't have a linuxolator on alpha anymore:
- unifdef __alpha__
 - revert rev. 1.66 of linux_socket.c
2006-05-10 20:38:16 +00:00
Alexander Leidinger
17138b619c Implement rt_sigpending in the linuxolator.
PR:		92671
Submitted by:	Markus Niemist"o <markus.niemisto@gmx.net>
2006-05-10 18:17:29 +00:00
Doug Ambrisko
32397ce071 Add in linsysfs. A linux 2.6 like sys filesystem to pacify the Linux
LSI MegaRAID SAS utility.

Sponsored by:		IronPort Systems
Man page help from:	brueffer
2006-05-09 22:27:01 +00:00
Doug Ambrisko
03487601c2 Fix the the duplicate cut-n-paste in linux_fstat64 pointed out by
Alexander Leidinger.  I forget to fix it in this version.
2006-05-05 16:17:59 +00:00
Doug Ambrisko
060e488247 Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy.
Add back in a scheme to emulate old type major/minor numbers via hooks into
stat, linprocfs to return major/minors that Linux app's expect.  Currently
only /dev/null is always registered.  Drivers can register via the Linux
type shim similar to the ioctl shim but by using
linux_device_register_handler/linux_device_unregister_handler functions.
The structure is:

    struct linux_device_handler {
        char    *bsd_driver_name;
        char    *linux_driver_name;
        char    *bsd_device_name;
        char    *linux_device_name;
        int     linux_major;
        int     linux_minor;
        int     linux_char_device;
    };

Linprocfs uses this to display the major number of the driver.  The
soon to be available linsysfs will use it to fill in the driver name.
Linux_stat uses it to translate the major/minor into Linux type values.

Note major numbers are dynamically assigned via passing in a -1 for
the major number so we don't need to keep track of them.

This is somewhat needed due to us switching to our devfs.  MegaCli
will not run until I add in the linsysfs and mfi Linux compat changes.

Sponsored by:	IronPort Systems
2006-05-05 16:10:45 +00:00
Robert Watson
f7f45ac8e2 Annotate uses of fgetsock() with indications that they should rely
on their existing file descriptor references to sockets, rather than
use fgetsock() to retrieve a direct socket reference.

MFC after:	3 months
2006-04-01 15:25:01 +00:00
Paul Saab
74f7258fb7 regen for 32bit System V shared memory 2006-03-30 07:43:01 +00:00
Paul Saab
fbb273bc05 Properly support for FreeBSD 4 32bit System V shared memory.
Submitted by:	peter
Obtained from:	Yahoo!
MFC after:	3 weeks
2006-03-30 07:42:32 +00:00
Tai-hwa Liang
d9d46ed258 Unbreaking build by removing a now unused variable. 2006-03-27 23:27:11 +00:00
John Baldwin
b77619bd7f Use td_ucred rather than p_ucred to avoid panics and general unhappiness.
Pointy hat to:	netchild
2006-03-27 19:16:31 +00:00
Alexander Leidinger
1daa386fcf Fix the LINT build on alpha:
- rename some file local structure definitions, the names clash with
  autogenerated names
- on !alpha add some compatibility defines for those renamed structures
- make some functions globally visible on alpha
2006-03-21 21:56:04 +00:00
Alexander Leidinger
61da9d97fb Fix tinderbox on alpha.
Tested by:	cross-compile
2006-03-20 19:46:56 +00:00
Ruslan Ermilov
aefce619cf Unbreak COMPAT_LINUX32 option support on amd64.
Broken by:	netchild
2006-03-19 11:10:33 +00:00
Alexander Leidinger
d4a3f5ddb6 Fixup some problems in my previous commit (COMPAT_43).
Pointyhat to:	netchild
2006-03-18 20:47:36 +00:00
Alexander Leidinger
5c8919adf4 Get rid of the need of COMPAT_43 in the linuxolator.
Submitted by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Obtained from:	DragonFly (some parts)
2006-03-18 18:20:17 +00:00
Stephan Uphoff
68ff3c2445 Fix exec_map resource leaks.
Tested by: kris@
2006-03-08 20:21:54 +00:00
Paul Saab
6308f39da8 use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure
the copied strings are properly terminated.

bzero the statfs32 struct in copy_statfs.
2006-03-04 00:09:09 +00:00
Paul Saab
26e4fb05dc regen for 32bit sendfile 2006-02-28 19:39:52 +00:00
Paul Saab
fa545f434c Fix 32bit sendfile by implementing kern_sendfile so that it takes
the header and trailers as iovec arguments instead of copying them
in inside of sendfile.

Reviewed by:	jhb
MFC after:	3 weeks
2006-02-28 19:39:18 +00:00
John Baldwin
8917b8d28c - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
Jeff Roberson
c4be19469a - Remove ifdef disabled code that doesn't have a chance of working anymore. 2006-02-06 10:10:42 +00:00
Robert Watson
ef572cf5bb Regenerate. 2006-02-04 13:29:09 +00:00
Robert Watson
2b8d08f814 Audit FreeBSD 32-bit system calls on 64-bit FreeBSD systems.
Obtained from:	TrustedBSD Project
2006-02-04 13:28:55 +00:00
Jeff Roberson
d6791f7615 - vn_lock with LK_RETRY can not return an error. The code that handled this
case was not necessary.

Sponsored by:	Isilon Systems, Inc.
2006-01-30 08:22:56 +00:00
Olivier Houchard
d425dbec89 Fix a typo : deivce => device
Spotted by:	rwatson
2006-01-26 21:48:50 +00:00
Olivier Houchard
e83d253beb Linux compat bits needed to make linux programs use the new ptys :
linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number
linux_stats.c :
	- Return the magic number for devfs.
	- In various stats()-related functions, check that we're stating a
file in /dev/pts, and if so, change the st_rdev field to match what linux
expects to be there for a slave pty device. The glibc checks for this, and
their openpty() fails if it is no correct.
2006-01-26 01:32:46 +00:00
Doug Ambrisko
f06b864361 Fix the build. When I added the lutimes the futimes definitions
went away in the generated files?  This didn't happen on my amd64
test machine but did when I committed it on my other i386 machine.
I need to figure this out since a regen on the amd64 doesn't fix it
now.  For now make the build work again.  Matt caught this before
my local mirror caught up.
2006-01-20 20:51:27 +00:00
Doug Ambrisko
cac2fa646c Regen. 2006-01-20 16:22:37 +00:00
Doug Ambrisko
08a3081da8 Add 32bit version of lutimes so untar doesn't mess up sym-links on amd64. 2006-01-20 16:22:06 +00:00
Tom Rhodes
0e36e11d57 Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code. 2005-12-28 07:08:54 +00:00
Gleb Smirnoff
3c6160327d Add \n to log() message.
Submitted by:	Stanislaw Halik <weirdo tehran.lain.pl>
2005-12-27 00:17:11 +00:00
Maxim Sobolev
900b28f9f6 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
Ruslan Ermilov
bebb4536ce Regen. 2005-12-23 20:06:50 +00:00
Ruslan Ermilov
c647318411 Fix build. 2005-12-23 20:06:14 +00:00
Poul-Henning Kamp
25f6e35a05 Regenerate sysent with new abort2 system call.
Implement abort2(const char *reason, int narg, void **args);

Submitted by:	"Wojciech A. Koszek" <dunstan@freebsd.czest.pl>
2005-12-23 11:58:42 +00:00
Poul-Henning Kamp
fe322ece24 Add missing 455-462 syscalls as unimplemented 2005-12-23 11:56:39 +00:00
Poul-Henning Kamp
5a56b437ec Add abort2() systemcall. 2005-12-23 11:54:11 +00:00
John Baldwin
410d857972 Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)
which existed to cleanup the linux_osname mutex.  Now that MTX_SYSINIT()
has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here
was redundant and resulted in panics in debug kernels.

MFC after:	1 week
Reported by:	Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu
2005-12-15 16:30:41 +00:00
Xin LI
1278dd6847 In Linux, kernel parameters passed to ioctl are by value, while in FreeBSD
they are passed by reference.  Handle the difference within the
linux_ioctl_termio on the LINUX_TCFLSH path.

Submitted by:	Jaroslav Drzik <jaro_AT_coop-voz_dot_sk>
2005-12-13 15:32:52 +00:00
Max Laier
2694019753 Fix calculation of meminfo's swaptotal and swapfree on at least amd64.
MFC after:	3 days
2005-12-11 21:37:42 +00:00
Doug Ambrisko
204634a652 Regen for futimes. 2005-12-08 22:15:09 +00:00
Doug Ambrisko
8e7604db06 Add 32bit version of futimes so untar doesn't result in bad dates
(Jan 1, 1970) when run on amd64.

Reviewed by:	ps
2005-12-08 22:14:25 +00:00
Gleb Smirnoff
7a14354549 Suppress logging about unimplemented syscalls to one time per process. This
prevents hard flood of the system console.

Reviewed by:	bde
2005-12-08 13:33:57 +00:00
Peter Wemm
79880f7327 Catch up to the system siginfo changes. Use a union for the ia32 layout
of siginfo just like the system one.  There are now two fields to copy
instead of one.
2005-12-06 23:06:29 +00:00
Ruslan Ermilov
f4e9888107 Fix -Wundef. 2005-12-04 02:12:43 +00:00
Craig Rodrigues
2207c7648e Remove MNT_NODEV mount option. In RELENG_6, MNT_NODEV was a no-op.
The presence of MNT_NODEV was confusing the am-utils autoconf scripts.

PR:	conf/79715
2005-11-29 00:28:17 +00:00
Bill Paul
f1b78ee016 Somehow memmove() got mapped to memset() in the patch table. Create a
real memmove() implementation and use that instead.
2005-11-23 17:10:46 +00:00
Bill Paul
78edd540cf Correct the API for Windows interupt handling a little. The prototype
for a Windows ISR is 'BOOLEAN isrfunc(KINTERRUPT *, void *)' meaning
the ISR get a pointer to the interrupt object and a context pointer,
and returns TRUE if the ISR determines the interrupt was really generated
by the associated device, or FALSE if not.

I had mistakenly used 'void isrfunc(void *)' instead. It happens the
only thing this affects is the internal ndis_intr() ISR in subr_ndis.c,
but it should be fixed just in case we ever need to register a real
Windows ISR vi IoConnectInterrupt().

For NDIS miniports that provide a MiniportISR() method, the 'is_our_intr'
value returned by the method serves as the return value from ndis_isr(),
and 'call_isr' is used to decide whether or not to schedule the interrupt
handler via DPC. For drivers that only supply MiniportEnableInterrupt()
and MiniportDisableInterrupt() methods, call_isr is always TRUE and
is_our_intr is always FALSE.

In the end, there should be no functional changes, except that now
ntoskrnl_intr() can terminate early once it finds the ISR that wants
to service the interrupt.
2005-11-20 01:29:29 +00:00
Ruslan Ermilov
f95871b97b Unlike the rest of the world, NDIS code can access "struct
ifnet" before is has been fully initialized by if_attach().
Account for that to avoid a null pointer dereference.
2005-11-14 18:19:57 +00:00
Bill Paul
86a8393963 Restore backwards source compatibility with 6.x and 5.x. 2005-11-13 21:36:48 +00:00
Ruslan Ermilov
4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
Bill Paul
e73e17729b Implement RtlZeroMemory() and RtlCopyMemory(). This seems to allow
the Broadcom Win64 wireless driver for the BCM4318 to work on amd64.
2005-11-10 02:22:55 +00:00
Bill Paul
65983e40e1 Change the definition for EXT_NDIS to EXT_NET_DRV. Since the latest
mbuf code changes, MEXTADD() can be used to add an external buffer with
arbitrary type, but mb_ext_free() won't let you free it.
2005-11-07 16:57:14 +00:00
Bill Paul
b5b548a6bc The latest version of the Intel 2200BG/2915ABG driver (9.0.0.3-9) from
Intel's web site requires some minor tweaks to get it to work:

- The driver seems to have been released with full WMI tracing enabled,
  and makes references to some WMI APIs, namely IoWMIRegistrationControl(),
  WmiQueryTraceInformation() and WmiTraceMessage(). Only the first
  one is ever called (during intialization). These have been implemented
  as do-nothing stubs for now. Also added a definition for STATUS_NOT_FOUND
  to ntoskrnl_var.h, which is used as a return code for one of the WMI
  routines.

- The driver references KeRaiseIrqlToDpcLevel() and KeLowerIrql()
  (the latter as a function, which is unusual because normally
  KeLowerIrql() is a macro in the Windows DDK that calls KfLowewIrql()).
  I'm not sure why these are being called since they're not really
  part of WDM. Presumeably they're being used for backwards
  compatibility with old versions of Windows. These have been
  implemented in subr_hal.c. (Note that they're _stdcall routines
  instead of _fastcall.)

- When querying the OID_802_11_BSSID_LIST OID to get a BSSID list,
  you don't know ahead of time how many networks the NIC has found
  during scanning, so you're allowed to pass 0 as the list length.
  This should cause the driver to return an 'insufficient resources'
  error and set the length to indicate how many bytes are actually
  needed. However for some reason, the Intel driver does not honor
  this convention: if you give it a length of 0, it returns some
  other error and doesn't tell you how much space is really needed.
  To get around this, if using a length of 0 yields anything besides
  the expected error case, we arbitrarily assume a length of 64K.
  This is similar to the hack that wpa_supplicant uses when doing
  a BSSID list query.
2005-11-06 19:38:34 +00:00
Paul Saab
506df56c79 Copy out the number of iovecs in freebsd32_recvmsg, not the length
of a single iovec.
2005-11-06 18:12:43 +00:00
Paul Saab
1471f287e1 Calling setrlimit from 32bit apps could potentially increase certain
limits beyond what should be capiable in a 32bit process, so we
must fixup the limits.

Reviewed by:	jhb
2005-11-02 21:18:07 +00:00
Bill Paul
a91395a9d0 Tests with my dual Opteron system have shown that it's possible
for code to start out on one CPU when thunking into Windows
mode in ctxsw_utow(), and then be pre-empted and migrated to another
CPU before thunking back to UNIX mode in ctxsw_wtou(). This is
bad, because then we can end up looking at the wrong 'thread environment
block' when trying to come back to UNIX mode. To avoid this, we now
pin ourselves to the current CPU when thunking into Windows code.

Few other cleanups, since I'm here:

- Get rid of the ndis_isr(), ndis_enable_interrupt() and
  ndis_disable_interrupt() wrappers from kern_ndis.c and just invoke
  the miniport's methods directly in the interrupt handling routines
  in subr_ndis.c. We may as well lose the function call overhead,
  since we don't need to export these things outside of ndis.ko
  now anyway.

- Remove call to ndis_enable_interrupt() from ndis_init() in if_ndis.c.
  We don't need to do it there anyway (the miniport init routine handles
  it, if needed).

- Fix the logic in NdisWriteErrorLogEntry() a little.

- Change some NDIS_STATUS_xxx codes in subr_ntoskrnl.c into STATUS_xxx
  codes.

- Handle kthread_create() failure correctly in PsCreateSystemThread().
2005-11-02 18:01:04 +00:00
Andre Oppermann
34333b16cd Retire MT_HEADER mbuf type and change its users to use MT_DATA.
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it.  It only adds a layer of confusion.  The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.

Non-native code is not changed in this commit.  For compatibility
MT_HEADER is mapped to MT_DATA.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-02 13:46:32 +00:00
Bill Paul
fde84c1850 Clean up one remaining 'multiple DPC thread' bogon: only bzero() one
sizeof(kq_queue), not sizeof(kq_queue) * mp_ncpus.
2005-11-01 09:24:35 +00:00
Paul Saab
ecc44de7a2 Reformat socket control messages on input/output for 32bit compatibility
on 64bit systems.

Submitted by:	ps, ups
Reviewed by:	jhb
2005-10-31 21:09:56 +00:00
Peter Wemm
946bca4fcd Regenerate (with the correct #ifdef COMPAT_43 tests now) 2005-10-26 22:21:03 +00:00
Peter Wemm
767dfc44be There is no 'freebsd3_' prefix for COMPAT_43 syscalls. Those are all
bundled under MCOMPAT and have an 'o' prefix.  Adjust as appropriate.
This re-enables compiling without COMPAT_43 again.
2005-10-26 22:19:51 +00:00
Bill Paul
4cf9a535a8 Minor nit: in ntoskrnl_finddev(), only free the 'children' device_t
array if device_find_children() actually returned a non-NULL array pointer.
2005-10-26 20:21:45 +00:00
Bill Paul
51d6d0952b Clean up and apply the fix for PR 83477. The calculation for locating
the start of the section headers has to take into account the fact
that the image_nt_header is really variable sized. It happens that
the existing calculation is correct for _most_ production binaries
produced by the Windows DDK, but if we get a binary with oddball
offsets, the PE loader could crash.

Changes from the supplied patch are:

- We don't really need to use the IMAGE_SIZEOF_NT_HEADER() macro when
  computing how much of the header to return to callers of
  pe_get_optional_header(). While it's important to take the variable
  size of the header into account in other calculations, we never
  actually look at anything outside the non-variable portion of the
  header. This saves callers from having to allocate a variable sized
  buffer off the heap (I purposely tried to avoid using malloc()
  in subr_pe.c to make it easier to compile in both the -D_KERNEL and
  !-D_KERNEL case), and since we're copying into a buffer on the
  stack, we always have to copy the same amount of data or else
  we'll trash the stack something fierce.

- We need <stddef.h> to get offsetof() in the !-D_KERNEL case.

- ndiscvt.c needs the IMAGE_FIRST_SECTION() macro too, since it does
  a little bit of section pre-processing.

PR: kern/83477
2005-10-26 18:46:27 +00:00
Bill Paul
7f3cc43211 Get rid of the timer tracking and reaping code in NdisMInitializeTimer()
and ndis_halt_nic(). It's been disabled for some time anyway, and
it turns out there's a possible deadlock in NdisMInitializeTimer() when
acquiring the miniport block lock to modify the timer list: it's
possible for a driver to call NdisMInitializeTimer() when the miniport
block lock has already been acquired by an earlier piece of code. You
can't acquire the same spinlock twice, so this can deadlock.

Also, implement MmMapIoSpace() and MmUnmapIoSpace(), and make
NdisMMapIoSpace() and NdisMUnmapIoSpace() use them. There are some
drivers that want MmMapIoSpace() and MmUnmapIoSpace() so that they can
map arbitrary register spaces not directly associated with their
device resources. For example, there's an Atheros driver for
a miniPci card (0x168C:0x1014) on the IBM Thinkpad x40 that wants
to map some I/O spaces at 0xF00000 and 0xE00000 which are held by
the acpi0 device. I don't know what it wants these ranges for,
but if it can't map and access them, the MiniportInitialize() method
fails.
2005-10-26 06:52:57 +00:00
Bill Paul
4ba4b2c45c Fix handling of message table messages that got broken when I
converted NdisWriteErrorLogEntry() to use the RtlXXX unicode/ansi
conversion routines.
2005-10-24 05:05:09 +00:00
David E. O'Brien
2f3e5b2f15 Add a 'clean' target. 2005-10-23 23:58:23 +00:00
Paul Saab
90168b92f2 regen 2005-10-23 10:43:39 +00:00
Paul Saab
e7abd4a000 Implement for FreeBSD 3 32 binaries:
sigaction, sigprocmask, sigpending, sigvec, sigblock, sigsetmask,
sigsuspend, sigstack
2005-10-23 10:43:14 +00:00
Bill Paul
a50286e21d Make the multiple DPC threads an option, and create only one by default.
This avoids the need for sched_bind() in the default case so that you
can start up the NDIS subsystem at boot time when only CPU 0 is running.

There are potentially ways to fix it so that the DPC threads aren't
started until after the other CPUs are launched, but doing it correctly
is tricky. You need to defer the startup of the ntoskrnl subsystem
(ntoskrnl_libinit()), not just defer ndis_attach().

For now, I don't think it will make much difference having just the
single DPC thread (I started out with just one anyway). Note that this
turns the KeSetTargetProcessorDpc() routine into a no-op, since the
CPU number in struct kdpc is now ignored.
2005-10-22 05:15:20 +00:00
Bill Paul
87ff20ed78 Correct the macro definition for KeRaiseIrql(). The official API
is KeRaiseIrql(newirql, &oldirql), not oldirql = KeRaiseIrql(newirql).
(The macro ultimately translates to KfRaiseIrql() which does use
the latter API, so this has no effect on generated code.)

Also, wait for thread termination the right way: kthread_exit()
will ultimately do a wakeup(td->td_proc). This is the event we
should wait on. Eliminate the previous synchronization machinery
for this since it was never guaranteed to work correctly.
2005-10-21 05:23:20 +00:00
Bill Paul
1e956d87e1 Use sched_bind() to make sure the DPC threads are bound to the correct
processor, to insure DPC thread 0 runs on CPU0, DPC thread 1 runs on
CPU1, and so on.

Elevate the priority of the workitem threads, though don't use as
high a priority as the DPC threads.
2005-10-20 17:45:58 +00:00
David Xu
3e1c732ffa Fix compiling problem by adding prefix name svr4 to si_xxx macro, the
si_xxx macro should not be used in compat headers, as these are standard
member names or only can be used in our native header file signal.h.
2005-10-19 09:33:15 +00:00
Bill Paul
a3ced67adf Another round of cleanups and fixes:
- Change ndis_return() from a DPC to a workitem so that it doesn't
  run at DISPATCH_LEVEL (with the dispatcher lock held).

- In if_ndis.c, submit packets to the stack via (*ifp->if_input)() in
  a workitem instead of doing it directly in ndis_rxeof(), because
  ndis_rxeof() runs in a DPC, and hence at DISPATCH_LEVEL. This
  implies that the 'dispatch level' mutex for the current CPU is
  being held, and we don't want to call if_input while holding
  any locks.

- Reimplement IoConnectInterrupt()/IoDisconnectInterrupt(). The original
  approach I used to track down the interrupt resource (by scanning
  the device tree starting at the nexus) is prone to problems when
  two devices share an interrupt. (E.g removing ndis1 might disable
  interrupts for ndis0.) The new approach is to multiplex all the
  NDIS interrupts through a common internal dispatcher (ntoskrnl_intr())
  and allow IoConnectInterrupt()/IoDisconnectInterrupt() to add or
  remove interrupts from the dispatch list.

- Implement KeAcquireInterruptSpinLock() and KeReleaseInterruptSpinLock().

- Change the DPC and workitem threads to use the KeXXXSpinLock
  API instead of mtx_lock_spin()/mtx_unlock_spin().

- Simplify the NdisXXXPacket routines by creating an actual
  packet pool structure and using the InterlockedSList routines
  to manage the packet queue.

- Only honor the value returned by OID_GEN_MAXIMUM_SEND_PACKETS
  for serialized drivers. For deserialized drivers, we now create
  a packet array of 64 entries. (The Microsoft DDK documentation
  says that for deserialized miniports, OID_GEN_MAXIMUM_SEND_PACKETS
  is ignored, and the driver for the Marvell 8335 chip, which is
  a deserialized miniport, returns 1 when queried.)

- Clean up timer handling in subr_ntoskrnl.

- Add the following conditional debugging code:
	NTOSKRNL_DEBUG_TIMERS - add debugging and stats for timers
	NDIS_DEBUG_PACKETS - add extra sanity checking for NdisXXXPacket API
	NTOSKRNL_DEBUG_SPINLOCKS - add test for spinning too long

- In kern_ndis.c, always start the HAL first and shut it down last,
  since Windows spinlocks depend on it. Ntoskrnl should similarly be
  started second and shut down next to last.
2005-10-18 19:52:15 +00:00
Paul Saab
15857ef5ea regen after recvmsg, recvfrom, sendmsg 2005-10-15 05:57:34 +00:00
Paul Saab
a372f8224c Implement the 32bit versions of recvmsg, recvfrom, sendmsg
Partially obtained from:	jhb
2005-10-15 05:57:06 +00:00
Paul Saab
fd151bb940 regen for clock_gettime, clock_settime, clock_getres 2005-10-15 02:54:39 +00:00
Paul Saab
f0b479cd75 Implement 32bit wrappers for clock_gettime, clock_settime, and
clock_getres.
2005-10-15 02:54:18 +00:00
Paul Saab
145f7e60da regen 2005-10-15 02:40:34 +00:00
Paul Saab
d5c7796115 Correct the prototype for freebsd32_nanosleep and use the proper
size when copying struct timespec32 in and out.
2005-10-15 02:40:10 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
Bill Paul
85c13a8375 Convert ndis_set_info() and ndis_get_info() from using msleep()
to KeSetEvent()/KeWaitForSingleObject(). Also make object argument
of KeWaitForSingleObject() a void * like it's supposed to be.
2005-10-12 03:02:50 +00:00
Bill Paul
21628ddbd6 This commit makes a big round of updates and fixes many, many things.
First and most importantly, I threw out the thread priority-twiddling
implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in
favor of a new scheme that uses sleep mutexes. The old scheme was
really very naughty and sought to provide the same behavior as
Windows spinlocks (i.e. blocking pre-emption) but in a way that
wouldn't raise the ire of WITNESS. The new scheme represents
'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If
a thread on cpu0 acquires the 'dispatcher mutex,' it will block
any other thread on the same processor that tries to acquire it,
in effect only allowing one thread on the processor to be at
'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit
and spin' routine on the spinlock variable itself. If a thread on
cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher
mutex' for cpu1 and then it too does an atomic sit and spin to try
acquiring the spinlock.

Unlike real spinlocks, this does not disable pre-emption of all
threads on the CPU, but it does put any threads involved with
the NDISulator to sleep, which is just as good for our purposes.

This means I can now play nice with WITNESS, and I can safely do
things like call malloc() when I'm at 'DISPATCH_LEVEL,' which
you're allowed to do in Windows.

Next, I completely re-wrote most of the event/timer/mutex handling
and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects()
have been re-written to use condition variables instead of msleep().
This allows us to use the Windows convention whereby thread A can
tell thread B "wake up with a boosted priority." (With msleep(), you
instead have thread B saying "when I get woken up, I'll use this
priority here," and thread A can't tell it to do otherwise.) The
new KeWaitForMultipleObjects() has been better tested and better
duplicates the semantics of its Windows counterpart.

I also overhauled the IoQueueWorkItem() API and underlying code.
Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the
same work item isn't put on the queue twice. ExQueueWorkItem(),
which in my implementation is built on top of IoQueueWorkItem(),
was also modified to perform a similar test.

I renamed the doubly-linked list macros to give them the same names
as their Windows counterparts and fixed RemoveListTail() and
RemoveListHead() so they properly return the removed item.

I also corrected the list handling code in ntoskrnl_dpc_thread()
and ntoskrnl_workitem_thread(). I realized that the original logic
did not correctly handle the case where a DPC callout tries to
queue up another DPC. It works correctly now.

I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and
modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to
use them. I also tried to duplicate the interrupt handling scheme
used in Windows. The interrupt handling is now internal to ndis.ko,
and the ndis_intr() function has been removed from if_ndis.c. (In
the USB case, interrupt handling isn't needed in if_ndis.c anyway.)

NdisMSleep() has been rewritten to use a KeWaitForSingleObject()
and a KeTimer, which is how it works in Windows. (This is mainly
to insure that the NDISulator uses the KeTimer API so I can spot
any problems with it that may arise.)

KeCancelTimer() has been changed so that it only cancels timers, and
does not attempt to cancel a DPC if the timer managed to fire and
queue one up before KeCancelTimer() was called. The Windows DDK
documentation seems to imply that KeCantelTimer() will also call
KeRemoveQueueDpc() if necessary, but it really doesn't.

The KeTimer implementation has been rewritten to use the callout API
directly instead of timeout()/untimeout(). I still cheat a little in
that I have to manage my own small callout timer wheel, but the timer
code works more smoothly now. I discovered a race condition using
timeout()/untimeout() with periodic timers where untimeout() fails
to actually cancel a timer. I don't quite understand where the race
is, using callout_init()/callout_reset()/callout_stop() directly
seems to fix it.

I also discovered and fixed a bug in winx32_wrap.S related to
translating _stdcall calls. There are a couple of routines
(i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that
return 64-bit quantities. On the x86 arch, 64-bit values are
returned in the %eax and %edx registers. However, it happens
that the ctxsw_utow() routine uses %edx as a scratch register,
and x86_stdcall_wrap() and x86_stdcall_call() were only preserving
%eax before branching to ctxsw_utow(). This means %edx was getting
clobbered in some cases. Curiously, the most noticeable effect of this
bug is that the driver for the TI AXC110 chipset would constantly drop
and reacquire its link for no apparent reason. Both %eax and %edx
are preserved on the stack now. The _fastcall and _regparm
wrappers already handled everything correctly.

I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem()
instead of the NdisScheduleWorkItem() API. This is to avoid possible
deadlocks with any drivers that use NdisScheduleWorkItem() themselves.

The unicode/ansi conversion handling code has been cleaned up. The
internal routines have been moved to subr_ntoskrnl and the
RtlXXX routines have been exported so that subr_ndis can call them.
This removes the incestuous relationship between the two modules
regarding this code and fixes the implementation so that it honors
the 'maxlen' fields correctly. (Previously it was possible for
NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't
own, which was causing many mysterious crashes in the Marvell 8335
driver.)

The registry handling code (NdisOpen/Close/ReadConfiguration()) has
been fixed to allocate memory for all the parameters it hands out to
callers and delete whem when NdisCloseConfiguration() is called.
(Previously, it would secretly use a single static buffer.)

I also substantially updated if_ndis so that the source can now be
built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only
WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled.

The original WPA code has been updated to fit in more cleanly with
the net80211 API, and to eleminate the use of magic numbers. The
ndis_80211_setstate() routine now sets a default authmode of OPEN
and initializes the RTS threshold and fragmentation threshold.
The WPA routines were changed so that the authentication mode is
always set first, followed by the cipher. Some drivers depend on
the operations being performed in this order.

I also added passthrough ioctls that allow application code to
directly call the MiniportSetInformation()/MiniportQueryInformation()
methods via ndis_set_info() and ndis_get_info(). The ndis_linksts()
routine also caches the last 4 events signalled by the driver via
NdisMIndicateStatus(), and they can be queried by an application via
a separate ioctl. This is done to allow wpa_supplicant to directly
program the various crypto and key management options in the driver,
allowing things like WPA2 support to work.

Whew.
2005-10-10 16:46:39 +00:00
John Baldwin
f2107e8d54 Use the constants for the syscall names from syscall.h rather than
hardcoding the numbers for the SYSVIPC syscalls.
2005-10-03 18:34:17 +00:00
Robert Watson
5f419982c2 Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by:	bde
2005-09-28 07:03:03 +00:00
Peter Wemm
a11ea6e325 Regenerate 2005-09-27 18:04:52 +00:00
Peter Wemm
add121a476 Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added
stubs for ia64 to keep it compiling.  These are used by 32 bit apps such
as gdb.
2005-09-27 18:04:20 +00:00
Robert Watson
84d2b7df26 Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions.  In most cases, this means adding an
acquisition of Giant immediately around the function.  In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset.  This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after:	1 week
2005-09-19 16:51:43 +00:00
Andre Oppermann
e72b668b69 Test the mbuf flags against the correct constant. The previous version
worked as intended but only by chance.  MT_HEADER == M_PKTHDR == 0x2.
2005-08-30 16:21:51 +00:00
Xin LI
e68796868a Fix kernel build.
Reported by:	tinderbox
2005-08-28 13:11:08 +00:00
Craig Rodrigues
8739cd44d0 Rewrite linux_ifconf() to be more like ifconf() in net/if.c
so that we do not call uiomove() while IFNET_RLOCK() is held.
This eliminates the witness warning:

Calling uiomove() with the following non-sleepable locks held:
exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @
/usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170

MFC after:	2 days
2005-08-27 14:44:10 +00:00
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
John Baldwin
ec1f24a934 Add missing dependencies on the SYSVIPC modules. 2005-07-29 19:41:04 +00:00
John Baldwin
813a5e14ec Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c
so that they aren't duplicated 3 times and are also in the same file as
the code that depends on the SYSVIPC modules.
2005-07-29 19:40:39 +00:00
John Baldwin
ac5ee935dd Regen. 2005-07-13 20:35:09 +00:00
John Baldwin
8683e7fdc1 Make a pass through all the compat ABIs sychronizing the MP safe flags
with the master syscall table as well as marking several ABI wrapper
functions safe.

MFC after:	1 week
2005-07-13 20:32:42 +00:00
John Baldwin
6e9b02cf80 Regen. 2005-07-13 15:14:54 +00:00
John Baldwin
2773347338 - Stop hardcoding #define's for options and use the appropriate
opt_foo.h headers instead.
- Hook up the IPC SVR4 syscalls.

MFC after:	3 days
2005-07-13 15:14:33 +00:00
John Baldwin
fa34d9b7a5 Wrap the ia64-specific freebsd32_mmap_partial() hack in Giant for now
since it calls into VFS and VM.  This makes the freebsd32_mmap() routine
MP safe and the extra Giants here can be revisited later.

Glanced at by:	marcel
MFC after:	3 days
2005-07-13 15:12:19 +00:00
John Baldwin
02295eedc7 Add Giant around linux_getcwd_common() in linux_getcwd().
Approved by:	re (scottl)
2005-07-09 12:34:49 +00:00
John Baldwin
4641373fde Add missing locking to linux_connect() so that it can be marked MP safe:
- Conditionally grab Giant around the EISCONN hack at the end based on
  debug.mpsafenet.
- Protect access to so_emuldata via SOCK_LOCK.

Reviewed by:	rwatson
Approved by:	re (scottl)
2005-07-09 12:26:22 +00:00
Roman Kurakin
fbb7165a4b Use implicit type cast for ->k_lock to fix compilation of ndis
as a part of the GENERIC kernel with INVARIANT* and WITNESS*
turned off.
(For non GENERIC kernel KTR and MUTEX_PROFILING should be also
off).

Submitted by:	Eygene A. Ryabinkin <rea at rea dot mbslab dot kiae dot ru>
Approved by:	re (scottl)
PR:		81767
2005-07-08 18:36:59 +00:00
John Baldwin
55522478e6 Lock Giant in svr4_add_socket() so that the various svr4_*stat() calls
can be marked MP safe as this is the only part of them that is not
already MP safe.

Approved by:	re (scottl)
2005-07-07 19:27:29 +00:00
John Baldwin
03badf38ab Remove an unused syscallarg() macro leftover from this code's origins in
NetBSD.

Approved by:	re (scottl)
2005-07-07 19:26:43 +00:00
John Baldwin
07fac65b15 Rototill this file so that it actually compiles. It doesn't do anything
in the build still due to some #undef's in svr4.h, but if you hack around
that and add some missing entries to syscalls.master, then this file will
now compile.  The changes involved proc -> thread, using FreeBSD syscall
names instead of NetBSD, and axeing syscallarg() and retval arguments.

Approved by:	re (scottl)
2005-07-07 19:25:47 +00:00
John Baldwin
8d948cd1ec Fix the computation of uptime for linux_sysinfo(). Before it was returning
the uptime in seconds mod 60 which wasn't very useful.

Approved by:	re (scottl)
2005-07-07 19:17:55 +00:00
John Baldwin
9f3157a254 Regenerate.
Approved by:	re (scottl)
2005-07-07 18:20:38 +00:00
John Baldwin
bcd9e0dd20 - Add two new system calls: preadv() and pwritev() which are like readv()
and writev() except that they take an additional offset argument and do
  not change the current file position.  In SAT speak:
  preadv:readv::pread:read and pwritev:writev::pwrite:write.
- Try to reduce code duplication some by merging most of the old
  kern_foov() and dofilefoo() functions into new dofilefoo() functions
  that are called by kern_foov() and kern_pfoov().  The non-v functions
  now all generate a simple uio on the stack from the passed in arguments
  and then call kern_foov().  For example, read() now just builds a uio and
  calls kern_readv() and pwrite() just builds a uio and calls kern_pwritev().

PR:		kern/80362
Submitted by:	Marc Olzheim marcolz at stack dot nl (1)
Approved by:	re (scottl)
MFC after:	1 week
2005-07-07 18:17:55 +00:00
Peter Wemm
62919d788b Jumbo-commit to enhance 32 bit application support on 64 bit kernels.
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.

ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
	procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
	and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
	sscanf fails.  They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets.  Note
	that 64 bit consumers can still debug 32 bit targets.

IA64 has got stubs for ia32_reg.c.

Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet.  We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.

Approved by:	re
2005-06-30 07:49:22 +00:00
John Baldwin
19042f9cce - Change the commented out freebsd32_xxx() example to use kern_xxx() along
with a single copyin() + translate and translate + copyout() rather than
  using the stackgap.
- Remove implementation of the stackgap for freebsd32 since it is no longer
  used for that compat ABI.

Approved by:	re (scottl)
2005-06-29 15:16:20 +00:00
John Baldwin
de1c01ad37 Correct the amount of data to allocate in these local copies of
exec_copyin_strings() to catch up to rev 1.266 of kern_exec.c.  This fixes
panics on amd64 with compat binaries since exec_free_args() was freeing
more memory than these functions were allocating and the mismatch could
cause memory to be freed out from under other concurrent execs.

Approved by:	re (scottl)
2005-06-24 17:41:28 +00:00
Pawel Jakub Dawidek
06a137780b Actually only protect mount-point if security.jail.enforce_statfs is set to 2.
If we don't return statistics about requested file systems, system tools
may not work correctly or at all.

Approved by:	re (scottl)
2005-06-23 22:13:29 +00:00
Pawel Jakub Dawidek
3a996d6e91 Do not allocate memory based on not-checked argument from userland.
It can be used to panic the kernel by giving too big value.
Fix it by moving allocation and size verification into kern_getfsstat().
This even simplifies kern_getfsstat() consumers, but destroys symmetry -
memory is allocated inside kern_getfsstat(), but has to be freed by the
caller.

Found by:	FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by:	Peter Holm <peter@holm.cc>
2005-06-11 14:58:20 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Pawel Jakub Dawidek
820a0de9a9 Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs
and extend its functionality:

value	policy
0	show all mount-points without any restrictions
1	show only mount-points below jail's chroot and show only part of the
	mount-point's path (if jail's chroot directory is /jails/foo and
	mount-point is /jails/foo/usr/home only /usr/home will be shown)
2	show only mount-point where jail's chroot directory is placed.

Default value is 2.

Discussed with:	rwatson
2005-06-09 18:49:19 +00:00
Pawel Jakub Dawidek
13a82b9623 Avoid code duplication in serval places by introducing universal
kern_getfsstat() function.

Obtained from:	jhb
2005-06-09 17:44:46 +00:00
Maxim Sobolev
bc165ab0fe Properly convert FreeBSD priority values into Linux values in the
getpriority(2) syscall.

PR:		kern/81951
Submitted by:	Andriy Gapon <avg@icyb.net.ua>
2005-06-08 20:41:28 +00:00
Paul Saab
efe5becafa Wrap copyin/copyout for kevent so the 32bit wrapper does not have
to malloc nchanges * sizeof(struct kevent) AND/OR nevents *
sizeof(struct kevent) on every syscall.

Glanced at by:	peter, jmg
Obtained from:	Yahoo!
MFC after:	2 weeks
2005-06-03 23:15:01 +00:00
Robert Watson
3984b2328c Rebuild generated system call definition files following the addition of
the audit event field to the syscalls.master file format.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:20:21 +00:00
Robert Watson
f3596e3370 Introduce a new field in the syscalls.master file format to hold the
audit event identifier associated with each system call, which will
be stored by makesyscalls.sh in the sy_auevent field of struct sysent.
For now, default the audit identifier on all system calls to AUE_NULL,
but in the near future, other BSM event identifiers will be used.  The
mapping of system calls to event identifiers is many:one due to
multiple system calls that map to the same end functionality across
compatibility wrappers, ABI wrappers, etc.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:09:18 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Pawel Jakub Dawidek
d0cad55da8 Remove (now) unused argument 'td' from bsd_to_linux_statfs(). 2005-05-27 19:25:39 +00:00
Paul Saab
473dd55f2e Copyout to userland if kern_sigaction succeeds 2005-05-24 17:52:14 +00:00
Pawel Jakub Dawidek
672d95c55d The code is under '#ifdef not_that_way', but anyway:
- Add missing prison_check_mount() check.
2005-05-22 22:30:31 +00:00
Pawel Jakub Dawidek
a0e96a49df If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us,
so do not duplicate the code in cvtstatfs().
Note, that we now need to clear fsid in freebsd4_getfsstat().

This moves all security related checks from functions like cvtstatfs()
and will allow to add more security related stuff (like statfs(2), etc.
protection for jails) a bit easier.
2005-05-22 21:52:30 +00:00