Commit Graph

9570 Commits

Author SHA1 Message Date
Alexander Leidinger
6a1162d4cd MFP4 (with some minor changes):
Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.

From the submitter:
---snip---
DESIGN NOTES:

1. Linux permits a process to own multiple AIO queues (distinguished by
   "context"), but FreeBSD creates only one single AIO queue per process.
   My code maintains a request queue (STAILQ of queue(3)) per "context",
   and throws all AIO requests of all contexts owned by a process into
   the single FreeBSD per-process AIO queue.

   When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
   io_cancel(2), my code can pick out requests owned by the specified context
   from the single FreeBSD per-process AIO queue according to the per-context
   request queues maintained by my code.

2. The request queue maintained by my code stores contrast information between
   Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
   (struct aiocb). FreeBSD IO control block actually exists in userland memory
   space, required by FreeBSD native aio_XXXXXX(2).

3. It is quite troubling that the function io_getevents() of libaio-0.3.105
   needs to use Linux-specific "struct aio_ring", which is a partial mirror
   of context in user space. I would rather take the address of context in
   kernel as the context ID, but the io_getevents() of libaio forces me to
   take the address of the "ring" in user space as the context ID.

   To my surprise, one comment line in the file "io_getevents.c" of
   libaio-0.3.105 reads:

             Ben will hate me for this

REFERENCE:

1. Linux kernel source code:   http://www.kernel.org/pub/linux/kernel/v2.6/
   (include/linux/aio_abi.h, fs/aio.c)

2. Linux manual pages:         http://www.kernel.org/pub/linux/docs/manpages/
   (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))

3. Linux Scalability Effort:   http://lse.sourceforge.net/io/aio.html
   The design notes:           http://lse.sourceforge.net/io/aionotes.txt

4. The package libaio, both source and binary:
       http://rpmfind.net/linux/rpm2html/search.php?query=libaio
   Simple transparent interface to Linux AIO system calls.

5. Libaio-oracle:              http://oss.oracle.com/projects/libaio-oracle/
   POSIX AIO implementation based on Linux AIO system calls (depending on
   libaio).
---snip---

Submitted by:	Li, Xiao <intron@intron.ac>
2006-10-15 14:22:14 +00:00
Ruslan Ermilov
a1b0a18096 Prevent IOC_IN with zero size argument (this is only supported
if backward copatibility options are present) from attempting
to free memory that wasn't allocated.  This is an old bug, and
previously it would attempt to free a null pointer.  I noticed
this bug when working on the previous revision, but forgot to
fix it.

Security:	local DoS
Reported by:	Peter Holm
MFC after:	3 days
2006-10-14 19:01:55 +00:00
Tom Rhodes
f51bf07af8 Close a race condition where num can be larger than tmp, giving the user
too large of a boundary.

Reported by:	Ilja Van Sprundel
2006-10-14 10:30:14 +00:00
Tor Egge
e0c33ad529 Wait for thread count to reach zero in destroy_devl() even when no purge
method is defined, to avoid memory being modified after free.

Temporarily increase refcount in destroy_devl() to avoid a double free
if dev_rel() is called while waiting for thread count to reach zero.
2006-10-13 20:49:24 +00:00
Gleb Smirnoff
68a57ebfad Improve ktr(4) logging for callout(9) subsystem. Log all inserts and
removals, including failures, into the callwheel.

XXX: Most of the CTR() macros are called with callout_lock spin mutex
held, thus won't be logged into file, if KTR_ALQ is used. Moving the
CTR() macros out from the spinlocked code would require copying of all
arguments. I'm too lazy to do this.
2006-10-11 14:57:03 +00:00
David Xu
ae7d8a6766 Implement 32bit umtx_lock and umtx_unlock system calls, these two system
calls are not used by libthr in RELENG_6 and HEAD, it is only used by
the libthr in RELENG-5, the _umtx_op system call can do more incremental
dirty works than these two system calls without having to introduce new
system calls or throw away old system calls when things are going on.
2006-10-06 08:22:08 +00:00
David Xu
c6511aea86 Move some declaration of 32-bit signal structures into file
freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.
2006-10-05 01:56:11 +00:00
Martin Blapp
89ff1e4cb8 Back out part of rev. 1.149. While adding a workaround in ptcopen() to
avoid leaked ptys works fine, this opens a possible security hole.

Submitted by:	bde
MFC after:	3 days
2006-10-04 05:43:39 +00:00
Robert Watson
531147aa3e Regenerate. 2006-10-03 20:48:11 +00:00
Robert Watson
888db9e177 Audit creat() system call (compat code), and change type for getpagesize(),
which isn't actually being audited anyway.

MFC after:	3 days
Obtained from:	TrustedBSD Project
2006-10-03 20:46:52 +00:00
Konstantin Belousov
30af71199e Fix the remaining race in the revs. 1.232, 1,233 that could occur during
unmount when mp structure is reused while waiting for coveredvp lock.
Introduce struct mount generation count, increment it on each reuse and
compare the generations before and after obtaining the coveredvp lock.

Reviewed by:	tegge, pjd
Approved by:	pjd (mentor)
MFC after:	2 weeks
2006-10-03 10:47:04 +00:00
Poul-Henning Kamp
e5037a18a9 Use utc_offset() where applicable, and hide the internals of it
as static variables.
2006-10-02 18:23:37 +00:00
Poul-Henning Kamp
f97c1c4bf7 Introduce utc_offset() to capture a calculation currently done all over the
place.
2006-10-02 16:17:23 +00:00
Poul-Henning Kamp
94d67e0fb8 Move tz_minuteswest and tz_dsttime to subr_clock.c 2006-10-02 16:06:26 +00:00
Poul-Henning Kamp
b69f71eb29 Second part of a little cleanup in the calendar/timezone/RTC handling.
Split subr_clock.c in two parts (by repo-copy):
   subr_clock.c contains generic RTC and calendaric stuff. etc.
   subr_rtc.c contains the newbus'ified RTC interface.

Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c.  They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.
2006-10-02 15:42:02 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Konstantin Belousov
45ea8737bf Correct the comment: numvnodes is decreased on vdestroying the vnode.
OKed by:	tegge
Approved by:	pjd (mentor)
MFC after:	1 week
2006-10-02 07:25:58 +00:00
Tor Egge
04aa807cb6 If the buffer lock has waiters after the buffer has changed identity then
getnewbuf() needs to drop the buffer in order to wake waiters that might
sleep on the buffer in the context of the old identity.
2006-10-02 02:06:27 +00:00
Martin Blapp
570d6457d1 Readd rev. 1.145 because of vfs bugs and races near revoke(). Until they
are fixed we can't free any slaves. Add a workaround to not to leak ptys
by number.
2006-09-30 22:51:05 +00:00
Pawel Jakub Dawidek
2342d5216e Remove duplicated $FreeBSD$. 2006-09-30 16:33:29 +00:00
Martin Blapp
35dcc318f4 Any call of tty_close() with a tty refcount of <= 1 is wrong and we will
free the tty in this case. This is a workaround until the underlaying
devfs/tty problems are fixed.

MFC after:	1 day
2006-09-30 08:11:51 +00:00
Martin Blapp
9b206de5a0 Free tty struct after last close. This should fix the pty-leak by numbers.
Remove workarounds for tty_refcount beeing 0, this will be fixed differently
later.
2006-09-29 09:53:19 +00:00
Martin Blapp
e4936f3763 Free tty struct after last close. This should fix the pty-leak by numbers.
Remove workarounds for tty_refcount beeing 0, this will be fixed differently
later.

Back out rev 1.145 since we initialize the tty struct from scratch and bad
things can't happen anymore.
2006-09-29 09:52:57 +00:00
Ruslan Ermilov
9fddcc6661 Fix our ioctl(2) implementation when the argument is "int". New
ioctls passing integer arguments should use the _IOWINT() macro.
This fixes a lot of ioctl's not working on sparc64, most notable
being keyboard/syscons ioctls.

Full ABI compatibility is provided, with the bonus of fixing the
handling of old ioctls on sparc64.

Reviewed by:	bde (with contributions)
Tested by:	emax, marius
MFC after:	1 week
2006-09-27 19:57:02 +00:00
Martin Blapp
8be563721a Move Giant up even further since P_CONTROLT isn't really fully locked
yet (p_flag is, but P_CONTROLT isn't really).

Submitted by:	jhb
2006-09-27 16:42:10 +00:00
Martin Blapp
1bf5e4b866 Use ctty instead of just returning. ctty just has a simple open that
returns ENXIO.

Submitted by:	jhb
2006-09-27 16:41:15 +00:00
Tor Egge
e60c361218 Reduce fluctuations of mnt_flag to allow unlocked readers to get a
slightly more consistent view.
2006-09-26 04:20:09 +00:00
Tor Egge
fba924ce9b Don't restore MNT_QUOTA bit in mnt_flag after a failed mount with
MNT_UPDATE flag, closing a race between nmount() and quotactl().
2006-09-26 04:18:36 +00:00
Tor Egge
a1e363f256 Add mnt_noasync counter to better handle interleaved calls to nmount(),
sync() and sync_fsync() without losing MNT_ASYNC.  Add MNTK_ASYNC flag
which is set only when MNT_ASYNC is set and mnt_noasync is zero, and
check that flag instead of MNT_ASYNC before initiating async io.
2006-09-26 04:15:59 +00:00
Tor Egge
cea9d840d8 Don't restore mnt_kern_flag on failed MNT_UPDATE mount, it can race
with dounmount(), causing loss of MNTK_UNMOUNT flag.
2006-09-26 04:15:04 +00:00
Tor Egge
5da56ddb21 Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().
2006-09-26 04:12:49 +00:00
Robert Watson
88b85279a9 SI_ORDER_THIRD + 2, not SI_ORDER_FOURTH + 2.
MFC after:	3 days
Submitted by:	mlaier
2006-09-26 00:15:56 +00:00
Robert Watson
5add74b4a7 Add "FreeBSD" trademark statement to copyright section of boot messages.
MFC after:	3 days
Approved by:	core, board at FreeBSDFoundation dot org
2006-09-25 23:19:01 +00:00
John-Mark Gurney
33fabe46da remove unnecessary NULL check...
Coverity ID:	1545
2006-09-25 01:29:48 +00:00
John-Mark Gurney
4db71d27a1 hide kqueue_register from public view, and replace it w/ kqfd_register...
this eliminates a possible race in aio registering a kevent..
2006-09-24 04:47:47 +00:00
John-Mark Gurney
aeab19b21f return EBADF instead of successfully attaching (and then panicing) when
an fd is dieing..

Convinced by:	jhb
PR:		103127
2006-09-24 02:29:53 +00:00
John-Mark Gurney
9edac6f3f9 add KTRACE hooks into kevent... This will help people debug their kqueue
programs to find out exactly which events were registered and which were
returned...  This should be lower in kern_kevent, but that would require
special munging due to locks and the functions used to copyin/copyout
kevents...

If someone wants to teach ktrace how to output pretty kevents, I have a
kevent prety printer that can be used...
2006-09-24 02:23:29 +00:00
Martin Blapp
45e6819160 Protect enterpgrp() against another tty/proc race case until the tty locking work
has been fixed.

MFC after:	1 week
2006-09-23 17:35:24 +00:00
Martin Blapp
7c56049e6d Check for tp->t_refcnt == 0 before doing anything in tty_open().
PR:		103520
MFC after:	1 week
2006-09-23 14:52:46 +00:00
Martin Blapp
153c21c8c1 If /dev/tty gets opened after your controlling terminal has been revoked
you can't call tty_clone afterwords. OpenBSD and NetBSD both fail the
open call in that case, so we should do so as well. This can
be done in ctty_clone by returning with *dev==NULL. Admittedly this
causes open to return ENOENT, instead of ENXIO as on the other BSDs,
but this way requires the least touching of code.

Submitted by:  Nate Eldredge <nge@cs.hmc.edu>
PR:            83375

MFC:           1 week
2006-09-23 14:44:14 +00:00
Bruce M Simpson
4a75dc2585 Fix a case where socket I/O atomicity is violated due to not dropping
the entire record when a non-data mbuf is removed in the soreceive() path.
This only triggers a panic directly when compiled with INVARIANTS.

PR:		38495
Submitted by:	James Juran
MFC after:	1 week
2006-09-22 15:34:16 +00:00
David Xu
cda9a0d1c2 Add compatible code to let 32bit libthr work on 64bit kernel. 2006-09-22 15:04:28 +00:00
David Xu
e58b17ea53 Fix umtx command order error for freebsd 32bit. 2006-09-22 14:59:10 +00:00
David Xu
1eec02f538 Add umtx support for 32bit process on AMD64 machine. 2006-09-22 00:52:54 +00:00
Martin Blapp
1c1d411bee Back out rev. 1.258. The real race cause has been fixed
in rev. 1.241 of kern_proc.c.

Requested by:	jhb
2006-09-21 14:09:26 +00:00
Randall Stewart
adf5d1c6d0 atomic_fetchadd_int is used by mb_free_ext(), but it
returns the previous value that the "add" effected (In
this case we are adding -1), afterwhich we compare it
to '0'... to see if we free the mbuf... we should
be comparing it to '1'... Note that this only effects
when there is contention since there is a first part
to the comparison that checks to see if its '1'. So
this bug would only crop up if two CPU's are trying
to free the same mbuf refcount at the same time. This
will happen in SCTP but I doubt can happen in TCP or
UDP.
PR:		N/A
Submitted by:	rrs
Reviewed by:	gnn,sam
Approved by:	gnn,sam
2006-09-21 09:55:43 +00:00
David Xu
cca0a557dd Regenerate. 2006-09-21 04:19:48 +00:00
David Xu
73fa3e5b88 Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparam
with rtprio_thread, while rtprio system call is for process only, the new
system call rtprio_thread is responsible for LWP.
2006-09-21 04:18:46 +00:00
Robert Watson
f50c4fd817 Remove MAC_DEBUG + MPRINTF debugging from System V IPC. This no longer
appears to be serving a useful purpose, as it was used during initial
development of MAC support for System V IPC.

MFC after:	1 month
Obtained from:	TrustedBSD Project
Suggested by:	Christopher dot Vance at SPARTA dot com
2006-09-20 13:40:00 +00:00
Robert Watson
738f14d4b1 Remove MAC_DEBUG label counters, which were used to debug leaks and
other problems while labels were first being added to various kernel
objects.  They have outlived their usefulness.

MFC after:	1 month
Suggested by:	Christopher dot Vance at SPARTA dot com
Obtained from:	TrustedBSD Project
2006-09-20 13:33:41 +00:00