Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.
From the submitter:
---snip---
DESIGN NOTES:
1. Linux permits a process to own multiple AIO queues (distinguished by
"context"), but FreeBSD creates only one single AIO queue per process.
My code maintains a request queue (STAILQ of queue(3)) per "context",
and throws all AIO requests of all contexts owned by a process into
the single FreeBSD per-process AIO queue.
When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
io_cancel(2), my code can pick out requests owned by the specified context
from the single FreeBSD per-process AIO queue according to the per-context
request queues maintained by my code.
2. The request queue maintained by my code stores contrast information between
Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
(struct aiocb). FreeBSD IO control block actually exists in userland memory
space, required by FreeBSD native aio_XXXXXX(2).
3. It is quite troubling that the function io_getevents() of libaio-0.3.105
needs to use Linux-specific "struct aio_ring", which is a partial mirror
of context in user space. I would rather take the address of context in
kernel as the context ID, but the io_getevents() of libaio forces me to
take the address of the "ring" in user space as the context ID.
To my surprise, one comment line in the file "io_getevents.c" of
libaio-0.3.105 reads:
Ben will hate me for this
REFERENCE:
1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/
(include/linux/aio_abi.h, fs/aio.c)
2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/
(io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))
3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html
The design notes: http://lse.sourceforge.net/io/aionotes.txt
4. The package libaio, both source and binary:
http://rpmfind.net/linux/rpm2html/search.php?query=libaio
Simple transparent interface to Linux AIO system calls.
5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/
POSIX AIO implementation based on Linux AIO system calls (depending on
libaio).
---snip---
Submitted by: Li, Xiao <intron@intron.ac>
it introduced a check after the call to file system's get pages method
that assumes that the get pages method does not change the array of pages
that is passed to it. In the case of vnode_pager_generic_getpages(),
this assumption has been incorrect. The contents of the array of pages
may be shifted by vnode_pager_generic_getpages(). Likely, the problem
has been hidden by vnode_pager_haspage() limiting the set of pages that
are passed to vnode_pager_generic_getpages() such that a shift never
occurs.
The fix implemented herein is to adjust the pointer to the array of pages
rather than shifting the pages within the array.
MFC after: 3 weeks
Fix suggested by: tegge
an error returned by VOP_BMAP() and a hole in the file.
Change the callers to vnode_pager_addr() such that they return
VM_PAGER_ERROR when VOP_BMAP fails instead of a zero-filled page.
Reviewed by: tegge
MFC after: 3 weeks
if backward copatibility options are present) from attempting
to free memory that wasn't allocated. This is an old bug, and
previously it would attempt to free a null pointer. I noticed
this bug when working on the previous revision, but forgot to
fix it.
Security: local DoS
Reported by: Peter Holm
MFC after: 3 days
VA_MARK_ATIME feature to fix POSIX conformance fore execve() and mmap(),
we thought that it was optimized well enough for the one file system
that supports it (ffs) and harmless for other file systems (except
layered ones which already get the layering for VOP_SETATTR() wrong).
However, nfs_setattr() doesn't do much parameter checking, so when
it gets a combination of parameters that it doesn't understand, it
always does a Setattr RPC. This RPC can't do anything good, and for
VA_MARK_ATIME it is null except for wasting a lot of time.
This is the smallest and easiest to fix of several bugs that have
increased the number of RPCs for kernel builds on nfs by more than
100% since 2004-11-05. The real-time increase depends on network
latency and parallelization and can also be very large (approaching
the same percentage for unparallelized operations like "make depend"
on systems with fast CPUs and high-latency networks).
method is defined, to avoid memory being modified after free.
Temporarily increase refcount in destroy_devl() to avoid a double free
if dev_rel() is called while waiting for thread count to reach zero.
size aligned requiring heavy usage of vm_page_alloc_contig
This change makes vm_page_alloc_contig SMP safe
Approved by: scottl (acting as backup for mentor rwatson)
removals, including failures, into the callwheel.
XXX: Most of the CTR() macros are called with callout_lock spin mutex
held, thus won't be logged into file, if KTR_ALQ is used. Moving the
CTR() macros out from the spinlocked code would require copying of all
arguments. I'm too lazy to do this.
entries' by src:port and dst:port pairs. IPv6 part is non-functional
as ``limit'' does not support IPv6 flows.
PR: kern/103967
Submitted by: based on Bruce Campbell patch
MFC after: 1 month
(PICs) rather than interrupt sources. This allows interrupt controllers
with no interrupt pics (such as the 8259As when APIC is in use) to
participate in suspend/resume.
- Always register the 8259A PICs even if we don't use any of their pins.
- Explicitly reset the 8259As on resume on amd64 if 'device atpic' isn't
included.
- Add a "dummy" PIC for the local APIC on the BSP to reset the local APIC
on resume. This gets suspend/resume working with APIC on UP systems.
SMP still needs more work to bring the APs back to life.
The MFC after is tentative.
Tested by: anholt (i386)
Submitted by: Andrea Bittau <a.bittau at cs.ucl.ac.uk> (3)
MFC after: 1 week
vnode_pager_generic_getpages(): (1) that VOP_BMAP() is unsupported by the
underlying file system and (2) an error in performing the VOP_BMAP().
Previously, vnode_pager_generic_getpages() assumed that all errors were
of the first type. If, in fact, the error was of the second type, the
likely outcome was for the process to become permanently blocked on a busy
page.
MFC after: 3 weeks
Reviewed by: tegge
not trust jails enough to execute audit related system calls. An example of
this is with su(1), or login(1) within prisons. So, if the syscall request
comes from a jail return ENOSYS. This will cause these utilities to operate
as if audit is not present in the kernel.
Looking forward, this problem will be remedied by allowing non privileged
users to maintain and their own audit streams, but the details on exactly how
this will be implemented needs to be worked out.
This change should fix situations when options AUDIT has been compiled into
the kernel, and utilities like su(1), or login(1) fail due to audit system
call failures within jails.
This is a RELENG_6 candidate.
Reported by: Christian Brueffer
Discussed with: rwatson
MFC after: 3 days
is suspending/suspended. Doing so may result in deadlock. Instead, set the
(new) IN_LAZYACCESS flag, that becomes IN_MODIFIED when suspend is lifted.
Change the locking protocol in order to set the IN_ACCESS and timestamps
without upgrading shared vnode lock to exclusive (see comments in the
inode.h). Before that, inode was modified while holding only shared
lock.
Tested by: Peter Holm
Reviewed by: tegge, bde
Approved by: pjd (mentor)
MFC after: 3 weeks
since they just duplicated the MI `reset' command. Instead of removing
them, make `reboot' an MI alias for `reboot' since this gives a better
way of killing the `r' alias for `reset'. Remove the `registers' command
that was used to kill the alias.
Turn the powerpc and sparc64 MD `halt' command into an MI command.
A copy of sparc64/db_interface.c grew in sun4v just after I found the
extra reboot commands. It has not been changed, and is now not
identical. Duplicated commands come out duplicated in ddb's online
help, but cause large problems when used (e.g., on i386's with 2 halt's
and an hwatch, typing h doesn' give the expected message about an
ambiguous command, but hangs like the halt command or a looping parseri
would).
suppression is only needed at ends of lines, but rev.1.32 forced it
off precisely there.
The --More-- prompt is now cleared by explicitly forcing out the
whitespace in "\r \r". It might be better to use the line
editor's clearing functions, but these are currently static and not
much different.
- `b' is now an official alias for `break'. It used to be an unofficial
alias, but this was broken by adding the `bt' alias for `trace'.
- `t' is now an official alias for `trace'. It used to be an unofficial
alias, but this was broken by adding the `thread' command.
- `registers' is now an alias for `show registers'. This is a hack to
break the unofficial `r' alias for `reset'. `r' really means
`registers' in some debuggers, so I sometimes type it accidentally and
am annoyed when it resets the system. A short command shouldn't have
such a large effect. Now at least `res' must be typed to disambiguate
`reset'.
output width of 79, only 6 columns of width 12 each fit, but 7 columns
were printed.
The fix is to pass the width of the next output to db_end_line() and
not assume there that this width is always 1.
Related unfixed bugs:
- 1 character is wasted for a space after the last column
- suppression of trailing spaces used to limit the misformatting, but
seems to have been lost
- in db_examine(), the width of the next output is not know and is
still assumed to be 1.
- Add entries in the uscanner.4 man page (along with missing 3500).
PR: usb/100957 [1], usb/100992 [2]
Submitted by: Jim Teresco <terescoj@teresco.org> [1],
Walter C. Pelissero <walter.pelissero@iesy.net> [2]
MFC after: 3 days
in ip6_output. In case this fails handle the error directly and log it[1].
In addition permit CARP over v6 in ip_fw2.
PR: kern/98622
Similar patch by: suz
Discussed with: glebius [1]
Tested by: Paul.Dekkers surfnet.nl, Philippe.Pegon crc.u-strasbg.fr
MFC after: 3 days
will fix a problem where you boot w/ the default of autoselect, but then
set the speed to 100/full, the switch will keep the autoselect/100/full
negotiation... This will continue to work till someone resets the switch
or unplugs the cable resulting in the switch failing to autoneg and falling
back to 100/half, causing a hard to track down duplex mismatch..
Submitted by: nCircle Network Security, Inc.
MFC after: 1 week
- Add support for the Conexant Waikiki/CX20551-22, found
in most Toshiba P100 series laptops. Despite of growing
urban legend of "unsupported Conexant", this codec is fully
supported in this driver.
Note: Toshiba P100 has broken (acpi) BIOS, thus rendering
its soundchip useless. Please disable ACPI, or get
BIOS updates (if any).
Found/tested by: Vulpes Velox <v.velox@vvelox.net>
URL: http://lists.freebsd.org/pipermail/freebsd-multimedia/2006-September/004896.html
- Parser cleanups to handle possible oss/mixer collision. Found
after parsing Conexant Waikiki nodes.
- Increase resilient against resource failure during attach/detach.
- Implement simple config through hint.pcm.<unit>.config. Supported
options:
gpio0 (default on Acer), gpio1, gpio2, softpcmvol,
fixedrate (default), forcestereo (default)
* Option prefixed with "no" (such as "nofixedrate") will do
the opposite.
* Options can be separated using space " " or comma ",".
* The "no" option will take precedence over anything else.
Example:
hint.pcm.0.config="gpio2,nofixedrate,noforcestereo,nogpio0,softpcmvol"
hint.pcm.0.config="softpcmvol noforcestereo"
read requests to its consumer. It has been developed to address
the problem of a horrible read performance of a 64k blocksize FS
residing on a RAID3 array with 8 data components, where a single
disk component would only get 8k read requests, thus effectively
killing disk performance under high load. Documentation will be
provided later. I'd like to thank Vsevolod Lobko for his bright
ideas, and Pawel Jakub Dawidek for helping me fix the nasty bug.
calls are not used by libthr in RELENG_6 and HEAD, it is only used by
the libthr in RELENG-5, the _umtx_op system call can do more incremental
dirty works than these two system calls without having to introduce new
system calls or throw away old system calls when things are going on.
unsuspecting users.
- Add a comment in NOTES about experimental status of SCHED_ULE.
- Make warning about experimental status in sched_ule(4) a bit
stronger.
Suggested and reviewed by: dougb
Discussed on: developers
MFC after: 3 days
or not the OS has to wait for RX_RDY or TX_RDY to be set before the OS sets
the control code in the control/status register. Looking at the interface
design, it seems that RX_RDY and TX_RDY are probably there to protect
access to the data register and have nothing to do with the control/status
register. Nevertheless, try to take what I think is the more conservative
approach and always wait for the appropriate [TR]X_RDY flag to be set
before writing any of the WR_NEXT, WR_END, RD_START, or RD_NEXT control
codes to the control/status register.
commits. For some reason I thought the scale factor was a shift count
rather than the multiplicand (that is, I thought leal (%eax,%edx,4) was
going to generate %eax + %edx << 4 rather than %eax + %edx * 4). What
I need is to multiply by 16 to convert a real-mode (seg, offset) tuple
into a flat address. However, the max multiplicand for scaled/index
addressing on i386 is 8, so go back to using a shl and an add.
- Convert two more inter-register mov instructions where we don't need to
preserve the source register to xchg instructions to keep our space
savings.
Tested by: Ian FREISLICH if at hetzner.co.za
MFC after: 1 week
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.
Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.
there are enough places in the DTrace kernel/module sources that
having a header that gathers together all the individual elf headers
is convenient.
Note that the Solaris compatibility definions are conditionally
included iff _SOLARIS_C_SOURCE is defined.
by Sun's CDDL and this file is only intended for inclusion where
_SOLARIS_C_SOURCE is defined (with the assumption that the code
being compiled is licensed under the CDDL too).
unmount when mp structure is reused while waiting for coveredvp lock.
Introduce struct mount generation count, increment it on each reuse and
compare the generations before and after obtaining the coveredvp lock.
Reviewed by: tegge, pjd
Approved by: pjd (mentor)
MFC after: 2 weeks
close and re-open the default pipe instead of relying on the host
controller driver to notice the changes. Remove the unreliable code
that attempted to update these fields while the pipe was active.
This fixes a case where the hardware could cache and continue to
use the old address, resulting in a "getting first desc failed"
error.
PR: usb/103167
- Fix support for ASUS M5200ae (buggy BIOS)
- Fix few problems, reported by Coverity Prevent (TM).
CID: 246991, 246676, 246675, 246674, 246477
Found by: Coverity Prevent (TM)
The parallel LINT build sometimes broke if kernel-depend wasn't
fast enough in generating ukbdmap.h. If someone thinks this
option would still be useful for the module, a proper fix is
to add the code generating ukbdmap.h into modules/ukbd/Makefile
and backing this change out.
Split subr_clock.c in two parts (by repo-copy):
subr_clock.c contains generic RTC and calendaric stuff. etc.
subr_rtc.c contains the newbus'ified RTC interface.
Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c. They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.
and CAM_RESRC_UNAVAIL returns. Delay a tunable amount for
either between retries.
This came up because the MPT IOC was returning "IOC out of
resources" for some user and this caused a CAM_RESRC_UNAVAIL
return. Putting a bit of delay between retries helped them
out.
There was some discussion that an async event should be used
to clear CAM_RESRC_UNAVAIL. That's probably a better notion
eventually.
Reviewed by: scsi@freebsd.org (ade, scott)
MFC after: 1 week
Add support for Intel High Definition Audio Controller.
This driver make a special guarantee that "playback" works
on majority hardwares with minimal or without specific vendor
quirk.
This driver is a product of collaborative effort made by:
Stephane E. Potvin <sepotvin@videotron.ca>
Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Wesley Morgan <morganw@chemikals.org>
Daniel Eischen <deischen@FreeBSD.org>
Maxime Guillaud <bsd-ports@mguillaud.net>
Ariff Abdullah <ariff@FreeBSD.org>
....and various people from freebsd-multimedia@FreeBSD.org
Refer to snd_hda(4) for features and issues.
Welcome To HDA.
Sponsored by: Defenxis Sdn. Bhd.
This driver make a special guarantee that "playback" works
on majority hardwares with minimal or without specific vendor
quirk.
This driver is a product of collaborative effort made by:
Stephane E. Potvin <sepotvin@videotron.ca>
Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Wesley Morgan <morganw@chemikals.org>
Daniel Eischen <deischen@FreeBSD.org>
Maxime Guillaud <bsd-ports@mguillaud.net>
Ariff Abdullah <ariff@FreeBSD.org>
....and various people from freebsd-multimedia@FreeBSD.org
Refer to snd_hda(4) for features and issues.
Welcome To HDA.
Sponsored by: Defenxis Sdn. Bhd.
- fix multiple initialization of the first codec (support for more than
one codec should be added in the future)
- use spicds instead of ak452x module
Submitted by: "Konstantin Dimitrov" <kosio.dimitrov@gmail.com>