aio_fsync(O_DSYNC, ...) is the asynchronous version of fdatasync(2).
Reviewed by: kib, asomers, jhb
Differential Review: https://reviews.freebsd.org/D25071
POSIX AIO is great, but it lacks vectored I/O functions. This commit
fixes that shortcoming by adding aio_writev and aio_readv. They aren't
part of the standard, but they're an obvious extension. They work just
like their synchronous equivalents pwritev and preadv.
It isn't yet possible to use vectored aiocbs with lio_listio, but that
could be added in the future.
Reviewed by: jhb, kib, bcr
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27743
The 'physio' fast-path used by AIO requests on md(4) devices, is not
gated on the unsafe_aio knob. Prior to r327755, some AIO requests could
fail the fast-path and fall back to the slow-path (requests for devices
not supporting unmapped I/O and requests which failed with EFAULT during
the fast-path). However, those cases now return a suitable error rather
than using the slow-path.
PR: 217261
Reviewed by: asomers
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D14742
Remove aio_test's legacy timeout handling and cleanup routines. Instead,
use ATF's builtin capabilities. ATF automatically cleans up newly created
files, too, so we don't have to explicitly unlink them. The only tests than
need a cleanup routine are the md(4) tests, which must destroy their md
device.
Reviewed by: jhb
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11468
These tests have been flapping (failing<->passing) on Jenkins for months.
It passes reliably for me if unsafe AIO is permitted, but it doesn't
pass on Jenkins reliably if unsafe AIO is disabled (the current default).
Mark the tests as requiring unsafe AIO to mitigate the intermittent
failures when unsafe AIO isn't permitted. If the kernel code is changed
to reliably function with md(4) devices using unsafe AIO, this commit can
be reverted.
MFC after: 2 months
PR: 217261
Reported by: Jenkins
Sponsored by: Dell EMC Isilon
* Add tests for aio_suspend(2).
* Add tests for polled completion notification.
* Test the full matrix of file descriptor types and completion notification
mechanisms.
* Don't bother with mkstemp, because ATF runs every test in its own temp dir.
* Fix some typos.
* Remove extraneous ATF_REQUIRE_KERNEL_MODULE calls.
Reviewed by: jhb
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D9045
- Use correct lock in aio_cancel_sync when dequeueing job.
- Add _locked variants of aio_set/clear_cancel_function and use those
to avoid lock recursion when adding and removing fsync jobs to the
per-process sync queue.
- While here, add a basic test for aio_fsync().
PR: 211390
Reported by: Randy Westlund <rwestlun@gmail.com>
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D7339
File and disk-backed I/O requests store counts of read/written disk
blocks in each AIO job so that they can be charged to the thread that
completes an AIO request via aio_return() or aio_waitcomplete(). This
change extends AIO jobs to store counts of received/sent messages and
updates socket backends to set these counts accordingly. Note that
the socket backends are careful to only charge a single messages for
each AIO request even though a single request on a blocking socket might
invoke sosend or soreceive multiple times. This is to mimic the
resource accounting of synchronous read/write.
Adjust the UNIX socketpair AIO test to verify that the message resource
usage counts update accordingly for aio_read and aio_write.
Approved by: re (hrs)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D6911
After the previous changes to fix requests on blocking sockets to complete
across multiple operations, an edge case exists where a request can be
cancelled after it has partially completed. POSIX doesn't appear to
dictate exactly how to handle this case, but in general I feel that
aio_cancel() should arrange to cancel any request it can, but that any
partially completed requests should return a partial completion rather
than ECANCELED. To that end, fix the socket AIO cancellation routine to
return a short read/write if a partially completed request is cancelled
rather than ECANCELED.
Sponsored by: Chelsio Communications
Always requeue an AIO job at the head of the socket buffer's queue if
sosend() or soreceive() returns EWOULDBLOCK on a blocking socket.
Previously, requests were only requeued if they returned EWOULDBLOCK
and completed no data. Now after a partial completion on a blocking
socket the request is queued and the remaining request is retried when
the socket is ready. This allows writes larger than the currently
available space on a blocking socket to fully complete. Reads on a
blocking socket that satifsy the low watermark can still return a short
read (same as read()).
In order to track previously completed data, the internal 'status'
field of the AIO job is used to store the amount of previously
computed data.
Non-blocking sockets continue to return short completions for both
reads and writes.
Add a test for a "large" AIO write on a blocking socket that writes
twice the socket buffer size to a UNIX domain socket.
Sponsored by: Chelsio Communications
The older AIO code awakened all pending AIO requests on a socket
when any data arrived. This could result in AIO daemons blocking on
an empty socket buffer. These requests could not be cancelled
which led to a deadlock during process exit. This test reproduces
this case. The newer AIO code is able to cancel the pending AIO
request correctly.
Reviewed by: ngie (-ish)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D4363
The large read test uses an empty file created via mkstemp() rather than
/dev/null as character devices are subject to two different clamping
sysctls. However, I forgot to update some of the error messages after
changing to mkstemp() that were still referring to /dev/null.
First, update the return types of aio_return() and aio_waitcomplete() to
ssize_t.
POSIX requires aio_return() to return a ssize_t so that it can represent
all return values from read() and write(). aio_waitcomplete() should use
ssize_t for the same reason.
aio_return() has used ssize_t in <aio.h> since r31620 but the manpage and
system call entry were not updated. aio_waitcomplete() has always
returned int.
Note that this does not require new system call stubs as this is
effectively only an API change in how the compiler interprets the return
value.
Second, allow aio_nbytes values up to IOSIZE_MAX instead of just INT_MAX.
aio_read/write should now honor the same length limits as normal read/write.
Third, use longs instead of ints in the aio_return() and aio_waitcomplete()
system call functions so that the 64-bit size_t in the in-kernel aiocb
isn't truncated to 32-bits before being copied out to userland or
being returned.
Finally, a simple test has been added to verify the bounds checking on the
maximum read size from a file.
improve cancellation robustness.
Introduce a new file operation, fo_aio_queue, which is responsible for
queueing and completing an asynchronous I/O request for a given file.
The AIO subystem now exports library of routines to manipulate AIO
requests as well as the ability to run a handler function in the
"default" pool of AIO daemons to service a request.
A default implementation for file types which do not include an
fo_aio_queue method queues requests to the "default" pool invoking the
fo_read or fo_write methods as before.
The AIO subsystem permits file types to install a private "cancel"
routine when a request is queued to permit safe dequeueing and cleanup
of cancelled requests.
Sockets now use their own pool of AIO daemons and service per-socket
requests in FIFO order. Socket requests will not block indefinitely
permitting timely cancellation of all requests.
Due to the now-tight coupling of the AIO subsystem with file types,
the AIO subsystem is now a standard part of all kernels. The VFS_AIO
kernel option and aio.ko module are gone.
Many file types may block indefinitely in their fo_read or fo_write
callbacks resulting in a hung AIO daemon. This can result in hung
user processes (when processes attempt to cancel all outstanding
requests during exit) or a hung system. To protect against this, AIO
requests are only permitted for known "safe" files by default. AIO
requests for all file types can be enabled by setting the new
vfs.aio.enable_usafe sysctl to a non-zero value. The AIO tests have
been updated to skip operations on unsafe file types if the sysctl is
zero.
Currently, AIO requests on sockets and raw disks are considered safe
and are enabled by default. aio_mlock() is also enabled by default.
Reviewed by: cem, jilles
Discussed with: kib (earlier version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D5289