freebsd-dev

Author	SHA1	Message	Date
Matt Macy	cbd92ce62e	Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month	2018-05-09 18:47:24 +00:00
Alan Somers	52c0983128	lio_listio: return EAGAIN instead of EIO when out of resources This behavior is already documented by the man page, and suggested by POSIX. Reviewed by: jhb MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D15099	2018-04-16 18:12:15 +00:00
Brooks Davis	6469bdcdb6	Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941	2018-04-06 17:35:35 +00:00
John Baldwin	86bbef4379	Don't store shadow copies of per-process AIO limits. Previously the AIO subsystem would save a snapshot of the currently configured per-process limits the first time a process used AIO. The process would continue to use the snapshotted limits ignoring any changes to the global limits during the rest of its lifetime. This change removes the snapshotted values and changes the AIO code to always check the global values which can be toggled at runtime. This means an administrator can now change the effective limits of existing processes. This is more consistent with how other limits configured via sysctl work in FreeBSD. Reviewed by: asomers, kib MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D13819	2018-01-10 21:18:46 +00:00
John Baldwin	f54c5606b3	Allow the fast-path for disk AIO requests to fail requests. - If aio_qphysio() returns a non-zero error code, fail the request rather than queueing it to the AIO kproc pool to be retried via the slow path. Currently this means that if vm_fault_quick_hold_pages() reports an error, EFAULT is returned from the fast-path rather than retrying the request in the slow path where it will still fail with EFAULT. - If aio_qphysio() wishes to use the fast path for a device that doesn't support unmapped I/O but there are already the maximum number of such requests in flight, fail with EAGAIN as we do for other AIO resource limits rather than queueing the request to the AIO kproc pool. - Move the opcode check for aio_qphysio() out of the caller and into aio_qphysio() to simplify some logic and remove two goto's while here. It also uses a whitelist (only supported for LIO_READ / LIO_WRITE) rather than a blacklist (skipped for LIO_SYNC). PR: 217261 Submitted by: jkim (an earlier version) MFC after: 2 weeks Sponsored by: Chelsio Communications	2018-01-10 00:18:47 +00:00
John Baldwin	7e40918452	Simplify some logic by merging an if test with a subsequent switch. Specifically, in aio_queue_file() the code was doing this: if (opcode == LIO_SYNC) { ... } switch (opcode) { ... case LIO_SYNC: ... } This moves the body of the if statement into the LIO_SYNC case of the switch statement. MFC after: 2 weeks Sponsored by: Chelsio Communications	2018-01-10 00:02:06 +00:00
John Baldwin	8091e52b42	Add a counter to track in-flight AIO requests using unmapped I/O. MFC after: 2 weeks Sponsored by: Chelsio Communications	2018-01-09 23:57:29 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Pedro F. Giffuni	8a36da99de	sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:20:12 +00:00
Alan Somers	df485bdb3c	Fix aio_suspend in 32-bit emulation An off-by-one error has been present since the system call was first present in 185878. It additionally became a memory corruption bug after change 324941. The failure is actually revealed by our existing AIO tests. However, apparently nobody's been running those in 32-bit emulation mode. Reported by: Coverity, cem CID: 1382114 MFC after: 18 days X-MFC-With: 324941 Sponsored by: Spectra Logic Corp	2017-10-26 19:45:15 +00:00
Alan Somers	913b932900	Remove artificial restriction on lio_listio's operation count In r322258 I made p1003_1b.aio_listio_max a tunable. However, further investigation shows that there was never any good reason for that limit to exist in the first place. It's used in two completely different ways: * To size a UMA zone, which globally limits the number of concurrent aio_suspend calls. * To artifically limit the number of operations in a single lio_listio call. There doesn't seem to be any memory allocation associated with this limit. This change does two things: * Properly names aio_suspend's UMA zone, and sizes it based on a new constant. * Eliminates the artifical restriction on lio_listio. Instead, lio_listio calls will now be limited by the more generous max_aio_queue_per_proc. The old p1003_1b.aio_listio_max is now an alias for vfs.aio.max_aio_queue_per_proc, so sysconf(3) will still work with _SC_AIO_LISTIO_MAX. Reported by: bde Reviewed by: jhb MFC after: 3 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12120	2017-10-23 23:12:01 +00:00
Alan Somers	c45796d54e	Make p1003_1b.aio_listio_max a tunable p1003_1b.aio_listio_max is now a tunable. Its value is reflected in the sysctl of the same name, and the sysconf(3) variable _SC_AIO_LISTIO_MAX. Its value will be bounded from below by the compile-time constant AIO_LISTIO_MAX and from above by the compile-time constant MAX_AIO_QUEUE_PER_PROC and the tunable vfs.aio.max_aio_queue. Reviewed by: jhb, kib MFC after: 3 weeks Relnotes: yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11601	2017-08-08 16:14:31 +00:00
Konstantin Belousov	711dba24d7	Allow negative aio_offset only for the read and write LIO ops on device nodes. Otherwise, the current check of aio_offset == -1LL makes it possible to pass negative file offsets down to the filesystems. This trips assertions and is even unsafe for e.g. FFS which keeps metadata at negative offsets. Reported and tested by: pho Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11266	2017-06-19 15:17:17 +00:00
Konstantin Belousov	2b34e84335	Add abstime kqueue(2) timers and expand struct kevent members. This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which specifies that the data field contains absolute time to fire the event. To make this useful, data member of the struct kevent must be extended to 64bit. Using the opportunity, I also added ext members. This changes struct kevent almost to Apple struct kevent64, except I did not changed type of ident and udata, the later would cause serious API incompatibilities. The type of ident was kept uintptr_t since EVFILT_AIO returns a pointer in this field, and e.g. CHERI is sensitive to the type (discussed with brooks, jhb). Unlike Apple kevent64, symbol versioning allows us to claim ABI compatibility and still name the new syscall kevent(2). Compat shims are provided for both host native and compat32. Requested by: bapt Reviewed by: bapt, brooks, ngie (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D11025	2017-06-17 00:57:26 +00:00
Konstantin Belousov	496ab0532d	Rework r313352. Rename kern_vm_* functions to kern_*. Move the prototypes to syscallsubr.h. Also change Mach VM types to uintptr_t/size_t as needed, to avoid headers pollution. Requested by: alc, jhb Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D9535	2017-02-13 09:04:38 +00:00
Konstantin Belousov	e2a18110f0	Remove duplicated code. aio_aqueue() calls aio_init_aioinfo() as the first action. There is no need to duplicate the code in kern_aio_fsync(). Also fix indent for aio_aqueue() definition. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D7523	2016-08-17 10:14:22 +00:00
John Baldwin	005ce8e4e6	Fix locking issues with aio_fsync(). - Use correct lock in aio_cancel_sync when dequeueing job. - Add _locked variants of aio_set/clear_cancel_function and use those to avoid lock recursion when adding and removing fsync jobs to the per-process sync queue. - While here, add a basic test for aio_fsync(). PR: 211390 Reported by: Randy Westlund <rwestlun@gmail.com> MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7339	2016-07-29 18:26:15 +00:00
John Baldwin	b9a53e161b	Adjust tests in fsync job scheduling loop to reduce indentation.	2016-07-27 19:31:25 +00:00
John Baldwin	9c20dc9963	Add more documentation regarding unsafe AIO requests. The asynchronous I/O changes made previously result in different behavior out of the box. Previously all AIO requests failed with ENOSYS / SIGSYS unless aio.ko was explicitly loaded. Now, some AIO requests complete and others ("unsafe" requests) fail with EOPNOTSUPP. Reword the introductory paragraph in aio(4) to add a general description of AIO before describing the vfs.aio.enable_unsafe sysctl. Remove the ENOSYS error description from aio_fsync(2), aio_read(2), and aio_write(2) and replace it with a description of EOPNOTSUPP. Remove the ENOSYS error description from aio_mlock(2). Log a message to the system log the first time a process requests an "unsafe" AIO request that fails with EOPNOTSUPP. This is modeled on the log message used for processes using the legacy pty devices. Reviewed by: kib (earlier version) MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D7151	2016-07-21 22:49:47 +00:00
Konstantin Belousov	9fe297bbdc	Declare aio requests on files from local filesystems safe. Two notes: - I allow AIO on reclaimed vnodes, since it is deterministically terminated fast. - devfs mounts are marked as MNT_LOCAL, but device vnodes have type VCHR, so the slow device io is not allowed. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D7273	2016-07-21 17:07:06 +00:00
John Baldwin	b1012d8036	Account for AIO socket operations in thread/process resource usage. File and disk-backed I/O requests store counts of read/written disk blocks in each AIO job so that they can be charged to the thread that completes an AIO request via aio_return() or aio_waitcomplete(). This change extends AIO jobs to store counts of received/sent messages and updates socket backends to set these counts accordingly. Note that the socket backends are careful to only charge a single messages for each AIO request even though a single request on a blocking socket might invoke sosend or soreceive multiple times. This is to mimic the resource accounting of synchronous read/write. Adjust the UNIX socketpair AIO test to verify that the message resource usage counts update accordingly for aio_read and aio_write. Approved by: re (hrs) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6911	2016-06-21 22:19:06 +00:00
John Baldwin	fe0bdd1d2c	Move backend-specific fields of kaiocb into a union. This reduces the size of kaiocb slightly. I've also added some generic fields that other backends can use in place of the BIO-specific fields. Change the socket and Chelsio DDP backends to use 'backend3' instead of abusing _aiocb_private.status directly. This confines the use of _aiocb_private to the AIO internals in vfs_aio.c. Reviewed by: kib (earlier version) Approved by: re (gjb) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D6547	2016-06-15 20:56:45 +00:00
John Baldwin	f0ec174043	Consistently set status to -1 when completing an AIO request with an error. Sponsored by: Chelsio Communications	2016-05-20 19:46:25 +00:00
John Baldwin	4d805eacfa	Tidy up the unmapped I/O code in qphysio. - Move some blocks around to reduce the number of 'if (unmap)' checks. - Use 'pbuf == NULL' instead of 'unmap'. - Use nitems. - Pull an assignment out of an if expression. Reviewed by: kib Sponsored by: Chelsio Communications	2016-03-31 17:27:30 +00:00
John Baldwin	bb430bc740	Fully handle size_t lengths in AIO requests. First, update the return types of aio_return() and aio_waitcomplete() to ssize_t. POSIX requires aio_return() to return a ssize_t so that it can represent all return values from read() and write(). aio_waitcomplete() should use ssize_t for the same reason. aio_return() has used ssize_t in <aio.h> since r31620 but the manpage and system call entry were not updated. aio_waitcomplete() has always returned int. Note that this does not require new system call stubs as this is effectively only an API change in how the compiler interprets the return value. Second, allow aio_nbytes values up to IOSIZE_MAX instead of just INT_MAX. aio_read/write should now honor the same length limits as normal read/write. Third, use longs instead of ints in the aio_return() and aio_waitcomplete() system call functions so that the 64-bit size_t in the in-kernel aiocb isn't truncated to 32-bits before being copied out to userland or being returned. Finally, a simple test has been added to verify the bounds checking on the maximum read size from a file.	2016-03-21 21:37:33 +00:00
Pedro F. Giffuni	5166fdde7f	aio_qphysio(): Avoid uninitialized pointer read on error. For the !unmap case it may happen that pbuf gets called unreferenced when vm_fault_quick_hold_pages() fails. Initialize it so it doesn't cause trouble. CID: 1352776 Reviewed by: jhb MFC after: 1 week	2016-03-18 19:04:01 +00:00
John Baldwin	399e8c1773	Simplify AIO initialization now that it is standard. - Mark AIO system calls as STD and remove the helpers to dynamically register them. - Use COMPAT6 for the old system calls with the older sigevent instead of an 'o' prefix. - Simplify the POSIX configuration to note that AIO is always available. - Handle AIO in the default VOP_PATHCONF instead of special casing it in the pathconf() system call. fpathconf() is still hackish. - Remove freebsd32_aio_cancel() as it just called the native one directly. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5589	2016-03-09 19:05:11 +00:00
John Baldwin	f3215338ef	Refactor the AIO subsystem to permit file-type-specific handling and improve cancellation robustness. Introduce a new file operation, fo_aio_queue, which is responsible for queueing and completing an asynchronous I/O request for a given file. The AIO subystem now exports library of routines to manipulate AIO requests as well as the ability to run a handler function in the "default" pool of AIO daemons to service a request. A default implementation for file types which do not include an fo_aio_queue method queues requests to the "default" pool invoking the fo_read or fo_write methods as before. The AIO subsystem permits file types to install a private "cancel" routine when a request is queued to permit safe dequeueing and cleanup of cancelled requests. Sockets now use their own pool of AIO daemons and service per-socket requests in FIFO order. Socket requests will not block indefinitely permitting timely cancellation of all requests. Due to the now-tight coupling of the AIO subsystem with file types, the AIO subsystem is now a standard part of all kernels. The VFS_AIO kernel option and aio.ko module are gone. Many file types may block indefinitely in their fo_read or fo_write callbacks resulting in a hung AIO daemon. This can result in hung user processes (when processes attempt to cancel all outstanding requests during exit) or a hung system. To protect against this, AIO requests are only permitted for known "safe" files by default. AIO requests for all file types can be enabled by setting the new vfs.aio.enable_usafe sysctl to a non-zero value. The AIO tests have been updated to skip operations on unsafe file types if the sysctl is zero. Currently, AIO requests on sockets and raw disks are considered safe and are enabled by default. aio_mlock() is also enabled by default. Reviewed by: cem, jilles Discussed with: kib (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5289	2016-03-01 18:12:14 +00:00
John Baldwin	5652770d8f	Rename aiocblist to kaiocb and use consistent variable names. Typically <foo>list is used for a structure that holds a list head in FreeBSD, not for members of a list. As such, rename 'struct aiocblist' to 'struct kaiocb' (the kernel version of 'struct aiocb'). While here, use more consistent variable names for AIO control blocks: - Use 'job' instead of 'aiocbe', 'cb', 'cbe', or 'iocb' for kernel job objects. - Use 'jobn' instead of 'cbn' for use with TAILQ_FOREACH_SAFE(). - Use 'sjob' and 'sjobn' instead of 'scb' and 'scbn' for fsync jobs. - Use 'ujob' instead of 'aiocbp', 'job', 'uaiocb', or 'uuaiocb' to hold a user pointer to a 'struct aiocb'. - Use 'ujobp' instead of 'aiocbp' for a user pointer to a 'struct aiocb *'. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5125	2016-02-05 20:38:09 +00:00
John Baldwin	0dd6c0352b	Various style fixes. - Wrap long lines. - Fix indentation. - Remove excessive parens. - Whitespace fixes in struct definitions. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5025	2016-01-26 21:24:49 +00:00
John Baldwin	39314b7d99	AIO daemons have always been kernel processes to facilitate switching to user VM spaces while servicing jobs. Update various comments and data structures that refer to AIO daemons as threads to refer to processes instead. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4999	2016-01-21 02:20:38 +00:00
John Baldwin	4429f0e2e3	Remove unused variables for socket AIO. In r55943, a per-process queue of pending socket AIO requests (requests waiting for the socket to become ready) was added so that they could be cancelled during process rundown. In r154765, the rundown code was changed to handle jobs in this state (JOBST_JOBQSOCK) directly removing the need for the extra queue. However, the per-process queue head and global lock were never removed. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4997	2016-01-21 01:28:31 +00:00
John Baldwin	8a4dc40ff4	Various cleanups to the main function for AIO kernel processes: - Pull the vmspace logic out into helper functions and reduce duplication. Operations on the vmspace are all isolated to vm_map.c, but it now exports a new 'vmspace_switch_aio' for use by AIO kernel processes. - When an AIO kernel process wants to exit, break out of the main loop and perform cleanup after the loop end. This reduces a lot of indentation and allows cleanup to more closely mirror setup actions before the loop starts. - Convert a DIAGNOSTIC to KASSERT(). - Replace mycp with more typical 'p'. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4990	2016-01-19 21:37:51 +00:00
John Baldwin	f2e7f06a0d	Don't create a dedicated session for each AIO kernel process. This code dates back to the initial AIO support and the commit log does not explain why it is needed. However, I cannot find anything in the AIO code or the various file methods (fo_read/fo_write) that would change behavior due to using a private session instead of proc0's session. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4988	2016-01-19 20:46:30 +00:00
John Baldwin	6c8fd02283	Remove aiod_timeout. It hasn't been used since the AIO code was made MPSAFE 10 years ago. Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4946	2016-01-14 21:28:56 +00:00
John Baldwin	c85650cacc	Rename aiod_bio taskqueue to aiod_kick. This taskqueue is not used to handle bio requests. It is only used to run aio_kick_nowait() to spin up new aio daemon processes. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D4904	2016-01-14 20:51:48 +00:00
Pawel Jakub Dawidek	38d68e2d42	The aio_waitcomplete(2) syscall should not sleep when the given timeout is 0. Without this change it was sleeping for one tick. Maybe not a big deal, but it makes share/dtrace/blocking script to report that. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D3814 Sponsored by: Wheel Systems, http://wheelsystems.com	2015-10-25 18:48:09 +00:00
Konstantin Belousov	9889bbac23	Mutex memory is not zeroed, add MTX_NEW. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-07-06 14:09:00 +00:00
Mateusz Guzik	f131759f54	fd: make 'rights' a manadatory argument to fget* functions	2015-07-05 19:05:16 +00:00
Alexander Motin	f743d981f3	Make AIO to not allocate pbufs for unmapped I/O like r281825. While there, make few more performance optimizations. On 40-core system doing many 512-byte AIO reads from array of raw SSDs this change removes lock congestions inside pbuf allocator and devfs, and bottleneck on single AIO completion taskqueue thread. It improves peak AIO performance from ~600K to ~1.3M IOPS. MFC after: 2 weeks	2015-04-22 18:11:34 +00:00
Mateusz Guzik	e015b1ab0a	Avoid dynamic syscall overhead for statically compiled modules. The kernel tracks syscall users so that modules can safely unregister them. But if the module is not unloadable or was compiled into the kernel, there is no need to do this. Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC during kernel build and 0 otherwise. Reviewed by: kib (previous version) MFC after: 2 weeks	2014-10-26 19:42:44 +00:00
Robert Watson	4a14441044	Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks	2014-03-16 10:55:57 +00:00
Pawel Jakub Dawidek	96a62209fb	The fget() function now takes pointer to cap_rights_t, so change 0 to NULL.	2013-09-05 11:59:23 +00:00
Pawel Jakub Dawidek	7008be5bd7	Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD \| CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t cap_rights_init(cap_rights_t rights, ...); void cap_rights_set(cap_rights_t rights, ...); void cap_rights_clear(cap_rights_t rights, ...); bool cap_rights_is_set(const cap_rights_t rights, ...); bool cap_rights_is_valid(const cap_rights_t rights); void cap_rights_merge(cap_rights_t dst, const cap_rights_t src); void cap_rights_remove(cap_rights_t dst, const cap_rights_t src); bool cap_rights_contains(const cap_rights_t big, const cap_rights_t little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP \| CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation	2013-09-05 00:09:56 +00:00
Kenneth D. Merry	ce625ec719	Change the way that unmapped I/O capability is advertised. The previous method was to set the D_UNMAPPED_IO flag in the cdevsw for the driver. The problem with this is that in many cases (e.g. sa(4)) there may be some instances of the driver that can handle unmapped I/O and some that can't. The isp(4) driver can handle unmapped I/O, but the esp(4) driver currently cannot. The cdevsw is shared among all driver instances. So instead of setting a flag on the cdevsw, set a flag on the cdev. This allows drivers to indicate support for unmapped I/O on a per-instance basis. sys/conf.h: Remove the D_UNMAPPED_IO cdevsw flag and replace it with an SI_UNMAPPED cdev flag. kern_physio.c: Look at the cdev SI_UNMAPPED flag to determine whether or not a particular driver can handle unmapped I/O. geom_dev.c: Set the SI_UNMAPPED flag for all GEOM cdevs. Since GEOM will create a temporary mapping when needed, setting SI_UNMAPPED unconditionally will work. Remove the D_UNMAPPED_IO flag. nvme_ns.c: Set the SI_UNMAPPED flag on cdevs created here if NVME_UNMAPPED_BIO_SUPPORT is enabled. vfs_aio.c: In aio_qphysio(), check the SI_UNMAPPED flag on a cdev instead of the D_UNMAPPED_IO flag on the cdevsw. sys/param.h: Bump __FreeBSD_version to `1000045` for the switch from setting the D_UNMAPPED_IO flag in the cdevsw to setting SI_UNMAPPED in the cdev. Reviewed by: kib, jimharris MFC after: 1 week Sponsored by: Spectra Logic	2013-08-15 22:52:39 +00:00
Gleb Smirnoff	977c7043eb	Remove extra zeroing after M_ZERO allocation.	2013-08-02 13:06:49 +00:00
Konstantin Belousov	9731998997	Move the convert_sigevent32() utility function into freebsd32_misc.c for consumption outside the vfs_aio.c. For SIGEV_THREAD_ID and SIGEV_SIGNAL notification delivery methods, also copy in the sigev_value, since librt event pumping loop compares note generation number with the value passed through sigev_value. Tested by: Petr Salinger <Petr.Salinger@seznam.cz> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2013-07-21 19:33:48 +00:00
Gleb Smirnoff	6160e12c10	Add new system call - aio_mlock(). The name speaks for itself. It allows to perform the mlock(2) operation, which can consume a lot of time, under control of aio(4). Reviewed by: kib, jilles Sponsored by: Nginx, Inc.	2013-06-08 13:27:57 +00:00
Gleb Smirnoff	f95c13db04	Separate LIO_SYNC processing into a separate function aio_process_sync(), and rename aio_process() into aio_process_rw(). Reviewed by: kib Sponsored by: Nginx, Inc.	2013-06-08 13:02:43 +00:00
Konstantin Belousov	f3215a60fd	Fix a race with the vnode reclamation in the aio_qphysio(). Obtain the thread reference on the vp->v_rdev and use the returned struct cdev dev instead of using vp->v_rdev. Call dev_strategy_csw() instead of dev_strategy(), since we now own the reference. Since the csw was already calculated, test d_flags to avoid mapping the buffer if the driver supports unmapped requests []. Suggested by: kan [*] Reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-27 11:47:52 +00:00

1 2 3 4 5 ...

307 Commits