freebsd-skq

Author	SHA1	Message	Date
trociny	3ef0ae6cd1	Fix KASSERT message. MFC after: 3 days	2012-07-03 19:08:02 +00:00
kib	53224f018a	Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks	2012-07-02 21:01:03 +00:00
jhb	ab100847da	Honor db_pager_quit in 'show uma' and 'show malloc'. MFC after: 1 month	2012-07-02 16:14:52 +00:00
imp	492254ade0	Remove an old hack I noticed years ago, but never committed.	2012-06-28 07:33:43 +00:00
alc	c5e6daff9d	Add new pmap layer locks to the predefined lock order. Change the names of a few existing VM locks to follow a consistent naming scheme.	2012-06-27 03:45:25 +00:00
kevlo	8473fac955	Correct sizeof usage Obtained from: DragonFly	2012-06-25 05:41:16 +00:00
kib	c763bb1500	Move the code dealing with shared page into a dedicated kern_sharedpage.c source file from kern_exec.c. MFC after: 29 days	2012-06-23 10:15:23 +00:00
kib	497817697c	Stop updating the struct vdso_timehands from even handler executed in the scheduled task from tc_windup(). Do it directly from tc_windup in interrupt context [1]. Establish the permanent mapping of the shared page into the kernel address space, avoiding the potential need to sleep waiting for allocation of sf buffer during vdso_timehands update. As a consequence, shared_page_write_start() and shared_page_write_end() functions are not needed anymore. Guess and memorize the pointers to native host and compat32 sysentvec during initialization, to avoid the need to get shared_page_alloc_sx lock during the update. In tc_fill_vdso_timehands(), do not loop waiting for timehands generation to stabilize, since vdso_timehands is written in the same interrupt context which wrote timehands. Requested by: mav [1] MFC after: 29 days	2012-06-23 09:33:06 +00:00
kib	7b36a08108	Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 07:06:40 +00:00
kib	4109c3e1ac	Enchance the shared page chunk allocator. Do not rely on the busy state of the page from which we allocate the chunk, to protect allocator state. Use statically allocated sx lock instead. Provide more flexible KPI. In particular, allow to allocate chunk without providing initial data, and allow writes into existing allocation. Allow to get an sf buf which temporary maps the chunk, to allow sequential updates to shared page content without unmapping in between. Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 06:39:28 +00:00
kib	df9f3d2faa	Fix locking for f_offset, vn_read() and vn_write() cases only, for now. It seems that intended locking protocol for struct file f_offset field was as follows: f_offset should always be changed under the vnode lock (except fcntl(2) and lseek(2) did not followed the rules). Since read(2) uses shared vnode lock, FOFFSET_LOCKED block is additionally taken to serialize shared vnode lock owners. This was broken first by enabling shared lock on writes, then by fadvise changes, which moved f_offset assigned from under vnode lock, and last by vn_io_fault() doing chunked i/o. More, due to uio_offset not yet valid in vn_io_fault(), the range lock for reads was taken on the wrong region. Change the locking for f_offset to always use FOFFSET_LOCKED block, which is placed before rangelocks in the lock order. Extract foffset_lock() and foffset_unlock() functions which implements FOFFSET_LOCKED lock, and consistently lock f_offset with it in the vn_io_fault() both for reads and writes, even if MNTK_NO_IOPF flag is not set for the vnode mount. Indicate that f_offset is already valid for vn_read() and vn_write() calls from vn_io_fault() with FOF_OFFSET flag, and assert that all callers of vn_read() and vn_write() follow this protocol. Extract get_advice() function to calculate the POSIX_FADV_XXX value for the i/o region, and use it were appropriate. Reviewed by: jhb Tested by: pho MFC after: 2 weeks	2012-06-21 09:19:41 +00:00
pjd	8f9f9f3c91	Check proper flag (PDF_DAEMON, not PD_DAEMON) when deciding if the process should be killed or not. This fixes killing pdfork(2)ed process on last close of the corresponding process descriptor. Reviewed by: rwatson MFC after: 1 month	2012-06-19 22:23:59 +00:00
pjd	81ad62d5c5	The falloc() function obtains two references to newly created 'fp'. On success we have to drop one after procdesc_finit() and on failure we have to close allocated slot with fdclose(), which also drops one reference for us and drop the remaining reference with fdrop(). Without this change closing process descriptor didn't result in killing pdfork(2)ed child. Reviewed by: rwatson MFC after: 1 month	2012-06-19 22:21:59 +00:00
jhb	571562fffb	Further refine the implementation of POSIX_FADV_NOREUSE. First, extend the changes in r230782 to better handle the common case of using NOREUSE with sequential reads. A NOREUSE file descriptor will now track the last implicit DONTNEED request it made as a result of a NOREUSE read. If a subsequent NOREUSE read is adjacent to the previous range, it will apply the DONTNEED request to the entire range of both the previous read and the current read. The effect is that each read of a file accessed sequentially will apply the DONTNEED request to the entire range that has been read. This allows NOREUSE to properly handle misaligned reads by flushing each buffer to cache once it has been completely read. Second, apply the same changes made to read(2) by r230782 and this change to writes. This provides much better performance in the sequential write case as it allows writes to still be clustered. It also provides much better performance for misaligned writes. It does mean that NOREUSE will be generally ineffective for non-sequential writes as the current implementation relies on a future NOREUSE write's implicit DONTNEED request to flush the dirty buffer from the current write. MFC after: 2 weeks	2012-06-19 18:42:24 +00:00
pho	85a3f61dca	In tty_makedev() the following construction: dev = make_dev_cred(); dev->si_drv1 = tp; leaves a small window where the newly created device may be opened and si_drv1 is NULL. As this is a vary rare situation, using a lock to close the window seems overkill. Instead just wait for the assignment of si_drv1. Suggested by: kib MFC after: 1 week	2012-06-18 07:34:38 +00:00
pjd	0118c86062	Don't check for race with close on advisory unlock (there is nothing smart we can do when such a race occurs). This saves lock/unlock cycle for the filedesc lock for every advisory unlock operation. MFC after: 1 month	2012-06-17 21:04:22 +00:00
pjd	32ff81e94f	Extend the comment about checking for a race with close to explain why it is done and why we don't return an error in such case. Discussed with: kib MFC after: 1 month	2012-06-17 16:59:37 +00:00
pjd	9a81d01ee0	If VOP_ADVLOCK() call or earlier checks failed don't check for a race with close, because even if we had a race there is nothing to unlock. Discussed with: kib MFC after: 1 month	2012-06-17 16:32:32 +00:00
davide	163c370e14	The variable 'error' in sys_poll() is initialized in declaration to value zero but in any case is overwritten by successive copyin(), making the previous initialization useless. Remove this. As an added bonus this fixes a style(9) bug. Discussed with: kib Approved by: gnn (mentor) MFC after: 3 days	2012-06-17 13:03:50 +00:00
pjd	9719a38d39	Revert r237073. 'td' can be NULL here. MFC after: 1 month	2012-06-16 12:56:36 +00:00
pjd	144a7f643e	One more attempt to make prototypes formated according to style(9), which holefully recovers from the "worse than useless" state. Reported by: bde MFC after: 1 month	2012-06-15 10:00:29 +00:00
pjd	2ede2f9ae2	Update comment. MFC after: 1 month	2012-06-14 17:32:58 +00:00
pjd	c2fe03ba67	Remove fdtofp() function and use fget_locked(), which works exactly the same. MFC after: 1 month	2012-06-14 16:25:10 +00:00
pjd	0984458a79	Assert that the filedesc lock is being held when the fdunwrap() function is called. MFC after: 1 month	2012-06-14 16:23:16 +00:00
pjd	f84f6132c8	Simplify the code by making more use of the fdtofp() function. MFC after: 1 month	2012-06-14 15:37:15 +00:00
pjd	4a9c37500e	- Assert that the filedesc lock is being held when fdisused() is called. - Fix white spaces. MFC after: 1 month	2012-06-14 15:35:14 +00:00
pjd	7b02ff9171	Style fixes and assertions improvements. MFC after: 1 month	2012-06-14 15:34:10 +00:00
pjd	32b7d4b149	Assert that the filedesc lock is not held when closef() is called. MFC after: 1 month	2012-06-14 15:26:23 +00:00
pjd	e1c12932a7	Style fixes. Reported by: bde MFC after: 1 month	2012-06-14 15:21:57 +00:00
pjd	2014b8defb	Remove code duplication from fdclosexec(), which was the reason of the bug fixed in r237065. MFC after: 1 month	2012-06-14 12:43:37 +00:00
pjd	6634e42976	When we are closing capabilities during exec, we want to call mq_fdclose() on the underlying object and not on the capability itself. Similar bug was fixed in r236853. MFC after: 1 month	2012-06-14 12:41:21 +00:00
pjd	841890f62a	Style. MFC after: 1 month	2012-06-14 12:37:41 +00:00
pjd	0ca632f7e9	When checking if file descriptor number is valid, explicitely check for 'fd' being less than 0 instead of using cast-to-unsigned hack. Today's commit was brought to you by the letters 'B', 'D' and 'E' :)	2012-06-13 22:12:10 +00:00
pjd	0123f7ed5a	Now that dupfdopen() doesn't depend on finstall() being called earlier, indx will never be -1 on error, as none of dupfdopen(), finstall() and kern_capwrap() modifies it on error, but what is more important none of those functions install and leave file at indx descriptor on error. Leave an assert to prove my words. MFC after: 1 month	2012-06-13 21:38:07 +00:00
pjd	f695b590b4	Allocate descriptor number in dupfdopen() itself instead of depending on the caller using finstall(). This saves us the filedesc lock/unlock cycle, fhold()/fdrop() cycle and closes a race between finstall() and dupfdopen(). MFC after: 1 month	2012-06-13 21:32:35 +00:00
pjd	f7e18321ef	- Remove nfp variable that is not really needed. - Update comment. - Style nits. MFC after: 1 month	2012-06-13 21:22:35 +00:00
pjd	219cd5caaa	Remove duplicated code. MFC after: 1 month	2012-06-13 21:15:01 +00:00
pjd	5d3532ce69	Add missing {. MFC after: 1 month	2012-06-13 21:13:18 +00:00
pjd	c745de62f2	Style. MFC after: 1 month	2012-06-13 21:11:58 +00:00
pjd	54a86dc320	There is no need to set td->td_retval[0] to -1 on error. Confirmed by: jhb MFC after: 1 month	2012-06-13 21:10:00 +00:00
pjd	b836448bf3	There is only one caller of the dupfdopen() function, so we can simplify it a bit: - We can assert that only ENODEV and ENXIO errors are passed instead of handling other errors. - The caller always call finstall() for indx descriptor, so we can assume it is set. Actually the filedesc lock is dropped between finstall() and dupfdopen(), so there is a window there for another thread to close the indx descriptor, but it will be closed in next commit. Reviewed by: mjg MFC after: 1 month	2012-06-13 19:00:29 +00:00
mjg	29bd2f6d46	Remove 'low' argument from fd_last_used(). This function is static and the only caller always passes 0 as low. While here update note about return values in comment. Reviewed by: pjd Approved by: trasz (mentor) MFC after: 1 month	2012-06-13 17:18:16 +00:00
mjg	1ca4c8cbf9	Re-apply reverted parts of r236935 by pjd with some changes. If fdalloc() decides to grow fdtable it does it once and at most doubles the size. This still may be not enough for sufficiently large fd. Use fd in calculations of new size in order to fix this. When growing the table, fd is already equal to first free descriptor >= minfd, also fdgrowtable() no longer drops the filedesc lock. As a result of this there is no need to retry allocation nor lookup. Fix description of fd_first_free to note all return values. In co-operation with: pjd Approved by: trasz (mentor) MFC after: 1 month	2012-06-13 17:12:53 +00:00
pjd	bcf3f4263d	Revert part of the r236935 for now, until I figure out why it doesn't work properly. Reported by: davidxu	2012-06-12 10:25:11 +00:00
pjd	ea4cd345da	fdgrowtable() no longer drops the filedesc lock so it is enough to retry finding free file descriptor only once after fdgrowtable(). Spotted by: pluknet MFC after: 1 month	2012-06-11 22:05:26 +00:00
pjd	b7902b949c	Use consistent way of checking if descriptor number is valid. MFC after: 1 month	2012-06-11 20:17:20 +00:00
pjd	00ef5a8d82	Be consistent with white spaces. MFC after: 1 month	2012-06-11 20:01:50 +00:00
pjd	d698b8f852	Remove code duplicated in kern_close() and do_dup() and use closefp() function introduced a minute ago. This code duplication was responsible for the bug fixed in r236853. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 20:00:44 +00:00
pjd	c8465e01a1	Introduce closefp() function that we will be able to use to eliminate code duplication in kern_close() and do_dup(). This is committed separately from the actual removal of the duplicated code, as the combined diff was very hard to read. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:57:31 +00:00
pjd	cab8c2dc3a	Merge two ifs into one to make the code almost identical to the code in kern_close(). Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:53:41 +00:00

1 2 3 4 5 ...

12736 Commits