freebsd-dev

Author	SHA1	Message	Date
Kevin Lo	c61325d009	Correct sizeof usage Obtained from: DragonFly	2012-06-25 05:41:16 +00:00
Konstantin Belousov	a665ed986c	Move the code dealing with shared page into a dedicated kern_sharedpage.c source file from kern_exec.c. MFC after: 29 days	2012-06-23 10:15:23 +00:00
Konstantin Belousov	21c295ef88	Stop updating the struct vdso_timehands from even handler executed in the scheduled task from tc_windup(). Do it directly from tc_windup in interrupt context [1]. Establish the permanent mapping of the shared page into the kernel address space, avoiding the potential need to sleep waiting for allocation of sf buffer during vdso_timehands update. As a consequence, shared_page_write_start() and shared_page_write_end() functions are not needed anymore. Guess and memorize the pointers to native host and compat32 sysentvec during initialization, to avoid the need to get shared_page_alloc_sx lock during the update. In tc_fill_vdso_timehands(), do not loop waiting for timehands generation to stabilize, since vdso_timehands is written in the same interrupt context which wrote timehands. Requested by: mav [1] MFC after: 29 days	2012-06-23 09:33:06 +00:00
Konstantin Belousov	aea810386d	Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future. The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code. The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism. Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed. Minimal stubs neccessary for non-x86 architectures to still compile are provided. Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 07:06:40 +00:00
Konstantin Belousov	a9d8437c6d	Enchance the shared page chunk allocator. Do not rely on the busy state of the page from which we allocate the chunk, to protect allocator state. Use statically allocated sx lock instead. Provide more flexible KPI. In particular, allow to allocate chunk without providing initial data, and allow writes into existing allocation. Allow to get an sf buf which temporary maps the chunk, to allow sequential updates to shared page content without unmapping in between. Reviewed by: jhb Tested by: flo MFC after: 1 month	2012-06-22 06:39:28 +00:00
Konstantin Belousov	854c3ce7ac	Fix locking for f_offset, vn_read() and vn_write() cases only, for now. It seems that intended locking protocol for struct file f_offset field was as follows: f_offset should always be changed under the vnode lock (except fcntl(2) and lseek(2) did not followed the rules). Since read(2) uses shared vnode lock, FOFFSET_LOCKED block is additionally taken to serialize shared vnode lock owners. This was broken first by enabling shared lock on writes, then by fadvise changes, which moved f_offset assigned from under vnode lock, and last by vn_io_fault() doing chunked i/o. More, due to uio_offset not yet valid in vn_io_fault(), the range lock for reads was taken on the wrong region. Change the locking for f_offset to always use FOFFSET_LOCKED block, which is placed before rangelocks in the lock order. Extract foffset_lock() and foffset_unlock() functions which implements FOFFSET_LOCKED lock, and consistently lock f_offset with it in the vn_io_fault() both for reads and writes, even if MNTK_NO_IOPF flag is not set for the vnode mount. Indicate that f_offset is already valid for vn_read() and vn_write() calls from vn_io_fault() with FOF_OFFSET flag, and assert that all callers of vn_read() and vn_write() follow this protocol. Extract get_advice() function to calculate the POSIX_FADV_XXX value for the i/o region, and use it were appropriate. Reviewed by: jhb Tested by: pho MFC after: 2 weeks	2012-06-21 09:19:41 +00:00
Pawel Jakub Dawidek	53e1646325	Check proper flag (PDF_DAEMON, not PD_DAEMON) when deciding if the process should be killed or not. This fixes killing pdfork(2)ed process on last close of the corresponding process descriptor. Reviewed by: rwatson MFC after: 1 month	2012-06-19 22:23:59 +00:00
Pawel Jakub Dawidek	0a7007b98f	The falloc() function obtains two references to newly created 'fp'. On success we have to drop one after procdesc_finit() and on failure we have to close allocated slot with fdclose(), which also drops one reference for us and drop the remaining reference with fdrop(). Without this change closing process descriptor didn't result in killing pdfork(2)ed child. Reviewed by: rwatson MFC after: 1 month	2012-06-19 22:21:59 +00:00
John Baldwin	cd4ecf3cd2	Further refine the implementation of POSIX_FADV_NOREUSE. First, extend the changes in r230782 to better handle the common case of using NOREUSE with sequential reads. A NOREUSE file descriptor will now track the last implicit DONTNEED request it made as a result of a NOREUSE read. If a subsequent NOREUSE read is adjacent to the previous range, it will apply the DONTNEED request to the entire range of both the previous read and the current read. The effect is that each read of a file accessed sequentially will apply the DONTNEED request to the entire range that has been read. This allows NOREUSE to properly handle misaligned reads by flushing each buffer to cache once it has been completely read. Second, apply the same changes made to read(2) by r230782 and this change to writes. This provides much better performance in the sequential write case as it allows writes to still be clustered. It also provides much better performance for misaligned writes. It does mean that NOREUSE will be generally ineffective for non-sequential writes as the current implementation relies on a future NOREUSE write's implicit DONTNEED request to flush the dirty buffer from the current write. MFC after: 2 weeks	2012-06-19 18:42:24 +00:00
Peter Holm	e84a11e7ff	In tty_makedev() the following construction: dev = make_dev_cred(); dev->si_drv1 = tp; leaves a small window where the newly created device may be opened and si_drv1 is NULL. As this is a vary rare situation, using a lock to close the window seems overkill. Instead just wait for the assignment of si_drv1. Suggested by: kib MFC after: 1 week	2012-06-18 07:34:38 +00:00
Pawel Jakub Dawidek	d99e1d5fd6	Don't check for race with close on advisory unlock (there is nothing smart we can do when such a race occurs). This saves lock/unlock cycle for the filedesc lock for every advisory unlock operation. MFC after: 1 month	2012-06-17 21:04:22 +00:00
Pawel Jakub Dawidek	604a7c2f00	Extend the comment about checking for a race with close to explain why it is done and why we don't return an error in such case. Discussed with: kib MFC after: 1 month	2012-06-17 16:59:37 +00:00
Pawel Jakub Dawidek	fd6049b186	If VOP_ADVLOCK() call or earlier checks failed don't check for a race with close, because even if we had a race there is nothing to unlock. Discussed with: kib MFC after: 1 month	2012-06-17 16:32:32 +00:00
Davide Italiano	68cefd64ad	The variable 'error' in sys_poll() is initialized in declaration to value zero but in any case is overwritten by successive copyin(), making the previous initialization useless. Remove this. As an added bonus this fixes a style(9) bug. Discussed with: kib Approved by: gnn (mentor) MFC after: 3 days	2012-06-17 13:03:50 +00:00
Pawel Jakub Dawidek	cff2dcd10d	Revert r237073. 'td' can be NULL here. MFC after: 1 month	2012-06-16 12:56:36 +00:00
Pawel Jakub Dawidek	3cde71cb25	One more attempt to make prototypes formated according to style(9), which holefully recovers from the "worse than useless" state. Reported by: bde MFC after: 1 month	2012-06-15 10:00:29 +00:00
Pawel Jakub Dawidek	a79de683f5	Update comment. MFC after: 1 month	2012-06-14 17:32:58 +00:00
Pawel Jakub Dawidek	19a8f6748e	Remove fdtofp() function and use fget_locked(), which works exactly the same. MFC after: 1 month	2012-06-14 16:25:10 +00:00
Pawel Jakub Dawidek	b7fc69ca89	Assert that the filedesc lock is being held when the fdunwrap() function is called. MFC after: 1 month	2012-06-14 16:23:16 +00:00
Pawel Jakub Dawidek	1a94dc8581	Simplify the code by making more use of the fdtofp() function. MFC after: 1 month	2012-06-14 15:37:15 +00:00
Pawel Jakub Dawidek	215aeba939	- Assert that the filedesc lock is being held when fdisused() is called. - Fix white spaces. MFC after: 1 month	2012-06-14 15:35:14 +00:00
Pawel Jakub Dawidek	7aef754274	Style fixes and assertions improvements. MFC after: 1 month	2012-06-14 15:34:10 +00:00
Pawel Jakub Dawidek	8d169d9ff0	Assert that the filedesc lock is not held when closef() is called. MFC after: 1 month	2012-06-14 15:26:23 +00:00
Pawel Jakub Dawidek	eb273c01f3	Style fixes. Reported by: bde MFC after: 1 month	2012-06-14 15:21:57 +00:00
Pawel Jakub Dawidek	c7e9a659ca	Remove code duplication from fdclosexec(), which was the reason of the bug fixed in r237065. MFC after: 1 month	2012-06-14 12:43:37 +00:00
Pawel Jakub Dawidek	8f59e9fddc	When we are closing capabilities during exec, we want to call mq_fdclose() on the underlying object and not on the capability itself. Similar bug was fixed in r236853. MFC after: 1 month	2012-06-14 12:41:21 +00:00
Pawel Jakub Dawidek	5570ae7d87	Style. MFC after: 1 month	2012-06-14 12:37:41 +00:00
Pawel Jakub Dawidek	620216725a	When checking if file descriptor number is valid, explicitely check for 'fd' being less than 0 instead of using cast-to-unsigned hack. Today's commit was brought to you by the letters 'B', 'D' and 'E' :)	2012-06-13 22:12:10 +00:00
Pawel Jakub Dawidek	7080f124d2	Now that dupfdopen() doesn't depend on finstall() being called earlier, indx will never be -1 on error, as none of dupfdopen(), finstall() and kern_capwrap() modifies it on error, but what is more important none of those functions install and leave file at indx descriptor on error. Leave an assert to prove my words. MFC after: 1 month	2012-06-13 21:38:07 +00:00
Pawel Jakub Dawidek	3812dcd3de	Allocate descriptor number in dupfdopen() itself instead of depending on the caller using finstall(). This saves us the filedesc lock/unlock cycle, fhold()/fdrop() cycle and closes a race between finstall() and dupfdopen(). MFC after: 1 month	2012-06-13 21:32:35 +00:00
Pawel Jakub Dawidek	7f35af0110	- Remove nfp variable that is not really needed. - Update comment. - Style nits. MFC after: 1 month	2012-06-13 21:22:35 +00:00
Pawel Jakub Dawidek	c64dd3bae1	Remove duplicated code. MFC after: 1 month	2012-06-13 21:15:01 +00:00
Pawel Jakub Dawidek	81424ab705	Add missing {. MFC after: 1 month	2012-06-13 21:13:18 +00:00
Pawel Jakub Dawidek	85c1550d63	Style. MFC after: 1 month	2012-06-13 21:11:58 +00:00
Pawel Jakub Dawidek	baf946221d	There is no need to set td->td_retval[0] to -1 on error. Confirmed by: jhb MFC after: 1 month	2012-06-13 21:10:00 +00:00
Pawel Jakub Dawidek	6195bfebcc	There is only one caller of the dupfdopen() function, so we can simplify it a bit: - We can assert that only ENODEV and ENXIO errors are passed instead of handling other errors. - The caller always call finstall() for indx descriptor, so we can assume it is set. Actually the filedesc lock is dropped between finstall() and dupfdopen(), so there is a window there for another thread to close the indx descriptor, but it will be closed in next commit. Reviewed by: mjg MFC after: 1 month	2012-06-13 19:00:29 +00:00
Mateusz Guzik	2ca63f0a90	Remove 'low' argument from fd_last_used(). This function is static and the only caller always passes 0 as low. While here update note about return values in comment. Reviewed by: pjd Approved by: trasz (mentor) MFC after: 1 month	2012-06-13 17:18:16 +00:00
Mateusz Guzik	02efb9a8b1	Re-apply reverted parts of r236935 by pjd with some changes. If fdalloc() decides to grow fdtable it does it once and at most doubles the size. This still may be not enough for sufficiently large fd. Use fd in calculations of new size in order to fix this. When growing the table, fd is already equal to first free descriptor >= minfd, also fdgrowtable() no longer drops the filedesc lock. As a result of this there is no need to retry allocation nor lookup. Fix description of fd_first_free to note all return values. In co-operation with: pjd Approved by: trasz (mentor) MFC after: 1 month	2012-06-13 17:12:53 +00:00
Pawel Jakub Dawidek	faf0db351d	Revert part of the r236935 for now, until I figure out why it doesn't work properly. Reported by: davidxu	2012-06-12 10:25:11 +00:00
Pawel Jakub Dawidek	039dc89f0d	fdgrowtable() no longer drops the filedesc lock so it is enough to retry finding free file descriptor only once after fdgrowtable(). Spotted by: pluknet MFC after: 1 month	2012-06-11 22:05:26 +00:00
Pawel Jakub Dawidek	d3ec30e525	Use consistent way of checking if descriptor number is valid. MFC after: 1 month	2012-06-11 20:17:20 +00:00
Pawel Jakub Dawidek	fd45a47ba6	Be consistent with white spaces. MFC after: 1 month	2012-06-11 20:01:50 +00:00
Pawel Jakub Dawidek	19d9c0e11e	Remove code duplicated in kern_close() and do_dup() and use closefp() function introduced a minute ago. This code duplication was responsible for the bug fixed in r236853. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 20:00:44 +00:00
Pawel Jakub Dawidek	642db963ab	Introduce closefp() function that we will be able to use to eliminate code duplication in kern_close() and do_dup(). This is committed separately from the actual removal of the duplicated code, as the combined diff was very hard to read. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:57:31 +00:00
Pawel Jakub Dawidek	129c87eb7d	Merge two ifs into one to make the code almost identical to the code in kern_close(). Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:53:41 +00:00
Pawel Jakub Dawidek	d327cee241	Move the code around a bit to move two parts of code duplicated from kern_close() close together. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:51:27 +00:00
Pawel Jakub Dawidek	8b40793150	Now that fdgrowtable() doesn't drop the filedesc lock we don't need to check if descriptor changed from under us. Replace the check with an assert. Discussed with: kib Tested by: pho MFC after: 1 month	2012-06-11 19:48:55 +00:00
Mitsuru IWASAKI	c1b0dc80b5	Another fixe for r236772. - Adjust correct cpuset (stopped_cpus/suspended_cpus) for cpu_spinwait() in generic_stop_cpus().	2012-06-11 18:47:26 +00:00
Pawel Jakub Dawidek	f3cd980557	Style fixes and simplifications. MFC after: 1 month	2012-06-11 16:08:03 +00:00
Pawel Jakub Dawidek	effb6326a1	Remove redundant include. MFC after: 1 month	2012-06-10 20:24:01 +00:00

1 2 3 4 5 ...

12734 Commits