freebsd-skq

Author	SHA1	Message	Date
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Kevin Lo	0ca2381d9b	- Use uint8_t for the variable x and spell the size of the variable as sizeof(x) - Capitalized comment - Parentheses around return value Requested by: bde	2012-02-06 06:03:16 +00:00
Martin Matuska	a91d2201f9	Analogous to r230407 a separate path buffer in vfs_mount.c is required for r230129. Fixes a out of bounds write to fspath. MFC after: 10 days	2012-02-05 10:59:50 +00:00
David Xu	d56e058a79	Add 32-bit compat code for AIO kevent flags introduced in revision 230857.	2012-02-05 04:49:31 +00:00
Ryan Stone	312ac3a23a	Whenever a new kernel thread is spawned, explicitly clear any CPU affinity set on the new thread. This prevents the thread from inadvertently inheriting affinity from a random sibling. Submitted by: attilio Tested by: pho MFC after: 1 week	2012-02-04 16:49:29 +00:00
Hiroki Sato	cf8b832511	Fix input validation in SO_SETFIB. Reviewed by: bz MFC after: 1 day	2012-02-04 15:00:26 +00:00
Konstantin Belousov	6af519cf18	Add kqueue support to /dev/klog. Submitted by: Mateusz Guzik <mjguzik gmail com> PR: kern/156423 MFC after: 1 weeks	2012-02-01 14:34:52 +00:00
David Xu	fde809356a	If multiple threads call kevent() to get AIO events on same kqueue fd, it is possible that a single AIO event will be reported to multiple threads, it is not threading friendly, and the existing API can not control this behavior. Allocate a kevent flags field sigev_notify_kevent_flags for AIO event notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH or EV_ONESHOT to AIO kernel code, user can control whether the event should be cleared once it is retrieved by a thread. This change should be comptaible with existing application, because the field should have already been zero-filled, and no additional action will be taken by kernel. PR: kern/156567	2012-02-01 02:53:06 +00:00
Konstantin Belousov	6ad1ff09cc	A debugger which requested PT_FOLLOW_FORK should get the notification about new child not only when doing PT_TO_SCX, but also for PT_CONTINUE. If TDB_FORK flag is set, always issue a stop, the same as is done for TDB_EXEC. Reported by: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week	2012-01-30 20:00:29 +00:00
John Baldwin	2bd3e4c2c2	Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such that instead of using direct I/O it allows read-ahead similar to POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the read(2) has completed to purge just-read data. The write(2) path continues to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works optimally if an application reads and writes full fs blocks.	2012-01-30 19:35:15 +00:00
Doug Ambrisko	8e9fc27818	When detaching an AIO or LIO requests grab the lock and tell knlist_remove that we have the lock now. This cleans up a locking panic ASSERT when knlist_empty is called without a lock when INVARIANTS etc. are turned. Reviewed by: kib jhb MFC after: 1 week	2012-01-30 19:19:22 +00:00
Konstantin Belousov	62c625fdd2	Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported, and some variants of powerpc32. MFC after: 2 months (if ever)	2012-01-30 07:56:00 +00:00
Attilio Rao	5d7380f8e3	Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424	2012-01-28 14:00:21 +00:00
Gleb Smirnoff	94fce84763	Fix size check, that prevents getting negative after casting to a signed type Reviewed by: bde	2012-01-27 08:58:58 +00:00
Kenneth D. Merry	7e949c467c	Xen netback driver rewrite. share/man/man4/Makefile, share/man/man4/xnb.4, sys/dev/xen/netback/netback.c, sys/dev/xen/netback/netback_unit_tests.c: Rewrote the netback driver for xen to attach properly via newbus and work properly in both HVM and PVM mode (only HVM is tested). Works with the in-tree FreeBSD netfront driver or the Windows netfront driver from SuSE. Has not been extensively tested with a Linux netfront driver. Does not implement LRO, TSO, or polling. Includes unit tests that may be run through sysctl after compiling with XNB_DEBUG defined. sys/dev/xen/blkback/blkback.c, sys/xen/interface/io/netif.h: Comment elaboration. sys/kern/uipc_mbuf.c: Fix page fault in kernel mode when calling m_print() on a null mbuf. Since m_print() is only used for debugging, there are no performance concerns for extra error checking code. sys/kern/subr_scanf.c: Add the "hh" and "ll" width specifiers from C99 to scanf(). A few callers were already using "ll" even though scanf() was handling it as "l". Submitted by: Alan Somers <alans@spectralogic.com> Submitted by: John Suykerbuyk <johns@spectralogic.com> Sponsored by: Spectra Logic MFC after: 1 week Reviewed by: ken	2012-01-26 16:35:09 +00:00
Gleb Smirnoff	434ea137cc	Although aio_nbytes is size_t, later is is signed to casted types: to ssize_t in filesystem code and to int in buf code, thus supplying a negative argument leads to kernel panic later. To fix that check user supplied argument in the beginning of syscall. Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@	2012-01-26 11:59:48 +00:00
Konstantin Belousov	abc942b56c	When doing vflush(WRITECLOSE), clean vnode pages. Unmounts do vfs_msync() before calling VFS_UNMOUNT(), but there is still a race allowing a process to dirty pages after msync finished. Remounts rw->ro just left dirty pages in system. Reviewed by: alc, tegge (long time ago) Tested by: pho MFC after: 2 weeks	2012-01-25 20:54:09 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
Mikolaj Golub	45efc9b4aa	Fix CTL flags in the declarations of KERN_PROC_ENV, AUXV and PS_STRINGS sysctls: they are read only. MFC after: 1 week	2012-01-25 20:15:58 +00:00
Konstantin Belousov	7a7e609a32	Apparently, both nfs clients do not use cache_enter_time() consistently, creating some namecache entries without NCF_TS flag. This causes panic due to failed assertion. As a temporal relief, remove the assert. Return epoch timestamp for the entries without timestamp if asked. While there, consolidate the code which returns timestamps, into a helper cache_out_ts(). Discussed with: jhb MFC after: 2 weeks	2012-01-23 17:09:23 +00:00
Gleb Smirnoff	93a1b4c4cf	Convert panic()s to KASSERT()s. This is an optimisation for hashdestroy() since in absence of INVARIANTS a compiler will drop the entire for() cycle.	2012-01-23 16:31:46 +00:00
Mikolaj Golub	8854fe3915	Change kern.proc.rlimit sysctl to: - retrive only one, specified limit for a process, not the whole array, as it was previously (the sysctl has been added recently and has not been backported to stable yet, so this change is ok); - allow to set a resource limit for another process. Submitted by: Andrey Zonov <andrey at zonov.org> Discussed with: kib Reviewed by: kib MFC after: 2 weeks	2012-01-22 20:25:00 +00:00
Pawel Jakub Dawidek	9b9a01792d	TDF_* flags should be used with td_flags field and TDP_* flags should be used with td_pflags field. Correct two places where it was not the case. Discussed with: kib MFC after: 1 week	2012-01-22 11:01:36 +00:00
Konstantin Belousov	c2b396f294	Remove the nc_time and nc_ticks elements from struct namecache, and provide struct namecache_ts which is the old struct namecache. Only allocate struct namecache_ts if non-null struct timespec *tsp was passed to cache_enter_time, otherwise use struct namecache. Change struct namecache allocation and deallocation macros into static functions, since logic becomes somewhat twisty. Provide accessor for the nc_name member of struct namecache to hide difference between struct namecache and namecache_ts. The aim of the change is to not waste 20 bytes per small namecache entry. Reviewed by: jhb MFC after: 2 weeks X-MFC-note: after r230394	2012-01-22 01:11:06 +00:00
Martin Matuska	6dfe0a3dc2	Use separate buffer for global path to avoid overflow of path buffer. Reviewed by: jamie@ MFC after: 3 weeks	2012-01-21 00:06:21 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Konstantin Belousov	2974cc36f7	Use shared lock for the executable vnode in the exec path after the VV_TEXT changes are handled. Assert that vnode is exclusively locked at the places that modify VV_TEXT. Discussed with: alc MFC after: 3 weeks	2012-01-19 23:03:31 +00:00
Alan Cox	1dfab8025e	Explain why it is safe to unlock the vnode. Requested by: kib	2012-01-17 16:20:50 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Alan Cox	292177e67a	Improve abstraction. Eliminate direct access by elf_load_section() to an OBJT_VNODE-specific field of the vm object. The same information can be just as easily obtained from the struct vattr that is in struct image_params if the latter is passed to elf_load_section(). Moreover, by replacing the vmspace and vm object parameters to elf*_load_section() with a struct image_params parameter, we actually reduce the size of the object code. In collaboration with: kib	2012-01-17 00:27:32 +00:00
Sergey Kandaurov	037f43d3ef	Be pedantic and change // comment to C-style one. Noticed by: Bruce Evans	2012-01-16 20:42:56 +00:00
Kevin Lo	575cabed9e	Fix a style bug Spotted by: avg	2012-01-16 14:54:48 +00:00
David Xu	29a06690ca	Eliminate branch and insert an explicit reader memory barrier to ensure that waiter bit is set before reading semaphore count.	2012-01-16 04:39:10 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Martin Matuska	9cbe30e1d5	Fix missing in r230129: kern_jail.c: initialize fullpath_disabled to zero vfs_cache.c: add missing dot in comment Reported by: kib MFC after: 1 month	2012-01-15 18:08:15 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Martin Matuska	f6e633a9e1	Introduce vn_path_to_global_path() This function updates path string to vnode's full global path and checks the size of the new path string against the pathlen argument. In vfs_domount(), sys_unmount() and kern_jail_set() this new function is used to update the supplied path argument to the respective global path. Unbreaks jailed zfs(8) with enforce_statfs set to 1. Reviewed by: kib MFC after: 1 month	2012-01-15 12:08:20 +00:00
Eitan Adler	886e862866	- Fix undefined behavior when device_get_name is null - Make error message more informative PR: kern/149800 Submitted by: olgeni Approved by: cperciva MFC after: 1 week	2012-01-15 07:09:18 +00:00
Oleksandr Tymoshenko	4104e83567	Fix kernel modules loading for MIPS64 kernel: On amd64, link_elf_obj.c must specify KERNBASE rather than VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable modules must be mapped for execution in the same upper region of the kernel map as the kernel code and data segments. For MIPS32 KERNBASE lies below KVA area (it's less than VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole KVA to look through. On MIPS64 it's not the case because KERNBASE is set to the very end of XKSEG, well out of KVA bounds, so vm_map_find always fails. We should use VM_MIN_KERNEL_ADDRESS as a base for vm_map_find. Details obtained from: alc@	2012-01-14 00:36:07 +00:00
John Baldwin	fbcebf7f71	Convert the per-interface address list lock from a mutex to a reader/writer lock. Reviewed by: bz	2012-01-09 19:34:12 +00:00
Andriy Gapon	90d8265326	enable stop_scheduler_on_panic by default My plan is to make this behavior unconditional before 10.0 release. X-MFC after: r228424 (if ever)	2012-01-09 12:06:09 +00:00
Konstantin Belousov	3ab0160340	Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). The vfs_busy() is after covered vnode lock in the global lock order, but since quotaon() does recursive VFS call to open quota file, we usually end up locking covered vnode after mp is busied in sys_quotactl(). Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon, esp. if vfs_busy cannot succeed due to unmount being performed. Reported and tested by: pho MFC after: 1 week	2012-01-08 23:06:53 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Hiroki Sato	ca54e1aee3	Fix a typo. (s/nessesary/necessary/)	2012-01-08 18:48:36 +00:00
John Baldwin	71eeeaf256	Add 5 spare VOPs as placeholders to avoid breaking the KBI in the future when new VOPs are MFC'd to a branch. Reviewed by: kib, bz MFC after: 3 days	2012-01-06 20:06:45 +00:00
John Baldwin	908cac07ce	Use proper argument structure types for the extattr post-VOP hooks. The wrong structure happened to work since the only argument used was the vnode which is in the same place in both VOP_SETATTR() and the two extattr VOPs. MFC after: 3 days	2012-01-06 20:05:48 +00:00
John Baldwin	948c460971	Fix a logic bug in change 228207 in the check for a thread's new user priority being a realtime priority. MFC after: 3 days	2012-01-05 19:02:52 +00:00
John Baldwin	137f91e80f	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
John Baldwin	7e3a96ea37	Some small fixes to CPU accounting for threads: - Only initialize the per-cpu switchticks and switchtime in sched_throw() for the very first context switch on APs during boot. This avoids a small gap between the middle of thread_exit() and sched_throw() where time is not accounted to any thread. - In thread_exit(), update the timestamp bookkeeping to track the changes to mi_switch() introduced by td_rux so that the code once again matches the comment claiming it is mimicing mi_switch(). Specifically, only update the per-thread stats directly and depend on ruxagg() to update p_rux rather than adjusting p_rux directly. While here, move the timestamp bookkeeping as late in the function as possible. Reviewed by: bde, kib MFC after: 1 week	2012-01-03 21:03:28 +00:00

1 2 3 4 5 ...

12498 Commits