freebsd-nq

Author	SHA1	Message	Date
Martin Matuska	e7af90ab00	Analogous to r232059, add a parameter for the ZFS file system: allow.mount.zfs: allow mounting the zfs filesystem inside a jail This way the permssions for mounting all current VFCF_JAIL filesystems inside a jail are controlled wia allow.mount.* jail parameters. Update sysctl descriptions. Update jail(8) and zfs(8) manpages. TODO: document the connection of allow.mount.* and VFCF_JAIL for kernel developers MFC after: 10 days	2012-02-26 16:30:39 +00:00
Jilles Tjoelker	581400dfed	Fix fchmod() and fchown() on fifos. The new fifo implementation in r232055 broke fchmod() and fchown() on fifos. Postfix needs this. Submitted by: gianni Reported by: dougb	2012-02-26 15:14:29 +00:00
Mikolaj Golub	6ce13747dc	Add sysctl to retrieve or set umask of another process. Submitted by: Dmitry Banschikov <me ubique spb ru> Discussed with: kib, rwatson Reviewed by: kib MFC after: 2 weeks	2012-02-26 14:25:48 +00:00
Konstantin Belousov	747d2fa178	Add SO_PROTOCOL/SO_PROTOTYPE socket SOL_SOCKET-level option to get the socket protocol number. This is useful since the socket type can be implemented by different protocols in the same protocol family, e.g. SOCK_STREAM may be provided by both TCP and SCTP. Submitted by: Jukka A. Ukkonen <jau iki fi> PR: kern/162352 Discussed with: bz Reviewed by: glebius MFC after: 2 weeks	2012-02-26 13:55:43 +00:00
Konstantin Belousov	9493639e35	Remove apparently redundand checks for socket so_proto being non-NULL from sosetopt() and sogetopt(). No exposed sockets may have so_proto invalid. Discussed with: bz, rwatson Reviewed by: glebius MFC after: 2 weeks	2012-02-26 13:51:05 +00:00
Maxim Konovalov	7dfdd83d56	o Reduce chances for integer overflow. o More verbose sysctl description added. MFC after: 2 weeks Sponsored by: Nginx, Inc.	2012-02-25 12:06:40 +00:00
Mikolaj Golub	662c901c54	When detaching an unix domain socket, uipc_detach() checks unp->unp_vnode pointer to detect if there is a vnode associated with (binded to) this socket and does necessary cleanup if there is. The issue is that after forced unmount this check may be too late as the unp_vnode is reclaimed and the reference is stale. To fix this provide a helper function that is called on a socket vnode reclamation to do necessary cleanup. Pointed by: kib Reviewed by: kib MFC after: 2 weeks	2012-02-25 10:15:41 +00:00
David Xu	df1f1bae9e	In revision 231989, we pass a 16-bit clock ID into kernel, however according to POSIX document, the clock ID may be dynamically allocated, it unlikely will be in 64K forever. To make it future compatible, we pack all timeout information into a new structure called _umtx_time, and use fourth argument as a size indication, a zero means it is old code using timespec as timeout value, but the new structure also includes flags and a clock ID, so the size argument is different than before, and it is non-zero. With this change, it is possible that a thread can sleep on any supported clock, though current kernel code does not have such a POSIX clock driver system.	2012-02-25 02:12:17 +00:00
Konstantin Belousov	dcdc6c361b	Restore the return statement erronously removed in the r232048. Submitted by: cognet Pointy hat to: kib (reuse the one I already got today) MFC after: 13 days	2012-02-24 11:02:35 +00:00
Martin Matuska	bf3db8aa65	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks	2012-02-23 18:51:24 +00:00
Kip Macy	11ac7ec076	merge pipe and fifo implementations Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde	2012-02-23 18:37:30 +00:00
Christian Brueffer	6bdc1841a9	Catch up with r195837 (2.5 years ago) which renamed net_add_domain() to domain_add(). PR: 165424 Submitted by: Lachlan Kang MFC after: 1 week	2012-02-23 17:47:19 +00:00
Konstantin Belousov	dcd432817e	Allow the parent to gather the exit status of the children reparented to the debugger. When reparenting for debugging, keep the child in the new orphan list of old parent. When looping over the children in kern_wait(), iterate over both children list and orphan list to search for the process by pid. Submitted by: Dmitry Mikulin <dmitrym juniper.net> MFC after: 2 weeks	2012-02-23 11:50:23 +00:00
David Xu	f911d9fa4d	Fix typo.	2012-02-22 07:34:23 +00:00
David Xu	b13a8fa78f	Use unused fourth argument of umtx_op to pass flags to kernel for operation UMTX_OP_WAIT. Upper 16bits is enough to hold a clock id, and lower 16bits is used to pass flags. The change saves a clock_gettime() syscall from libthr.	2012-02-22 03:22:49 +00:00
Mikolaj Golub	a95852edf3	unp_connect() may use a shared lock on the vnode to fetch the socket. Suggested by: jhb Reviewed by: jhb, kib, rwatson MFC after: 2 weeks	2012-02-21 19:40:13 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Xin LI	fcdd3d322b	Revert r231923 for now. Further work is needed to make sure that the behavior is consistent.	2012-02-20 09:32:32 +00:00
Xin LI	5bfbb59851	Use uprintf instead of printf for the reason why a kernel module can not be loaded. This way, the administrator can get response immediately from the shell session rather than relying on dmesg. MFC after: 1 month	2012-02-20 01:05:17 +00:00
Alan Cox	7dc0ace10e	Close a race due to dropping of the map lock between creating a map entry for a shared mapping and marking the entry for inheritance. Reviewed by: kib X-MFC after: r231526	2012-02-19 00:28:49 +00:00
Konstantin Belousov	3494f31ad2	Fix misuse of the kernel map in miscellaneous image activators. Vnode-backed mappings cannot be put into the kernel map, since it is a system map. Use exec_map for transient mappings, and remove the mappings with kmem_free_wakeup() to notify the waiters on available map space. Do not map the whole executable into KVA at all to copy it out into usermode. Directly use vn_rdwr() for the case of not page aligned binary. There is one place left where the potentially unbounded amount of data is mapped into exec_map, namely, in the COFF image activator enumeration of the needed shared libraries. Reviewed by: alc MFC after: 2 weeks	2012-02-17 23:47:16 +00:00
Bjoern A. Zeeb	9dba179d5e	IFC @231845 Sponsored by: Cisco Systems, Inc.	2012-02-17 00:27:48 +00:00
Eitan Adler	f17a6f1b17	Add a timestamp to the msgbuf output in order to determine when when messages were printed. This can be enabled with the kern.msgbuf_show_timestamp sysctl PR: kern/161553 Reviewed by: avg Submitted by: Arnaud Lacombe <lacombar@gmail.com> Approved by: cperciva MFC after: 1 month	2012-02-16 05:11:35 +00:00
Konstantin Belousov	343b391f20	The PTRACESTOP() macro is used only once. Inline the only use and remove the macro. MFC after: 1 week	2012-02-11 14:49:25 +00:00
Ed Schouten	852b05c5b5	Remove unneeded newline. It fits in 80 columns now. Pointed out by: jh	2012-02-10 14:55:47 +00:00
Ed Schouten	8fac9b7b7d	Merge si_name and __si_namebuf. The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.	2012-02-10 12:40:50 +00:00
Kevin Lo	de02885a7b	Add a missing break. This bug was introduced in r228856.	2012-02-10 06:30:52 +00:00
Konstantin Belousov	db3273398b	Mark the automatically attached child with PL_FLAG_CHILD in struct lwpinfo flags, for PT_FOLLOWFORK auto-attachment. In collaboration with: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week	2012-02-10 00:02:13 +00:00
Martin Matuska	0cc207a6f5	Add support for mounting devfs inside jails. A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month	2012-02-09 10:22:08 +00:00
Konstantin Belousov	9cfb2326bc	Unbreak detection of the async mode for clustered writes after r231075. Submitted by: bde MFC after: 12 days	2012-02-08 15:07:19 +00:00
Pawel Jakub Dawidek	12075c0936	Allow to set kern.ipc.shmmax from /boot/loader.conf. MFC after: 1 week	2012-02-08 09:18:22 +00:00
Ed Schouten	cd864a19a5	Fix whitespace inconsistencies in TTY code.	2012-02-06 18:15:46 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Kevin Lo	0ca2381d9b	- Use uint8_t for the variable x and spell the size of the variable as sizeof(x) - Capitalized comment - Parentheses around return value Requested by: bde	2012-02-06 06:03:16 +00:00
Martin Matuska	a91d2201f9	Analogous to r230407 a separate path buffer in vfs_mount.c is required for r230129. Fixes a out of bounds write to fspath. MFC after: 10 days	2012-02-05 10:59:50 +00:00
David Xu	d56e058a79	Add 32-bit compat code for AIO kevent flags introduced in revision 230857.	2012-02-05 04:49:31 +00:00
Ryan Stone	312ac3a23a	Whenever a new kernel thread is spawned, explicitly clear any CPU affinity set on the new thread. This prevents the thread from inadvertently inheriting affinity from a random sibling. Submitted by: attilio Tested by: pho MFC after: 1 week	2012-02-04 16:49:29 +00:00
Hiroki Sato	cf8b832511	Fix input validation in SO_SETFIB. Reviewed by: bz MFC after: 1 day	2012-02-04 15:00:26 +00:00
Bjoern A. Zeeb	ee799639e8	Add SO_SETFIB option support on PF_INET6 sockets and allow inheriting the FIB number from the process, as set by setfib(2), on socket creation. Sponsored by: Cisco Systems, Inc.	2012-02-03 11:00:53 +00:00
Konstantin Belousov	6af519cf18	Add kqueue support to /dev/klog. Submitted by: Mateusz Guzik <mjguzik gmail com> PR: kern/156423 MFC after: 1 weeks	2012-02-01 14:34:52 +00:00
David Xu	fde809356a	If multiple threads call kevent() to get AIO events on same kqueue fd, it is possible that a single AIO event will be reported to multiple threads, it is not threading friendly, and the existing API can not control this behavior. Allocate a kevent flags field sigev_notify_kevent_flags for AIO event notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH or EV_ONESHOT to AIO kernel code, user can control whether the event should be cleared once it is retrieved by a thread. This change should be comptaible with existing application, because the field should have already been zero-filled, and no additional action will be taken by kernel. PR: kern/156567	2012-02-01 02:53:06 +00:00
Konstantin Belousov	6ad1ff09cc	A debugger which requested PT_FOLLOW_FORK should get the notification about new child not only when doing PT_TO_SCX, but also for PT_CONTINUE. If TDB_FORK flag is set, always issue a stop, the same as is done for TDB_EXEC. Reported by: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week	2012-01-30 20:00:29 +00:00
John Baldwin	2bd3e4c2c2	Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such that instead of using direct I/O it allows read-ahead similar to POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the read(2) has completed to purge just-read data. The write(2) path continues to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works optimally if an application reads and writes full fs blocks.	2012-01-30 19:35:15 +00:00
Doug Ambrisko	8e9fc27818	When detaching an AIO or LIO requests grab the lock and tell knlist_remove that we have the lock now. This cleans up a locking panic ASSERT when knlist_empty is called without a lock when INVARIANTS etc. are turned. Reviewed by: kib jhb MFC after: 1 week	2012-01-30 19:19:22 +00:00
Konstantin Belousov	62c625fdd2	Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported, and some variants of powerpc32. MFC after: 2 months (if ever)	2012-01-30 07:56:00 +00:00
Attilio Rao	5d7380f8e3	Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424	2012-01-28 14:00:21 +00:00
Gleb Smirnoff	94fce84763	Fix size check, that prevents getting negative after casting to a signed type Reviewed by: bde	2012-01-27 08:58:58 +00:00
Kenneth D. Merry	7e949c467c	Xen netback driver rewrite. share/man/man4/Makefile, share/man/man4/xnb.4, sys/dev/xen/netback/netback.c, sys/dev/xen/netback/netback_unit_tests.c: Rewrote the netback driver for xen to attach properly via newbus and work properly in both HVM and PVM mode (only HVM is tested). Works with the in-tree FreeBSD netfront driver or the Windows netfront driver from SuSE. Has not been extensively tested with a Linux netfront driver. Does not implement LRO, TSO, or polling. Includes unit tests that may be run through sysctl after compiling with XNB_DEBUG defined. sys/dev/xen/blkback/blkback.c, sys/xen/interface/io/netif.h: Comment elaboration. sys/kern/uipc_mbuf.c: Fix page fault in kernel mode when calling m_print() on a null mbuf. Since m_print() is only used for debugging, there are no performance concerns for extra error checking code. sys/kern/subr_scanf.c: Add the "hh" and "ll" width specifiers from C99 to scanf(). A few callers were already using "ll" even though scanf() was handling it as "l". Submitted by: Alan Somers <alans@spectralogic.com> Submitted by: John Suykerbuyk <johns@spectralogic.com> Sponsored by: Spectra Logic MFC after: 1 week Reviewed by: ken	2012-01-26 16:35:09 +00:00
Gleb Smirnoff	434ea137cc	Although aio_nbytes is size_t, later is is signed to casted types: to ssize_t in filesystem code and to int in buf code, thus supplying a negative argument leads to kernel panic later. To fix that check user supplied argument in the beginning of syscall. Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@	2012-01-26 11:59:48 +00:00
Konstantin Belousov	abc942b56c	When doing vflush(WRITECLOSE), clean vnode pages. Unmounts do vfs_msync() before calling VFS_UNMOUNT(), but there is still a race allowing a process to dirty pages after msync finished. Remounts rw->ro just left dirty pages in system. Reviewed by: alc, tegge (long time ago) Tested by: pho MFC after: 2 weeks	2012-01-25 20:54:09 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
Mikolaj Golub	45efc9b4aa	Fix CTL flags in the declarations of KERN_PROC_ENV, AUXV and PS_STRINGS sysctls: they are read only. MFC after: 1 week	2012-01-25 20:15:58 +00:00
Konstantin Belousov	7a7e609a32	Apparently, both nfs clients do not use cache_enter_time() consistently, creating some namecache entries without NCF_TS flag. This causes panic due to failed assertion. As a temporal relief, remove the assert. Return epoch timestamp for the entries without timestamp if asked. While there, consolidate the code which returns timestamps, into a helper cache_out_ts(). Discussed with: jhb MFC after: 2 weeks	2012-01-23 17:09:23 +00:00
Gleb Smirnoff	93a1b4c4cf	Convert panic()s to KASSERT()s. This is an optimisation for hashdestroy() since in absence of INVARIANTS a compiler will drop the entire for() cycle.	2012-01-23 16:31:46 +00:00
Mikolaj Golub	8854fe3915	Change kern.proc.rlimit sysctl to: - retrive only one, specified limit for a process, not the whole array, as it was previously (the sysctl has been added recently and has not been backported to stable yet, so this change is ok); - allow to set a resource limit for another process. Submitted by: Andrey Zonov <andrey at zonov.org> Discussed with: kib Reviewed by: kib MFC after: 2 weeks	2012-01-22 20:25:00 +00:00
Pawel Jakub Dawidek	9b9a01792d	TDF_* flags should be used with td_flags field and TDP_* flags should be used with td_pflags field. Correct two places where it was not the case. Discussed with: kib MFC after: 1 week	2012-01-22 11:01:36 +00:00
Konstantin Belousov	c2b396f294	Remove the nc_time and nc_ticks elements from struct namecache, and provide struct namecache_ts which is the old struct namecache. Only allocate struct namecache_ts if non-null struct timespec *tsp was passed to cache_enter_time, otherwise use struct namecache. Change struct namecache allocation and deallocation macros into static functions, since logic becomes somewhat twisty. Provide accessor for the nc_name member of struct namecache to hide difference between struct namecache and namecache_ts. The aim of the change is to not waste 20 bytes per small namecache entry. Reviewed by: jhb MFC after: 2 weeks X-MFC-note: after r230394	2012-01-22 01:11:06 +00:00
Martin Matuska	6dfe0a3dc2	Use separate buffer for global path to avoid overflow of path buffer. Reviewed by: jamie@ MFC after: 3 weeks	2012-01-21 00:06:21 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Konstantin Belousov	2974cc36f7	Use shared lock for the executable vnode in the exec path after the VV_TEXT changes are handled. Assert that vnode is exclusively locked at the places that modify VV_TEXT. Discussed with: alc MFC after: 3 weeks	2012-01-19 23:03:31 +00:00
Alan Cox	1dfab8025e	Explain why it is safe to unlock the vnode. Requested by: kib	2012-01-17 16:20:50 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Alan Cox	292177e67a	Improve abstraction. Eliminate direct access by elf_load_section() to an OBJT_VNODE-specific field of the vm object. The same information can be just as easily obtained from the struct vattr that is in struct image_params if the latter is passed to elf_load_section(). Moreover, by replacing the vmspace and vm object parameters to elf*_load_section() with a struct image_params parameter, we actually reduce the size of the object code. In collaboration with: kib	2012-01-17 00:27:32 +00:00
Sergey Kandaurov	037f43d3ef	Be pedantic and change // comment to C-style one. Noticed by: Bruce Evans	2012-01-16 20:42:56 +00:00
Kevin Lo	575cabed9e	Fix a style bug Spotted by: avg	2012-01-16 14:54:48 +00:00
David Xu	29a06690ca	Eliminate branch and insert an explicit reader memory barrier to ensure that waiter bit is set before reading semaphore count.	2012-01-16 04:39:10 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Martin Matuska	9cbe30e1d5	Fix missing in r230129: kern_jail.c: initialize fullpath_disabled to zero vfs_cache.c: add missing dot in comment Reported by: kib MFC after: 1 month	2012-01-15 18:08:15 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Martin Matuska	f6e633a9e1	Introduce vn_path_to_global_path() This function updates path string to vnode's full global path and checks the size of the new path string against the pathlen argument. In vfs_domount(), sys_unmount() and kern_jail_set() this new function is used to update the supplied path argument to the respective global path. Unbreaks jailed zfs(8) with enforce_statfs set to 1. Reviewed by: kib MFC after: 1 month	2012-01-15 12:08:20 +00:00
Eitan Adler	886e862866	- Fix undefined behavior when device_get_name is null - Make error message more informative PR: kern/149800 Submitted by: olgeni Approved by: cperciva MFC after: 1 week	2012-01-15 07:09:18 +00:00
Oleksandr Tymoshenko	4104e83567	Fix kernel modules loading for MIPS64 kernel: On amd64, link_elf_obj.c must specify KERNBASE rather than VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable modules must be mapped for execution in the same upper region of the kernel map as the kernel code and data segments. For MIPS32 KERNBASE lies below KVA area (it's less than VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole KVA to look through. On MIPS64 it's not the case because KERNBASE is set to the very end of XKSEG, well out of KVA bounds, so vm_map_find always fails. We should use VM_MIN_KERNEL_ADDRESS as a base for vm_map_find. Details obtained from: alc@	2012-01-14 00:36:07 +00:00
John Baldwin	fbcebf7f71	Convert the per-interface address list lock from a mutex to a reader/writer lock. Reviewed by: bz	2012-01-09 19:34:12 +00:00
Andriy Gapon	90d8265326	enable stop_scheduler_on_panic by default My plan is to make this behavior unconditional before 10.0 release. X-MFC after: r228424 (if ever)	2012-01-09 12:06:09 +00:00
Konstantin Belousov	3ab0160340	Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). The vfs_busy() is after covered vnode lock in the global lock order, but since quotaon() does recursive VFS call to open quota file, we usually end up locking covered vnode after mp is busied in sys_quotactl(). Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon, esp. if vfs_busy cannot succeed due to unmount being performed. Reported and tested by: pho MFC after: 1 week	2012-01-08 23:06:53 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Hiroki Sato	ca54e1aee3	Fix a typo. (s/nessesary/necessary/)	2012-01-08 18:48:36 +00:00
John Baldwin	71eeeaf256	Add 5 spare VOPs as placeholders to avoid breaking the KBI in the future when new VOPs are MFC'd to a branch. Reviewed by: kib, bz MFC after: 3 days	2012-01-06 20:06:45 +00:00
John Baldwin	908cac07ce	Use proper argument structure types for the extattr post-VOP hooks. The wrong structure happened to work since the only argument used was the vnode which is in the same place in both VOP_SETATTR() and the two extattr VOPs. MFC after: 3 days	2012-01-06 20:05:48 +00:00
John Baldwin	948c460971	Fix a logic bug in change 228207 in the check for a thread's new user priority being a realtime priority. MFC after: 3 days	2012-01-05 19:02:52 +00:00
John Baldwin	137f91e80f	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
John Baldwin	7e3a96ea37	Some small fixes to CPU accounting for threads: - Only initialize the per-cpu switchticks and switchtime in sched_throw() for the very first context switch on APs during boot. This avoids a small gap between the middle of thread_exit() and sched_throw() where time is not accounted to any thread. - In thread_exit(), update the timestamp bookkeeping to track the changes to mi_switch() introduced by td_rux so that the code once again matches the comment claiming it is mimicing mi_switch(). Specifically, only update the per-thread stats directly and depend on ruxagg() to update p_rux rather than adjusting p_rux directly. While here, move the timestamp bookkeeping as late in the function as possible. Reviewed by: bde, kib MFC after: 1 week	2012-01-03 21:03:28 +00:00
Ed Schouten	dc15eac046	Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.	2012-01-02 12:12:10 +00:00
Konstantin Belousov	cdb7a43117	Avoid double-unlock or double unreference for ndp->ni_dvp when the vnode dp lock upgrade right after the 'success' label fails. In collaboration with: pho MFC after: 1 week	2012-01-01 18:45:59 +00:00
John Baldwin	0c0d27d5dd	Cap the priority calculated from the current thread's running tick count at SCHED_PRI_RANGE to prevent overflows in the priority value. This can happen due to irregularities with clock interrupts under certain virtualization environments. Tested by: Larry Rosenman ler lerctr org MFC after: 2 weeks	2011-12-29 16:17:16 +00:00
Lawrence Stewart	6cedd609b7	Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. The sysclock_getsnapshot() function allows the caller to obtain a snapshot of all the system clock and timecounter state required to create time stamps at a later point. The sysclock_snap2bintime() function converts a previously obtained snapshot into a bintime time stamp according to the specified flags e.g. which system clock, uptime vs absolute time, etc. These KPIs enable useful functionality, including direct comparison of the feedback and feed-forward system clocks and generation of multiple time stamps with different formats from a single timecounter read. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)	2011-12-24 01:32:01 +00:00
John Baldwin	f0d6c5caf0	Add post-VOP hooks for VOP_DELETEEXTATTR() and VOP_SETEXTATTR() and use these to trigger a NOTE_ATTRIB EVFILT_VNODE kevent when the extended attributes of a vnode are changed. Note that OS X already implements this behavior. Reviewed by: rwatson MFC after: 2 weeks	2011-12-23 20:11:37 +00:00
John Baldwin	268e76d86e	Use TASK_INITIALIZER() for dev_dtr_task rather than a dedicated SYSINIT().	2011-12-22 16:01:10 +00:00
Andriy Gapon	167057914b	ule: ensure that batch timeshare threads are scheduled fairly With the previous code, if the range of priorities for timeshare batch threads was greater than RQ_NQS, then the threads with low priorities in the part of the range above RQ_NQS would be scheduled to the run-queues as if they had high priorities at the beginning of the range. In other words, threads with a nice level of +N could be scheduled as if they had a nice level of -M. Reported by: George Mitchell <george@m5p.com> Reviewed by: jhb Tested by: George Mitchell <george@m5p.com> (earlier version) MFC after: 1 week	2011-12-19 20:01:21 +00:00
Mikolaj Golub	547b155eb1	Fix style and white spaces. MFC after: 1 week	2011-12-17 22:18:26 +00:00
Mikolaj Golub	fa3935bcea	On start most of sysctl_kern_proc functions use the same pattern: locate a process calling pfind() and do some additional checks like p_candebug(). To reduce this code duplication a new function pget() is introduced and used. As the function may be useful not only in kern_proc.c it is in the kernel name space. Suggested by: kib Reviewed by: kib MFC after: 2 weeks	2011-12-17 16:59:22 +00:00
Andriy Gapon	f389bc9585	belatedly transfer copyrights from libkern/gets.c to kern_cons.c MFC after: 2 months MFC with: r228642	2011-12-17 15:50:45 +00:00
Andriy Gapon	f6ce353e58	replace uses of libkern gets with cngets MFC after: 2 months	2011-12-17 15:26:34 +00:00
Andriy Gapon	8e62854265	introduce cngets, a method for kernel to read a string from console This is intended as a replacement for libkern's gets and mostly borrows its implementation. It uses cngrab/cnungrab to delimit kernel's access to console input. Note: libkern's gets obviously doesn't share any bits of implementation iwth libc's gets. They also have different APIs and the former doesn't have the overflow problems of the latter. Inspired by: bde MFC after: 2 months	2011-12-17 15:16:54 +00:00
Andriy Gapon	bf8696b408	introduce cngrab/cnungrab stub calls in some places where they make sense MFC after: 2 months	2011-12-17 15:11:22 +00:00
Andriy Gapon	9976156f12	kern cons: introduce infrastructure for console grabbing by kernel At the moment grab and ungrab methods of all console drivers are no-ops. Current intended meaning of the calls is that the kernel takes control of console input. In the future the semantics may be extended to mean that the calling thread takes full ownership of the console (e.g. console output from other threads could be suspended). Inspired by: bde MFC after: 2 months	2011-12-17 15:08:43 +00:00
John Baldwin	f427c78b19	Fire a kevent if necessary after seeking on a regular file. This fixes a case where a kevent would not fire on a regular file if an application read to EOF and then seeked backwards into the file. Reviewed by: kib MFC after: 2 weeks	2011-12-16 20:10:00 +00:00
John Baldwin	338e7cf235	Use vm_mmap_to_errno(). Submitted by: kib	2011-12-15 15:17:19 +00:00
Jilles Tjoelker	6d1c58f8a2	Fix select/poll/kqueue for write on reverse direction before first write. The reverse direction of a pipe is lazily allocated on the first write in that direction (because pipes are usually used in one direction only). A special case is needed to ensure the pipe appears writable before the first write because there are 0 bytes of pending data in 0 bytes of buffer space at that point, leaving 0 bytes of data that can be written with the normal code. Note that the first write returns [ENOMEM] if kern.ipc.maxpipekva is exceeded and does not block or return [EAGAIN], so selecting true for write is correct even in that case. PR: kern/93685 Submitted by: gianni MFC after: 2 weeks	2011-12-14 22:26:39 +00:00
John Baldwin	fb680e16f4	Add a helper API to allow in-kernel code to map portions of shared memory objects created by shm_open(2) into the kernel's address space. This provides a convenient way for creating shared memory buffers between userland and the kernel without requiring custom character devices.	2011-12-14 22:22:19 +00:00
David E. O'Brien	1c5151f3f8	Match other formatting.	2011-12-14 02:31:32 +00:00
David E. O'Brien	3d7618d8bf	Disallow various debug.kdb sysctl's when securelevel is raised. PR: 161350	2011-12-13 17:59:16 +00:00
Eitan Adler	9910b854c6	- Add a sysctl to allow non-root users the ability to set idle priorities. - While here fix up some style nits. Discussed with: cperciva (breifly) Reviewed by: pjd (earlier version) Reviewed by: bde Approved by: jhb MFC after: 1 month	2011-12-13 14:00:27 +00:00
Eitan Adler	3eb9ab5255	Document a large number of currently undocumented sysctls. While here fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week	2011-12-13 00:38:50 +00:00
Andriy Gapon	7a7ce668ef	put sys/systm.h at its proper place or add it if missing Reported by: lstewart, tinderbox Pointyhat to: avg, attilio MFC after: 1 week MFC with: r228430	2011-12-12 10:05:13 +00:00
Andriy Gapon	0e225211a0	kern_racct: move sys/systm.h inclusion to its proper place This should fix the build failure introduced with r228424. Also remove duplicate inclusion of sys/param.h. Pointyhat to: avg MFC after: 1 week	2011-12-12 07:46:10 +00:00
Andriy Gapon	353705930f	panic: add a switch and infrastructure for stopping other CPUs in SMP case Historical behavior of letting other CPUs merily go on is a default for time being. The new behavior can be switched on via kern.stop_scheduler_on_panic tunable and sysctl. Stopping of the CPUs has (at least) the following benefits: - more of the system state at panic time is preserved intact - threads and interrupts do not interfere with dumping of the system state Only one thread runs uninterrupted after panic if stop_scheduler_on_panic is set. That thread might call code that is also used in normal context and that code might use locks to prevent concurrent execution of certain parts. Those locks might be held by the stopped threads and would never be released. To work around this issue, it was decided that instead of explicit checks for panic context, we would rather put those checks inside the locking primitives. This change has substantial portions written and re-written by attilio and kib at various times. Other changes are heavily based on the ideas and patches submitted by jhb and mdf. bde has provided many insights into the details and history of the current code. The new behavior may cause problems for systems that use a USB keyboard for interfacing with system console. This is because of some unusual locking patterns in the ukbd code which have to be used because on one hand ukbd is below syscons, but on the other hand it has to interface with other usb code that uses regular mutexes/Giant for its concurrency protection. Dumping to USB-connected disks may also be affected. PR: amd64/139614 (at least) In cooperation with: attilio, jhb, kib, mdf Discussed with: arch@, bde Tested by: Eugene Grosbein <eugen@grosbein.net>, gnn, Steven Hartland <killing@multiplay.co.uk>, glebius, Andrew Boyer <aboyer@averesystems.com> (various versions of the patch) MFC after: 3 months (or never)	2011-12-11 21:02:01 +00:00
Peter Holm	cdea31e305	Move cpu_set_upcall(newtd, td) up before the first call of thread_free(newtd). This to avoid a possible page fault in cpu_thread_clean() as seen on amd64 with syscall fuzzing. Reviewed by: kib MFC after: 1 week	2011-12-09 17:19:41 +00:00
Eitan Adler	5a01b72672	- Fix ktrace leakage if error is set PR: kern/163098 Submitted by: Loganaden Velvindron <loganaden@devio.us> Approved by: sbruno@ MFC after: 1 month	2011-12-08 03:20:38 +00:00
Alan Cox	ea3f07d3a0	Eliminate stale numbers from a comment.	2011-12-07 16:27:23 +00:00
Alan Cox	c749c003b8	Eliminate the possibility of 32-bit arithmetic overflow in the calculation of vm_kmem_size that may occur if the system administrator has specified a vm.vm_kmem_size tunable value that exceeds the hard cap. PR: 162741 Submitted by: Adam McDougall Reviewed by: bde@ MFC after: 3 weeks	2011-12-07 07:03:14 +00:00
Konstantin Belousov	93c26de0ad	Most users of pipe(2) do not call fstat(2) on the returned pipe descriptors. Optimize for the case, by lazily allocating the pipe inode number at the fstat(2) time. If alloc_unr(9) returns failure, do not fail fstat(2), since uses of inode numbers are even rare then fstat(2), but provide zero inode forever. Note that alloc_unr() failure is unlikely due to total number of pipes in the system limited by the number of file descriptors. Based on the submission by: gianni MFC after: 2 weeks	2011-12-06 11:24:03 +00:00
Mikolaj Golub	9e94d5b83f	Really protect kern.proc.ps_strings sysctls with p_candebug(). This was intended to be in r228288. Spotted by: many MFC after: 1 week	2011-12-06 06:40:14 +00:00
Mikolaj Golub	c65932be9d	Protect kern.proc.auxv and kern.proc.ps_strings sysctls with p_candebug(). Citing jilles: If we are ever going to do ASLR, the AUXV information tells an attacker where the stack, executable and RTLD are located, which defeats much of the point of randomizing the addresses in the first place. Given that the AUXV information seems to be used by debuggers only anyway, I think it would be good to move it to p_candebug() now. The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already under p_candebug(). Suggested by: jilles Discussed with: rwatson MFC after: 1 week	2011-12-05 19:34:02 +00:00
Kevin Lo	2b69bb1f27	Add a missing curly bracket	2011-12-05 10:34:52 +00:00
Andriy Gapon	5e27a60372	critical_exit: ignore td_owepreempt if kdb_active is set calling mi_switch in such a context results in a recursion via kdb_switch Suggested by: jhb Reviewed by: jhb MFC after: 5 weeks	2011-12-04 21:27:41 +00:00
Mikolaj Golub	0f60ecdaa4	In sysctl_kern_proc_ps_strings() there is no much sense in checking for P_WEXIT and P_SYSTEM flags. Reviewed by: kib	2011-12-04 21:24:01 +00:00
Hans Petter Selasky	494b6fec82	Make sure the description of pause() is equivalent to its implementation. No code change. Suggested by: Bruce Evans MFC after: 3 days	2011-12-03 15:51:15 +00:00
Eitan Adler	f565f395c6	- Fix typos s/(more\|less) then\|\1 than/ Submitted by: Davide Italiano <davide.italiano@gmail.com> Approved by: brucec MFC after: 3 days	2011-12-03 15:41:37 +00:00
Peter Holm	9a1d0cf68f	Use umtx_copyin_timeout() to copy and check timeout parameter. In collaboration with: kib MFC after: 1 week	2011-12-03 12:35:13 +00:00
Peter Holm	662ebe9b53	Add umtx_copyin_timeout() and move parameter checks here. In collaboration with: kib MFC after: 1 week	2011-12-03 12:30:58 +00:00
Peter Holm	ff77dfb0c1	Rename copyin_timeout32 to umtx_copyin_timeout32 and move parameter check here. Include check for negative seconds value. In collaboration with: kib MFC after: 1 week	2011-12-03 12:28:33 +00:00
Marius Strobl	002214d6bb	It doesn't make much sense to check whether child is NULL after already having dereferenced it. We either should generally check the device_t's supplied to bus functions before using them (which we seem to virtually never do) or just assume that they are not NULL. While at it make this code fit 78 columns. Found with: Coverity Prevent(tm) CID: 4230	2011-12-02 22:03:27 +00:00
Marius Strobl	f60d6c2bdf	- In device_probe_child(9) check the return value of device_set_driver(9) when actually setting a driver as especially ENOMEM is fatal in these cases. - Annotate other calls to device_set_devclass(9) and device_set_driver(9) without the return value being checked and that are okay to fail. Reviewed by: yongari (slightly earlier version)	2011-12-02 21:19:14 +00:00
John Baldwin	593dd43eee	When changing the user priority of a thread, change the real priority in addition to the user priority for threads whose current real priority is equal to the previous user priority or if the new priority is a real-time priority. This allows priority changes of other threads to have an immediate effect. MFC after: 2 weeks	2011-12-02 19:59:46 +00:00
Konstantin Belousov	5ed954efd1	If alloc_unr() call in the pipe_create() failed, then pipe->pipe_ino is -1. But, because ino_t is unsigned, this case was not covered by the test ino > 0 in pipeclose(), leading to the free_unr(-1). Fix it by explicitely comparing with 0 and -1. [1] Do no access freed memory, the inode number was cached to prevent access to cpipe after it possibly was freed, but I failed to commit the right patch. Noted by: gianni [1] Pointy hat to: kib MFC after: 3 days	2011-12-01 11:36:41 +00:00
Lawrence Stewart	66dcfed32a	Revise the sysctl handling code and restructure the hierarchy of sysctls introduced when feed-forward clock support is enabled in the kernel: - Rename the "choice" variable to "available". - Streamline the implementation of the "active" variable's sysctl handler function. - Create a kern.sysclock sysctl node for general sysclock related configuration options. Place the "available" and "active" variables under this node. - Create a kern.sysclock.ffclock sysctl node for feed-forward clock specific configuration options. Place the "version" and "ffcounter_bypass" variables under this node. - Tweak some of the description strings. Discussed with: Julien Ridoux (jridoux at unimelb edu au)	2011-12-01 07:19:13 +00:00
Konstantin Belousov	dc874f9881	Rename vm_page_set_valid() to vm_page_set_valid_range(). The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc	2011-11-30 17:39:00 +00:00
Lawrence Stewart	6f83fc5112	Make sysclock_active publicly available to external consumers. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-29 08:43:04 +00:00
Lawrence Stewart	88394fe42c	Do away with the somewhat clunky sysclock_ops structure and associated code, reimplementing the [get]{bin,nano,micro}[up]time() wrapper functions in terms of the new "fromclock" API instead. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-29 08:33:40 +00:00
Lawrence Stewart	e977bac333	Make the fbclock_[get]{bin,nano,micro}[up]time() function prototypes public so that new APIs with some performance sensitivity can be built on top of them. These functions should not be called directly except in special circumstances. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-29 06:53:36 +00:00
Lawrence Stewart	c2a4ee9906	Fix an oversight in r227747 by calling fbclock_bin{up}time() directly from the fbclock_{nanouptime\|microuptime\|bintime\|nanotime\|microtime}() functions to avoid indirecting through a sysclock_ops wrapper function. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-29 06:12:19 +00:00
Mikolaj Golub	9732458f35	Add sysctl to retrieve ps_strings structure location of another process. Suggested by: kib Reviewed by: kib	2011-11-27 17:05:26 +00:00
Mikolaj Golub	4fd6053b43	In sysctl_kern_proc_auxv the process was released too early: we still need to hold it when checking process sv_flags. MFC after: 2 weeks	2011-11-27 16:56:01 +00:00
Lawrence Stewart	66761af34f	Export the "ffclock" feature for kernels compiled with feed-forward clock support. Suggested by: netchild Reviewed by: netchild	2011-11-26 01:44:37 +00:00
Mikolaj Golub	9e7d058351	Add sysctl to get process resource limits. Reviewed by: kib MFC after: 2 weeks	2011-11-24 20:43:37 +00:00
Konstantin Belousov	561984be06	Fix a race between getvnode() dereferencing half-constructed file and dupfdopen(). Reported and tested by: pho MFC after: 3 days	2011-11-24 20:34:06 +00:00
Mikolaj Golub	7ad9baae41	Fix build without INVARIANTS. Discussed with: kib	2011-11-23 08:11:04 +00:00
Hans Petter Selasky	3b12bdb58f	Rename device_delete_all_children() into device_delete_children(). Suggested by: jhb @ and marius @ MFC after: 1 week	2011-11-22 21:56:55 +00:00
Hans Petter Selasky	5b288d2abf	Style change. Suggested by: jhb @ and marius @ MFC after: 1 week	2011-11-22 21:53:19 +00:00
Mikolaj Golub	c5cfcb1c19	Add new sysctls, KERN_PROC_ENV and KERN_PROC_AUXV, to return environment strings and ELF auxiliary vectors from a process stack. Make sysctl_kern_proc_args to read not cached arguments from the process stack. Export proc_getargv() and proc_getenvv() so they can be reused by procfs and linprocfs. Suggested by: kib Reviewed by: kib Discussed with: kib, rwatson, jilles Tested by: pho MFC after: 2 weeks	2011-11-22 20:40:18 +00:00
Lawrence Stewart	65e359a15c	- Add Pulse-Per-Second timestamping using raw ffcounter and corresponding ffclock time in seconds. - Add IOCTL to retrieve ffclock timestamps from userland. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-21 13:34:29 +00:00
Attilio Rao	9fde98bba3	Introduce the same mutex-wise fix in r227758 for sx locks. The functions that offer file and line specifications are: - sx_assert_ - sx_downgrade_ - sx_slock_ - sx_slock_sig_ - sx_sunlock_ - sx_try_slock_ - sx_try_xlock_ - sx_try_upgrade_ - sx_unlock_ - sx_xlock_ - sx_xlock_sig_ - sx_xunlock_ Now vm_map locking is fully converted and can avoid to know specifics about locking procedures. Reviewed by: kib MFC after: 1 month	2011-11-21 12:59:52 +00:00
Sergey Kandaurov	ca4aa8c363	Remove no more relevant XXXRW comments since accessing the vmspace is now properly done with the acquired vmspace reference. Pointed out by: kib	2011-11-21 12:21:00 +00:00
Sergey Kandaurov	18be8527e9	Use the acquired reference to the vmspace instead of direct dereferencing of p->p_vmspace like it is done in sysctl_kern_proc_vmmap().	2011-11-21 10:36:57 +00:00
Lawrence Stewart	cf13a58510	- Add the ffclock_getcounter(), ffclock_getestimate() and ffclock_setestimate() system calls to provide feed-forward clock management capabilities to userspace processes. ffclock_getcounter() returns the current value of the kernel's feed-forward clock counter. ffclock_getestimate() returns the current feed-forward clock parameter estimates and ffclock_setestimate() updates the feed-forward clock parameter estimates. - Document the syscalls in the ffclock.2 man page. - Regenerate the script-derived syscall related files. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ Submitted by: Julien Ridoux (jridoux at unimelb edu au)	2011-11-21 01:26:10 +00:00
Attilio Rao	ccdf233323	Introduce macro stubs in the mutex implementation that will be always defined and will allow consumers, willing to provide options, file and line to locking requests, to not worry about options redefining the interfaces. This is typically useful when there is the need to build another locking interface on top of the mutex one. The introduced functions that consumers can use are: - mtx_lock_flags_ - mtx_unlock_flags_ - mtx_lock_spin_flags_ - mtx_unlock_spin_flags_ - mtx_assert_ - thread_lock_flags_ Spare notes: - Likely we can get rid of all the 'INVARIANTS' specification in the ppbus code by using the same macro as done in this patch (but this is left to the ppbus maintainer) - all the other locking interfaces may require a similar cleanup, where the most notable case is sx which will allow a further cleanup of vm_map locking facilities - The patch should be fully compatible with older branches, thus a MFC is previewed (infact it uses all the underlying mechanisms already present). Comments review by: eadler, Ben Kaduk Discussed with: kib, jhb MFC after: 1 month	2011-11-20 16:33:09 +00:00
Hans Petter Selasky	9e3ae31c7a	Given that the typical usage of pause() is pause("zzz", hz / N), where N can be greater than hz in some cases, simply ignore a timeout value of zero. Suggested by: Bruce Evans MFC after: 1 week	2011-11-20 08:36:18 +00:00
Hans Petter Selasky	f1a1612fc2	Minor style change: Simplify the description of pause() and shorten the KASSERT message in pause. Also add a clamp for the timo argument in the non-KASSERT case. Suggested by: Bruce Evans MFC after: 1 week	2011-11-20 08:29:23 +00:00

1 2 3 4 5 ...

12631 Commits