freebsd-dev

Author	SHA1	Message	Date
Dimitry Andric	ac382cb7f1	Change definition of pipe_chmod() from K&R to C99, to avoid the following clang warning: sys/kern/sys_pipe.c:1556:10: error: promoted type 'int' of K&R function parameter is not compatible with the parameter type 'mode_t' (aka 'unsigned short') declared in a previous prototype [-Werror] mode_t mode; ^ sys/kern/sys_pipe.c:155:19: note: previous declaration is here static fo_chmod_t pipe_chmod; ^	2012-02-28 21:45:21 +00:00
John Baldwin	0428cbe4df	Properly clear a device's devclass if DEVICE_ATTACH() fails if the device does not have a fixed devclass. Reviewed by: imp MFC after: 2 weeks	2012-02-28 19:16:02 +00:00
Konstantin Belousov	1d7ca9bb8e	Currently, the debugger attached to the process executing vfork() does not get syscall exit notification until the child performed exec of exit. Swap the order of doing ptracestop() and waiting for P_PPWAIT clearing, by postponing the wait into syscallret after ptracestop() notification is done. Reported, tested and reviewed by: Dmitry Mikulin <dmitrym juniper net> MFC after: 2 weeks	2012-02-27 21:10:10 +00:00
John Baldwin	ef7b427562	Clear the a device's description string anytime it's driver changes. Descriptions are specific to drivers and we don't change drivers on attached devices. This fixes a few places where we were not clearing the description when detaching a driver (e.g. with device_attach() failed). While here, fix a few other nits: - Remove spurious call to remove a device's driver from devclass_driver_deleted(). device_detach() removes it already. - Fix a typo.	2012-02-27 16:08:18 +00:00
David Xu	24c209494a	Follow changes made in revision 232144, pass absolute timeout to kernel, this eliminates a clock_gettime() syscall.	2012-02-27 13:38:52 +00:00
Alexander Motin	36acfc6507	Rework CPU load balancing in SCHED_ULE: - In sched_pickcpu() be more careful taking previous CPU on SMT systems. Do it only if all other logical CPUs of that physical one are idle to avoid extra resource sharing. - In sched_pickcpu() change general logic of CPU selection. First look for idle CPU, sharing last level cache with previously used one, skipping SMT CPU groups. If none found, search all CPUs for the least loaded one, where the thread with its priority can run now. If none found, search just for the least loaded CPU. - Make cpu_search() compare lowest/highest CPU load when comparing CPU groups with equal load. That allows to differentiate 1+1 and 2+0 loads. - Make cpu_search() to prefer specified (previous) CPU or group if load is equal. This improves cache affinity for more complicated topologies. - Randomize CPU selection if above factors are equal. Previous code tend to prefer CPUs with lower IDs, causing unneeded collisions. - Rework periodic balancer in sched_balance_group(). With cpu_search() more intelligent now, make balansing process flat, removing recursion over the topology tree. That fixes double swap problem and makes load distribution more even and predictable. All together this gives 10-15% performance improvement in many tests on CPUs with SMT, such as Core i7, for number of threads is less then number of logical CPUs. In some tests it also gives positive effect to systems without SMT. Reviewed by: jeff Tested by: flo, hackers@ MFC after: 1 month Sponsored by: iXsystems, Inc.	2012-02-27 10:31:54 +00:00
Poul-Henning Kamp	f9a61f7dcb	Also call the low-level driver if ->c_iflag & (IXON\|IXOFF\|IXANY) changes. Uftdi(4) examines (c_iflag & (IXON\|IXOFF)) to control hw XON-XOFF support. This is obviously no good, if changes to those bits are not communicated down the stack.	2012-02-26 20:56:49 +00:00
Alan Cox	e5c92401dd	Fix typo. MFC after: 1 week	2012-02-26 19:10:14 +00:00
Martin Matuska	e7af90ab00	Analogous to r232059, add a parameter for the ZFS file system: allow.mount.zfs: allow mounting the zfs filesystem inside a jail This way the permssions for mounting all current VFCF_JAIL filesystems inside a jail are controlled wia allow.mount.* jail parameters. Update sysctl descriptions. Update jail(8) and zfs(8) manpages. TODO: document the connection of allow.mount.* and VFCF_JAIL for kernel developers MFC after: 10 days	2012-02-26 16:30:39 +00:00
Jilles Tjoelker	581400dfed	Fix fchmod() and fchown() on fifos. The new fifo implementation in r232055 broke fchmod() and fchown() on fifos. Postfix needs this. Submitted by: gianni Reported by: dougb	2012-02-26 15:14:29 +00:00
Mikolaj Golub	6ce13747dc	Add sysctl to retrieve or set umask of another process. Submitted by: Dmitry Banschikov <me ubique spb ru> Discussed with: kib, rwatson Reviewed by: kib MFC after: 2 weeks	2012-02-26 14:25:48 +00:00
Konstantin Belousov	747d2fa178	Add SO_PROTOCOL/SO_PROTOTYPE socket SOL_SOCKET-level option to get the socket protocol number. This is useful since the socket type can be implemented by different protocols in the same protocol family, e.g. SOCK_STREAM may be provided by both TCP and SCTP. Submitted by: Jukka A. Ukkonen <jau iki fi> PR: kern/162352 Discussed with: bz Reviewed by: glebius MFC after: 2 weeks	2012-02-26 13:55:43 +00:00
Konstantin Belousov	9493639e35	Remove apparently redundand checks for socket so_proto being non-NULL from sosetopt() and sogetopt(). No exposed sockets may have so_proto invalid. Discussed with: bz, rwatson Reviewed by: glebius MFC after: 2 weeks	2012-02-26 13:51:05 +00:00
Maxim Konovalov	7dfdd83d56	o Reduce chances for integer overflow. o More verbose sysctl description added. MFC after: 2 weeks Sponsored by: Nginx, Inc.	2012-02-25 12:06:40 +00:00
Mikolaj Golub	662c901c54	When detaching an unix domain socket, uipc_detach() checks unp->unp_vnode pointer to detect if there is a vnode associated with (binded to) this socket and does necessary cleanup if there is. The issue is that after forced unmount this check may be too late as the unp_vnode is reclaimed and the reference is stale. To fix this provide a helper function that is called on a socket vnode reclamation to do necessary cleanup. Pointed by: kib Reviewed by: kib MFC after: 2 weeks	2012-02-25 10:15:41 +00:00
David Xu	df1f1bae9e	In revision 231989, we pass a 16-bit clock ID into kernel, however according to POSIX document, the clock ID may be dynamically allocated, it unlikely will be in 64K forever. To make it future compatible, we pack all timeout information into a new structure called _umtx_time, and use fourth argument as a size indication, a zero means it is old code using timespec as timeout value, but the new structure also includes flags and a clock ID, so the size argument is different than before, and it is non-zero. With this change, it is possible that a thread can sleep on any supported clock, though current kernel code does not have such a POSIX clock driver system.	2012-02-25 02:12:17 +00:00
Konstantin Belousov	dcdc6c361b	Restore the return statement erronously removed in the r232048. Submitted by: cognet Pointy hat to: kib (reuse the one I already got today) MFC after: 13 days	2012-02-24 11:02:35 +00:00
Martin Matuska	bf3db8aa65	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks	2012-02-23 18:51:24 +00:00
Kip Macy	11ac7ec076	merge pipe and fifo implementations Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde	2012-02-23 18:37:30 +00:00
Christian Brueffer	6bdc1841a9	Catch up with r195837 (2.5 years ago) which renamed net_add_domain() to domain_add(). PR: 165424 Submitted by: Lachlan Kang MFC after: 1 week	2012-02-23 17:47:19 +00:00
Konstantin Belousov	dcd432817e	Allow the parent to gather the exit status of the children reparented to the debugger. When reparenting for debugging, keep the child in the new orphan list of old parent. When looping over the children in kern_wait(), iterate over both children list and orphan list to search for the process by pid. Submitted by: Dmitry Mikulin <dmitrym juniper.net> MFC after: 2 weeks	2012-02-23 11:50:23 +00:00
David Xu	f911d9fa4d	Fix typo.	2012-02-22 07:34:23 +00:00
David Xu	b13a8fa78f	Use unused fourth argument of umtx_op to pass flags to kernel for operation UMTX_OP_WAIT. Upper 16bits is enough to hold a clock id, and lower 16bits is used to pass flags. The change saves a clock_gettime() syscall from libthr.	2012-02-22 03:22:49 +00:00
Mikolaj Golub	a95852edf3	unp_connect() may use a shared lock on the vnode to fetch the socket. Suggested by: jhb Reviewed by: jhb, kib, rwatson MFC after: 2 weeks	2012-02-21 19:40:13 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Xin LI	fcdd3d322b	Revert r231923 for now. Further work is needed to make sure that the behavior is consistent.	2012-02-20 09:32:32 +00:00
Xin LI	5bfbb59851	Use uprintf instead of printf for the reason why a kernel module can not be loaded. This way, the administrator can get response immediately from the shell session rather than relying on dmesg. MFC after: 1 month	2012-02-20 01:05:17 +00:00
Alan Cox	7dc0ace10e	Close a race due to dropping of the map lock between creating a map entry for a shared mapping and marking the entry for inheritance. Reviewed by: kib X-MFC after: r231526	2012-02-19 00:28:49 +00:00
Konstantin Belousov	3494f31ad2	Fix misuse of the kernel map in miscellaneous image activators. Vnode-backed mappings cannot be put into the kernel map, since it is a system map. Use exec_map for transient mappings, and remove the mappings with kmem_free_wakeup() to notify the waiters on available map space. Do not map the whole executable into KVA at all to copy it out into usermode. Directly use vn_rdwr() for the case of not page aligned binary. There is one place left where the potentially unbounded amount of data is mapped into exec_map, namely, in the COFF image activator enumeration of the needed shared libraries. Reviewed by: alc MFC after: 2 weeks	2012-02-17 23:47:16 +00:00
Bjoern A. Zeeb	9dba179d5e	IFC @231845 Sponsored by: Cisco Systems, Inc.	2012-02-17 00:27:48 +00:00
Eitan Adler	f17a6f1b17	Add a timestamp to the msgbuf output in order to determine when when messages were printed. This can be enabled with the kern.msgbuf_show_timestamp sysctl PR: kern/161553 Reviewed by: avg Submitted by: Arnaud Lacombe <lacombar@gmail.com> Approved by: cperciva MFC after: 1 month	2012-02-16 05:11:35 +00:00
Konstantin Belousov	343b391f20	The PTRACESTOP() macro is used only once. Inline the only use and remove the macro. MFC after: 1 week	2012-02-11 14:49:25 +00:00
Ed Schouten	852b05c5b5	Remove unneeded newline. It fits in 80 columns now. Pointed out by: jh	2012-02-10 14:55:47 +00:00
Ed Schouten	8fac9b7b7d	Merge si_name and __si_namebuf. The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.	2012-02-10 12:40:50 +00:00
Kevin Lo	de02885a7b	Add a missing break. This bug was introduced in r228856.	2012-02-10 06:30:52 +00:00
Konstantin Belousov	db3273398b	Mark the automatically attached child with PL_FLAG_CHILD in struct lwpinfo flags, for PT_FOLLOWFORK auto-attachment. In collaboration with: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week	2012-02-10 00:02:13 +00:00
Martin Matuska	0cc207a6f5	Add support for mounting devfs inside jails. A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month	2012-02-09 10:22:08 +00:00
Konstantin Belousov	9cfb2326bc	Unbreak detection of the async mode for clustered writes after r231075. Submitted by: bde MFC after: 12 days	2012-02-08 15:07:19 +00:00
Pawel Jakub Dawidek	12075c0936	Allow to set kern.ipc.shmmax from /boot/loader.conf. MFC after: 1 week	2012-02-08 09:18:22 +00:00
Ed Schouten	cd864a19a5	Fix whitespace inconsistencies in TTY code.	2012-02-06 18:15:46 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Kevin Lo	0ca2381d9b	- Use uint8_t for the variable x and spell the size of the variable as sizeof(x) - Capitalized comment - Parentheses around return value Requested by: bde	2012-02-06 06:03:16 +00:00
Martin Matuska	a91d2201f9	Analogous to r230407 a separate path buffer in vfs_mount.c is required for r230129. Fixes a out of bounds write to fspath. MFC after: 10 days	2012-02-05 10:59:50 +00:00
David Xu	d56e058a79	Add 32-bit compat code for AIO kevent flags introduced in revision 230857.	2012-02-05 04:49:31 +00:00
Ryan Stone	312ac3a23a	Whenever a new kernel thread is spawned, explicitly clear any CPU affinity set on the new thread. This prevents the thread from inadvertently inheriting affinity from a random sibling. Submitted by: attilio Tested by: pho MFC after: 1 week	2012-02-04 16:49:29 +00:00
Hiroki Sato	cf8b832511	Fix input validation in SO_SETFIB. Reviewed by: bz MFC after: 1 day	2012-02-04 15:00:26 +00:00
Bjoern A. Zeeb	ee799639e8	Add SO_SETFIB option support on PF_INET6 sockets and allow inheriting the FIB number from the process, as set by setfib(2), on socket creation. Sponsored by: Cisco Systems, Inc.	2012-02-03 11:00:53 +00:00
Konstantin Belousov	6af519cf18	Add kqueue support to /dev/klog. Submitted by: Mateusz Guzik <mjguzik gmail com> PR: kern/156423 MFC after: 1 weeks	2012-02-01 14:34:52 +00:00
David Xu	fde809356a	If multiple threads call kevent() to get AIO events on same kqueue fd, it is possible that a single AIO event will be reported to multiple threads, it is not threading friendly, and the existing API can not control this behavior. Allocate a kevent flags field sigev_notify_kevent_flags for AIO event notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH or EV_ONESHOT to AIO kernel code, user can control whether the event should be cleared once it is retrieved by a thread. This change should be comptaible with existing application, because the field should have already been zero-filled, and no additional action will be taken by kernel. PR: kern/156567	2012-02-01 02:53:06 +00:00
Konstantin Belousov	6ad1ff09cc	A debugger which requested PT_FOLLOW_FORK should get the notification about new child not only when doing PT_TO_SCX, but also for PT_CONTINUE. If TDB_FORK flag is set, always issue a stop, the same as is done for TDB_EXEC. Reported by: Dmitry Mikulin <dmitrym juniper net> MFC after: 1 week	2012-01-30 20:00:29 +00:00
John Baldwin	2bd3e4c2c2	Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such that instead of using direct I/O it allows read-ahead similar to POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the read(2) has completed to purge just-read data. The write(2) path continues to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works optimally if an application reads and writes full fs blocks.	2012-01-30 19:35:15 +00:00
Doug Ambrisko	8e9fc27818	When detaching an AIO or LIO requests grab the lock and tell knlist_remove that we have the lock now. This cleans up a locking panic ASSERT when knlist_empty is called without a lock when INVARIANTS etc. are turned. Reviewed by: kib jhb MFC after: 1 week	2012-01-30 19:19:22 +00:00
Konstantin Belousov	62c625fdd2	Finally, try to enable the nxstacks on amd64 and powerpc64 for both 64bit and 32bit ABIs. Also try to enable nxstacks for PAE/i386 when supported, and some variants of powerpc32. MFC after: 2 months (if ever)	2012-01-30 07:56:00 +00:00
Attilio Rao	5d7380f8e3	Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424	2012-01-28 14:00:21 +00:00
Gleb Smirnoff	94fce84763	Fix size check, that prevents getting negative after casting to a signed type Reviewed by: bde	2012-01-27 08:58:58 +00:00
Kenneth D. Merry	7e949c467c	Xen netback driver rewrite. share/man/man4/Makefile, share/man/man4/xnb.4, sys/dev/xen/netback/netback.c, sys/dev/xen/netback/netback_unit_tests.c: Rewrote the netback driver for xen to attach properly via newbus and work properly in both HVM and PVM mode (only HVM is tested). Works with the in-tree FreeBSD netfront driver or the Windows netfront driver from SuSE. Has not been extensively tested with a Linux netfront driver. Does not implement LRO, TSO, or polling. Includes unit tests that may be run through sysctl after compiling with XNB_DEBUG defined. sys/dev/xen/blkback/blkback.c, sys/xen/interface/io/netif.h: Comment elaboration. sys/kern/uipc_mbuf.c: Fix page fault in kernel mode when calling m_print() on a null mbuf. Since m_print() is only used for debugging, there are no performance concerns for extra error checking code. sys/kern/subr_scanf.c: Add the "hh" and "ll" width specifiers from C99 to scanf(). A few callers were already using "ll" even though scanf() was handling it as "l". Submitted by: Alan Somers <alans@spectralogic.com> Submitted by: John Suykerbuyk <johns@spectralogic.com> Sponsored by: Spectra Logic MFC after: 1 week Reviewed by: ken	2012-01-26 16:35:09 +00:00
Gleb Smirnoff	434ea137cc	Although aio_nbytes is size_t, later is is signed to casted types: to ssize_t in filesystem code and to int in buf code, thus supplying a negative argument leads to kernel panic later. To fix that check user supplied argument in the beginning of syscall. Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@	2012-01-26 11:59:48 +00:00
Konstantin Belousov	abc942b56c	When doing vflush(WRITECLOSE), clean vnode pages. Unmounts do vfs_msync() before calling VFS_UNMOUNT(), but there is still a race allowing a process to dirty pages after msync finished. Remounts rw->ro just left dirty pages in system. Reviewed by: alc, tegge (long time ago) Tested by: pho MFC after: 2 weeks	2012-01-25 20:54:09 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
Mikolaj Golub	45efc9b4aa	Fix CTL flags in the declarations of KERN_PROC_ENV, AUXV and PS_STRINGS sysctls: they are read only. MFC after: 1 week	2012-01-25 20:15:58 +00:00
Konstantin Belousov	7a7e609a32	Apparently, both nfs clients do not use cache_enter_time() consistently, creating some namecache entries without NCF_TS flag. This causes panic due to failed assertion. As a temporal relief, remove the assert. Return epoch timestamp for the entries without timestamp if asked. While there, consolidate the code which returns timestamps, into a helper cache_out_ts(). Discussed with: jhb MFC after: 2 weeks	2012-01-23 17:09:23 +00:00
Gleb Smirnoff	93a1b4c4cf	Convert panic()s to KASSERT()s. This is an optimisation for hashdestroy() since in absence of INVARIANTS a compiler will drop the entire for() cycle.	2012-01-23 16:31:46 +00:00
Mikolaj Golub	8854fe3915	Change kern.proc.rlimit sysctl to: - retrive only one, specified limit for a process, not the whole array, as it was previously (the sysctl has been added recently and has not been backported to stable yet, so this change is ok); - allow to set a resource limit for another process. Submitted by: Andrey Zonov <andrey at zonov.org> Discussed with: kib Reviewed by: kib MFC after: 2 weeks	2012-01-22 20:25:00 +00:00
Pawel Jakub Dawidek	9b9a01792d	TDF_* flags should be used with td_flags field and TDP_* flags should be used with td_pflags field. Correct two places where it was not the case. Discussed with: kib MFC after: 1 week	2012-01-22 11:01:36 +00:00
Konstantin Belousov	c2b396f294	Remove the nc_time and nc_ticks elements from struct namecache, and provide struct namecache_ts which is the old struct namecache. Only allocate struct namecache_ts if non-null struct timespec *tsp was passed to cache_enter_time, otherwise use struct namecache. Change struct namecache allocation and deallocation macros into static functions, since logic becomes somewhat twisty. Provide accessor for the nc_name member of struct namecache to hide difference between struct namecache and namecache_ts. The aim of the change is to not waste 20 bytes per small namecache entry. Reviewed by: jhb MFC after: 2 weeks X-MFC-note: after r230394	2012-01-22 01:11:06 +00:00
Martin Matuska	6dfe0a3dc2	Use separate buffer for global path to avoid overflow of path buffer. Reviewed by: jamie@ MFC after: 3 weeks	2012-01-21 00:06:21 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Konstantin Belousov	2974cc36f7	Use shared lock for the executable vnode in the exec path after the VV_TEXT changes are handled. Assert that vnode is exclusively locked at the places that modify VV_TEXT. Discussed with: alc MFC after: 3 weeks	2012-01-19 23:03:31 +00:00
Alan Cox	1dfab8025e	Explain why it is safe to unlock the vnode. Requested by: kib	2012-01-17 16:20:50 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Alan Cox	292177e67a	Improve abstraction. Eliminate direct access by elf_load_section() to an OBJT_VNODE-specific field of the vm object. The same information can be just as easily obtained from the struct vattr that is in struct image_params if the latter is passed to elf_load_section(). Moreover, by replacing the vmspace and vm object parameters to elf*_load_section() with a struct image_params parameter, we actually reduce the size of the object code. In collaboration with: kib	2012-01-17 00:27:32 +00:00
Sergey Kandaurov	037f43d3ef	Be pedantic and change // comment to C-style one. Noticed by: Bruce Evans	2012-01-16 20:42:56 +00:00
Kevin Lo	575cabed9e	Fix a style bug Spotted by: avg	2012-01-16 14:54:48 +00:00
David Xu	29a06690ca	Eliminate branch and insert an explicit reader memory barrier to ensure that waiter bit is set before reading semaphore count.	2012-01-16 04:39:10 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Martin Matuska	9cbe30e1d5	Fix missing in r230129: kern_jail.c: initialize fullpath_disabled to zero vfs_cache.c: add missing dot in comment Reported by: kib MFC after: 1 month	2012-01-15 18:08:15 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Martin Matuska	f6e633a9e1	Introduce vn_path_to_global_path() This function updates path string to vnode's full global path and checks the size of the new path string against the pathlen argument. In vfs_domount(), sys_unmount() and kern_jail_set() this new function is used to update the supplied path argument to the respective global path. Unbreaks jailed zfs(8) with enforce_statfs set to 1. Reviewed by: kib MFC after: 1 month	2012-01-15 12:08:20 +00:00
Eitan Adler	886e862866	- Fix undefined behavior when device_get_name is null - Make error message more informative PR: kern/149800 Submitted by: olgeni Approved by: cperciva MFC after: 1 week	2012-01-15 07:09:18 +00:00
Oleksandr Tymoshenko	4104e83567	Fix kernel modules loading for MIPS64 kernel: On amd64, link_elf_obj.c must specify KERNBASE rather than VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable modules must be mapped for execution in the same upper region of the kernel map as the kernel code and data segments. For MIPS32 KERNBASE lies below KVA area (it's less than VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole KVA to look through. On MIPS64 it's not the case because KERNBASE is set to the very end of XKSEG, well out of KVA bounds, so vm_map_find always fails. We should use VM_MIN_KERNEL_ADDRESS as a base for vm_map_find. Details obtained from: alc@	2012-01-14 00:36:07 +00:00
John Baldwin	fbcebf7f71	Convert the per-interface address list lock from a mutex to a reader/writer lock. Reviewed by: bz	2012-01-09 19:34:12 +00:00
Andriy Gapon	90d8265326	enable stop_scheduler_on_panic by default My plan is to make this behavior unconditional before 10.0 release. X-MFC after: r228424 (if ever)	2012-01-09 12:06:09 +00:00
Konstantin Belousov	3ab0160340	Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). The vfs_busy() is after covered vnode lock in the global lock order, but since quotaon() does recursive VFS call to open quota file, we usually end up locking covered vnode after mp is busied in sys_quotactl(). Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon, esp. if vfs_busy cannot succeed due to unmount being performed. Reported and tested by: pho MFC after: 1 week	2012-01-08 23:06:53 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Hiroki Sato	ca54e1aee3	Fix a typo. (s/nessesary/necessary/)	2012-01-08 18:48:36 +00:00
John Baldwin	71eeeaf256	Add 5 spare VOPs as placeholders to avoid breaking the KBI in the future when new VOPs are MFC'd to a branch. Reviewed by: kib, bz MFC after: 3 days	2012-01-06 20:06:45 +00:00
John Baldwin	908cac07ce	Use proper argument structure types for the extattr post-VOP hooks. The wrong structure happened to work since the only argument used was the vnode which is in the same place in both VOP_SETATTR() and the two extattr VOPs. MFC after: 3 days	2012-01-06 20:05:48 +00:00
John Baldwin	948c460971	Fix a logic bug in change 228207 in the check for a thread's new user priority being a realtime priority. MFC after: 3 days	2012-01-05 19:02:52 +00:00
John Baldwin	137f91e80f	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
John Baldwin	7e3a96ea37	Some small fixes to CPU accounting for threads: - Only initialize the per-cpu switchticks and switchtime in sched_throw() for the very first context switch on APs during boot. This avoids a small gap between the middle of thread_exit() and sched_throw() where time is not accounted to any thread. - In thread_exit(), update the timestamp bookkeeping to track the changes to mi_switch() introduced by td_rux so that the code once again matches the comment claiming it is mimicing mi_switch(). Specifically, only update the per-thread stats directly and depend on ruxagg() to update p_rux rather than adjusting p_rux directly. While here, move the timestamp bookkeeping as late in the function as possible. Reviewed by: bde, kib MFC after: 1 week	2012-01-03 21:03:28 +00:00
Ed Schouten	dc15eac046	Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.	2012-01-02 12:12:10 +00:00
Konstantin Belousov	cdb7a43117	Avoid double-unlock or double unreference for ndp->ni_dvp when the vnode dp lock upgrade right after the 'success' label fails. In collaboration with: pho MFC after: 1 week	2012-01-01 18:45:59 +00:00
John Baldwin	0c0d27d5dd	Cap the priority calculated from the current thread's running tick count at SCHED_PRI_RANGE to prevent overflows in the priority value. This can happen due to irregularities with clock interrupts under certain virtualization environments. Tested by: Larry Rosenman ler lerctr org MFC after: 2 weeks	2011-12-29 16:17:16 +00:00
Lawrence Stewart	6cedd609b7	Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. The sysclock_getsnapshot() function allows the caller to obtain a snapshot of all the system clock and timecounter state required to create time stamps at a later point. The sysclock_snap2bintime() function converts a previously obtained snapshot into a bintime time stamp according to the specified flags e.g. which system clock, uptime vs absolute time, etc. These KPIs enable useful functionality, including direct comparison of the feedback and feed-forward system clocks and generation of multiple time stamps with different formats from a single timecounter read. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)	2011-12-24 01:32:01 +00:00
John Baldwin	f0d6c5caf0	Add post-VOP hooks for VOP_DELETEEXTATTR() and VOP_SETEXTATTR() and use these to trigger a NOTE_ATTRIB EVFILT_VNODE kevent when the extended attributes of a vnode are changed. Note that OS X already implements this behavior. Reviewed by: rwatson MFC after: 2 weeks	2011-12-23 20:11:37 +00:00
John Baldwin	268e76d86e	Use TASK_INITIALIZER() for dev_dtr_task rather than a dedicated SYSINIT().	2011-12-22 16:01:10 +00:00
Andriy Gapon	167057914b	ule: ensure that batch timeshare threads are scheduled fairly With the previous code, if the range of priorities for timeshare batch threads was greater than RQ_NQS, then the threads with low priorities in the part of the range above RQ_NQS would be scheduled to the run-queues as if they had high priorities at the beginning of the range. In other words, threads with a nice level of +N could be scheduled as if they had a nice level of -M. Reported by: George Mitchell <george@m5p.com> Reviewed by: jhb Tested by: George Mitchell <george@m5p.com> (earlier version) MFC after: 1 week	2011-12-19 20:01:21 +00:00
Mikolaj Golub	547b155eb1	Fix style and white spaces. MFC after: 1 week	2011-12-17 22:18:26 +00:00
Mikolaj Golub	fa3935bcea	On start most of sysctl_kern_proc functions use the same pattern: locate a process calling pfind() and do some additional checks like p_candebug(). To reduce this code duplication a new function pget() is introduced and used. As the function may be useful not only in kern_proc.c it is in the kernel name space. Suggested by: kib Reviewed by: kib MFC after: 2 weeks	2011-12-17 16:59:22 +00:00

1 2 3 4 5 ...

12589 Commits