freebsd-nq

Author	SHA1	Message	Date
Kevin Lo	575cabed9e	Fix a style bug Spotted by: avg	2012-01-16 14:54:48 +00:00
David Xu	29a06690ca	Eliminate branch and insert an explicit reader memory barrier to ensure that waiter bit is set before reading semaphore count.	2012-01-16 04:39:10 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Martin Matuska	9cbe30e1d5	Fix missing in r230129: kern_jail.c: initialize fullpath_disabled to zero vfs_cache.c: add missing dot in comment Reported by: kib MFC after: 1 month	2012-01-15 18:08:15 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Martin Matuska	f6e633a9e1	Introduce vn_path_to_global_path() This function updates path string to vnode's full global path and checks the size of the new path string against the pathlen argument. In vfs_domount(), sys_unmount() and kern_jail_set() this new function is used to update the supplied path argument to the respective global path. Unbreaks jailed zfs(8) with enforce_statfs set to 1. Reviewed by: kib MFC after: 1 month	2012-01-15 12:08:20 +00:00
Eitan Adler	886e862866	- Fix undefined behavior when device_get_name is null - Make error message more informative PR: kern/149800 Submitted by: olgeni Approved by: cperciva MFC after: 1 week	2012-01-15 07:09:18 +00:00
Oleksandr Tymoshenko	4104e83567	Fix kernel modules loading for MIPS64 kernel: On amd64, link_elf_obj.c must specify KERNBASE rather than VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable modules must be mapped for execution in the same upper region of the kernel map as the kernel code and data segments. For MIPS32 KERNBASE lies below KVA area (it's less than VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole KVA to look through. On MIPS64 it's not the case because KERNBASE is set to the very end of XKSEG, well out of KVA bounds, so vm_map_find always fails. We should use VM_MIN_KERNEL_ADDRESS as a base for vm_map_find. Details obtained from: alc@	2012-01-14 00:36:07 +00:00
John Baldwin	fbcebf7f71	Convert the per-interface address list lock from a mutex to a reader/writer lock. Reviewed by: bz	2012-01-09 19:34:12 +00:00
Andriy Gapon	90d8265326	enable stop_scheduler_on_panic by default My plan is to make this behavior unconditional before 10.0 release. X-MFC after: r228424 (if ever)	2012-01-09 12:06:09 +00:00
Konstantin Belousov	3ab0160340	Avoid LOR between vfs_busy() lock and covered vnode lock on quotaon(). The vfs_busy() is after covered vnode lock in the global lock order, but since quotaon() does recursive VFS call to open quota file, we usually end up locking covered vnode after mp is busied in sys_quotactl(). Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon, esp. if vfs_busy cannot succeed due to unmount being performed. Reported and tested by: pho MFC after: 1 week	2012-01-08 23:06:53 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Hiroki Sato	ca54e1aee3	Fix a typo. (s/nessesary/necessary/)	2012-01-08 18:48:36 +00:00
John Baldwin	71eeeaf256	Add 5 spare VOPs as placeholders to avoid breaking the KBI in the future when new VOPs are MFC'd to a branch. Reviewed by: kib, bz MFC after: 3 days	2012-01-06 20:06:45 +00:00
John Baldwin	908cac07ce	Use proper argument structure types for the extattr post-VOP hooks. The wrong structure happened to work since the only argument used was the vnode which is in the same place in both VOP_SETATTR() and the two extattr VOPs. MFC after: 3 days	2012-01-06 20:05:48 +00:00
John Baldwin	948c460971	Fix a logic bug in change 228207 in the check for a thread's new user priority being a realtime priority. MFC after: 3 days	2012-01-05 19:02:52 +00:00
John Baldwin	137f91e80f	Convert all users of IF_ADDR_LOCK to use new locking macros that specify either a read lock or write lock. Reviewed by: bz MFC after: 2 weeks	2012-01-05 19:00:36 +00:00
John Baldwin	7e3a96ea37	Some small fixes to CPU accounting for threads: - Only initialize the per-cpu switchticks and switchtime in sched_throw() for the very first context switch on APs during boot. This avoids a small gap between the middle of thread_exit() and sched_throw() where time is not accounted to any thread. - In thread_exit(), update the timestamp bookkeeping to track the changes to mi_switch() introduced by td_rux so that the code once again matches the comment claiming it is mimicing mi_switch(). Specifically, only update the per-thread stats directly and depend on ruxagg() to update p_rux rather than adjusting p_rux directly. While here, move the timestamp bookkeeping as late in the function as possible. Reviewed by: bde, kib MFC after: 1 week	2012-01-03 21:03:28 +00:00
Ed Schouten	dc15eac046	Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.	2012-01-02 12:12:10 +00:00
Konstantin Belousov	cdb7a43117	Avoid double-unlock or double unreference for ndp->ni_dvp when the vnode dp lock upgrade right after the 'success' label fails. In collaboration with: pho MFC after: 1 week	2012-01-01 18:45:59 +00:00
John Baldwin	0c0d27d5dd	Cap the priority calculated from the current thread's running tick count at SCHED_PRI_RANGE to prevent overflows in the priority value. This can happen due to irregularities with clock interrupts under certain virtualization environments. Tested by: Larry Rosenman ler lerctr org MFC after: 2 weeks	2011-12-29 16:17:16 +00:00
Lawrence Stewart	6cedd609b7	Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. The sysclock_getsnapshot() function allows the caller to obtain a snapshot of all the system clock and timecounter state required to create time stamps at a later point. The sysclock_snap2bintime() function converts a previously obtained snapshot into a bintime time stamp according to the specified flags e.g. which system clock, uptime vs absolute time, etc. These KPIs enable useful functionality, including direct comparison of the feedback and feed-forward system clocks and generation of multiple time stamps with different formats from a single timecounter read. Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project. For more information, see http://www.synclab.org/radclock/ In collaboration with: Julien Ridoux (jridoux at unimelb edu au)	2011-12-24 01:32:01 +00:00
John Baldwin	f0d6c5caf0	Add post-VOP hooks for VOP_DELETEEXTATTR() and VOP_SETEXTATTR() and use these to trigger a NOTE_ATTRIB EVFILT_VNODE kevent when the extended attributes of a vnode are changed. Note that OS X already implements this behavior. Reviewed by: rwatson MFC after: 2 weeks	2011-12-23 20:11:37 +00:00
John Baldwin	268e76d86e	Use TASK_INITIALIZER() for dev_dtr_task rather than a dedicated SYSINIT().	2011-12-22 16:01:10 +00:00
Andriy Gapon	167057914b	ule: ensure that batch timeshare threads are scheduled fairly With the previous code, if the range of priorities for timeshare batch threads was greater than RQ_NQS, then the threads with low priorities in the part of the range above RQ_NQS would be scheduled to the run-queues as if they had high priorities at the beginning of the range. In other words, threads with a nice level of +N could be scheduled as if they had a nice level of -M. Reported by: George Mitchell <george@m5p.com> Reviewed by: jhb Tested by: George Mitchell <george@m5p.com> (earlier version) MFC after: 1 week	2011-12-19 20:01:21 +00:00
Mikolaj Golub	547b155eb1	Fix style and white spaces. MFC after: 1 week	2011-12-17 22:18:26 +00:00
Mikolaj Golub	fa3935bcea	On start most of sysctl_kern_proc functions use the same pattern: locate a process calling pfind() and do some additional checks like p_candebug(). To reduce this code duplication a new function pget() is introduced and used. As the function may be useful not only in kern_proc.c it is in the kernel name space. Suggested by: kib Reviewed by: kib MFC after: 2 weeks	2011-12-17 16:59:22 +00:00
Andriy Gapon	f389bc9585	belatedly transfer copyrights from libkern/gets.c to kern_cons.c MFC after: 2 months MFC with: r228642	2011-12-17 15:50:45 +00:00
Andriy Gapon	f6ce353e58	replace uses of libkern gets with cngets MFC after: 2 months	2011-12-17 15:26:34 +00:00
Andriy Gapon	8e62854265	introduce cngets, a method for kernel to read a string from console This is intended as a replacement for libkern's gets and mostly borrows its implementation. It uses cngrab/cnungrab to delimit kernel's access to console input. Note: libkern's gets obviously doesn't share any bits of implementation iwth libc's gets. They also have different APIs and the former doesn't have the overflow problems of the latter. Inspired by: bde MFC after: 2 months	2011-12-17 15:16:54 +00:00
Andriy Gapon	bf8696b408	introduce cngrab/cnungrab stub calls in some places where they make sense MFC after: 2 months	2011-12-17 15:11:22 +00:00
Andriy Gapon	9976156f12	kern cons: introduce infrastructure for console grabbing by kernel At the moment grab and ungrab methods of all console drivers are no-ops. Current intended meaning of the calls is that the kernel takes control of console input. In the future the semantics may be extended to mean that the calling thread takes full ownership of the console (e.g. console output from other threads could be suspended). Inspired by: bde MFC after: 2 months	2011-12-17 15:08:43 +00:00
John Baldwin	f427c78b19	Fire a kevent if necessary after seeking on a regular file. This fixes a case where a kevent would not fire on a regular file if an application read to EOF and then seeked backwards into the file. Reviewed by: kib MFC after: 2 weeks	2011-12-16 20:10:00 +00:00
John Baldwin	338e7cf235	Use vm_mmap_to_errno(). Submitted by: kib	2011-12-15 15:17:19 +00:00
Jilles Tjoelker	6d1c58f8a2	Fix select/poll/kqueue for write on reverse direction before first write. The reverse direction of a pipe is lazily allocated on the first write in that direction (because pipes are usually used in one direction only). A special case is needed to ensure the pipe appears writable before the first write because there are 0 bytes of pending data in 0 bytes of buffer space at that point, leaving 0 bytes of data that can be written with the normal code. Note that the first write returns [ENOMEM] if kern.ipc.maxpipekva is exceeded and does not block or return [EAGAIN], so selecting true for write is correct even in that case. PR: kern/93685 Submitted by: gianni MFC after: 2 weeks	2011-12-14 22:26:39 +00:00
John Baldwin	fb680e16f4	Add a helper API to allow in-kernel code to map portions of shared memory objects created by shm_open(2) into the kernel's address space. This provides a convenient way for creating shared memory buffers between userland and the kernel without requiring custom character devices.	2011-12-14 22:22:19 +00:00
David E. O'Brien	1c5151f3f8	Match other formatting.	2011-12-14 02:31:32 +00:00
David E. O'Brien	3d7618d8bf	Disallow various debug.kdb sysctl's when securelevel is raised. PR: 161350	2011-12-13 17:59:16 +00:00
Eitan Adler	9910b854c6	- Add a sysctl to allow non-root users the ability to set idle priorities. - While here fix up some style nits. Discussed with: cperciva (breifly) Reviewed by: pjd (earlier version) Reviewed by: bde Approved by: jhb MFC after: 1 month	2011-12-13 14:00:27 +00:00
Eitan Adler	3eb9ab5255	Document a large number of currently undocumented sysctls. While here fix some style(9) issues and reduce redundancy. PR: kern/155491 PR: kern/155490 PR: kern/155489 Submitted by: Galimov Albert <wtfcrap@mail.ru> Approved by: bde Reviewed by: jhb MFC after: 1 week	2011-12-13 00:38:50 +00:00
Andriy Gapon	7a7ce668ef	put sys/systm.h at its proper place or add it if missing Reported by: lstewart, tinderbox Pointyhat to: avg, attilio MFC after: 1 week MFC with: r228430	2011-12-12 10:05:13 +00:00
Andriy Gapon	0e225211a0	kern_racct: move sys/systm.h inclusion to its proper place This should fix the build failure introduced with r228424. Also remove duplicate inclusion of sys/param.h. Pointyhat to: avg MFC after: 1 week	2011-12-12 07:46:10 +00:00
Andriy Gapon	353705930f	panic: add a switch and infrastructure for stopping other CPUs in SMP case Historical behavior of letting other CPUs merily go on is a default for time being. The new behavior can be switched on via kern.stop_scheduler_on_panic tunable and sysctl. Stopping of the CPUs has (at least) the following benefits: - more of the system state at panic time is preserved intact - threads and interrupts do not interfere with dumping of the system state Only one thread runs uninterrupted after panic if stop_scheduler_on_panic is set. That thread might call code that is also used in normal context and that code might use locks to prevent concurrent execution of certain parts. Those locks might be held by the stopped threads and would never be released. To work around this issue, it was decided that instead of explicit checks for panic context, we would rather put those checks inside the locking primitives. This change has substantial portions written and re-written by attilio and kib at various times. Other changes are heavily based on the ideas and patches submitted by jhb and mdf. bde has provided many insights into the details and history of the current code. The new behavior may cause problems for systems that use a USB keyboard for interfacing with system console. This is because of some unusual locking patterns in the ukbd code which have to be used because on one hand ukbd is below syscons, but on the other hand it has to interface with other usb code that uses regular mutexes/Giant for its concurrency protection. Dumping to USB-connected disks may also be affected. PR: amd64/139614 (at least) In cooperation with: attilio, jhb, kib, mdf Discussed with: arch@, bde Tested by: Eugene Grosbein <eugen@grosbein.net>, gnn, Steven Hartland <killing@multiplay.co.uk>, glebius, Andrew Boyer <aboyer@averesystems.com> (various versions of the patch) MFC after: 3 months (or never)	2011-12-11 21:02:01 +00:00
Peter Holm	cdea31e305	Move cpu_set_upcall(newtd, td) up before the first call of thread_free(newtd). This to avoid a possible page fault in cpu_thread_clean() as seen on amd64 with syscall fuzzing. Reviewed by: kib MFC after: 1 week	2011-12-09 17:19:41 +00:00
Eitan Adler	5a01b72672	- Fix ktrace leakage if error is set PR: kern/163098 Submitted by: Loganaden Velvindron <loganaden@devio.us> Approved by: sbruno@ MFC after: 1 month	2011-12-08 03:20:38 +00:00
Alan Cox	ea3f07d3a0	Eliminate stale numbers from a comment.	2011-12-07 16:27:23 +00:00
Alan Cox	c749c003b8	Eliminate the possibility of 32-bit arithmetic overflow in the calculation of vm_kmem_size that may occur if the system administrator has specified a vm.vm_kmem_size tunable value that exceeds the hard cap. PR: 162741 Submitted by: Adam McDougall Reviewed by: bde@ MFC after: 3 weeks	2011-12-07 07:03:14 +00:00
Konstantin Belousov	93c26de0ad	Most users of pipe(2) do not call fstat(2) on the returned pipe descriptors. Optimize for the case, by lazily allocating the pipe inode number at the fstat(2) time. If alloc_unr(9) returns failure, do not fail fstat(2), since uses of inode numbers are even rare then fstat(2), but provide zero inode forever. Note that alloc_unr() failure is unlikely due to total number of pipes in the system limited by the number of file descriptors. Based on the submission by: gianni MFC after: 2 weeks	2011-12-06 11:24:03 +00:00
Mikolaj Golub	9e94d5b83f	Really protect kern.proc.ps_strings sysctls with p_candebug(). This was intended to be in r228288. Spotted by: many MFC after: 1 week	2011-12-06 06:40:14 +00:00
Mikolaj Golub	c65932be9d	Protect kern.proc.auxv and kern.proc.ps_strings sysctls with p_candebug(). Citing jilles: If we are ever going to do ASLR, the AUXV information tells an attacker where the stack, executable and RTLD are located, which defeats much of the point of randomizing the addresses in the first place. Given that the AUXV information seems to be used by debuggers only anyway, I think it would be good to move it to p_candebug() now. The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already under p_candebug(). Suggested by: jilles Discussed with: rwatson MFC after: 1 week	2011-12-05 19:34:02 +00:00

1 2 3 4 5 ...

12466 Commits