freebsd-dev

Author	SHA1	Message	Date
John Baldwin	88bf5036fc	Include the associated wait channel message for context switch ktrace records. kdump supports both the old and new messages. Submitted by: Andrey Zonov andrey zonov org MFC after: 1 week	2012-04-20 15:32:36 +00:00
Jaakko Heinonen	dd952f80bc	The value of flags matching VNOVAL can't be supported. Return EOPNOTSUPP from setfflags() in this case. This fixes the return value of chflags(path, -1). Discussed with: bde MFC after: 2 weeks	2012-04-20 10:08:30 +00:00
Kirk McKusick	dca5e0ec50	This update uses the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point to replace MNT_VNODE_FOREACH_ALL in the vfs_msync, ffs_sync_lazy, and qsync routines. The vfs_msync routine is run every 30 seconds for every writably mounted filesystem. It ensures that any files mmap'ed from the filesystem with modified pages have those pages queued to be written back to the file from which they are mapped. The ffs_lazy_sync and qsync routines are run every 30 seconds for every writably mounted UFS/FFS filesystem. The ffs_lazy_sync routine ensures that any files that have been accessed in the previous 30 seconds have had their access times queued for updating in the filesystem. The qsync routine ensures that any files with modified quotas have those quotas queued to be written back to their associated quota file. In a system configured with 250,000 vnodes, less than 1000 are typically active at any point in time. Prior to this change all 250,000 vnodes would be locked and inspected twice every minute by the syncer. For UFS/FFS filesystems they would be locked and inspected six times every minute (twice by each of these three routines since each of these routines does its own pass over the vnodes associated with a mount point). With this change the syncer now locks and inspects only the tiny set of vnodes that are active. Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-20 07:00:28 +00:00
Kirk McKusick	f257ebbb2e	This change creates a new list of active vnodes associated with a mount point. Active vnodes are those with a non-zero use or hold count, e.g., those vnodes that are not on the free list. Note that this list is in addition to the list of all the vnodes associated with a mount point. To avoid adding another set of linkage pointers to the vnode structure, the active list uses the existing linkage pointers used by the free list (previously named v_freelist, now renamed v_actfreelist). This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-20 06:50:44 +00:00
Kirk McKusick	16165feec4	Delete a no longer useful VNASSERT missed during changes in 234400. Suggested by: kib	2012-04-18 19:34:20 +00:00
Kirk McKusick	60005d66ab	Fix a memory leak of M_VNODE_MARKER introduced in 234386. Found by: Peter Holm	2012-04-18 19:30:22 +00:00
Kirk McKusick	73305eb826	Drop export of vdestroy() function from kern/vfs_subr.c as it is used only as a helper function in that file. Replace sole call to vbusy() with inline code in vholdl(). Replace sole calls to vfree() and vdestroy() with inline code in vdropl(). The Clang compiler already inlines these functions, so they do not show up in a kernel backtrace which is confusing. Also you cannot set their frame in kgdb which means that it is impossible to view their local variables. So, while the produced code is unchanged, the debugging should be easier. Discussed with: kib MFC after: 2 weeks	2012-04-17 21:46:59 +00:00
Kirk McKusick	71469bb38f	Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL. The primary changes are that the user of the interface no longer needs to manage the mount-mutex locking and that the vnode that is returned has its mutex locked (thus avoiding the need to check to see if its is DOOMED or other possible end of life senarios). To minimize compatibility issues for third-party developers, the old MNT_VNODE_FOREACH interface will remain available so that this change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH will be removed in head. The reason for this update is to prepare for the addition of the MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-17 16:28:22 +00:00
Edward Tomasz Napierala	9e21ef395a	Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally allow the owner to read and write ACL and file attributes when there was no entry with subject matching the owner. In other words, 'getfacl meh' shouldn't fail for the owner if the ACL looks like this: # file: meh # owner: trasz # group: wheel user:root:------a-------:------:allow Reported by: kientzle	2012-04-17 14:54:00 +00:00
Edward Tomasz Napierala	0b18eb6d74	Stop treating system processes as special. This fixes panics like the one triggered by this: # kldload geom_vinum # pwait `pgrep -S gv_worker` & # kldunload geom_vinum or this: GEOM_JOURNAL: Shutting down geom gjournal 3464572051. panic: destroying non-empty racct: 1 allocated for resource 6 which were tracked by jh@ to be caused by checking p->p_flag, while it wasn't initialised yet. Basically, during fork, the code checked p_flag, concluded the process isn't marked as P_SYSTEM, incremented the counter, and later on, when exiting, checked that the process was marked as P_SYSTEM, and thus didn't decrement it. Also, I believe there wasn't any good reason for checking P_SYSTEM in the first place. Tested by: jh	2012-04-17 14:31:02 +00:00
Edward Tomasz Napierala	47f6635cc1	Fix panic, triggered like this: "int main() { thr_exit(); }" Submitted by: Mateusz Guzik	2012-04-17 13:44:40 +00:00
Edward Tomasz Napierala	786813aa1f	Enforce upper bound on the input buffer length. Reported by: Mateusz Guzik	2012-04-17 13:28:14 +00:00
Jung-uk Kim	d69a426fce	- Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27 but GNU libc used it without checking its kernel version, e. g., Fedora 10. - Move pipe(2) implementation for Linuxulator from MD files to MI file, sys/compat/linux/linux_file.c. There is no MD code for this syscall at all. - Correct an argument type for pipe() from l_ulong * to l_int *. Probably this was the source of MI/MD confusion. Reviewed by: emulation	2012-04-16 21:22:02 +00:00
Davide Italiano	99006d44f8	Fix a typo. Approved by: gnn (mentor) MFC after: 2 days	2012-04-14 23:59:58 +00:00
Davide Italiano	331805a5d3	Fix some style bugs introduced in a previous commit (r233045) Reported by: glebius, jmallet Reviewed by: jmallet Approved by: gnn (mentor) MFC after: 2 days	2012-04-14 23:53:31 +00:00
Marius Strobl	91849f349c	Fix !DDB build after r234190.	2012-04-14 11:21:24 +00:00
Adrian Chadd	676c1784cb	Use strdup() on the name (and free it when it's done) so non-static names can be used in firmware_register().	2012-04-13 04:22:42 +00:00
John Baldwin	0cc457b000	- Extend the KDB interface to add a per-debugger callback to print a backtrace for an arbitrary thread (rather than the calling thread). A kdb_backtrace_thread() wrapper function uses the configured debugger if possible, otherwise it falls back to using stack(9) if that is available. - Replace a direct call to db_trace_thread() in propagate_priority() with a call to kdb_backtrace_thread() instead. MFC after: 1 week	2012-04-12 17:43:59 +00:00
John Baldwin	7582954e34	If a linker file contains at least one module, but all of the modules fail to load (the MOD_LOAD event fails) during a kldload(2), unload the linker file and fail the kldload(2) with ENOEXEC. Reported by: gcooper MFC after: 1 week	2012-04-12 14:49:25 +00:00
Konstantin Belousov	2dd9ea6f70	Add thread-private flag to indicate that error value is already placed in td_errno. Flag is supposed to be used by syscalls returning EJUSTRETURN because errno was already placed into the usermode frame by a call to set_syscall_retval(9). Both ktrace and dtrace get errno value from td_errno if the flag is set. Use the flag to fix sigsuspend(2) error return ktrace records. Requested by: bde MFC after: 1 week	2012-04-12 10:48:43 +00:00
Kirk McKusick	ecb6e528c5	Export vinactive() from kern/vfs_subr.c (e.g., make it no longer static and declare its prototype in sys/vnode.h) so that it can be called from process_deferred_inactive() (in ufs/ffs/ffs_snapshot.c) instead of the body of vinactive() being cut and pasted into process_deferred_inactive(). Reviewed by: kib MFC after: 2 weeks	2012-04-11 23:01:11 +00:00
John Baldwin	77b479e644	Allow device_busy() and device_unbusy() to be invoked while a device is being attached. This is implemented by adding a new DS_ATTACHING state while a device's DEVICE_ATTACH() method is being invoked. A driver is required to not fail an attach of a busy device. The device's state will be promoted to DS_BUSY rather than DS_ACTIVE() if the device was marked busy during DEVICE_ATTACH(). Reviewed by: kib MFC after: 1 week	2012-04-11 20:57:41 +00:00
Eitan Adler	847d0034e3	Return EBADF instead of EMFILE from dup2 when the second argument is outside the range of valid file descriptors PR: kern/164970 Submitted by: Peter Jeremy <peterjeremy@acm.org> Reviewed by: jilles Approved by: cperciva MFC after: 1 week	2012-04-11 14:08:09 +00:00
Jilles Tjoelker	8a8be77610	Remove unused and wrong SA_PROC internal signal property. The SA_PROC signal property indicated whether each signal number is directed at a specific thread or at the process in general. However, that depends on how the signal was generated and not on the signal number. SA_PROC was not used.	2012-04-09 21:58:58 +00:00
Alexander Motin	70801abe8f	Microoptimize cpu_search(). According to profiling, it makes one take 6% of CPU time on hackbench with its million of context switches per second, instead of 8% before.	2012-04-09 18:24:58 +00:00
Gleb Kurtsou	0ff93c48da	Add vfs_getopt_size. Support human readable file system options in tmpfs. Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:27:34 +00:00
Alexander V. Chernikov	e4b3229aa5	- Improve BPF locking model. Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. - Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. - Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 3 weeks	2012-04-06 06:53:58 +00:00
John Baldwin	35818d2e94	Add new ktrace records for the start and end of VM faults. This gives a pair of records similar to syscall entry and return that a user can use to determine how long page faults take. The new ktrace records are enabled via the 'p' trace type, and are enabled in the default set of trace points. Reviewed by: kib MFC after: 2 weeks	2012-04-05 17:13:14 +00:00
David Xu	8931e524bf	In sem_post, the field _has_waiters is no longer used, because some application destroys semaphore after sem_wait returns. Just enter kernel to wake up sleeping threads, only update _has_waiters if it is safe. While here, check if the value exceed SEM_VALUE_MAX and return EOVERFLOW if this is true.	2012-04-05 03:05:02 +00:00
David Xu	17ce606321	umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses a mutex after a thread has unlocked it, it event writes data to the mutex memory to clear contention bit, there is a race that other threads can lock it and unlock it, then destroy it, so it should not write data to the mutex memory if there isn't any waiter. The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It requires thread library to clear the lock word entirely, then call the WAKE2 operation to check if there is any waiter in kernel, and try to wake up a thread, if necessary, the contention bit is set again by the operation. This also mitgates the chance that other threads find the contention bit and try to enter kernel to compete with each other to wake up sleeping thread, this is unnecessary. With this change, the mutex owner is no longer holding the mutex until it reaches a point where kernel umtx queue is locked, it releases the mutex as soon as possible. Performance is improved when the mutex is contensted heavily. On Intel i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c	2012-04-05 02:24:08 +00:00
Navdeep Parhar	60a305887a	- Remove redundant call to pr_ctloutput from code that handles SO_SETFIB. - Add a check for errors during copyin while here. Reviewed by: julian, bz MFC after: 2 weeks	2012-04-03 18:38:00 +00:00
Konstantin Belousov	5085ecb75a	When process exists, not only the children shall be reparented to init, but also the orphans shall be removed from the orphan list, because the list header is destroyed. Reported and tested by: pho MFC after: 3 days	2012-04-02 19:35:36 +00:00
Konstantin Belousov	2e39e24f64	Add helper function to remove the process from the orphans list and use it instead of inlined code. Tested by: pho MFC after: 3 days	2012-04-02 19:34:56 +00:00
John Baldwin	e506e182dd	Export some more useful info about shared memory objects to userland via procstat(1) and fstat(1): - Change shm file descriptors to track the pathname they are associated with and add a shm_path() method to copy the path out to a caller-supplied buffer. - Use the fo_stat() method of shared memory objects and shm_path() to export the path, mode, and size of a shared memory object via struct kinfo_file. - Add a struct shmstat to the libprocstat(3) interface along with a procstat_get_shm_info() to export the mode and size of a shared memory object. - Change procstat to always print out the path for a given object if it is valid. - Teach fstat about shared memory objects and to display their path, mode, and size. MFC after: 2 weeks	2012-04-01 18:22:48 +00:00
David Xu	8b1eafa723	Remove stale comments.	2012-03-31 06:48:41 +00:00
David Xu	b29d7d9b60	Remove trailing semicolon, it is a typo.	2012-03-30 12:57:14 +00:00
David Xu	0cf573e989	Fix COMPAT_FREEBSD32 build. Submitted by: Andreas Tobler < andreast at fgznet dot ch >	2012-03-30 09:03:53 +00:00
David Xu	4ed8858df0	Remove trailing space.	2012-03-30 05:49:32 +00:00
David Xu	e05171d939	Merge umtxq_sleep and umtxq_nanosleep into a single function by using an abs_timeout structure which describes timeout info.	2012-03-30 05:40:26 +00:00
David Xu	d31f470d15	Reduce code size by creating common timed sleeping function.	2012-03-29 02:46:43 +00:00
Fabien Thomas	f5f9340b98	Add software PMC support. New kernel events can be added at various location for sampling or counting. This will for example allow easy system profiling whatever the processor is with known tools like pmcstat(8). Simultaneous usage of software PMC and hardware PMC is possible, for example looking at the lock acquire failure, page fault while sampling on instructions. Sponsored by: NETASQ MFC after: 1 month	2012-03-28 20:58:30 +00:00
Ryan Stone	9742410797	Instead of only iterating over the set of known SDT probes when sdt.ko is loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c that will be called when a newly loaded KLD module adds more probes or a module with probes is unloaded. This fixes two issues: first, if a module with SDT probes was loaded after sdt.ko was loaded, those new probes would not be available in DTrace. Second, if a module with SDT probes was unloaded while sdt.ko was loaded, the kernel would panic the next time DTrace had cause to try and do anything with the no-longer-existent probes. This makes it possible to create SDT probes in KLD modules, although there are still two caveats: first, any SDT probes in a KLD module must be part of a DTrace provider that is defined in that module. At present DTrace only destroys probes when the provider is destroyed, so you can still panic the system if a KLD module creates new probes in a provider from a different module(including the kernel) and then unload the the first module. Second, the system will panic if you unload a module containing SDT probes while there is an active D script that has enabled those probes. MFC after: 1 month	2012-03-27 15:07:43 +00:00
Alexander V. Chernikov	b25711e6b0	- Add knlist_init_rw_reader() function to kqueue(9). Function acquired reader lock if needed. Assert check for reader or writer lock (RA_LOCKED / RA_UNLOCKED) - While here, add knlist_init_mtx.9 to MLINKS and fix some style(9) issues Reviewed by: glebius Approved by: ae(mentor) MFC after: 2 weeks	2012-03-26 09:34:17 +00:00
Mikolaj Golub	903712c99c	Add a sysctl to set and retrieve binary osreldate of another process. Suggested by: kib Reviewed by: kib MFC after: 2 weeks	2012-03-23 20:05:41 +00:00
Andrey V. Elsukov	5b0da85a41	Correct debug message.	2012-03-22 09:29:07 +00:00
Alan Cox	5730afc9b6	Handle spurious page faults that may occur in no-fault sections of the kernel. When access restrictions are added to a page table entry, we flush the corresponding virtual address mapping from the TLB. In contrast, when access restrictions are removed from a page table entry, we do not flush the virtual address mapping from the TLB. This is exactly as recommended in AMD's documentation. In effect, when access restrictions are removed from a page table entry, AMD's MMUs will transparently refresh a stale TLB entry. In short, this saves us from having to perform potentially costly TLB flushes. In contrast, Intel's MMUs are allowed to generate a spurious page fault based upon the stale TLB entry. Usually, such spurious page faults are handled by vm_fault() without incident. However, when we are executing no-fault sections of the kernel, we are not allowed to execute vm_fault(). This change introduces special-case handling for spurious page faults that occur in no-fault sections of the kernel. In collaboration with: kib Tested by: gibbs (an earlier version) I would also like to acknowledge Hiroki Sato's assistance in diagnosing this problem. MFC after: 1 week	2012-03-22 04:52:51 +00:00
Andrey V. Elsukov	c5e7f0649a	Acquire modules lock before call module_getname() in the KLD_DEBUG case. MFC after: 1 week	2012-03-21 09:48:32 +00:00
Eitan Adler	24c10828e4	- Clean up timestamps in msgbuf code. The timestamps should now be inserted after the priority token thus cleaning up the output. - Remove the needless double internal do_add_char function. - Resolve a possible deadlock if interrupts are disabled and getnanotime is called Reviewed by: bde kmacy, avg, sbruno (various versions) Approved by: cperciva MFC after: 2 weeks	2012-03-19 00:36:32 +00:00
Jaakko Heinonen	59f513cd09	Cast wallclock.tv_sec to uint64_t to avoid overflow in the calculation. PR: kern/161552 Reviewed by: trasz Tested by: Nikos Vassiliadis MFC after: 1 week	2012-03-18 19:13:32 +00:00
Davide Italiano	c6111de55d	Add rudimentary profiling of the hash table used in the in the umtx code to hold active lock queues. Reviewed by: attilio Approved by: davidxu, gnn (mentor) MFC after: 3 weeks	2012-03-16 20:32:11 +00:00

1 2 3 4 5 ...

12625 Commits