freebsd-skq

Author	SHA1	Message	Date
emaste	a3f6608533	Make a thread's address available via the kern proc sysctl, just like the process address. Add "tdaddr" keyword to ps(1) to display this thread address. Distilled from Sandvine's patch set by Mark Johnston.	2010-10-08 00:44:53 +00:00
rpaulo	ea11ba6788	Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]	2010-08-22 11:18:57 +00:00
jhb	2f662f7a9c	There isn't really a need to hold the ktrace mutex just to read the value of p_traceflag that is stored in the kinfo_proc structure. It is still racey even with the lock and the code will read a consistent snapshot of the flag without the lock.	2010-08-19 16:40:30 +00:00
attilio	e56433dd50	Add the support for reporting the NOCOREDUMP flag from sysctl_kern_proc_vmmap(). Sponsored by: Sandvine Incorporated Reviewed by: kib, emaste MFC after: 1 week	2010-05-27 08:10:12 +00:00
kib	a3da7d7e69	Fix a mistake in r207603. td_rux.rux_runtime still needs conversion. Reported and tested by: nwhitehorn Pointy hat to: kib MFC after: 6 days	2010-05-05 16:05:51 +00:00
kib	7ef4b25b49	Use td_rux.rux_runtime for ki_runtime instead of redoing calculation. Submitted by: bde MFC after: 1 week	2010-05-04 06:00:39 +00:00
kib	a22b32df4a	Remove caddr_t casts. Requested by: bde MFC after: 10 days	2010-04-29 09:55:51 +00:00
kib	e91c695f77	Move the constants specifying the size of struct kinfo_proc into machine-specific header files. Add KINFO_PROC32_SIZE for struct kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add CTASSERT for the size of struct kinfo_proc32. Submitted by: pluknet Reviewed by: imp, jhb, nwhitehorn MFC after: 2 weeks	2010-04-24 12:49:52 +00:00
kib	f504a7390f	Fix typo. Submitted by: emaste Pointy hat to: kib (who needs much bigger wardrobe) MFC after: 1 week	2010-04-21 20:04:42 +00:00
kib	1b4a81ab7e	Provide compat32 shims for kinfo_proc sysctl. This allows 32bit ps(1) to mostly work on 64bit host. The work is based on an original patch submitted by emaste, obtained from Sandvine's source tree. Reviewed by: jhb MFC after: 1 week	2010-04-21 19:32:00 +00:00
kib	a8e7da4cf2	For kinfo_proc in kp->ki_siglist, return the set of the signals pending in the process queue when gathering information for the process, and set of signals pending for the thread, when gathering information for the thread. Previously, the sysctl returned a union of the process and some arbitrary thread pending set for the process, and union of the process and the thread pending set for the thread. MFC after: 1 week	2010-02-27 15:32:49 +00:00
jilles	9a57b41b18	Include terminated threads in ps's process cpu time field. MFC after: 2 weeks	2010-02-27 12:15:59 +00:00
bz	97711237c0	Remove an unused global. MFC after: 3 days	2009-12-25 20:03:03 +00:00
ed	449c6ac843	Let access overriding to TTYs depend on the cdev_priv, not the vnode. Basically this commit changes two things, which improves access to TTYs in exceptional conditions. Basically the problem was that when you ran jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if you want to attach to screens quickly, use ssh(1), etc. The fixes: - Cache the cdev_priv of the controlling TTY in struct session. Change devfs_access() to compare against the cdev_priv instead of the vnode. This allows you to bypass UNIX permissions, even across different mounts of devfs. - Extend devfs_prison_check() to unconditionally expose the device node of the controlling TTY, even if normal prison nesting rules normally don't allow this. This actually allows you to interact with this device node. To be honest, I'm not really happy with this solution. We now have to store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp). In an ideal world, we should just get rid of the latter two and only use s_ttyp, but this makes certian pieces of code very impractical (e.g. devfs, kern_exit.c). Reported by: Many people	2009-12-19 18:42:12 +00:00
emaste	93e81ca098	In fill_kinfo_thread, copy the thread's name into struct kinfo_proc even if it is empty. Otherwise the previous thread's name would remain in the struct and then be reported for this thread. Submitted by: Ryan Stone MFC after: 1 week	2009-10-01 21:44:30 +00:00
kib	bae5df8cbb	Reintroduce the r196640, after fixing the problem with my testing. Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho (and retested according to new test scenarious) MFC after: 1 week	2009-09-01 11:41:51 +00:00
kib	d105721a22	Reverse r196640 and r196644 for now.	2009-08-29 21:53:08 +00:00
kib	9e8ade6852	Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho MFC after: 1 week	2009-08-29 13:28:02 +00:00
brooks	d91fda355e	Introduce a new sysctl process mib, kern.proc.groups which adds the ability to retrieve the group list of each process. Modify procstat's -s option to query this mib when the kinfo_proc reports that the field has been truncated. If the mib does not exist, fall back to the truncated list. Reviewed by: rwatson Approved by: re (kib) MFC after: 2 weeks	2009-07-24 19:12:19 +00:00
brooks	7931ef2c42	Revert the changes to struct kinfo_proc in r194498. Instead, fill in up to 16 (KI_NGROUPS) values and steal a bit from ki_cr_flags (all bits currently unused) to indicate overflow with the new flag KI_CRF_GRP_OVERFLOW. This fixes procstat -s. Approved by: re (kib)	2009-07-24 15:03:10 +00:00
jhb	44220d7e1e	Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks	2009-07-24 13:50:29 +00:00
brooks	f53c1c309d	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
rwatson	d9b4d146e8	Add a flags field to struct ucred, and export that via kinfo_proc, consuming one of its spare fields. The cr_flags field is currently unused, but will be used for features, including capability mode and pay-as-you-go audit. Discussed with: jhb, sson	2009-06-01 20:26:51 +00:00
jamie	a013e0afcb	Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)	2009-05-27 14:11:23 +00:00
attilio	bf75b4612a	- Add a function (fill_kinfo_aggregate()) which aggregates relevant members for a kinfo entry on a process-wide system. - Use the newly introduced function in order to fix cases like KERN_PROC_PROC where aggregating stats are broken because they just consider the first thread in the pool for each process. (Note, additively, that KERN_PROC_PROC is rather inaccurate on thread-wide informations like the 'state' of the process. Such informations should maybe be invalidated and being forceably discarded by the consumers?). - Simplify the logic of sysctl_out_proc() and adjust the fill_kinfo_thread() accordingly. - Remove checks on the FIRST_THREAD_IN_PROC() being NULL but add assertives. This patch should fix aggregate statistics for KERN_PROC_PROC. This is one of the reasons why top doesn't use this option and now it can be use it safely. ps, when launched in order to display just processes, now should report correct cpu utilization percentages and times (as opposed by the old code). Reviewed by: jhb, emaste Sponsored by: Sandvine Incorporated	2009-02-18 21:52:13 +00:00
jhb	0245a3370f	- Add conditional Giant locking around the vrele() in sysctl_kern_proc_pathname(). - Mark all the kern.proc.* sysctls as MPSAFE. Submitted by: csjp (2)	2009-01-23 22:46:45 +00:00
kib	bd5d614be8	vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:45:11 +00:00
kib	e747469903	Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week	2008-12-12 12:12:36 +00:00
kib	8324189f53	Do drop vm map lock earlier in the sysctl_kern_proc_vmmap(), to avoid locking a vnode while having vm map locked. Reported and tested by: pho MFC after: 1 week	2008-12-08 12:29:30 +00:00
kib	ccad2ebfb2	Several threads in a process may do vfork() simultaneously. Then, all parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with proc mutex of the child, that triggers assertion in the sleepq_add(). The assertion requires that at any time, all simultaneous sleepers for the channel use the same interlock. Silent the assertion by using conditional variable allocated in the child. Broadcast the variable event on exec() and exit(). Since struct proc * sleep wait channel is overloaded for several unrelated events, I was unable to remove wakeups from the places where cv_broadcast() is added, except exec(). Reported and tested by: ganbold Suggested and reviewed by: jhb MFC after: 2 week	2008-12-05 20:50:24 +00:00
peter	76037b082e	Merge user/peter/kinfo branch as of r185547 into head. This changes struct kinfo_filedesc and kinfo_vmentry such that they are same on both 32 and 64 bit platforms like i386/amd64 and won't require sysctl wrapping. Two new OIDs are assigned. The old ones are available under COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface was never actually released on 7.x. The other main change is to pack the data passed to userland via the sysctl. kf_structsize and kve_structsize are reduced for the copyout. If you have a process with 100,000+ sockets open, the unpacked records require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still seriously unpleasant, but not quite as devastating). A similar problem exists for the vmentry structure - have lots and lots of shared libraries and small mmaps and its copyout gets expensive too. My immediate problem is valgrind. It traditionally achieves this functionality by parsing procfs output, in a packed format. Secondly, when tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled 32 bit binary which ran directly into the differing data structures in 32 vs 64 bit mode. (valgrind uses this to track file descriptor operations and this therefore affected every single 32 bit binary) I've added two utility functions to libutil to unpack the structures into a fixed record length and to make it a little more convenient to use.	2008-12-02 06:50:26 +00:00
peter	cd7b78c33f	Duplicate another few hundred lines of code in order to be compatible with unreleased binaries.	2008-12-01 02:13:32 +00:00
peter	83dc2280ce	WIP kinfo_file/kinfo_vmmentry tweaks. The idea: 1) to get the 32 and 64 bit versions in sync so that no shims are needed, Valgrind in particular excercises this. and: 2) reduce the size of the copyout. On large processes this turns out to be a huge problem. Valgrind also suffers from this since it needs to do this in a context that can't malloc. I want to pack the records. 3) Add new types.. 'tell me about fd N' and 'tell me about addr N'.	2008-11-29 20:55:11 +00:00
pjd	bbe899b96e	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
jhb	6c6f8c89e8	Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month	2008-11-04 19:04:01 +00:00
peter	1f7fd22cbb	Add three extra to the kinfo_proc_vmmap data. kve_offset - the offset within an object that a mapping refers to. fileid and fsid are inode/dev for vnodes. (Linux procfs has these and valgrind is really unhappy without them.) I believe I didn't change the size of the struct.	2008-10-31 05:43:19 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
ed	0f8f4f624b	Fix minor TTY API inconsistency. Unlike tty_rel_gone() and tty_rel_sess(), the tty_rel_pgrp() routine does not unlock the TTY. I once had the idea to make the code call tty_rel_pgrp() and tty_rel_sess(), picking up the TTY lock once. This turned out a little harder than I expected, so this is how it works now. It's a lot easier if we just let tty_rel_pgrp() unlock the TTY, because the other routines do this anyway.	2008-09-16 14:57:23 +00:00
kevlo	9f7bbf786b	If the process id specified is invalid, the system call returns ESRCH	2008-09-04 10:44:33 +00:00
ed	cc3116a938	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
kib	e2333a32b6	Call pargs_drop() unconditionally in do_execve(), the function correctly handles the NULL argument. Make pargs_free() static. MFC after: 1 week	2008-07-25 11:55:32 +00:00
jb	c4443570b6	Add DTrace 'proc' provider probes using the Statically Defined Trace (sdt) mechanism.	2008-05-24 06:22:16 +00:00
jeff	46f09d5bc3	- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.	2008-03-19 06:19:01 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
rwatson	f261f9865b	Don't zero td_runtime when billing thread CPU usage to the process; maintain a separate td_incruntime to hold unbilled CPU usage for the thread that has the previous properties of td_runtime. When thread information is requested using the thread monitoring sysctls, export thread td_runtime instead of process rusage runtime in kinfo_proc. This restores the display of individual ithread and other kernel thread CPU usage since inception in ps -H and top -SH, as well for libthr user threads, valuable debugging information lost with the move to try kthreads since they are no longer independent processes. There is universal agreement that we should rewrite the process and thread export sysctls, but this commit gets things going a bit better in the mean time. Likewise, there are resevations about the continued validity of statclock given the speed of modern processors. Reviewed by: attilio, emaste, jhb, julian	2008-01-10 22:11:20 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
rwatson	010adfaab6	Return ESRCH when a kernel stack is queried on a process in execve() -- p_candebug() will return EAGAIN which, if the other process never leaves execve(), will result in the sysctl spinning and never returning to userspace. Processes should always eventually leave execve(), but spinning in kernel while we wait is bad for countless reasons, and particularly harmful if execve() itself is deadlocked. Possibly we should return another error, or return a marker indicating the thread is in execve() so it can be reported that way in userspace. Reported by: kris	2007-12-27 22:44:01 +00:00
rwatson	cce7cfdaf5	Check for P_WEXIT before PHOLD() on a process in kstack and vm query sysctls, as PHOLD() asserts !P_WEXIT. Reported by: Michael Plass <mfp49_freebsd at plass-family dot net>	2007-12-09 17:22:27 +00:00
rwatson	e891688481	Add another new sysctl in support of the forthcoming procstat(1) to support its -k argument: kern.proc.kstack - dump the kernel stack of a process, if debugging is permitted. This sysctl is present if either "options DDB" or "options STACK" is compiled into the kernel. Having support for tracing the kernel stacks of processes from user space makes it much easier to debug (or understand) specific wmesg's while avoiding the need to enter DDB in order to determine the path by which a process came to be blocked on a particular wait channel or lock.	2007-12-02 21:52:18 +00:00
rwatson	c25458da37	Add two new sysctls in support of the forthcoming procstat(1) to support its -f and -v arguments: kern.proc.filedesc - dump file descriptor information for a process, if debugging is permitted, including socket addresses, open flags, file offsets, file paths, etc. kern.proc.vmmap - dump virtual memory mapping information for a process, if debugging is permitted, including layout and information on underlying objects, such as the type of object and path. These provide a superset of the information historically available through the now-deprecated procfs(4), and are intended to be exported in an ABI-robust form.	2007-12-02 10:10:27 +00:00

1 2 3 4 5 ...

303 Commits