freebsd-dev

Author	SHA1	Message	Date
Rebecca Cran	8d065a3914	Fix some more style(9) issues.	2010-11-14 16:10:15 +00:00
Rebecca Cran	b389be97db	Fix style(9) issues from r215281 and r215282. MFC after: 1 week	2010-11-14 08:06:29 +00:00
Rebecca Cran	2baa5cddb6	Add some descriptions to sys/kern sysctls. PR: kern/148710 Tested by: Chip Camden <sterling at camdensoftware.com> MFC after: 1 week	2010-11-14 06:09:50 +00:00
Edward Tomasz Napierala	a37e14e1d8	Fix style. Submitted by: bde	2010-11-11 21:53:46 +00:00
Edward Tomasz Napierala	98e0196aef	Remove unneeded conditional. Discussed with: kib	2010-11-11 08:15:12 +00:00
Ed Maste	6239ef1d29	Make a thread's address available via the kern proc sysctl, just like the process address. Add "tdaddr" keyword to ps(1) to display this thread address. Distilled from Sandvine's patch set by Mark Johnston.	2010-10-08 00:44:53 +00:00
Rui Paulo	79856499bd	Add an extra comment to the SDT probes definition. This allows us to get use '-' in probe names, matching the probe names in Solaris.[1] Add userland SDT probes definitions to sys/sdt.h. Sponsored by: The FreeBSD Foundation Discussed with: rwaston [1]	2010-08-22 11:18:57 +00:00
John Baldwin	ba50d5975d	There isn't really a need to hold the ktrace mutex just to read the value of p_traceflag that is stored in the kinfo_proc structure. It is still racey even with the lock and the code will read a consistent snapshot of the flag without the lock.	2010-08-19 16:40:30 +00:00
Attilio Rao	937912ea04	Add the support for reporting the NOCOREDUMP flag from sysctl_kern_proc_vmmap(). Sponsored by: Sandvine Incorporated Reviewed by: kib, emaste MFC after: 1 week	2010-05-27 08:10:12 +00:00
Konstantin Belousov	213c077f62	Fix a mistake in r207603. td_rux.rux_runtime still needs conversion. Reported and tested by: nwhitehorn Pointy hat to: kib MFC after: 6 days	2010-05-05 16:05:51 +00:00
Konstantin Belousov	03d13670c2	Use td_rux.rux_runtime for ki_runtime instead of redoing calculation. Submitted by: bde MFC after: 1 week	2010-05-04 06:00:39 +00:00
Konstantin Belousov	e68d26fd5d	Remove caddr_t casts. Requested by: bde MFC after: 10 days	2010-04-29 09:55:51 +00:00
Konstantin Belousov	ed7806879b	Move the constants specifying the size of struct kinfo_proc into machine-specific header files. Add KINFO_PROC32_SIZE for struct kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add CTASSERT for the size of struct kinfo_proc32. Submitted by: pluknet Reviewed by: imp, jhb, nwhitehorn MFC after: 2 weeks	2010-04-24 12:49:52 +00:00
Konstantin Belousov	b4bf2ac1ac	Fix typo. Submitted by: emaste Pointy hat to: kib (who needs much bigger wardrobe) MFC after: 1 week	2010-04-21 20:04:42 +00:00
Konstantin Belousov	05e06d1157	Provide compat32 shims for kinfo_proc sysctl. This allows 32bit ps(1) to mostly work on 64bit host. The work is based on an original patch submitted by emaste, obtained from Sandvine's source tree. Reviewed by: jhb MFC after: 1 week	2010-04-21 19:32:00 +00:00
Konstantin Belousov	00305546f3	For kinfo_proc in kp->ki_siglist, return the set of the signals pending in the process queue when gathering information for the process, and set of signals pending for the thread, when gathering information for the thread. Previously, the sysctl returned a union of the process and some arbitrary thread pending set for the process, and union of the process and the thread pending set for the thread. MFC after: 1 week	2010-02-27 15:32:49 +00:00
Jilles Tjoelker	bbb17d169c	Include terminated threads in ps's process cpu time field. MFC after: 2 weeks	2010-02-27 12:15:59 +00:00
Bjoern A. Zeeb	095809b084	Remove an unused global. MFC after: 3 days	2009-12-25 20:03:03 +00:00
Ed Schouten	8dc9b4cf04	Let access overriding to TTYs depend on the cdev_priv, not the vnode. Basically this commit changes two things, which improves access to TTYs in exceptional conditions. Basically the problem was that when you ran jexec(8) to attach to a jail, you couldn't use /dev/tty (well, also the node of the actual TTY, e.g. /dev/pts/X). This is very inconvenient if you want to attach to screens quickly, use ssh(1), etc. The fixes: - Cache the cdev_priv of the controlling TTY in struct session. Change devfs_access() to compare against the cdev_priv instead of the vnode. This allows you to bypass UNIX permissions, even across different mounts of devfs. - Extend devfs_prison_check() to unconditionally expose the device node of the controlling TTY, even if normal prison nesting rules normally don't allow this. This actually allows you to interact with this device node. To be honest, I'm not really happy with this solution. We now have to store three pointers to a controlling TTY (s_ttyp, s_ttyvp, s_ttydp). In an ideal world, we should just get rid of the latter two and only use s_ttyp, but this makes certian pieces of code very impractical (e.g. devfs, kern_exit.c). Reported by: Many people	2009-12-19 18:42:12 +00:00
Ed Maste	e380cc73ed	In fill_kinfo_thread, copy the thread's name into struct kinfo_proc even if it is empty. Otherwise the previous thread's name would remain in the struct and then be reported for this thread. Submitted by: Ryan Stone MFC after: 1 week	2009-10-01 21:44:30 +00:00
Konstantin Belousov	8a945d109c	Reintroduce the r196640, after fixing the problem with my testing. Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho (and retested according to new test scenarious) MFC after: 1 week	2009-09-01 11:41:51 +00:00
Konstantin Belousov	f25fa6abb2	Reverse r196640 and r196644 for now.	2009-08-29 21:53:08 +00:00
Konstantin Belousov	c3cf0b476f	Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho MFC after: 1 week	2009-08-29 13:28:02 +00:00
Brooks Davis	254d03c508	Introduce a new sysctl process mib, kern.proc.groups which adds the ability to retrieve the group list of each process. Modify procstat's -s option to query this mib when the kinfo_proc reports that the field has been truncated. If the mib does not exist, fall back to the truncated list. Reviewed by: rwatson Approved by: re (kib) MFC after: 2 weeks	2009-07-24 19:12:19 +00:00
Brooks Davis	1b5768be71	Revert the changes to struct kinfo_proc in r194498. Instead, fill in up to 16 (KI_NGROUPS) values and steal a bit from ki_cr_flags (all bits currently unused) to indicate overflow with the new flag KI_CRF_GRP_OVERFLOW. This fixes procstat -s. Approved by: re (kib)	2009-07-24 15:03:10 +00:00
John Baldwin	013818111a	Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks	2009-07-24 13:50:29 +00:00
Brooks Davis	838d985825	Rework the credential code to support larger values of NGROUPS and NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024 and 1023 respectively. (Previously they were equal, but under a close reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it is the number of supplemental groups, not total number of groups.) The bulk of the change consists of converting the struct ucred member cr_groups from a static array to a pointer. Do the equivalent in kinfo_proc. Introduce new interfaces crcopysafe() and crsetgroups() for duplicating a process credential before modifying it and for setting group lists respectively. Both interfaces take care for the details of allocating groups array. crsetgroups() takes care of truncating the group list to the current maximum (NGROUPS) if necessary. In the future, crsetgroups() may be responsible for insuring invariants such as sorting the supplemental groups to allow groupmember() to be implemented as a binary search. Because we can not change struct xucred without breaking application ABIs, we leave it alone and introduce a new XU_NGROUPS value which is always 16 and is to be used or NGRPS as appropriate for things such as NFS which need to use no more than 16 groups. When feasible, truncate the group list rather than generating an error. Minor changes: - Reduce the number of hand rolled versions of groupmember(). - Do not assign to both cr_gid and cr_groups[0]. - Modify ipfw to cache ucreds instead of part of their contents since they are immutable once referenced by more than one entity. Submitted by: Isilon Systems (initial implementation) X-MFC after: never PR: bin/113398 kern/133867	2009-06-19 17:10:35 +00:00
Robert Watson	820c518125	Add a flags field to struct ucred, and export that via kinfo_proc, consuming one of its spare fields. The cr_flags field is currently unused, but will be used for features, including capability mode and pay-as-you-go audit. Discussed with: jhb, sson	2009-06-01 20:26:51 +00:00
Jamie Gritton	0304c73163	Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)	2009-05-27 14:11:23 +00:00
Attilio Rao	f8d9048018	- Add a function (fill_kinfo_aggregate()) which aggregates relevant members for a kinfo entry on a process-wide system. - Use the newly introduced function in order to fix cases like KERN_PROC_PROC where aggregating stats are broken because they just consider the first thread in the pool for each process. (Note, additively, that KERN_PROC_PROC is rather inaccurate on thread-wide informations like the 'state' of the process. Such informations should maybe be invalidated and being forceably discarded by the consumers?). - Simplify the logic of sysctl_out_proc() and adjust the fill_kinfo_thread() accordingly. - Remove checks on the FIRST_THREAD_IN_PROC() being NULL but add assertives. This patch should fix aggregate statistics for KERN_PROC_PROC. This is one of the reasons why top doesn't use this option and now it can be use it safely. ps, when launched in order to display just processes, now should report correct cpu utilization percentages and times (as opposed by the old code). Reviewed by: jhb, emaste Sponsored by: Sandvine Incorporated	2009-02-18 21:52:13 +00:00
John Baldwin	24f87fdbe8	- Add conditional Giant locking around the vrele() in sysctl_kern_proc_pathname(). - Mark all the kern.proc.* sysctls as MPSAFE. Submitted by: csjp (2)	2009-01-23 22:46:45 +00:00
Konstantin Belousov	22a448c4d9	vm_map_lock_read() does not increment map->timestamp, so we should compare map->timestamp with saved timestamp after map read lock is reacquired, not with saved timestamp + 1. The only consequence of the +1 was unconditional lookup of the next map entry, though. Tested by: pho Approved by: des MFC after: 2 weeks	2008-12-29 12:45:11 +00:00
Konstantin Belousov	c7462f4387	Reference the vmspace of the process being inspected by procfs, linprocfs and sysctl kern_proc_vmmap handlers. Reported and tested by: pho Reviewed by: rwatson, des MFC after: 1 week	2008-12-12 12:12:36 +00:00
Konstantin Belousov	118d0afa28	Do drop vm map lock earlier in the sysctl_kern_proc_vmmap(), to avoid locking a vnode while having vm map locked. Reported and tested by: pho MFC after: 1 week	2008-12-08 12:29:30 +00:00
Konstantin Belousov	aeb325719a	Several threads in a process may do vfork() simultaneously. Then, all parent threads sleep on the parent' struct proc until corresponding child releases the vmspace. Each sleep is interlocked with proc mutex of the child, that triggers assertion in the sleepq_add(). The assertion requires that at any time, all simultaneous sleepers for the channel use the same interlock. Silent the assertion by using conditional variable allocated in the child. Broadcast the variable event on exec() and exit(). Since struct proc * sleep wait channel is overloaded for several unrelated events, I was unable to remove wakeups from the places where cv_broadcast() is added, except exec(). Reported and tested by: ganbold Suggested and reviewed by: jhb MFC after: 2 week	2008-12-05 20:50:24 +00:00
Peter Wemm	43151ee6cf	Merge user/peter/kinfo branch as of r185547 into head. This changes struct kinfo_filedesc and kinfo_vmentry such that they are same on both 32 and 64 bit platforms like i386/amd64 and won't require sysctl wrapping. Two new OIDs are assigned. The old ones are available under COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface was never actually released on 7.x. The other main change is to pack the data passed to userland via the sysctl. kf_structsize and kve_structsize are reduced for the copyout. If you have a process with 100,000+ sockets open, the unpacked records require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still seriously unpleasant, but not quite as devastating). A similar problem exists for the vmentry structure - have lots and lots of shared libraries and small mmaps and its copyout gets expensive too. My immediate problem is valgrind. It traditionally achieves this functionality by parsing procfs output, in a packed format. Secondly, when tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled 32 bit binary which ran directly into the differing data structures in 32 vs 64 bit mode. (valgrind uses this to track file descriptor operations and this therefore affected every single 32 bit binary) I've added two utility functions to libutil to unpack the structures into a fixed record length and to make it a little more convenient to use.	2008-12-02 06:50:26 +00:00
Pawel Jakub Dawidek	1ba4a712dd	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
John Baldwin	2ff47c5f18	Remove unnecessary locking around vn_fullpath(). The vnode lock for the vnode in question does not need to be held. All the data structures used during the name lookup are protected by the global name cache lock. Instead, the caller merely needs to ensure a reference is held on the vnode (such as vhold()) to keep it from being freed. In the case of procfs' <pid>/file entry, grab the process lock while we gain a new reference (via vhold()) on p_textvp to fully close races with execve(2). For the kern.proc.vmmap sysctl handler, use a shared vnode lock around the call to VOP_GETATTR() rather than an exclusive lock. MFC after: 1 month	2008-11-04 19:04:01 +00:00
Peter Wemm	7a9c4d2409	Add three extra to the kinfo_proc_vmmap data. kve_offset - the offset within an object that a mapping refers to. fileid and fsid are inode/dev for vnodes. (Linux procfs has these and valgrind is really unhappy without them.) I believe I didn't change the size of the struct.	2008-10-31 05:43:19 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Ed Schouten	42ff2756c7	Fix minor TTY API inconsistency. Unlike tty_rel_gone() and tty_rel_sess(), the tty_rel_pgrp() routine does not unlock the TTY. I once had the idea to make the code call tty_rel_pgrp() and tty_rel_sess(), picking up the TTY lock once. This turned out a little harder than I expected, so this is how it works now. It's a lot easier if we just let tty_rel_pgrp() unlock the TTY, because the other routines do this anyway.	2008-09-16 14:57:23 +00:00
Kevin Lo	f308bddd3f	If the process id specified is invalid, the system call returns ESRCH	2008-09-04 10:44:33 +00:00
Ed Schouten	bc093719ca	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
Konstantin Belousov	58e8af1bf5	Call pargs_drop() unconditionally in do_execve(), the function correctly handles the NULL argument. Make pargs_free() static. MFC after: 1 week	2008-07-25 11:55:32 +00:00
John Birrell	5d217f173c	Add DTrace 'proc' provider probes using the Statically Defined Trace (sdt) mechanism.	2008-05-24 06:22:16 +00:00
Jeff Roberson	374ae2a393	- Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice from requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support.	2008-03-19 06:19:01 +00:00
Jeff Roberson	6617724c5f	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
Robert Watson	d92909c1d4	Don't zero td_runtime when billing thread CPU usage to the process; maintain a separate td_incruntime to hold unbilled CPU usage for the thread that has the previous properties of td_runtime. When thread information is requested using the thread monitoring sysctls, export thread td_runtime instead of process rusage runtime in kinfo_proc. This restores the display of individual ithread and other kernel thread CPU usage since inception in ps -H and top -SH, as well for libthr user threads, valuable debugging information lost with the move to try kthreads since they are no longer independent processes. There is universal agreement that we should rewrite the process and thread export sysctls, but this commit gets things going a bit better in the mean time. Likewise, there are resevations about the continued validity of statclock given the speed of modern processors. Reviewed by: attilio, emaste, jhb, julian	2008-01-10 22:11:20 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Robert Watson	0417fe5421	Return ESRCH when a kernel stack is queried on a process in execve() -- p_candebug() will return EAGAIN which, if the other process never leaves execve(), will result in the sysctl spinning and never returning to userspace. Processes should always eventually leave execve(), but spinning in kernel while we wait is bad for countless reasons, and particularly harmful if execve() itself is deadlocked. Possibly we should return another error, or return a marker indicating the thread is in execve() so it can be reported that way in userspace. Reported by: kris	2007-12-27 22:44:01 +00:00

1 2 3 4 5 ...

306 Commits