freebsd-nq

Author	SHA1	Message	Date
Mateusz Guzik	7e1d3eefd4	vfs: remove the unused thread argument from NDINIT* See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.	2021-11-25 22:50:42 +00:00
Mark Johnston	5c18bf9d5f	ktrace: Zero request structures when populating the pool Otherwise uninitialized pad bytes may be copied into the ktrace log file. Reported by: KMSAN MFC after: 1 week Sponsored by: The FreeBSD Foundation	2021-07-23 10:29:53 -04:00
Mark Johnston	283e60fb31	ktrace: Fix an inverted comparison added in commit f3851b235 Fixes: f3851b235 ("ktrace: Fix a race with fork()") Reported by: dchagin, phk	2021-06-01 09:15:35 -04:00
Mark Johnston	f3851b235b	ktrace: Fix a race with fork() ktrace(2) may toggle trace points in any of 1. a single process 2. all members of a process group 3. all descendents of the processes in 1 or 2 In the first two cases, we do not permit the operation if the process is being forked or not visible. However, in case 3 we did not enforce this restriction for descendents. As a result, the assertions about the child in ktrprocfork() may be violated. Move these checks into ktrops() so that they are applied consistently. Allow KTROP_CLEAR for nascent processes. Otherwise, there is a window where we cannot clear trace points for a nascent child if they are inherited from the parent. Reported by: syzbot+d96676592978f137e05c@syzkaller.appspotmail.com Reported by: syzbot+7c98fcf84a4439f2817f@syzkaller.appspotmail.com Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30481	2021-05-27 15:52:20 -04:00
Mark Johnston	f885100773	ktrace: Handle negative array sizes in ktrstructarray ktrstructarray() may be used to create copies of kevent(2) change and event arrays. It is called before parameter validation is done and so should check for bogus array lengths before allocating a copy. Reported by: syzkaller Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30479	2021-05-27 15:52:20 -04:00
Mark Johnston	6f6cd1e8e8	ktrace: Remove vrele() at the end of ktr_writerequest() As of commit fc369a353 we no longer ref the vnode when writing a record. Drop the corresponding vrele() call in the error case. Fixes: fc369a353 ("ktrace: fix a race between writes and close") Reported by: syzbot+9b96ea7a5ff8917d3fe4@syzkaller.appspotmail.com Reported by: syzbot+6120ebbb354cd52e5107@syzkaller.appspotmail.com Reviewed by: kib MFC after: 6 days Differential Revision: https://reviews.freebsd.org/D30404	2021-05-23 14:13:01 -04:00
Konstantin Belousov	fc369a353b	ktrace: fix a race between writes and close It was possible that termination of ktrace session occured during some record write, in which case write occured after the close of the vnode. Use ktr_io_params refcounting to avoid this situation, by taking the reference on the structure instead of vnode. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30400	2021-05-22 23:14:13 +03:00
Mark Johnston	e4b16f2fb1	ktrace: Avoid recursion in namei() sys_ktrace() calls namei(), which may call ktrnamei(). But sys_ktrace() also calls ktrace_enter() first, so if the caller is itself being traced, the assertion in ktrace_enter() is triggered. And, ktrnamei() does not check for recursion like most other ktrace ops do. Fix the bug by simply deferring the ktrace_enter() call. Also make the parameter to ktrnamei() const and convert to ANSI. Reported by: syzbot+d0a4de45e58d3c08af4b@syzkaller.appspotmail.com Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30340	2021-05-22 12:07:32 -04:00
Konstantin Belousov	ea2b64c241	ktrace: add a kern.ktrace.filesize_limit_signal knob When enabled, writes to ktrace.out that exceed the max file size limit cause SIGXFSZ as it should be, but note that the limit is taken from the process that initiated ktrace. When disabled, write is blocked, but signal is not send. Note that in either case ktrace for the affected process is stopped. Requested and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257	2021-05-22 15:16:09 +03:00
Konstantin Belousov	02645b886b	ktrace: use the limit of the trace initiator for file size limit on writes Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257	2021-05-22 15:16:09 +03:00
Konstantin Belousov	1762f674cc	ktrace: pack all ktrace parameters into allocated structure ktr_io_params Ref-count the ktr_io_params structure instead of vnode/cred. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257	2021-05-22 15:16:08 +03:00
Konstantin Belousov	a6144f713c	ktrace: do not stop tracing other processes if our cannot write to this vnode Other processes might still be able to write, make the decision to stop based on the per-process situation. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D30257	2021-05-22 15:16:08 +03:00
Edward Tomasz Napierala	4658877815	Move KTRUSERRET() from userret() to ast(). It's a really long detour - it writes ktrace entries to the filesystem - so the overhead of ast() won't make any difference. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26404	2020-10-03 12:03:08 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Mateusz Guzik	0a1427c5ab	ktrace: provide ktrstat_error This eliminates a branch from its consumers trading it for an extra call if ktrace is enabled for curthread. Given that this is almost never true, the tradeoff is worth it.	2020-02-03 22:26:00 +00:00
Mateusz Guzik	b249ce48ea	vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427	2020-01-03 22:29:58 +00:00
Matt Macy	ad738f3791	Reduce overhead of ktrace checks in the common case. KTRPOINT() checks both if we are tracing _and_ if we are recursing within ktrace. The second condition is only ever executed if ktrace is actually enabled. This change moves the check out of the hot path in to the functions themselves. Discussed with mjg@ Reported by: mjg@ Approved by: sbruno@	2018-05-09 00:00:47 +00:00
John Baldwin	ffb6607984	Decode kevent structures logged via ktrace(2) in kdump. - Add a new KTR_STRUCT_ARRAY ktrace record type which dumps an array of structures. The structure name in the record payload is preceded by a size_t containing the size of the individual structures. Use this to replace the previous code that dumped the kevent arrays dumped for kevent(). kdump is now able to decode the kevent structures rather than dumping their contents via a hexdump. One change from before is that the 'changes' and 'events' arrays are not marked with separate 'read' and 'write' annotations in kdump output. Instead, the first array is the 'changes' array, and the second array (only present if kevent doesn't fail with an error) is the 'events' array. For kevent(), empty arrays are denoted by an entry with an array containing zero entries rather than no record. - Move kevent decoding tables from truss to libsysdecode. This adds three new functions to decode members of struct kevent: sysdecode_kevent_filter, sysdecode_kevent_flags, and sysdecode_kevent_fflags. kdump uses these helper functions to pretty-print kevent fields. - Move structure definitions for freebsd11 and freebsd32 kevent structures to <sys/event.h> so that they can be shared with userland. The 32-bit structures are only exposed if _WANT_KEVENT32 is defined. The freebsd11 structures are only exposed if _WANT_FREEBSD11_KEVENT is defined. The 32-bit freebsd11 structure requires both. - Decode freebsd11 kevent structures in truss for the compat11.kevent() system call. - Log 32-bit kevent structures via ktrace for 32-bit compat kevent() system calls. - While here, constify the 'void *data' argument to ktrstruct(). Reviewed by: kib (earlier version) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12470	2017-11-25 04:49:12 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Konstantin Belousov	1e4296c919	Ktracing kevent(2) calls with unusual arguments might leads to an overly large allocation requests. When ktrace-ing io, sys_kevent() allocates memory to copy the requested changes and reported events. Allocations are sized by the incoming syscall lengths arguments, which are user-controlled, and might cause overflow in calculations or too large allocations. Since io trace chunks are limited by ktr_geniosize, there is no sense it even trying to satisfy unbounded allocations. Export ktr_geniosize and clamp the buffers sizes in advance. PR: 217435 Reported by: Tim Newsham <tim.newsham@nccgroup.trust> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-03-12 13:48:24 +00:00
Ed Maste	039644eca9	ANSYfy kern_ktrace.c and remove archaic register keyword Sponsored by: The FreeBSD Foundation	2017-01-20 14:59:56 +00:00
Ed Maste	69a2875821	Renumber license clauses in sys/kern to avoid skipping #3	2016-09-15 13:16:20 +00:00
Mateusz Guzik	7c34b35b57	ktrace: do a lockless check on fork to see if tracing is enabled This saves 2 lock acquisitions in the common case.	2016-08-10 15:25:44 +00:00
Pedro F. Giffuni	02abd40029	kernel: use our nitems() macro when it is available through param.h. No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current	2016-04-19 23:48:27 +00:00
Mateusz Guzik	4dd3a21fb3	ktrace: tidy up ktrstruct - minor style fixes - avoid doing strlen twice [1] PR: 206648 Submitted by: C Turt <ecturt gmail.com> (original version) [1]	2016-01-27 19:55:02 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Jilles Tjoelker	093e059c7d	ktrace: Use designated initializers for the data_lengths array. In the .o file, this only changes some line numbers (head amd64) because element 0 is no longer explicitly initialized. This should make bugs like FreeBSD-SA-14:12.ktrace less likely. Discussed with: des MFC after: 1 week	2014-06-06 14:49:00 +00:00
Robert Watson	4a14441044	Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks	2014-03-16 10:55:57 +00:00
Pawel Jakub Dawidek	3fded357af	Fix panic in ktrcapfail() when no capability rights are passed. While here, correct all consumers to pass NULL instead of 0 as we pass capability rights as pointers now, not uint64_t. Reported by: Daniel Peyrolon Tested by: Daniel Peyrolon Approved by: re (marius)	2013-09-18 19:26:08 +00:00
Pawel Jakub Dawidek	7008be5bd7	Change the cap_rights_t type from uint64_t to a structure that we can extend in the future in a backward compatible (API and ABI) way. The cap_rights_t represents capability rights. We used to use one bit to represent one right, but we are running out of spare bits. Currently the new structure provides place for 114 rights (so 50 more than the previous cap_rights_t), but it is possible to grow the structure to hold at least 285 rights, although we can make it even larger if 285 rights won't be enough. The structure definition looks like this: struct cap_rights { uint64_t cr_rights[CAP_RIGHTS_VERSION + 2]; }; The initial CAP_RIGHTS_VERSION is 0. The top two bits in the first element of the cr_rights[] array contain total number of elements in the array - 2. This means if those two bits are equal to 0, we have 2 array elements. The top two bits in all remaining array elements should be 0. The next five bits in all array elements contain array index. Only one bit is used and bit position in this five-bits range defines array index. This means there can be at most five array elements in the future. To define new right the CAPRIGHT() macro must be used. The macro takes two arguments - an array index and a bit to set, eg. #define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL) We still support aliases that combine few rights, but the rights have to belong to the same array element, eg: #define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL) #define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL) #define CAP_FCHMODAT (CAP_FCHMOD \| CAP_LOOKUP) There is new API to manage the new cap_rights_t structure: cap_rights_t cap_rights_init(cap_rights_t rights, ...); void cap_rights_set(cap_rights_t rights, ...); void cap_rights_clear(cap_rights_t rights, ...); bool cap_rights_is_set(const cap_rights_t rights, ...); bool cap_rights_is_valid(const cap_rights_t rights); void cap_rights_merge(cap_rights_t dst, const cap_rights_t src); void cap_rights_remove(cap_rights_t dst, const cap_rights_t src); bool cap_rights_contains(const cap_rights_t big, const cap_rights_t little); Capability rights to the cap_rights_init(), cap_rights_set(), cap_rights_clear() and cap_rights_is_set() functions are provided by separating them with commas, eg: cap_rights_t rights; cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT); There is no need to terminate the list of rights, as those functions are actually macros that take care of the termination, eg: #define cap_rights_set(rights, ...) \ __cap_rights_set((rights), __VA_ARGS__, 0ULL) void __cap_rights_set(cap_rights_t *rights, ...); Thanks to using one bit as an array index we can assert in those functions that there are no two rights belonging to different array elements provided together. For example this is illegal and will be detected, because CAP_LOOKUP belongs to element 0 and CAP_PDKILL to element 1: cap_rights_init(&rights, CAP_LOOKUP \| CAP_PDKILL); Providing several rights that belongs to the same array's element this way is correct, but is not advised. It should only be used for aliases definition. This commit also breaks compatibility with some existing Capsicum system calls, but I see no other way to do that. This should be fine as Capsicum is still experimental and this change is not going to 9.x. Sponsored by: The FreeBSD Foundation	2013-09-05 00:09:56 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
John Baldwin	88bf5036fc	Include the associated wait channel message for context switch ktrace records. kdump supports both the old and new messages. Submitted by: Andrey Zonov andrey zonov org MFC after: 1 week	2012-04-20 15:32:36 +00:00
John Baldwin	35818d2e94	Add new ktrace records for the start and end of VM faults. This gives a pair of records similar to syscall entry and return that a user can use to determine how long page faults take. The new ktrace records are enabled via the 'p' trace type, and are enabled in the default set of trace points. Reviewed by: kib MFC after: 2 weeks	2012-04-05 17:13:14 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Eitan Adler	5a01b72672	- Fix ktrace leakage if error is set PR: kern/163098 Submitted by: Loganaden Velvindron <loganaden@devio.us> Approved by: sbruno@ MFC after: 1 month	2011-12-08 03:20:38 +00:00
Dag-Erling Smørgrav	e141be6f79	Revisit the capability failure trace points. The initial implementation only logged instances where an operation on a file descriptor required capabilities which the file descriptor did not have. By adding a type enum to struct ktr_cap_fail, we can catch other types of capability failures as well, such as disallowed system calls or attempts to wrap a file descriptor with more capabilities than it had to begin with.	2011-10-18 07:28:58 +00:00
Dag-Erling Smørgrav	c601ad8eeb	Add a new trace point, KTRFAC_CAPFAIL, which traces capability check failures. It is included in the default set for ktrace(1) and kdump(1).	2011-10-11 20:37:10 +00:00
Kip Macy	8451d0dd78	In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)	2011-09-16 13:58:51 +00:00
John Baldwin	e806d352d2	Fix several places to ignore processes that are not yet fully constructed. MFC after: 1 week	2011-04-06 17:47:22 +00:00
Dmitry Chagin	de60a5f38c	Style(9) fix. Fix indentation in comment, double ';' in variable declaration. MFC after: 1 Week	2011-03-05 20:54:17 +00:00
Dmitry Chagin	22ec040605	Partially reworked r219042. The reason for this is a bug at ktrops() where process dereferenced without having a lock. This might cause a panic if ktrace was runned with -p flag and the specified process exited between the dropping a lock and writing sv_flags. Since it is impossible to acquire sx lock while holding mtx switch to use asynchronous enqueuerequest() instead of writerequest(). Rename ktr_getrequest_ne() to more understandable name [1]. Requested by: jhb [1] MFC after: 1 Week	2011-03-05 20:36:42 +00:00
Dmitry Chagin	7705d4b24a	Introduce preliminary support of the show description of the ABI of traced process by adding two new events which records value of process sv_flags to the trace file at process creation/execing/exiting time. MFC after: 1 Month.	2011-02-25 22:05:33 +00:00
Dmitry Chagin	b4c20e5e37	ktrace_resize_pool() locking slightly reworked: 1) do not take a lock around the single atomic operation. 2) do not lose the invariant of lock by dropping/acquiring ktrace_mtx around free() or malloc(). MFC after: 1 Month.	2011-02-25 22:03:28 +00:00
Alexander Leidinger	de5b19526b	Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/ PMC/SYSV/...). No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:11:01 +00:00
John Baldwin	d680caab73	- When disabling ktracing on a process, free any pending requests that may be left. This fixes a memory leak that can occur when tracing is disabled on a process via disabling tracing of a specific file (or if an I/O error occurs with the tracefile) if the process's next system call is exit(). The trace disabling code clears p_traceflag, so exit1() doesn't do any KTRACE-related cleanup leading to the leak. I chose to make the free'ing of pending records synchronous rather than patching exit1(). - Move KTRACE-specific logic out of kern_(exec\|exit\|fork).c and into kern_ktrace.c instead. Make ktrace_mtx private to kern_ktrace.c as a result. MFC after: 1 month	2010-10-21 19:17:40 +00:00
John Baldwin	2b3fb61569	Fix a whitespace nit and remove a questioning comment. STAILQ_CONCAT() does require the STAILQ the existing list is being added to to already be initialized (it is CONCAT() vs MOVE()).	2010-08-19 16:38:58 +00:00
John Baldwin	fe41d17ab2	Keep the process locked when calling ktrops() or ktrsetchildren() instead of dropping the lock only to immediately reacquire it.	2010-08-17 21:34:19 +00:00
Gavin Atkinson	a0c87b747c	Add descriptions to a handful of sysctl nodes. PR: kern/148580 Submitted by: Galimov Albert <wtfcrap mail.ru> MFC after: 1 week	2010-08-09 14:48:31 +00:00

1 2 3 4

182 Commits