freebsd-skq

Author	SHA1	Message	Date
Edward Tomasz Napierala	5d1d844a77	kern_linkat: modify to accept AT_ flags instead of FOLLOW/NOFOLLOW This makes this API match other kern_xxxat() functions. Reviewed By: kib Sponsored By: EPSRC Differential Revision: https://reviews.freebsd.org/D29776	2021-04-25 14:13:12 +01:00
Robert Watson	af14713d49	Support run-time configuration of the PIPE_MINDIRECT threshold. PIPE_MINDIRECT determines at what (blocking) write size one-copy optimizations are applied in pipe(2) I/O. That threshold hasn't been tuned since the 1990s when this code was originally committed, and allowing run-time reconfiguration will make it easier to assess whether contemporary microarchitectures would prefer a different threshold. (On our local RPi4 baords, the 8k default would ideally be at least 32k, but it's not clear how generalizable that observation is.) MFC after: 3 weeks Reviewers: jrtc27, arichardson Differential Revision: https://reviews.freebsd.org/D29819	2021-04-24 20:04:28 +01:00
Mark Johnston	8e8f1cc9bb	Re-enable network ioctls in capability mode This reverts a portion of `274579831b` ("capsicum: Limit socket operations in capability mode") as at least rtsol and dhcpcd rely on being able to configure network interfaces while in capability mode. Reported by: bapt, Greg V Sponsored by: The FreeBSD Foundation	2021-04-23 09:22:49 -04:00
Warner Losh	df456a1fcf	newbus: style nit (align comments) Sponsored by: Netflix	2021-04-21 15:37:24 -06:00
Warner Losh	1eebd6158c	newbus: Optimize/Simplify kobj_class_compile_common a little "i" is not used in this loop at all. There's no need to initialize and increment it. Reviewed by: markj@ Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D29898	2021-04-21 15:37:24 -06:00
Konstantin Belousov	54f98c4dbf	vn_open_vnode(): handle error when fp == NULL If VOP_ADD_WRITECOUNT() or adv locking failed, so VOP_CLOSE() needs to be called, we cannot use fp fo_close() when there is no fp. This occurs when e.g. kernel code directly calls vn_open() instead of the open(2) syscall. In this case, VOP_CLOSE() can be called directly, after possible lock upgrade. Reported by: nvass@gmx.com PR: 255119 Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D29830	2021-04-21 18:06:51 +03:00
Konstantin Belousov	ecfbddf0cd	sysctl vm.objects: report backing object and swap use For anonymous objects, provide a handle kvo_me naming the object, and report the handle of the backing object. This allows userspace to deconstruct the shadow chain. Right now the handle is the address of the object in KVA, but this is not guaranteed. For the same anonymous objects, report the swap space used for actually swapped out pages, in kvo_swapped field. I do not believe that it is useful to report full 64bit counter there, so only uint32_t value is returned, clamped to the max. For kinfo_vmentry, report anonymous object handle backing the entry, so that the shadow chain for the specific mapping can be deconstructed. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29771	2021-04-19 21:32:01 +03:00
Konstantin Belousov	4342ba184c	sysctl_handle_string: do not malloc when SYSCTL_IN cannot fault In particular, this avoids malloc(9) calls when from early tunable handling, with no working malloc yet. Reported and tested by: mav Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-04-19 21:32:01 +03:00
Konstantin Belousov	578c26f31c	linkat(2): check NIRES_EMPTYPATH on the first fd arg Reported by: arichardson Reviewed by: markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29834	2021-04-19 21:32:01 +03:00
Warner Losh	571a1a64b1	Minor style tidy: if( -> if ( Fix a few 'if(' to be 'if (' in a few places, per style(9) and overwhelming usage in the rest of the kernel / tree. MFC After: 3 days Sponsored by: Netflix	2021-04-18 11:19:15 -06:00
Warner Losh	f1f9870668	Minor style cleanup We prefer 'while (0)' to 'while(0)' according to grep and stlye(9)'s space after keyword rule. Remove a few stragglers of the latter. Many of these usages were inconsistent within the file. MFC After: 3 days Sponsored by: Netflix	2021-04-18 11:14:17 -06:00
Konstantin Belousov	bbf7a4e878	O_PATH: allow vnode kevent filter on such files if VREAD access is checked as allowed during open Requested by: wulf Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:49:18 +03:00
Konstantin Belousov	f9b923af34	O_PATH: Allow to open symlink When O_NOFOLLOW is specified, namei() returns the symlink itself. In this case, open(O_PATH) should be allowed, to denote the location of symlink itself. Prevent O_EXEC in this case, execve(2) code is not ready to try to execute symlinks. Reported by: wulf Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:49:09 +03:00
Konstantin Belousov	a5970a529c	Make files opened with O_PATH to not block non-forced unmount by only keeping hold count on the vnode, instead of the use count. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:48:27 +03:00
Konstantin Belousov	8d9ed174f3	open(2): Implement O_PATH Reviewed by: markj Tested by: pho Discussed with: walker.aj325_gmail.com, wulf Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:48:24 +03:00
Konstantin Belousov	509124b626	Add AT_EMPTY_PATH for several *at(2) syscalls It is currently allowed to fchownat(2), fchmodat(2), fchflagsat(2), utimensat(2), fstatat(2), and linkat(2). For linkat(2), PRIV_VFS_FHOPEN privilege is required to exercise the flag. It allows to link any open file. Requested by: trasz Tested by: pho, trasz Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29111	2021-04-15 12:48:11 +03:00
Konstantin Belousov	437c241d0c	vfs_vnops.c: Make vn_statfile() non-static Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:47:56 +03:00
Konstantin Belousov	42be0a7b10	Style. Add missed spaces, wrap long lines. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29323	2021-04-15 12:47:46 +03:00
Mateusz Guzik	4f0279e064	cache: extend mismatch vnode assert print to include the name	2021-04-15 07:55:43 +00:00
Mark Johnston	29bb6c19f0	domainset: Define additional global policies Add global definitions for first-touch and interleave policies. The former may be useful for UMA, which implements a similar policy without using domainset iterators. No functional change intended. Reviewed by: mav MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29104	2021-04-14 13:03:33 -04:00
Konstantin Belousov	75c5cf7a72	filt_timerexpire: avoid process lock recursion Found by: syzkaller Reported and reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29746	2021-04-14 10:53:28 +03:00
Konstantin Belousov	5cc1d19941	realtimer_expire: avoid proc lock recursion when called from itimer_proc_continue() It is fine to drop the process lock there, process cannot exit until its timers are cleared. Found by: syzkaller Reported and reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29746	2021-04-14 10:53:19 +03:00
Konstantin Belousov	116f26f947	sbuf_uionew(): sbuf_new() takes int as length and length should be not less than SBUF_MINSIZE Reported and tested by: pho Noted and reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D29752	2021-04-14 10:23:20 +03:00
Mark Johnston	06a53ecf24	malloc: Add state transitions for KASAN - Reuse some REDZONE bits to keep track of the requested and allocated sizes, and use that to provide red zones. - As in UMA, disable memory trashing to avoid unnecessary CPU overhead. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29461	2021-04-13 17:42:21 -04:00
Mark Johnston	f1c3adefd9	execve: Mark exec argument buffers We cache mapped execve argument buffers to avoid the overhead of TLB shootdowns. Mark them invalid when they are freed to the cache. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29460	2021-04-13 17:42:21 -04:00
Mark Johnston	b261bb4057	vfs: Add KASAN state transitions for vnodes vnodes are a bit special in that they may exist on per-CPU lists even while free. Add a KASAN-only destructor that poisons regions of each vnode that are not expected to be accessed after a free. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29459	2021-04-13 17:42:21 -04:00
Mark Johnston	6faf45b34b	amd64: Implement a KASAN shadow map The idea behind KASAN is to use a region of memory to track the validity of buffers in the kernel map. This region is the shadow map. The compiler inserts calls to the KASAN runtime for every emitted load and store, and the runtime uses the shadow map to decide whether the access is valid. Various kernel allocators call kasan_mark() to update the shadow map. Since the shadow map tracks only accesses to the kernel map, accesses to other kernel maps are not validated by KASAN. UMA_MD_SMALL_ALLOC is disabled when KASAN is configured to reduce usage of the direct map. Currently we have no mechanism to completely eliminate uses of the direct map, so KASAN's coverage is not comprehensive. The shadow map uses one byte per eight bytes in the kernel map. In pmap_bootstrap() we create an initial set of page tables for the kernel and preloaded data. When pmap_growkernel() is called, we call kasan_shadow_map() to extend the shadow map. kasan_shadow_map() uses pmap_kasan_enter() to allocate memory for the shadow region and map it. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29417	2021-04-13 17:42:20 -04:00
Mark Johnston	38da497a4d	Add the KASAN runtime KASAN enables the use of LLVM's AddressSanitizer in the kernel. This feature makes use of compiler instrumentation to validate memory accesses in the kernel and detect several types of bugs, including use-after-frees and out-of-bounds accesses. It is particularly effective when combined with test suites or syzkaller. KASAN has high CPU and memory usage overhead and so is not suited for production environments. The runtime and pmap maintain a shadow of the kernel map to store information about the validity of memory mapped at a given kernel address. The runtime implements a number of functions defined by the compiler ABI. These are prefixed by __asan. The compiler emits calls to __asan_load() and __asan_store() around memory accesses, and the runtime consults the shadow map to determine whether a given access is valid. kasan_mark() is called by various kernel allocators to update state in the shadow map. Updates to those allocators will come in subsequent commits. The runtime also defines various interceptors. Some low-level routines are implemented in assembly and are thus not amenable to compiler instrumentation. To handle this, the runtime implements these routines on behalf of the rest of the kernel. The sanitizer implementation validates memory accesses manually before handing off to the real implementation. The sanitizer in a KASAN-configured kernel can be disabled by setting the loader tunable debug.kasan.disable=1. Obtained from: NetBSD MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29416	2021-04-13 17:42:20 -04:00
Mitchell Horne	2816bd8442	rmlock(9): add an RM_DUPOK flag Allows for duplicate locks to be acquired without witness complaining. Similar flags exists already for rwlock(9) and sx(9). Reviewed by: markj MFC after: 3 days Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. NetApp PR: 52 Differential Revision: https://reviews.freebsd.org/D29683n	2021-04-12 11:42:21 -03:00
Mark Johnston	dfff37765c	Rename struct device to struct _device types.h defines device_t as a typedef of struct device *. struct device is defined in subr_bus.c and almost all of the kernel uses device_t. The LinuxKPI also defines a struct device, so type confusion can occur. This causes bugs and ambiguity for debugging tools. Rename the FreeBSD struct device to struct _device. Reviewed by: gbe (man pages) Reviewed by: rpokala, imp, jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29676	2021-04-12 09:32:30 -04:00
Andrew Turner	5d2d599d3f	Create VM_MEMATTR_DEVICE on all architectures This is intended to be used with memory mapped IO, e.g. from bus_space_map with no flags, or pmap_mapdev. Use this new memory type in the map request configured by resource_init_map_request, and in pciconf. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D29692	2021-04-12 06:15:31 +00:00
Konstantin Belousov	a091c35323	ptrace: restructure comments around reparenting on PT_DETACH style code, and use {} for both branches. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-04-11 14:44:30 +03:00
Konstantin Belousov	9d7e450b64	ptrace: remove dead call to FIX_SSTEP() It was an alias for procfs_fix_sstep() long time ago. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2021-04-11 14:44:30 +03:00
Piotr Pawel Stefaniak	a212f56d10	Balance parentheses in sysctl descriptions	2021-04-11 10:30:55 +02:00
Konstantin Belousov	2fd1ffefaa	Stop arming kqueue timers on knote owner suspend or terminate This way, even if the process specified very tight reschedule intervals, it should be stoppable/killable. Reported and reviewed by: markj Tested by: markj, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29106	2021-04-09 23:43:51 +03:00
Konstantin Belousov	533e5057ed	Add helper for kqueue timers callout scheduling Reviewed by: markj Tested by: markj, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29106	2021-04-09 23:42:56 +03:00
Konstantin Belousov	4d27d8d2f3	Stop arming realtime posix process timers on suspend or terminate Reported and reviewed by: markj Tested by: markj, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29106	2021-04-09 23:42:51 +03:00
Konstantin Belousov	dc47fdf131	Stop arming periodic process timers on suspend or terminate Reported and reviewed by: markj Tested by: markj, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29106	2021-04-09 23:42:44 +03:00
Mateusz Guzik	72b3b5a941	vfs: replace vfs_smr_quiesce with vfs_smr_synchronize This ends up using a smr specific method. Suggested by: markj Tested by: pho	2021-04-08 11:14:45 +00:00
Mark Johnston	0f07c234ca	Remove more remnants of sio(4) Reviewed by: imp MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29626	2021-04-07 14:33:02 -04:00
Mark Johnston	274579831b	capsicum: Limit socket operations in capability mode Capsicum did not prevent certain privileged networking operations, specifically creation of raw sockets and network configuration ioctls. However, these facilities can be used to circumvent some of the restrictions that capability mode is supposed to enforce. Add capability mode checks to disallow network configuration ioctls and creation of sockets other than PF_LOCAL and SOCK_DGRAM/STREAM/SEQPACKET internet sockets. Reviewed by: oshogbo Discussed with: emaste Reported by: manu Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D29423	2021-04-07 14:32:56 -04:00
Mateusz Guzik	13b3862ee8	cache: update an assert on CACHE_FPL_STATUS_ABORTED Since symlink support it can get upgraded to CACHE_FPL_STATUS_DESTROYED. Reported by: bdrewery	2021-04-06 22:31:58 +02:00
Mark Johnston	2425f5e912	mount: Disallow mounting over a jail root Discussed with: jamie Approved by: so Security: CVE-2020-25584 Security: FreeBSD-SA-21:10.jail_mount	2021-04-06 14:49:36 -04:00
Edward Tomasz Napierala	7f6157f7fd	lock_delay(9): improve interaction with restrict_starvation After `e7a5b3bd05`, the la->delay value was adjusted after being set by the starvation_limit code block, which is wrong. Reported By: avg Reviewed By: avg Fixes: `e7a5b3bd05` Sponsored By: NetApp, Inc. Sponsored By: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D29513	2021-04-03 13:08:53 +01:00
Mark Johnston	52a99c72b5	sendfile: Fix error initialization in sendfile_getobj() Reviewed by: chs, kib Reported by: jhb Fixes: `faa998f6ff` MFC after: 1 day Differential Revision: https://reviews.freebsd.org/D29540	2021-04-02 17:42:38 -04:00
Richard Scheffenegger	cad4fd0365	Make sbuf_drain safe for external use While sbuf_drain was an internal function, two KASSERTS checked the sanity of it being called. However, an external caller may be ignorant if there is any data to drain, or if an error has already accumulated. Be nice and return immediately with the accumulated error. MFC after: 2 weeks Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D29544	2021-04-02 20:12:11 +02:00
Konstantin Belousov	aa3ea612be	x86: remove gcov kernel support Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D29529	2021-04-02 15:41:51 +03:00
Mateusz Guzik	f79bd71def	cache: add high level overview Differential Revision: https://reviews.freebsd.org/D28675	2021-04-02 05:11:05 +02:00
Mateusz Guzik	dc532884d5	cache: fix resizing in face of lockless lookup Reported by: pho Tested by: pho	2021-04-02 05:11:05 +02:00
Lawrence Stewart	1eb402e47a	stats(3): Improve t-digest merging of samples which result in mu adjustment underflow. Allow the calculation of the mu adjustment factor to underflow instead of rejecting the VOI sample from the digest and logging an error. This trades off some (currently unquantified) additional centroid error in exchange for better fidelity of the distribution's density, which is the right trade off at the moment until follow up work to better handle and track accumulated error can be undertaken. Obtained from: Netflix MFC after: immediately	2021-04-02 13:17:53 +11:00

1 2 3 4 5 ...

18284 Commits