freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	84267cacf4	sysctl: don't modify oid_running for static nodes It is necessary to prevent nodes from being destroyed while used, but static ones cannot be destroyed.	2014-12-28 19:24:01 +00:00
Mateusz Guzik	f84f8f9468	Now that sysctl_root is only called with sysctl lock in shared mode, update its assertion to require that. Update comment missed in r273400: sysctl_xlock/unlock -> sysctl_xlock/xunlock Noted by: jhb	2014-10-26 01:47:55 +00:00
Dag-Erling Smørgrav	b0d69dfad9	In all cases except CTLTYPE_STRING, penv is NULL here, so passing it indiscriminately to printf() and freeenv() is incorrect. Add a NULL check before freeenv(); as for printf(), we could use req.newptr instead, but we'd have to select the correct format string based on the type, and that's too much work for an error message, so just remove it.	2014-10-23 22:42:56 +00:00
Mateusz Guzik	fca7732078	Mark some more sysctl stuff shared-locked and MPSAFE.	2014-10-21 21:08:45 +00:00
Mateusz Guzik	b564c5d6aa	Make sysctl name2oid shared-locked as well. This is a follow-up to r273401.	2014-10-21 19:45:08 +00:00
Mateusz Guzik	efe0abddf5	Implement shared locking for sysctl.	2014-10-21 19:05:44 +00:00
Mateusz Guzik	580a011762	Rename sysctl_lock and _unlock to sysctl_xlock and _xunlock.	2014-10-21 19:02:26 +00:00
Davide Italiano	2be111bf7d	Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe	2014-10-16 18:04:43 +00:00
Mateusz Guzik	30d58d6b39	Don't make a temporary copy of fixed sysctl strings.	2014-07-10 21:46:57 +00:00
Hans Petter Selasky	604bf9d37e	When getting the initial value of numeric tunables use the getenv_xxx() functions instead of strtoq(), because the getenv_xxx() functions include wrappers for various postfixes like G/M/K, which strtoq() doesn't do.	2014-07-05 06:12:48 +00:00
Hans Petter Selasky	6a3287f889	Fix regression issue after r267961. Handle special string case for SYSCTLs like previously. MFC after: 2 weeks Reported by: several people	2014-06-28 03:59:04 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Robert Watson	4a14441044	Update kernel inclusions of capability.h to use capsicum.h instead; some further refinement is required as some device drivers intended to be portable over FreeBSD versions rely on __FreeBSD_version to decide whether to include capability.h. MFC after: 3 weeks	2014-03-16 10:55:57 +00:00
Gleb Smirnoff	b5c32cf481	Remove identical vnet sysctl handlers, and handle CTLFLAG_VNET in the sysctl_root(). Note: SYSCTL_VNET_* macros can be removed as well. All is needed to virtualize a sysctl oid is set CTLFLAG_VNET on it. But for now keep macros in place to avoid large code churn. Sponsored by: Nginx, Inc.	2014-02-07 13:47:33 +00:00
Scott Long	f510415d84	Add a helpful message that can help point to why a sysctl tree removal failed Obtained from: Netflix MFC after: 3 days	2013-08-09 01:04:44 +00:00
Marius Strobl	db9066f798	- Use strdup(9) instead of reimplementing it. - Use __DECONST instead of strange casts. - Reduce code duplication and simplify name2oid(). PR: 176373 Submitted by: Christoph Mallon MFC after: 1 week	2013-03-01 18:49:14 +00:00
Marius Strobl	18716f9f4b	Update comments to reflect r246689.	2013-02-11 23:05:10 +00:00
Marius Strobl	bdc5f0172e	Make SYSCTL_{LONG,QUAD,ULONG,UQUAD}(9) work as advertised and also handle constant values. Reviewed by: kib MFC after: 3 days	2013-02-11 21:50:00 +00:00
Alan Cox	5730afc9b6	Handle spurious page faults that may occur in no-fault sections of the kernel. When access restrictions are added to a page table entry, we flush the corresponding virtual address mapping from the TLB. In contrast, when access restrictions are removed from a page table entry, we do not flush the virtual address mapping from the TLB. This is exactly as recommended in AMD's documentation. In effect, when access restrictions are removed from a page table entry, AMD's MMUs will transparently refresh a stale TLB entry. In short, this saves us from having to perform potentially costly TLB flushes. In contrast, Intel's MMUs are allowed to generate a spurious page fault based upon the stale TLB entry. Usually, such spurious page faults are handled by vm_fault() without incident. However, when we are executing no-fault sections of the kernel, we are not allowed to execute vm_fault(). This change introduces special-case handling for spurious page faults that occur in no-fault sections of the kernel. In collaboration with: kib Tested by: gibbs (an earlier version) I would also like to acknowledge Hiroki Sato's assistance in diagnosing this problem. MFC after: 1 week	2012-03-22 04:52:51 +00:00
Kip Macy	8451d0dd78	In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)	2011-09-16 13:58:51 +00:00
Robert Watson	ff66f6a404	Define two new sysctl node flags: CTLFLAG_CAPRD and CTLFLAG_CAPRW, which may be jointly referenced via the mask CTLFLAG_CAPRW. Sysctls with these flags are available in Capsicum's capability mode; other sysctl nodes are not. Flag several useful sysctls as available in capability mode, such as memory layout sysctls required by the run-time linker and malloc(3). Also expose access to randomness and available kernel features. A few sysctls are enabled to support name->MIB conversion; these may leak information to capability mode by virtue of providing resolution on names not flagged for access in capability mode. This is, generally, not a huge problem, but might be something to resolve in the future. Flag these cases with XXX comments. Submitted by: jonathan Sponsored by: Google, Inc.	2011-07-17 23:05:24 +00:00
Matthew D Fleming	3d08a76bbc	Use a name instead of a magic number for kern_yield(9) when the priority should not change. Fetch the td_user_pri under the thread lock. This is probably not necessary but a magic number also seems preferable to knowing the implementation details here. Requested by: Jason Behmer < jason DOT behmer AT isilon DOT com >	2011-05-13 05:27:58 +00:00
Jeff Roberson	e4cd31dd3c	- Merge changes to the base system to support OFED. These include a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND, and other miscellaneous small features.	2011-03-21 09:40:01 +00:00
Matthew D Fleming	e7ceb1e99b	Based on discussions on the svn-src mailing list, rework r218195: - entirely eliminate some calls to uio_yeild() as being unnecessary, such as in a sysctl handler. - move should_yield() and maybe_yield() to kern_synch.c and move the prototypes from sys/uio.h to sys/proc.h - add a slightly more generic kern_yield() that can replace the functionality of uio_yield(). - replace source uses of uio_yield() with the functional equivalent, or in some cases do not change the thread priority when switching. - fix a logic inversion bug in vlrureclaim(), pointed out by bde@. - instead of using the per-cpu last switched ticks, use a per thread variable for should_yield(). With PREEMPTION, the only reasonable use of this is to determine if a lock has been held a long time and relinquish it. Without PREEMPTION, this is essentially the same as the per-cpu variable.	2011-02-08 00:16:36 +00:00
Matthew D Fleming	00f0e671ff	Explicitly wire the user buffer rather than doing it implicitly in sbuf_new_for_sysctl(9). This allows using an sbuf with a SYSCTL_OUT drain for extremely large amounts of data where the caller knows that appropriate references are held, and sleeping is not an issue. Inspired by: rwatson	2011-01-27 00:34:12 +00:00
Matthew D Fleming	73d6f8516d	Remove the CTLFLAG_NOLOCK as it seems to be both unused and unfunctional. Wiring the user buffer has only been done explicitly since r101422. Mark the kern.disks sysctl as MPSAFE since it is and it seems to have been mis-using the NOLOCK flag. Partially break the KPI (but not the KBI) for the sysctl_req 'lock' field since this member should be private and the "REQ_LOCKED" state seems meaningless now.	2011-01-26 22:48:09 +00:00
Matthew D Fleming	cbc134ad03	Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().	2011-01-19 23:00:25 +00:00
Matthew D Fleming	2fee06f087	Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need to rely on the format string.	2011-01-18 21:14:18 +00:00
Matthew D Fleming	dd6312a7c1	Fix uninitialized variable warning that shows on Tinderbox but not my setup. (??) Submitted by: Michael Butler <imb at protected-networks dot net>	2010-11-29 21:53:21 +00:00
Matthew D Fleming	ccecef29d1	Do not hold the sysctl lock across a call to the handler. This fixes a general LOR issue where the sysctl lock had no good place in the hierarchy. One specific instance is #284 on http://sources.zabbadoz.net/freebsd/lor.html . Reviewed by: jhb MFC after: 1 month X-MFC-note: split oid_refcnt field for oid_running to preserve KBI	2010-11-29 18:18:07 +00:00
Matthew D Fleming	d0bb6f258b	Slightly modify the logic in sysctl_find_oid to reduce the indentation. There should be no functional change. MFC after: 3 days	2010-11-29 18:18:00 +00:00
Matthew D Fleming	5127ecb89c	Use the SYSCTL_CHILDREN macro in kern_sysctl.c to help de-obfuscate the code. MFC after: 3 days	2010-11-29 18:17:53 +00:00
Matthew D Fleming	4e6571599b	Re-add r212370 now that the LOR in powerpc64 has been resolved: Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough SBUF_FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk (original patch)	2010-09-16 16:13:12 +00:00
Matthew D Fleming	404a593e28	Revert r212370, as it causes a LOR on powerpc. powerpc does a few unexpected things in copyout(9) and so wiring the user buffer is not sufficient to perform a copyout(9) while holding a random mutex. Requested by: nwhitehorn	2010-09-13 18:48:23 +00:00
Matthew D Fleming	dd67e2103c	Add a drain function for struct sysctl_req, and use it for a variety of handlers, some of which had to do awkward things to get a large enough FIXEDLEN buffer. Note that some sysctl handlers were explicitly outputting a trailing NUL byte. This behaviour was preserved, though it should not be necessary. Reviewed by: phk	2010-09-09 18:33:46 +00:00
Bjoern A. Zeeb	eb79e1c76e	Make it possible to change the vnet sysctl variables on jails with their own virtual network stack. Jails only inheriting a network stack cannot change anything that cannot be changed from within a prison. Reviewed by: rwatson, zec Approved by: re (kib)	2009-08-13 10:26:34 +00:00
Robert Watson	530c006014	Merge the remainder of kern_vimage.c and vimage.h into vnet.c and vnet.h, we now use jails (rather than vimages) as the abstraction for virtualization management, and what remained was specific to virtual network stacks. Minor cleanups are done in the process, and comments updated to reflect these changes. Reviewed by: bz Approved by: re (vimage blanket)	2009-08-01 19:26:27 +00:00
Bjoern A. Zeeb	a08362ce46	sysctl_msec_to_ticks is used with both virtualized and non-vrtiualized sysctls so we cannot used one common function. Add a macro to convert the arg1 in the virtualized case to vnet.h to not expose the maths to all over the code. Add a wrapper for the single virtualized call, properly handling arg1 and call the default implementation from there. Convert the two over places to use the new macro. Reviewed by: rwatson Approved by: re (kib)	2009-07-21 21:58:55 +00:00
Robert Watson	eddfbb763d	Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)	2009-07-14 22:48:30 +00:00
Bjoern A. Zeeb	ebd8672cc3	Add explicit includes for jail.h to the files that need them and remove the "hidden" one from vimage.h.	2009-06-17 15:01:01 +00:00
Jamie Gritton	9ed47d01eb	Get vnets from creds instead of threads where they're available, and from passed threads instead of curthread. Reviewed by: zec, julian Approved by: bz (mentor)	2009-06-15 19:01:53 +00:00
Robert Watson	bcf11e8d00	Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include. Discussed with: pjd	2009-06-05 14:55:22 +00:00
Dag-Erling Smørgrav	433e2f4763	Remove do-nothing code that was required to dirty the old buffer on Alpha. Coverity ID: 838 Approved by: jhb, alc	2009-05-15 21:34:58 +00:00
Konstantin Belousov	6b72d8db47	Revert r192094. The revision caused problems for sysctl(3) consumers that expect that oldlen is filled with required buffer length even when supplied buffer is too short and returned error is ENOMEM. Redo the fix for kern.proc.filedesc, by reverting the req->oldidx when remaining buffer space is too short for the current kinfo_file structure. Also, only ignore ENOMEM. We have to convert ENOMEM to no error condition to keep existing interface for the sysctl, though. Reported by: ed, Florian Smeets <flo kasimir com> Tested by: pho	2009-05-15 14:41:44 +00:00
John Baldwin	3e829b18d6	- Use a separate sx lock to try to limit the number of concurrent userland sysctl requests to avoid wiring too much user memory. Only grab this lock if the user's old buffer is larger than a page as a tradeoff to allow more concurrency for common small requests. - Just use a shared lock on the sysctl tree for user sysctl requests now. MFC after: 1 week	2009-05-14 22:01:32 +00:00
Konstantin Belousov	e401a6a54e	Do not advance req->oldidx when sysctl_old_user returning an error due to copyout failure or short buffer. The later breaks the usermode iterators of the sysctl results that pack arbitrary number of variable-sized structures. Iterator expects that kernel filled exactly oldlen bytes, and tries to interpret half-filled or garbage structure at the end of the buffer. In particular, kinfo_getfile(3) segfaulted. Reported and tested by: pho MFC after: 3 weeks	2009-05-14 10:54:57 +00:00
Marko Zec	f6dfe47a14	Permit buiding kernels with options VIMAGE, restricted to only a single active network stack instance. Turning on options VIMAGE at compile time yields the following changes relative to default kernel build: 1) V_ accessor macros for virtualized variables resolve to structure fields via base pointers, instead of being resolved as fields in global structs or plain global variables. As an example, V_ifnet becomes: options VIMAGE: ((struct vnet_net ) vnet_net)->_ifnet default build: vnet_net_0._ifnet options VIMAGE_GLOBALS: ifnet 2) INIT_VNET_ macros will declare and set up base pointers to be used by V_ accessor macros, instead of resolving to whitespace: INIT_VNET_NET(ifp->if_vnet); becomes struct vnet_net vnet_net = (ifp->if_vnet)->mod_data[VNET_MOD_NET]; 3) Memory for vnet modules registered via vnet_mod_register() is now allocated at run time in sys/kern/kern_vimage.c, instead of per vnet module structs being declared as globals. If required, vnet modules can now request the framework to provide them with allocated bzeroed memory by filling in the vmi_size field in their vmi_modinfo structures. 4) structs socket, ifnet, inpcbinfo, tcpcb and syncache_head are extended to hold a pointer to the parent vnet. options VIMAGE builds will fill in those fields as required. 5) curvnet is introduced as a new global variable in options VIMAGE builds, always pointing to the default and only struct vnet. 6) struct sysctl_oid has been extended with additional two fields to store major and minor virtualization module identifiers, oid_v_subs and oid_v_mod. SYSCTL_V_ family of macros will fill in those fields accordingly, and store the offset in the appropriate vnet container struct in oid_arg1. In sysctl handlers dealing with virtualized sysctls, the SYSCTL_RESOLVE_V_ARG1() macro will compute the address of the target variable and make it available in arg1 variable for further processing. Unused fields in structs vnet_inet, vnet_inet6 and vnet_ipfw have been deleted. Reviewed by: bz, rwatson Approved by: julian (mentor)	2009-04-30 13:36:26 +00:00
John Baldwin	a56be37e68	Add a new type of KTRACE record for sysctl(3) invocations. It uses the internal sysctl_sysctl_name() handler to map the MIB array to a string name and logs this name in the trace log. This can be useful to see exactly which sysctls a thread is invoking. MFC after: 1 month	2009-03-11 21:48:36 +00:00

1 2 3 4 5

239 Commits