freebsd-dev

Author	SHA1	Message	Date
Brooks Davis	dd51fec3b9	Copyout a whole int to cpuset_domain's policy pointer. The previous code only copied 16-bits and corrupted the target int. Reviewed by: kib, markj Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14611	2018-03-09 00:50:40 +00:00
Jeff Roberson	3f289c3fcf	Implement 'domainset', a cpuset based NUMA policy mechanism. This allows userspace to control NUMA policy administratively and programmatically. Implement domainset based iterators in the page layer. Remove the now legacy numa_* syscalls. Cleanup some header polution created by having seq.h in proc.h. Reviewed by: markj, kib Discussed with: alc Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D13403	2018-01-12 22:48:23 +00:00
Pedro F. Giffuni	8a36da99de	sys/kern: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:20:12 +00:00
Jung-uk Kim	a1d0659ca9	Fix size to copyout(9) for cpuset_getid(2). MFC after: 3 days	2017-08-22 20:46:29 +00:00
Allan Jude	f299c47b52	Allow cpuset_{get,set}affinity in capabilities mode bhyve was recently sandboxed with capsicum, and needs to be able to control the CPU sets of its vcpu threads Reviewed by: emaste, oshogbo, rwatson MFC after: 2 weeks Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D10170	2017-05-24 00:58:30 +00:00
Conrad Meyer	29dfb631d8	Extend cpuset_get/setaffinity() APIs Add IRQ placement-only and ithread-only API variants. intr_event_bind has been extended with sibling methods, as it has many more callsites in existing code. Reviewed by: kib@, adrian@ (earlier version) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D10586	2017-05-03 18:41:08 +00:00
Gleb Smirnoff	6286dc78d4	Remove unneeded include of vm_phys.h.	2017-04-17 16:51:04 +00:00
Edward Tomasz Napierala	96ee43103d	Add kern_cpuset_getaffinity() and kern_cpuset_getaffinity(), and use it in compats instead of their sys_*() counterparts. Reviewed by: kib, jhb, dchagin MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9383	2017-02-05 13:24:54 +00:00
Edward Tomasz Napierala	ea2ebdc19e	Add kern_cpuset_getid() and kern_cpuset_setid(), and use them in compat32 instead of their sub_*() counterparts. Reviewed by: jhb@, kib@ MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D9382	2017-01-31 15:11:23 +00:00
John Baldwin	62d70a8174	Add more fine-grained kernel options for NUMA support. VM_NUMA_ALLOC is used to enable use of domain-aware memory allocation in the virtual memory system. DEVICE_NUMA is used to enable affinity reporting for devices such as bus_get_domain(). MAXMEMDOM must still be set to a value greater than for any NUMA support to be effective. Note that 'cpuset -gd' always works if MAXMEMDOM is enabled and the system supports NUMA. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5782	2016-04-09 13:58:04 +00:00
Adrian Chadd	5bbb2169d2	Un-static cpuset_which() - it's useful in other contexts, such as some CPU set operations in my upcoming NUMA work. Tested/compiled: * i386 (run) * amd64 (run) * mips (run) * mips64 (run) * armv6 (built) Sponsored by: Norse Corp, Inc.	2015-06-26 04:14:05 +00:00
Jonathan Anderson	60aa2c85fa	Allow sizeof(cpuset_t) to be queried in capability mode. This allows functions that retrieve and inspect pthread_attr_t objects to work correctly: querying the cpuset_t size is part of querying CPU affinity information, which is part of creating a complete pthread_attr_t. Approved by: rwatson (mentor) Reviewed by: pjd Sponsored by: NSERC	2015-05-14 15:14:03 +00:00
John Baldwin	bbf686ed60	Reject attempts to read the cpuset mask of a negative domain ID.	2015-01-08 19:11:14 +00:00
John Baldwin	c0ae66888b	Create a cpuset mask for each NUMA domain that is available in the kernel via the global cpuset_domain[] array. To export these to userland, add a CPU_WHICH_DOMAIN level that can be used to fetch the mask for a specific domain. Add a -d flag to cpuset(1) that can be used to fetch the mask for a given domain. Differential Revision: https://reviews.freebsd.org/D1232 Submitted by: jeff (kernel bits) Reviewed by: adrian, jeff	2015-01-08 15:53:13 +00:00
Hans Petter Selasky	f0188618f2	Fix multiple incorrect SYSCTL arguments in the kernel: - Wrong integer type was specified. - Wrong or missing "access" specifier. The "access" specifier sometimes included the SYSCTL type, which it should not, except for procedural SYSCTL nodes. - Logical OR where binary OR was expected. - Properly assert the "access" argument passed to all SYSCTL macros, using the CTASSERT macro. This applies to both static- and dynamically created SYSCTLs. - Properly assert the the data type for both static and dynamic SYSCTLs. In the case of static SYSCTLs we only assert that the data pointed to by the SYSCTL data pointer has the correct size, hence there is no easy way to assert types in the C language outside a C-function. - Rewrote some code which doesn't pass a constant "access" specifier when creating dynamic SYSCTL nodes, which is now a requirement. - Updated "EXAMPLES" section in SYSCTL manual page. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-21 07:31:21 +00:00
Adrian Chadd	7f7528fc79	Modify cpuset_setithread() to take a CPU ID as an integer, not a char. We're going to end up having > 254 CPUs at some point.	2014-09-16 01:21:47 +00:00
Alexander V. Chernikov	c1d9ecf2be	Fix error handling in cpuset_setithread() introduced in r267716. Noted by: kib MFC after: 1 week	2014-09-13 13:46:16 +00:00
Alexander V. Chernikov	811985398d	Permit changing cpu mask for cpu set 1 in presence of drivers binding their threads to particular CPU. Changing ithread cpu mask is now performed by special cpuset_setithread(). It creates additional cpuset root group on first bind invocation. No objection: jhb Tested by: hiren MFC after: 2 weeks Sponsored by: Yandex LLC	2014-06-22 11:32:23 +00:00
John Baldwin	cd32bd7ad1	Several improvements to rmlock(9). Many of these are based on patches provided by Isilon. - Add an rm_assert() supporting various lock assertions similar to other locking primitives. Because rmlocks track readers the assertions are always fully accurate unlike rw_assert() and sx_assert(). - Flesh out the lock class methods for rmlocks to support sleeping via condvars and rm_sleep() (but only while holding write locks), rmlock details in 'show lock' in DDB, and the lc_owner method used by dtrace. - Add an internal destroyed cookie so that API functions can assert that an rmlock is not destroyed. - Make use of rm_assert() to add various assertions to the API (e.g. to assert locks are held when an unlock routine is called). - Give RM_SLEEPABLE locks their own lock class and always use the rmlock's own lock_object with WITNESS. - Use THREAD_NO_SLEEPING() / THREAD_SLEEPING_OK() to disallow sleeping while holding a read lock on an rmlock. Submitted by: andre Obtained from: EMC/Isilon	2013-06-25 18:44:15 +00:00
Jeff Roberson	17a2737732	- Add a BIT_FFS() macro and use it to replace cpusetffs_obj() Discussed with: attilio Sponsored by: EMC / Isilon Storage Division	2013-06-13 20:46:03 +00:00
John Baldwin	c9813d0a37	Do not compare the existing mask of a cpuset with a new mask when changing the mask of a cpuset. Also, change the cpuset's mask before updating the masks of all children. Previously changing a cpuset's mask first required setting the mask to a super-set of both the old and new masks and then changing it a second time to the new mask.	2013-06-06 14:43:19 +00:00
Attilio Rao	d4a2ab8c07	Post r222812 KTR_CPUMASK started being initialized only as a tunable handler and not more statically. Unfortunately, it seems that this is not ideal for new platform bringup and boot low level development (which needs ktr_cpumask to be effective before tunables can be setup). Because of this, add a way to statically initialize cpusets, by passing an list of initializers, divided by commas. Also, provide a way to enforce an all-set mask, for above mentioned initializers. This imposes some differences on how KTR_CPUMASK is setup now as a kernel option, and in particular this makes the words specifications backward wrt. what is currently in -CURRENT. In order to avoid mismatches between KTR_CPUMASK definition and other way to setup the mask (tunable, sysctl) and to print it, change the ordering how cpusetobj_print() and cpusetobj_scan() acquire the words belonging to the set. Please give a look to sys/conf/NOTES in order to understand how the new format is supposed to work. Also, ktr manpages will be updated shortly by gjb which volountereed for this. This patch won't be merged because it changes a POLA (at least from the theoretical standpoint) and this is however a patch that proves to be effective only in development environments. Requested by: rpaulo Reviewed by: jeff, rpaulo	2012-08-30 21:22:47 +00:00
Kevin Lo	2b69bb1f27	Add a missing curly bracket	2011-12-05 10:34:52 +00:00
Kip Macy	8451d0dd78	In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)	2011-09-16 13:58:51 +00:00
Attilio Rao	e370959707	Fix KTR_CPUMASK in order to accept a string representing a cpuset_t. This introduce all the underlying support for making this possible (via the function cpusetobj_strscan() and keeps ktr_cpumask exported. sparc64 implements its own assembly primitives for tracing events and needs to properly check it. Anyway the sparc64 logic is not implemented yet due to lack of knowledge (by me) and time (by marius), but it is just a matter of using ktr_cpumask when possible. Tested and fixed by: pluknet Reviewed by: marius	2011-05-31 20:48:58 +00:00
Attilio Rao	d0984adc98	Revert a change that crept in during MFC.	2011-05-31 20:23:33 +00:00
Attilio Rao	5b6ea0b538	MFC	2011-05-31 14:18:10 +00:00
Attilio Rao	217e1c0ebc	Revert a patch that unvolountary sneaked in while I was MFCing.	2011-05-23 23:50:21 +00:00
Attilio Rao	a9ff18a210	MFC	2011-05-23 01:17:30 +00:00
Attilio Rao	e3071102d6	Merge r221912 from largeSMP project branch: Fix a long-standing bug in cpuset_thread0() where only the first part of cs_mask is set full. Submitted by: anonymous MFC after: 1 week	2011-05-22 21:35:03 +00:00
Attilio Rao	34a1e065bd	Make cpusetobj_strprint() prepare the string in order to print the least significant cpuset_t word at the outmost right part of the string (more far from the beginning of it). This follows the natural build of bits rappresentation in the words.	2011-05-22 20:29:47 +00:00
Attilio Rao	f27aed53d0	Fix a longstanding bug where only the first part of the cpumask was correctly set full. Submitted by: anonymous	2011-05-14 19:36:12 +00:00
Attilio Rao	faa0e911fb	Simplify the code here. Submitted by: jhb	2011-05-14 18:22:08 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
John Baldwin	e84c2db137	When constructing a new cpuset, apply the parent cpuset's mask to the new set's mask rather than the root mask. This was causing the root mask to be modified incorrectly. Reviewed by: jeff MFC after: 1 week	2011-03-08 14:18:21 +00:00
David Xu	444528c026	Use integer for size of cpuset, as it won't be bigger than INT_MAX, This is requested by bge. Also move the sysctl into file kern_cpuset.c, because it should always be there, it is independent of thread scheduler.	2010-11-01 00:42:25 +00:00
David Xu	4a5478709b	- Revert r214409. - Use long word to figure out sizeof kernel cpuset, hope it works.	2010-10-27 09:29:03 +00:00
David Xu	1676b42546	If input parameter cpusetsize is zero, give userland size of cpuset mask kernel is using.	2010-10-27 02:32:54 +00:00
David Xu	42fe684c1a	Use function tdfind() to find a thread.	2010-10-25 13:13:16 +00:00
John Baldwin	86855bf549	Another nit that both I and ispell missed. Submitted by: Ben Kaduk minimarmot of gmail	2009-10-26 18:32:06 +00:00
John Baldwin	9390262576	Fix some spelling nits.	2009-10-26 17:42:03 +00:00
Jamie Gritton	399645e1b9	Remove unnecessary/redundant includes. Approved by: bz (mentor)	2009-06-23 14:39:21 +00:00
Jamie Gritton	0304c73163	Add hierarchical jails. A jail may further virtualize its environment by creating a child jail, which is visible to that jail and to any parent jails. Child jails may be restricted more than their parents, but never less. Jail names reflect this hierarchy, being MIB-style dot-separated strings. Every thread now points to a jail, the default being prison0, which contains information about the physical system. Prison0's root directory is the same as rootvnode; its hostname is the same as the global hostname, and its securelevel replaces the global securelevel. Note that the variable "securelevel" has actually gone away, which should not cause any problems for code that properly uses securelevel_gt() and securelevel_ge(). Some jail-related permissions that were kept in global variables and set via sysctls are now per-jail settings. The sysctls still exist for backward compatibility, used only by the now-deprecated jail(2) system call. Approved by: bz (mentor)	2009-05-27 14:11:23 +00:00
Bjoern A. Zeeb	6aaa0b3cf1	Prevent a superuser inside a jail from modifying the dedicated root cpuset of that jail. Processes inside the jail will still be able to change child sets. A superuser outside of a jail will still be able to change the jail cpuset and thus limit the number of cpus available to the jail. Problem reported by: 000.fbsd@quip.cz (Miroslav Lachman) PR: kern/134050 Reviewed by: jeff MFC after: 3 weeks X-MFC: backout r191596	2009-04-28 21:00:50 +00:00
Bjoern A. Zeeb	47479a8ceb	Correct a comment: the function name given had never existed in any (relevant) version of this file orany of my patches. MFC after: 1 month	2009-04-22 20:49:54 +00:00
Bjoern A. Zeeb	413628a7e3	MFp4: Bring in updated jail support from bz_jail branch. This enhances the current jail implementation to permit multiple addresses per jail. In addtion to IPv4, IPv6 is supported as well. Due to updated checks it is even possible to have jails without an IP address at all, which basically gives one a chroot with restricted process view, no networking,.. SCTP support was updated and supports IPv6 in jails as well. Cpuset support permits jails to be bound to specific processor sets after creation. Jails can have an unrestricted (no duplicate protection, etc.) name in addition to the hostname. The jail name cannot be changed from within a jail and is considered to be used for management purposes or as audit-token in the future. DDB 'show jails' command was added to aid debugging. Proper compat support permits 32bit jail binaries to be used on 64bit systems to manage jails. Also backward compatibility was preserved where possible: for jail v1 syscalls, as well as with user space management utilities. Both jail as well as prison version were updated for the new features. A gap was intentionally left as the intermediate versions had been used by various patches floating around the last years. Bump __FreeBSD_version for the afore mentioned and in kernel changes. Special thanks to: - Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches and Olivier Houchard (cognet) for initial single-IPv6 patches. - Jeff Roberson (jeff) and Randall Stewart (rrs) for their help, ideas and review on cpuset and SCTP support. - Robert Watson (rwatson) for lots and lots of help, discussions, suggestions and review of most of the patch at various stages. - John Baldwin (jhb) for his help. - Simon L. Nielsen (simon) as early adopter testing changes on cluster machines as well as all the testers and people who provided feedback the last months on freebsd-jail and other channels. - My employer, CK Software GmbH, for the support so I could work on this. Reviewed by: (see above) MFC after: 3 months (this is just so that I get the mail) X-MFC Before: 7.2-RELEASE if possible	2008-11-29 14:32:14 +00:00
Bjoern A. Zeeb	dea0ed6690	Add a `show cpusets' DDB command to print numbered root and assigned CPU affinity sets. Reviewed by: brooks	2008-07-07 21:32:02 +00:00
Bjoern A. Zeeb	7a8f695a21	Move cpuset_refroot and cpuset_refbase functions up, grouping the cpuset_ref* functions together. Will make it easier to read and add code without forward declarations. No functional changes.	2008-07-07 20:45:55 +00:00
Bjoern A. Zeeb	ba931c0855	Add a new priv 'PRIV_SCHED_CPUSET' to check if manipulating cpusets is allowed and replace the suser() call. Do not allow it in jails. Reviewed by: rwatson	2008-06-29 17:58:16 +00:00
Konstantin Belousov	887aedc64e	Take into account possible overflow when multiplying. The casuality is the malloc call later, panicing kernel due to the oversized allocation. Reported by: pho Reviewed by: jeff	2008-05-26 10:01:13 +00:00

1 2

59 Commits