freebsd-nq

Author	SHA1	Message	Date
Mark Johnston	a11ac730a7	Prevent CPU migration when checking the DTrace nofault flag on x86. dtrace_trap() consumes page and protection faults triggered by code running in DTrace probe context. Such faults occur with interrupts disabled and are detected using a per-CPU flag. Regular faults cause dtrace_trap() to be called with interrupts enabled, and nothing was ensuring that the flag was read from the correct CPU. This may result in dtrace_trap() consuming unrelated page and protection faults when DTrace is enabled, causing the fault handler to return without actually having handled the fault. Diagnosed by: Ryan Libby <rlibby@gmail.com> MFC after: 3 days Sponsored by: Dell EMC Isilon	2017-02-16 23:05:20 +00:00
John Baldwin	fdce57a042	Add an EARLY_AP_STARTUP option to start APs earlier during boot. Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix	2016-05-14 18:22:52 +00:00
Mark Johnston	6c2806594b	Make the second argument of dtrace_invop() a trapframe pointer. Currently this argument is a pointer into the stack which is used by FBT to fetch the first five probe arguments. On all non-x86 architectures it's simply the trapframe address, so this change has no functional impact. On amd64 it's a pointer into the trapframe such that stack[1 .. 5] gives the first five argument registers, which are deliberately grouped together in the amd64 trapframe definition. A trapframe argument simplifies the invop handlers on !x86 and makes the x86 FBT invop handler easier to understand. Moreover, it allows for invop handlers that may want to modify the register set of the interrupted thread.	2016-04-17 23:08:47 +00:00
Mark Johnston	e1e33ff912	Initialize DTrace hrtimer frequency during SI_SUB_CPU on i386 and amd64. This allows the hrtimer to be used earlier during boot. This is required for boot-time DTrace: anonymous enablings are created during SI_SUB_DTRACE_ANON, which runs before APs are started. In particular, the DTrace deadman timer requires that the hrtimer be functional. MFC after: 2 weeks	2016-04-10 01:23:39 +00:00
Mark Johnston	48cc2d5e22	Remove unused variables dtrace_in_probe and dtrace_in_probe_addr.	2016-03-17 18:55:54 +00:00
Mark Johnston	5a9f9cb38e	Remove some commented-out upstream code for handling traps from usermode DTrace probes. This handling is already done in trap() on i386 and amd64.	2015-05-10 22:27:48 +00:00
Mark Johnston	cafe874475	Restore the trap type argument to the DTrace trap hook, removed in r268600. It's redundant at the moment since it can be obtained from the trapframe on the architectures where DTrace is supported, but this won't be the case with ARM.	2014-12-23 15:38:19 +00:00
Mark Johnston	291624fdf6	Invoke the DTrace trap handler before calling trap() on amd64. This matches the upstream implementation and helps ensure that a trap induced by tracing fbt::trap:entry is handled without recursively generating another trap. This makes it possible to run most (but not all) of the DTrace tests under common/safety/ without triggering a kernel panic. Submitted by: Anton Rang <anton.rang@isilon.com> (original version) Phabric: D95	2014-07-14 04:38:17 +00:00
George V. Neville-Neil	d638e8dcde	Change UL to ULL since time is 32 bits. Pointed out by: avg@ MFC after: 2 weeks	2012-07-17 14:36:40 +00:00
George V. Neville-Neil	57d025c338	Add support for walltimestamp in DTrace. Submitted by: Fabian Keil MFC after: 2 weeks	2012-07-16 20:17:19 +00:00
George V. Neville-Neil	4737d389b0	Integrate a fix for a very odd signal delivery problem found by Bryan Cantril and others in the Solaris/Illumos version of DTrace. Obtained from: https://www.illumos.org/issues/789 MFC after: 2 weeks	2012-06-04 16:15:40 +00:00
Zachary Loafman	db5c7d363d	Fix DTrace TSC skew calculation: The skew calculation here is exactly backwards. We were able to repro it on a multi-package ESX server running a FreeBSD VM, where the TSCs can be pretty evil. MFC after: 1 week Submitted by: Jeff Ford <jeffrey.ford2@isilon.com> Reviewed by: avg, gnn	2012-06-04 16:04:01 +00:00
Attilio Rao	ada5b73915	Remove pc_cpumask usage from dtrace MD support	2011-06-28 13:14:39 +00:00
Attilio Rao	b68eda3b54	MFC	2011-05-10 15:54:37 +00:00
Andriy Gapon	d9b8935fb9	dtrace: remove unused code Which is also useless, IMO. MFC after: 5 days	2011-05-10 15:05:27 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Jung-uk Kim	3453537fa5	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Andriy Gapon	fe8c7b3d77	dtrace_xcall: no need for special handling of curcpu smp_rendezvous_cpus alreadt does the right thing in a very similar fashion, so the code was kind of duplicating that. MFC after: 3 weeks	2010-12-07 09:19:47 +00:00
Andriy Gapon	7becfa95b9	dtrace_gethrtime_init: pin to master while examining other CPUs Also use pc_cpumask to be future-friendly. Reviewed by: jhb MFC after: 2 weeks	2010-12-07 09:03:17 +00:00
John Baldwin	3aa6d94e0c	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
Andriy Gapon	b064b6d1cd	dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds Currently dtrace_gethrtime uses formula similar to the following for converting TSC ticks to nanoseconds: rdtsc() * 10^9 / tsc_freq The dividend overflows 64-bit type and wraps-around every 2^64/10^9 = 18446744073 ticks which is just a few seconds on modern machines. Now we instead use precalculated scaling factor of 10^9*2^N/tsc_freq < 2^32 and perform TSC value multiplication separately for each 32-bit half. This allows to avoid overflow of the dividend described above. The idea is taken from OpenSolaris. This has an added feature of always scaling TSC with invariant value regardless of TSC frequency changes. Thus the timestamps will not be accurate if TSC actually changes, but they are always proportional to TSC ticks and thus monotonic. This should be much better than current formula which produces wildly different non-monotonic results on when tsc_freq changes. Also drop write-only 'cp' variable from amd64 dtrace_gethrtime_init() to make it identical to the i386 twin. PR: kern/127441 Tested by: Thomas Backman <serenity@exscape.org> Reviewed by: jhb Discussed with: current@, bde, gnn Silence from: jb Approved by: re (gnn) MFC after: 1 week	2009-07-15 17:07:39 +00:00
Ganbold Tsagaankhuu	79dae0aa0b	Remove unused variable. Found with: Coverity Prevent(tm) CID: 3669,3671 Approved by: jb	2008-11-25 19:25:54 +00:00
John Birrell	91eaf3e183	Custom DTrace kernel module files plus FreeBSD-specific DTrace providers.	2008-05-23 05:59:42 +00:00

24 Commits