freebsd-skq

Author	SHA1	Message	Date
mmacy	c937b516d8	CK: update consumers to use CK macros across the board r334189 changed the fields to have names distinct from those in queue.h in order to expose the oversights as compile time errors	2018-05-24 23:21:23 +00:00
mmacy	3ddb2ade67	hwppmc: set threadid in callchain records - second part of r334108	2018-05-23 17:44:29 +00:00
mmacy	0ff2c24861	pmc: detach free_gtask on unload Reported by: pho	2018-05-20 20:34:15 +00:00
mmacy	5f1cf9cd72	pmc: avoid potential race on shutdown Clear shutdown flag first, conservatively allow 5ms for all hardclock consumers to see flag before drainining	2018-05-20 19:35:24 +00:00
mmacy	a48d80f193	epoch(9): Make epochs non-preemptible by default There are risks associated with waiting on a preemptible epoch section. Change the name to make them not be the default and document the issue under CAVEATS. Reported by: markj	2018-05-18 17:29:43 +00:00
mmacy	b4ad383689	hwpmc: Implement per-thread counters for PMC sampling This implements per-thread counters for PMC sampling. The thread descriptors are stored in a list attached to the process descriptor. These thread descriptors can store any per-thread information necessary for current or future features. For the moment, they just store the counters for sampling. The thread descriptors are created when the process descriptor is created. Additionally, thread descriptors are created or freed when threads are started or stopped. Because the thread exit function is called in a critical section, we can't directly free the thread descriptors. Hence, they are freed to a cache, which is also used as a source of allocations when needed for new threads. Approved by: sbruno Obtained from: jtl Sponsored by: Juniper Networks, Limelight Networks Differential Revision: https://reviews.freebsd.org/D15335	2018-05-16 22:29:20 +00:00
mmacy	4c3dbe627d	hwpmc: don't reference domain index with no memory backing it On multi-socket the domain will be correctly set for a given CPU regardless of whether or not NUMA is enabled. Approved by: sbruno	2018-05-14 06:11:25 +00:00
mmacy	4d2273cd96	pmc: don't add pmc owner to list until setup is complete Once a pmc owner is added to the pmc_ss_owners list it is visible for all to see. We don't want this to happen until setup is complete. Reported by: mjg Approved by: sbruno	2018-05-14 01:08:47 +00:00
mmacy	9888701947	hwpmc: fix load/unload race and vm map LOR - fix load/unload race by allocating the per-domain list structure at boot - fix long extant vm map LOR by replacing pmc_sx sx_slock with global_epoch to protect the liveness of elements of the pmc_ss_owners list Reported by: pho Approved by: sbruno	2018-05-14 00:21:04 +00:00
mmacy	71ab2f70a9	hwpmc/epoch - don't reference domain if NUMA is not set It appears that domain information is set correctly independent of whether or not NUMA is defined. However, there is no memory backing secondary domains leading to allocation failure. Reported by: pho@, np@ Approved by: sbruno@	2018-05-12 20:00:29 +00:00
mmacy	dcb7d046f9	hwpmc(9): clear remaining sample work for hardclock - fix last minute change in 333509 where by runcount references to a pmc would remaining causing us to pause loop forever Approved by: sbruno	2018-05-12 03:45:30 +00:00
mmacy	2981a3420c	hwpmc(9): Make pmclog buffer pcpu and update constants On non-trivial SMP systems the contention on the pmc_owner mutex leads to a substantial number of samples captured being from the pmc process itself. This change a) makes buffers larger to avoid contention on the global list b) makes the working sample buffer per cpu. Run pmcstat in the background (default event rate of 64k): pmcstat -S UNHALTED_CORE_CYCLES -O /dev/null sleep 600 & Before: make -j96 buildkernel -s >&/dev/null 3336.68s user 24684.10s system 7442% cpu 6:16.50 total After: make -j96 buildkernel -s >&/dev/null 2697.82s user 1347.35s system 6058% cpu 1:06.77 total For more realistic overhead measurement set the sample rate for ~2khz on a 2.1Ghz processor: pmcstat -n 1050000 -S UNHALTED_CORE_CYCLES -O /dev/null sleep 6000 & Collecting 10 samples of `make -j96 buildkernel` from each: x before + after real time: N Min Max Median Avg Stddev x 10 76.4 127.62 84.845 88.577 15.100031 + 10 59.71 60.79 60.135 60.179 0.29957192 Difference at 95.0% confidence -28.398 +/- 10.0344 -32.0602% +/- 7.69825% (Student's t, pooled s = 10.6794) system time: N Min Max Median Avg Stddev x 10 2277.96 6948.53 2949.47 3341.492 1385.2677 + 10 1038.7 1081.06 1070.555 1064.017 15.85404 Difference at 95.0% confidence -2277.47 +/- 920.425 -68.1574% +/- 8.77623% (Student's t, pooled s = 979.596) x no pmc + pmc running real time: HEAD: N Min Max Median Avg Stddev x 10 58.38 59.15 58.86 58.847 0.22504567 + 10 76.4 127.62 84.845 88.577 15.100031 Difference at 95.0% confidence 29.73 +/- 10.0335 50.5208% +/- 17.0525% (Student's t, pooled s = 10.6785) patched: N Min Max Median Avg Stddev x 10 58.38 59.15 58.86 58.847 0.22504567 + 10 59.71 60.79 60.135 60.179 0.29957192 Difference at 95.0% confidence 1.332 +/- 0.248939 2.2635% +/- 0.426506% (Student's t, pooled s = 0.264942) system time: HEAD: N Min Max Median Avg Stddev x 10 1010.15 1073.31 1025.465 1031.524 18.135705 + 10 2277.96 6948.53 2949.47 3341.492 1385.2677 Difference at 95.0% confidence 2309.97 +/- 920.443 223.937% +/- 89.3039% (Student's t, pooled s = 979.616) patched: N Min Max Median Avg Stddev x 10 1010.15 1073.31 1025.465 1031.524 18.135705 + 10 1038.7 1081.06 1070.555 1064.017 15.85404 Difference at 95.0% confidence 32.493 +/- 16.0042 3.15% +/- 1.5794% (Student's t, pooled s = 17.0331) Reviewed by: jeff@ Approved by: sbruno@ Differential Revision: https://reviews.freebsd.org/D15155	2018-05-12 01:26:34 +00:00
mmacy	a0bd5d3d7f	Eliminate the overhead of gratuitous repeated reinitialization of cap_rights - Add macros to allow preinitialization of cap_rights_t. - Convert most commonly used code paths to use preinitialized cap_rights_t. A 3.6% speedup in fstat was measured with this change. Reported by: mjg Reviewed by: oshogbo Approved by: sbruno MFC after: 1 month	2018-05-09 18:47:24 +00:00
fabient	326b96a88c	Fix pmcstat exit from kernel introduced by r325275. pmcstat request for close will generate a close event. This event will be in turn received by pmcstat to close the file. Reviewed by: kib Tested by: pho MFC after: 1 week Sponsored by: Stormshield	2018-01-17 16:41:22 +00:00
pfg	1537078d8f	sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 14:52:40 +00:00
kib	bb9ce4f821	Do not leak PMC_PO_OWNS_LOGFILE on error. Note that PMCLOG_RESERVE_WITH_ERROR() macro contains goto error; statement and executed after the flag is set. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-11-13 10:45:31 +00:00
kib	6f8de9c7ee	Style bug. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2017-11-13 10:43:31 +00:00
kib	70df4cad97	Check that the pmc index is less than the number of hardware PMCs, instead of asserting the condition. The row index is directly supplied by userspace, the kernel must handle invalid values. Submitted by: pho MFC after: 3 days	2017-11-10 19:10:14 +00:00
kib	3d9fc3cca3	Do not run pmclog_configure_log() without pmc_sx protection. The r195005 unlocked pmc_sx before calling into pmclog_configure_log() to avoid the LOR, but it allows flush or closelog to run in parallel with the configuration, causing many failure modes. Revert r195005. Pre-create the logging process, allowing it to run after the set up succeeded, otherwise the process terminates itself. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 11:43:39 +00:00
kib	b7057b25e3	Be protective and check the po_file validity before dropping the ref. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 11:37:45 +00:00
kib	1ae9155255	In hwpmc, do not double-close the logging file. hwpmc(4) must not voluntarily call fo_close(), doing this causes double-close of the file. It seems to almost avoid bad consequences for pipes, but other types of files demonstrate random memory access. To fix, remove fo_close() calls, which also do not provide the declared wake-up of waiters consistently. Instead, send a signal to the logger and configure the logger process to not block it. Since logger never returns to userspace, the signal only causes termination of the interruptible sleeps in fo_write(). Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 11:32:52 +00:00
kib	024598205a	There is no use for dropping Giant in the pmc syscall. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 11:16:18 +00:00
kib	6d15235904	Minor style tweaks. Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 11:05:47 +00:00
kib	964a0edf56	Use designated initializers for pmc sysent and module data. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week X-Differential revision: https://reviews.freebsd.org/D12882	2017-11-01 10:49:41 +00:00
br	06468fac12	o Support for Kabylake CPU PMCs (fall down to PMC_CPU_INTEL_SKYLAKE). o Fix bugs in events descriptions for Skylake, Skylake Xeon and Haswell. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D12654	2017-10-13 15:02:29 +00:00
cem	da5d7dd0c1	hwpmc(4): Actually use a sufficiently wide type jhibbits@ points out that left shifting bits 8-11 24 bits won't fit in a 32-bit integer either. Corrects r324533. Submitted by: jhibbits Sponsored by: Dell EMC Isilon	2017-10-11 15:13:40 +00:00
cem	32335afadd	hwpmc(4): Force sufficiently wide type for left shift Ordinary input to this macro comes from pe_code, which is uint16_t. Coverity points out that shifting such a value discards the result of a 24 bit shift, which is not what we want. A follow-up to r324291. CID: 1381676 Sponsored by: Dell EMC Isilon	2017-10-11 14:59:04 +00:00
cem	e57bc68025	hwpmc(4): Add support for extended AMD events Sponsored by: Dell EMC Isilon	2017-10-04 23:35:10 +00:00
kib	3e6224e523	Skylake server core PMC support for hwpmc(4). Reviewed by: emaste Sponsored by: The FreeBSD Foundation Hardware provided by: Intel MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D12221	2017-09-06 17:19:48 +00:00
kib	65e37fca9f	Fix logic error in the the assert, causing the condition to be always true. Also improve the formatting of the corresponding KASSERT message. Based on the submission by: Svyatoslav <razmyslov@viva64.com> Found by: PVS-Studio PR: 217741 Reviewed by: emaste Sponsored by: The FreeBSD Foundation (kib) MFC after: 1 week	2017-08-08 15:46:29 +00:00
zbb	13c9408e90	Fix INVARIANTS debug code in HWPMC When HWPMC stops sampling, ps_pmc may be freed before samples are processed. In such situation treat PMC as stopped. Add "ifdef" to fix build without INVARIANTS code. Submitted by: Michal Mazur <mkm@semihalf.com> Obtained from: Semihalf Sponsored by: Stormshield, Netgate Differential revision: https://reviews.freebsd.org/D10912	2017-06-13 18:53:56 +00:00
zbb	86fcc88a77	Fix event table for Cortex A9. Removed events 0x8 (INSTR_EXECUTED), 0xE (PC_PROC_RETURN) and 0x13-0x1d not supported on Cortex A9. Add events 0x68 and 0x6E which replaced 0x8 and 0xE. Submitted by: Michal Mazur <mkm@semihalf.com> Obtained from: Semihalf Sponsored by: Stormshield, Netgate Differential revision: https://reviews.freebsd.org/D10911	2017-06-13 18:52:39 +00:00
zbb	0c419353c5	Fix HWPMC interrupt handling in Counting Mode Additionally: - Fix support for Cycle Counter (evsel == 0xFF) - Stop and mask interrupts from all counters on init and finish Submitted by: Michal Mazur <mkm@semihalf.com> Obtained from: Semihalf Sponsored by: Stormshield, Netgate Differential revision: https://reviews.freebsd.org/D10910	2017-06-13 18:51:23 +00:00
fabient	74bd0be5e9	Fix arm stack frame walking support: - Adjust stack offset for Clang - Correctly fill registers for fake stack frame (soft PMC) MFC after: 1 week Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D7396	2017-03-14 16:06:57 +00:00
gnn	0a141676a9	Fix PMC architecture check to handle later IPAs including Skylake Tested with tools/test/hwpmc/pmctest.py Obtained from: Oliver Pinter MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D9036	2017-01-04 02:15:03 +00:00
avg	959e19a84a	pmc_process_csw_out: ignore deleted counters I see the fllowing panic on AMD when exiting pmcstat: panic: [pmc,1473] pp_pmcval outside of expected range cpu=2 ri=17 pp_pmcval=fffffffffa529f5b pm_reloadcount=10000 It seems that at least on AMD a performance counter keeps counting after overflowing. When pmcstat exits it sets counters that it used to PMC_STATE_DELETED and waits until their use count goes to zero. amd_intr() wouldn't reload a counter in that state and, thus, a counter would be allowed to overflow. That means that the counter's value would be allowed to go outside the expected range. MFC after: 2 weeks	2016-11-10 11:12:45 +00:00
avg	864edd8840	hwpmc: fix a race between amd_stop_pmc and amd_intr It is possible that wrmsr in amd_stop_pmc() causes an overflow in a counter that it disables. In that case a non-maskable interrupt is generated. The interrupt handler code was written in such a way that it would re-enable the counter. That would lead to an unexpected interrupt later on. This problem was easy to reproduce with $ pmcstat -T -P instructions -t $pid if the target process is sufficiently busy and there are context switches from time to time. There would be a lot of interrupts to "race" with amd_stop_pmc() called during the context switches. The problem affected only AMD processors. While there, trace whether amd_intr() claimed an interrupt. Reviewed by: jhb MFC after: 2 weeks	2016-10-30 09:38:10 +00:00
emaste	2fd125607b	hwpmc: remove sys/capability.h backwards compatibility The Capsicum header is installed as sys/capsicum.h in stable/10 as well.	2016-09-20 12:56:03 +00:00
jhb	ae52b8b4ff	Apply the fix from r232612 to fixed function counters. Reviewed by: emaste MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D7397	2016-08-03 16:52:00 +00:00
andrew	15f895037c	Don't panic in hwpmc when stopping sampling. When hwpmc stops sampling it will set the pm_state to something other than PMC_STATE_RUNNING. This means the following sequence can happen: CPU 0: Enter the interrupt handler CPU 0: Set the thread TDP_CALLCHAIN pflag CPU 1: Stop sampling CPU 0: Call pmc_process_samples, sampling is stopped so clears ps_nsamples CPU 0: Finishes interrupt processing with the TDP_CALLCHAIN flag set CPU 0: Call pmc_capture_user_callchain to capture the user call chain CPU 0: Find all the pmc sample are free so no call chains need to be captured CPU 0: KASSERT because of this This fixes the issue by checking if any of the samples have been stopped and including this in te KASSERT. PR: 204273 Reviewed by: bz, gnn Obtained from: ABT Systems Ltd Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6581	2016-05-28 13:05:39 +00:00
jhb	bcc5b0c55d	Add an EARLY_AP_STARTUP option to start APs earlier during boot. Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix	2016-05-14 18:22:52 +00:00
trasz	94bd76e619	Remove misc NULL checks after M_WAITOK allocations. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-05-10 10:26:07 +00:00
pfg	eed4bd22ad	sys/dev: minor spelling fixes. Most affect comments, very few have user-visible effects.	2016-05-03 03:41:25 +00:00
pfg	6ce01c2d90	etc: minor spelling fixes. Mostly comments but also some user-visible strings. MFC after: 2 weeks	2016-05-02 16:47:28 +00:00
pfg	42747553f4	sys: use our nitems() macro when param.h is available. This should cover all the remaining cases in the kernel. Discussed in: freebsd-current	2016-04-21 19:40:10 +00:00
pfg	fc65edc1cd	Remove slightly used const values that can be replaced with nitems(). Suggested by: jhb	2016-04-21 15:38:28 +00:00
pfg	e3c8c9cbf7	Remove unused e500_event_codes_size. Found by: jhb	2016-04-20 20:37:58 +00:00
pfg	b63211eed5	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
jhibbits	60a000630b	Fix a masking bug for e500 PMC. No idea how this slipped through my regression testing. pe_code is the event to count, pe_cpu is the CPU family mask.	2016-04-09 01:02:17 +00:00
kib	7af72453b7	If full width writes to the performance monitoring counters are supported, use full-width aliases MSRs for writes. This fixes the "[pmc,X] negative increment" assertion on the context switch when clipped counter value is sign-extended. Add definitions for the MSR IA32_PERF_CAPABILITIES needed to detect the feature. PR: 207068 Submitted by: joss.upton@yahoo.com MFC after: 2 weeks	2016-02-12 07:27:24 +00:00

1 2 3 4 5 ...

306 Commits