freebsd-skq/sys/x86/include/vdso.h

54 lines
1.9 KiB
C
Raw Normal View History

/*-
* SPDX-License-Identifier: BSD-2-Clause-FreeBSD
*
* Copyright 2012 Konstantin Belousov <kib@FreeBSD.ORG>.
Implement userspace gettimeofday(2) with HPET timecounter. Right now, userspace (fast) gettimeofday(2) on x86 only works for RDTSC. For older machines, like Core2, where RDTSC is not C2/C3 invariant, and which fall to HPET hardware, this means that the call has both the penalty of the syscall and of the uncached hw behind the QPI or PCIe connection to the sought bridge. Nothing can me done against the access latency, but the syscall overhead can be removed. System already provides mappable /dev/hpetX devices, which gives straight access to the HPET registers page. Add yet another algorithm to the x86 'vdso' timehands. Libc is updated to handle both RDTSC and HPET. For HPET, the index of the hpet device to mmap is passed from kernel to userspace, index might be changed and libc invalidates its mapping as needed. Remove cpu_fill_vdso_timehands() KPI, instead require that timecounters which can be used from userspace, to provide tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64 libc/<arch>/sys/__vdso_gettc.c into one source file in the new libc/x86/sys location. __vdso_gettc() internal interface is changed to move timecounter algorithm detection into the MD code. Measurements show that RDTSC even with the syscall overhead is faster than userspace HPET access. But still, userspace HPET is three-four times faster than syscall HPET on several Core2 and SandyBridge machines. Tested by: Howard Su <howard0su@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
* Copyright 2016 The FreeBSD Foundation.
* All rights reserved.
*
Implement userspace gettimeofday(2) with HPET timecounter. Right now, userspace (fast) gettimeofday(2) on x86 only works for RDTSC. For older machines, like Core2, where RDTSC is not C2/C3 invariant, and which fall to HPET hardware, this means that the call has both the penalty of the syscall and of the uncached hw behind the QPI or PCIe connection to the sought bridge. Nothing can me done against the access latency, but the syscall overhead can be removed. System already provides mappable /dev/hpetX devices, which gives straight access to the HPET registers page. Add yet another algorithm to the x86 'vdso' timehands. Libc is updated to handle both RDTSC and HPET. For HPET, the index of the hpet device to mmap is passed from kernel to userspace, index might be changed and libc invalidates its mapping as needed. Remove cpu_fill_vdso_timehands() KPI, instead require that timecounters which can be used from userspace, to provide tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64 libc/<arch>/sys/__vdso_gettc.c into one source file in the new libc/x86/sys location. __vdso_gettc() internal interface is changed to move timecounter algorithm detection into the MD code. Measurements show that RDTSC even with the syscall overhead is faster than userspace HPET access. But still, userspace HPET is three-four times faster than syscall HPET on several Core2 and SandyBridge machines. Tested by: Howard Su <howard0su@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
* Portions of this software were developed by Konstantin Belousov
* under sponsorship from the FreeBSD Foundation.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* $FreeBSD$
*/
#ifndef _X86_VDSO_H
#define _X86_VDSO_H
#define VDSO_TIMEHANDS_MD \
uint32_t th_x86_shift; \
Implement userspace gettimeofday(2) with HPET timecounter. Right now, userspace (fast) gettimeofday(2) on x86 only works for RDTSC. For older machines, like Core2, where RDTSC is not C2/C3 invariant, and which fall to HPET hardware, this means that the call has both the penalty of the syscall and of the uncached hw behind the QPI or PCIe connection to the sought bridge. Nothing can me done against the access latency, but the syscall overhead can be removed. System already provides mappable /dev/hpetX devices, which gives straight access to the HPET registers page. Add yet another algorithm to the x86 'vdso' timehands. Libc is updated to handle both RDTSC and HPET. For HPET, the index of the hpet device to mmap is passed from kernel to userspace, index might be changed and libc invalidates its mapping as needed. Remove cpu_fill_vdso_timehands() KPI, instead require that timecounters which can be used from userspace, to provide tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64 libc/<arch>/sys/__vdso_gettc.c into one source file in the new libc/x86/sys location. __vdso_gettc() internal interface is changed to move timecounter algorithm detection into the MD code. Measurements show that RDTSC even with the syscall overhead is faster than userspace HPET access. But still, userspace HPET is three-four times faster than syscall HPET on several Core2 and SandyBridge machines. Tested by: Howard Su <howard0su@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
uint32_t th_x86_hpet_idx; \
uint32_t th_res[6];
#define VDSO_TH_ALGO_X86_TSC VDSO_TH_ALGO_1
#define VDSO_TH_ALGO_X86_HPET VDSO_TH_ALGO_2
#define VDSO_TH_ALGO_X86_HVTSC VDSO_TH_ALGO_3 /* Hyper-V ref. TSC */
#ifdef _KERNEL
#ifdef COMPAT_FREEBSD32
#define VDSO_TIMEHANDS_MD32 VDSO_TIMEHANDS_MD
#endif
#endif
#endif