Give the HZ/overflow check a 10% margin.
Eliminate bogus newline.
If timecounters have equal quality, prefer higher frequency.
Some inspiration from: bde
represents the pruely stylistic changes and should have no net impact
on the rest of the code.
bde's more substantive changes will follow in a separate commit once
we've come to closure on them.
Submitted by: bde
ntp_update_second twice when we have a large step in case that step
goes across a scheduled leap second. The only way this could happen
would be if we didn't call tc_windup over the end of day on the day of
a leap second, which would only happen if timeouts were delayed for
seconds. While it is an edge case, it is an important one to get
right for my employer.
Sponsored by: Timing Solutions Corporation
A timecounter will be selected when registered if its quality is
not negative and no less than the current timecounters.
Add a sysctl to report all available timecounters and their qualities.
Give the dummy timecounter a solid negative quality of minus a million.
Give the i8254 zero and the ACPI 1000.
The TSC gets 800, unless APM or SMP forces it negative.
Other timecounters default to zero quality and thereby retain current
selection behaviour.
Before, we would add/subtract the leap second when the system had been
up for an even multiple of days, rather than at the end of the day, as
a leap second is defined (at least wrt ntp). We do this by
calculating the notion of UTC earlier in the loop, and passing that to
get it adjusted. Any adjustments that ntp_update_second makes to this
time are then transferred to boot time. We can't pass it either the
boot time or the uptime because their sum is what determines when a
leap second is needed. This code adds an extra assignment and two
extra compare in the typical case, which is as cheap as I could made
it.
I have confirmed with this code the kernel time does the correct thing
for both positive and negative leap seconds. Since the ntp interface
doesn't allow for +2 or -2, those cases can't be tested (and the folks
in the know here say there will never be a +2s or -2s leap event, but
rather two +1s or -1s leap events).
There will very likely be no leap seconds for a while, given how the
earth is speeding up and slowing down, so there will be plenty of time
for this fix to propigate. UT1-UTC is currently at "about -0.4s" and
decrementing by .1s every 8 months or so. 6 * 8 is 48 months, or 4
years.
-stable has different code, but a similar bug that was introduced
about the time of the last leap second, which is why nobody has
noticed until now.
MFC After: 3 weeks
Reviewed by: phk
"Furthermore, leap seconds must die." -- Cato the Elder
potential discontinuities in our UTC timescale.
Applications can monitor this variable if they want to be informed
about steps in the timescale. Slews (ntp and adjtime(2)) and
frequency adjustments (ntp) will not increment this counter, only
operations which set the clock. No attempt is made to classify
size or direction of the step.
called. Otherwise (depending on a non-deterministic sort), the timecounter
code can be initialized before the clock rate has been set (on ia64) and it
assumes hz = 100, rather than the real value of 1024. I'm not sure how much
gets upset by this.
Glanced at by: phk
functions which run for several milliseconds at a time and getting
in queue behind one or more of those makes us miss our rewind.
Instead call it from hardclock() like we used to do, but retain the
prescaler so we still cope with high HZ values.
by other bits of code, split struct timecounter into two.
struct timecounter contains just the bits which pertains to the hardware
counter and the reading of it.
struct timehands (as in "the hands on a clock") contains all the ugly bit
fidling stuff. Statically compile ten timehands.
This commit is the functional part. A later cosmetic patch will rename
various variables and fieldnames.
timeout loop.
Limit the rate at which we wind the timecounters to approx 1000 Hz.
This limits the precision of the get{bin,nano,micro}[up]time(9)
functions to roughly a millisecond.
timecounter will be used starting at the next second, which is
good enough for sysctl purposes. If better adjustment is needed
the NTP PLL should be used.
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.
In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder. If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.
The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.
Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.
This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
our feet when we look inside timecounter structures.
Make the "sync_other" code more robust by never overwriting the
tc_next field.
Add counters for the bin[up]time functions.
Call tc_windup() in tc_init() and switch_timecounter() to make sure
we all the fields set right.
The binary format "bintime" is a 32.64 format, it will go to 64.64
when time_t does.
The bintime format is available to consumers of time in the kernel,
and is preferable where timeintervals needs to be accumulated.
This change simplifies much of the magic math inside the timecounters
and improves the frequency and time precision by a couple of bits.
I have not been able to measure a performance difference which was not
a tiny fraction of the standard deviation on the measurements.
HZ=BIGNUM will strain the assumptions behind timecounters to the
point where they break.
This may or may not help people seeing microuptime() backwards messages.
Make the global timecounter variable volatile, it makes no difference in
the code GCC generates, but it makes represents the intent correctly.
Thanks to: jdp
MFC after: 2 weeks
include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).
Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh