Commit Graph

82 Commits

Author SHA1 Message Date
kib
a25ba832b4 Fix a bug in r302252.
Change ntpadj_lock to spinlock always, and rename stuff removing
ADJ/adj from the names. ntp_update_second() requires ntp_lock and is
called from the tc_windup(), so ntp_lock must be a spinlock.  Add
missed lock to ntp_update_second().

Tested by:	pho (as part of the whole patch)
Reviewed by:	jhb (same)
Noted by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:40:06 +00:00
kib
42d92058c3 Currently the ntptime code and resettodr() are Giant-locked. In
particular, the Giant is supposed to protect against parallel
ntp_adjtime(2) invocations.  But, for instance, sys_ntp_adjtime() does
copyout(9) under Giant and then examines time_status to return syscall
result.  Since copyout(9) could sleep, the syscall result might be
inconsistent.

Another and more important issue is that if PPS is configured,
hardpps(9) is executed without any protection against the parallel
top-level code invocation. Potentially, this may result in the
inconsistent state of the ntptime state variables, but I cannot say
how serious such distortion is. The non-functional splclock() call in
sys_ntp_adjtime() protected against clock interrupts calling hardpps()
in the pre-SMP era.

Modernize the locking. A mutex protects ntptime data.  Due to the
hardpps() KPI legitimately serving from the interrupt filters (and
e.g. uart(4) does call it from filter), the lock cannot be sleepable
mutex if PPS_SYNC is defined.  Otherwise, use normal sleepable mutex
to reduce interrupt latency.

Reviewed by:	  imp, jhb
Sponsored by:	  The FreeBSD Foundation
Approved by:	  re (gjb)
Differential revision:	https://reviews.freebsd.org/D6825
2016-06-28 16:43:23 +00:00
ian
098fcc1952 Use the monotonic (uptime) counter rather than time-of-day to measure elapsed
time between ntp_adjtime() clock offset adjustments.  This eliminates spurious
frequency steering after a large clock step (such as a 1970->2015 step on a
system with no battery-backed clock hardware).

This problem was discovered after the import of ntpd 4.2.8, which does things
in a slightly different (but still correct) order than the 4.2.4 we had
previously.  In particular, 4.2.4 would step the clock then immediately after
use ntp_adjtime() to set the frequency and offset to zero, which captured the
post-step time-of-day as a side effect.  In 4.2.8, ntpd sets frequency and
offset to zero before any initial clock step, capturing the time as 1970-ish,
then when it next calls ntp_adjtime() it's with a non-zero offset measurement.
This non-zero value gets multiplied by the apparent 45-year interval, which
blows up into a completely bogus frequency steer.  That gets clamped to
500ppm, but that's still enough to make the clock drift so fast that ntpd has
to keep stepping it every few minutes to compensate.
2015-07-12 18:38:17 +00:00
hselasky
35b126e324 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
gjb
fc21f40567 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
hselasky
bd1ed65f0f Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
avg
9e6374b6a9 rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST
Also directly call swapper() at the end of mi_startup instead of
relying on swapper being the last thing in sysinits order.

Rationale:

- "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage
- "scheduler" was misleading, the function swaps in the swapped out processes
- another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be
  invoked depending on its relative order with scheduler; this was not obvious
  and the bug actually used to exist

Reviewed by:	kib (ealier version)
MFC after:	14 days
2013-07-24 09:45:31 +00:00
imp
193f715d75 Limit popcorn limit to something sane (either 2ns or 2 ticks if that's
longer).

PR:		156481
Submitted by:	Ian Lepore
2012-08-16 02:35:44 +00:00
lstewart
1c0bb02c84 Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. The
sysclock_getsnapshot() function allows the caller to obtain a snapshot of all
the system clock and timecounter state required to create time stamps at a later
point. The sysclock_snap2bintime() function converts a previously obtained
snapshot into a bintime time stamp according to the specified flags e.g. which
system clock, uptime vs absolute time, etc.

These KPIs enable useful functionality, including direct comparison of the
feedback and feed-forward system clocks and generation of multiple time stamps
with different formats from a single timecounter read.

Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.

For more information, see http://www.synclab.org/radclock/

In collaboration with:	Julien Ridoux (jridoux at unimelb edu au)
2011-12-24 01:32:01 +00:00
eadler
3072a90209 Document a large number of currently undocumented sysctls. While here
fix some style(9) issues and reduce redundancy.

PR:		kern/155491
PR:		kern/155490
PR:		kern/155489
Submitted by:	Galimov Albert <wtfcrap@mail.ru>
Approved by:	bde
Reviewed by:	jhb
MFC after:	1 week
2011-12-13 00:38:50 +00:00
kmacy
99851f359e In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
netchild
cc4128c6b1 Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
mdf
306bf0c055 Fix up a few more sysctl(9) mis-typing found in various LINT builds. 2011-01-13 18:20:27 +00:00
avg
eca696eeba there must be only one SYSINIT with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY order
SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY should only be used to call
scheduler() function which turns the initial thread into swapper proper
and thus there is no further SYSINIT processing.
Other SYSINITs with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY may get ordered
after scheduler() and thus never executed.  That particular relative
order is semi-arbitrary.

Thus, change such places to use SI_ORDER_MIDDLE.
Also, use SI_ORDER_MIDDLE instead of correct, but less appealing,
SI_ORDER_ANY - 1.

MFC after:	1 week
2010-09-30 17:05:23 +00:00
avg
2cfe78bdd9 kern_ntptime: drop a comment that became stale after r207359
MFC after:	1 week
X-MFC after:	r207359
2010-04-29 09:18:36 +00:00
avg
cce2a4186b periodically save system time to hardware time-of-day clock
This is done in kern_ntptime, perhaps not the best place.
This is done using resettodr().
Some features:
- make save period configurable via tunable and sysctl
- period of zero disables saving, setting a non-zero period re-enables
  it or reschedules it
- do saving only if system clock is ntp-synchronized
- save on shutdown

Discussed with:	des, Peter Jeremy <peterjeremy@acm.org>
X-Maybe:		save time near seconds boundary for better precision
MFC after:		2 weeks
2010-04-29 09:02:46 +00:00
avg
cbff4850b6 kern_ntptime: abstract time error check into a function
... to avoid code duplication

MFC after:	1 week
2010-04-29 09:02:21 +00:00
rwatson
877d7c65ba In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
rwatson
f5b0c94db5 Only require privilege to set the current time adjustment, not in order to
query it.
2007-06-14 18:37:58 +00:00
rwatson
69938bd196 Further system call comment cleanup:
- Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde)
- Remove extra blank lines in some cases.
- Add extra blank lines in some cases.
- Remove no-op comments consisting solely of the function name, the word
  "syscall", or the system call name.
- Add punctuation.
- Re-wrap some comments.
2007-03-05 13:10:58 +00:00
rwatson
300d4098cf Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.
2007-03-04 22:36:48 +00:00
imp
faa5da2dc1 When ntp_gettime() was converted from a sysctl + wrapper to a system
call, its semantics were unintentionally changed.  It went from
returning the time state to returning 0 or -1.  Since 0 means time
normal, and non-zero effectively only shows up around leap seconds,
this went unnoticed until now.  At least unnoticed until someone was
trying to run a binary they didn't have source for and it was
misbehaving...

Submitted by: Judah Levine
MFC After: 2 weeks
2007-01-12 07:40:30 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
rwatson
f1dfea9d61 Explicitly acquire Giant around the ntp_gettime() and assert it in the
sysctl path.  While this code is close to MPSAFE, it may require some
additional locking.  Mark ntp_gettime1() as GIANT_REQUIRED for now.

Suggested by:	phk
2005-05-28 14:34:41 +00:00
jhb
72d1a40cb6 Implement kern_adjtime(), kern_readv(), kern_sched_rr_get_interval(),
kern_settimeofday(), and kern_writev() to allow for further stackgap
reduction in the compat ABIs.
2005-03-31 22:51:18 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
marks
2edc978740 Place function comment above the right function. 2004-11-19 00:58:30 +00:00
marks
d0ae6351d9 Add system call implementation of ntp_gettime(2).
Moved most of the work to ntp_gettime1(), which is now called by
ntp_gettime() and ntp_sysctl().

Reviewed by:	imp, phk, njl, peter
Approved by:	njl
2004-11-18 23:44:49 +00:00
phk
7a8abdd585 Annual NTP kernel code spring-cleaning:
Use int64_t rather than long long for the fixpoint type.

Don't discard fractional nanosecond frequency correction.
2004-03-14 15:23:05 +00:00
phk
81a3e3bfb9 Deal with MOD_FREQUENCY before MOD_OFFSET because the latter is the
one which runs the actual update.  This fixes a bug where there were
a delay in applying the frequency adjustment.  In extreme cases this
could result in marginal stability of the kernel-pll.
2004-01-24 21:48:43 +00:00
imp
27d211ac0c During a positive leap second, the tai_time offset should be
incremented at the start of the leap second, not after the leap second
has been inserted.  This is because at the start of the leap second,
we set the time back one second.  This setting back one second is the
moment that the offset changes.  The old code set it back after the
leap second, but that's one second too late.  The negative leap second
case is handled correctly.

Reviewed by: phk
2003-06-25 20:56:40 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
peter
af85e6b2d6 Explicitly have the timecounter init happen after the cpu_initclocks is
called.  Otherwise (depending on a non-deterministic sort), the timecounter
code can be initialized before the clock rate has been set (on ia64) and it
assumes hz = 100, rather than the real value of 1024.  I'm not sure how much
gets upset by this.

Glanced at by:	phk
2003-01-06 01:01:08 +00:00
phk
ae4ae707ad Remove an unused variable. 2002-10-11 10:36:22 +00:00
phk
d1d55e6cb9 Hide the private parts of timecounter from a couple of places that don't
really need to know the gory details.
2002-04-26 21:31:44 +00:00
phk
f4a2041f29 suser is Giant safe, so optimize a pointless case. 2002-04-19 09:20:13 +00:00
phk
2edc95ffee Remove two debug printfs which should never have been committed. 2002-04-15 21:08:51 +00:00
jhb
f656d44b0b You have to cast int64_t's to long long if you printf them with %lld.
This now compiles on alpha without a warning.

Pointy-hat to:	phk
2002-04-15 21:04:32 +00:00
phk
b6bf4c07cf Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of
discrete steps and make it possible to adjust arbitraryly huge
amounts of time in either direction.

In practice this is done by hooking into the same once-per-second
loop as the NTP PLL and setting a suitable frequency offset deducting
the amount slewed from the remainder.  If the remaining delta is
larger than 1 second we slew at 5000PPM (5msec/sec), for a delta
less than a second we slew at 500PPM (500usec/sec) and for the last
one second period we will slew at whatever rate (less than 500PPM)
it takes to eliminate the delta entirely.

The old implementation stepped the clock a number of microseconds
every HZ to acheive the same effect, using the same rates of change.

Eliminate the global variables tickadj, tickdelta and timedelta and
their various use and initializations.

This removes the most significant obstacle to running timecounter and
NTP housekeeping from a timeout rather than hardclock.
2002-04-15 12:23:11 +00:00
phk
af54e26ee0 In the ntp_adjtime(2) syscall, return our actual estimate of unapplied
offset correction instead of the most recent offset applied.
2002-04-15 08:58:24 +00:00
jhb
dc2e474f79 Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
phk
a028cfa65d Revise timercounters to use binary fixed point format internally.
The binary format "bintime" is a 32.64 format, it will go to 64.64
when time_t does.

The bintime format is available to consumers of time in the kernel,
and is preferable where timeintervals needs to be accumulated.

This change simplifies much of the magic math inside the timecounters
and improves the frequency and time precision by a couple of bits.

I have not been able to measure a performance difference which was not
a tiny fraction of the standard deviation on the measurements.
2002-02-07 21:21:55 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
dillon
b781e73eb6 Pushdown Giant for: profil(), ntp_adjtime(), ogethostname(),
osethostname(), ogethostid(), osethostid()
2001-09-01 05:47:58 +00:00
jhay
0bd9e525aa Update to the 2001-04-02 version of the nanokernel code from Dave Mills. 2001-04-16 13:05:05 +00:00
phk
06ca4c9ec0 Updates to the ntp pll from John Hay.
Submitted by:	jhay
2000-09-10 09:13:34 +00:00
phk
a6dc2ae9af Update the NTP kernel PLL code to the 2000-08-29 version of Dave Mills
nanokernel.

The FreeBSD private mode hardpps Type 2 PLL has been removed.
2000-09-04 08:19:32 +00:00
phk
e5de271d47 Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
phk
61ff05be25 Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
phk
1a34cea0e8 Isolate the Timecounter internals in their own two files.
Make the public interface more systematically named.

Remove the alternate method, it doesn't do any good, only ruins performance.

Add counters to profile the usage of the 8 access functions.

Apply the beer-ware to my code.

The weird +/- counts are caused by two repocopies behind the scenes:
	kern/kern_clock.c -> kern/kern_tc.c
	sys/time.h -> sys/timetc.h
(thanks peter!)
2000-03-20 14:09:06 +00:00