Extremely accurate system time: Chrony (not cron), PTP NIC clock and NIST (atomic clock based) time servers

I believe it is no longer necessary to use NTP every so often to resynchronize your system clock. A majoriy of new motherboards have a NIC with a PTP clock which has hardware-time-stamping. Using this internal clock, rather than the standard RTC of the motherboard, the system time will remain more stable with less overall drift.

There is a very high chance you have a NIC-hardware clock in your system.

ethtool -T device

ip a

ip a, is a shorter version of ip addr which is the replacement for ifconfig.

Enter the active network device name to check its capabilities. I assume most wireless devices currently do not support PTP or hardware time-stamping.

I just recently learned about Chrony, which I assumed was something like cron, the task scheduler. Chrony is probably the most interesting Linux program there is regarding time keeping. It doesn’t poll an NTP server constantly. Instead through the wonderous power of arythmetic, will calculate drift.

I assume the drift can only reliably be calcuoated through longer time-interval ntp syncs. However it works, it will eventually figure out how much your RTC clock, and PTP clocks are drifting. I am not sure of it changes the frequemcy of the RTC but I understand so far, it temporarily changes the PTP hardware clock in the NIC to eliminate offset from the NTP source and constantly monitors the clock accuracy while the service is running, even when not syncing to an external server.

I think this is as close as one can get to an accurate as possible system clock when PTP servers are for unknown reasons not accessible with software.

NIST, National Institute of Standards and Technology operates atomic clocks, and has developed an even more accurate atomic clock. I get the impression that using their service (the one that is recommended for just NTP only) openly listed on their website, in combination with chrony set to use the PTP (or PHC) clock is far more accurate than using basic timedatectl with ntp linux pool relying on the RTC.

I think this would be a great way to vastly reduce our reliance on NTP servers and make use of the new hardware clocks many of us already have in our machines. Might as well put that NIC hardware clock to good use!

2 Likes

Already to :wink: GPS Disciplined Rubidium Clock at home because I was a bored engineer. I basically built a GPSD-OCRCXO lol. there are also several resources out there for making your own time reference such as the open timecard project

NTP isnt bad to rely on in any form. Its fine for what it does. PTP was designed for scientific purposes. For all intents and purposes (at least terrestrially) most people dont need even need sub ms accurate time. Data centers on the other hand need sub us lol. My clock as well as the timecard projects out there have sub ns accuracy. Your consumer system rtc is still going to drift far more than these clocks quite frequently

What I want to do now is use POE to power a small microcontroller that has ntp capability on it to be an alarm clock in a 3dprinted package. Would be cool

2 Likes

Can I plug my blog post?

Its not PTP, but I’m currently going down that road and the next guide will include it

5 Likes

That is NOT how NTP works.

Chrony isn’t unique in doing that. Since the mid-1990s, ntpd has done just that:

ntpd learns and remembers the clock drift and corrects it autonomously, even if there is no reachable server. Therefore large clock steps can be avoided while the machine is synchronized to some reference clock. In addition ntpd will maintain error estimates and statistics, and can offer NTP service for other machines.
https://www.ntp.org/ntpfaq/ntp-s-config/#611-cant-i-just-run-ntpdate

I just skimmed through OP’s post…

A few comments:

NTP is the name of a standardised protocol as well as the name of a daemon and a set of tools which first implemented it. First mover usually enjoys such honour.

Chrony was developed by the Department of Physics at UBC, Vancouver. NTP daemon (the program) was also developed by some science department at university. Chrony was done at a much recent time. Hence, learned a lot from the previous endeavour.

As much as I want to praise Chrony, you give proper credit where it’s due.

The hardware clock in a normal server is going to be quartz, so it’s not that precise and the clock will drift over time.
I’d be more inclined to ask the other curious minds here their thoughts regarding

That’s why people use NTP (either NTP daemon or Chrony) to discipline system clock. The drift isn’t an issue in itself.

A very niche product that some government & business institutions may find it a bargain. Over priced and over spec’ed for everyone else I would think.

It’s essentially a Ublox LEA-M8F. The external OCXO only meaningfully kicks in when you lose satellite signals.

So you would see no benefit in having a more PRECISE clock locally? What I mean is - if the buzzy buzz buzz vibrations say 1 second is more precisely 1 second, or better still 1ms is precisely 1ms, the radioactive ones are ns precise.

It would have to sync with NTP less frequently, since it’s drifting less. Which would make it a “better” time source for local clocks, since they will have less latency to reach it.

I guess what I am asking is, isn’t this the exact kind of thing to build a stratum 2 clock with?
While NTP itself is pretty good, I’m sure this makes a valid improvement, no?
As an example, I am stratum 2 at home, but this could use alot of improvement:

ok…I believe you’re asking: if the quartz crystal on your motherboard is replaced with a OCXO (oven controlled crystal oscillator), will you get better system time accuracy and jitter as a stratum 2 time source.

The answer is: most likely not.

You can achieve sub-microsecond accuracy with a GPS module attached locally to your computer. It’s very cheap…from $10 to $50.

4 Likes

I guess I am trying to understand the ratio of important here. I have a 5 Gb fiber connection in my house, so latency to the outside world should be less of an issue?

Hitting lots of NTP servers here, can probably add more, but overwhelmingly because of some CoLo agreement Google is way more accurate by far than NIST and whatever other sources I had put in there.

If my offset is -1.077 from google consistently, would an OCXO not bring that value closer to 0?

It does stabilize,

So I feel like I can improve the 3ms offset while keeping the source clocks on the internet?

I think you answered your own question. Less network latency between your home and the time server, better the accuracy and jitter.

OCXO won’t help here at all. You’re already doing pretty good with <=2ms offset and jitter.

Very few lucky people are living next to a time source (data centres where Google/Cloudflare put their CoLo servers). If you move your home close to one of them, then you’ll get sub-millisecond accuracy and jitter. That’s probably the best you could achieve with NTP over network.

A GPS module could bring it down to sub-microsecond level. A thousand times better at a much cheaper cost.

2 Likes

So next steps would be

  • GPS receiver:. .ns territory.
  • Oscillators are not oscillating fast enough: you can get further gains by adding a OCXO
    I can fudge 2 and know I have a good enough clock?
# your local system clock, could be used as a backup
# (this is only useful if you need to distribute time no matter how good or bad it is)
#server 127.127.1.0
# but it should operate at a high stratum level to let the clients know and force them to
# use any other timesource they may have.
#fudge 127.127.1.0 stratum 2
  • ???
  • Atomic clock

Also - the fact that some German company 20 years ago ported and has maintained ntpd on Windows, and even include a GUI. <3 me some Meinberg

OKay I am literally insane.
After seeing this, I am in the rabbit hole.
(49) I’m going to have the most ACCURATE time at LTX 2023 - YouTube

But I saw the livestream today and I have to make my time more perfect for the fun of it.

I’ve modified my config to try to keep more data sources for more consistently accurate time.

I’ve sent up an identical config on my other lab AD domain controller, and I’ve started moving my workstations off of the default w32tm version of SNTP and am using a different
screenshot 2023-07-29 195410

Since all of my Linux VMs are already using NTP, I just re-configured their servers the same as above.

What’s fascinating is that they both have now tuned into an milbarge which is running on a TrueNAS mini under my desk, on the same switch as them, rather than what I would consider my primary one in my network closet.

Everything else is now pointing at emmett, which is in some cases in the same server but a differant vm, or is attached to that switch.

On my VM host where the NTP server emmett2 lives

So it seems NTP really does have some stories to tell us, without getting into Chrony vs NTP or even going down the PTP path. NTP appears to be noticing the ns level difference in time between hosts on this switch, and the NTP server 1 switch away at L2.0

I think this really illustrates the needs of having a proper stratum hierarchy in a large network. I never really inherently understood that, but I think now I do. Any other clock nerds wanna chime in? @wendell are there dozens of us??

I kind of feel like all of this means we should just have ntp server running on everything…and we kinda already do. But it should be configured to somehow determine its closest neighbors and automatically take them into consideration through some discovery method. Tis what stratums are all about afterall. With everything being as fast as it is today, I think we could alot more aggressively crowd source time. Maybe we could create a mesh network overlay for ntp.

Imagine having a bunch of phones out there with…GPS and Cellular time on board…with plenty of horsepower to drive an ntp server…all available to compare themselves against. And then harvesting that network to improve your own local time on other systems. The NTP pool project makes alot more sense now :slight_smile:

Mt time is pretty close now with just those changes. I think we can move the decimal point over a point and get closer to 0.008, we’ll see what I get my fancy clock just how close we get. lol.

FWIW my phone is getting .001 right now not on my network. Which is fascinating.


Bog standard Windows 10 polling a a local stratum 1 NTP server every 60 seconds. Some optimisation was done to improve the accuracy.

The NTP server is a just a raspberry pi 4 running Chrony. Time on the Pi 4 is disciplined using a GPS module, with the PPS signal piped into a GPIO pin.

Hmmm.
Am I misunderstanding something? It seems that more precise time would have tangible low level benefits in the IP stack.

TCP windowing and fragmentation for networks with large MTUs should benefit as an example, since these are measured by RTT.

Just using their software stack I can report much more precise time, and I haven’t gotten or mafe any hardware modifications. This is greater than 10x improvement…and I haven’t really do anything?

So I am testing the claims here of the TimeBeat stack.
(3) Timebeat is better than ptp4l by a mile - YouTube
They claim to be better than ptp4l. I am using them in the most basic of configuration options. My source of UTC time is still NTP over the internet.
I am using NTP both as a source of UTC and I am still using it to connect my clients through the TimeBeat implementation of NTP, and while I have PTP enable I don’t think it’s being leveraged in anyway.

TLDR HERE: So far I have seen that, in terms of performance for local synchronization: w32tm and default AD functionality for time < Meinberg NTPd for Windows < Timebeat
Again, with NO hardware changes. No fancy crystal oscillators in ovens, no GPS source of UTC.

Using the Meinberg implementation of NTP has been my source of UTC forever. All of my clients were getting UTC from this server, either by w32tm or an implementation of NTP.

Previously, it was seeing an offset as high as 5ms with some fairly wide scatter points meaning low accuracy.

After tweaking my Meiberg a couple days ago I was able to get it tighter:

Running Timebeat for UTC over NTP while also running Meinberg NTP server for UTC:

Running the same Timbeat iPerf3 test between my desktop and VMhost:

Over about 10 minutes of load and 15 minutes of datapoints, we are seeing accuracy of about 1ms with a traffic load applied, and without.

I wonder if there is an easy way to re-create their test grafana instance using only NTP to compare. I’m not really sure what this all means.

Time.is thinks my workstation is right on the money

So it seems we have at least achieved millisecond levels of accuracy. But from the scatterplot it looks like we lack some precision. I am going to flip on the PTPSquared functionality and see if we can improve the precision.
(3) PTP+Squared: intelligent path selection - YouTube

Here’s an easy way to visualize it:

  • A watch that is always five minutes ahead is not accurate, but it is precise because it consistently reads five minutes ahead.
  • A watch that sometimes runs five minutes ahead, sometimes five minutes behind, and sometimes on time, is neither accurate nor precise.
  • A watch that is always on time is both accurate and precise.

From a homelab perspective, and even a general IT sysadmin perspective, I’m not sure if millisecond level accuracy to UTC really matters. But I do think that having more precise clocks matters locally.

Also, is there a way to test if any of this even matters? Like in theory anything and everything relying on RTT to make decisions should benefit from more precision?

Remember, from where I started I have already achieved (and it was actually worse to be fair with OOB windows settings), my time is now between 10-100 times more accurate to UTC, while still only using NTP. I’m not sure if what I’ve done matters. So I’m less sure that further improvements in accuracy will matter, but further improvements to precision relative to systems locally might?

I have no context or reference points…so I hope my exploration here helps the next folks that come down this rabbit hole. There’s just so much code up and down the stack that assumes time is not perfectly synchronized. I’m just not sure where the point of diminishing returns is in light of that fact.

EDIT: Just want to note the patterns we see in the Meinberg reports. My first implementation polled far less frequently, and the modifications I made a couple of days ago were to increase that frequency. The Timebeat implementation of NTP seems to be analogous to the Ludacris mode on a Tesla. The frequency is insane. I’m surprised the internet NTP servers haven’t blocked me.

Not sure if I am hurting performance elsewhere in the stack here, but it seems that there is not alot of additional load on the individual servers themselves. I don’t have good enough telemetry to report on networking tho.

1 Like

I think the entire purpose of this stream of consciousness was to prove that OPs statement. But I realized I’d lack the knowledge to refute his claims despite disagreeing with those assertions.

And this:

It seems that I have proven that protocol layer changes, such as the OPs suggestion of using Chrony can improve relative time precision and accuracy. But Chrony is still just NTP under the hood, so I validated this with Meinberg which is my favorite NTP server since I am a Windows user in a Unix world.

This were fairly insightful for helping to understand NTP a bit better:

And this one a bit longer but interesting:
(4) PWLSF - Bryan Fink on “A Brief History of NTP Time: Memoirs of an Internet Timekeeper” - YouTube

It seems that, really, NTP was inherently designed to increase the accuracy of our clocks. A relative increase in precision is really a side affect of that. So before going down the Rabbit hole of PTP itself, at least I now better understand why PTP exists and the problem it is trying to solve.

In normal human interactions, relative time, is far more important than having an accurate understanding of what real time is. See Jerry Seinfeld.

We also see the importance of relative time in the animal kingdom
Salmon run - Wikipedia
Bird migration - Wikipedia

Try asking a bird what Greenich Mean time is…and yet they inherently just know from their internal body clocks it’s time to move on. Sometimes they are extremely accurate down to the day for their migration.

Just like Jerry Seinfeld knows exactly when in a conversation to throw in a joke so that it doesn’t go flat. The difference in humans is that our interactions happen much quicker, so precision matters alot more.

You’d think computers which were invented for both accuracy and precision would have had a mechanism such as PTP much sooner, but PTP as a standard was recently updated in 2019, and only first dropped in 2002. 1588-2019 - IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems | IEEE Standard | IEEE Xplore

Wheras David Mills invented NTP in 1985, with version 4 being fromm 2010. RFC 5905 - Network Time Protocol Version 4: Protocol and Algorithms Specification (ietf.org)

Why is it, as computer systems administrators, that we collectively did not recognize a need to drive precision upwards for 17 years? Why is it 21 years after such a protocol was created that it still doesn’t see ubiquitous adoption? Is NTP really not good enough as a one size fits all?

From a historical perspective, it seems that “what is good enough” has morphed over this time period. But even still clocks in 1990 were pretty dang accurate.

Clients with high stratum levels are not uncommon. A laptop or iPad are going to sync with a single peer on the internet by default. This is only possible today because we have alot more bandwidth and compute to go around than we did in the 90s.

Now as sysadmins we are commonly driving to keep at least stratum two time highly available for clients to get time from. Multiple sources of good time locally will help drive accuracy and precision. Are efficiency gains beyond that still not worth it?

Is that not good enough?
Do we need PTP???
This looks relevant: Computer Network Time Synchronization: The Network Time Protocol on Earth and in Space, Second Edition 2, Mills, David L., eBook - Amazon.com

1 Like

I used my always-on server to provide ntp time to all clients I can configure that way. Here are some time stats from server and clients:

server:
The only change to chrony.conf was to allow NTP client access from local network.

chronyc> tracking
Reference ID    : 32CD3926 (50.205.57.38)
Stratum         : 2
Ref time (UTC)  : Tue Aug 01 00:03:10 2023
System time     : 0.000189060 seconds fast of NTP time
Last offset     : +0.000019050 seconds
RMS offset      : 0.000076620 seconds
Frequency       : 4.173 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.010 ppm
Root delay      : 0.015767990 seconds
Root dispersion : 0.000467045 seconds
Update interval : 1036.2 seconds
Leap status     : Normal
chronyc> sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^- 208.67.72.43                  3  10   377   595  -2050us[-2031us] +/-   81ms
^+ t1.time.bf1.yahoo.com         2  10   377   436  +1139us[+1139us] +/- 7537us
^- io.netbunker.org              2  10   377   421   +819us[ +819us] +/-   42ms
^- 66.85.78.80                   2  10   350   80m  +2058us[+2178us] +/-   72ms
^- spidey.rellim.com             2  10   273   878  -3065us[-3046us] +/-   81ms
^* 50.205.57.38                  1  10   277   446    +19us[  +39us] +/- 8617us
^- 72-46-61-205.lnk.ne.stat>     2  10   377   823  +3713us[+3732us] +/-   59ms
^- t2.time.gq1.yahoo.com         2  10   377   226  +1752us[+1752us] +/-   35ms
^- ip229.ip-51-81-226.us         2  10   340  104m   +955us[+1275us] +/-   99ms
^- 108.61.73.244                 2  10   377   269  +1089us[+1089us] +/-   29ms
^- 192.189.65.187                2  10   326   43m  +3915us[+4025us] +/-   33ms
^- 108.61.23.93.vultruserco>     2  10   377   119  -1205us[-1205us] +/-   47ms
chronyc> sourcestats
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
208.67.72.43               11   5  172m     -0.108      0.127  -2134us   291us
t1.time.bf1.yahoo.com      49  22   13h     +0.001      0.011   +580us   318us
io.netbunker.org           35  17   10h     +0.007      0.024    +86us   438us
66.85.78.80                 7   5  120m     -0.039      0.374  +1981us   383us
spidey.rellim.com          44  25   15h     +0.006      0.020  -2245us   642us
50.205.57.38               28  14  551m     +0.009      0.025   -468us   361us
72-46-61-205.lnk.ne.stat>  26  13  431m     +0.006      0.091  +2087us  1005us
t2.time.gq1.yahoo.com      25  19  413m     +0.015      0.066    -58us   705us
ip229.ip-51-81-226.us      21  12  516m     -0.057      0.146  -1797us  1685us
108.61.73.244              13   8  224m     -0.028      0.145   +208us   519us
192.189.65.187             15   9  397m     +0.004      0.044  +3652us   324us
108.61.23.93.vultruserco>  33  16   10h     -0.010      0.028   -656us   526us

clients:
I’ve been chasing Linux/Windows clients to use my server as NTP source. Will take note if anyone figured out how to config NTP source for iOS clients.

chronyc> tracking
Reference ID    : C0A80AFE (192.168.10.254)
Stratum         : 3
Ref time (UTC)  : Tue Aug 01 00:16:04 2023
System time     : 0.000202612 seconds slow of NTP time
Last offset     : -0.000212449 seconds
RMS offset      : 0.000097993 seconds
Frequency       : 3.338 ppm slow
Residual freq   : -0.006 ppm
Skew            : 0.053 ppm
Root delay      : 0.016194632 seconds
Root dispersion : 0.000800681 seconds
Update interval : 1032.0 seconds
Leap status     : Normal
chronyc> sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^* 192.168.1.254                2   9   377   105   -279us[ -491us] +/- 9094us

So, this yields < 1ms time offset for all configured NTP clients on my network with minimal one-time configuration overhead.

Curious, but skeptical about real-life benefits of switching to PTP.

Cheers.

1 Like