Handling clock drift

May 5, 2005 at 4:11 pm Leave a comment

All modern computers have clocks… But, to save money, makers use very cheap hardware, which makes for inaccurate clocks… Now, if you have NTP to sync to an atomic clock somewhere, why does it matter if your clock is drifting? Because things like “SysUpTime” don’t get corrected when NTP updates the system clock on a router, for instance.

Lets just say that I’ve found that a typical Cisco 2651XM routers have a clock drift of about 5 minutes every 4 to 6 months or so…

Using SNMP (via some snazzy PHP scripts I wrote), I’m pulling the SysUpTime from about 950 Cisco 2651XM routers every day. I’m calculating back to find out when the router rebooted and comparing that to a database. To account for small differences in time related to network latency, my original routines allowed as much a 5 minutes of difference between the calculated reboot time and the reboot time in the database… If there is more than that, it is assumed that the router rebooted and the calculated time is placed in the database so we can easily keep track of the last time the router rebooted… (There’s an audit table too, that lets us track all the database changes, but I’ll save that for another post)… If the difference between the old database value and the new calculated time is less than 5 minutes, I assumed there was a small amount of variation due to how busy the circuits were, etc. and I ignore it, leaving the old value in the database. Oh, and an email is broadcast out to people interested in these routers when ones are found to have rebooted or had other changes applied.

After running for a few months, I noticed that I started getting unusual info is these alert emails… For example, a few days ago the email told me that the new “RouterRebootTime” was something like Nov 25th, 2004 at 09:04, and that the old “RouterRebootTime” was Nov 25th, 2004 and 08:59. Now, remember, months had passed since November. Other routers listed in the same email had reboot date/timestamps for the current week (Let’s say this started in March)…

At least once a week I saw these… Then, it seemed to pick up to a few times a week… Recently, I was getting like one or two of these a day. The further we get away from the original calculated time, the more machines are showing up how badly they drift…

Anyhow, I changed my methodology for handling these…. If the new calculated reboot time is over a week old, I’m now checking that value against the database value… If there is a difference of 24 hours or more, then I’m updating the database… If the calculated reboot time is less than a week old, I’m looking for the 5 minute difference, just like I used to…

So, the possibility will still exist that I’ll get these “false alarms” in my email, but since it will have to be 24 hours off, and so far I’ve only seem differences of 5 minutes in several months, I imagine I won’t see too many of these any time soon… In fact, I imagine that sysUpTime will roll over before there is 24 hours of drift.

Advertisements

Entry filed under: Networking.

Telnet via PHP? Easy, but SSH via PHP? Blogging about programming, networking, and computers in general

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

May 2005
S M T W T F S
« Dec   May »
1234567
891011121314
15161718192021
22232425262728
293031  

Most Recent Posts


%d bloggers like this: