zlacker

[parent] [thread] 7 comments
1. flavie+(OP)[view] [source] 2013-11-13 00:44:36
Distributed systems design aside, the core of the problem is that they relied on ntp (as they probably should), and in their case ntp was not working properly.
replies(4): >>device+84 >>scottd+Ok >>donava+rn >>global+Vo
2. device+84[view] [source] 2013-11-13 01:53:43
>>flavie+(OP)
And this is precisely why a thing that is not monitored is not actually a thing.
replies(2): >>olefoo+A8 >>specia+Sk
◧◩
3. olefoo+A8[view] [source] [discussion] 2013-11-13 03:10:17
>>device+84
> "A thing that is not monitored is not actually a thing."

That should be on a cross-stitch sampler on the wall of every NOC.

4. scottd+Ok[view] [source] 2013-11-13 07:19:00
>>flavie+(OP)
The key take away from the article SHOULD be: don't rely on ntp if you don't have to.

There are people who have to. They run their own atomic clocks, and worry about things such as precision delivery of nuclear ordanance.

Then there's you. You should use vector clocks, with a builtin conflict resolution mechanism based on domain knowledge.

That's the point of the article.

◧◩
5. specia+Sk[view] [source] [discussion] 2013-11-13 07:20:45
>>device+84
Nice. Much stronger than the "you can only manage what you measure" adage I learned from accounting.
6. donava+rn[view] [source] 2013-11-13 08:25:19
>>flavie+(OP)
Ntp is good. Assuming that time is coordinated, much less monotonically increasing, is a bad plan. Just the other week i got paged in the middle of the night because a clock moved backwards.
7. global+Vo[view] [source] 2013-11-13 08:57:08
>>flavie+(OP)
Even if NTP had been working properly, you would not have clocks synchronised at the level of individual ticks - only to the level of time intervals. If two updates happened at roughly the same time, and fell into the same time interval, there would be no way to tell which one happened before the other. A paper by Cilia et al on timing of composite events in distributed event-based systems using NTP deals with this issue.
replies(1): >>Dylan1+6y
◧◩
8. Dylan1+6y[view] [source] [discussion] 2013-11-13 12:12:55
>>global+Vo
But this is not a problem in many situations. Whereas successor writes failing within an entire 30 second span is a pretty big problem.
[go to top]