zlacker

[return to "HN is up again"]
1. sillys+z[view] [source] 2022-07-08 20:34:23
>>tpmx+(OP)
HN was down because the failover server also failed: https://twitter.com/HNStatus/status/1545409429113229312

Double disk failure is improbable but not impossible.

The most impressive thing is that there seems to be no dataloss, almost whatsoever. Whatever the backup system is, it seems rock solid.

◧◩
2. hunter+g1[view] [source] 2022-07-08 20:37:16
>>sillys+z
What was the test to determine the dataloss?
◧◩◪
3. Cerium+r1[view] [source] 2022-07-08 20:38:25
>>hunter+g1
I came to the same conclusion by observing that there are posts and comments from only eight hours ago.
◧◩◪◨
4. jbvers+G2[view] [source] 2022-07-08 20:43:54
>>Cerium+r1
So that means dataloss.. Probably restored from backup.

Good news for people who were banned, or for posts that didn't get enough momentum :)

edit: Was restored from backup.. so def. dataloss

◧◩◪◨⬒
5. dang+SZ[view] [source] 2022-07-09 01:33:45
>>jbvers+G2
8 hours of downtime, but not data loss, since there was no data to lose during the downtime.

Last post before we went down (2022-07-08 12:46:04 UTC): https://news.ycombinator.com/item?id=32026565

First post once we were back up (2022-07-08 20:30:55 UTC): https://news.ycombinator.com/item?id=32026571 (hey, that's this thread! how'd you do that, tpmx?)

So, 7h 45m of downtime. What we don't know is how many posts (or votes, etc.) happened after our last backup, and were therefore lost. The latest vote we have was at 2022-07-08 12:46:05 UTC, which is about the same as the last post.

There can't be many lost posts or votes, though, because I checked HN Search (https://hn.algolia.com/) just before we brought HN back up, and their most recent comment and story were behind ours. That means our last backup on the ill-fated server was taken after the last API update (HN Search relies on our API), and the API gets updated every 30 seconds.

I'm not saying that's a rock-solid argument, but it suggests that 30 seconds is an upper bound on how much data we lost.

◧◩◪◨⬒⬓
6. sillys+j31[view] [source] 2022-07-09 02:05:22
>>dang+SZ
Curiosity got the better of me. Why was there a 6 ID gap between the last post and first post? The answer seems to be that admins were making posts, which is neat. (There was also one lonely Flexport job ad.)

Is your backup system tied to your API? Algolia is a third party service, and streaming the latest HN data to Algolia seems pretty similar to streaming it to a backup system.

◧◩◪◨⬒⬓⬔
7. dang+v31[view] [source] 2022-07-09 02:06:49
>>sillys+j31
I posted a bunch of test things and then deleted them.
◧◩◪◨⬒⬓⬔⧯
8. scott_+ih2[view] [source] 2022-07-09 14:18:38
>>dang+v31
I love this answer so much.
◧◩◪◨⬒⬓⬔⧯▣
9. sillys+ZW2[view] [source] 2022-07-09 18:25:49
>>scott_+ih2
I really wanted to ask “How did you post things if the server was down?” but perhaps some things are better left as mysteries.
◧◩◪◨⬒⬓⬔⧯▣▦
10. O_____+K33[view] [source] 2022-07-09 19:15:03
>>sillys+ZW2
You could see them via HN’s API before they were deleted, nothing interesting; API was back up before the www.
[go to top]