zlacker

[parent] [thread] 4 comments
1. jabart+(OP)[view] [source] 2023-06-29 17:52:59
No it doesn't. The amount of false alarm alerts you can get with internet based monitoring is more than 0. You could have a BGP route break things for one ISP your monitoring happens to use. You could have a failover event happening where it takes 30 seconds for everything to converge. I have multiple monitors on my app at 1 minute intervals from different vendors and ALWAYS a user will email us within 5 seconds of an issue. It's not realistic for a company to have automatic status updates trigger things without a person manually reviewing them because too many things can go wrong on the automatic status update to cause panic.
replies(2): >>lucb1e+H >>wongar+02
2. lucb1e+H[view] [source] 2023-06-29 17:54:55
>>jabart+(OP)
Who would panic? If nobody notices it's out because it's not, then nobody is going to be checking the status page. And if they do see the status page showing red while it's up, it's not like they're going to be unhappy about their SLA being met.

Maybe you want human confirmation on historic figures, but the live thing might as well be live.

3. wongar+02[view] [source] 2023-06-29 17:59:21
>>jabart+(OP)
Most paid status monitoring services cover BGP route problems and ISP issues by only flagging an event if it's detected from X geographically diverse endpoints.

For the 30 seconds where you wait for failover to complete: that is a 30 second outage. It's not necessarily profitable to admit to it, but showing it as a 30 second outage would be accurate

replies(2): >>jabart+wb >>jabart+Wb
◧◩
4. jabart+wb[view] [source] [discussion] 2023-06-29 18:36:54
>>wongar+02
TCP default is more than 30 seconds. The internet itself has about a 99.9% uptime. If one company showed every 30 second blip on their outage page all their competitors would have that screenshot on the first page of their pitch deck even if they also had the same issue. 2-5 minutes is reasonable for a public service to announce an outage.
◧◩
5. jabart+Wb[view] [source] [discussion] 2023-06-29 18:38:58
>>wongar+02
Forgot about that centurylink BGP infinite loop route bug they had where it took down their whole system nationwide. A lot of monitoring services showed red even though it was one ISP that was done.
[go to top]