zlacker

Ask HN: Is GitHub down?

submitted by mikebo+(OP) on 2023-06-29 17:42:02 | 306 points 184 comments
[source] [links] [go to bottom]

Not loading for me at all, but status page shows green across the board.

replies(62): >>etimbe+f >>comput+r >>adpirz+E >>saintb+O >>wcarss+U >>gwbas1+W >>Mornin+X >>aftayl+Y >>Grumpy+01 >>joshst+c1 >>Actual+m1 >>klysm+s1 >>stefan+y1 >>ushako+E1 >>charli+L1 >>coder9+M1 >>whitew+N1 >>jetpac+O1 >>tednot+T1 >>turtle+b2 >>JohnMa+d2 >>hn8305+g2 >>abe-10+p2 >>kashno+v2 >>ChrisA+w2 >>almost+A2 >>distor+63 >>numbsa+73 >>jester+h3 >>duckkg+n3 >>edgyqu+t3 >>mbrees+y3 >>billy1+T3 >>facu17+q5 >>lucb1e+r5 >>Gordon+B5 >>onioni+E5 >>rk32+16 >>circui+36 >>renonc+a6 >>Actual+d6 >>datalu+e6 >>Xeamek+r6 >>lucb1e+s6 >>mydria+x6 >>oblong+G6 >>gaosha+J6 >>ukrain+N6 >>hejclo+57 >>sparc2+d7 >>connor+B7 >>sdsd+F7 >>Macuyi+M7 >>saintb+R8 >>escape+W8 >>gwbas1+Ba >>ny711+lc >>rvz+nc >>ughits+Lc >>acim+Pc >>themus+yd >>squalo+ji
1. etimbe+f[view] [source] 2023-06-29 17:43:01
>>mikebo+(OP)
Same for me
2. comput+r[view] [source] 2023-06-29 17:43:28
>>mikebo+(OP)
Yup. Totally down, happened right as I opened a PR.
replies(1): >>jonapr+E3
3. adpirz+E[view] [source] 2023-06-29 17:43:57
>>mikebo+(OP)
Same
4. saintb+O[view] [source] 2023-06-29 17:44:51
>>mikebo+(OP)
Same here.
5. wcarss+U[view] [source] 2023-06-29 17:45:02
>>mikebo+(OP)
I can't even load the status page.
replies(1): >>makewo+j1
6. gwbas1+W[view] [source] 2023-06-29 17:45:06
>>mikebo+(OP)
I can't push from my desktop, and https://github.com/ spins in the browser
7. Mornin+X[view] [source] 2023-06-29 17:45:08
>>mikebo+(OP)
Same. Spins for a while then dreaded `ERR_CONNECTION_TIMED_OUT` chrome error page.
8. aftayl+Y[view] [source] 2023-06-29 17:45:09
>>mikebo+(OP)
Unable to load Github website or push using git-remote-https as of the last several minutes.
9. Grumpy+01[view] [source] 2023-06-29 17:45:15
>>mikebo+(OP)
Same here, it cant send a verification sms.
replies(1): >>zamale+Q3
10. joshst+c1[view] [source] 2023-06-29 17:45:39
>>mikebo+(OP)
Yep, looks like it's down. Can't pull/push and can't even get the web to load at all.
◧◩
11. makewo+j1[view] [source] [discussion] 2023-06-29 17:46:01
>>wcarss+U
Status page loads for me, it just incorrectly says all green: https://www.githubstatus.com/
replies(5): >>onioni+D1 >>ketchu+X1 >>wcarss+k2 >>comput+t2 >>ketchu+V3
12. Actual+m1[view] [source] 2023-06-29 17:46:10
>>mikebo+(OP)
Yep
13. klysm+s1[view] [source] 2023-06-29 17:46:25
>>mikebo+(OP)
Seems to be a fairly catastrophic failure. https://github.com/ fails to load. https://www.githubstatus.com/ shows all green as of this writing. Nothing on the twitter yet https://twitter.com/githubstatus

edit: The outage is now acknowledged on the status page https://www.githubstatus.com/

edit: EU folks appear to have things working so it looks like a regional network fault

replies(11): >>edgyqu+f3 >>pc86+p3 >>mkolas+w3 >>ChadyW+x3 >>urda+K3 >>cruano+14 >>fluix+54 >>pjot+94 >>deatha+i4 >>Maxion+I4 >>Macuyi+Z6
14. stefan+y1[view] [source] 2023-06-29 17:46:51
>>mikebo+(OP)
So down right now...I wonder why they still use https://www.githubstatus.com/ that reports everything is alright when it's not!
replies(4): >>SV_Bub+22 >>Shekel+S2 >>munk-a+M3 >>maschu+Gg4
◧◩◪
15. onioni+D1[view] [source] [discussion] 2023-06-29 17:47:07
>>makewo+j1
Maybe it's right and we're all wrong
16. ushako+E1[view] [source] 2023-06-29 17:47:12
>>mikebo+(OP)
Actions won't start for me
17. charli+L1[view] [source] 2023-06-29 17:47:39
>>mikebo+(OP)
Loads for me

Unless it is cached

Edit: I could even login

replies(2): >>lights+m2 >>Maxion+O4
18. coder9+M1[view] [source] 2023-06-29 17:47:43
>>mikebo+(OP)
Yep it's down. Why do they even bother with the status page
19. whitew+N1[view] [source] 2023-06-29 17:47:43
>>mikebo+(OP)
Yes, github status says all is operational but hacker news is faster. Go figure.
replies(1): >>munk-a+Z2
20. jetpac+O1[view] [source] 2023-06-29 17:47:53
>>mikebo+(OP)
Its down for me
21. tednot+T1[view] [source] 2023-06-29 17:48:11
>>mikebo+(OP)
Down for me right now.
◧◩◪
22. ketchu+X1[view] [source] [discussion] 2023-06-29 17:48:40
>>makewo+j1
That requires manual acknowledgement. Probably requires an approval from a VP or some high level exec to change that status.
◧◩
23. SV_Bub+22[view] [source] [discussion] 2023-06-29 17:48:49
>>stefan+y1
If it takes someone to manually change it from green to red, that does seem to defeat the purpose.
replies(6): >>evulho+83 >>klysm+g3 >>jabart+i3 >>distor+s3 >>numbsa+x4 >>AYBABT+v8
24. turtle+b2[view] [source] 2023-06-29 17:49:12
>>mikebo+(OP)
Won't load for me.

https://githubstatus.com shows all green, but it's not the case...

replies(1): >>jerryg+T5
25. JohnMa+d2[view] [source] 2023-06-29 17:49:15
>>mikebo+(OP)
First noticed when trying to pull a helm chart - get a 503 backend error page.
26. hn8305+g2[view] [source] 2023-06-29 17:49:19
>>mikebo+(OP)
140.82.113.0/24 is visible in the global routing table:

  route-views>sh bgp 140.82.113.0
  BGP routing table entry for 140.82.113.0/24, version 62582026
  Paths: (19 available, best #4, table default)
The route is verified by RPKI so it's not a route hijack.

Edit: deleted traceroute

replies(5): >>llimll+85 >>travis+u5 >>almost+Y5 >>iso163+M6 >>adnaus+ia
◧◩◪
27. wcarss+k2[view] [source] [discussion] 2023-06-29 17:49:23
>>makewo+j1
Interesting, I'm used to using status.github.com, which got hit by whatever issue is hitting the main site.
◧◩
28. lights+m2[view] [source] [discussion] 2023-06-29 17:49:30
>>charli+L1
Same. I'm in Europe and it loads slowly but it gets there.
replies(1): >>charli+33
29. abe-10+p2[view] [source] 2023-06-29 17:49:33
>>mikebo+(OP)
Same here
◧◩◪
30. comput+t2[view] [source] [discussion] 2023-06-29 17:49:44
>>makewo+j1
GitHub should monitor their status page traffic for spikes, which probably mean something is wrong somewhere, even if they themselves haven't noticed yet.
31. kashno+v2[view] [source] 2023-06-29 17:49:46
>>mikebo+(OP)
Are these status pages updated manually? At the very least it should be able to detect that the home page doesn't even load and turn itself red.
32. ChrisA+w2[view] [source] 2023-06-29 17:49:47
>>mikebo+(OP)
[dupe] / merge >>36523843
33. almost+A2[view] [source] 2023-06-29 17:50:13
>>mikebo+(OP)
Works for me?
◧◩
34. Shekel+S2[view] [source] [discussion] 2023-06-29 17:51:34
>>stefan+y1
Pretty much every company has been shown to have fake status pages at this point.
replies(5): >>ezekg+c5 >>klysm+p6 >>wsatb+E6 >>Night_+H6 >>AYBABT+u7
◧◩
35. munk-a+Z2[view] [source] [discussion] 2023-06-29 17:52:03
>>whitew+N1
At this point I don't trust self-owned status pages at all - those crowd-sourced ones where users report issues are much faster to respond to outages that may never even go reported by status pages.
◧◩◪
36. charli+33[view] [source] [discussion] 2023-06-29 17:52:17
>>lights+m2
And it seems like loading HN is even slower
replies(1): >>iso163+Pa
37. distor+63[view] [source] 2023-06-29 17:52:18
>>mikebo+(OP)
Down for me as well, Operation Timed Out errors on all attempted SSH connections
38. numbsa+73[view] [source] 2023-06-29 17:52:23
>>mikebo+(OP)
If they used AI to rebuild their system and migrated it Azure, I bet they would stop having all the problems.
replies(1): >>889135+n4
◧◩◪
39. evulho+83[view] [source] [discussion] 2023-06-29 17:52:27
>>SV_Bub+22
Yep, and when money comes into play when you're supposed to meet SLAs, you certainly don't want it being automatic.
◧◩
40. edgyqu+f3[view] [source] [discussion] 2023-06-29 17:52:49
>>klysm+s1
Even the status page isn’t loading for me currently
◧◩◪
41. klysm+g3[view] [source] [discussion] 2023-06-29 17:52:56
>>SV_Bub+22
Possibly, but sometimes with failures this bad you can't get to the page to update it.
replies(1): >>munk-a+e4
42. jester+h3[view] [source] 2023-06-29 17:52:58
>>mikebo+(OP)
From GitHub - Incident On 2023-06-29: https://www.githubstatus.com/incidents/gqx5l06jjxhp?u=ry1xb4...
◧◩◪
43. jabart+i3[view] [source] [discussion] 2023-06-29 17:52:59
>>SV_Bub+22
No it doesn't. The amount of false alarm alerts you can get with internet based monitoring is more than 0. You could have a BGP route break things for one ISP your monitoring happens to use. You could have a failover event happening where it takes 30 seconds for everything to converge. I have multiple monitors on my app at 1 minute intervals from different vendors and ALWAYS a user will email us within 5 seconds of an issue. It's not realistic for a company to have automatic status updates trigger things without a person manually reviewing them because too many things can go wrong on the automatic status update to cause panic.
replies(2): >>lucb1e+Z3 >>wongar+i5
44. duckkg+n3[view] [source] 2023-06-29 17:53:23
>>mikebo+(OP)
Status page updated showing all red https://www.githubstatus.com/
◧◩
45. pc86+p3[view] [source] [discussion] 2023-06-29 17:53:27
>>klysm+s1
Status page is red now, it probably only checks once every couple minutes.
◧◩◪
46. distor+s3[view] [source] [discussion] 2023-06-29 17:53:35
>>SV_Bub+22
Unknown unknowns means you can have catastrophic system failures that automated alerts don't detect.
47. edgyqu+t3[view] [source] 2023-06-29 17:53:40
>>mikebo+(OP)
Everytime github goes down, and my push/pull is rejected, I immediately assume they’ve discovered I’m incompetent and fired me. And I’m the head of engineering at my company.
replies(13): >>antoin+J5 >>vrosas+P6 >>Maxion+R6 >>aliasx+m8 >>morkal+V9 >>mantra+jg >>voodoo+fs >>maximi+Rz >>dathin+9D >>progme+9M >>dreday+iW >>SergeA+ug1 >>rubick+MG2
◧◩
48. mkolas+w3[view] [source] [discussion] 2023-06-29 17:53:47
>>klysm+s1
Looks like they finally updated the second status page to show the outage.
◧◩
49. ChadyW+x3[view] [source] [discussion] 2023-06-29 17:53:48
>>klysm+s1
Looks like they've updated it now.
50. mbrees+y3[view] [source] 2023-06-29 17:53:48
>>mikebo+(OP)
https://www.githubstatus.com/

Just flipped to red.

See here: https://www.githubstatus.com/incidents/gqx5l06jjxhp

>Investigating - We are currently experiencing an outage of GitHub products and are investigating.

>Jun 29, 2023 - 17:52 UTC

replies(1): >>jachee+xC
◧◩
51. jonapr+E3[view] [source] [discussion] 2023-06-29 17:53:58
>>comput+r
You broke it.
◧◩
52. urda+K3[view] [source] [discussion] 2023-06-29 17:54:18
>>klysm+s1
Status page is fully red now.
◧◩
53. munk-a+M3[view] [source] [discussion] 2023-06-29 17:54:20
>>stefan+y1
https://downdetector.com/status/github/ is a far more reliable source - it's just powered by user reports and often will show issues long before the status page ever receives an update.
replies(1): >>jachee+mD
◧◩
54. zamale+Q3[view] [source] [discussion] 2023-06-29 17:54:30
>>Grumpy+01
Use the downtime to purchase a Yubikey/FIDO2.
55. billy1+T3[view] [source] 2023-06-29 17:54:36
>>mikebo+(OP)
It's down! https://twitter.com/githubstatus/status/1674475870931808256?...
◧◩◪
56. ketchu+V3[view] [source] [discussion] 2023-06-29 17:54:43
>>makewo+j1
they just updated it, now its all red
◧◩◪◨
57. lucb1e+Z3[view] [source] [discussion] 2023-06-29 17:54:55
>>jabart+i3
Who would panic? If nobody notices it's out because it's not, then nobody is going to be checking the status page. And if they do see the status page showing red while it's up, it's not like they're going to be unhappy about their SLA being met.

Maybe you want human confirmation on historic figures, but the live thing might as well be live.

◧◩
58. cruano+14[view] [source] [discussion] 2023-06-29 17:54:56
>>klysm+s1
GitHub pages are down too, although funnily enough https://pages.github.com is up
replies(1): >>megado+o6
◧◩
59. fluix+54[view] [source] [discussion] 2023-06-29 17:55:05
>>klysm+s1
Pages hosted on github pages also show the unicorn 503 page. However, https://pages.github.com/?(null) loads.
◧◩
60. pjot+94[view] [source] [discussion] 2023-06-29 17:55:19
>>klysm+s1
Ack’d on twitter: https://twitter.com/githubstatus/status/1674475870931808256
◧◩◪◨
61. munk-a+e4[view] [source] [discussion] 2023-06-29 17:55:30
>>klysm+g3
There was that hilarious multi-hour AWS failure a while back where the status page was updated via one of their internal services... and that service went down as part of the outage.
◧◩
62. deatha+i4[view] [source] [discussion] 2023-06-29 17:55:45
>>klysm+s1
status.github.com was a timeout error for me. githubstatus.com is the rainbow unicorn.

Lunch time.

replies(1): >>klysm+g6
◧◩
63. 889135+n4[view] [source] [discussion] 2023-06-29 17:56:02
>>numbsa+73
I can't tell if this is sarcasm or not.
replies(1): >>numbsa+Q5
◧◩◪
64. numbsa+x4[view] [source] [discussion] 2023-06-29 17:56:26
>>SV_Bub+22
I bet they could teach Co-Pilot to create a PR to make the change, and build some GitHub actions to automatically merge those changes.
◧◩
65. Maxion+I4[view] [source] [discussion] 2023-06-29 17:57:09
>>klysm+s1
Strange stuff, as it works completely fine for me in the EU? I just posted comments to several issues.

Edit: Front page still loads and I am logged in. Everything is as normal. Status page shows everything is down. Lol.

replies(3): >>klysm+A5 >>facu17+P5 >>leesal+m7
◧◩
66. Maxion+O4[view] [source] [discussion] 2023-06-29 17:57:42
>>charli+L1
Same, in EU and is just as normal.
◧◩
67. llimll+85[view] [source] [discussion] 2023-06-29 17:58:42
>>hn8305+g2
Your requests made it farther than mine - mine get to charter in nyc and die there

    6  lag-26.nycmny837aw-bcr00.netops.charter.com (24.30.201.130)  158.033 ms
       lag-16.nycmny837aw-bcr00.netops.charter.com (66.109.6.74)  29.575 ms
       lag-416.nycmny837aw-bcr00.netops.charter.com (66.109.6.10)  30.077 ms
    7  lag-1.pr2.nyc20.netops.charter.com (66.109.9.5)  81.351 ms  37.879 ms  27.877 ms
    8  * * *
replies(1): >>alexel+i6
◧◩◪
68. ezekg+c5[view] [source] [discussion] 2023-06-29 17:58:53
>>Shekel+S2
Pretty much. They want the burden of proof for SLAs to fall on the customer, not on themselves. If a customer has to prove that an outage specifically affected them, they are much less likely to have a successful case against the failure to meet their SLA.

(Not directed at GitHub specifically, but at bogus status pages.)

◧◩◪◨
69. wongar+i5[view] [source] [discussion] 2023-06-29 17:59:21
>>jabart+i3
Most paid status monitoring services cover BGP route problems and ISP issues by only flagging an event if it's detected from X geographically diverse endpoints.

For the 30 seconds where you wait for failover to complete: that is a 30 second outage. It's not necessarily profitable to admit to it, but showing it as a 30 second outage would be accurate

replies(2): >>jabart+Oe >>jabart+ef
70. facu17+q5[view] [source] 2023-06-29 17:59:47
>>mikebo+(OP)
seems to be a network issue, not a service issue
replies(1): >>lucb1e+96
71. lucb1e+r5[view] [source] 2023-06-29 17:59:50
>>mikebo+(OP)
https://codeberg.org open source GitHub without Microsoft (it's a German non-profit). You can also host your own lightweight https://forgejo.org instance

In case anyone questions whether centralizing literally everything onto GitHub is a good idea, at least as a mirror for things you depend on

replies(1): >>escape+L9
◧◩
72. travis+u5[view] [source] [discussion] 2023-06-29 18:00:07
>>hn8305+g2
found the neteng guy
◧◩◪
73. klysm+A5[view] [source] [discussion] 2023-06-29 18:00:19
>>Maxion+I4
Sounds like a regional network fault then
74. Gordon+B5[view] [source] 2023-06-29 18:00:20
>>mikebo+(OP)
No it isn't, it's working absolutely fine and has been all afternoon.
replies(1): >>sophac+47
75. onioni+E5[view] [source] 2023-06-29 18:00:25
>>mikebo+(OP)
I heard from somebody at GitHub that this one will make a good incident report. No other details or estimates for recovery time.
◧◩
76. antoin+J5[view] [source] [discussion] 2023-06-29 18:00:33
>>edgyqu+t3
> And I’m the head of engineering at my company.

Haha! Happy to see impostor syndrome goes all the way to the top of the hierarchy.

replies(1): >>dreday+5W
◧◩◪
77. facu17+P5[view] [source] [discussion] 2023-06-29 18:01:08
>>Maxion+I4
yes, seems to be a network issue, not a service issue
replies(1): >>bernie+kV
◧◩◪
78. numbsa+Q5[view] [source] [discussion] 2023-06-29 18:01:09
>>889135+n4
I'm pretty sure if I e-mailed my sales rep right now they would tell me that Azure Dev Ops doesn't have these problems.
replies(1): >>sdelli+38
◧◩
79. jerryg+T5[view] [source] [discussion] 2023-06-29 18:01:25
>>turtle+b2
Github status page doesn't even load for me ... "We're having a really bad day, the unicorns have taken over"
◧◩
80. almost+Y5[view] [source] [discussion] 2023-06-29 18:01:42
>>hn8305+g2
I'm able to reach on 192.30.252.0/22.
81. rk32+16[view] [source] 2023-06-29 18:01:51
>>mikebo+(OP)
its all down currently
82. circui+36[view] [source] 2023-06-29 18:01:53
>>mikebo+(OP)
It works for me at the moment
◧◩
83. lucb1e+96[view] [source] [discussion] 2023-06-29 18:02:06
>>facu17+q5
Being up on the localhost interface doesn't count!
replies(1): >>bernie+RW
84. renonc+a6[view] [source] 2023-06-29 18:02:09
>>mikebo+(OP)
Not down for me accessing from Hong Kong. I suspect this is a regional outage.
85. Actual+d6[view] [source] 2023-06-29 18:02:16
>>mikebo+(OP)
Since nobody can work, I'll just leave this here: "I must have put a decimal point in the wrong place or something. Shit! I always do that. I always mess up some mundane detail."
86. datalu+e6[view] [source] 2023-06-29 18:02:24
>>mikebo+(OP)
And just when I was about to get into flow state...
◧◩◪
87. klysm+g6[view] [source] [discussion] 2023-06-29 18:02:33
>>deatha+i4
for some reason www.githubstatus.com works while githubstatus.com doesn't
◧◩◪
88. alexel+i6[view] [source] [discussion] 2023-06-29 18:02:41
>>llimll+85
I’m in US east coast with a dev box in Helsinki. My dev box can still hit github.com, but I can’t at home.
replies(2): >>llimll+t6 >>musha6+Fc
◧◩◪
89. megado+o6[view] [source] [discussion] 2023-06-29 18:03:08
>>cruano+14
That is funnily.
◧◩◪
90. klysm+p6[view] [source] [discussion] 2023-06-29 18:03:12
>>Shekel+S2
fake and not automated are pretty different
91. Xeamek+r6[view] [source] 2023-06-29 18:03:15
>>mikebo+(OP)
looking on the bright side, at least we'll get an interesting post-mortem to read in a day or two.
92. lucb1e+s6[view] [source] 2023-06-29 18:03:18
>>mikebo+(OP)
To all the "same" and "not for me" posters: the very least you could add is a location
◧◩◪◨
93. llimll+t6[view] [source] [discussion] 2023-06-29 18:03:24
>>alexel+i6
What IP does it resolve to in Helsinki?
replies(1): >>Maxion+y7
94. mydria+x6[view] [source] 2023-06-29 18:03:27
>>mikebo+(OP)
Yes. Can't even pull ;(
◧◩◪
95. wsatb+E6[view] [source] [discussion] 2023-06-29 18:03:53
>>Shekel+S2
From my experience, GitHub is the best out there when it comes to updating their status page.
96. oblong+G6[view] [source] 2023-06-29 18:03:55
>>mikebo+(OP)
I noticed Github's OIDC token changed about a half an hour ago. Security incident?
replies(3): >>klysm+48 >>Ysx+I8 >>ralgoz+Bh
◧◩◪
97. Night_+H6[view] [source] [discussion] 2023-06-29 18:03:56
>>Shekel+S2
Really? Why?

That's so disappointing.

replies(1): >>cududa+4a
98. gaosha+J6[view] [source] 2023-06-29 18:03:59
>>mikebo+(OP)
My team ran some code that crushed our Github actions very shortly before this outage. Nervous laughter around the office that it was us.
◧◩
99. iso163+M6[view] [source] [discussion] 2023-06-29 18:04:11
>>hn8305+g2
github.com for me returns 140.82.121.3 which routes fine in the uk, returning from

lb-140-82-121-3-fra.github.com

which from the distance and name I would assume is a frankfurt based load balancer. I get there from BT -> Zayo

I can reach that IP from Washington too, but github returns 140.82.114.3 and 140.82.114.4 from DNS at 1.1.1.1 on a Level3 handoff in Washington

Spot checks around the place show the first returned IP as pingable across the world

Bangkok, Dhaka, Jakarta - 20.205.243.166

Seoul - 20.200.245.247

Nairobi - 20.87.225.212

Kabul, Dakar, Amman, Amman, Cairo - 140.82.121.3

Moscow, Riga, Istanbul - 140.82.121.4

Miami - 140.82.114.3

replies(1): >>Maxion+T7
100. ukrain+N6[view] [source] 2023-06-29 18:04:14
>>mikebo+(OP)
Cannot use oauth2 in algoexpert :/
◧◩
101. vrosas+P6[view] [source] [discussion] 2023-06-29 18:04:16
>>edgyqu+t3
You’re not alone.
◧◩
102. Maxion+R6[view] [source] [discussion] 2023-06-29 18:04:21
>>edgyqu+t3
I think the same thing every time my credentials to our issue tracker expires and I have to log in again.

I am the lead dev on two projects.

replies(1): >>Commit+Sa
◧◩
103. Macuyi+Z6[view] [source] [discussion] 2023-06-29 18:04:43
>>klysm+s1
EU here. Actions are failing to run. Rest is kinda ok.
◧◩
104. sophac+47[view] [source] [discussion] 2023-06-29 18:04:48
>>Gordon+B5
githubstatus.com disagrees. Heres the specific incident: https://www.githubstatus.com/incidents/gqx5l06jjxhp

I think I'll believe them when they say it's down, no offense.

replies(1): >>Gordon+39
105. hejclo+57[view] [source] 2023-06-29 18:04:50
>>mikebo+(OP)
I feel sympathy for all the engs at companies I've implemented CI/CD based on Gh Actions in recent years. It's not like I didn't tell them that Github showed to be somewhat unreliable in the recent years and in contrast to their claim "it's just the build pipeline, not the product" I think it is a horrible incident if you're not able to deploy to production and have barely any ad-hoc backup.

I'm always evangelizing Argo or Flux and some self-hosted Gitlab or gitea, but seems like they all prefer to throw their money at Github as of now.

replies(1): >>bernie+8W
106. sparc2+d7[view] [source] 2023-06-29 18:05:10
>>mikebo+(OP)
Azure Strikes Again!
◧◩◪
107. leesal+m7[view] [source] [discussion] 2023-06-29 18:05:59
>>Maxion+I4
Switched on a VPN in EU and it started loading. I can get back to what I was doing now ;).
◧◩◪
108. AYBABT+u7[view] [source] [discussion] 2023-06-29 18:06:16
>>Shekel+S2
Status pages are updated by humans and the humans need to (1) realize there's a problem and (2) understand the magnitude of the problem and (3) put that on the status page.

It's not fake, it's just a human process. And automating this would be error prone just the same.

replies(3): >>Macuyi+t8 >>wsatb+1b >>jachee+AD
◧◩◪◨⬒
109. Maxion+y7[view] [source] [discussion] 2023-06-29 18:06:37
>>llimll+t6
From Finland, but not Helsinki: 140.82.121.3
replies(1): >>alexel+jc
110. connor+B7[view] [source] 2023-06-29 18:06:41
>>mikebo+(OP)
Up in Africa (Morocco).
111. sdsd+F7[view] [source] 2023-06-29 18:07:06
>>mikebo+(OP)
I am laughing so hard right now. My best friend mocked me for using git.lain.faith to host my code, saying it wasn't reliable. Well, well, well. In the last year GitLain hasn't gone down once.

I know he was still right in a way, who knows when git.lain.faith will just disappear. But still. I'm going to send some texts to bother him right now, hahaha.

112. Macuyi+M7[view] [source] 2023-06-29 18:07:22
>>mikebo+(OP)
Update

We have identified the root cause of the outage and are working toward mitigation Posted 4 minutes ago. Jun 29, 2023 - 18:02 UTC

replies(2): >>Maxion+18 >>cududa+19
◧◩◪
113. Maxion+T7[view] [source] [discussion] 2023-06-29 18:07:39
>>iso163+M6
Same from Finland, and same route. (Except my ISP instead of BT).
replies(1): >>iso163+Qc
◧◩
114. Maxion+18[view] [source] [discussion] 2023-06-29 18:08:03
>>Macuyi+M7
Seems like an Oopsie! If they found it so quickly.
◧◩◪◨
115. sdelli+38[view] [source] [discussion] 2023-06-29 18:08:06
>>numbsa+Q5
Still can't tell if this is sarcasm or not.
replies(1): >>jprd+zd
◧◩
116. klysm+48[view] [source] [discussion] 2023-06-29 18:08:07
>>oblong+G6
Interesting observation, but I'd be surprised if that was related to a regional network fault
◧◩
117. aliasx+m8[view] [source] [discussion] 2023-06-29 18:09:13
>>edgyqu+t3
I resonate with this.
◧◩◪◨
118. Macuyi+t8[view] [source] [discussion] 2023-06-29 18:09:22
>>AYBABT+u7
Very good points. Meanwhile I have clients asking me why they can't have a status page to which I reply: you can, but ultimately to be completely fail proof it will be a human updating it slowly. To which they reply: but GitHub or X does it...

Very infuriating, that.

replies(1): >>AYBABT+ca
◧◩◪
119. AYBABT+v8[view] [source] [discussion] 2023-06-29 18:09:33
>>SV_Bub+22
Not really, things fail in unexpected ways. Automated anomaly detection is notoriously error prone, leading to a lot of false positive and false negatives, in the trivial case of monitoring a single timeseries. For a system the size of GitHub, you need to monitor a whole host of things and if it's quasi impossible to do one timeseries well, there's basically no hope of doing automated many timeseries anomaly detection with a signal-to-noise ratio that's better than "humans looking at the thing and realizing it's not going well".

There's stuff like this that can't be automated well. The automated result is far worse than the human-based alternative.

◧◩
120. Ysx+I8[view] [source] [discussion] 2023-06-29 18:10:13
>>oblong+G6
They added a second token on Tuesday: https://github.blog/changelog/2023-06-27-github-actions-upda...
replies(1): >>bernie+vW
121. saintb+R8[view] [source] 2023-06-29 18:11:11
>>mikebo+(OP)
Looks like we are back online.
122. escape+W8[view] [source] 2023-06-29 18:11:24
>>mikebo+(OP)
often people point out how unreliable self-hosted services are, well, hosted services are just as unreliable if not more.

this, ladies and gentlemen, is why you should always self host critical infrastructure

replies(1): >>iso163+Vb
◧◩
123. cududa+19[view] [source] [discussion] 2023-06-29 18:11:46
>>Macuyi+M7
Seems like it's coming back online in fits and starts
◧◩◪
124. Gordon+39[view] [source] [discussion] 2023-06-29 18:11:52
>>sophac+47
Uhm, okay.

Rather than looking at a rather noddy status page, have you tried using it?

replies(1): >>sophac+59
◧◩◪◨
125. sophac+59[view] [source] [discussion] 2023-06-29 18:12:06
>>Gordon+39
Yes and everything times out.
replies(1): >>Gordon+Dh
◧◩
126. escape+L9[view] [source] [discussion] 2023-06-29 18:14:04
>>lucb1e+r5
gitea is also a great self hosted alternative!
replies(1): >>lucb1e+Na
◧◩
127. morkal+V9[view] [source] [discussion] 2023-06-29 18:14:46
>>edgyqu+t3
Sounds kind of like those dreams where you can only run slowly, punch with noodly arms and trying to turn on a light just has a dim effect.
◧◩◪◨
128. cududa+4a[view] [source] [discussion] 2023-06-29 18:15:07
>>Night_+H6
Two technical reasons capstoned by driving business motivation:

-False positives -Short outages that last a minute or three

Ultimately, SLA's and uptime guarantees. That way, a business can't automatically tally every minute of publicly admitted downtime against the 99.99999% uptime guarantee, and the onus to prove a breach of contract is on the customer

◧◩◪◨⬒
129. AYBABT+ca[view] [source] [discussion] 2023-06-29 18:15:47
>>Macuyi+t8
There's some nice tooling these days for this. E.g. https://firehydrant.com/ and https://incident.io both make this a faster, more embedded process.
replies(2): >>sjwhit+Ug >>amanda+YS
◧◩
130. adnaus+ia[view] [source] [discussion] 2023-06-29 18:16:08
>>hn8305+g2
This is so cool! I'm not at all familiar with any of this network stuff. Any good resources for learning these tools and when to use them?

Sorry to bother!

replies(1): >>Shamel+oC
131. gwbas1+Ba[view] [source] 2023-06-29 18:17:36
>>mikebo+(OP)
Yaay! I just pushed and my new commit showed up in CI!
◧◩◪
132. lucb1e+Na[view] [source] [discussion] 2023-06-29 18:18:41
>>escape+L9
I think that's what Forgejo forked from (and Gitea, in turn, forked from Gogs). I am not involved so don't know the details, but yeah basically all of these will do. I ran my own in the Gitea era and was happy with it, 10x lighter and easier than gitlab, I expect Forgejo has a similar experience.
◧◩◪◨
133. iso163+Pa[view] [source] [discussion] 2023-06-29 18:18:44
>>charli+33
Probably because everyone in the US is piling on HN to see if github is down
◧◩◪
134. Commit+Sa[view] [source] [discussion] 2023-06-29 18:18:55
>>Maxion+R6
Is there a name for firing trauma? I'm like this ever since I was scapegoated.
replies(5): >>AdamJa+if >>jachee+zB >>addand+6O >>scrum-+Hd9 >>pandac+Td9
◧◩◪◨
135. wsatb+1b[view] [source] [discussion] 2023-06-29 18:19:37
>>AYBABT+u7
I wouldn't necessarily call them fake, but the issue often has to be big enough for most companies to admit to it. AWS often has smaller outages that they will never acknowledge.
◧◩
136. iso163+Vb[view] [source] [discussion] 2023-06-29 18:24:20
>>escape+W8
Depends on what you want. If you want uptime, then sure. If you want to be able to blame someone then no.

If you are down for 1 hour a year on self hosting, but Office 364 is down 3 days a year, your CEO is going to be more understanding of the Office outage as all his golf buddies have the same problem, and he reads about it in the NYT.

But in any case zero downtime is difficult, that's why you need two independent systems. I had a a 500 microsecond outage at the weekend when a circuit failed which caused an business affecting incident, not a big one fortunately, as it was only some singers, but it was still one that was unacceptable -- had it happened at a couple of other events in the last 12 months it would have been far more problematic. Work has started to ensure it doesn't happen next year.

◧◩◪◨⬒⬓
137. alexel+jc[view] [source] [discussion] 2023-06-29 18:26:14
>>Maxion+y7
yep, same
138. ny711+lc[view] [source] 2023-06-29 18:26:24
>>mikebo+(OP)
This should be a weekly ASK HN; seems to happy pretty frequently at this point
139. rvz+nc[view] [source] 2023-06-29 18:26:35
>>mikebo+(OP)
Unsurprising. There is at least one outage with GitHub every single month. [0]

[0] >>35967921

◧◩◪◨
140. musha6+Fc[view] [source] [discussion] 2023-06-29 18:28:01
>>alexel+i6
Curious aside: That sounds like quite the roundtrip for day to day work. How do you cope with that, used to IntelliJ IDEs? ;D
replies(1): >>alexel+cOG
141. ughits+Lc[view] [source] 2023-06-29 18:28:34
>>mikebo+(OP)
Good time to grab a beer!
142. acim+Pc[view] [source] 2023-06-29 18:28:45
>>mikebo+(OP)
I just noticed that artifacts download didn't work although the web site was up. There was some varnish proxy error.
◧◩◪◨
143. iso163+Qc[view] [source] [discussion] 2023-06-29 18:28:52
>>Maxion+T7
They do have other peering -- that IP from my ISP in Jakarta routes onto Hurricane Electric in Singapore and then to github. From Sao Paulo I go to Atlanta, USA, then to Paris and Frankfurt on twelve99/Telia
144. themus+yd[view] [source] 2023-06-29 18:31:35
>>mikebo+(OP)
It shouldn't matter. Nobody should be using github post-2018.
◧◩◪◨⬒
145. jprd+zd[view] [source] [discussion] 2023-06-29 18:31:40
>>sdelli+38
That's how you know it's good sarcasm
◧◩◪◨⬒
146. jabart+Oe[view] [source] [discussion] 2023-06-29 18:36:54
>>wongar+i5
TCP default is more than 30 seconds. The internet itself has about a 99.9% uptime. If one company showed every 30 second blip on their outage page all their competitors would have that screenshot on the first page of their pitch deck even if they also had the same issue. 2-5 minutes is reasonable for a public service to announce an outage.
◧◩◪◨⬒
147. jabart+ef[view] [source] [discussion] 2023-06-29 18:38:58
>>wongar+i5
Forgot about that centurylink BGP infinite loop route bug they had where it took down their whole system nationwide. A lot of monitoring services showed red even though it was one ISP that was done.
◧◩◪◨
148. AdamJa+if[view] [source] [discussion] 2023-06-29 18:39:36
>>Commit+Sa
I got logged out of our slack today, which I'm the primary owner of, and I was also wondering this.

I've also never been fired, so, it isn't always linked to trauma from past firings.

◧◩
149. mantra+jg[view] [source] [discussion] 2023-06-29 18:44:13
>>edgyqu+t3
Ooof, I felt that. My project management system logs me out a few times a year and each time it happens my heart rate elevates.
◧◩◪◨⬒⬓
150. sjwhit+Ug[view] [source] [discussion] 2023-06-29 18:46:32
>>AYBABT+ca
Hey, incident.io CEO here! Thanks for mentioning us.
◧◩
151. ralgoz+Bh[view] [source] [discussion] 2023-06-29 18:49:00
>>oblong+G6
I must ask, how did you notice that?!
◧◩◪◨⬒
152. Gordon+Dh[view] [source] [discussion] 2023-06-29 18:49:08
>>sophac+59
Are you sure it's Github and not your ISP or something? I've just pushed commits to a bunch of repositories in the past half hour.
153. squalo+ji[view] [source] 2023-06-29 18:51:58
>>mikebo+(OP)
Not a single day passes without a MAJOR outage in a Microsoft owned service.
◧◩
154. voodoo+fs[view] [source] [discussion] 2023-06-29 19:36:23
>>edgyqu+t3
Well im IT Teamlead and imposter syndrom hits me hard too. Always wonder when the day will come.
◧◩
155. maximi+Rz[view] [source] [discussion] 2023-06-29 20:16:35
>>edgyqu+t3
At a friend's company, the CEO had a calendar invite of "Fire Dan", for April 1. Dan went to confirm it was an April Fools' joke. It wasn't!
replies(2): >>20afte+fI >>progme+uM
◧◩◪◨
156. jachee+zB[view] [source] [discussion] 2023-06-29 20:25:37
>>Commit+Sa
I’m no mental health professional, but that sounds like literal PTSD, to me.
replies(1): >>bernie+2V
◧◩◪
157. Shamel+oC[view] [source] [discussion] 2023-06-29 20:30:04
>>adnaus+ia
TCP/IP Illustrated is a good start.
replies(1): >>adnaus+vsp
◧◩
158. jachee+xC[view] [source] [discussion] 2023-06-29 20:30:43
>>mbrees+y3
As the person in charge of one such page, I’d like to take the opportunity to remind folks that many— if not most—of these status pages are hand-updated, and most bosses absolutely hate anyone having to update them to anything but green.
replies(1): >>salawa+JE
◧◩
159. dathin+9D[view] [source] [discussion] 2023-06-29 20:33:28
>>edgyqu+t3
thats not very healthy given how unreliable github has become in recent years.

E.g. just yesterday for a short time frame of a few hours maybe half a day or so they had a bug where some closed PRs where shown in the personal which show created _not closed_ PRs.

Or github CI having spurious job cancellations or sometimes on a job failing waits until some (quite long) timeout is reached before reporting it.

Or it temporary being (partial or fully) down for a few hours.

Or it's documentation even through rather complete somehow managing to be often rather inconvenient to use. Oh wait that's not a bug, just subtle bad design, like it's PR overview/filters. Both cases of something which seems right on the first look, but starts being more and more inconvenient the more you use it. A trend I would argue describes GitHub as a whole rather well.

replies(1): >>Wojtki+2G
◧◩◪
160. jachee+mD[view] [source] [discussion] 2023-06-29 20:34:08
>>munk-a+M3
Keep in mind that downdetector can be brigaded and/or show knock-on problems instead of root causes. e.g. A couple weeks ago there were fairly major spikes across a rather huge variety of services on there, but it turned out that it was actually Comcast that was having trouble, rather than any of the “down” services.
◧◩◪◨
161. jachee+AD[view] [source] [discussion] 2023-06-29 20:35:09
>>AYBABT+u7
Also (2b) convince their boss that the “optics” are better to update sooner than later.
◧◩◪
162. salawa+JE[view] [source] [discussion] 2023-06-29 20:40:09
>>jachee+xC
Sounds like an anti-pattern or SLA dodge to me.
replies(1): >>bernie+MV
◧◩◪
163. Wojtki+2G[view] [source] [discussion] 2023-06-29 20:47:20
>>dathin+9D
Something internal must be going on at Microsoft. My company's PowerBI service has had some major performance issues over the past week
◧◩◪
164. 20afte+fI[view] [source] [discussion] 2023-06-29 20:59:18
>>maximi+Rz
wtf. that's pretty cold.
◧◩
165. progme+9M[view] [source] [discussion] 2023-06-29 21:18:58
>>edgyqu+t3
I get this feeling when I get kicked out of Google services at a different time than my usual 7 days (Monday) log out. I'm an admin of the Google services we use, and I still get that stomach dropping feeling.
◧◩◪
166. progme+uM[view] [source] [discussion] 2023-06-29 21:20:28
>>maximi+Rz
This is something my boss would post in his calendar publicly without even thinking of it. I guess it helps me to get ahead of it, if it were to ever happen to me, but having the rest of the company able to see it is pretty cold and unfeeling.
◧◩◪◨
167. addand+6O[view] [source] [discussion] 2023-06-29 21:28:14
>>Commit+Sa
Borderline imposter syndrome?
◧◩◪◨⬒⬓
168. amanda+YS[view] [source] [discussion] 2023-06-29 21:54:08
>>AYBABT+ca
And Jeli.io for this! With the Statuspage integration, you can set the status, impact, write a message for customers, and select impacted components all without leaving Slack. Statuspage gets updated with a click of a button.
◧◩◪◨⬒
169. bernie+2V[view] [source] [discussion] 2023-06-29 22:04:21
>>jachee+zB
There’s a long list of signals that can trigger folks into layoff panic.
◧◩◪◨
170. bernie+kV[view] [source] [discussion] 2023-06-29 22:05:44
>>facu17+P5
That was my guess. Something on the frontend like a load balancer or proxy blocking traffic, but everything behind that was doing fine.
◧◩◪◨
171. bernie+MV[view] [source] [discussion] 2023-06-29 22:08:03
>>salawa+JE
Sometimes, but sometimes it’s just a precaution against automatic false alarms causing a huge panic.
replies(1): >>salawa+QV7
◧◩◪
172. dreday+5W[view] [source] [discussion] 2023-06-29 22:09:24
>>antoin+J5
Something that I've found that helps me with impostor syndrome is to read and talk about it.

Check out this Ted talk from the co-founder of Atlassian.

https://www.ted.com/talks/mike_cannon_brookes_how_you_can_us...

◧◩
173. bernie+8W[view] [source] [discussion] 2023-06-29 22:09:31
>>hejclo+57
Tradeoffs and tolerances need to be considered.
◧◩
174. dreday+iW[view] [source] [discussion] 2023-06-29 22:10:16
>>edgyqu+t3
Cannot login to slack
◧◩◪
175. bernie+vW[view] [source] [discussion] 2023-06-29 22:11:07
>>Ysx+I8
I never got that memo. Found out when it broke something.
◧◩◪
176. bernie+RW[view] [source] [discussion] 2023-06-29 22:12:44
>>lucb1e+96
Works on my machine!
◧◩
177. SergeA+ug1[view] [source] [discussion] 2023-06-30 00:17:04
>>edgyqu+t3
I hope you are not contributing production code.
◧◩
178. rubick+MG2[view] [source] [discussion] 2023-06-30 13:11:58
>>edgyqu+t3
I feel this in my bones. Every. Single. Time.
◧◩
179. maschu+Gg4[view] [source] [discussion] 2023-06-30 18:53:24
>>stefan+y1
Maybe the status page is down - it needs a status page to tell us if the status page is down
◧◩◪◨⬒
180. salawa+QV7[view] [source] [discussion] 2023-07-01 19:54:39
>>bernie+MV
See, I prefer panic. People don't pay enough attention to systems as it is. May just yeet together a bunch of parts and never bother to learn to actually troubleshoot or maintain, or reason through things.
◧◩◪◨
181. scrum-+Hd9[view] [source] [discussion] 2023-07-02 07:56:48
>>Commit+Sa
Can you describe what happened that qualifies as "scapegoated"?
◧◩◪◨
182. pandac+Td9[view] [source] [discussion] 2023-07-02 07:59:50
>>Commit+Sa
Sudden self-awareness of bias?

>>36508656

◧◩◪◨
183. adnaus+vsp[view] [source] [discussion] 2023-07-06 19:46:25
>>Shamel+oC
Thank you!
◧◩◪◨⬒
184. alexel+cOG[view] [source] [discussion] 2023-07-11 23:12:41
>>musha6+Fc
Surprisingly, not that bad ;) just a cheap hetzner box.
[go to top]