> Please don't post on HN to ask or tell us something. Send it to hn@ycombinator.com.
> We're having a really bad day.
> The Unicorns have taken over. We're doing our best to get them under control and get GitHub back up and running.
Not updating that status page when the core domain goes down: less good
Investigating - We are currently experiencing an outage of GitHub products and are investigating. Jun 29, 2023 - 17:52 UTC
But remember, that also a large part of what github offers is not directly available in git. e.g. pull requests, issues, wiki, continuous xyz, etc. A lot of planning activities and "tell me what I need to do next" kind of things are not tracked in git itself (of course).
So there's more to it than just the quip, "git is a distributed version control system". The whole value of github is more than just git commits.
No clue at all, just some romantic fantasy I just concocted.
Everyone wants a green/red status, but the world is all shades of yellow.
If you can have 1% of stuff down 100% of the time, or 100% of the stuff down 1% of the time, I think there's a preference we _feel_ is better, but I'm not sure one is actually more practical than the other.
Of course, people can always mirror things, but that's not really what this comment is about, since people can do that today if they feel like.
Although you are right in that they would be VCSing even better if they were using email as originally envisioned.
https://www.youtube.com/watch?v=YhDVC7-QgkI
One site that he mentioned was https://kernelci.org/ and the dashboard https://linux.kernelci.org/
[1] https://www.merriam-webster.com/words-at-play/jive-jibe-gibe
No changes - relatively easy to keep stable, as long as bugfixing is done.
Changes - new features = new bugs, new workloads.
Centralized VCS makes a lot of sense in a corporate flow, and isn't awful for many projects. I haven't seen a lot of projects that really embrace the distributed nature of git.
[Subtitle: Yes, he is wrong for doing that]The status page backend should actively probe the site, not just being told what to say and keeping stale info around.
https://economictimes.indiatimes.com/thumb/msid-99511498,wid...
PS -- is there a better or more appropriate way to share images here? I know they're not really conducive to discussion, but given that this is a response to a joke comment I'm not sure...
Brief downtime really only affects the infrastructure surrounding the actual code. Workflows, issues, etc.
I talked to a CS person a couple months ago and they pretty much blamed the lack of stability on all the custom work they do for large customers. There's a TON of tech debt as a result basically.
Step 3: Profit
Solving for step 2: Place google ads, because those pages are favored.
(Sometimes a link to an image doesn't work for various reasons, always good to check.)
at a higher layer in the stack though, consider the well-established but mostly historic mail list patch flow: even when the listserver goes down, i can still review and apply patches from my local inbox; i can still directly email my co-maintainers and collaborators. new patches are temporarily delayed, but retransmit logic is built in so that the user can still fire off the patch and go outside, rather than check back in every while to see if it’s up yet.
If you just mean checking whether downdetector.com is down, obviously you have to use a different service for that.
In either case, you should of course always have at least two custodes for cross-checking and backup purposes. (Which is the problem re Glassdoor and Yelp.)
They go for the Pull Requests, Issues, and general collaboration and workflow tools.
That's exactly the point. This infrastructure used to be supported by email which is also distributed and everyone has a complete copy of all of the data locally.
Github has been slowly trying to embrace, extend, and extinguish the distributed model.
You should have that sort of expectation with GitHub. How many more times do you need to realise that this service is unreliable?
I think we have given GitHub plenty of time to fix these issues and they haven't. So perhaps now is the perfect time to consider self-hosting as I said years ago. [1]
No more excuses this time.
[0] >>35967921
[1] >>22867803
At that point it's worse than what you already know from your browser - it may show the service is having issues when you can access it, or that the service is ok when you can't.
Hilarious
try that and you'll understand why they update the pages manually.
Stage 1: Status is manually set. There may be various metrics around what requires an update, and there may be one or more layers of approval needed.
Problems: Delayed or missed updates. Customers complain that you're not being honest about outages.
Stage 2: Status is automatically set based on the outcome of some monitoring check or functional test.
Problems: Any issue with the system that performs the "up or not?" source of truth test can result in a status change regardless of whether an actual problem exists. "Override automatic status updates" becomes one of the first steps performed during incident response, turning this into "status is manually set, but with extra steps". Customers complain that you're not being honest about outages and latency still sucks.
Stage 3: Status is automatically set based on a consensus of results from tests run from multiple points scattered across the public internet.
Problems: You now have a network of remote nodes to maintain yourself or pay someone else to maintain. The more reliable you want this monitoring to be, the more you need to spend. The cost justification discussions in an enterprise get harder as that cost rises. Meanwhile, many customers continue to say you're not being honest because they can't tell the difference between a local issue and an actual outage. Some customers might notice better alignment between the status page and their experience, but they're content, so they have little motivation to reach out and thank you for the honesty.
Eventually, the monitoring service gets axed because we can just manually update the status page after all.
Stage 4: Status is manually set. There may be various metrics around what requires an update, and there may be one or more layers of approval needed.
Not saying this is a great outcome, but it is an outcome that is understandable given the parameters of the situation.
GP is talking about directly emailing patches around or just having discussions over email. Not intermediated through GitHub.
> This does raise the question of why we don't enter this sense of jive, even though we have evidence of its use since the 1940s. [...] So far, neither jive nor gibe as substitutions for jibe has this kind of record [literally hundreds of years], but it seems possible that this use of jive will increase in the future, and if it does dictionaries will likely add it to the definition.
Take a service responding 1% of the time with errors. Probably not "down". What about 10%? Probably not. What about 50%? Maybe, hard to say.
Maybe there's a fiber cut in rural village effecting 100% of your customers there but only 0.0001% of total customers?
Sure there's cases like this where everything is hosed but it sort of begs the question "is building a complex monitoring system for <some small number of downtimes a year>" actually worth it?
Internal metrics: Healthy
External status check: Healthy
Did ops announce an incident: No
Backend API latency: )`'-.,_)`'-.,_)`'-.,_)`'-.,_)`'-.,_)`'-.,_)`'-.,_)`'-.,_
And when there's disagreement between indicators I can draw my own conclusions.I guess in reality the very existence of a status page is a tenuous compromise between engineers wanting to be helpful towards external engineers, and business interests who would prefer to sweep things under various rugs as much as possible ("what's the point of a website whose entire point is to tell the world that we're currently fucking up?").
Worst case you have more data points to draw conclusions from. Status page red, works for me? Hmm, maybe that's why the engineers in the other office are goofing off on Slack. Status page green, I get HTTP 500s? Guess I can't do this thing but maybe other parts of the app still work?
Of course it still sucked when some tool decided I needed to update dependencies which all lived on regular Github, but at least our deployment stuff etc still worked.
That decision proved wise many times. I don't remember CodeCommit ever having any notable problems.
That said: if you're using GitHub in your actual dev processes (i.e. using it as a forge: using the issue tracker, PRs for reviews, etc), there's really no good way to isolate yourself as far as I know.
This is more likely a network routing or some other layer 4 or below screw up. Most application changes would be rolling + canary released and rolled back pretty quickly if things go wrong
> i can still review and apply patches from my local inbox
`git fetch` gets me all the code from open PRs. And comments are already in email. Now I'm thinking if I should put `git fetch` in crontab.
> retransmit logic is built in so that the user can still fire off the patch and go outside
You can do that with a couple lines of bash, but I bet someone's already made a prettier script to retry an arbitrary non-interactive command like `git push`? This works best if your computer stays on while you go outside, but this is often the case even with a laptop, and even more so if you use a beefy remote server for development.
Some of the earliest features and subcommands in Git were for generating and consuming patches sent and received via e-mail. Git can even send e-mail messages itself; see git-send-email(1). On open source mailing-lists when you see a long series of posts with subject prefixes like '[foo 1/7]', it's likely that series is being sent by the send-email subcommand, directly or indirectly.
While I've long known that Git has such capabilities, was originally designed around the LKML workflow, and that many traditionally managed open source projects employ that workflow on both ends (sender and receiver), I've never used this feature myself, even though I actually host my own e-mail as well as my own Git repositories.[1] In fact it was only the other day while reading the musl-libc mailing list when it clicked that multiple contributors had been pushing and discussing patches this way--specifically using the built-in subcommands as opposed to manually shuttling patches to and from their e-mail client--even though I've been subscribed to and following that mailing list for years.
The open source community has come to lean too heavily on Github and Gitlab-style, web-based pull request workflows. It's not good for the long-term health of open source as these workflows are tailor made for vendor lock-in, notwithstanding that Github and Gitlab haven't yet abused this potential. Issue ticket management is a legitimate sore point for self hosting open source projects. But Git's patch sharing capabilities are sophisticated and useful and can even be used over channels like IRC or ad hoc web forums, not simply via personal e-mail or group mailing-lists.
[1] Little known fact: you can host read-only Git repositories over HTTP statically, without any special server-side software. The git update-server-info subcommand generates auxiliary files in a bare repository that the git client automatically knows to look for when cloning over HTTP. While I use SSH to push into private Git repositories, each private Git repository has a post-receive hook that does `(cd "${M}" && git fetch && git --bare update-server-info)`, where '${M}' is a bare Git mirror[2] underneath the document root for the local HTTP server. (I would never run a git protocol daemon on my personal server; those and other niche application servers are security nightmares. But serving static files over HTTP is about as safe and foolproof as you can get.)
[2] See git clone --mirror (https://git-scm.com/docs/git-clone#Documentation/git-clone.t...)
EDIT: Regarding note #1, in principle one could implement a web-based Git repository browser that is implemented purely client-side. Using WASM one could probably quickly hack pre-existing server-side applications like gitweb to work this way, or at least make use of libgit2 for a low-level repository interface. If I could retire tomorrow, this is a project that would be at the top of my list.
Of course they should try to update their status page in a timely manner, but it is frequently manual from what I’ve seen.
This is equivalent to step 3 :)