In Twitter’s early days, only one celebrity could tweet at a time

>>evanwe+(OP)
2010: "At any moment, Justin Bieber uses 3% of our infrastructure. Racks of servers are dedicated to him"

https://gizmodo.com/5632095/justin-bieber-has-dedicated-serv...

>>evanwe+(OP)
If you want to read more about activity feeds there are a ton of papers listed here: https://github.com/tschellenbach/stream-framework I've been working on this stuff for years. Recently I've also enjoyed reading Linkedin's posts about their feed tech. Three are a few different posts but here's one of them: https://engineering.linkedin.com/blog/2016/03/followfeed--li...

Scaling a social network is just inherently a very hard problem. Especially if you have a large userbase with a few very popular users. Stackshare recently did a nice blogpost about how we at Stream solve this for 300 million users with Go, RocksDB and Raft: https://stackshare.io/stream/stream-and-go-news-feeds-for-ov...

I think the most important part is using a combination of push and pull. So you keep the most popular users in memory and for the other users you use the traditional fanout on-write approach. The other thing that helped us scale was using Go+RocksDB. The throughput is just so much higher compared to traditional databases like Cassandra.

It's also interesting to note how other companies solved it. Instagram used a fanout on write approach with Redis, later on Cassandra and eventually a flavor of Cassandra based on RocksDB. They managed to use a full fanout approach using a combination of great optimization, a relatively lower posting volume (compared to Twitter at least) and a ton of VC money.

Friendster and Hyves are two stories of companies that didn't really manage to solve this and went out of business. (there were probably other factors as well, but still.) I also heard one investor mention how Tumblr struggled with technical debt related to their feed. A more recent example is Vero that basically collapsed under scaling issues.

>>lunchb+Ex
The web part of it is pretty easy to scale. You simply add more web servers. The problem is in the storage/db layer. I imagine the feed was the primary challenge scaling Twitter.

Other components such as the search were probably also quite tricky. One thing i've never figured out is how Facebook handles search on your timeline. That a seriously complex problem.

Linkedin Recently published a bunch of papers about their feed tech:

https://engineering.linkedin.com/blog/2016/03/followfeed--li...

And Stream's stackshare is also interesting: https://stackshare.io/stream/stream-and-go-news-feeds-for-ov...

>>lwansb+sz
Hyperloglog was only just invented, and hadn't really made its way into the popular developer consciousness.

Anyhow, the solution is easier than that (implement lock striping, i.e. https://stackoverflow.com/questions/16151606/need-simple-exp...), but I was reorg'd away before my code hit production, so I'm not sure what happened there.

>>lunchb+Ex
WhatsApp was a pretty well known case for Erlang[1], given that I'd think Phoenix would be a great fit.

[1] https://www.wired.com/2015/09/whatsapp-serves-900-million-us...

>>fizx+qB
Approximate counters are much much older than HyperLogLog. Here’s something from 1978: https://www.inf.ed.ac.uk/teaching/courses/exc/reading/morris.... Of course this solves a different problem because HyperLogLog is an, uh, interesting way to count followers (why would you need to do a count-distinct query? You can’t follow someone multiple times). In any case, the Flajolet Martin sketch dates back to 1985 and solves the same problem as HyperLogLog: https://en.wikipedia.org/wiki/Flajolet%E2%80%93Martin_algori...

>>evanwe+(OP)
In the early days of Twitter the "fail whale" was so common it got assimilated into culture as a term to use for anytime any site gets overloaded. Nowadays it seems like that term is "hugged to death"

https://www.theatlantic.com/technology/archive/2015/01/the-s...

>>evanwe+(OP)
I haven't seen anyone touch on this, but I remember reading about this in Data Intensive Applications[1]. The way that they solved the celebrity feed issue was to decouple users with high amounts of followers from normal users.

Here is a quick excerpt, this book is filled to the brim with these gems.

> The final twist of the Twitter anecdote: now that approach 2 is robustly implemented,Twitter is moving to a hybrid of both approaches. Most users’ tweets continue to be fanned out to home timelines at the time when they are posted, but a small number of users with a very large number of followers (i.e., celebrities) are excepted from this fan-out. Tweets from any celebrities that a user may follow are fetched separately and merged with that user’s home timeline when it is read, like in approach 1. This hybrid approach is able to deliver consistently good performance.

Approach 1 is a global collection of tweets, the tweets are discovered and merged in that order.

Approach 2 involves posting a tweet from each user into each follower's timeline, with a cache similar to how a mailbox would work.

[1] https://www.amazon.com/Designing-Data-Intensive-Applications...

>>city41+MF
For us old timers the term is “slashdotted.”

https://en.m.wikipedia.org/wiki/Slashdot_effect

>>evanwe+(OP)
There was a fun high-scalability article around their 'fan-out' approaches to disseminating tweets by popular users etc : http://highscalability.com/blog/2013/7/8/the-architecture-tw... .

When I was working on something with similar technical requirements I also came across this paper (http://jeffterrace.com/docs/feeding-frenzy-sigmod10-web.pdf) that outlined the approach in a more 'formal' manner.

>>gianca+1x
From 2008:

Scaling is fundamentally about the ability of a system to easily support many servers. So something is scalable if you can easily start with one server and go easily to 100, 1000, or 10,000 servers and get performance improvement commensurate with the increase in resources.

When people talk about languages scaling, this is silly, because it is really the architecture that determines the scalability. One language may be slower than another, but this will not affect the ability of the system to add more servers.

Typically one language could be two or three, or even ten times slower. But all this would mean in a highly scalable system is that you would need two or three or ten times the number of servers to handle a given load. Servers aren't free (just ask Facebook), but a well-capitalized company can certainly afford them.

http://www.businessinsider.com/2008/5/why-can-t-twitter-scal...

>>loopdo+yL
https://mashable.com/2015/10/30/justin-bieber-hotline-bling/... - It was an interesting experience.

>>firebo+WM
I think this post is relevant here: https://danluu.com/sounds-easy/

>>lorenz+sE
Here is nice presentation covering those: (link: https://www.cs.princeton.edu/~rs/talks/AC11-Cardinality.pdf)

>>MBCook+wI
By RPS, I imagined Shopify is largest. But then from what I understand, Shopify operate in such a way that every shop is its own app. i.e there is 100s of thousands of Basecamp / AirBnb running on the same code base and every shop is somewhat isolated.

Purely in terms of App in think the largest would be Cookpad. AirBnB never shared their numbers so I don't know.

https://speakerdeck.com/a_matsuda/the-recipe-for-the-worlds-...

>>city41+MF
Everybody knows about the "fail whale".. nobody knows about the "moan cone".. The image is lost to history but it was captured by this account ages ago: https://twitter.com/moan_cone

IT was thrown when the system was failing at the rails layer rather than the apache layer. I believe that Ryan King and I were the last people to ever see the moan cone in production. =)

>>evanwe+(OP)
Missed that this went to the front page! I will answer questions if I can.

I am now CEO @ https://fauna.com/, making whales and fails a thing of the past for everybody. We are hiring, if you want to work on distributed databases.

>>gianca+1x
Rails, and the way it used MySQL/Postgres, was designed for building CMSs. Very amenable to CDN caching to scale.

Twitter was a realtime messaging platform (fundamentally not edge-cacheable) that evolved from that CMS foundation. So the reason for the difficult evolution should be clear.

It's not really a coincidence that before Twitter I worked at CNET on Urbanbaby.com which was also a realtime, threaded, short message web chat implemented in Rails.

Anyway the point is: use my new project/company https://fauna.com/ to scale your operational data. :-)

>>ma2rte+AP
Its a leveldb fork with substantial changes. For example, a non binary search file format option: https://github.com/facebook/rocksdb/wiki/PlainTable-Format

Pull down the code some time. Everything and the kitchen sink is in there somewhere. It's a crazy project.

>>tschel+Hy
I've found the Yahoo paper from 2010 to be the best distillation of the general problem of feed following and how to solve it (blending push/pull as you say):http://jeffterrace.com/docs/feeding-frenzy-sigmod10-web.pdf

zlacker

In Twitter’s early days, only one celebrity could tweet at a time