zlacker

[parent] [thread] 0 comments
1. opport+(OP)[view] [source] 2019-07-05 22:15:48
Let's say you have a few tens of millions of people using your webapp and you are collecting analytics/usage data from them. You definitely don't want them all trying to authenticate + write directly to a db; you need an intermediate layer to process all the data coming from tons of different sources.

You could potentially do that with a separate microservice you communicate with via http, but this requires a "liveness" of the microservice that isn't really necessary; you will often lose events if the microservice isn't able to keep up with the incoming load, and you need to process the events just as fast as they come in. The data flow is really just unidirectional so the response is unnecessary, you just need to reliably transmit the data.

Kafka lets you handle unidirectional data flows in a way that is lazier. The data producers just write to a service and the consumers connect to the service. In between, Kafka just behaves like a distributed message queue. Obviously this is a huge benefit over directly writing to a db or any other kind of offline storage since it can greatly reduce the connection overhead. The main benefit over using a microservice is that it relaxes the constraint that all the data is processed/handled exactly as it comes in. It makes non-critical data flow more redundant by adding this queue

(I don't think the linked article does a great job explaining why you would use kafka / what the alternatives are)

[go to top]