zlacker

[parent] [thread] 12 comments
1. berkes+(OP)[view] [source] 2021-11-26 07:51:08
When you are wondering whether you might need Kafka, it is certain that you don't need it.

But there are times when you have a problem, and amongst the possible solutions is Kafka.

I've come across Kafkaesque problems only three times in the last seven years: a hosting platform that had to parse logs of over 700 WordPress sites for security and other businesslogic. Putting all events of a financial app backend into datalakes and filtering and parsing all openstreetmap changesets live.

replies(2): >>benjam+G1 >>olavgg+6e
2. benjam+G1[view] [source] 2021-11-26 08:13:33
>>berkes+(OP)
Not sure I agree. It seems as good a way as any to decouple systems, asynchronously exchange messages between services, get them into durable storage, exactly once processing, replay semantics etc. I think it should be on the table at least whenever we have two services needing to exchange data.

Maybe a few use cases could be switched out for direct API calls, but I think Kafka hits the sweet spot in many situations.

What alternatives would you be looking at?

replies(3): >>weego+63 >>berkes+fv >>LgWood+fC
◧◩
3. weego+63[view] [source] [discussion] 2021-11-26 08:35:17
>>benjam+G1
It is as good as any if you have no financial constraints and no technical overhead and time constraints, but everyone does.

Kafka is one of those systems that needs to be justified by out-scaling other solutions that don't come wedded with all its baggage.

replies(1): >>hansbo+L3
◧◩◪
4. hansbo+L3[view] [source] [discussion] 2021-11-26 08:46:06
>>weego+63
What would you say is its baggage?
replies(1): >>berkes+Pv
5. olavgg+6e[view] [source] 2021-11-26 10:52:39
>>berkes+(OP)
I work in the oil and gas industry where legacy systems runs on their last breath. Kafka is a fantastic tool and solves a shit ton of problem. We have millions of sensors on an offshore installation, these all send data into kafka, where we generate events on new topics from different timeseries. Other data services consume these topics and get data updated in near realtime.

No more daily SQL dumps from offshore to onshore and big batch procedures to genereate outdated events.

replies(3): >>elcano+Cr >>berkes+5u >>fatbir+Y01
◧◩
6. elcano+Cr[view] [source] [discussion] 2021-11-26 13:10:25
>>olavgg+6e
What legacy systems is the oil and gas using? MQTT? OPC-DA? OPC-UA?
◧◩
7. berkes+5u[view] [source] [discussion] 2021-11-26 13:39:03
>>olavgg+6e
Sounds like you have Serious Problems, for which Kafka is a very good solution.

For me, Kafka sits in the same area of solutions as Kubernetes, Hadoop clusters, or anything "webscale": you don't need it. Untill you do, but by then you'll (i) have Serious Problems which such systems solve and (ii) the manpower and budgets to fix them.

With which I don't mean to avoid Kafka at all costs. By all means, play around with it: if anything, the event-driven will teach you things that make your a better Rails/Flask/WordPress developer if that is what you do.

◧◩
8. berkes+fv[view] [source] [discussion] 2021-11-26 13:49:20
>>benjam+G1
Some alternatives are:

* Just keep your architecture a monolith. You'll do fine the majority of the cases.

* Event-sourcing doesn't require Kafka clusters. Nor do event-driven setups. You don't need complex tooling to pass around strings/json-blurps. An S3 bucket or a Postgresql database storing "Events-as-json" is often fine.

* Postgres can do most of what you need (except for the "webscale" clustering etc)[0] in practice already.

* Redis[1]

My main point is that while Kafka is a fantastic tool, you don't need that tool to achieve what you want in many cases.

> It seems as good a way as any to decouple systems

IMO relying on a tool to achieve a good software design, rather than design-patterns, is a recipe for trouble. If anything, because it locks you in (do you suddenly get a tightly coupled system if you remove Kafka?) or because its details force you into directions that don't naturally fit your domain or problem.

--

[0] https://spin.atomicobject.com/2021/02/04/redis-postgresql/ [1] https://redis.com/redis-best-practices/communication-pattern... etc.

◧◩◪◨
9. berkes+Pv[view] [source] [discussion] 2021-11-26 13:55:15
>>hansbo+L3
For me, the baggage is mostly the complexity of the service. With that comes monitoring, maintenance, tuning, debugging and troubleshooting.

Lessened somewhat with SaaS products like Amazon Kinesis (technically not a Kafka, but close).

Another "baggage" is that an event-driven setup is eventual-consistent -and async- by nature. If your software already is eventual-consistent, this is not a problem. But it is a huge change if you come from a blocking/simple "crud" setup.

replies(1): >>piyh+5A
◧◩◪◨⬒
10. piyh+5A[view] [source] [discussion] 2021-11-26 14:29:09
>>berkes+Pv
I second Kafka having massive operational overhead. It's a burden and is killing any support for it within our org.
◧◩
11. LgWood+fC[view] [source] [discussion] 2021-11-26 14:42:37
>>benjam+G1
I’ve seen comments like the gp’s often enough that it strikes me as a form of gatekeeping.

MY problems are so special that my use of Kafka was perfect, but YOURS are trivial and you shouldn’t even consider Kafka.

replies(1): >>robert+Zw1
◧◩
12. fatbir+Y01[view] [source] [discussion] 2021-11-26 17:27:47
>>olavgg+6e
I'm in the same situation in the paper making industry. Kafka is an almost perfect match for our needs: high volume, durable storage, decoupled stream processing.
◧◩◪
13. robert+Zw1[view] [source] [discussion] 2021-11-26 21:04:16
>>LgWood+fC
> it strikes me as a form of gatekeeping

I consider this a form of gatekeeping of advice on using Kafka.

[go to top]