zlacker

[parent] [thread] 2 comments
1. jaunty+(OP)[view] [source] 2024-08-27 01:51:21
OTel being a capture & ingest only specification is kind of messed up. There's no attempt from what I can tell for how to query or present stored data; it's just an over-the-wire specification, & that drastically limits usable scope. It means vendors each get to make their own services & backends & tools, but it's greviously limiting the effort as a whole, makes even an open spec like OTel a one-way door.

Ideally OTel would be more than observability, imo. Traces would be event-sources, would be a thing that begets more computing. The system to observe computing should in turn also be the system to react & respond to computing, should be begetting more computing elsewhere. That's the distributed system I want to see; local agents reporting their actions, other agents seeing that & responding. OTel would fit the need well, if only we could expand ourselves beyond thinking of observability as an operator affordance & start thinking of it as the system to track computation.

replies(1): >>singro+S2
2. singro+S2[view] [source] 2024-08-27 02:32:32
>>jaunty+(OP)
Otel works as a standard since there isn't any need to innovate at that level. Despite the over complications it has, all the implementations largely have the same requirements, and it's useful to instrument everything the same way.

Querying unfortunately has lots of room for innovation, and it's really hard to nail down in a spec especially when the vendors all want to compete.

replies(1): >>__turb+vj3
◧◩
3. __turb+vj3[view] [source] [discussion] 2024-08-28 06:49:24
>>singro+S2
Otel is nice and all but I still think you are best off going 100% all in prometheus. Prometheus is so common that it has become a de-facto standard in metrics.

At BigCo we have migrated a number of internal things to Otel but I don’t think it has been worth the effort.

So many projects come with prometheus metrics, dashboards, and alerts out of the box that it becomes hard to use anything else. When I pick some random helm chart to install you can almost guarantee that is comes with prometheus integrations.

With grafana mimir you can now scale easily to a few billion metrics streams so a lot of the issues with the old model of prometheus have been fixed.

Like you said I don’t think there is much to innovate on in this area, which is a good thing.

[go to top]