Ideally OTel would be more than observability, imo. Traces would be event-sources, would be a thing that begets more computing. The system to observe computing should in turn also be the system to react & respond to computing, should be begetting more computing elsewhere. That's the distributed system I want to see; local agents reporting their actions, other agents seeing that & responding. OTel would fit the need well, if only we could expand ourselves beyond thinking of observability as an operator affordance & start thinking of it as the system to track computation.
This was such a well put comment, that truly made me grok the entire article in just this one statement.
---
Infrastructure needs to be invisible, and that is where the future of AI-enabled orchestration/abstraction will allow development to be more poetry than code - whereby we can describe complex logic paths/workflows in a language of intent - and all the components required to accomplish that desired outcome will be much more quickly, elegantly be a reality.
THe real challenge ahead is the divide between those who have the capability and power of all the AI tools available to them, and those who are subjugated by those who do.
For example, an individual can build a lot with the current state of the available tool universe... but a more sophisticated and well funded organization will have a lot more potential capability.'
What I am really interested to know, is if there is a dark Dystopian Cyberpunk AI under-world happening yet?
Whats the state of BadActor/BigCorpo/BigSpy's capability and covert actions currently?
While we are distracted by AI_ClipArt and celebrity voice squabbles, and seemingly Top AI Voices are being ignored after founding organizations for Alignment/Governance/Humane/etc and warning of catastrophe - define The State of Things?
But yeah - extracting the code and letting logic just be handled yet portable, clonable, refactorable easily is where we are already headed. Its amazing and terrifying at the same time.
I'm thankful that all my Cyberpunk Fantasy reading, thinking, imagining and then my tiny part in the overall evolution of the world of tech today, having the opportunity to be here, worked with and build to, in with -- and now seeing the birth of AI and using it daily in my actual interactions with my IRL.
Such an amazing moment in Human History to be here through this.
See also the "Linux kernel management style" document that's been in the kernel since forever: https://docs.kernel.org/6.1/process/management-style.html
> It turns out that while it’s easy to undo technical mistakes, it’s not as easy to undo personality disorders. You just have to live with theirs - and yours.
this was definitely written by Linus XD
Querying unfortunately has lots of room for innovation, and it's really hard to nail down in a spec especially when the vendors all want to compete.
Being able to bring the whole application up locally should be an absolute non-negotiable.
This usually doesn't work that well for larger systems with services split between multiple teams. And it's not typically the RAM/CPU limitations that are the problem, but the amount of configuration that needs to be customized (and, in some cases, data).
Sooner or later, you just start testing with the other teams' production/staging environments rather than deal with local incompatibilities.
The S3 API (object storage) is the accepted storage API, but you do not need AWS (but they are very good at this).
The Kafka API is the accepted stream/ buffer/ queue API, but you do not need Confluent.
SQL is the query language, but you do not need a relational database.
I was kinda expecting BigQuery to do this under the hood, but it seems like they don't, which is a shame. BigQuery isn't faster than, say, trino on gcs, even though Google could do some major optimisations here.
If you read this section, the author gets a lot of things right, but clearly doesn't know the space that well since there have been people building things along these lines for years. And making vague commentary instead of describing the nitty-gritty doesn't evoke much confidence.
I work on one such language/tool called mgmt config, but I have had virtually no interest and/or skill in marketing it. TBQH, I'm disenchanted by the fact that it seems to get any recognition you need to have VC's and a three-year timeline, short-term goals, and a plan to be done by then or move on.
If you're serious about future infra, then it's all here:
https://github.com/purpleidea/mgmt/
Looking for coding help for some of the harder bits that people might wish to add, and for people to take it into production and find issues that we've missed.
But opens up also a threat vector. And you have competing users running their predicates. So one has to think also about queues and pipelining and so on. But probably also solvable, just like on any multiuser system.
Interesting.
And not only because of legacy systems that are hard to migrate to a modern platform. At my place of work there are workloads that can easily run on Kubernetes and it would be wise to do so. On the other hand there are systems that are not designed to run in a container and there is frankly no need to, because not everything needs to scale up and down or be available 100% of the time at all costs.
I think configuration management systems like mgmt (or Ansible and Puppet) are here to stay.
In this post it is attributed to Jeff Bezos quotes, but it was popular in the Pacific North West before his rise.
I think there is even a widening talent gap where you can't get people excited about doing something that maybe should have been done years ago (assuming VM -> containers makes sense for a thing). The salary needs to go higher for things that are less beneficial to the resume.
The industry at large asks most developers to stay up-to-date, so it starts looking suspicious when a company doesn't stay up-to-date too. For C# in particular, companies who have only recently migrated to .NET 5+ are now a red flag to me considering how long .NET Core has been out.
In the latter case I would consider it a red flag if some long-deprecated tool turned up in the tech stack of a company, but there might be perfectly good reasons to stick to the former, a bunch of VMs, instead of operating a Kubernetes cluster.
I ran a small Kubernetes cluster once and it turned out to be the wrong decision _at that time_. I think I would be delighted to see a job ad from a company that mentioned both (common hypervisors/VMs, containers/Kubernetes) in their tech stack. Without more information I would think that company took their time to evaluate their needs irrespective of current tech trends.
Mgmt doesn't care whether or not you want to build your system to be immutable, that's up to you! Mgmt let's you glue together the different pieces with a safe, reactive, distributed DSL.
Regarding your Talos comment, Kubernetes makes building things so complicated, so no, I don't think it will win out long term.
I think so too, however "mgmt config" builds a lot of radical new primitives that Ansible and Puppet don't have. It's been negative for my "PR" to classify it as "config management" because people assume I'm building a "Puppet clone", but I really see it as that space, it's just that those legacy tools never delivered on the idea that I thought they should have correctly.
From the 70s through the 90s or 00s everything was file system-based, and it was just assumed that the best way to store data in a distributed system - even a globally-distributed one - was some sort of distributed file system. (e.g. Andrew File System, or research projects like OceanStore.
Nowadays the file system holds applications and configuration, but applications mostly store data in databases and object stores. In distributed systems this is done almost exclusively through system-specific network connections (e.g. port 3306 to MySQL, or HTTP for S3) rather than OS-level mounting of a file system.
(not counting HPC, where distributed file systems are used to preserve the developer look and feel of early non-distributed HPC systems)
That's probably about the time when your development pace goes downhill.
I think it's an interesting idea to consider: If some team interfaces with something outside of its control, they need to have a mock of it. That policy increases the development effort by at least a factor of two (you always have to create the mock alongside the thing), but it's just a linear increase.
Either way, once the local version exists, then the job becomes maintaining all the infrastructure that lets you bring up the pieces, populate them with reasonable state and wire them into whatever the bits are that are being actively hacked-on.
Oh, absolutely. But at this point, your team is probably around several dozen people and you have a product with paying customers. This naturally slows the development speed, however you organize the development process.
> I think it's an interesting idea to consider: If some team interfaces with something outside of its control, they need to have a mock of it. That policy increases the development effort by at least a factor of two (you always have to create the mock alongside the thing), but it's just a linear increase.
The problem is, you can't really recapture the actual behavior of a service in a mock.
To give you an example, DynamoDB in AWS has a local mock in-memory DB for testing and development. It has nearly the same functionality, but stores all the data in RAM. So the simulated global secondary indexes (something like table views in classic SQL databases) are updated instantly. But on the real database it's eventually consistent, and it can take a fraction of a second to update.
So when you try to use your service in production, it can start breaking under the load.
Perhaps, we need better mocks that also simulate the behavior of the real services for delays, retries, and so on.
At BigCo we have migrated a number of internal things to Otel but I don’t think it has been worth the effort.
So many projects come with prometheus metrics, dashboards, and alerts out of the box that it becomes hard to use anything else. When I pick some random helm chart to install you can almost guarantee that is comes with prometheus integrations.
With grafana mimir you can now scale easily to a few billion metrics streams so a lot of the issues with the old model of prometheus have been fixed.
Like you said I don’t think there is much to innovate on in this area, which is a good thing.
It works great for stateless things, but not so great for stateful things. I guess this plays into state being persisted in object storage or DBs, this allows the application to be stateless.
It's good to actually see even a mention of control theory.
My degree was electronics and control theory and whilst I've only had one job that involved either electronics or control theory I often think about software in these terms: I genuinely think that as an industry we need to seriously consider the systems we build in control theoretic terms.
SharePoint CSM, Dynamics, SQL Server CLR, Visual Studio extensions, Office AddIns.
I've also recently proposed a Table Read protocol that should be a "non-vendor-controlled" equivalent of BigQuery Storage APIs: https://engineeringideas.substack.com/p/table-transfer-proto...