But I can't help but agree with a lot of points in this article. Go was designed by some old-school folks that maybe stuck a bit too hard to their principles, losing sight of the practical conveniences. That said, it's a _feeling_ I have, and maybe Go would be much worse if it had solved all these quirks. To be fair, I see more leniency in fixing quirks in the last few years, like at some point I didn't think we'd ever see generics, or custom iterators, etc.
The points about RAM and portability seem mostly like personal grievances though. If it was better, that would be nice, of course. But the GC in Go is very unlikely to cause issues in most programs even at very large scale, and it's not that hard to debug. And Go runs on most platforms anyone could ever wish to ship their software on.
But yeah the whole error / nil situation still bothers me. I find myself wishing for Result[Ok, Err] and Optional[T] quite often.
I'd say that it's entirely the other way around: they stuck to the practical convenience of solving the problem that they had in front of them, quickly, instead of analyzing the problem from the first principles, and solving the problem correctly (or using a solution that was Not Invented Here).
Go's filesystem API is the perfect example. You need to open files? Great, we'll create
func Open(name string) (*File, error)
function, you can open files now, done. What if the file name is not valid UTF-8, though? Who cares, hasn't happen to me in the first 5 years I used Go.Is it the best or most robust or can you do fancy shit with it? No
But it works well enough to release reliable software along with the massive linter framework that's built on top of Go.
I agree.
The Go std-lib is fantastic.
Also no dependency-hell with Go, unlike with Python. Just ship an oven-ready binary.
And what's the alternative ?
Java ? Licensing sagas requiring the use of divergent forks. Plus Go is easier to work with, perhaps especially for server-side deployments.
Zig ? Rust ? Complex learning curve. And having to choose e.g. Rust crates re-introduces dependency hell and the potential for supply-chain attacks.
The code was on the hot path of their central routing server handling Billions (with a B) messages in a second or something crazy like that.
You're not building Discord, the GC will most likely never be even a blip in your metrics. The GC is just fine.
Nothing? Neither Go nor the OS require file names to be UTF-8, I believe
you can go `uv run script.py` and it'll automatically fetch the libraries and run the script in a virtual environment.
Still no match for Go though, shipping a single cross-compiled binary is a joy. And with a bit of trickery you can even bundle in your whole static website in it :) Works great when you're building business logic with a simple UI on top.
The issue is that it was a bit outdated in the choice of _which_ things to choose as the one Go way. People expect a map/filter method rather than a loop with off by one risks, a type system with the smartness of typescript (if less featured and more heavily enforced), error handling is annoying, and so on.
I get that it’s tough to implement some of those features without opening the way to a lot of “creativity” in the bad sense. But I feel like go is sometimes a hard sell for this reason, for young devs whose mother language is JavaScript and not C.
Yes, My favourite is the `time` package. It's just so elegant how it's just a number under there, the nominal type system truly shines. And using it is a treat. What do you mean I can do `+= 8*time.Hour` :D
Yeah, these are sagas only, because there is basically one, single, completely free implementation anyone uses on the server-side and it's OpenJDK, which was made 100% open-source and the reference implementation by Oracle. Basically all of Corretto, AdoptOpenJDK, etc are just builds of the exact same repository.
People bringing this whole license topic up can't be taken seriously, it's like saying that Linux is proprietary because you can pay for support at Red Hat..
I've said this before, but much of Go's design looks like it's imitating the C++ style at Google. The comments where I see people saying they like something about Go it's often an idiom that showed up first in the C++ macros or tooling.
I used to check this before I left Google, and I'm sure it's becoming less true over time. But to me it looks like the idea of Go was basically "what if we created a Python-like compiled language that was easier to onboard than C++ but which still had our C++ ergonomics?"
I got insta rejected in interview when i said this in response to interview panels question about 'thoughts about golang' .
Like they said, 'interview is over' and showed me the (virtual) door. I was stunned lol. This was during peak golang mania . Not sure what happened to rancherlabs .
You really come to appreciate when these batteries are included with the language itself. That Go binary will _always_ run but that Python project won't build in a few years.
So you mean all those universities and other places that have been forced to spend $$$ on licenses under the new regime also can't be taken seriously ? Are you saying none of them took advice and had nobody on staff to tell them OpenJDK exists ?
Regarding your Linux comment, some of us are old enough to remember the SCO saga.
Sadly Oracle have deeper pockets to pay more lawyers than SCO ever did ....
You can do something like WTF-8 (not a misspelling, alas) to make it bidirectional. Rust does this under the hood but doesn’t expose the internal representation.
Do they? After too many functional battles I started practicing what I'm jokingly calling "Debugging-Driven Development" and just like TDD keeps the design decisions in mind to allow for testability from the get-go, this makes me write code that will be trivially easy to debug (specially printf-guided debugging and step-by-step execution debugging)
Like, adding a printf in the middle of a for loop, without even needing to understand the logic of the loop. Just make a new line and write a printf. I grew tired of all those tight chains of code that iterate beautifully but later when in a hurry at 3am on a Sunday are hell to decompose and debug.
Yeah, but you still have to install `uv` as a pre-requisite.
And you still end up with a virtual environment full of dependency hell.
And then of course we all remember that whole messy era when Python 2 transitioned to Python 3, and then deferred it, and deferred it again....
You make a fair point, of course it is technically possible to make it (slightly) "cleaner". But I'll still take the Go binary thanks. ;-)
It's simplistic and that's nice for small tools or scripts, but at scale it becomes really brittle since none of the edge cases are handled
I’m only a casual user of both but how are rust crates meaningfully different from go’s dependency management?
I don't know what/which university you talk about, but I'm sure they were also "forced to pay $$$" for their water bills and whatnot. If they decided to go with paid support, then.. you have to pay for it. In exchange you can a) point your finger at a third-party if something goes wrong (which governments love doing/often legally necessary) b) get actual live support on Christmas Eve if needed.
This info is actually quite surprising to me, never heard of it since everywhere I know switched to OpenJDK-based alternatives from the get-go. There was no reason to keep on the Oracle one after the licencing shenanigans they tried to play.
Why do these places kept the Oracle JDK and ended up paying for it? OpenJDK was a drop-in replacement, nothing of value is lost by switching...
It's far better to get some � when working with messy data instead of applications refusing to work and erroring out left and right.
I think it's a bad trade-off, most languages out there are moving away from it
[]Rune is for sequences of UTF characters. rune is an alias for int32. string, I think, is an alias for []byte.
I know it's mostly a matter of tastes, but darn, it feels horrible. And there are no default parameter values, and the error hanling smells bad, and no real stack trace in production. And the "object orientation" syntax, adding some ugly reference to each function. And the pointers...
It took me back to my C/C++ days. Like programming with 25 year old technology from back when I was in university in 1999.
e.g. iirc. Rust has multiple ways of handling Strings while Go has (to a big extent) only one (thanks to the GC)
Many compiled languages are very slow to compile however, especially for large projects, C++ and rust being the usual examples.
In Go, `int * Duration = error`, but `Duration * Duration = Duration`!
It's just that a ridiculous amount of steps in real world problems can be summarised as 'reshape this data', 'give me a subset of this set', or 'aggregate this data by this field'.
Loops are, IMO, very bad at expressing those common concepts briefly and clearly. They take a lot of screen space, usually accesory variables, and it isn't immediately clear from just seing a for block what you're about to do - "I'm about to iterate" isn't useful information to me as a reader, are you transforming data, selecting it, aggregating it?.
The consequence is that you usually end up with tons of lines like
userIds = getIdsfromUsers(users);
where the function is just burying a loop. Compare to:
userIds = users.pluck('id')
and you save the buried utility function somewhere else.
I quite often see devs introducing them in other languages like TypeScript, but it just doesn't work as well when it's introduced in userland (usually you just end up with a small island of the codebase following this standard).
Quote from this article:[1]
*He told The Register that Oracle is "putting specific Java sales teams in country, and then identifying those companies that appear to be downloading and... then going in and requesting to [do] audits. That recipe appears to be playing out truly globally at this point."*
[1] https://www.theregister.com/2025/06/13/jisc_java_oracle/See link/quote in my earlier reply above.
Also, as another topic, Oracle is doing audits specifically because their software doesn't phone home to check licenses and stuff like that - which is a crucial requirement for their intended target demographics, big government organizations, safety critical systems, etc. A whole country's healthcare system, or a nuclear power base can't just stop because someone forgot to pay the bill.
So instead Oracle just visits companies that have a license with them, and checks what is being used to determine if it's in accord with the existing contract. And yeah, from this respect I also heard of a couple of stories where a company was not using the software as the letter of the contract, e.g. accidentally enabling this or that, and at the audit the Oracle salesman said that they will ignore the mistake if they subscribe to this larger package, which most manager will gladly accept as they can avoid the blame, which is questionable business practice, but still doesn't have anything to do with OpenJDK..
Stuff like this matters a great deal on the standard library level.
Score another for Rust's Safety Culture. It would be convenient to just have &str as an alias for &[u8] but if that mistake had been allowed all the safety checking that Rust now does centrally has to be owned by every single user forever. Instead of a few dozen checks overseen by experts there'd be myriad sprinkled across every project and always ready to bite you.
There is a difference between "small" and Rust's which is for all intents and purposes, non-existent.
I mean, in 2025, not having crypto in stdlib when every man and his dog is using crypto ? Or http when every man and his dog are calling REST APIs ?
As the other person who replied to you said. Go just allows you to hit the ground running and get on with it.
Having to navigate the world of crates, unofficially "blessed" or not is just a bit of a re-inventing the wheel scenario really....
P.S. The Go stdlib is also well maintained, so I don't really buy the specific "dead batteries" claim either.
And sure, it is welcome from a dev POV on one hand, though from an ecosystem perspective, more languages are not necessarily good as it multiplies the effort required.
It’s part trying to keep a common direction and part fear that dislike of their tech risks the hire not staying for long.
I don’t agree with this approach, don’t get me wrong, but I’ve seen it done and it might explain your experience.
They could support passing filename as `string | []byte`. But wait, go does not even have union types.
Go’s more chaotic approach to allow strings to have non-Unicode contents is IMO more ergonomic. You validate that strings are UTF-8 at the place where you care that they are UTF-8. (So I’m agreeing.)
[1] ZGC has basically decoupled the heap size from the pause time, at that point you get longer pauses from the OS scheduler than from GC.
It feels often like the two principles they stuck/stick to are "what makes writing the compiler easier" and "what makes compilation fast". And those are good goals, but they're only barely developer-oriented.
In general, Windows filenames are Unicode and you can always express those filenames by using the -W APIs (like CreateFileW()).
Internally time.Duration is a single 64bit count, while time.Time is two more complicated 64bit fields plus a location
I'm not and I'm glad the core team doesn't have to maintain an http server and can spend time on the low level features I chose Rust for.
You should always be able to iterate the code points of a string, whether or not it's valid Unicode. The iterator can either silently replace any errors with replacement characters, or denote the errors by returning eg, `Result<char, Utf8Error>`, depending on the use case.
All languages that have tried restricting Unicode afaik have ended up adding workarounds for the fact that real world "text" sometimes has encoding errors and it's often better to just preserve the errors instead of corrupting the data through replacement characters, or just refusing to accept some inputs and crashing the program.
In Rust there's bstr/ByteStr (currently being added to std), awkward having to decide which string type to use.
In Python there's PEP-383/"surrogateescape", which works because Python strings are not guaranteed valid (they're potentially ill-formed UTF-32 sequences, with a range restriction). Awkward figuring out when to actually use it.
In Raku there's UTF8-C8, which is probably the weirdest workaround of all (left as an exercise for the reader to try to understand .. oh, and it also interferes with valid Unicode that's not normalized, because that's another stupid restriction).
Meanwhile the Unicode standard itself specifies Unicode strings as being sequences of code units [0][1], so Go is one of the few modern languages that actually implements Unicode (8-bit) strings. Note that at least two out of the three inventors of Go also basically invented UTF-8.
[0] https://www.unicode.org/versions/Unicode16.0.0/core-spec/cha...
> Unicode string: A code unit sequence containing code units of a particular Unicode encoding form.
[1] https://www.unicode.org/versions/Unicode16.0.0/core-spec/cha...
> Unicode strings need not contain well-formed code unit sequences under all conditions. This is equivalent to saying that a particular Unicode string need not be in a Unicode encoding form.
Like, yes, those ideas have frequently been driven too far and have led to their own pain points. But people also seem to frequently rediscover that removing them entirety will lead to pain, too.
It breaks. Which is weird because you can create a string which isn't valid UTF-8 (eg "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98") and print it out with no trouble; you just can't pass it to e.g. `os.Create` or `os.Open`.
(Bash and a variety of other utils will also complain about it being valid UTF-8; neovim won't save a file under that name; etc.)
Other than having to periodically remember what 0-padded milliseconds are or whatever this isn't a huge deal.
The article tries very hard to draw a connection between the licensing costs for the universities and Oracle auditing random java downloads, but nobody actually says that this is what happened.
The waiver of historic fees goes back to the last licensing change where Oracle changed how licensing fees would be calculated. So it seems reasonable that Oracle went after them because they were paying customers that failed to pay the inflated fees.
So for a large loop the code like
for i, value := source { result[i] = value * 2 + 1 }
Would be 2x faster than a loop like
for i, value := source { intermediate[i] = value * 2 }
for i, value := intermediate { result[i] = value + 1 }
This tends to be true for most languages, even the ones with easier concurrency support. Using it correctly is the tricky part.
I have no real problem with the portability. The area I see Go shining in is stuff like AWS Lambda where you want fast execution and aren't distributing the code to user systems.
Also, as mentioned by another comment, an HTTP or crypto library can become obsolete _fast_. What about HTTP3? What about post-quantum crypto? What about security fixes? The stdlib is tied to the language version, thus to a language release. Having such code independant allows is to evolve much faster, be leaner, and be more composable. So yes, the library is well maintained, but it's tied to the Go version.
Also, it enables breaking API changes if absolutely needed. I can name two precendents:
- in rust, time APIs in chrono had to be changed a few times, and the Rust maintainers were thankful it was not part of the stdlib, as it allowed massive changes
- otoh, in Go, it was found out that net.Ip has an absolutely atrocious design (it's just an alias for []byte). Tailscale wrote a replacement that's now in a subpackage in net, but the old net.Ip is set in stone. (https://tailscale.com/blog/netaddr-new-ip-type-for-go)
For example, Rust iterators are lazily evaluated with early-exits (when filtering data), thus it's your first form but as optimized as possible. OTOH python's map/filter/etc may very well return a full list each time, like with your intermediate. [EDIT] python returns generators, so it's sane.
I would say that any sane language allowing functional-style data manipulation will have them as fast as manual for-loops. (that's why Rust bugs you with .iter()/.collect())
Well, so long as you don't care about compatibility with the broad ecosystem, you can write a perfectly fine Optional yourself:
type Optional[Value any] struct {
value Value
exists bool
}
// New empty.
func New[Value any]() Optional[Value] {}
// New of value.
func Of[Value any](value Value) Optional[Value] {}
// New of pointer.
func OfPointer[Value any](value *Value) Optional[Value] {}
// Only general way to get the value.
func (o Optional[Value]) Get() (Value, bool) {}
// Get value or panic.
func (o Optional[Value]) MustGet() Value {}
// Get value or default.
func (o Optional[Value]) GetOrElse(defaultValue Value) Value {}
// JSON support.
func (o Optional[Value]) MarshalJSON() ([]byte, error) {}
func (o *Optional[Value]) UnmarshalJSON(data []byte) error {}
// DB support.
func (o *Optional[Value]) Scan(value any) error {}
func (o Optional[Value]) Value() (driver.Value, error) {}
But you probably do care about compatibility with everyone else, so... yeah it really sucks that the Go way of dealing with optionality is slinging pointers around.I always encounter these upsides once every few years when preparing leetcode interviews, where this kind of optimization is needed for achieving acceptable results.
In daily life, however, most of these chunks of data to transform fall in one of these categories:
- small size, where readability and maintainability matters much more than performance
- living in a db, and being filtered/reshaped by the query rather than code
- being chunked for atomic processing in a queue or similar (usual when importing a big chunk of data).
- the operation itself is a standard algorithm that you just consume from a standard library that handless the loop internally.
Much like trees and recursion, most of us don’t flex that muscle often. Your mileage might vary depending of domain of course.
Especially given how the language was criticised back in 1996.
No, there is no dependency hell in the venv.
Python 2 to 3: are you really still kicking that horse? It's dead...please move on.
I think they only work if the language is built around it. In Rust, it works, because you just can't deref an Optional type without matching it, and the matching mechanism is much more general than that. But in other languages, it just becomes a wart.
As I said, some kind of type annotation would be most go-like, e.g.
func f(ptr PtrToData?) int { ... }
You would only be allowed to touch *ptr inside a if ptr != nil { ... }. There's a linter from uber (nilaway) that works like that, except for the type annotation. That proposal would break existing code, so perhaps something an explicit marker for non-nil pointers is needed instead (but that's not very ergonomic, alas).Go has chosen explicit over implicit everywhere except initialization—the one place where I really needed "explicit."
The problem with this, as with any lack of static typing, is that you now have to validate at _every_ place that cares, or to carefully track whether a value has already been validated, instead of validating once and letting the compiler check that it happened.
$ cat main.go
package main
import (
"log"
"os"
)
func main() {
f, err := os.Create("\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98")
if err != nil {
log.Fatalf("create: %v", err)
}
_ = f
}
$ go run .
$ ls -1
''$'\275\262''='$'\274'' ⌘'
go.mod
main.goWTF-8 has some inconvenient properties. Concatenating two strings requires special handling. Rust's opaque types can patch over this but I bet Go's WTF-8 handling exposes some unintuitive behavior.
There is a desire to add a normal string API to OsStr but the details aren't settled. For example: should it be possible to split an OsStr on an OsStr needle? This can be implemented but it'd require switching to OMG-WTF-8 (https://rust-lang.github.io/rfcs/2295-os-str-pattern.html), an encoding with even more special cases. (I've thrown my own hat into this ring with OsStr::slice_encoded_bytes().)
The current state is pretty sad yeah. If you're OK with losing portability you can use the OsStrExt extension traits.
If you have an int variable hours := 8, you have to cast it before multiplying.
This is also true for simple int and float operations.
f := 2.0
3 * f
is valid, but x := 3 would need float64(x)*f to be valid. Same is true for addition etc.The go language and its runtime is the only system I know that is able to handle concurrency with multicore cpus seamlessly within the language, using the CSP-like (goroutine/channel) formalism which is easy to reason with.
Python is a mess with the gil and async libraries that are hard to reason with. C,C++,Java etc need external libraries to implement threading which cant be reasoned with in the context of the language itself.
So, go is a perfect fit for the http server (or service) usecase and in my experience there is no parallel.
Because 99.999% of the time you want it to be valid and would like an error if it isn't? If you want to work with invalid UTF-8, that should be a deliberate choice.
But certainly, anyone will bring their previous experience to the project, so there must be some Plan9 influence in there somewhere
IMO the differences with Windows are such that I’m much more unhappy with WTF-8. There’s a lot that sucks about C++ but at least I can do something like
#if _WIN32
using pathchar = wchar_t;
constexpr pathchar sep = L'\\';
#else
using pathchar = char;
constexpr pathchar sep = '/';
#endif
using pathstring = std::basic_string<pathchar>;
Mind you this sucks for a lot of reasons, one big reason being that you’re directly exposed to the differences between path representations on different operating systems. Despite all the ways that this (above) sucks, I still generally prefer it over the approaches of Go or Rust.I believe it’s the only system you know. But it’s far from the only one.
Validation is nice but Rust’s principled approach leaves me high and dry sometimes. Maybe Rust will finish figuring out the OsString interface and at that point we can say Rust has “won” the conversation, but it’s not there yet, and it’s been years.
Consider:
for i, chr := range string([]byte{226, 150, 136, 226, 150, 136}) {
fmt.Printf("%d = %v\n", i, chr)
// note, s[i] != chr
}
How many times does that loop over 6 bytes iterate? The answer is it iterates twice, with i=0 and i=3.There's also quite a few standard APIs that behave weirdly if a string is not valid utf-8, which wouldn't be the case if it was just a bag of bytes.
I don't have a lot of experience with the malloc languages at scale, but I do know that heat fragmentation and GC fragmentation are very similar problems.
There are techniques in GC languages to avoid GC like arena allocation and stuff like that, generally considered non-idiomatic.
So that means that for 99% of scenarios, the difference between char[] and a proper utf8 string is none. They have the same data representation and memory layout.
The problem comes in when people start using string like they use string in PHP. They just use it to store random bytes or other binary data.
This makes no sense with the string type. String is text, but now we don't have text. That's a problem.
We should use byte[] or something for this instead of string. That's an abuse of string. I don't think allowing strings to not be text is too constraining - that's what a string is!
I'd love to see a list of these, with any references you can provide.
Except when it doesn’t and then you have to deal with fucking Cthulhu because everyone thought they could just make incorrect assumptions that aren’t actually enforced anywhere because “oh that never happens”.
That isn’t engineering. It’s programming by coincidence.
> Maybe Rust will finish figuring out the OsString interface
The entire reason OsString is painful to use is because those problems exist and are real. Golang drops them on the floor and forces you pick up the mess on the random day when an unlucky end user loses data. Rust forces you to confront them, as unfortunate as they are. It's painful once, and then the problem is solved for the indefinite future.
Rust also provides OsStrExt if you don’t care about portability, which greatly removes many of these issues.
I don’t know how that’s not ideal: mistakes are hard, but you can opt into better ergonomics if you don’t need the portability. If you end up needing portability later, the compiler will tell you that you can’t use the shortcuts you opted into.
The ergonomics for this use case are better than in any language I ever used.
You wanted sources, here's the chapter on tasks and synchronization in the Ada LRM: http://www.ada-auth.org/standards/22rm/html/RM-9.html
For Erlang and Elixir, concurrent programming is pretty much their thing so grab any book or tutorial on them and you'll be introduced to how they handle it.
Sure it's good compared to like... C++. Is go actually competing with C++? From where I'm standing, no.
But compared to what you might actually use Go for... The tooling is bad. PHP has better tooling, dotnet has better tooling, Java has better tooling.
If you use 3) to create a &str/String from invalid bytes, you can't safely use that string as the standard library is unfortunately designed around the assumption that only valid UTF-8 is stored.
https://doc.rust-lang.org/std/primitive.str.html#invariant
> Constructing a non-UTF-8 string slice is not immediate undefined behavior, but any function called on a string slice may assume that it is valid UTF-8, which means that a non-UTF-8 string slice can lead to undefined behavior down the road.
In Go's category, there's Java, Haskell, OCaml, Julia, Nim, Crystal, Pony...
Dynamic languages are more likely to have green threads but aren't Go replacements.
I agree with this. I feel like Go was a very smart choice to create a new language to be easy and practical and have great tooling, and not to be experimental or super ambitious in any particular direction, only trusting established programming patterns. It's just weird that they missed some things that had been pretty well hashed out by 2009.
Map/filter/etc. are a perfect example. I remember around 2000 the average programmer thought map and filter were pointlessly weird and exotic. Why not use a for loop like a normal human? Ten years later the average programmer was like, for loops are hard to read and are perfect hiding places for bugs, I can't believe we used to use them even for simple things like map, filter, and foreach.
By 2010, even Java had decided that it needed to add its "stream API" and lambda functions, because no matter how awful they looked when bolted onto Java, it was still an improvement in clarity and simplicity.
Somehow Go missed this step forward the industry had taken and decided to double down on "for." Go's different flavors of for are a significant improvement over the C/C++/Java for loop, but I think it would have been more in line with the conservative, pragmatic philosophy of Go to adopt the proven solution that the industry was converging on.
You list three that don't, and then you go on to list seven languages that do.
Yes, not many languages support concurrency like Go does...
Java does not need external libraries to implement threading, it's baked into the language and its standard libraries.
What do you mean by this for Java? The library is the runtime that ships with Java, and while they're OS threads under the hood, the abstraction isn't all that leaky, and it doesn't feel like they're actually outside the JVM.
Working with them can be a bit clunky, though.
my favorite example of this was the go authors refusing to add monotonic time into the standard library because they confidently misunderstood its necessity
(presumably because clocks at google don't ever step)
then after some huge outages (due to leap seconds) they finally added it
now the libraries are a complete a mess because the original clock/time abstractions weren't built with the concept of multiple clocks
and every go program written is littered with terrible bugs due to use of the wrong clock
https://github.com/golang/go/issues/12914 (https://github.com/golang/go/issues/12914#issuecomment-15075... might qualify for the worst comment ever)
There are also runtimes like e.g. Hermes (used primarily by React Native), there's support for separating operations between the graphics thread and other threads.
All that being said, I won't dispute OP's point about "handling concurrency [...] within the language"- multithreading and concurrency are baked into the Golang language in a more fundamental way than Javascript. But it's certainly worth pointing out that at least several of the major runtimes are capable of multithreading, out of the box.
I assume, anyway. Maybe the Go debugger is kind of shitty, I don't know. But in PHP with xdebug you just use all the fancy array_* methods and then step through your closures or callables with the debugger.
Elixir handling 2 million websocket connections on a single machine back in 2015 would like to have a word.[1] This is largely thanks to the Erlang runtime it sits atop.
Having written some tricky Go (I implemented Raft for a class) and a lot of Elixir (professional development), it is my experience that Go's concurrency model works for a few cases but largely sucks in others and is way easier to write footguns in Go than it ought to be.
[1]: https://phoenixframework.org/blog/the-road-to-2-million-webs...
Source: spent the last few weeks at work replacing a Go program with an Elixir one instead.
I'd use Go again (without question) but it is not a panacea. It should be the default choice for CLI utilities and many servers, but the notion that it is the only usable language with something approximating CSP is idiotic.
I feel people who complain about rustc compile times must be new to using compiled languages…
Basically OP was saying that JavaScript can run multiple tasks concurrently, but with no parallelism since all tasks map to 1 OS thread.
However no &str is not "an alias for &&String" and I can't quite imagine how you'd think that. String doesn't exist in Rust's core, it's from alloc and thus wouldn't be available if you don't have an allocator.
import "gopkg.in/yaml.v3" // does *what* now?
curl https://gopkg.in/yaml.v3?go-get=1 | grep github
<meta name="go-source" content="gopkg.in/yaml.v3 _ https://github.com/go-yaml/yaml/tree/v3.0.1{/dir} https://github.com/go-yaml/yaml/blob/v3.0.1{/dir}/{file}#L{line}">
oh, ok :-/I would presume only a go.mod entry would specify whether it really is v3.0.0 or v3.0.1
Also, for future generations, don't use that package https://github.com/go-yaml/yaml#this-project-is-unmaintained
So it's really Go vs. Java, or you can take a performance hit and use Erlang (valid choice for some tasks but not all), or take a chance on a novel paradigm/unsupported language.
The downside of a small stdlib is the proliferation of options, and you suddenly discover(ed?, it's been a minute) that your async package written for Tokio won't work on async-std and so forth.
This has often been the case in Go too - until `log/slog` existed, lots of people chose a structured logger and made it part of their API, forcing it on everyone else.
I thought it was a seldom mentioned fact in Go that CSP systems are impossible to reason about outside of toy projects so everyone uses mutexes and such for systemic coordination.
I'm not sure I've even seen channels in a production application used for anything more than stopping a goroutine, collecting workgroup results, or something equally localized.
> Within a worker thread, worker.getEnvironmentData() returns a clone of data passed to the spawning thread's worker.setEnvironmentData(). Every new Worker receives its own copy of the environment data automatically.
M:1 threaded means that the user space threads are mapped onto a single kernel thread. Go is M:N threaded: goroutines can be arbitrarily scheduled across various underlying OS threads. Its primitives (goroutines and channels) make both concurrency and parallelism notably simpler than most languages.
> But it's certainly worth pointing out that at least several of the major runtimes are capable of multithreading, out of the box.
I’d personally disagree in this context. Almost every language has pthread-style cro-magnon concurrency primitives. The context for this thread is precisely how go differs from regular threading interfaces. Quoting gp:
> The go language and its runtime is the only system I know that is able to handle concurrency with multicore cpus seamlessly within the language, using the CSP-like (goroutine/channel) formalism which is easy to reason with.
Yes other languages have threading, but in go both concurrency and parallelism are easier than most.
(But not erlang :) )
Most of the time if there's a result, there's no error. If there's an error, there's no result. But don't forget to check every time! And make sure you don't make a mistake when you're checking and accidentally use the value anyway, because even though it's technically meaningless it's still nominally a meaningful value since zero values are supposed to be meaningful.
Oh and make sure to double-check the docs, because the language can't let you know about the cases where both returns are meaningful.
The real world is messy. And golang doesn't give you advance warning on where the messes are, makes no effort to prevent you from stumbling into them, and stands next to you constantly criticizing you while you clean them up by yourself. "You aren't using that variable any more, clean that up too." "There's no new variables now, so use `err =` instead of `err :=`."
I recently realized that there is no easy way to "bubble up a goroutine error", and I wrote some code to make sure that was possible, and that's when I realize, as usual, that I'm rewriting part of the OTP library.
The whole supervisor mechanism is so valuable for concurrency.
The tasks run concurrently, but not in parallel.
For JSON, you can't encode Optional[T] as nothing at all. It has to encode to something, which usually means null. But when you decode, the absence of the field means UnmarshalJSON doesn't get called at all. This typically results in the default value, which of course you would then re-encode as null. So if you round-trip your JSON, you get a materially different output than input (this matters for some other languages/libraries). Maybe the new encoding/json/v2 library fixes this, I haven't looked yet.
Also, I would usually want Optional[T]{value:nil,exists:true} to be impossible regardless of T. But Go's type system is too limited to express this restriction, or even to express a way for a function to enforce this restriction, without resorting to reflection, and reflection has a type erasure problem making it hard to get right even then! So you'd have to write a bunch of different constructors: one for all primitive types and strings; one each for pointers, maps, and slices; three for channels (chan T, <-chan T, chan<- T); and finally one for interfaces, which has to use reflection.
We can try to shove it into objects that work on other text but this won't work in edge cases.
Like if I take text on Linux and try to write a Windows file with that text, it's broken. And vice versa.
Go let's you do the broken thing. In Rust or even using libraries in most languages, you cant. You have to specifically convert between them.
That's why I mean when I say "storing random binary data as text". Sure, Windows almost UTF16 abomination is kind of text, but not really. Its its own thing. That requires a different type of string OR converting it to a normal string.
That works well for go and Google but I'm not sure how easily that'd be to replicate with rust or others
Should and could golang have been so much better than it is? Would golang have been better if Pike and co. had considered use-cases outside of Google, or looked outward for inspiration even just a little? Unambiguously yes, and none of the changes would have needed it to sacrifice its priorities of language simplicity, compilation speed, etc.
It is absolutely okay to feel that go is a better language than some of its predecessors while at the same time being utterly frustrated at the the very low-hanging, comparatively obvious, missed opportunities for it to have been drastically better.
It maybe legacy cruft downstream of poorly thought out design decisions at the system/OS level, but we're stuck with it. And a language that doesn't provide the tooling necessary to muddle through this mess safely isn't a serious platform to build on, IMHO.
There is room for languages that explicitly make the tradeoff of being easy to use (e.g. a unified string type) at the cost of not handling many real world edge cases correctly. But these should not be used for serious things like backup systems where edge cases result in lost data. Go is making the tradeoff for language simplicity, while being marketed and positioned as a serious language for writing serious programs, which it is not.
Make use of binary libraries, export templates, incremental compilation and linking with multiple cores, and if using VC++ or clang vLatest, modules.
It still isn't Delphi fast, but becomes more manageable.
That's 6 languages, a non-exhaustive list of them, that are either properly mainstream and more popular than Go or at least well-known and easy to obtain and get started with. All of which have concurrency baked in and well-supported (unlike, say, C).
EDIT: And one more thing, all but Elixir are older than Go, though Clojure only slightly. So prior art was there to learn from.
But yeah, the CSP model is mostly dead. I think the language authors' insistence that goroutines should not be addressable or even preemptible from user code makes this inevitable.
Practical Go concurrency owes more to its green threads and colorless functions than its channels.
Again, this is the same simplistic, vs just the right abstraction, this just smudges the complexity over a much larger surface area.
If you have a byte array that is not utf-8 encoded, then just... use a byte array.
I shouldn't fault the creators. They did what they did, and that is all and good. I am more shocked by the way it has exploded in adoption.
Would love to see a coffeescript for golang.
You hear that Rob Pike? LOL. All those years he shat on Java, it was so irritating. (Yes schadenfreude /g)
But those are not rules. If you're doing stuff for fun, check out QBE <https://c9x.me/compile/> or Plan 9 C <https://plan9.io/sys/doc/comp.html> (which Go was derived from!)
Functions are colored: those taking context.Context and those who don't.
But I agree, this is very faint coloring compared to async implementations. One is free to context.Background() liberally.
Of the top of my head, in order of likely difficulty to calculate: byte length, number of code points, number of grapheme/characters, height/width to display.
Maybe it would be best for Str not to have len at all. It could have bytes, code_points, graphemes. And every use would be precise.
No, none outside of stdlib anyway in the way you're probably thinking of.
There are specialized constructs which live in third-party crates, such as rope implementations and stack-to-heap growable Strings, but those would have to exist as external modules in Go as well.
FWIW the docs indicate that working with grapheme clusters will never end up in the standard library.
Yes this is why all competent libraries don't actually use string for path. They have their own path data type because it's actually a different data type.
Again, you can do the Go thing and just use the broken string, but that's dumb and you shouldn't. They should look at C++ std::filesystem, it's actually quite good in this regard.
> And a language that doesn't provide the tooling necessary to muddle through this mess safely isn't a serious platform to build on, IMHO.
I agree, even PHP does a better job at this than Go, which is really saying something.
> Go is making the tradeoff for language simplicity, while being marketed and positioned as a serious language for writing serious programs, which it is not.
I would agree.
Once you know about it, though, it's easy to avoid. I do think, especially given that the CSP features of Go are downplayed nowadays, this should be addressed more prominently in the docs, with the more realistic solutions presented (atomics, mutexes).
It could also potentially be addressed using 128-bit atomics, at least for strings and interfaces (whereas slices are too big, taking up 3 words). The idea of adding general 128-bit atomic support is on their radar [2] and there already exists a package for it [3], but I don't think strings or interfaces meet the alignment requirements.
[1]: https://research.swtch.com/gorace
I mean, really neither should be the default. You should have to pick chars or bytes on use, but I don't think that's palatable; most languages have chosen one or the other as the preferred form. Or some have the joy of being forward thinking in the 90s and built around UCS-2 and later extended to UTF-16, so you've got 16-bit 'characters' with some code points that are two characters. Of course, dealing with operating systems means dealing with whatever they have as well as what the language prefers (or, as discussed elsewhere in this thread, pretending it doesn't exist to make easy things easier and hard things harder)
The answer here isn't to throw up your hands, pick one, and other cases be damned. It's to expose them all and let the engineer choose. To not beat the dead horse of Rust, I'll point that Ruby gets this right too.
* String#length # count Unicode code units
* String#bytes#length # count bytes
* String#grapheme_clusters#length # count grapheme clusters
Similarly, each of those "views" lets you slice, index, etc. across those concepts naturally. Golang's string is the worst of them all. They're nominally UTF-8, but nothing actually enforces it. But really they're just buckets of bytes, unless you send them to APIs that silently require them to be UTF-8 and drop them on the floor or misbehave if they're not.Height/width to display is font-dependent, so can't just be on a "string" but needs an object with additional context.
What is different about it? I don't see any constraints here relevant to having a different type. Note that this thread has already confused the issue, because they said filename and you said path. A path can contain /, it just happens to mean something.
If you want a better abstraction to locations of files on disk, then you shouldn't use paths at all, since they break if the file gets moved.
I have absolutely no idea how go would solve this problem, and in fact I don't think it does at all.
> The Go std-lib is fantastic.
I have seen worse, but I would still not call it decent considering this is a fairly new language that could have done a lot more.
I am going to ignore the incredible amount of asinine and downright wrong stuff in many of the most popular libraries (even the basic ones maintained by google) since you are talking only about the stdlib.
On the top of my head I found inconsistent tagging management for structs (json defaults, omitzero vs omitempty), not even errors on tag typos, the reader/writer pattern that forces you to to write custom connectors between the two, bzip2 has a reader and no writer, the context linked list for K/V. Just look at the consistency of the interfaces in the "encoding" pkg and cry, the package `hash` should actually be `checksum`. Why does `strconv.Atoi`/ItoA still exist? Time.Add() vs Time.Sub()...
It chock full of inconsistencies. It forces me to look at the documentation every single time I don't use something for more than a couple of days. No, the autocomplete with the 2-line documentation does not include the potential pitfalls that are explained at the top of the package only.
And please don't get me started on the wrappers I had to write around stuff in the net library to make it a bit more consistent or just less plain wrong. net/url.Parse!!! I said don't make my start on this package! nil vs NoBody! ARGH!
None of this is stuff at the language level (of which there is plenty to say).
None of it is a dealbreaker per se, but it adds attrition and becomes death by a billion cuts.
I don't even trust any parser written in go anymore, I always try to come up with corner cases to check how it reacts, and I am often surprised by most of them.
Sure, there are worse languages and libraries. Still not something I would pick up in 2025 for a new project.
This is one of the minor errors in the post.
One of the great advances of Unix was that you don't need separate handling for binary data and text data; they are stored in the same kind of file and can be contained in the same kinds of strings (except, sadly, in C). Occasionally you need to do some kind of text-specific processing where you care, but the rest of the time you can keep all your code 8-bit clean so that it can handle any data safely.
Languages that have adopted the approach you advocate, such as Python, frequently have bugs like exception tracebacks they can't print (because stdout is set to ASCII) or filenames they can't open when they're passed in on the command line (because they aren't valid UTF-8).
The entire point of UTF-8 (designed, by the way, by the group that designed Go) is to encode Unicode in such a way that these byte string operations perform the corresponding Unicode operations, precisely so that you don't have to care whether your string is Unicode or just plain ASCII, so you don't need any error handling, except for the rare case where you want to do something related to the text that the string semantically represents. The only operation that doesn't really map is measuring the length.
But it is being put. Read newsletters like "The Go Blog", "Go Weekly". It's been improving constantly. Language-changes require lots of time to be done right, but the language is evolving.
Every single thing you listed here is supported by &[u8] type. That's the point: if you want to operate on data without assuming it's valid UTF-8, you just use &[u8] (or allocating Vec<u8>), and the standard library offers what you'd typically want, except of the functions that assume that the string is valid UTF-8 (like e.g. iterating over code points). If you want that, you need to convert your &[u8] to &str, and the process of conversion forces you to check for conversion errors.
Are Java AOT compilation times just as fast as Go?
There aren't many possibilities for nil errors in Go once you eliminate the self-harm of abusing pointers to represent optionality.
After Go added generics in version 1.18, you can just import someone else's generic implementations of whatever of these functions you want and use them all throughout your code and never think about it. It's no longer a problem.
And if you're engaging in CS then Go is probably the last language you should be using. If however, what you're interested in doing is programming, the fundamental data structures there are arrays and hashmaps, which Go has built-in. Everything else is niche.
> Also, as mentioned by another comment, an HTTP or crypto library can become obsolete _fast_. What about HTTP3? What about post-quantum crypto? What about security fixes? The stdlib is tied to the language version, thus to a language release. Having such code independant allows is to evolve much faster, be leaner, and be more composable. So yes, the library is well maintained, but it's tied to the Go version.
The entire point is to have a well supported crypto library. Which Go does and it's always kept up to date. Including security fixes.
As for post-quantum: https://words.filippo.io/mlkem768/
> - otoh, in Go, it was found out that net.Ip has an absolutely atrocious design (it's just an alias for []byte). Tailscale wrote a replacement that's now in a subpackage in net, but the old net.Ip is set in stone. (https://tailscale.com/blog/netaddr-new-ip-type-for-go)
Yes, and? This seems to me to be the perfect way to handle things - at all times there is a blessed high-quality library to use. As warts of its design get found out over time, a new version is worked on and released once every ~10 years.
A total mess of barely-supported libraries that the userbase is split over is just that - a mess.
Yes, that was my assumption when bash et al also had problems with it.
Typically the way you do this is you have the constructor for path do the validation or you use a static path::fromString() function.
Also paths breaking when a file is moved is correct behavior sometimes. For example something like openFile() or moveFile() requires paths. Also path can be relative location.
colors := items.Filter(_.age > 20).Map(_.color)
Instead of colors := items.Filter(func(x Item){ return x.age > 20 }).Map(func(x Item){ return x.color })
which as best as I can tell is how you'd express the same thing in Go if you had a container type with Map and Filter defined.Joda time is an excellent library and indeed it was basically the basis for java's time API, and.. for pretty much any modern language's time API, but given the history - Java basically always had the best time library available at the time.
Why not? Machine code is not all that special - C++ and Rust is slow due to optimizations, not for machine code as target itself. Go "barely does anything", just spits out machine code almost as is.
Java AOT via GraalVM's native image is quite slow, but it has a different way of working (doing all the Java class loading and initialization and "baking" that into the native image).
It's not viable to use, but: https://github.com/borgo-lang/borgo
{
"value": "value",
"exists: true,
}
For nil, that's interesting. I've never ran into issues there, so I never considered it.Weird, to me that is a strong argument. Choose your stewards.
If your API takes &str, and tries to do byte-based indexing, it should almost certainly be taking &[u8] instead.
Yes, and that's a good thing. It allows every code that gets &str/String to assume that the input is valid UTF-8. The alternative would be that every single time you write a function that takes a string as an argument, you have to analyze your code, consider what would happen if the argument was not valid UTF-8, and handle that appropriately. You'd also have to redo the whole analysis every time you modify the function. That's a horrible waste of time: it's much better to:
1) Convert things to String early, and assume validity later, and
2) Make functions that explicitly don't care about validity take &[u8] instead.
This is, of course, exactly what Rust does: I am not aware of a single thing that &str allows you to do that you cannot do with &[u8], except things that do require you to assume it's valid UTF-8.
It's really not. Proebsting's Law applies.
Given that, compilers/languages should be optimized for programmer productivity first and code speed second.
Can it? If you want to open a file with invalid UTF8 in the name, then the path has to contain that.
And a path can contain the path separator - it's the filename that can't contain it.
> For example something like openFile() or moveFile() requires paths.
macOS has something called bookmark URLs that can contain things like inode numbers or addresses of network mounts. Apps use it to remember how to find recently opened files even if you've reorganized your disk or the mount has dropped off.
IIRC it does resolve to a path so it can use open() eventually, but you could imagine an alternative. Well, security issues aside.
Doesn't this demonstrate my point? If you can do everything with &[u8], what's the point in validating UTF-8? It's just a less universal string type, and your program wastes CPU cycles doing unnecessary validation.
Note that &[u8] would allow things like null bytes, and maybe other edge cases.
So you naturally write another one of these functions that takes a `&str` so that it can pass to another function that only accepts `&str`.
Fundamentally no one actually requires validation (ie, walking over the string an extra time up front), we're just making it part of the contract because something else has made it part of the contract.
In Linux, they’re 8-bit almost-arbitrary strings like you noted, and usually UTF-8. So they always have a convenient 8-bit encoding (I.e. leave them alone). If you hated yourself and wanted to convert them to UTF-16, however, you’d have the same problem Windows does but in reverse.
The upshot is that since the values aren’t always UTF-16, there’s no canonical way to convert them to single byte strings such that valid UTF-16 gets turned into valid UTF-8 but the rest can still be roundtripped. That’s what bastardized encodings like WTF-8 solve. The Rust Path API is the best take on this I’ve seen that doesn’t choke on bad Unicode.
use std::env;
fn main() {
let args: Vec<String> = env::args().collect();
...
}
When I run this code, a literal example from the official manual, with this filename I have here, it panics: $ ./main $'\200'
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "\x80"', library/std/src/env.rs:805:51
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
($'\200' is bash's notation for a single byte with the value 128. We'll see it below in the strace output.)So, literally any program anyone writes in Rust will crash if you attempt to pass it that filename, if it uses the manual's recommended way to accept command-line arguments. It might work fine for a long time, in all kinds of tests, and then blow up in production when a wild file appears with a filename that fails to be valid Unicode.
This C program I just wrote handles it fine:
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
char buf[4096];
void
err(char *s)
{
perror(s);
exit(-1);
}
int
main(int argc, char **argv)
{
int input, output;
if ((input = open(argv[1], O_RDONLY)) < 0) err(argv[1]);
if ((output = open(argv[2], O_WRONLY | O_CREAT, 0666)) < 0) err(argv[2]);
for (;;) {
ssize_t size = read(input, buf, sizeof buf);
if (size < 0) err("read");
if (size == 0) return 0;
ssize_t size2 = write(output, buf, (size_t)size);
if (size2 != size) err("write");
}
}
(I probably should have used O_TRUNC.)Here you can see that it does successfully copy that file:
$ cat baz
cat: baz: No such file or directory
$ strace -s4096 ./cp $'\200' baz
execve("./cp", ["./cp", "\200", "baz"], 0x7ffd7ab60058 /* 50 vars */) = 0
brk(NULL) = 0xd3ec000
brk(0xd3ecd00) = 0xd3ecd00
arch_prctl(ARCH_SET_FS, 0xd3ec380) = 0
set_tid_address(0xd3ec650) = 4153012
set_robust_list(0xd3ec660, 24) = 0
rseq(0xd3ecca0, 0x20, 0, 0x53053053) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=9788*1024, rlim_max=RLIM64_INFINITY}) = 0
readlink("/proc/self/exe", ".../cp", 4096) = 22
getrandom("\xcf\x1f\xb7\xd3\xdb\x4c\xc7\x2c", 8, GRND_NONBLOCK) = 8
brk(NULL) = 0xd3ecd00
brk(0xd40dd00) = 0xd40dd00
brk(0xd40e000) = 0xd40e000
mprotect(0x4a2000, 16384, PROT_READ) = 0
openat(AT_FDCWD, "\200", O_RDONLY) = 3
openat(AT_FDCWD, "baz", O_WRONLY|O_CREAT, 0666) = 4
read(3, "foo\n", 4096) = 4
write(4, "foo\n", 4) = 4
read(3, "", 4096) = 0
exit_group(0) = ?
+++ exited with 0 +++
$ cat baz
foo
The Rust manual page linked above explains why they think introducing this bug by default into all your programs is a good idea, and how to avoid it:> Note that std::env::args will panic if any argument contains invalid Unicode. If your program needs to accept arguments containing invalid Unicode, use std::env::args_os instead. That function returns an iterator that produces OsString values instead of String values. We’ve chosen to use std::env::args here for simplicity because OsString values differ per platform and are more complex to work with than String values.
I don't know what's "complex" about OsString, but for the time being I'll take the manual's word for it.
So, Rust's approach evidently makes it extremely hard not to introduce problems like that, even in the simplest programs.
Go's approach doesn't have that problem; this program works just as well as the C program, without the Rust footgun:
package main
import (
"io"
"log"
"os"
)
func main() {
src, err := os.Open(os.Args[1])
if err != nil {
log.Fatalf("open source: %v", err)
}
dst, err := os.OpenFile(os.Args[2], os.O_CREATE|os.O_WRONLY, 0666)
if err != nil {
log.Fatalf("create dest: %v", err)
}
if _, err := io.Copy(dst, src); err != nil {
log.Fatalf("copy: %v", err)
}
}
(O_CREATE makes me laugh. I guess Ken did get to spell "creat" with an "e" after all!)This program generates a much less clean strace, so I am not going to include it.
You might wonder how such a filename could arise other than as a deliberate attack. The most common scenario is when the filenames are encoded in a non-Unicode encoding like Shift-JIS or Latin-1, followed by disk corruption, but the deliberate attack scenario is nothing to sneeze at either. You don't want attackers to be able to create filenames your tools can't see, or turn to stone if they examine, like Medusa.
Note that the log message on error also includes the ill-formed Unicode filename:
$ ./cp $'\201' baz
2025/08/22 21:53:49 open source: open ζ: no such file or directory
But it didn't say ζ. It actually emitted a byte with value 129, making the error message ill-formed UTF-8. This is obviously potentially dangerous, depending on where that logfile goes because it can include arbitrary terminal escape sequences. But note that Rust's UTF-8 validation won't protect you from that, or from things like this: $ ./cp $'\n2025/08/22 21:59:59 oh no' baz
2025/08/22 21:59:09 open source: open
2025/08/22 21:59:59 oh no: no such file or directory
I'm not bagging on Rust. There are a lot of good things about Rust. But its string handling is not one of them.If you stuff random binary data into a string, Go just steams along, as described in this post.
Over the decades I have lost data to tools skipping non-UTF-8 filenames. I should not be blamed for having files that were named before UTF-8 existed.
If your API takes &str, and tries to do byte-based indexing, it should
almost certainly be taking &[u8] instead.
Str is indexed by bytes. That's the issue.You're meant to use `unsafe` as a way of limiting the scope of reasoning about safety.
Once you construct a `&str` using `from_utf8_unchecked`, you can't safely pass it to any other function without looking at its code and reasoning about whether it's still safe.
Also see the actual documentation: https://doc.rust-lang.org/std/primitive.str.html#method.from...
> Safety: The bytes passed in must be valid UTF-8.
It seems like there's some confusion in the GGGGGP post, since Go works correctly even if the filename is not valid UTF-8 .. maybe that's why they haven't noticed any issues.
Granted, many people don't ever need to handle that kind of throughput. It depends on the app and the load put on to it. So many people don't realize. Which is fine! If it works it works. But if you do fall into the need of concurrency, yea, you probably don't want to be using node - even the newer versions. You certainly could do worse than golang. It's good we have some choices out there.
The other thing I always say is the choice in languages and technology is not for one person. It's for the software and team at hand. I often choose languages, frameworks, and tools specifically because of the team that's charged with building and maintaining. If you can make them successful because a language gives them type safety or memory safety that rust offers or a good tool chain, whatever it is that the team needs - that's really good. In fact, it could well be the difference between a successful business and an unsuccessful one. No one really cares how magical the software is if the company goes under and no one uses the software.
https://github.com/golang/go/issues/32334
oops, looks like some files are just inaccessible to you, and you cannot copy them.
Fortunately, when you try to delete the source directory, Go's standard library enters infinite loop, which saves your data.
Then make it valid UTF-8. If you try to solve the long tail of issues in a commonly used function of the library its going to cause a lot of pain. This approach is better. If someone has a weird problem like file names with invalid characters, they can solve it themselves, even publish a package. Why complicate 100% of uses for solving 0.01% of issues?
I think you misunderstand. How do you do that for a file that exists on disk that's trying to be read? Rename it for them? They may not like that.
I guess the issue isn't so much about whether strings are well-formed, but about whether the conversion (eg, from UTF-16 to UTF-8 at the filesystem boundary) raises an error or silently modifies the data to use replacement characters.
I do think that is the main fundamental mistake in Go's Unicode handling; it tends to use replacement characters automatically instead of signalling errors. Using replacement characters is at least conformant to Unicode but imo unless you know the text is not going to be used as an identifier (like a filename), conversion should instead just fail.
The other option is using some mechanism to preserve the errors instead of failing quietly (replacement) or failing loudly (raise/throw/panic/return err), and I believe that's what they're now doing for filenames on Windows, using WTF-8. I agree with this new approach, though would still have preferred they not use replacement characters automatically in various places (another one is the "json" module, which quietly corrupts your non-UTF-8 and non-UTF-16 data using replacement characters).
Probably worth noting that the WTF-8 approach works because strings are not validated; WTF-8 involves converting invalid UTF-16 data into invalid UTF-8 data such that the conversion is reversible. It would not be possible to encode invalid UTF-16 data into valid UTF-8 data without changing the meaning of valid Unicode strings.
I think this is sensible, because the fact that Windows still uses UTF-16 (or more precisely "Unicode 16-bit strings") in some places shouldn't need to complicate the API on other platforms that didn't make the UCS-2/UTF-16 mistake.
It's possible that the WTF-8 strings might not concatenate the way they do in UTF-16 or properly enforced WTF-8 (which has special behaviour on concatenation), but they'll still round-trip to the intended 16-bit string, even after concatenation.
(1) "Generics are too complicated and academical and in the real world we only need them for a small number of well-known tasks anyway, so let's just leave them out!"
(2) The amount of code that does need generics but now has to work around the lack of them piles up, leading to an explosion of different libraries, design patterns, etc, that all try to partially recreate them in their own way.
(3) The language designers finally cave and introduce some kind of generics support in a later version of the language. However, at this point, they have to deal with all the "legacy" code that is not generics-aware and with runtime environments that aren't either. It also somehow has to play nice with all the ad-hoc solutions that are still present. So the new implementation has to deal with a myriad of special cases and tradeoffs that wouldn't be there in the first if it had been included in the language from the beginning.
(4) All the tradeoffs give the feature a reputation of needless complexity and frustrating limitations and/or footguns, prompting the next language designer to wonder if they should include them at all. Go to (1) ...
Eventually there should be an underlying operation which can only work on valid UTF-8, but that doesn't exist. UTF-8 was designed such that invalid data can be detected and handled, without affecting the meaning of valid subsequences in the same string.
That “reign” continued forever if you count when java.time got introduced and no, Calendar was not much better in the mean time. Python already had datetime in 2002 or 2003 and VB6 was miles ahead back when Java had just util.Date.
Windows doing something similar wouldn't surprise me at all. I believe NTFS internally stores filenames as UTF-16, so enforcing UTF-8 at the API boundary sounds likely.
$ golangci-lint linters | wc -l
107
That is a long list of linters (only a few enabled by default however). I much prefer less fragmented approaches such as clippy or ruff. It makes for a more coherent experience and surely much higher performance (1x AST parsing instead of dozens of times). let s = “asd”;
println!(“{}”, s[0]);
You will get a compiler error telling you that you cannot index into &str. fn main() {
let s = "12345";
println!("{}", &s[0..1]);
}
compiles and prints out "1".This:
fn main() {
let s = "\u{1234}2345";
println!("{}", &s[0..1]);
}
compiles and panics with the following error: byte index 1 is not a char boundary; it is inside 'ሴ' (bytes 0..3) of `ሴ2345`
To get the nth char (scalar codepoint): fn main() {
let s = "\u{1234}2345";
println!("{}", s.chars().nth(1).unwrap());
}
To get a substring: fn main() {
let s = "\u{1234}2345";
println!("{}", s.chars().skip(0).take(1).collect::<String>());
}
To actually get the bytes you'd have to call #as_bytes which works with scalar and range indices, e.g.: fn main() {
let s = "\u{1234}2345";
println!("{:02X?}", &s.as_bytes()[0..1]);
println!("{:02X}", &s.as_bytes()[0]);
}
IMO it's less intuitive than it should be but still less bad than e.g. Go's two types of nil because it will fail in a visible manner. let start = s.find('a')?;
let end = s.find('z')?;
let sub = &s[start..end];
and it will never panic, because find will never return something that's not on a char boundary. Where would you even get them from?
In my case it was in parsing text where a numeric value had a two character prefix but a string value did not. So I was matching on 0..2 (actually 0..2.min(string.len()) which doubly highlights the indexing issue) which blew up occasionally depending on the string values. There are perhaps smarter ways to do this (e.g. splitn on a space, regex, giant if-else statement, etc, etc) but this seemed at first glance to be the most efficient way because it all fit neatly into a match statement.The inverse was also a problem: laying out text with a monospace font knowing that every character took up the same number of pixels along the x-axis (e.g. no odd emoji or whatever else). Gotta make sure to call #len on #chars instead of the string itself as some of the text (Windows-1250 encoded) got converted into multi-byte Unicode codepoints.
I think this is a fine "fail-closed" way of language design. For example, Python has gone the other way and language complexity has gotten pretty bad since the small-language days. Trust what you are, don't try to please everyone, lest you become something like C++.
Clojure is good in this respect.
Compared to incumbents like dotnet and PHP? Uhh, no. The tooling is very far behind and cumbersome in comparison.
A couple quotes from the Go Blog by Rob Pike:
> It’s important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.
> Besides the axiomatic detail that Go source code is UTF-8, there’s really only one way that Go treats UTF-8 specially, and that is when using a for range loop on a string.
Both from https://go.dev/blog/strings
If you want UTF-8 in a guaranteed way, use the functions available in unicode/utf8 for that. Using `string` is not sufficient unless you make sure you only put UTF-8 into those strings.
If you put valid UTF-8 into a string, you can be sure that the string holds valid UTF-8, but if someone else puts data into a string, and you assume that it is valid UTF-8, you may have a problem because of that assumption.
It does though? Strings are internable, comparable, can be keys, etc.
At the protocol (or disk, etc) boundary. If I write code that consumes bytes that are intended to be UTF-8, I need to make a choice about what to do if they aren’t UTF-8 somewhere. A strict UTF-8 string forces me to make that choice in a considered location. In a language where a “string” is just bytes, I can forget, or to pieces of code can disagree on what the contract is. And bugs result.
Check out MySQL for a fun example of getting this wildly, impressively wrong. At least a Rust or a type checked-Python 3 wrapper around some MySQL code enforces a degree of correctness, which is much better than having your transaction fail to commit or commit indirectly was down the stack when you get bytes you didn’t expect.
(MySQL can still reject strictly valid UTF-8 data for utterly pathetic historical reasons if you configure it incorrectly.)
Command-line arguments on Windows are their own special disaster.
But there is not a canonical response to invalid data. So literally every operation that might need to make a choice of what to do when presented what invalid data should either (a) accept a parameter asking what to do on error and potentially fail or (b) take a parameter type that forces errors to be handled in advance.