Consider:
for i, chr := range string([]byte{226, 150, 136, 226, 150, 136}) {
fmt.Printf("%d = %v\n", i, chr)
// note, s[i] != chr
}
How many times does that loop over 6 bytes iterate? The answer is it iterates twice, with i=0 and i=3.There's also quite a few standard APIs that behave weirdly if a string is not valid utf-8, which wouldn't be the case if it was just a bag of bytes.
A couple quotes from the Go Blog by Rob Pike:
> It’s important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.
> Besides the axiomatic detail that Go source code is UTF-8, there’s really only one way that Go treats UTF-8 specially, and that is when using a for range loop on a string.
Both from https://go.dev/blog/strings
If you want UTF-8 in a guaranteed way, use the functions available in unicode/utf8 for that. Using `string` is not sufficient unless you make sure you only put UTF-8 into those strings.
If you put valid UTF-8 into a string, you can be sure that the string holds valid UTF-8, but if someone else puts data into a string, and you assume that it is valid UTF-8, you may have a problem because of that assumption.