zlacker

[parent] [thread] 3 comments
1. maxdam+(OP)[view] [source] 2025-08-22 22:50:47
> This is, of course, exactly what Rust does: I am not aware of a single thing that &str allows you to do that you cannot do with &[u8], except things that do require you to assume it's valid UTF-8.

Doesn't this demonstrate my point? If you can do everything with &[u8], what's the point in validating UTF-8? It's just a less universal string type, and your program wastes CPU cycles doing unnecessary validation.

replies(1): >>matt_k+j41
2. matt_k+j41[view] [source] 2025-08-23 10:50:59
>>maxdam+(OP)
> except things that do require you to assume it's valid UTF-8

That's the point.

replies(1): >>maxdam+zs1
◧◩
3. maxdam+zs1[view] [source] [discussion] 2025-08-23 15:02:24
>>matt_k+j41
But no one has demonstrated an actual operation that requires valid UTF-8. The reasoning is always circular: "I require valid UTF-8 because someone else requires valid UTF-8".

Eventually there should be an underlying operation which can only work on valid UTF-8, but that doesn't exist. UTF-8 was designed such that invalid data can be detected and handled, without affecting the meaning of valid subsequences in the same string.

replies(1): >>amluto+ivr
◧◩◪
4. amluto+ivr[view] [source] [discussion] 2025-09-01 21:06:17
>>maxdam+zs1
> UTF-8 was designed such that invalid data can be detected and handled, without affecting the meaning of valid subsequences in the same string.

But there is not a canonical response to invalid data. So literally every operation that might need to make a choice of what to do when presented what invalid data should either (a) accept a parameter asking what to do on error and potentially fail or (b) take a parameter type that forces errors to be handled in advance.

[go to top]