zlacker

[parent] [thread] 3 comments
1. andyfe+(OP)[view] [source] 2025-08-22 14:37:29
I believe the same is true on linux, which only cares about 0x2f bytes (i.e. /)
replies(3): >>matt_k+g9 >>orthox+ea >>johnco+2E1
2. matt_k+g9[view] [source] 2025-08-22 15:28:04
>>andyfe+(OP)
And 0x00.
3. orthox+ea[view] [source] 2025-08-22 15:32:47
>>andyfe+(OP)
And 0x00, if I remember correctly.
4. johnco+2E1[view] [source] 2025-08-23 00:04:39
>>andyfe+(OP)
Windows paths are not necessarily well-formed UTF-16 (UCS-2 by some people’s definition) down to the filesystem level. If they were always well formed, you could convert to a single byte representation by straightforward Unicode re-encoding. But since they aren’t - there are choices that need to be made about what to do with malformed UTF-16 if you want to round trip them to single byte strings such that they match UTF-8 encoding if they are well formed.

In Linux, they’re 8-bit almost-arbitrary strings like you noted, and usually UTF-8. So they always have a convenient 8-bit encoding (I.e. leave them alone). If you hated yourself and wanted to convert them to UTF-16, however, you’d have the same problem Windows does but in reverse.

[go to top]