zlacker

> I don't think of base 10 being meaningful in binary computers.

They communicate via the network, right? And telephony has always been in base 10 bits as opposed to base two eight bit bytes IIUC. So these two schemes have always been in tension.

So at some point the Ki, Mi, etc prefixes were introduced along with b vs B suffixes and that solved the issue 3+ decades ago so why is this on the HN front page?!

A better question might be, why do we privilege the 8 bit byte? Shouldn't KiB officially have a subscript 8 on the end?

replies(1): >>purple+ul

>>fc417f+(OP)
To be fair, the octet as the byte has been dominant for decades. POSIX even has the definition “A byte is composed of a contiguous sequence of 8 bits.” I would wager many software engineers don't even know that a non-octet bytes were a thing, given that college CS curricula typically just teach a byte is 8 bits.

I found some search results about Texas Instruments' digital signal processors using 16-bit bytes, and came across this blogpost from 2017 talking about implementing 16-bit bytes in LLVM: https://embecosm.com/2017/04/18/non-8-bit-char-support-in-cl.... Not sure if they actually implemented it, but that was surprising to me that non octet bytes still exist, albeit in a very limited manner.

Do you know of any other uses for bytes that are not 8 bits?

replies(3): >>zineke+8r >>ahazre+rN >>fc417f+wY

>>purple+ul
> Do you know of any other uses for bytes that are not 8 bits?

For "bytes" as the term-of-art itself? Probably not. For "codes" or "words"? 5 bits are the standard in Baudot transmission (in teletype though). 6- and 7-bit words were the standards of the day for very old computers (ASCII is in itself a 7-bit code), especially on DEC-produced ones (https://rabbit.eng.miami.edu/info/decchars.html).

>>purple+ul
Back in the days of Octal notation, there were computers with a 12 bit word size that used sixbit characters (early DEC PDP-8, PDP-5, early CDC machines). 'Byte' was sometimes used for 6- and 9-bit halfword values.

>>purple+ul
I wanted to reply with a bunch of DSP examples but on further investigation the ones I checked just now seem to very deliberately use the term "data word". That said, the C char type in these cases is one "data word" as opposed to 8 bits; I feel like that ought to count as a non-8-bit byte regardless of the terminology in the docs.

NXP makes a number of audio DSPs with a native 24 bit width.

Microchip still ships chips in the PIC family with instructions of various widths including 12 and 14 bit however I believe the data memory on those chips is either 8 or 16 bit. I have no idea how to classify a machine where the instruction and data memory widths don't match.

Unlike POSIX, C merely requires that char be at least 8 bits wide. Although I assume lots of real world code would break if challenged on that particular detail.