All they had to say was that the KiB et. al. were introduced in 1998, and the adoption has been slow.
And not “but a kilobyte can be 1000,” as if it’s an effort issue.
In my mind base 10 only became relevant when disk drive manufacturers came up with disks with "weird" disk sizes (maybe they needed to reserve some space for internals, or it's just that the disk platters didn't like powers of two) and realised that a base 10 system gave them better looking marketing numbers. Who wants a 2.9TB drive when you can get a 3TB* drive for the same price?
Three binary terabytes i.e. 3 * 2^40 is 3298534883328, or 298534883328 more bytes than 3 decimal terabytes. The latter is 298.5 decimal gigabytes, or 278 binary gigabytes.
Indeed, early hard drives had slightly more than even the binary size --- the famous 10MB IBM disk, for example, had 10653696 bytes, which was 167936 bytes more than 10MB --- more than an entire 160KB floppy's worth of data.
Okay, but what do you mean by “10”?
They communicate via the network, right? And telephony has always been in base 10 bits as opposed to base two eight bit bytes IIUC. So these two schemes have always been in tension.
So at some point the Ki, Mi, etc prefixes were introduced along with b vs B suffixes and that solved the issue 3+ decades ago so why is this on the HN front page?!
A better question might be, why do we privilege the 8 bit byte? Shouldn't KiB officially have a subscript 8 on the end?
That is to say, all the (high-end/“gamer”) consumer SSDs that I’ve checked use 10% overprovisioning and achieve that by exposing a given number of binary TB of physical flash (e.g. a “2TB” SSD will have 2×1024⁴ bytes’ worth of flash chips) as the same number of decimal TB of logical addresses (e.g. that same SSD will appear to the OS as 2×1000⁴ bytes of storage space). And this makes sense: you want a round number on your sticker to make the marketing people happy, you aren’t going to make non-binary-sized chips, and 10% overprovisioning is OK-ish (in reality, probably too low, but consumers don’t shop based on the endurance metrics even if they should).
TLC flash actually has a total number of bits that's a multiple of 3, but it and QLC are so unreliable that there's a significant amount of extra bits used for error correction and such.
SSDs haven't been real binary sizes since the early days of SLC flash which didn't need more than basic ECC. (I have an old 16MB USB drive, which actually has a user-accessible capacity of 16,777,216 bytes. The NAND flash itself actually stores 17,301,504 bytes.)
I found some search results about Texas Instruments' digital signal processors using 16-bit bytes, and came across this blogpost from 2017 talking about implementing 16-bit bytes in LLVM: https://embecosm.com/2017/04/18/non-8-bit-char-support-in-cl.... Not sure if they actually implemented it, but that was surprising to me that non octet bytes still exist, albeit in a very limited manner.
Do you know of any other uses for bytes that are not 8 bits?
First, you implicitly assumed a decimal number base in your comment.
Second: Of course its meaningful. It's also relevant since humans use binary computers and numeric input and output in text is almost always in decimal.
For "bytes" as the term-of-art itself? Probably not. For "codes" or "words"? 5 bits are the standard in Baudot transmission (in teletype though). 6- and 7-bit words were the standards of the day for very old computers (ASCII is in itself a 7-bit code), especially on DEC-produced ones (https://rabbit.eng.miami.edu/info/decchars.html).
Its been well over a decade now and neither I nor anyone I know has ever had an SSD endurance issue. So it seems like the type of problem where you should just go enterprise if you have it.
NXP makes a number of audio DSPs with a native 24 bit width.
Microchip still ships chips in the PIC family with instructions of various widths including 12 and 14 bit however I believe the data memory on those chips is either 8 or 16 bit. I have no idea how to classify a machine where the instruction and data memory widths don't match.
Unlike POSIX, C merely requires that char be at least 8 bits wide. Although I assume lots of real world code would break if challenged on that particular detail.
A little late to lawyer that...