zlacker

That split between bytes and unicode made better code. Bytes are what you get from the network. Is it a PNG? A paragraph of text? Who knows! But in Python 2, you treated them both as the same thing: a series of bytes.

Being more or less forced to decode that series into a string of text where appropriate made a huge number of bugs vanish. Oops, forget to run `value=incoming_data.decode()` before passing incoming data to a function that expects a string, not a series of bytes? Boom! Thing is, it was always broken, but now it's visibly broken. And there was no more having to remember if you'd already .decode()d a value or whether you still needed to, because the end result isn't the same datatype anymore. It was so annoying to have an internal function in a webserver, and the old sloppiness meant that sometimes you were calling it with decoded strings and sometimes the raw bytes coming in over the wire, so sometimes it processed non-ASCII characters incorrectly, and if you tried to fix it by making it decode passed-in values, it start started breaking previously-working callers. Ugh, what a mess!

I hated the schism for about the first month because it broke a lot of my old, crappy code. Well, it didn't actually. It just forced me to be aware of my old, crappy code, and do the hard, non-automatable work of actually fixing it. The end result was far better than what I'd started with.

replies(1): >>JoshTr+G8

>>kstrau+(OP)
That distinction is indeed critical, and I'm not suggesting removing that distinction. My point is that you could give all those types names, and manage the transition by having Python 3 change the defaults (e.g. that a string is unicode).

replies(1): >>kstrau+s9

>>JoshTr+G8
I’m a little confused. That’s basically with Python 3 did, right? In py2, “foo” is a string of bytes, and u”foo” is Unicode. In py3, both are Unicode, and bytes() is a string of bytes.

replies(1): >>JoshTr+kA

>>kstrau+s9
The difference is that the two don't interoperate. You can't import a Python 3 module from Python 2 or vice versa; you have to use completely separate interpreters to run them.

I'm suggesting a model in which one interpreter runs both Python 2 and Python 3, and the underlying types are the same, so you can pass them between the two. You'd have to know that "foo" created in Python 2 is the equivalent of b"foo" created in Python 3, but that's easy enough to deal with.