zlacker

Asynchronous programming

With the addition of async to django core, I felt its time to finally learn the concept. I first took interest in async early last year when I re-read a medium post on Japronto; an async python web framework that claims to be faster than Go and Node.

Since then, I've been on the lookout for introductory posts about async but all I see is snippets from the docs with little or no modifications and a lame (or maybe I'm too dumb) attempt at explaining it.

I picked up multi threaded programming few weeks ago and I understand (correct me if I'm wrong) it does have similarities with asynchronous programming, but I just don't see where async fits in the puzzle.

replies(4): >>pixelm+T8 >>Fridge+fd >>starea+1e >>carapa+rU

>>bunya0+(OP)
The first couple of paragraphs of the documentation for asyncore, the module in Python's standard library that implemented the machinery for async IO all the way back in 2000, has a great description of what async programming is all about. Here it is:

https://python.readthedocs.io/en/latest/library/asyncore.htm...

'There are only two ways to have a program on a single processor do “more than one thing at a time.” Multi-threaded programming is the simplest and most popular way to do it, but there is another very different technique, that lets you have nearly all the advantages of multi-threading, without actually using multiple threads. It’s really only practical if your program is largely I/O bound. If your program is processor bound, then pre-emptive scheduled threads are probably what you really need. Network servers are rarely processor bound, however.'

'If your operating system supports the select() system call in its I/O library (and nearly all do), then you can use it to juggle multiple communication channels at once; doing other work while your I/O is taking place in the “background.” ...'

>>bunya0+(OP)
Someone probably much better than me can correct me where I go wrong here, but I'll have a stab at this.

The way I think about it, is asynchronous programming gives you the tools to write programs that don't stop doing useful work while they're waiting for something to happen. If parallelism gives you more effective use of your CPU, asynchronous programming gives you more effective use of your time. Let's presume you have a program that does some things, makes several requests to the network or requests several things from the file system, collects the results and carries on.

In a synchronous program, you would make each request, wait for it to come back (the program would block at this point), then when it's complete, proceed with the next request. If each request takes ~2 seconds to complete, and you've got 15 to make, you've spent most of that 30 seconds just idling, not actually doing anything.

In an asynchronous program, you could submit those requests all at once, and then process them as they came back, which means you only spend about ~2 seconds waiting before you start doing useful work processing the results. Even if your program is single threaded and you can only actually process one item at a time, you've made more efficient use of your time.

Some murkiness comes in the intersection of the 2 and how it's implemented in various languages. For example, you could also dispatch each of those requests out to a thread, and if you returned all the results to the main thread before processing them you'd have the same result and near the same performance as the async example (+- thread dispatch overhead etc etc). The power and advantage comes when you can use both to their advantage: you can't necessarily dispatch threads forever, because the overhead will impact you, and you can saturate your CPU. On the flip side, making something asynchronous that actually requires CPU work won't net any benefits because the work still has to be done at some point. Asynchronous programming gives you a way to move things around to maximise your efficiency, it doesn't actually make you go faster.

JS and Python are single threaded with event-loops, Rust organises chains/graphs of async code into state machines at compile time and then lets the user decide exactly how it should be run (I'm fairly this is correct, but if I'm wrong someone let me know). Dotnet to the best of my knowledge lets your write "tasks" which are usually threads behind the scenes (someone please correct me here). I don't know what Java uses, but I imagine there's a few options to choose from. Haskell performs magic as far as I can tell. I don't know how it's model works, but I did once come across a library that appeared to let you write code and it would automatically figure out when something could be async, rearrange calls to make use of batching, automatically cache and reuse similar requests and just generally perform all kinds of Haskell wizardry.

>>bunya0+(OP)
The best way to earn in my opinion is to start with small simple projects and build up from there. It can be a strange concept to get used to if you have a longtime background in synchronous programming (like I had too). I finally wrapped my head around it when picturing the programming flow as getting another direction - perpendicular to the normal vertical flow (I prefer to think in terms of geometry ...).

Japronto does not seem to be under active development any more, but async programming is definitely the way to go in order to squeeze the most performance out of the hardware at ones disposal.

I put down some thought round this that tracks my own journey to understanding the concept (sorry if this is too basic for you, take it or leave it, and please note that I'm not an expert by a long shot).

It's not guaranteed that you have the same way of picturing things, but here goes: programs normally run in one direction, executing one line at the time from top to bottom (vertically). But one or more of those 'vertical' commands may send the computer off in a horizontal direction too (async calls), that have a 'horizontal' chain of commands.

The problem that I (and I think many with me) have had a hard time grokking at first is that the 'vertical' flows continue immediately after having issued a 'horizontal'(async) call. The computer doesn't wait for the async call to come back. To do something after the async call has finished you have to tack a new call onto the result of the async call in the 'horizontal' chain of events, previously often leading to what was called 'callback hell' in Nodejs programming.

Not sure about PHP but one may get round the problem of callback hell in the JavaScript world by using async/await and promises which mimics synchronous programming, i.e. program flow in the 'vertical' direction is actually halted until the async calls return a result. Personally I find that this adds another level of abstraction that sometimes may make things even more difficult to understand and debug. I prefer wrapping async calls in some queue construct instead (which takes care of the chaining of consecutive async calls), works for me.

In short, synchronous commands are automatically 'chained' top to bottom in code, asynchronous commands have to be chained manually after the completion of each async bloc of code. I believe multi-threaded process programming is just a more advanced case of async calls that often need to be 'orchestrated', i.e. coordinated in a way that simple async calls usually don't need. But all types of async programming comes with some special issues, of which race-conditions is maybe the most common, i.e. when several async processes are trying to change the value of a shared asset in an ad-hoc manner.