zlacker

Notation as a Tool of Thought

submitted by mafaa+(OP) on 2020-11-30 00:39:09 | 203 points 43 comments
[view article] [source] [links] [go to bottom]

replies(10): >>azhenl+m >>alexpe+Ac >>emmanu+Lc >>sxp+ck >>ColinW+eK >>fouron+XL >>rscho+sO >>derang+UW >>cs702+g61 >>bjourn+W82

>>mafaa+(OP)
This is Kenneth Iverson's 1979 Turing Award lecture.

replies(1): >>blulul+Fa

>>azhenl+m
Yes. There are some deep insights in this exposition. The irony is (in my opinion) APL is the worst Array/Matrix based programming language. In fairness it was also the first, but compared to Matlab or Julia it is not as expressive and feels much harder to use.

replies(3): >>soline+Gb >>patrec+PA >>yiyus+jY

>>blulul+Fa
There's a certain mathematical elegance to APL, I think. When the language is terse enough it helps you visualize and work with the language as a tool of thought--Matlab attempts to map actual mathematics to ASCII which is not that successful for me at least, since it meets a middle ground where it's too difficult for me to think quickly purely in Matlab and it's too high level for it to be useful as a practical language.

Engineers love it for prototyping, though, so maybe I just haven't worked with Matlab enough.

replies(2): >>blulul+lj >>Someon+0M

>>mafaa+(OP)
Reminds me of these notes: https://github.com/hypotext/notation

replies(1): >>emmanu+Vc

>>mafaa+(OP)
On this subject I love this strangeloop talk [1]: the right notation allowed John Conway to solve in an afternoon a knot problem that took another mathematician (Little) 6 years to solve!

Another example is juggling notation, that allowed not only the sharing of patterns but the discovery of new ones [2].

1: https://www.youtube.com/watch?v=Wahc9Ocka1g

2: https://en.wikipedia.org/wiki/Juggling_notation

>>alexpe+Ac
Thanks for sharing! Lots of great pointers, including the stuff I commented too :-)

>>soline+Gb
Fair point - personally I found APL to be a little too terse to be readable in ASCII. I think that there is a big difference in the affordances of a chalkboard/paper and a monospaced text editor, and to me APL is too close to paper based notation where it is easy to read and write a larger set of symbols. Matlab has some dedicated notation around matrices but uses more text heavy descriptions beyond that which feels better suited to a command line. Julia takes an even more text based approach and supports notations like list comprehensions which feel easier to learn, read and use than a set glyphs.

>>mafaa+(OP)
For those unfamiliar with the power of APL, see this demo of someone livecoding the Game of Life: https://www.youtube.com/watch?v=a9xAKttWgP4

Its modern descendent are https://en.wikipedia.org/wiki/J_(programming_language) & https://en.wikipedia.org/wiki/K_(programming_language).

replies(2): >>moonch+ov >>dTal+ip2

>>sxp+ck
APL itself continues to be developed; dyalog APL (the version demonstrated in that video) has added many of the features introduced by j. Roger hui (one of the original j developers) now works on dyalog.

There's also bqn[0], among others.

0. https://github.com/mlochbaum/bqn

>>blulul+Fa
If you think APL is less expressive than Matlab, you probably haven't really grasped it, IMO. Having said that, Matlab is optimized for manipulating matrices and replicating the notation of normal linear algebra and has excellent implementations of basically any numerical algorithm that frequently comes up in this . So writing something like chol(X'*X+diag(eig(X))) in an APL will look uglier and quite possibly slower and less accurate and depending on the numerical routines you need, require extra implementation work on your part.

But that overhead is constant factor, more or less anything you can express well in matlab can be expressed straightforwardly in APL, too, if you have the right numerical routines. That's not true in the other direction though: there's a lot of stuff in APL you cannot express adequately in matlab at all. For example J (and these days Dyalog as well IIRC) have an operation called under which basically does this: u(f,g) = x => f^-1(g(f(x)). So you can write geometric_mean = u(mean, log).

It is completely impossible to implement something like "under" in matlab. Admittedly the J implementation at least of deriving a generalized inverse for an arbitrary function f is a somewhat ill-defined hack, but this is still something that is both conceptually and practically quite powerful. Also, whilst Matlab is really clunky for anything that is not a 2D array and hardcodes matrix multiplication as the one inner-product, APL has more powerful abstractions for manipulating arbitrary rank arrays and a more general concept of inner products.

Also, APL has some really dumb but cherished-by-the-community ideas that make the language less expressive and much more awkward to learn, e.g. the idea of replicating the terrible defect of normal mathematical notation where - is overloaded for negation and subtraction to every other function.

replies(2): >>moonch+QE >>henrik+AP

>>patrec+PA
> Admittedly the J implementation at least of deriving a generalized inverse for an arbitrary function f is a somewhat ill-defined hack

Have you seen the version used by dzaima/apl[1]? The equivalent of '(-&.:{:) i.5' works and results in 0 1 2 3 _4.

> APL has some really dumb but cherished-by-the-community ideas that make the language less expressive and much more awkward to learn, e.g. the idea of replicating the terrible defect of normal mathematical notation where - is overloaded for negation and subtraction to every other function

Klong[2] is a partial attempt to resolve this. I won't repeat the arguments in favour of ambivalent functions, as I guess you've heard them a dozen times before

> u(f,g) = x => f^-1(g(f(x)).

Other way round; it's g^-1(f(g(x)))

1. https://github.com/dzaima/apl

2. https://t3x.org/klong/

replies(1): >>patrec+Ha2

>>mafaa+(OP)
As mentioned elsewhere, the right notation[0] allowed us to discover new juggling patterns that had never been done[1][3]. I freely admit that current mathematical notation has problems, but most of the proposed reforms seem to lose the predictive and creative power, becoming mere notation and nothing more. The problem is that without experiencing that extra dimension it's impossible to see that, and you can't experience that extra dimension without investing the time and effort to learn mathematics to a significant level.

Tricky.

[0] https://www.numberphile.com/videos/juggling-by-numbers

[1] As far as we know. Without the notation we don't actually know what had been done, but when I took the new patterns to juggling conventions, no one knew them[2].

[2] Actually it's stronger than that. I showed people some of the new patterns at the British Juggling Convention in 1985 and no one knew them. Then at the European Juggling Convention just 4 months later, people from the USA were proclaiming them as the latest patterns that they had just learned, and were perplexed at how I not only knew them, but knew many, many more.

[3] And actually Paul Klimek had beaten us to it, but hadn't been able to get others interested in the notation. As far as we can tell, Paul was the first to get the notation.

replies(1): >>fouron+fM

>>mafaa+(OP)
I recently came accross a math theorem that in my opinion perfectly illustrates how mathematical notation can sometimes reach harmful levels of abuse.

E( E(X|Y) ) = E(X)

This is known as "the law of total expectation", and as a programmer this notation is so weakly typed it makes no sense. The more correct notation is

E_Y(E_X(X|Y))

If you can see that the outer E is summing over Y and the inner one over X, then the theorem is immediately clearer and very intuitive.

replies(2): >>Myrmor+nN >>st1x7+pQ

>>soline+Gb
“Engineers love it for prototyping, though”

Makes perfect sense. Matlab is for engineers, not for mathematicians. They use computer algebra systems, proof assistants, etc. Difference is that engineers (and physicists) want answers and don’t care about how they are obtained, while its the reverse for mathematicians.

I think APL, although it, too, is a language for computing numbers, spiritually is a bit closer to mathematics than Matlab.

replies(1): >>soline+Tz7

>>ColinW+eK
Siteswap is really wonderful. It's so good that many nerd jugglers (myself included) enjoy reading about weird and extreme patterns that are way beyond our skills (or even human skills) but technically possible.

>>fouron+XL
In the inner expectation there’s only one choice: you’re conditioning on Y so it’s clear that X is the random variable whose expectation is being taken.

In the outer expectation, there’s only one choice: X no longer exists as a potential random variable (as if it were a local variable in the inner expectation) so the outer expectation must be over Y.

I’m not saying that you’re wrong that those subscripts could be used (they often are) but the meaning of the expression is clear after a little while working with expectations.

>>mafaa+(OP)
As a lone data wrangler, I am dreaming of an "APL like R", i.e. geared specifically towards data manipulation and stats with an integrated columnar store (I am spoilt by J).

People always say that such a thing will never fly in teams due to syntactic issues, but APL really is a productivity secret weapon for loners and small teams!

replies(1): >>chrisp+mT

>>patrec+PA
> It is completely impossible to implement something like "under" in matlab.

I’m a little curious about this. Does J have a notion of the relationship between certain functions and their inverse? What is it that enables “under” in J which makes it impossible in Matlab?

replies(2): >>kliber+GV >>patrec+M21

>>fouron+XL
My favorite example is from MIT's Intro to Probability and Statistics course:

> Definition: The probability mass function (pmf) of a discrete random variable is the function p(a)=P(X=a).

This only makes sense if you already know what the definition is.

>>rscho+sO
Is it a matter of writing a comprehensive stats library?

replies(1): >>rscho+eV

>>chrisp+mT
A comprehensive stat library for J would be close (there is an RServe lib), but the most important thing is the integrated data store. For big companies, it makes perfect sense to want a separate data store. But for small shops or loners, tight language integration is light-years better! With J, I am allowed a real database (no effing .csv) where I can use the same (terse) language as for the analysis. This is the killer feature. This is where you see that say, for example Julia, is made for serious industrial coding with teams of tens of people and not lone guys.

Following the same principle, R dplyr allows you to seamlessly interact with the DB by using a translation layer. However, every time I open R I find myself having to write tens if not hundreds of loc to shape the data where I'd do that in a few lines of J. For single researchers, it's actually much easier to read your one-page of J code 6 months later than it is for your 500-loc R script (again IMO).

Although I imagine it could be possible to really make a specialized APL geared towards data analysis as a strict DSL (APL is not a DSL). Meaning for example, making it more static and therefore statically compileable at the expense of losing things such as first-class environments (namespaces) or the "execute" primitive. One could also specialize the notation further towards statistics. There really is a whole realm of possibilities, here!

In a word, there is a market for lone scientists. It would be nice to have tools for that market ;-)

replies(1): >>chrisp+PW

>>henrik+AP
> Does J have a notion of the relationship between certain functions and their inverse?

Yes. Many built-in words have inverses assigned, and you can assign inverse functions to your own words with :. https://code.jsoftware.com/wiki/Vocabulary/codot

EDIT: and here's a table with predefined inverses: https://code.jsoftware.com/wiki/Vocabulary/Inverses

replies(1): >>henrik+K21

>>rscho+eV
You didn't mention it, so for avoidance of doubt: have you heard of k?

Commercial k variants come with a columnar data store (see Kx's q/kdb+, Shakti's k9).

replies(1): >>rscho+WZ

>>mafaa+(OP)
I feel like this ties back to making general abstractions that anyone can make for any field. Software engineering is littered with abstractions to the point where 2 similar functioning applications can look wildly different when looking at their respective code bases.

Even the abstractions we build into programming languages invites a certain way of thinking which is why there’s so many different paradigms like functional, imperative, logic, declarative, etc.

replies(1): >>jkhdig+UU6

>>blulul+Fa
How do you define expressiveness to arrive to that conclusion?

>>chrisp+PW
Yes, I heard of K which makes some things a little more convenient and others a little less by adopting the "list-of-vectors" model instead of true multidimensional arrays. K still is very much in the line of traditional APLs, though. I don't have extensive experience with K, but J with the Jd DB seems very closely related.

I was thinking of more radical changes that would certainly disappoint APL purists by exchanging some well-known and highly-regarded APL capabilities for a more specialized language. Maybe I'll put my thoughts in code someday, if life doesn't get in the way.

The strength of APLs relies on having a fixed set of versatile primitives. From there, the terseness of the language allows one to write small but expressive and useful programs. This strongly limits the need for external libs and this is where the real value lies. Whereas in Python, you'd reach for multiple libs made in another language and often by other people, in APL you learn the prmitives and you're off to the races! Therefore, the fixed fraction of the programs you write (primitives vs functions) is much smaller (no libs with poor documentation) and you don't risk the carpet being swept from under you by changing lib APIs.

BTW J is stellar for data wrangling, and I encourage everyone to endure the multiple weeks of effort required to learn the basics of the language. Spend the time, it will be really rewarding!

replies(1): >>avmich+2i2

>>kliber+GV
That can definitely be implemented in Matlab with the symbolic math toolbox.

But interesting nonetheless.

replies(1): >>patrec+V31

>>henrik+AP
Here is toy example: I define a function f(x):=1+x2. J will automatically work out the inverse for me, as you can see below:
f=:(1+*&2) f 1 2 3 4 3 5 7 9 (f^:_1)f 1 2 3 4 1 2 3 4 (f^:_1) 1 2 3 4 0 0.5 1 1.5
Now obviously not every function is bijective or even if bijective, trivial to invert -- and J doesn't (or at least didn't) have a super well specified way of computing those generalized inverses. But still: "under" is actually pretty cool, even just conceptually I find it quite valuable.

◧◩◪◨⬒⬓⬔
29. patrec+V31[view] [source] [discussion] 2020-11-30 13:53:25

>>henrik+K21
I have used matlab quite a bit in the past, but not the symbolic math toolbox: can you make a "normal" function definition symbolic retroactively? I thought you need to define the function explicitly as symbolic to start with, or am I wrong?
replies(1): >>henrik+vn1

◧
30. cs702+g61[view] [source] 2020-11-30 14:06:59

>>mafaa+(OP)
Iverson's famous paper introducing APL.
A key takeaway for me is that computers make it possible to have unambiguous notation. Quoting from Iverson's paper: "The thesis of the present paper is that the advantages of executability and universality found in programming languages can be effectively combined, in a single coherent language, with the advantages offered by mathematical notation."
Ambiguity is a real problem with much of conventional mathematical notation, which has evolved in fits and starts throughout history. Mathematical symbols are used, reused, and overloaded with different meanings again and again, in so many ways, that context is often necessary for understanding. Ambiguity hampers the use of mathematical notation as a tool for thought.
The other big takeaway from this paper, for me, is that succinctness makes it easier to reason. That is, programming languages that enable us to express more with less code make it easier for us to reason about -- and with -- code.

◧◩◪◨⬒⬓⬔⧯
31. henrik+vn1[view] [source] [discussion] 2020-11-30 15:40:54

>>patrec+V31
Yes, the symbolic functions can be made "retroactively". The various conventional functions are overloaded with the symbolic input.
>> syms x >> f = @(x) log(sqrt(x)).^2 f = function_handle with value: @(x)log(sqrt(x)).^2 >> f(x) ans = log(x^(1/2))^2 >> finverse(f(x)) ans = exp(2*x^(1/2))
And to implement under:
function u = under(f, g) syms x g_inv = matlabFunction(finverse(g(x))); u = @(x) g_inv(f(g(x))); end
replies(1): >>patrec+lr1

◧◩◪◨⬒⬓⬔⧯▣
32. patrec+lr1[view] [source] [discussion] 2020-11-30 16:00:26

>>henrik+vn1
So if you have odddouble.m with
function y=odddouble(a,b) y=2*x+1 endfunction
you can do
>>> h = under(@(x) 1/x, odddouble) >>> h(3)
? If so, yeah, I agree you can implement under in matlab (as long as you have the symbolic toolbox as well); in which case it's probably one of very few non-CAS systems where you can define it.
replies(1): >>henrik+ty3

◧
33. bjourn+W82[view] [source] 2020-11-30 19:27:38

>>mafaa+(OP)
I have thought about this in relation to chess. Most of us play chess by looking at an 8x8 black-and-white grid with pieces on it. That is a two-dimensional notation capable of expressing any chess position (castling, en passant, repeat-moves etc, excepted).
What if you could invent a more efficient "notation" for chess? For example, FEN is a chess notation that is very efficient for computers. So maybe something similar exists for humans? Perhaps a three-dimensional notation, or perhaps a rearrangement of the board using knight moves and octagons instead of squares. Knights can jump in eight directions at most so a board using octagons would make it easy to see where they land.

◧◩◪◨⬒
34. patrec+Ha2[view] [source] [discussion] 2020-11-30 19:34:56

>>moonch+QE
Is there an argument for ambivalent function definitions other than "keyword" recycling (possibly in a mnemonic fashion)?
I hadn't seen Dzaima's APL, thanks! I like that he made a processing binding; APL always seemed like it would be such an obvious choice for doing dweet style graphics code golfing that I wondered why no one seemed to be doing it. A web-based APL would be a better choice though.
replies(2): >>moonch+9d2 >>dzaima+9N3

◧◩◪◨⬒⬓
35. moonch+9d2[view] [source] [discussion] 2020-11-30 19:47:52

>>patrec+Ha2
> web-based
In that case you'll be wanting ngn/apl[1], which runs in a browser and compiles to js.
> ambivalent
The arguments are mostly linguistic. Natural language is also context-sensitive, so we are well-equipped to parse such formations; and they allow us to reuse information. The monadic and dyadic forms of '-' are related, so it's less cognitive overhead to recognize its meaning.
1. https://gitlab.com/n9n/apl

◧◩◪◨⬒⬓
36. avmich+2i2[view] [source] [discussion] 2020-11-30 20:14:29

>>rscho+WZ
> endure the multiple weeks of effort required to learn the basics
J for C programmers is a good book - https://www.jsoftware.com/docs/help807/jforc/contents.htm - and it could take significantly less than many weeks to "get" some important ideas. Specifically, make sure you understand ranks at chapters 5 and 6.
After you understand how +/ with different ranks can sum along different axis, you're well on the way.
I mean, here is a cube of numbers:
i. 2 3 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Plain +/ sums along the leading axis -
+/ i. 2 3 4 12 14 16 18 20 22 24 26 28 30 32 34
That's because rank of +/ is infinity, so / inserts pluses between highest-ranked items, of which there are two - a square
0 1 2 3 4 5 6 7 8 9 10 11
and square
12 13 14 15 16 17 18 19 20 21 22 23
(rank is a sort of dimension). So +/ just adds, element by element, these two squares together, giving the resulting square.
If you specify +/"0 - this sets the rank of the verb (function) to 0 - then +/ will be applied to each number separately and results will be combined. Adding a single number (not with itself - just as it is, without the other argument for summation) makes the same number, so +/"0 doesn't change the result - it's the same cube as in i. 2 3 4
Trying with +/"1 gives
+/"1 i. 2 3 4 6 22 38 54 70 86
That's because +/"1 now is a verb of rank 1, so it works with items (subarrays) of rank 1. In cube i. 2 3 4 there are 6 subarrays of rank 1, 3 of them are in the first "plane" and 3 of them are in the second "plane". +/"1 takes each such subarray of rank 1 separately and sums elements in it (inserts + between elements of such array), and J then aggregates results into the array.
Finally,
+/"2 i. 2 3 4 12 15 18 21 48 51 54 57
sums within 2-dimensional arrays. Elements of such arrays are 1-dimensional arrays, so those arrays are summed, element by element. There are two planes, so the result has two element (two arrays of rank 1), and each element is array of rank 1, obtained from summing 3 arrays of rank 1.
The book tells it better, of course.

◧◩
37. dTal+ip2[view] [source] [discussion] 2020-11-30 20:52:12

>>sxp+ck
APL was amazing for the time, but array-oriented programming is mainstream now, while the notation never really caught on. A lot of the mystique of APL is because it's illegible, but at the end of the day it's nothing more than a DSL for 'numpy-like' code. You can code the same demo, in the same amount of time, using Julia, and the result is (in my opinion) much more legible:
The opaque one-liner:
using IterTools,ImageInTerminal,Colors;for g in iterated(a->let n=sum(map(t->circshift(a,t),product(-1:1,-1:1)));(a.&(n.==4)).|(n.==3);end,rand(Bool,(99,99)));imshow(map(Gray,g));print("\n\n");end
The legible version where we give everything descriptive names so it's not cryptic and mysterious:
using ImageInTerminal,Colors #the APL demo also uses a library for pretty display using IterTools #okay *technically* this is a minor cheat function nextgen(grid) neighborcount = sum(map((t)->circshift(grid,t), product(-1:1,-1:1))) return (grid .& (neighborcount .== 4)) .| (neighborcount .== 3) end function animate(grid) for gen in iterated(nextgen, grid) imshow(map(Gray, gen)) print("\n\n") sleep(0.05) end end animate(rand(Bool,(100,100)))
replies(1): >>eggy+Wc3

◧◩◪
38. eggy+Wc3[view] [source] [discussion] 2020-12-01 03:14:38

>>dTal+ip2
I would argue that numpy is a dsl for apl-like code. APL and J are based on arrays at the fundamental level. J inspired Pandas per Pandas' creator.
I still think learning mathematical symbols is better than spelling out mathematical formulas and likewise APL and J to me allow the same power of abstraction; it just takes some effort to learn them. A lot of friction is learning something new.
replies(1): >>dTal+A44

◧◩◪◨⬒⬓⬔⧯▣▦
39. henrik+ty3[view] [source] [discussion] 2020-12-01 07:38:33

>>patrec+lr1
Just tested it and it worked fine. Note that your function takes too many parameters in its definition and that functions can't be passed as parameters by name.
It is definitely not as elegant as the built-in facility in J, but definitely doable and usable in Matlab. In fact, I think any language with flexible enough function overloading should be able to implement such a feature.

◧◩◪◨⬒⬓
40. dzaima+9N3[view] [source] [discussion] 2020-12-01 10:51:38

>>patrec+Ha2
Keyboard space is another somewhat important factor. My layout for dzaima/APL already uses all altgr keys, so I could definitely not afford multiplying the number of needed characters by 2. Not having ambivalently callable operators would also mean needing 2 versions of most of them.
dzaima/APL being written in Java means getting it to run in a browser would be a bit hard, and ngn has given up on ngn/apl, but BQN[0] could definitely get a web canvas based graphics interface.
Somewhat interesting to add to the conversation about Under is that, in my impl, calling a function, calling its inverse, or doing something under it (i.e. structural under) are all equally valid ways to "use" a function, it's just a "coincidence" that there's direct syntax for invoking only one. (Dyalog does not yet have under, but it definitely is planned.)
0. https://mlochbaum.github.io/BQN/

◧◩◪◨
41. dTal+A44[view] [source] [discussion] 2020-12-01 14:08:56

>>eggy+Wc3
I only mentioned Numpy as an example of a ubiquitous array paradigm, that most HNers are likely to know. It's a bolt-on to Python, yes, and it's ugly, and it's clearly inspired by APL which came first. All this is true.
But the power of abstraction of APL is available to any other language, with the right functions. Most scientific languages come with those functions out of the box, as demonstrated by my 1<->1 translation of 'Life in APL' into Julia above. And APL doesn't give you a bunch of other really useful general-purpose stuff; that's why I term it a 'DSL'. It's a one-trick pony. It's a great trick, but it's ultimately not quite enough. That's why NumPy hasn't replaced pure Python - you still need to get your hands dirty outside the array paradigm from time to time, and APL is very primitive at that. In fact there's nothing to stop anyone from aliasing array-functions to their APL equivalents in any Unicode-aware language, like Julia (oddly, nobody does). What you're left with is a rather basic parser, some odd syntax quirks like arrow assignment, and some ugly imperative flow control.
Are the fancy symbols really worth it?

◧◩
42. jkhdig+UU6[view] [source] [discussion] 2020-12-02 13:32:17

>>derang+UW
Turing machines all the way down... for every Turing machine there are an infinite number of alternative descriptions that provide the same result.

◧◩◪◨⬒
43. soline+Tz7[view] [source] [discussion] 2020-12-02 17:20:58

>>Someon+0M
I'm a bit of both, so I guess I take the radical approach--straight from mathematics to C++/ASM/FPGA/ASIC. Ultimately programming languages are just an alternate notative system for mathematics--formal language theory actually formalizes and generalizes this, it's what us Computer Scientist's specialize in generally.
Since the computer is just a glorified calculator with memory (sorry Apple), we can fit the whole thing into a formal mathematical framework.

[go to top]