I feel like this kind of operation on a list feels more naturally expressed by filtering the list and taking the length of the filtered list.
Like this line of JS feels so much easier to read than that line of python:
ages.filter(age => age > 17).length
Directly translating this approach to python:
len(list(filter(lambda age: (age > 17), ages)))
Although a better way to write this in python I guess would be using list comprehensions:
len([age for age in ages if age > 17])
which I feel is more readable (but less efficient) than the APL inspired approach. Overall, none of these python versions seem as readable to me as my JS one liner. Obviously if the function is on a hot path iterating and summing with a number is far more efficient versus filtering. In that case i'd probably still use something like reduce instead of summing booleans because the code would be more similar to other instances where you need to process a list to produce a scalar value but need to do something more complex than simply adding.
The blog starts out with mentioning other reduce operations in Python: 'prod from the math module; min; max; any; all; and "".join'. In APL those are:
×/ product reduce
⌊/ min reduce
⌈/ max reduce
∨/ logical OR reduce (any bool set)
∧/ logical AND reduce (all bools set)
,/ catenate reduce (join without spaces)
These all show a pattern of connection clearly where the Python names don't, that they are related operations; that suggests that you could put any function on the left or any kind of array on the right and see what happens. And they work over multidimensional arrays - and you can swap / for ⌿ to reduce down columns instead of accross the rows.
Your rewritten Python and JS, by showing the operation as length instead of sum, and making a shorter filtered list, hide the connection even further instead of helping to reveal and clarify it.
> "which I feel is more readable (but less efficient) than the APL inspired approach"
I know how to read it, but just look at:
age age ages age
len age for age in ages if age
what's readable about so much repetition, what's readable bout having to spot the single character plural change in the middle of 8 short words?
([>])
that's more symbols than the APL one has, the language people reject because of the heavy use of symbols(!). Why does the [] indicate loopy-listy code but so does 'for'? In PowerShell arrays have a property .Count to use instead of Length - what's readable about counting the number of ages by indirectly looking at the length of something? Is this "it's readable because I'm familiar with it" rather than "because it's objectively readable"?
Yes. One of the “repercussions” of the author’s experience with APL might have led to him recoiling from the verbosity of his Python examples and exploring languages, like Julia, that can get closer to the APL spirit.
It feels more natural to you because of familiarity. However, if you've learned Iverson Bracket notation in math (https://en.wikipedia.org/wiki/Iverson_bracket) then the APL approach will probably feel more natural, because it's a more direct expression of the mathematical foundations. Of course, the actual APL version is by far the most natural once you're familiar with the core ideas: +/ages>17
Doing @WalterBright's job for a second to let you know that in the D language, given a function that is defined as taking, say, a list as the first argument, it can be called either
myfun(list)
or
list.myfun()
and this applies without exception. This means a lot of C-library code gets easier to read when used from D:
I love D, learned it back in high school, but unfortunately it just seems like D is one of those great languages that just isn't going to take off. It just doesn't have a big enough champion to bootstrap a thriving large community that would make more people want to learn it and more companies adopt it.
On its technical merits, one of the best languages out there. It gives you insane performance but is so much easier to learn and write than C++ or Rust.
I felt the same way about Rust's .await and .?; I think we're gradually converging on the idea that postfix operators are the right way to do 1-argument functions. I'm not remotely convinced that RPN is a readable idea in general, but when we have a pipeline rather than a tree, writing it left-to-right is the winner.
Numpy is something close to APL semantics with Python syntax. There's no doubt it was heavily inspired by APL. One could argue that numpy's popularity vindicates the array model pioneered by APL, while driving a nail in the coffin of "notation as a tool of thought", or APL's version of it at any rate. Array programming has never been more popular but there's no demand for APL syntax.
I think the key to array programming success was due to the possibility of having fast interpreted languages for numerical computing. I'm very used to programming in array languages (Matlab, Python, Julia and even Mathematica has lot of vectorized ops) and still I think a lot in terms of for loops.
Neither of your examples is doing what the parent suggested as an alternative, which is simply looping and counting rather than creating an anonymous function to repeatedly call. I doubt this makes an enormous difference in practice, but you can see your generated assembly doing all the argument passing and stack frame setup, which is extra instructions compared to just straight looping and counting.
Your list comprehension is how I would naturally write this, and to me seems the most readable - it's almost like an English language sentence. The sum example in the article would probably be my second choice.
It's in there but near the end (80% or so of the way down the page). The article would benefit from moving that to the top and drawing the comparison to the APL code earlier.
I think APL's ability to lift loop patterns into tensor patterns is interesting. It certainly results in a lot less syntax related to binding single values in an inner loop.
Kenneth E Iverson, the inventor of APL was truly a genius and his primary mission was about how to bridge the world of computing and mathematics.
To do this he invented the APL notation.
If you find the article interesting, you might enjoy my curation of his work "Math For The Layman" [0] where he introduces several math topics using this "Iversonian" thinking.
Thanks ex-APL2 coder here, will look up For Layman site later. My favorite APL story was somebody from morning yoga practice who grew up close to Yorktown Heights, in high school, his parents got him a job writing code for Iverson, which he described as a lot of fun.
It could have been a good career choice at one point, given how many FTE's were devoted to the language at Merrill Lynch, Morgan Stanley, Lehman etc but I took the other fork
As a beginner I definitely thought list comprehensions were easier than apply/filter style of operations.
They amount to the same, but the explicit loop was much easier for me to understand (and I’m still not sure if one applies a function to a value or a value to a function, so I never remember if the function or the values go first in an apply filter call)
bool is explicitly documented to be a subclass of int [1][2], so while it might be an obscure feature, or subjectively not someone's preferred style, I don't see any typing related issue. In general I don't think that treating an object as if it were an instance of one of its base classes is weak typing.
I'd argue that even though it's well defined behaviour in Python, it still appears to me as a programmer as weak typing. For example, Python lets me write:
if 23:
print('hello')
and it'll print 'hello'. But I'd prefer a strongly typed approach where this code would give an error saying, 'a bool is expected here'. Sure it's a subjective thing, and this is just my preference.
“I don’t know the type hierarchy used in language X” is not the same thing as “language X is weakly typed”.
Python 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit
(AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> isinstance(True, int)
True
>>> isinstance(False, int)
True
>>> issubclass(bool, int)
True
Bools are ints, but I'd say that makes Python weakly typed (ish - it's a spectrum obviously). Are you just using "weak typing" to mean "implicit type coercion"?
python -m timeit 'sum(1 for age in range(100000) if age > 17)'
50 loops, best of 5: 5.08 msec per loop
python -m timeit 'sum(int(age > 17) for age in range(100000))'
50 loops, best of 5: 7.96 msec per loop
python -m timeit 'sum(age > 17 for age in range(100000))'
50 loops, best of 5: 4.78 msec per loop
> Another big thing that APL made me realise is that the Boolean values True/False and the integers 1/0 are tightly connected
Amen! It's of course also a C language tenet, and a great one. Life is so much simpler and more flexible when true and false are 1 and 0. It drives me crazy when I need to use a language where the logical operators only work on bools and the arithmetic only on ints, or some coercions work and others don't. When I incorporate somebody else's code into mine, first thing I get rid of is anything called "bool", a completely useless type. (as a nice side effect, that frees up the bool keyword for Boolean sets, which are quite useful)
a disappointment with unix is that process retval has this a bit backward, 0 is success, nonzero is failure (probably because errno does want for more bits than a singleton) but it's easily enough remedied with a !
I did love everything else about APL for the brief time I used it long ago (except the difficulty of entering the symbols)
> Another big thing that APL made me realise is that the Boolean values True/False and the integers 1/0 are tightly connected
Not that tightly. Which is why C and Python, which both started out with boolean values just being integers, eventually retrofitted booleans to the language. Conversions come back to bite you.
(replying to myself to add) I do use hungarian, and I do use a bool-like abstract type, and it is indicated in my type calculus, but semantically my implementation allows for mixing true=1 and false=0 with other ints and arithmetic operators. I eschew implementations that do not have the convenient semantics which I feel introduces no confusion, only convenience
What is it about Python? A pretty trivial feature that has been known for 30 years (and is present in NumPy) is made into a whole blog post, linked to APL to sound more interesting and is on the front page for a day.
If the allegedly "most popular language" (confirmed by Gartner and Netcraft) needs that much proselytizing, perhaps it is artificially popular? Or has voting rings?
For a very large sequence traversing it to build a list and then traversing the list to do something you could do in one traversal without creating a list may be undesirable.
In a very large sequence, you would be using numpy. I assumed the purpose of their example was some sort of clarity, over `sum(age > 17 for age in ages)`, rather than performance.
You can't call len on a gen exp, though you could define a count function. For an unsafe variant:
def count(it):
return sum(1 for _ in it)
Which is basically just putting a friendly name on the approach from the article.
For a safe version, you probably want to wrap it in another generator that bails with an exception at a specified size so you don’t risk an infinite loop.
def safe_count(it, limit=100):
# returns None if actual length > limit
nit = zip(range(limit+1), it)
if (l := sum(1 for _ in nit)) <= limit:
return l
Of course, you can just convert to a list and return the length, but sometimes you don’t want to build a list in memory.
In Python a Boolean true and false are often used in mathematical formulas, so they will often implicitly be coerced into the integers 1 and 0. Sum is the sum function, you can sum a sequence of numbers, but in this case it’s summing a bunch of Boolean values which are coerced to 1 and 0.
Because what is happening I the list of ages is being transformed into a list of booleans where it's true if the age is greater than 17. This list of booleans is then turned into a list of integers where it's 1 if true, 0 if false. This list of integers is then being summed.
numpy (which is inspired by Matlab which is inspired by APL) does indeed have a count_nonzero function, which is intended to be used in situations like this. Unfortunately, it (like most of numpy) doesn't work with generators, just array-like objects (aka numpy arrays and python lists), so it has the same memory performance issues as filtering and using len.
If your input was a numpy array to begin with you could skip the array comprehension, and shorten it to numpy.count_nonzero(ages > 17), since numpy automatically broadcasts the comparison operation to each element of the array.
Yes, very similar. When performing an operation between an array and a scalar, it is identical to mapping that operation on each element of the array. Broadcasting generalizes this to also handle operations between matrices and vectors, such that the operation with the vector is applied to each row or column of the matrix.
Multi-line lambdas are fine: Python will accept newlines in certain parts of an expression, and you can use '\' for others; e.g.
f = lambda x: [
x + y
for y in range(x)
if y % 2 == 0
]
>>> f(5)
[5, 7, 9]
Lambdas which perform multiple sequential steps are fine, since we can use tuples to evaluate expressions in order; e.g.
from sys import stdout
g = lambda x: (
stdout.write("Given {0}\n".format(repr(x))),
x.append(42),
stdout.write("Mutated to {0}\n".format(repr(x))),
len(x)
)[-1]
>>> my_list = [1, 2, 3]
>>> new_len = g(my_list)
Given [1, 2, 3]
Mutated to [1, 2, 3, 42]
>>> new_len
4
>>> my_list
[1, 2, 3, 42]
The problem is that many things in Python require statements, and lambdas cannot contain any; not even one. For example, all of the following are single lines:
>>> throw = lambda e: raise e
File "<stdin>", line 1
throw = lambda e: raise e
^^^^^
SyntaxError: invalid syntax
>>> identity = lambda x: return x
File "<stdin>", line 1
identity = lambda x: return x
^^^^^^
SyntaxError: invalid syntax
>>> abs = lambda n: -1 * (n if n < 0 else return n)
File "<stdin>", line 1
abs = lambda n: -1 * (n if n < 0 else return n)
^^^^^^
SyntaxError: invalid syntax
>>> repeat = lambda f, n: for _ in range(n): f()
File "<stdin>", line 1
repeat = lambda f, n: for _ in range(n): f()
^^^
SyntaxError: invalid syntax
>>> set_key = lambda d, k, v: d[k] = v
File "<stdin>", line 1
set_key = lambda d, k, v: d[k] = v
^^^^^^^^^^^^^^^^^^^^
SyntaxError: cannot assign to lambda
>>> set_key = lambda d, k, v: (d[k] = v)
File "<stdin>", line 1
set_key = lambda d, k, v: (d[k] = v)
^^^^
SyntaxError: cannot assign to subscript here. Maybe you meant '==' instead of '='?
Ah yes, I've been doing this with stdout.write since Python 2; it didn't occur to me that when Python 3 turned print into a function that would make it usable in lambdas!
You're right that the walrus makes assignment more usable; we can also call methods like .__setitem__ to get similar effects. Unfortunately the walrus seems to suffer the same broken/ambiguous scoping as assignment statements, e.g.
>>> a = 1
>>> b = lambda: (print(a), a := 2)
>>> b()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
UnboundLocalError: cannot access local variable 'a' where it is not associated with a value
> Don't need return in a lambda
`return` is needed for early returns. Notice that my `abs` example is trying to do that with `return n` (to skip the `-1 *` operation, since `n` is non-negative in that case).
> Leaves raise and a few other odds and ends
Off the top of my head: yield, match, def, import, class, with, for, while, assert, try, except, async, await.
Hmm, need to put the assignment before. Can't access nonlocals unless you only read the the var and don't write to it. That's the way functions in python work as well.
No, that would not have the intended behaviour: your `print(a)` is reading a fresh local variable, defined by the `:=` expression (which shadows that from the outer scope), so it will output '2'. The intended behaviour is to print '1' and reassign the outer name; but we can't do that (that's why I chose it as an example!)
> Can't access nonlocals unless you only read the the var and don't write to it. That's the way functions in python work as well.
Yes, that is precisely what I meant when I said it "suffers the same broken/ambiguous scoping as assignment statements".
Python took a haphazard, WorseIsBetter approach to assignment/declaration/scoping. The resulting behaviour is complex, confusing and involves spooky action-at-a-distance (e.g. the meaning of my `print(a)` expression was altered by the presence of an ':=' expression which hadn't been reached yet!). Since this doesn't really affect simple "top to bottom" scripts, it only became an issue once large applications and frameworks started emerging; by that time it was too late to fix, since there was too much Python code in the wild to justify such a large breaking change :(
I merely explained why it didn't work, for any passerby. I don't think the rules are hard to understand once described. There's basically only one of consequence—if writing outside the current scope there neeeds to be a clarifying statement first to avoid ambiguity. Unfortunately that prevents usage in a lambda.
However, I don't think complex functionality belongs in them either, so not a big loss. See the Beyonce rule mentioned elsewhere in this thread.
Most functional languages don't have statements at all, and Python's anonymous functions can, as most, handle any single expression, regardless of complexity or size.
Python having a statement heavy syntax and making complex expressions (while possible) awkward is the problem with its anonymous functions, not the fact that its anonymous functions are limited to a single expression.
Definitely, I'm looking at this from a superficial level of writing functional programming inspired code in imperative languages. JS makes this a bit more comfortable with its style of anonymous functions.
I use the toolz package for currying and a few other conveniences.
That plus named tuples and a little discipline gets me 80+% of the benefits of pure functional.
The biggest thing to me wasn't in your list: lack of tail call optimization means you have to be a bit careful about where you use recursion.
I personally think long anonymous functions are an anti-pattern, naming your functionality is great for readability. I learned this from Wolfram language which makes heavy use of anonymous functions, I would often find myself returning from lunch to find my code which was perfectly clear that morning had become unintelligible. Today I try to limit anonymous functions to re-ordering arguments or "picking" functions that pull simple values out of more complicated data structures.
hmm... that just sounds like a specific case of recursive application of partial functions? At least that's how I interepret the wikipedia explanation:
"As such, curry is more suitably defined as an operation which, in many theoretical cases, is often applied recursively, but which is theoretically indistinguishable (when considered as an operation) from a partial application."
(while I see partial as having value, I'm struggling to see if currying would really be a useful addition to a scripting language)
You're saying that brevity for typing on 110 baud teletypes was the primary reason for the density of APL operators, but that's not historically true. For one thing Iverson and pretty much everyone else high profile in the APL community has said that the conciseness was itself a major part of the power of APL.
Further proof of that: APL began as a mathematical notation on chalkboards, and only later was it decided to implement as a programming language.
As an aside, Unix also began in the era of 110 baud teletypes, which motivated brevity of many of its common commands (ls, cat, etc), and those brief names have been criticized a lot -- but people still use Unix/Linux; the brevity is a side issue, not a deciding point. As usual with most things.
Writing on a chalkboard is another situation where brevity is at a premium. There's limited space and the act of writing on it is slow and causes hand cramps if it's too verbose.
I've gotten paid for working with APL code. Math professors who aren't great at typing love that stuff, but code has to be maintained and some mild verbosity, as python has, is a very reasonable price to pay for that maintainability.
If this was punch card input, or 110 baud teletypes, where program listings come back at a snails pace and use paper, then APL is great for that.
> "If this was punch card input, or 110 baud teletypes, where program listings come back at a snails pace and use paper"
So if your typing speed hasn't gone up in proportion to the increase in Baud, I'm guessing your reading speed also hasn't gone up tens of thousands of times, and your ability to hold working state in your head hasn't gone up thousands of times, what is the advantage of increased Baud to code readability which you are talking about?
Let's say I'm not disagreeing, but I'm trying to dig into what specifically the change is which makes the difference; the computer can display more code at you per second than 1950 but humans can't read much faster than 1950 so that doesn't seem like it will help. Presumably longer books aren't inherently more readable than shorter books?
Can it be that Python is more readable because it lets you skim over and not read more of the code? Since not-reading isn't reading, it seems like 'more filler' that you don't read isn't what adds to readability.
It presumably isn't that Python is more English-y because languages which tend towards English words (SQL, Objective-C, PowerShell Cmdlets, Applescript, BASIC) are often maligned specifically for that reason, and because Python isn't English - you couldn't speak it to Shakespeare and have him understand you).
It presumably isn't because Python uses fewer symbols, or we'd all love to write Java style var1.Equals(var2) instead of == and var1.Plus(var2) instead of + and people seem to dislike that also. Why would + be preferred over .plus() but .sum() be preferred over +/ ?
Is it that Python has more visible structure to hang understanding on? Is it that it's more like walking compared to jogging compared to sprinting, that one can sustain a lower effort 'slower read' for longer?
Like this line of JS feels so much easier to read than that line of python:
Directly translating this approach to python: Although a better way to write this in python I guess would be using list comprehensions: which I feel is more readable (but less efficient) than the APL inspired approach. Overall, none of these python versions seem as readable to me as my JS one liner. Obviously if the function is on a hot path iterating and summing with a number is far more efficient versus filtering. In that case i'd probably still use something like reduce instead of summing booleans because the code would be more similar to other instances where you need to process a list to produce a scalar value but need to do something more complex than simply adding.