My previous post showed a bit of code that demonstrates an actual bug I ran into this week. Here is the code again:

# Python
names = ['foo', 'bar', 'baz']
fns = [lambda: n for n in names]
print [f() for f in fns]

Here's the output:

['baz', 'baz', 'baz']

Let's try to figure out why. First factor out the list comprehensions:

names = ['foo', 'bar', 'baz']
fns = []
for n in names:
    fns.append(lambda: n)
result = []
for f in fns:
    result.append(f())
print result

Same result. Good. Now let's factor out the lambda (anonymous function) and give it an explicit name.

names = ['foo', 'bar', 'baz']
fns = []
for n in names:
    def fn(): return n
    fns.append(fn)
result = []
for f in fns:
    result.append(f())
print result

Still the same result. The reason this program outputs what it does might be becoming clearer... Let's think about what a for loop does again. "for N in LS" is translated into something like "if LS has anything, assign N to the first item in LS and run the loop body. if there is another item in LS, assign N to that item and run the body again. repeat." That assignment to N happens in the current local scope.

Okay, getting closer... Inside a function created inside of a scope, variable lookups start in the current local scope; if the lookup fails, it continues to the next outer scope. And so on all the way up to the global scope. (If it's not there, you get a NameError.) This lets you create inner functions that use variables from outside, such as globals. The important thing to note here is that these lookups only happen when you actually try to use a variable. So let's see what's happening here. The program builds a list of functions which return a variable n. Since n is not in the functions' scopes, it is pulled from the outer scope (in this case, the globals). By the time the functions are called, however, n has changed value several times. Its most recent value is the one the functions return; in this case, 'baz'. Thus, all of the functions return 'baz'. For performance reasons, you wouldn't want to take a snapshot of the values of every variable your functions access, but I think mentally that's what we think is happening when we write the code.

Is there a way to take a snapshot of variables we know will change, though? Default argument initializers let us do just that. Replace the program with the following and it does exactly what we want.

names = ['foo', 'bar', 'baz']
fns = []
for n in names:
    def fn(n=n): return n
    fns.append(fn)
result = []
for f in fns:
    result.append(f())
print result

And translating back into the original form with lambdas and list comprehensions...

names = ['foo', 'bar', 'baz']
fns = [lambda n=n: n for n in names]
print [f() for f in fns]

Tada! Lesson: functions create scopes (or frames in Python terminology). loops don't. Therefore, if you create functions inside of loops where variables will change value, use default arguments to capture their values at function creation time.