Thursday, January 28, 2010

More Assorted Closures

As I am learning more and more about functional programming, I am increasingly coming across the idea of closures. Closures deal with the concept of free variables, or variables that are not explicitly defined in the arguments to the function. The idea of a closure is that the binding of free variables in a function stays with that function. It can be thought of as saving the context a specific function runs in with the function itself.

Let's say you have a function with a free variable. Let's call it multbya, and let's define it to take an argument x, and return x * a, where a is a free value.

All languages will recognize a as a free variable. Now, different languages will do different things with this information. Many languages will simply freak out, saying "Oh God Oh God, you referenced a variable that I don't know anything about!" In some situations, this makes sense: Strongly typed languages (e.g. Scala freaks out) guarantee that the multiplication operator will work, but the language cannot guarantee that the multiplication operator is defined for any old symbol which may or may not be defined at runtime. Other languages (e.g. JavaScript doesn't freak out) may simply say "Well, I don't know what a is now, but fortunately I don't have to run the function now. I'm going to hope that a exists at runtime, and if it does, I'll use it. Also: hopefully the multiplication operator will exist for a."

Note that this concept only really applies in a REPL, as opposed to a file. In a file, nothing really is declared before or after anything else, because all the definitions are there in the file. The compiler can make 2 passes to the file, one to find symbols, and the other to define symbols. That way, symbols that are used can be defined after their use in the file. This is what Haskell appears to do.

This general idea, however, is a little different than what happens in a REPL. In Haskell and Prolog, the "program" is really what is referred to as a "knowledge base." The program doesn't flow: there are no variables and nothing changes. You define your program as a tree of transformations which transforms the input to the output. The "knowledge base" is just a list of those transformations. There are no variables - only arguments.

Okay, back to business. Let's say, for longevity's sake, that we define the variable a ahead of time. Let's set it equal to an integer equal to 3. Now, when you define multbya after defining a, all languages should be happy. Now, you can call the function and it should work just great.

Now, let's try to mask the variable a. Let's make another function, testit, with one variable, a. Inside this function, let's call multbya.

Now, in a language without closures, you could think of the body of the function simply being plopped down in place of the function call. Therefore, in a non-closured language, the a inside multbya is going to refer to the highest-level a, which is going to be the argument to testit. Therefore, in a non-closured language, the output of this function is going to be the argument to testit times the argument that testit called multbya with.

In a language with closures, the bindings associated free variables are passed around with the function. In technical language, the function is closed around its free variables. This means that, even though the highest level a is the argument to testit, multbya remembers its own binding to a. So, multbya remember that it was defined with a free variable of a, and that this free variable refers to the global a. This works anywhere, not just with global variables: functions remember what variables are closed under it so that a caller can't mask the callee's variables and change the function's functionality.

Not all languages use closures the same way, however. In our little example, let's pretend that we then change the value of the global a, then re-run testit. Now, this only works in languages where you can change values, or re-bind values.

Now, a language could deal with this in one of a few ways. Either a language lets you change values, or a language doesn't let you change values. Also, a closured language will either save values in closures, it will save references to values, or it will save symbolic references and a scope identifier.

If a language saves values in a closure, then changing the global value of a should have no effect on the output of testit.

If a language saves references in a closure, and if the language lets you change values, then the value returned from testit should change accordingly.

If references are saved, but the language simply re-binds the global name a to a new value, then the output should not be changed, because the reference still refers to the old value. Keeping this reference around makes sure that the garbage collector doesn't collect the value.

A way to get around this is to save a symbolic reference and a scope identifier. This is like saving the idea of "I want a variable a, and I want the one that's in the global namespace." Thus, no matter how references are handled under the hood, the output value is updated to use that variable.

All the languages that I have tried, except Haskell, have used either references or the last idea, because they have updated their output when I change a. As Haskell is purely functional, it might be doing some caching under the hood because values are not supposed to change.

Here is the Scala code I used:
hani:~ litherum$ scala
Welcome to Scala version (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_17).
Type in expressions to have them evaluated.
Type :help for more information.

scala> var a = 3
a: Int = 3

scala> def multbya(x : Int) : Int = x * a
multbya: (Int)Int

scala> def testit(a : Int) : Int = multbya(6)
test: (Int)Int

scala> testit(5)
res0: Int = 18

scala> a = 4
a: Int = 4

scala> testit(5)
res2: Int = 24

Here is the Clojure program I used:

hani:~ litherum$ clj
Clojure 1.1.0
user=> (def a 3)
user=> (defn multbya [x] (* x a))
user=> (defn testit [a] (multbya 6))
user=> (testit 5)
user=> (def a 4)
user=> (testit 5)

Here is the Common Lisp program I used:

hani:~ litherum$ clisp
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8

Welcome to GNU CLISP 2.48 (2009-07-28)

Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2009

Type :h and hit Enter for context help.

[1]> (setf a 3)
[2]> (defun multbya (x) (* a x))
[3]> (multbya 4)
[4]> (defun testit (a) (multbya 6))
[5]> (testit 5)
[6]> (setf a 4)
[7]> (testit 5)

Here is the JavaScript program I used:

hani:rhino1_7R2 litherum$ java -jar js.jar
Rhino 1.7 release 2 2009 03 22
js> var a = 3
js> function multbya(x) {
> return x * a;
> }
js> function testit(a) {
> return multbya(6);
> }
js> testit(5)
js> a = 4
js> testit(5)

Here is another JavaScript program I used:

hani:v8-read-only litherum$ ./shell
V8 version
> var a = 3
> function multbya(x) { return x * a; }
> function testit(a) { return multbya(6); }
> testit(5)
> a = 4
> testit(5)

Here is the Python program I used:

hani:~ litherum$ python3.1
Python 3.1.1 (r311:74480, Sep 25 2009, 09:54:21)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 3
>>> def multbya(x):
... return a * x
>>> def testit(a):
... return multbya(6)
>>> testit(5)
>>> a = 4
>>> testit(5)

Here is the Haskell program I used:

hani:~ litherum$ ghci
GHCi, version 6.10.4: :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude> let a = 3
Prelude> let multbya x = x * a
Prelude> let testit a = multbya 6
Prelude> testit 5
Prelude> let a = 4
Prelude> testit 5

No comments:

Post a Comment