Difference between Special Variable and Global Variable-CodePudding

In GNU CLISP 2.49.92, the following code:

(defvar i 1)
(defun g ()
  (format T "g0:~d~%" i)
  (setf i 2)
  (format T "g1:~d~%" i))
(defun f ()
  (setf i 3)
  (format T "f0:~d~%" i)
  (g)
  (format T "f1:~d~%" i))
(f)

gives the following output:

f0:3
g0:3
g1:2
f1:2
NIL

Similarly, the following code in C:

#include <stdio.h>

static int i = 1;

int g (void) {
  printf("g0:%d\n", i);
  i = 2;
  printf("g1:%d\n", i);  
}

int f (void) {
  i = 3;
  printf("f0:%d\n", i);
  g();
  printf("f1:%d\n", i);  
}

int main() {
    f();
    return 0;
}

gives the following output:

f0:3
g0:3
g1:2
f1:2

According the the documentation that I found, defvar creates a special variable that is dynamically scoped. On the other hand, C is a statically scoped language. And yet, the two pieces of code give the same output. What then is the difference between a Special Variable and a Global Variable?

CodePudding user response：

In the case you're showing, you're setting an existing binding. Nothing surprising here. The interesting part is what happens when you let a special variable.

(defvar *i* 1)

(defun f ()
  (format t "f0: ~a~%" *i*)
  (let ((*i* (1  *i*)))
    (format t "f1: ~a~%" *i*)
    (g)
    (incf *i*)
    (format t "f2: ~a~%" *i*))
  (format t "f3: ~a~%" *i*))

(defun g ()
  (incf *i*)
  (format t "g: ~a~%" *i*))

(f)

which prints:

f0: 1
f1: 2
g: 3
f2: 4
f3: 1

The let of *i* creates a dynamic extent (because *i* was globally declared special by defvar).

CodePudding user response：

The difference is that a special variable is dynamically scoped: any binding of that name is visible to any code that runs during the dynamic extent of that binding, whether or not that binding is lexically visible to the code.

In what follows I am skating over some things: see notes at the end for some hints as to what I've skated over.

It's important to understand the difference between a binding and an assignment, which is often confused in various languages (notably Python, but not really C):

used as a noun, a binding is an association between a name and a value;
used as a verb, binding a variable creates a new association between a name and a value;
an assignment of a variable modifies an existing binding, it modifies an association between a name and a value.

So, in C:

void g (void) {
  int i;                        /* a binding */
  int j = 2;                    /* a binding with an initial value */
  i = 1;                        /* an assignment */
  {
    int i;                      /* a binding */
    i = 3;                      /* an assignment to the inner binding of i */
    j = 4;                      /* an assignment to the outer binding of j */
  }
}

C calls bindings 'declarations'.

In Lisp (by which I will mean 'Common Lisp' here & below), bindings are created by a small number of primitive binding forms: functions bind their arguments, let establishes bindings and there are some other forms perhaps. Existing bindings are mutated by, ultimately, setq and some other operators perhaps: setf is a macro which expands to setq in simple cases.

C does not have dynamic bindings: if my g function called some function h then if h tried to refer to i it would either be an error or it would be referring to some global i.

But Lisp does have such bindings, although they are not used by default.

So if you take the default case, bindings work the same way as C (in fact, they don't, but the difference does not matter here):

(defun g ()
  (let ((i)                             ;a binding (initial value will be NIL)
        (j 2))                          ;a binding with a initial value
    (setf i 1)                          ;an assignment
    (let ((i))                          ;a binding
      (setf i 3)                        ;an assignment to the inner binding of i
      (setf j 4))))                     ;an assignment to the outer binding of j

In this case you can tell, just by looking (which is what 'lexical' means) which bindings are visible, and which assignments mutate which bindings.

Something like this code would be an error (technically: is undefined behaviour, but I will call it 'an error'):

(defun g ()
  (let ((i))
    (h)))

(defun h ()
  (setf i 3))                           ;this is an error

It's an error because (assuming there's no global binding of i), h can't see the binding established by g and so cannot mutate it. This is not an error:

(defun g ()
  (let ((i 2))
    (h i)
    i))

(defun h (i)                            ;binds i
  (setf i 3))                           ;mutates that binding

But calling g will return 2, not 3 because h is mutating the binding it creates, not the binding g created.

Dynamic bindings work very differently. The normal way to create them is to use defvar (or defparameter) which declares that a given name is 'globally special' which means that all bindings of that name are dynamic (which is also called 'special'). So consider this code:

;;; Declare *i* globally special and give it an initial value of 1
(defvar *i* 1)

(defun g ()
  (let ((*i* 2))                        ;dynamically bind *i* to 2
    (h)))

(defun h ()
  *i*)                                  ;refer to the dynamic value of *i*

Calling g will return 2. And in this case:

;;; Declare *i* globally special and give it an initial value of 1
(defvar *i* 1)

(defun g ()
  (let ((*i* 2))                        ;dynamically bind *i* to 2
    (h)
    *i*))

(defun h ()
  (setf *i* 4))                         ;mutate the current dynamic binding of *i*

Calling g will return 4, because h has mutated the dynamic binding of *i* established by g. What will this return?

;;; Declare *i* globally special and give it an initial value of 1
(defvar *i* 1)

(defun g ()
  (let ((*i* 2))                        ;dynamically bind *i* to 2
    (h))
  *i*)

(defun h ()
  (setf *i* 4))                         ;mutate the current dynamic binding of *i*

Dynamic bindings are very useful where you want some dynamic state to be established for a computation. For instance imagine some system which deals with transactions of some kind. You might write this:

(defvar *current-transaction*)

(defun outer-thing (...)
  (let ((*current-transaction* ...))
    (inner-thing ...)))

(defun inner-thing (...)
  ...
  refer to *current-transaction* ...)

Note that *current-transaction* is essentially a bit of 'ambient state': any code in the dynamic scope of a transaction can see it but you don't have to spend some fantastic amount of work to pass it down to all the code. And note also that you can't do this with globals: you might think that this will work:

(defun outer-thing (...)
  (setf *current-transaction* ...)
  (inner-thing)
  (setf *current-transaction* nil))

And it will, superficially ... until you get an error which leaves *current-transaction* assigned to some bogus thing. Well you can deal with that in CL:

(defun outer-thing (...)
  (setf *current-transaction* ...)
  (unwind-protect
      (inner-thing)
    (setf *current-transaction* nil)))

The unwind-protect form will mean that *current-transaction* always gets assigned to nil on the way out, regardless of whether an error happened. And that seems to work even better ... until you start using multiple threads, at which point you die screaming, because now *current-transaction* is shared across all the threads and you're just doomed (see below): If you want dynamic bindings, you need dynamic bindings, and you can't, in fact, fake them up with assignments.

One important thing is that, because CL does not textually distinguish between operations on dynamic bindings and those on lexical bindings, it's important that there should be a convention about names, so when you read code you can understand it. For global dynamic variables, this convention is to surround the name with * characters: *foo* not foo. It's important to use this convention if you do not want to fall into a pit of confusion.

I hope that is enough both to understand what bindings are, how they differ from assignments, and what dynamic bindings are and why they're interesting.

Notes.

There are other lisps than Common Lisp of course. They have different rules (for instance, for a long time in elisp, all bindings were dynamic).
In CL and its relations, dynamic bindings are called 'special' bindings, and dynamic variables are therefore 'special variables'.
It is possible to have variables which are only local but are dynamically bound, although I have not talked about them.
Primitively, Common Lisp does not support variables which are both global and lexical: all the constructs which create global variables create global dynamic (or special) variables. However CL is powerful enough that it's pretty easy to simulate global lexicals if you wan them.
There is some dispute about what an assignment to an undeclared variable (one for which there is no apparent binding) should do, which I alluded to above. Some people claim this is OK: they are heretics and should be shunned. Of course they regard me as a heretic and think I should be shunned...
There is a subtlety about things like (defvar *foo*): this declares that *foo* is a dynamic variable, but doesn't give it an initial value: it is globally dynamic, but globally unbound.
Common Lisp does not define any kind of threading interface so, technically, how special variables work in the presence of threads is undefined. I'm sure that in practice all implementations which have multiple threads deal with special bindings as I've described above, because anything else would be awful. Some (perhaps all) implementations (maybe all) allow you to specify that new threads get a new set of bindings of some global variables but that doesn't alter any of this.

There will be other things I have missed.