I read chapter 4 of Practical Common Lisp today. In it, Seibel went over the syntax and semantics of Common Lisp.
The first thing I should note is that I was amused that he began the chapter with a section titled "What's with All the Parentheses?" Yesterday I was thinking about that. Apparently a lot of people strongly dislike the sheer number of parentheses used in CL and for many other people, they find it to be quite powerful and beautiful. I haven't been using the language long enough to have a good opinion on it, but I'm liking it so far. They just feel right. Apparently John McCarthy (the inventor of Lisp) was considering making Lisp more Algol-like (Algol was a programming language that influenced languages like C) but it never actually ended up happening because a whole bunch of people ended up actually really enjoying Lisp's "s-expressions."
I then learned the high-level overview of how the Lisp language processor is structured. Usually, the language processor of a programming language breaks a source code file into tokens which are then fed into a parser to build an abstract syntax tree (AST). This tree is then interpreted or compiled (language-dependent of course). Lisp is different, however. Instead of one language processor, there are two. The first directly translates text into Lisp objects and the second implements the semantics of the language using those objects. The former is called the reader and the latter is called the evaluator. They operate on different levels of syntax. The first makes sure the text being fed in consists of legal s-expressions and the second makes sure that those s-expressions are legal Lisp forms.
S-expressions consist of lists and "atoms." Lists are anything enclosed in parentheses, each of the elements being separated by whitespace. Atoms are everything else. Lisp forms are all s-expressions that aren't a list where the first element is a literal.
So it seems that yesterday when I said that I was wrong about "everything in Lisp being a list," I was partially wrong. All non-atomic forms are in fact lists. It's just that not all lists are legal Lisp forms. Many things in Lisp syntax do consist of lists. They're pretty fundamental.
There are two types of atoms in Lisp forms: symbols and everything else. A symbol is basically just a variable label. When evaluated, a symbol simply returns the value its name refers to. All other atoms are self-evaluating. So when they're evaluated, they simply return themselves.
This has some interesting implications: things like T
and NIL
are also symbols. They're built-in constant symbols referring to "true" and "false." These are self-evaluating symbols. Other self-evaluating symbols include keyword symbols. These are what we saw yesterday in property lists: symbols prefaced by :
.
So there's list forms and then there are self-evaluating atomic and symbollic forms. List forms are broken into three categories: function call forms, special forms, and macro forms. I'm only gonna cover the first two today because I only got up to the section on macros.
If a list begins with a symbol that refers to a function, the evaluator simply evaluates the remaining forms in the list and then passes them into the function.
There are also special operators. These are operators that are so basic that they can't be functions (ex. if
statements). Special rules for these operators are encoded into the evaluator so that when they're encountered as the first element of a list form, these rules are ran.
This stuff is so interesting. This is the first time I'm really peering behind the curtains at how a language is designed from its most basic building blocks. Amazing. I want to know more.