Lisp Interpreter

A Lisp interpreter is a program that reads and executes code written in a Lisp-family language. Lisp, standing for 'LISt Processor,' is characterized by its simple, uniform syntax based on S-expressions (symbolic expressions), which are either atoms (symbols or numbers) or lists of S-expressions. Building a Lisp interpreter is a classic exercise in computer science and programming language design, often used to illustrate fundamental concepts like parsing, abstract syntax trees, and evaluation.

Core Components of a Lisp Interpreter:

1. Reader (Lexer & Parser):
* Lexer (Tokenization): The first step is to break the input string (Lisp code) into a stream of meaningful tokens. For example, `(+ 1 2)` would be tokenized into `(`, `+`, `1`, `2`, `)`. This involves identifying operators, numbers, symbols, and parentheses.
* Parser: The parser takes the token stream and transforms it into a structured representation, typically an Abstract Syntax Tree (AST) or directly into S-expressions. In Lisp, S-expressions inherently serve as a direct representation of the program's structure. For `(+ 1 2)`, the parser would build a list structure where the first element is the symbol `+`, followed by the numbers `1` and `2`.

2. Evaluator (Eval-Apply Cycle):
* This is the heart of the interpreter, responsible for executing the parsed S-expressions. The evaluator recursively processes expressions.
* Atoms: If the expression is an atom (like a number or a symbol), it's either returned directly (if a number) or its value is looked up in the current environment (if a symbol/variable).
* Lists: If the expression is a list, the evaluator typically treats the first element of the list as a function or special form (like `if`, `define`, `lambda`). The remaining elements are arguments to that function or form.
* Function Application: For regular functions, the arguments are first evaluated, and then the function is 'applied' to these evaluated arguments. This recursive process is known as the "eval-apply cycle."
* Environment: The evaluator maintains an environment (a mapping of symbols to their values) that tracks variables and their scope. When a function is called, a new environment frame is often created, extending the current one.

3. Printer:
* After evaluation, the result (which is also an S-expression) is converted back into a human-readable string format for output.

Why build one?

* Understanding Language Design: It provides deep insight into how programming languages work, from syntax to semantics.
* Simplicity of Lisp Syntax: Lisp's simple, uniform, and parenthesized syntax (S-expressions) makes it an excellent candidate for learning parser and evaluator design, as the parsing step is often simpler than for languages with more complex grammars.
* Recursion and Data Structures: Lisp interpreters heavily rely on recursion and list processing, reinforcing these fundamental programming concepts.

In essence, a Lisp interpreter continuously reads Lisp code, parses it into an internal data structure, evaluates that structure, and prints the result, often in a Read-Eval-Print Loop (REPL).

Example Code

use std::collections::HashMap;
use std::io::{self, Write};

#[derive(Debug, Clone, PartialEq)]
enum Expr {
    Atom(String),
    Number(f64),
    List(Vec<Expr>),
}

// --- Lexer (Tokenization) ---
// Converts a raw Lisp-like string into a sequence of tokens.
// Example: "(+ 1 2)" -> ["(", "+", "1", "2", ")"]
fn tokenize(input: &str) -> Vec<String> {
    let mut tokens = Vec::new();
    let mut current_token = String::new();

    for c in input.chars() {
        match c {
            '(' | ')' => {
                if !current_token.is_empty() {
                    tokens.push(current_token.clone());
                    current_token.clear();
                }
                tokens.push(c.to_string());
            }
            ' ' | '\t' | '\n' => {
                if !current_token.is_empty() {
                    tokens.push(current_token.clone());
                    current_token.clear();
                }
            }
            _ => {
                current_token.push(c);
            }
        }
    }

    if !current_token.is_empty() {
        tokens.push(current_token);
    }

    tokens
}

// --- Parser (S-expression construction) ---
// Recursively builds an S-expression (Expr) from a token stream.
fn read_from_tokens(tokens: &mut Vec<String>) -> Result<Expr, String> {
    if tokens.is_empty() {
        return Err("Unexpected end of input".to_string());
    }

    let token = tokens.remove(0); // Pop the first token

    match token.as_str() {
        "(" => {
            let mut list = Vec::new();
            while tokens.get(0).map_or(false, |t| t != ")") {
                list.push(read_from_tokens(tokens)?);
            }
            if tokens.is_empty() || tokens.remove(0) != ")" { // Pop ')'
                return Err("Unclosed parenthesis".to_string());
            }
            Ok(Expr::List(list))
        }
        ")" => Err("Unexpected ')'".to_string()),
        _ => {
            // Try to parse as a number, otherwise it's an atom
            if let Ok(num) = token.parse::<f64>() {
                Ok(Expr::Number(num))
            } else {
                Ok(Expr::Atom(token))
            }
        }
    }
}

// Combines tokenization and S-expression parsing.
fn parse(input: &str) -> Result<Expr, String> {
    let mut tokens = tokenize(input);
    if tokens.is_empty() {
        return Err("No input tokens.".to_string());
    }
    read_from_tokens(&mut tokens)
}

// --- Evaluator ---
type Env = HashMap<String, Expr>; // Simple environment for variables

// Evaluates an S-expression within a given environment.
fn eval(expr: &Expr, env: &mut Env) -> Result<Expr, String> {
    match expr {
        Expr::Number(n) => Ok(Expr::Number(*n)), // Numbers evaluate to themselves
        Expr::Atom(s) => {
            // If it's an atom, try to look it up in the environment
            env.get(s).cloned().ok_or_else(|| format!("Unbound symbol: {}", s))
        }
        Expr::List(list) => {
            if list.is_empty() {
                return Err("Empty list cannot be evaluated".to_string());
            }

            let head = &list[0];
            let tail = &list[1..];

            match head {
                Expr::Atom(op) => {
                    match op.as_str() {
                        "+" => {
                            let mut sum = 0.0;
                            for arg_expr in tail {
                                match eval(arg_expr, env)? {
                                    Expr::Number(n) => sum += n,
                                    evaluated_arg => return Err(format!("Expected number for +, got {:?}", evaluated_arg)),
                                }
                            }
                            Ok(Expr::Number(sum))
                        }
                        "-" => {
                            if tail.is_empty() {
                                return Err("Not enough arguments for -".to_string());
                            }
                            let first_num = match eval(&tail[0], env)? {
                                Expr::Number(n) => n,
                                evaluated_arg => return Err(format!("Expected number for -, got {:?}", evaluated_arg)),
                            };
                            if tail.len() == 1 {
                                Ok(Expr::Number(-first_num))
                            } else {
                                let mut result = first_num;
                                for arg_expr in &tail[1..] {
                                    match eval(arg_expr, env)? {
                                        Expr::Number(n) => result -= n,
                                        evaluated_arg => return Err(format!("Expected number for -, got {:?}", evaluated_arg)),
                                    }
                                }
                                Ok(Expr::Number(result))
                            }
                        }
                        "*" => {
                            let mut product = 1.0;
                            for arg_expr in tail {
                                match eval(arg_expr, env)? {
                                    Expr::Number(n) => product *= n,
                                    evaluated_arg => return Err(format!("Expected number for *, got {:?}", evaluated_arg)),
                                }
                            }
                            Ok(Expr::Number(product))
                        }
                        "/" => {
                            if tail.len() < 2 {
                                return Err("Not enough arguments for /".to_string());
                            }
                            let first_num = match eval(&tail[0], env)? {
                                Expr::Number(n) => n,
                                evaluated_arg => return Err(format!("Expected number for /, got {:?}", evaluated_arg)),
                            };
                            let mut result = first_num;
                            for arg_expr in &tail[1..] {
                                match eval(arg_expr, env)? {
                                    Expr::Number(n) => {
                                        if n == 0.0 {
                                            return Err("Division by zero".to_string());
                                        }
                                        result /= n;
                                    },
                                    evaluated_arg => return Err(format!("Expected number for /, got {:?}", evaluated_arg)),
                                }
                            }
                            Ok(Expr::Number(result))
                        },
                        "define" => {
                            if tail.len() != 2 {
                                return Err("define expects 2 arguments: (define symbol value)".to_string());
                            }
                            let symbol_expr = &tail[0];
                            let value_expr = &tail[1];

                            match symbol_expr {
                                Expr::Atom(s) => {
                                    let evaluated_value = eval(value_expr, env)?;
                                    env.insert(s.clone(), evaluated_value.clone());
                                    Ok(evaluated_value) // Return the defined value
                                },
                                _ => Err("define expects a symbol for the first argument".to_string())
                            }
                        },
                        _ => Err(format!("Unknown operator or special form: {}", op)),
                    }
                },
                _ => Err("First element of list must be an operator or function call".to_string()),
            }
        }
    }
}

// --- Main REPL Loop ---
fn main() {
    let mut env = HashMap::new();
    println!("Welcome to the Rust Lisp Interpreter!");
    println!("Enter Lisp expressions. Type 'exit' to quit.");

    loop {
        print!("> ");
        io::stdout().flush().unwrap();

        let mut input = String::new();
        io::stdin().read_line(&mut input).unwrap();
        let input = input.trim();

        if input == "exit" {
            break;
        }
        if input.is_empty() {
            continue;
        }

        match parse(input) {
            Ok(expr) => match eval(&expr, &mut env) {
                Ok(result) => println!("{:?}", result),
                Err(e) => eprintln!("Evaluation Error: {}", e),
            },
            Err(e) => eprintln!("Parsing Error: {}", e),
        }
    }
}

Example Code

Related Topics