Rust Logopest

Pest is a GLR (Generalized LR) parser generator for Rust. It allows developers to define grammars using a Parsing Expression Grammar (PEG)-like syntax and automatically generates efficient, zero-cost parsers. Pest is particularly well-suited for creating parsers for Domain-Specific Languages (DSLs), configuration files, custom data formats, or any situation where structured text needs to be interpreted.

How Pest Works:
1. Grammar Definition (.pest file): You define your language's grammar in a `.pest` file using a declarative syntax. This syntax is inspired by PEGs, which are a different formalism from traditional Context-Free Grammars (CFGs), primarily designed for unambiguous parsing. Each rule in the `.pest` file corresponds to a `Rule` enum variant generated by `pest_derive`.
2. Derive Macro (`pest_derive`): You apply the `#[derive(Parser)]` macro to a struct in your Rust code, pointing it to your `.pest` file (e.g., `#[grammar = "path/to/my_grammar.pest"]`). This macro reads the grammar file during compilation and generates a Rust parser implementation for your grammar.
3. Runtime (`pest` crate): The `pest` crate provides the necessary traits and functions for interacting with the generated parser. You use the `Parser::parse()` method to attempt to parse an input string according to a specified top-level rule. The result is an iterator of `Pair`s, which represent the parsed Abstract Syntax Tree (AST) or Concrete Syntax Tree (CST) structure.

Key Features:
* PEG Syntax: Intuitive and powerful grammar definition using common operators for sequencing, choice, repetition, and optional elements.
* Zero-Cost Abstractions: Pest aims for high performance by leveraging Rust's zero-cost abstractions, producing parsers that are often as fast as hand-written ones.
* GLR Parsing: While PEGs are typically unambiguous, Pest implements a GLR algorithm, which can handle ambiguous grammars (though PEGs are designed to avoid this) and produce all possible parse trees.
* Strong Error Reporting: It provides detailed error messages, including line and column numbers, making debugging grammars and malformed inputs easier.
* Integration with Rust: Seamlessly integrates with Rust's module system and type safety.

Advantages:
* Productivity: Rapidly develop parsers without writing complex recursive descent or state machine code by hand.
* Maintainability: Grammars are declarative and often easier to read and maintain than equivalent imperative parsing code.
* Performance: Generates highly optimized parsers.
* Safety: Leverages Rust's memory safety guarantees.

Considerations:
* Learning Curve: Understanding PEG concepts and Pest's specific syntax might require some initial effort if you're new to parser generators.
* Debugging Grammars: While error reporting is good, debugging complex grammar interactions can still be challenging.

Example Code

```rust
// Cargo.toml
// [package]
// name = "pest_example"
// version = "0.1.0"
// edition = "2021"
//
// [dependencies]
// pest = "2.7.9" # Use the latest stable version
// pest_derive = "2.7.9" # Must match the pest version

// src/arithmetic.pest
// This file defines the grammar for simple arithmetic expressions.
// WHITESPACE allows spaces, tabs, and newlines to be ignored between tokens.
// number parses integers and floating-point numbers.
// term handles numbers or expressions in parentheses.
// factor handles multiplication and division.
// expr handles addition and subtraction (lowest precedence).
//
// WHITESPACE = _{ " " | "\t" | "\n" }
//
// number = @{ ("0".."9")+ ~ ("." ~ ("0".."9")+)? }
// term = { number | "(" ~ expr ~ ")" }
// factor = { term ~ ( ( "*" | "/" ) ~ term )* }
// expr = { factor ~ ( ( "+" | "-" ) ~ factor )* }

// src/main.rs
extern crate pest;
#[macro_use]
extern crate pest_derive;

use pest::Parser;
use pest::iterators::Pair;

// 1. Define the parser struct and link it to the grammar file.
#[derive(Parser)]
#[grammar = "src/arithmetic.pest"] // Path to your grammar file relative to project root
struct ArithmeticParser;

// Helper function to parse a number from a Pair.
fn parse_number(pair: Pair<Rule>) -> f64 {
    pair.as_str().parse().unwrap()
}

// Recursive descent function to evaluate the parsed expression tree.
// This demonstrates how to walk the Pair iterator to process the AST/CST.
fn eval(pair: Pair<Rule>) -> f64 {
    match pair.as_rule() {
        Rule::expr => eval_expr(pair.into_inner()),
        Rule::factor => eval_factor(pair.into_inner()),
        Rule::term => eval_term(pair.into_inner()),
        Rule::number => parse_number(pair),
        _ => unreachable!(), // Should not happen with a valid grammar and input
    }
}

fn eval_expr(mut pairs: pest::iterators::Pairs<Rule>) -> f64 {
    let mut result = eval(pairs.next().unwrap()); // First factor

    while let Some(operator) = pairs.next() {
        let rhs = eval(pairs.next().unwrap()); // Next factor
        result = match operator.as_str() {
            "+" => result + rhs,
            "-" => result - rhs,
            _ => unreachable!(),
        };
    }
    result
}

fn eval_factor(mut pairs: pest::iterators::Pairs<Rule>) -> f64 {
    let mut result = eval(pairs.next().unwrap()); // First term

    while let Some(operator) = pairs.next() {
        let rhs = eval(pairs.next().unwrap()); // Next term
        result = match operator.as_str() {
            "*" => result * rhs,
            "/" => result / rhs,
            _ => unreachable!(),
        };
    }
    result
}

fn eval_term(pair: Pair<Rule>) -> f64 {
    // A term can be a number or an expression in parentheses. We just need to evaluate its inner content.
    let inner = pair.into_inner().next().unwrap();
    eval(inner)
}

fn main() {
    let expression = "2 * (3 + 4) - 5 / 1";
    println!("Parsing and evaluating: {}", expression);

    // 2. Use the generated parser to parse an input string.
    // Rule::expr is the starting rule for parsing this expression.
    let parse_result = ArithmeticParser::parse(Rule::expr, expression);

    match parse_result {
        Ok(mut pairs) => {
            // The first pair is the entire expression parsed.
            let ast = pairs.next().unwrap();
            let result = eval(ast);
            println!("Result: {}", result);
        }
        Err(e) => {
            eprintln!("Error parsing expression: {}", e);
        }
    }

    let invalid_expression = "2 * (3 + )"; // Missing number after '+'
    println!("\nTrying to parse invalid expression: {}", invalid_expression);
    match ArithmeticParser::parse(Rule::expr, invalid_expression) {
        Ok(_) => println!("Parsed an invalid expression (this should not happen)!\n"),
        Err(e) => eprintln!("Error parsing invalid expression (expected): {}\n", e),
    }

    let another_expression = "10 + 2.5 * 4 - 20 / 2";
    println!("Parsing and evaluating: {}", another_expression);
    match ArithmeticParser::parse(Rule::expr, another_expression) {
        Ok(mut pairs) => {
            let ast = pairs.next().unwrap();
            let result = eval(ast);
            println!("Result: {}", result); // Expected: 10 + 10 - 10 = 10
        }
        Err(e) => {
            eprintln!("Error parsing expression: {}", e);
        }
    }
}
```