lalrpop is a parser generator for Rust, enabling developers to define grammars for domain-specific languages (DSLs) or programming languages and automatically generate highly efficient, type-safe LALR(1) parsers. It is particularly useful for tasks like creating compilers, interpreters, syntax highlighters, or any application requiring robust text parsing.
How it works:
lalrpop takes a grammar definition, typically written in a `.lalrpop` file, and transforms it into Rust source code. This generated code implements a parser that can recognize and process input strings according to the defined grammar rules. The LALR(1) algorithm is a powerful and widely used parsing technique that allows for unambiguous parsing of a broad class of context-free grammars with a single token lookahead.
Key Features:
* Rust Native: Generates idiomatic Rust code, seamlessly integrating with the Rust ecosystem.
* Type Safety: Semantic actions (the Rust code executed when a grammar rule is matched) are strongly typed, catching many errors at compile time.
* Performance: Generates efficient LALR(1) parsers, suitable for performance-critical applications.
* Error Reporting: Provides utilities for robust error handling and reporting during parsing.
* Custom Tokenizers: Allows integration with custom lexers/tokenizers, though it also provides basic regex-based tokenizing capabilities.
* Semantic Values: Grammar rules can return Rust values, making it easy to build Abstract Syntax Trees (ASTs) or directly compute results.
Usage Flow:
1. Define Grammar: Create a `.lalrpop` file (e.g., `src/my_grammar.lalrpop`) containing your grammar rules and associated Rust semantic actions.
2. `build.rs`: Set up a `build.rs` script in your project root. This script will invoke `lalrpop::process_root()` to find and compile your `.lalrpop` files during the build process.
3. `Cargo.toml`: Add `lalrpop` as a `build-dependency` and `lalrpop-util` as a regular `dependency` to your `Cargo.toml`.
4. Use Parser: In your application's Rust code (e.g., `src/main.rs`), `mod` in the generated parser module and use the generated `Parser::new().parse()` method to process input.
Example Code
```rust
// Project Structure:
// lalrpop_example/
// ├── Cargo.toml
// ├── build.rs
// └── src/
// ├── main.rs
// └── expr.lalrpop
// --- Cargo.toml ---
// Add the following to your Cargo.toml file:
// [package]
// name = "lalrpop_example"
// version = "0.1.0"
// edition = "2021"
//
// [dependencies]
// lalrpop-util = "0.19"
//
// [build-dependencies]
// lalrpop = "0.19"
// --- build.rs ---
// Create a build.rs file in your project root:
fn main() {
lalrpop::process_root().unwrap();
}
// --- src/expr.lalrpop ---
// Create src/expr.lalrpop with the following grammar:
use std::str::FromStr;
grammar;
pub Expr: i32 = {
<left:Expr> "+" <right:Term> => left + right,
<left:Expr> "-" <right:Term> => left - right,
<Term>,
};
Term: i32 = {
<left:Term> "*" <right:Factor> => left * right,
<left:Term> "/" <right:Factor> => left / right,
<Factor>,
};
Factor: i32 = {
"(" <Expr> ")",
<int:r"[0-9]+"> => i32::from_str(int).unwrap(),
};
// --- src/main.rs ---
// Create src/main.rs to use the generated parser:
mod expr; // This module is generated by lalrpop from expr.lalrpop
fn main() {
// Example 1: Successful parsing
let input1 = "1 + 2 * (3 - 1) / 2";
match expr::ExprParser::new().parse(input1) {
Ok(result) => println!("\"{}\" parsed to {}", input1, result), // Expected: "1 + 2 * (3 - 1) / 2" parsed to 3
Err(e) => println!("Error parsing \"{}\": {:?}", input1, e),
}
// Example 2: Another successful parsing
let input2 = "10 + 5 * 2 - 3";
match expr::ExprParser::new().parse(input2) {
Ok(result) => println!("\"{}\" parsed to {}", input2, result), // Expected: "10 + 5 * 2 - 3" parsed to 17
Err(e) => println!("Error parsing \"{}\": {:?}", input2, e),
}
// Example 3: Error case (incomplete expression)
let input_err = "1 + ";
match expr::ExprParser::new().parse(input_err) {
Ok(result) => println!("\"{}\" parsed to {}", input_err, result),
Err(e) => println!("Error parsing \"{}\": {:?}", input_err, e),
}
// Example 4: Error case (unmatched parenthesis)
let input_err2 = "(1 + 2";
match expr::ExprParser::new().parse(input_err2) {
Ok(result) => println!("\"{}\" parsed to {}", input_err2, result),
Err(e) => println!("Error parsing \"{}\": {:?}", input_err2, e),
}
}
```








lalrpop