Nom is a powerful and efficient parser combinator library written in Rust. It enables developers to build parsers for various data formats, including text, binary, network protocols, configuration files, and custom languages, in a declarative and composable manner.
Key Concepts and Features:
1. Parser Combinators: The core idea behind Nom is to break down complex parsing tasks into smaller, simpler parsers. These small parsers (called combinators) can then be combined, or 'composed', to form larger, more intricate parsers. This approach makes parsers highly modular, readable, and maintainable.
2. Declarative Syntax: Nom's API often allows parsers to closely mirror the grammar or structure of the data they are designed to parse. This declarative style improves readability and reduces the cognitive load of understanding how data is processed.
3. Zero-Copy Parsing: Nom is designed for performance. It primarily operates on input slices (`&[u8]` for binary data or `&str` for text) and avoids unnecessary data copying or allocations where possible. Parsed data is often returned as slices of the original input, minimizing overhead.
4. `IResult` Type: Parsers in Nom typically return an `IResult` type, which is an alias for `Result<(&[U8/&str], Output), Error>`. It contains:
* `Ok((remaining_input, parsed_value))`: On success, it returns the unparsed portion of the input and the successfully parsed value.
* `Err(error)`: On failure, it returns an error type that can indicate the kind of error (e.g., `Incomplete`, `Error`, `Failure`) and its location.
5. Streaming Support: Nom can handle incomplete input, making it suitable for parsing data streams where the entire input might not be available at once. This is achieved through the `Incomplete` error type.
6. Extensive Combinators: The library provides a rich set of built-in combinators for common parsing tasks, such as:
* Basic Parsers: `tag`, `char`, `digit1`, `alpha1`, `newline`.
* Sequence Combinators: `tuple`, `preceded`, `terminated`, `delimited`.
* Choice Combinators: `alt`.
* Repetition Combinators: `many0`, `many1`, `count`, `separated_list0`, `separated_list1`.
* Transformation Combinators: `map`, `map_res`, `recognize`.
* Error Handling: `cut`, `value`.
When to use Nom:
* Parsing configuration files (INI, TOML, custom formats).
* Implementing custom domain-specific languages (DSLs).
* Processing structured log files.
* Deserializing binary data formats (e.g., network packets, file headers).
* Building compilers or interpreters for simple languages.
Nom's focus on performance, composability, and strong typing (leveraging Rust's type system) makes it an excellent choice for robust and efficient parsing needs.
Example Code
```rust
use nom::{IResult,
bytes::complete::{tag},
character::complete::{alpha1, char, newline, not_line_ending},
multi::many0,
sequence::{separated_pair, terminated},
combinator::opt,
};
// A parser that parses a single key=value pair.
// It returns a tuple of (&str, &str) for (key, value).
fn parse_key_value_pair(input: &str) -> IResult<&str, (&str, &str)> {
separated_pair(
alpha1, // Key: one or more alphabetic characters
char('='), // Separator: a single '=' character
not_line_ending, // Value: anything until the end of the line (or EOF)
)(input)
}
// A parser for a complete line, which is a key=value pair optionally followed by a newline.
// The `opt(newline)` allows the last line in the input to not have a trailing newline.
fn parse_config_line(input: &str) -> IResult<&str, (&str, &str)> {
terminated(
parse_key_value_pair, // First parse the key=value pair
opt(newline) // Then optionally consume a newline character
)(input)
}
// The main parser for the entire configuration, which consists of zero or more config lines.
fn parse_config(input: &str) -> IResult<&str, Vec<(&str, &str)>> {
many0(parse_config_line)(input) // `many0` applies `parse_config_line` zero or more times
}
fn main() {
let config_data = "name=John Doe\nage=30\ncity=New York";
println!("--- Parsing valid configuration data ---");
match parse_config(config_data) {
Ok((remaining, parsed_data)) => {
println!("Successfully parsed configuration:");
for (key, value) in parsed_data {
println!(" Key: '{}', Value: '{}'", key, value);
}
if !remaining.is_empty() {
println!("Remaining unparsed input: '{}'", remaining);
}
}
Err(e) => {
eprintln!("Failed to parse configuration: {:?}", e);
}
}
let config_data_no_final_newline = "setting1=valueA\nsetting2=valueB";
println!("\n--- Parsing valid configuration without a final newline ---");
match parse_config(config_data_no_final_newline) {
Ok((remaining, parsed_data)) => {
println!("Successfully parsed configuration:");
for (key, value) in parsed_data {
println!(" Key: '{}', Value: '{}'", key, value);
}
if !remaining.is_empty() {
println!("Remaining unparsed input: '{}'", remaining);
}
}
Err(e) => {
eprintln!("Failed to parse configuration: {:?}", e);
}
}
let bad_data = "name=Alice\nerror_field\nage=25";
println!("\n--- Attempting to parse malformed data ---");
match parse_config(bad_data) {
Ok((_, _)) => {
println!("Unexpectedly parsed bad data.");
}
Err(e) => {
// Nom correctly fails when 'error_field' is encountered because it doesn't match 'key=value'
eprintln!("Correctly failed to parse malformed data: {:?}", e);
}
}
}
```








nom