A Rust library for parsing and evaluating filters against JSON meta data.
use serde_json::json;
use diskann_label_filter::{parse_query_filter, eval_query_expr};
// Create a JSON label
let label = json!({
"a": 1,
"b": 2,
"specs": { "cpu": "i7" },
"tags": ["red", "blue", "green"]
});
// Create a filter that matches labels with a=1 AND b>1 AND specs.cpu="i7" AND tags contains "blue"
let filter = json!({
"$and": [
{"a": {"$eq": 1}},
{"b": {"$gt": 1}},
{"specs.cpu": {"$eq": "i7"}},
{"tags": {"$in": ["blue"]}}
]
});
// Parse the filter into an AST
let ast = match parse_query_filter(&filter) {
Ok(ast) => ast,
Err(e) => {
eprintln!("Failed to parse filter: {}", e);
return;
}
};
// Evaluate the filter against the label
let matches = eval_query_expr(&ast, &label);
assert!(matches);Parse AST and output it as simple query expression
cargo run --example print_query
Process and evaluate JSON line formatted files with:
cargo run --example jsonl_reader_example
Convert old txt based format into json based file
converter <base_input_file> <query_input_file> <base_output_file> <query_output_file>
cargo run --example converter ..\tests\data\disk_index_search\data.256.label ..\tests\data\disk_index_search\query.128.label ..\tests\data\disk_index_search\data.256.label.jsonl ..\tests\data\disk_index_search\query.128.label.jsonl
The project includes a comprehensive benchmarking suite that can be run with:
cargo benchBenchmarks are organized in modules under the benches/benchmarks/ directory:
parser_bench.rs: Evaluates the performance of parsingevaluator_bench.rs: Evaluates the query evaluation performance
The label-filter library is built around three core components:
- Abstract Syntax Tree (AST): A hierarchical representation of query filters
- Parser: Converts JSON query filters to the AST representation
- Evaluator: Evaluates the AST against JSON labels
The AST is defined in ast.rs and consists of:
pub enum ASTExpr {
And(Vec<ASTExpr>), // Logical AND of sub-expressions
Or(Vec<ASTExpr>), // Logical OR of sub-expressions
Not(Box<ASTExpr>), // Logical NOT of a sub-expression
Compare { field: String, op: CompareOp }, // Field comparison
}The CompareOp enum uses type-safe representations for different comparison operators:
pub enum CompareOp {
Eq(Value), // Equal to any JSON value
Ne(Value), // Not equal to any JSON value
Lt(f64), // Less than (numeric only)
Lte(f64), // Less than or equal (numeric only)
Gt(f64), // Greater than (numeric only)
Gte(f64), // Greater than or equal (numeric only)
In(Vec<Value>), // Value is in array
Nin(Vec<Value>), // Value is not in array
}The type-safe design ensures that each operator only accepts appropriate value types, enforcing correctness at compile time.
The parser (parser.rs) converts JSON filter specifications into the AST. Key features:
- Support for logical operators (
$and,$or,$not) - Support for comparison operators (
$eq,$ne,$lt,$lte,$gt,$gte,$in,$nin) - Automatic handling of implicit
$andfor multiple field conditions - Support for dot notation to access nested fields (
user.profile.age) - Enforced nesting depth limit
- Type checking for operators (e.g., numeric operators require numeric values)
The evaluator (evaluator.rs) applies the AST against JSON labels to determine if they match:
- Recursive traversal of the AST
- Type-aware comparison operations
- Support for array field values with
$inand$ninoperators
The library implements the Visitor pattern to enable extensible operations on the AST:
ASTVisitortrait defines the interface for visitorsPrintVisitorimplementation converts AST to human-readable format- Display implementation for easy debugging and logging