# Query Language Unification Plan

**Status: DONE** (2026-04-11)

## Problem (was)

Two query parsers that didn't agree on syntax:

1. **PEG parser** (`hippocampus/query/parser.rs`) — boolean logic, general
   comparisons, operator precedence, parentheses. Used by CLI and compact
   format path in `query()` tool.

2. **Pipeline parser** (`hippocampus/query/engine.rs`) — domain-specific
   filters (type, age, provenance), graph algorithms (spread, spectral).
   Used by full format path in `query()` tool.

`journal_tail` generates pipeline syntax but gets routed through the PEG
parser on the compact path. Result: parse errors.

## Approach

Keep the PEG parser (has the harder-to-build structural foundation),
extend it with the pipeline parser's domain features.

## Expression extensions (add to `expr` rule in parser.rs)

- `field:value` shorthand for `field = 'value'` (colon-separated equality)
- `*` already works as `Expr::All`
- `key ~ 'glob'` already works via match operator

## New stages (add to `stage` rule in parser.rs)

Domain filter stages from engine.rs:
- `type:X` — filter by node type (episodic, daily, weekly, monthly, semantic)
- `age:<7d` — duration comparison on timestamp
- `key:GLOB` — glob match on key
- `provenance:X` — provenance filter
- `weight:>N` — weight comparison (may already work via general comparison)
- `content-len:>N` — content size filter

Sort/limit syntax variants:
- `sort:field` alongside existing `sort field`
- `limit:N` alongside existing `limit N`

Graph algorithms:
- `spread` — spreading activation
- `spectral` — spectral nearest neighbors
- `confluence` — multi-source reachability
- `geodesic` — straightest spectral paths
- `manifold` — extrapolation along seed direction

## What changes

1. `parser.rs` — add field:value shorthand to expr, add domain stages
2. `engine.rs` — keep run_pipeline execution logic, have PEG parser emit
   compatible Stage types (or convert PEG AST to Stage at boundary)
3. `query()` tool handler (memory.rs) — one parser path for all formats
4. `journal_tail` (memory.rs) — generate unified syntax
5. CLI `poc-memory query` — uses unified parser

## Migration path

1. Add field:value shorthand and type/age/key stages to PEG parser
2. Route query() through PEG parser for all formats
3. Migrate journal_tail and any other pipeline-syntax callers
4. Remove the pipeline parser (or keep as internal execution layer)

## What was done

**Deleted from engine.rs (-153 lines):**
- `Stage::parse()` and `Stage::parse_pipeline()` — redundant with PEG
- `parse_cmp()`, `parse_duration_or_number()`, `parse_composite_sort()`,
  `parse_node_type()`, `parse_sort_field()` — helper functions for deleted parser

**Added to parser.rs (+120 lines):**
- Pipeline syntax in PEG grammar (`type:X`, `age:<Nd`, `sort:field`, etc.)
- `parse_stages()` — unified entry point returning `Vec<Stage>`
- Grammar helper functions

**Net: +17 lines**

**Architecture now:**
- parser.rs: PEG grammar handles ALL parsing (both syntaxes)
- engine.rs: Pure execution — types and `run_query()`, no parsing

Result: `all | type:episodic | sort:timestamp | limit:5` works everywhere.
Mixed syntax like `degree > 5 | type:semantic | sort degree` also works.

## What NOT to change (original note)

The run_pipeline execution logic stays — it's correct and well-tested.
Only the parsing front-end unifies. The pipeline parser's Stage enum
becomes the internal representation that both the PEG parser and any
remaining direct callers produce.