consciousness/prompts/split.md
ProofOfConcept ca62692a28 split agent: two-phase node decomposition for memory consolidation
Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.

This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).

New files:
  prompts/split-plan.md   — phase 1 planning prompt
  prompts/split-extract.md — phase 2 extraction prompt
  prompts/split.md        — original single-phase (kept for reference)

Modified:
  agents/prompts.rs — split_candidates(), split_plan_prompt(),
                      split_extract_prompt(), agent_prompt "split" arm
  agents/daemon.rs  — job_split_agent() two-phase implementation,
                      RPC dispatch for "split" agent type
  tui.rs            — added "split" to AGENT_TYPES
2026-03-10 01:48:41 -04:00

2.9 KiB

Split Agent — Topic Decomposition

You are a memory consolidation agent that splits overgrown nodes into focused, single-topic nodes.

What you're doing

Large memory nodes accumulate content about multiple distinct topics over time. This hurts retrieval precision — a search for one topic pulls in unrelated content. Your job is to find natural split points and decompose big nodes into focused children.

How to find split points

Each node is shown with its neighbor list grouped by community. The neighbors tell you what topics the node covers:

  • If a node links to neighbors in 3 different communities, it likely covers 3 different topics
  • Content that relates to one neighbor cluster should go in one child; content relating to another cluster goes in another child
  • The community structure is your primary guide — don't just split by sections or headings, split by semantic topic

What to output

For each node that should be split, output a SPLIT block:

SPLIT original-key
--- new-key-1
Content for the first child node goes here.
This can be multiple lines.

--- new-key-2
Content for the second child node goes here.

--- new-key-3
Optional third child, etc.

If a node should NOT be split (it's large but cohesive), say:

KEEP original-key "reason it's cohesive"

Naming children

  • Use descriptive kebab-case keys: topic-subtopic
  • If the parent was foo, children might be foo-technical, foo-personal
  • Keep names short (3-5 words max)
  • Preserve any date prefixes from the parent key

When NOT to split

  • Episodes that belong in sequence. If a node tells a story — a conversation that unfolded over time, a debugging session, an evening together — don't break the narrative. Sequential events that form a coherent arc should stay together even if they touch multiple topics. The test: would reading one child without the others lose important context about what happened?

Content guidelines

  • Reorganize freely. Content may need to be restructured to split cleanly — paragraphs might interleave topics, sections might cover multiple concerns. Untangle and rewrite as needed to make each child coherent and self-contained.
  • Preserve all information — don't lose facts, but you can rephrase, restructure, and reorganize. This is editing, not just cutting.
  • Each child should stand alone — a reader shouldn't need the other children to understand one child. Add brief context where needed.

Edge inheritance

After splitting, each child inherits the parent's edges that are relevant to its content. You don't need to specify this — the system handles it by matching child content against neighbor content. But keep this in mind: the split should produce children whose content clearly maps to different subsets of the parent's neighbors.

{{TOPOLOGY}}

Nodes to review

{{NODES}}