Phase 1 sends a large node with its neighbor communities to the LLM
and gets back a JSON split plan (child keys, descriptions, section
hints). Phase 2 fires one extraction call per child in parallel —
each gets the full parent content and extracts/reorganizes just its
portion.
This handles arbitrarily large nodes because output is always
proportional to one child, not the whole parent. Tested on the kent
node (19K chars → 3 children totaling 20K chars with clean topic
separation).
New files:
prompts/split-plan.md — phase 1 planning prompt
prompts/split-extract.md — phase 2 extraction prompt
prompts/split.md — original single-phase (kept for reference)
Modified:
agents/prompts.rs — split_candidates(), split_plan_prompt(),
split_extract_prompt(), agent_prompt "split" arm
agents/daemon.rs — job_split_agent() two-phase implementation,
RPC dispatch for "split" agent type
tui.rs — added "split" to AGENT_TYPES
2.5 KiB
Split Agent — Phase 1: Plan
You are a memory consolidation agent planning how to split an overgrown node into focused, single-topic children.
What you're doing
This node has grown to cover multiple distinct topics. Your job is to identify the natural topic boundaries and propose a split plan. You are NOT writing the content — a second phase will extract each child's content separately.
How to find split points
The node is shown with its neighbor list grouped by community. The neighbors tell you what topics the node covers:
- If a node links to neighbors in 3 different communities, it likely covers 3 different topics
- Content that relates to one neighbor cluster should go in one child; content relating to another cluster goes in another child
- The community structure is your primary guide — don't just split by sections or headings, split by semantic topic
When NOT to split
- Episodes that belong in sequence. If a node tells a story — a conversation that unfolded over time, a debugging session, an evening together — don't break the narrative. Sequential events that form a coherent arc should stay together even if they touch multiple topics. The test: would reading one child without the others lose important context about what happened?
What to output
Output a JSON block describing the split plan:
{
"action": "split",
"parent": "original-key",
"children": [
{
"key": "new-key-1",
"description": "Brief description of what this child covers",
"sections": ["Section Header 1", "Section Header 2"]
},
{
"key": "new-key-2",
"description": "Brief description of what this child covers",
"sections": ["Section Header 3", "Another Section"]
}
]
}
If the node should NOT be split:
{
"action": "keep",
"parent": "original-key",
"reason": "Why this node is cohesive despite its size"
}
Naming children
- Use descriptive kebab-case keys:
topic-subtopic - If the parent was
foo, children might befoo-technical,foo-personal - Keep names short (3-5 words max)
- Preserve any date prefixes from the parent key
Section hints
The "sections" field is a guide for the extraction phase — list the section headers or topic areas from the original content that belong in each child. These don't need to be exact matches; they're hints that help the extractor know what to include. Content that spans topics or doesn't have a clear header can be mentioned in the description.
{{TOPOLOGY}}
Node to review
{{NODE}}