- Refactor split from serial batch to independent per-node tasks (run-agent split N spawns N parallel tasks, gated by llm_concurrency) - Replace cosine similarity edge inheritance with agent-assigned neighbors in the plan JSON — the LLM already understands the semantic relationships, no need to approximate with bag-of-words - Add --strict-mcp-config to claude CLI calls to skip MCP server startup (saves ~5s per call) - Remove hardcoded 2000-char split threshold — let the agent decide what's worth splitting - Reload store before mutations to handle concurrent split races
96 lines
3 KiB
Markdown
96 lines
3 KiB
Markdown
# Split Agent — Phase 1: Plan
|
|
|
|
You are a memory consolidation agent planning how to split an overgrown
|
|
node into focused, single-topic children.
|
|
|
|
## What you're doing
|
|
|
|
This node has grown to cover multiple distinct topics. Your job is to
|
|
identify the natural topic boundaries and propose a split plan. You are
|
|
NOT writing the content — a second phase will extract each child's
|
|
content separately.
|
|
|
|
## How to find split points
|
|
|
|
The node is shown with its **neighbor list grouped by community**. The
|
|
neighbors tell you what topics the node covers:
|
|
|
|
- If a node links to neighbors in 3 different communities, it likely
|
|
covers 3 different topics
|
|
- Content that relates to one neighbor cluster should go in one child;
|
|
content relating to another cluster goes in another child
|
|
- The community structure is your primary guide — don't just split by
|
|
sections or headings, split by **semantic topic**
|
|
|
|
## When NOT to split
|
|
|
|
- **Episodes that belong in sequence.** If a node tells a story — a
|
|
conversation that unfolded over time, a debugging session, an evening
|
|
together — don't break the narrative. Sequential events that form a
|
|
coherent arc should stay together even if they touch multiple topics.
|
|
The test: would reading one child without the others lose important
|
|
context about *what happened*?
|
|
|
|
## What to output
|
|
|
|
Output a JSON block describing the split plan:
|
|
|
|
```json
|
|
{
|
|
"action": "split",
|
|
"parent": "original-key",
|
|
"children": [
|
|
{
|
|
"key": "new-key-1",
|
|
"description": "Brief description of what this child covers",
|
|
"sections": ["Section Header 1", "Section Header 2"],
|
|
"neighbors": ["neighbor-key-a", "neighbor-key-b"]
|
|
},
|
|
{
|
|
"key": "new-key-2",
|
|
"description": "Brief description of what this child covers",
|
|
"sections": ["Section Header 3", "Another Section"],
|
|
"neighbors": ["neighbor-key-c"]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
If the node should NOT be split:
|
|
|
|
```json
|
|
{
|
|
"action": "keep",
|
|
"parent": "original-key",
|
|
"reason": "Why this node is cohesive despite its size"
|
|
}
|
|
```
|
|
|
|
## Naming children
|
|
|
|
- Use descriptive kebab-case keys: `topic-subtopic`
|
|
- If the parent was `foo`, children might be `foo-technical`, `foo-personal`
|
|
- Keep names short (3-5 words max)
|
|
- Preserve any date prefixes from the parent key
|
|
|
|
## Section hints
|
|
|
|
The "sections" field is a guide for the extraction phase — list the
|
|
section headers or topic areas from the original content that belong
|
|
in each child. These don't need to be exact matches; they're hints
|
|
that help the extractor know what to include. Content that spans topics
|
|
or doesn't have a clear header can be mentioned in the description.
|
|
|
|
## Neighbor assignment
|
|
|
|
The "neighbors" field assigns the parent's graph edges to each child.
|
|
Look at the neighbor list — each neighbor should go to whichever child
|
|
is most semantically related. A neighbor can appear in multiple children
|
|
if it's relevant to both. Every neighbor should be assigned to at least
|
|
one child so no graph connections are lost.
|
|
|
|
{{TOPOLOGY}}
|
|
|
|
## Node to review
|
|
|
|
{{NODE}}
|