split agent: parallel execution, agent-driven edges, no MCP overhead

- Refactor split from serial batch to independent per-node tasks
  (run-agent split N spawns N parallel tasks, gated by llm_concurrency)
- Replace cosine similarity edge inheritance with agent-assigned
  neighbors in the plan JSON — the LLM already understands the
  semantic relationships, no need to approximate with bag-of-words
- Add --strict-mcp-config to claude CLI calls to skip MCP server
  startup (saves ~5s per call)
- Remove hardcoded 2000-char split threshold — let the agent decide
  what's worth splitting
- Reload store before mutations to handle concurrent split races
This commit is contained in:
ProofOfConcept 2026-03-10 03:21:33 -04:00
parent 149c289fea
commit 8bbc246b3d
4 changed files with 218 additions and 195 deletions

View file

@ -43,12 +43,14 @@ Output a JSON block describing the split plan:
{
"key": "new-key-1",
"description": "Brief description of what this child covers",
"sections": ["Section Header 1", "Section Header 2"]
"sections": ["Section Header 1", "Section Header 2"],
"neighbors": ["neighbor-key-a", "neighbor-key-b"]
},
{
"key": "new-key-2",
"description": "Brief description of what this child covers",
"sections": ["Section Header 3", "Another Section"]
"sections": ["Section Header 3", "Another Section"],
"neighbors": ["neighbor-key-c"]
}
]
}
@ -79,6 +81,14 @@ in each child. These don't need to be exact matches; they're hints
that help the extractor know what to include. Content that spans topics
or doesn't have a clear header can be mentioned in the description.
## Neighbor assignment
The "neighbors" field assigns the parent's graph edges to each child.
Look at the neighbor list — each neighbor should go to whichever child
is most semantically related. A neighbor can appear in multiple children
if it's relevant to both. Every neighbor should be assigned to at least
one child so no graph connections are lost.
{{TOPOLOGY}}
## Node to review