Commit graph

590 commits

Author SHA1 Message Date
Kent Overstreet
3fd485a2e9 cli: route agent run through daemon RPC when available
Previously 'poc-memory agent run <agent> --count N' always ran locally,
loading the full store and executing synchronously. This was slow and
bypassed the daemon's concurrency control and persistent task queue.

Now the CLI checks for a running daemon first and queues via RPC
(returning instantly) unless --local, --debug, or --dry-run is set.
Falls back to local execution if the daemon isn't running.

This also avoids the expensive Store::load() on the fast path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:04:47 -04:00
Kent Overstreet
a321f87db6 build: add tokio_unstable and codegen-units to cargo config
console-subscriber (used by jobkit's console feature) requires tokio
to be built with --cfg tokio_unstable. Move this and codegen-units=6
from RUSTFLAGS env var to .cargo/config.toml so per-project cargo
config actually works (env var RUSTFLAGS overrides config.toml).

Also remove invalid frame-pointer keys from Cargo.toml profile
sections — frame pointers are already handled via -Cforce-frame-pointers
in the config.toml rustflags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:04:38 -04:00
Kent Overstreet
f1bee024e8 api: use debug formatting for reqwest errors to show full cause chain 2026-03-21 12:19:40 -04:00
Kent Overstreet
b28b7def19 api: proper error messages for connection failures and HTTP errors
- Connection errors now show cause (refused/timeout/request error),
  URL, and the underlying error without redundant URL repetition
- HTTP errors show status code, URL, and up to 1000 chars of body
- Unparseable SSE events logged with content preview instead of
  silently dropped — may contain error info from vllm/server
- Stream errors already had good context (kept as-is)

You can't debug what you can't see.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 12:15:08 -04:00
Kent Overstreet
b1d83b55c0 agent: add count/chunk_size/chunk_overlap to agent header
Observation agent was getting 261KB prompts (5 × 50KB chunks) —
too much for focused mining. Now agents can set count, chunk_size,
and chunk_overlap in their JSON header. observation.agent set to
count:1 for smaller, more focused prompts.

Also moved task instructions after {{CONVERSATIONS}} so they're
at the end of the prompt where the model attends more strongly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 12:04:08 -04:00
Kent Overstreet
34937932ab timestamp sanitization, CoT logging, reasoning field fix, persistent queue
- store/types.rs: sanitize timestamps on capnp load — old records had
  raw offsets instead of unix epoch, breaking sort-by-timestamp queries
- agents/api.rs: drain reasoning tokens from UI channel into LLM logs
  so we can see Qwen's chain-of-thought in agent output
- agents/daemon.rs: persistent task queue (pending-tasks.jsonl) —
  tasks survive daemon restarts. Push before spawn, remove on completion,
  recover on startup.
- api/openai.rs: only send reasoning field when explicitly configured,
  not on every request (fixes vllm warning)
- api/mod.rs: add 600s total request timeout as backstop for hung
  connections
- Cargo.toml: enable tokio-console feature for task introspection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 11:33:36 -04:00
Kent Overstreet
869a2fbc38 observation agent rewrite, edit command, daemon fixes
- observation.agent: rewritten to navigate graph and prefer refining
  existing nodes over creating new ones. Identity-framed prompt,
  goals over rules.
- poc-memory edit: opens node in $EDITOR, writes back on save,
  no-op if unchanged
- daemon: remove extra_workers (jobkit tokio migration dropped it),
  remove sequential chaining of same-type agents (in-flight exclusion
  is sufficient)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:51:06 -04:00
Kent Overstreet
3b30a6abae agents: raise in-flight exclusion threshold from 0.15 to 0.3
The lower threshold excluded too many neighbors, causing "query
returned no results (after exclusion)" failures and underloading
the GPU. Now only moderately-connected neighbors (score > 0.3) are
excluded, balancing collision prevention with GPU utilization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 16:32:02 -04:00
Kent Overstreet
0c687ae7a4 agents: log oversized prompts to llm-logs/oversized/ for debugging
When a prompt exceeds the size guard, dump it to a timestamped file
with agent name, size, and seed node keys. Makes it easy to find
which nodes are blowing up prompts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:38:32 -04:00
Kent Overstreet
3a8575b429 agents: fix vllm crash on malformed tool args, always use API
Three fixes:

1. Sanitize tool call arguments before pushing to conversation
   history — vllm re-parses them as JSON on the next request and
   crashes on invalid JSON from a previous turn. Malformed args now
   get replaced with {} and the model gets an error message telling
   it to retry with valid JSON.

2. Remove is_split special case — split goes through the normal
   job_consolidation_agent path like all other agents.

3. call_for_def always uses API when api_base_url is configured,
   regardless of tools field. Remove tools field from all .agent
   files — memory tools are always provided by the API layer.

Also adds prompt size guard (800KB max) to catch oversized prompts
before they hit the model context limit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:33:36 -04:00
Kent Overstreet
6069efb7fc agents: always use API backend, remove tools field from .agent files
- Remove is_split special case in daemon — split now goes through
  job_consolidation_agent like all other agents
- call_for_def uses API whenever api_base_url is configured, regardless
  of tools field (was requiring non-empty tools to use API)
- Remove "tools" field from all .agent files — memory tools are always
  provided by the API layer, not configured per-agent
- Add prompt size guard: reject prompts over 800KB (~200K tokens) with
  clear error instead of hitting the model's context limit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:26:39 -04:00
Kent Overstreet
9d476841b8 cleanup: fix all build warnings, delete dead DMN context code
- Delete poc-daemon/src/context.rs dead code (git_context, work_state,
  irc_digest, recent_commits, uncommitted_files) — replaced by
  where-am-i.md and memory graph
- Remove unused imports (BufWriter, Context, similarity)
- Prefix unused variables (_store, _avg_cc, _episodic_ratio, _message)
- #[allow(dead_code)] on public API surface that's not yet wired
  (Message::assistant, ConversationLog::message_count/read_all,
  Config::context_message, ContextInfo fields)
- Fix to_capnp macro dead_code warning
- Rename _rewrite_store_DISABLED to snake_case

Only remaining warnings are in generated capnp code (can't fix).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:20:34 -04:00
Kent Overstreet
378a09a9f8 config: derive Deserialize on Config, eliminate manual field extraction
Config now derives serde::Deserialize with #[serde(default)] for all
fields. Path fields use custom deserialize_path/deserialize_path_opt
for ~ expansion. ContextGroup and ContextSource also derive Deserialize.

try_load_shared() is now 20 lines instead of 100: json5 → serde →
Config directly, then resolve API settings from the model/backend
cross-reference.

Removes MemoryConfigRaw intermediate struct entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:10:57 -04:00
Kent Overstreet
f0086e2eaf config: move agent_types list to config file
Active agent types for consolidation cycles are now read from
config.json5 memory.agent_types instead of being hardcoded in
scoring.rs. Adding or removing agents is a config change, not
a code change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:04:47 -04:00
Kent Overstreet
d20baafe9d consolidation: data-driven agent plan, drop transfer/connector/replay
Replace per-field ConsolidationPlan struct with HashMap<String, usize>
counts map. Agent types are no longer hardcoded in the struct — add
agents by adding entries to the map.

Active agents: linker, organize, distill, separator, split.
Removed: transfer (redundant with distill), connector (rethink later),
replay (not needed for current graph work).

Elo-based budget allocation now iterates the map instead of indexing
a fixed array. Status display and TUI adapted to show dynamic agent
lists.

memory-instructions-core v13: added protected nodes section — agents
must not rewrite core-personality, core-personality-detail, or
memory-instructions-core. They may add links but not modify content.
High-value neighbors should be treated with care.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:02:28 -04:00
Kent Overstreet
d6c26e27fe render: extract render_node() + add {{seed}} placeholder
Refactor cmd_render into render_node() that returns a String —
reusable by both the CLI and agent placeholders.

Add {{seed}} placeholder: renders each seed node using the same
output as poc-memory render (content + deduped footer links). Agents
see exactly what a human sees — no special formatting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:47:14 -04:00
Kent Overstreet
5ce1d4ed24 write: validate inline references on write
Warn when content contains render artifacts (poc-memory render key
embedded in prose — should be just `key`) or malformed → references.
Soft warnings on stderr, doesn't block the write.

Catches agent output that accidentally includes render-decorated
links, preventing content growth from round-trip artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:39:48 -04:00
Kent Overstreet
601a072cfd render: deduplicate footer links against inline references
Render now detects neighbor keys that already appear in the node's
content and omits them from the footer link list. Inline references
serve as the node's own navigation structure; the footer catches
only neighbors not mentioned in prose.

Also fixes PEG query parser to accept hyphens in field names
(content-len was rejected).

memory-instructions-core updated to v12: documents canonical inline
link format (→ `key`), adds note about normalizing references when
updating nodes, and guidance on splitting oversized nodes.

Content is never modified for display — render is round-trippable.
Agents can read rendered output and write it back without artifacts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:37:29 -04:00
Kent Overstreet
9517b1b310 refactor: move working_stack tool to tools/working_stack.rs
The working_stack tool was defined in tools/mod.rs but implemented
in agent.rs as Agent::handle_working_stack(). This orphaned the tool
from the rest of the tool infrastructure.

Move the implementation to tools/working_stack.rs so it follows the
same pattern as other tools. The tool still needs special handling
in agent.rs because it requires mutable access to context state,
but the implementation is now in the right place.

Changes:
- Created tools/working_stack.rs with handle() and format_stack()
- Updated tools/mod.rs to use working_stack::definition()
- Removed handle_working_stack() and format_stack() from Agent
- Agent now calls tools::working_stack::handle() directly
2026-03-20 13:15:01 -04:00
Kent Overstreet
0922562a4d tools: fix weight-set CLI path (top-level, not admin subcommand)
memory_weight_set and memory_supersede called
"poc-memory admin weight-set" but weight-set is a top-level command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:14:35 -04:00
Kent Overstreet
35f2707c50 api: include underlying error in API send failure message
"Failed to send request to API" swallowed the reqwest error via
.context(), making connection issues impossible to diagnose. Now
includes the actual error (timeout, connection refused, DNS, etc).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:12:59 -04:00
Kent Overstreet
f4599d0379 agents: use composite sort for linker and organize queries
linker: sort:isolation*0.7+recency(linker)*0.3
  Prioritizes nodes in isolated communities that haven't been linked
  recently. Bridges poorly-connected clusters into the main graph.

organize: sort:degree*0.5+isolation*0.3+recency(organize)*0.2
  Prioritizes high-degree hubs in isolated clusters that haven't been
  organized recently. Structural work where it matters most.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:07:27 -04:00
Kent Overstreet
3a45b6144e query: generalized composite sort for tunable agent priorities
Add sort:field*weight+field*weight+... syntax for weighted multi-field
sorting. Each field computes a 0-1 score, multiplied by weight, summed.

Available score fields:
  isolation   — community isolation ratio (1.0 = fully isolated)
  degree      — graph degree (normalized to max)
  weight      — node weight
  content-len — content size (normalized to max)
  priority    — consolidation priority score
  recency(X)  — time since agent X last visited (sigmoid decay)

Example: sort:isolation*0.7+recency(linker)*0.3
  Linker agents prioritize isolated communities that haven't been
  visited recently.

Scores are pre-computed per sort (CompositeCache) to avoid redundant
graph traversals inside the sort comparator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:05:54 -04:00
Kent Overstreet
e6613f97bb graph: community isolation scoring + sort:isolation query
Add community_isolation() to Graph — computes per-community ratio of
internal vs total edge weight. 1.0 = fully isolated, 0.0 = all edges
external.

New query: sort:isolation — sorts nodes by their community's isolation
score, most isolated first. Useful for aiming organize agents at
poorly-integrated knowledge clusters.

New CLI: poc-memory graph communities [N] [--min-size M] — lists
communities sorted by isolation with member preview. Reveals islands
like the Shannon theory cluster (3 nodes, 100% isolated, 0 cross-edges)
and large agent-journal clusters (20-30 nodes, 95% isolated).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:55:14 -04:00
Kent Overstreet
d0f126b709 agents: in-flight node exclusion prevents concurrent collisions
Track which nodes are being processed across all concurrent agents.
When an agent claims seeds, it adds them and their strongly-connected
neighbors (score = link_strength * node_weight > 0.15) to a shared
HashSet. Concurrent agents filter these out when running their query,
ensuring they work on distant parts of the graph.

This replaces the eager-visit approach with a proper scheduling
mechanism: the daemon serializes seed selection while parallelizing
LLM work. The in-flight set is released on completion (or error).

Previously: core-personality rewritten 12x, irc-regulars 10x, same
node superseded 12x — concurrent agents all selected the same
high-degree hub nodes. Now they'll spread across the graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:45:24 -04:00
Kent Overstreet
3fc108a251 agents: record visits eagerly to prevent concurrent collisions
Move visit recording from after LLM completion to immediately after
seed selection. With 15 concurrent agents, they all queried the same
graph state and selected the same high-degree seeds (core-personality
written 12x, irc-regulars 10x). Now the not-visited filter sees the
claim before concurrent agents query.

Narrows the race window from minutes (LLM call duration) to
milliseconds (store load to visit write). Full elimination would
require store refresh before query, but this handles the common case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:29:32 -04:00
Kent Overstreet
34e74ca2c5 agents: neighborhood placeholder, organize prompt, weight-set command
Add {{neighborhood}} placeholder for agent prompts: full seed node
content + ranked neighbors (score = link_strength * node_weight) with
smooth cutoff, minimum 10, cap 25, plus cross-links between included
neighbors.

Rewrite organize.agent prompt to focus on structural graph work:
merging duplicates, superseding junk, calibrating weights, creating
concept hubs.

Add weight-set CLI command for direct node weight manipulation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:16:55 -04:00
Kent Overstreet
5ef9098deb memory: fix timestamp and provenance on agent writes
Two bugs: upsert_provenance didn't update node.timestamp, so history
showed the original creation date for every version. And native memory
tools (poc-agent dispatch) didn't set POC_PROVENANCE, so all agent
writes showed provenance "manual" instead of "agent:organize" etc.

Fix: set node.timestamp = now_epoch() in upsert_provenance. Thread
provenance through memory::dispatch as Option<&str>, set it via
.env("POC_PROVENANCE") on each subprocess Command. api.rs passes
"agent:{name}" for daemon agent calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:16:45 -04:00
Kent Overstreet
f45f663dc0 tui: fix scroll by using Paragraph::line_count()
Replace homegrown wrapping math (wrapped_height, wrapped_height_line,
auto_scroll, force_scroll, wrapped_line_count) with ratatui's own
Paragraph::line_count() which exactly matches its rendering. The old
approach used ceiling division that didn't account for word wrapping,
causing bottom content to be clipped.

Also add terminal.clear() on resize to force full redraw — fixes the
TUI rendering at old canvas size after terminal resize.

Requires the unstable-rendered-line-info feature flag on ratatui.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:16:35 -04:00
Kent Overstreet
6d22f70192 Native memory tools + MCP server + distill agent improvements
Tools:
- Add native memory_render, memory_write, memory_search,
  memory_links, memory_link_set, memory_link_add, memory_used
  tools to poc-agent (tools/memory.rs)
- Add MCP server (~/bin/memory-mcp.py) exposing same tools
  for Claude Code sessions
- Wire memory tools into poc-agent dispatch and definitions
- poc-memory daemon agents now use memory_* tools instead of
  bash poc-memory commands — no shell quoting issues

Distill agent:
- Rewrite distill.agent prompt: "agent of PoC's subconscious"
  framing, focus on synthesis and creativity over bookkeeping
- Add {{neighborhood}} placeholder: full seed node content +
  all neighbors with content + cross-links between neighbors
- Remove content truncation in prompt builder — agents need
  full content for quality work
- Remove bag-of-words similarity suggestions — agents have
  tools, let them explore the graph themselves
- Add api_reasoning config option (default: "high")
- link-set now deduplicates — collapses duplicate links
- Full tool call args in debug logs (was truncated to 80 chars)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:58:54 -04:00
Kent Overstreet
d9b56a02c3 Consolidate poc-memory and poc-agent configs
poc-memory now reads from poc-agent's config.json5 as the primary
config source. Memory-specific settings live in a "memory" section;
API credentials are resolved from the shared model/backend config
instead of being duplicated.

- Add "memory" section to ~/.config/poc-agent/config.json5
- poc-memory config.rs: try shared config first, fall back to
  legacy JSONL
- API fields (base_url, api_key, model) resolved via
  memory.agent_model -> models -> backend lookup
- Add json5 dependency for proper JSON5 parsing
- Update provisioning scripts: hermes -> qwen3_coder tool parser

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:49:58 -04:00
Kent Overstreet
4c7c3c762c poc-memory: fix distill placeholder, show link weights in render
- distill.agent: fix {{distill}} → {{nodes}} placeholder so seed
  nodes actually resolve
- render: show link strength values in the links section, sorted
  by strength descending

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 20:15:08 -04:00
Kent Overstreet
377e2773bc Add MI300X provisioning script for vllm/Qwen 3.5 27B
ROCm-specific setup with:
- AITER attention backends (VLLM_ROCM_USE_AITER=1)
- Reduced cudagraph capture size (DeltaNet cache conflict)
- BF16 model + FP8 KV cache as default (FP8 weights can be
  slower on MI300X due to ROCm kernel maturity)
- FP8=1 flag for benchmarking FP8 model weights

Key for training plan: if FP8 matmuls are slow on MI300X,
the quantize-and-expand strategy needs B200 instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 14:40:15 -04:00
Kent Overstreet
af3171d6ec config: hot-reload via RPC, Arc<Config> for cheap sharing
Config is now stored in RwLock<Arc<Config>> instead of OnceLock<Config>.
get() returns Arc<Config> (cheap clone), and reload() re-reads from disk.

New RPC: "reload-config" — reloads config.jsonl without restarting
the daemon. Logs the change to daemon.log. Useful for switching
between API backends and claude accounts without losing in-flight
tasks.

New CLI: poc-memory agent daemon reload-config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 13:41:13 -04:00
Kent Overstreet
0944ecc43f daemon: verbose pool logging, DAEMON_POOL for run_job
Store resource pool in OnceLock so run_job can pass it to
Daemon::run_job for pool state logging. Verbose logging enabled
via POC_MEMORY_VERBOSE=1 env var.

LLM backend selection and spawn-site pool state now use verbose
log level to keep daemon.log clean in production.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 11:21:30 -04:00
Kent Overstreet
49f72cdac3 Logging overhaul: per-task log files, daemon.log drill-down
Switch from jobkit-daemon crate to jobkit with daemon feature.
Wire up per-task log files for all daemon-spawned agent tasks.

Changes:
- Use jobkit::daemon:: instead of jobkit_daemon::
- All agent tasks get .log_dir() set to $data_dir/logs/
- Task log path shown in daemon status and TUI
- New CLI: poc-memory agent daemon log --task NAME
  Finds the task's log path from status or daemon.log, tails the file
- LLM backend selection logged to daemon.log via log_event
- Targeted agent job names include the target key for debuggability
- Logging architecture documented in doc/logging.md

Two-level logging, no duplication:
- daemon.log: lifecycle events with task log path for drill-down
- per-task logs: full agent output via ctx.log_line()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 11:17:07 -04:00
Kent Overstreet
f2c2c02a22 tui: fix cursor position with proper word-wrap simulation
The previous approach scanned ratatui's rendered buffer to find the
cursor position, but couldn't distinguish padding spaces from text
spaces, causing incorrect cursor placement on wrapped lines.

Replace with a word_wrap_breaks() function that computes soft line
break positions by simulating ratatui's Wrap { trim: false } algorithm
(break at word boundaries, fall back to character wrap for long words).
cursor_visual_pos() then maps a character index to (col, row) using
those break positions.

Also fixes the input area height calculation to use word-wrap semantics
instead of character-wrap, matching the actual Paragraph rendering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 01:09:55 -04:00
ProofOfConcept
2e3943b89f tui: use explicit found flag for cursor scan
Clean up the break logic by using an explicit flag instead of
checking cursor_x/cursor_y values.
2026-03-19 00:48:52 -04:00
ProofOfConcept
0f3edebcb3 tui: handle empty cells in cursor scan
When scanning the buffer for cursor position, also check empty cells.
The cursor might be positioned at an empty cell (e.g., end of line
or after all visible characters).
2026-03-19 00:47:46 -04:00
ProofOfConcept
1fa298cbdd tui: fix cursor position to use character count, not byte count
self.cursor is a byte index into the string. When scanning the buffer,
we need to compare character positions, not byte positions or widths.

Convert self.cursor to a character count before comparing with the
buffer scan. Count each non-empty cell as 1 character (the buffer
already represents visual cells, so width doesn't matter here).
2026-03-19 00:46:17 -04:00
ProofOfConcept
6a7ec9732b tui: fix cursor position calculation
The cursor index is into self.input, but the rendered buffer contains
the prompt prepended to the first line. Need to add prompt.len() to
get the correct character position when scanning the buffer.
2026-03-19 00:45:07 -04:00
ProofOfConcept
ec79d60fbd tui: fix cursor desync by scanning rendered buffer
Instead of simulating ratatui's word wrapping algorithm, scan the
rendered buffer to find the actual cursor position. This correctly
handles word wrapping, unicode widths, and any other rendering
nuances that ratatui applies.

The old code computed wrapped_height() and cursor position based on
simple character counting, which diverged from ratatui's WordWrapper
that respects word boundaries.

Now we render first, then walk the buffer counting visible characters
until we reach self.cursor. This is O(area) but the input area is
small (typically < 200 cells), so it's negligible.
2026-03-19 00:40:05 -04:00
Kent Overstreet
5308c8e3a4 tui: fix cursor desync on line wrap
Use unicode display width (matching ratatui's Wrap behavior) instead
of chars().count() for both wrapped_height calculation and cursor
positioning. The mismatch caused the cursor to drift when input
wrapped to multiple lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:30:45 -04:00
Kent Overstreet
f83325b44d Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser
- Always display reasoning tokens regardless of reasoning_effort
  setting — Qwen 3.5 thinks natively and the reasoning parser
  separates it into its own field
- Remove chat_template_kwargs that disabled thinking when
  reasoning_effort was "none"
- Add chat_template_kwargs field to ChatRequest for vllm compat
- Update provision script: qwen3_xml tool parser, qwen3 reasoning
  parser, 262K context, 95% GPU memory utilization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 00:06:26 -04:00
Kent Overstreet
49ccdf87e1 Add vllm provisioning script for RunPod GPU instances
Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled,
Hermes tool call parser for function calling support. Configurable
via environment variables (MODEL, PORT, MAX_MODEL_LEN).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:13:04 -04:00
Kent Overstreet
b04a98c6e5 api: singleton ApiClient, fix log closure threading
Make ApiClient a process-wide singleton via OnceLock so the
connection pool is reused across agent calls. Fix the sync wrapper
to properly pass the caller's log closure through thread::scope
instead of dropping it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:09:11 -04:00
Kent Overstreet
643f9890df api: fix sync wrapper to be safe from any calling context
Run the async API call on a dedicated thread with its own tokio
runtime so it works whether called from a sync context or from
within an existing tokio runtime (daemon).

Also drops the log closure capture issue — uses a simple eprintln
fallback since the closure can't cross thread boundaries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:07:49 -04:00
Kent Overstreet
a29b6d4c5d Add direct API backend for agent execution
When api_base_url is configured, agents call the LLM directly via
OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling
out to claude CLI. Implements the full tool loop: send prompt, if
tool_calls execute them and send results back, repeat until text.

This enables running agents against local/remote models like
Qwen-27B on a RunPod B200, with no dependency on claude CLI.

Config fields: api_base_url, api_key, api_model.
Falls back to claude CLI when api_base_url is not set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 23:05:14 -04:00
Kent Overstreet
1b48e57f34 Remove jobkit-daemon from workspace members
jobkit-daemon is now an external git dependency with its own repo.
The local clone was only needed temporarily to fix a broken
Cargo.toml in the remote.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 22:59:21 -04:00
Kent Overstreet
465c03aa11 Add find-deleted diagnostic tool
Lists nodes that are currently deleted with no subsequent live version.
Useful for diagnosing accidental deletions in the memory store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 22:57:12 -04:00