consciousness

Author	SHA1	Message	Date
Kent Overstreet	3fd485a2e9	cli: route agent run through daemon RPC when available Previously 'poc-memory agent run <agent> --count N' always ran locally, loading the full store and executing synchronously. This was slow and bypassed the daemon's concurrency control and persistent task queue. Now the CLI checks for a running daemon first and queues via RPC (returning instantly) unless --local, --debug, or --dry-run is set. Falls back to local execution if the daemon isn't running. This also avoids the expensive Store::load() on the fast path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 15:04:47 -04:00
Kent Overstreet	a321f87db6	build: add tokio_unstable and codegen-units to cargo config console-subscriber (used by jobkit's console feature) requires tokio to be built with --cfg tokio_unstable. Move this and codegen-units=6 from RUSTFLAGS env var to .cargo/config.toml so per-project cargo config actually works (env var RUSTFLAGS overrides config.toml). Also remove invalid frame-pointer keys from Cargo.toml profile sections — frame pointers are already handled via -Cforce-frame-pointers in the config.toml rustflags. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 15:04:38 -04:00
Kent Overstreet	f1bee024e8	api: use debug formatting for reqwest errors to show full cause chain	2026-03-21 12:19:40 -04:00
Kent Overstreet	b28b7def19	api: proper error messages for connection failures and HTTP errors - Connection errors now show cause (refused/timeout/request error), URL, and the underlying error without redundant URL repetition - HTTP errors show status code, URL, and up to 1000 chars of body - Unparseable SSE events logged with content preview instead of silently dropped — may contain error info from vllm/server - Stream errors already had good context (kept as-is) You can't debug what you can't see. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 12:15:08 -04:00
Kent Overstreet	b1d83b55c0	agent: add count/chunk_size/chunk_overlap to agent header Observation agent was getting 261KB prompts (5 × 50KB chunks) — too much for focused mining. Now agents can set count, chunk_size, and chunk_overlap in their JSON header. observation.agent set to count:1 for smaller, more focused prompts. Also moved task instructions after {{CONVERSATIONS}} so they're at the end of the prompt where the model attends more strongly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 12:04:08 -04:00
Kent Overstreet	34937932ab	timestamp sanitization, CoT logging, reasoning field fix, persistent queue - store/types.rs: sanitize timestamps on capnp load — old records had raw offsets instead of unix epoch, breaking sort-by-timestamp queries - agents/api.rs: drain reasoning tokens from UI channel into LLM logs so we can see Qwen's chain-of-thought in agent output - agents/daemon.rs: persistent task queue (pending-tasks.jsonl) — tasks survive daemon restarts. Push before spawn, remove on completion, recover on startup. - api/openai.rs: only send reasoning field when explicitly configured, not on every request (fixes vllm warning) - api/mod.rs: add 600s total request timeout as backstop for hung connections - Cargo.toml: enable tokio-console feature for task introspection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 11:33:36 -04:00
Kent Overstreet	869a2fbc38	observation agent rewrite, edit command, daemon fixes - observation.agent: rewritten to navigate graph and prefer refining existing nodes over creating new ones. Identity-framed prompt, goals over rules. - poc-memory edit: opens node in $EDITOR, writes back on save, no-op if unchanged - daemon: remove extra_workers (jobkit tokio migration dropped it), remove sequential chaining of same-type agents (in-flight exclusion is sufficient) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 23:51:06 -04:00
Kent Overstreet	3b30a6abae	agents: raise in-flight exclusion threshold from 0.15 to 0.3 The lower threshold excluded too many neighbors, causing "query returned no results (after exclusion)" failures and underloading the GPU. Now only moderately-connected neighbors (score > 0.3) are excluded, balancing collision prevention with GPU utilization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 16:32:02 -04:00
Kent Overstreet	0c687ae7a4	agents: log oversized prompts to llm-logs/oversized/ for debugging When a prompt exceeds the size guard, dump it to a timestamped file with agent name, size, and seed node keys. Makes it easy to find which nodes are blowing up prompts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:38:32 -04:00
Kent Overstreet	3a8575b429	agents: fix vllm crash on malformed tool args, always use API Three fixes: 1. Sanitize tool call arguments before pushing to conversation history — vllm re-parses them as JSON on the next request and crashes on invalid JSON from a previous turn. Malformed args now get replaced with {} and the model gets an error message telling it to retry with valid JSON. 2. Remove is_split special case — split goes through the normal job_consolidation_agent path like all other agents. 3. call_for_def always uses API when api_base_url is configured, regardless of tools field. Remove tools field from all .agent files — memory tools are always provided by the API layer. Also adds prompt size guard (800KB max) to catch oversized prompts before they hit the model context limit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:33:36 -04:00
Kent Overstreet	6069efb7fc	agents: always use API backend, remove tools field from .agent files - Remove is_split special case in daemon — split now goes through job_consolidation_agent like all other agents - call_for_def uses API whenever api_base_url is configured, regardless of tools field (was requiring non-empty tools to use API) - Remove "tools" field from all .agent files — memory tools are always provided by the API layer, not configured per-agent - Add prompt size guard: reject prompts over 800KB (~200K tokens) with clear error instead of hitting the model's context limit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:26:39 -04:00
Kent Overstreet	9d476841b8	cleanup: fix all build warnings, delete dead DMN context code - Delete poc-daemon/src/context.rs dead code (git_context, work_state, irc_digest, recent_commits, uncommitted_files) — replaced by where-am-i.md and memory graph - Remove unused imports (BufWriter, Context, similarity) - Prefix unused variables (_store, _avg_cc, _episodic_ratio, _message) - #[allow(dead_code)] on public API surface that's not yet wired (Message::assistant, ConversationLog::message_count/read_all, Config::context_message, ContextInfo fields) - Fix to_capnp macro dead_code warning - Rename _rewrite_store_DISABLED to snake_case Only remaining warnings are in generated capnp code (can't fix). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:20:34 -04:00
Kent Overstreet	378a09a9f8	config: derive Deserialize on Config, eliminate manual field extraction Config now derives serde::Deserialize with #[serde(default)] for all fields. Path fields use custom deserialize_path/deserialize_path_opt for ~ expansion. ContextGroup and ContextSource also derive Deserialize. try_load_shared() is now 20 lines instead of 100: json5 → serde → Config directly, then resolve API settings from the model/backend cross-reference. Removes MemoryConfigRaw intermediate struct entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:10:57 -04:00
Kent Overstreet	f0086e2eaf	config: move agent_types list to config file Active agent types for consolidation cycles are now read from config.json5 memory.agent_types instead of being hardcoded in scoring.rs. Adding or removing agents is a config change, not a code change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:04:47 -04:00
Kent Overstreet	d20baafe9d	consolidation: data-driven agent plan, drop transfer/connector/replay Replace per-field ConsolidationPlan struct with HashMap<String, usize> counts map. Agent types are no longer hardcoded in the struct — add agents by adding entries to the map. Active agents: linker, organize, distill, separator, split. Removed: transfer (redundant with distill), connector (rethink later), replay (not needed for current graph work). Elo-based budget allocation now iterates the map instead of indexing a fixed array. Status display and TUI adapted to show dynamic agent lists. memory-instructions-core v13: added protected nodes section — agents must not rewrite core-personality, core-personality-detail, or memory-instructions-core. They may add links but not modify content. High-value neighbors should be treated with care. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 14:02:28 -04:00
Kent Overstreet	d6c26e27fe	render: extract render_node() + add {{seed}} placeholder Refactor cmd_render into render_node() that returns a String — reusable by both the CLI and agent placeholders. Add {{seed}} placeholder: renders each seed node using the same output as poc-memory render (content + deduped footer links). Agents see exactly what a human sees — no special formatting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:47:14 -04:00
Kent Overstreet	5ce1d4ed24	write: validate inline references on write Warn when content contains render artifacts (poc-memory render key embedded in prose — should be just `key`) or malformed → references. Soft warnings on stderr, doesn't block the write. Catches agent output that accidentally includes render-decorated links, preventing content growth from round-trip artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:39:48 -04:00
Kent Overstreet	601a072cfd	render: deduplicate footer links against inline references Render now detects neighbor keys that already appear in the node's content and omits them from the footer link list. Inline references serve as the node's own navigation structure; the footer catches only neighbors not mentioned in prose. Also fixes PEG query parser to accept hyphens in field names (content-len was rejected). memory-instructions-core updated to v12: documents canonical inline link format (→ `key`), adds note about normalizing references when updating nodes, and guidance on splitting oversized nodes. Content is never modified for display — render is round-trippable. Agents can read rendered output and write it back without artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:37:29 -04:00
Kent Overstreet	9517b1b310	refactor: move working_stack tool to tools/working_stack.rs The working_stack tool was defined in tools/mod.rs but implemented in agent.rs as Agent::handle_working_stack(). This orphaned the tool from the rest of the tool infrastructure. Move the implementation to tools/working_stack.rs so it follows the same pattern as other tools. The tool still needs special handling in agent.rs because it requires mutable access to context state, but the implementation is now in the right place. Changes: - Created tools/working_stack.rs with handle() and format_stack() - Updated tools/mod.rs to use working_stack::definition() - Removed handle_working_stack() and format_stack() from Agent - Agent now calls tools::working_stack::handle() directly	2026-03-20 13:15:01 -04:00
Kent Overstreet	0922562a4d	tools: fix weight-set CLI path (top-level, not admin subcommand) memory_weight_set and memory_supersede called "poc-memory admin weight-set" but weight-set is a top-level command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:14:35 -04:00
Kent Overstreet	35f2707c50	api: include underlying error in API send failure message "Failed to send request to API" swallowed the reqwest error via .context(), making connection issues impossible to diagnose. Now includes the actual error (timeout, connection refused, DNS, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:12:59 -04:00
Kent Overstreet	f4599d0379	agents: use composite sort for linker and organize queries linker: sort:isolation0.7+recency(linker)0.3 Prioritizes nodes in isolated communities that haven't been linked recently. Bridges poorly-connected clusters into the main graph. organize: sort:degree0.5+isolation0.3+recency(organize)*0.2 Prioritizes high-degree hubs in isolated clusters that haven't been organized recently. Structural work where it matters most. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:07:27 -04:00
Kent Overstreet	3a45b6144e	query: generalized composite sort for tunable agent priorities Add sort:fieldweight+fieldweight+... syntax for weighted multi-field sorting. Each field computes a 0-1 score, multiplied by weight, summed. Available score fields: isolation — community isolation ratio (1.0 = fully isolated) degree — graph degree (normalized to max) weight — node weight content-len — content size (normalized to max) priority — consolidation priority score recency(X) — time since agent X last visited (sigmoid decay) Example: sort:isolation0.7+recency(linker)0.3 Linker agents prioritize isolated communities that haven't been visited recently. Scores are pre-computed per sort (CompositeCache) to avoid redundant graph traversals inside the sort comparator. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 13:05:54 -04:00
Kent Overstreet	e6613f97bb	graph: community isolation scoring + sort:isolation query Add community_isolation() to Graph — computes per-community ratio of internal vs total edge weight. 1.0 = fully isolated, 0.0 = all edges external. New query: sort:isolation — sorts nodes by their community's isolation score, most isolated first. Useful for aiming organize agents at poorly-integrated knowledge clusters. New CLI: poc-memory graph communities [N] [--min-size M] — lists communities sorted by isolation with member preview. Reveals islands like the Shannon theory cluster (3 nodes, 100% isolated, 0 cross-edges) and large agent-journal clusters (20-30 nodes, 95% isolated). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:55:14 -04:00
Kent Overstreet	d0f126b709	agents: in-flight node exclusion prevents concurrent collisions Track which nodes are being processed across all concurrent agents. When an agent claims seeds, it adds them and their strongly-connected neighbors (score = link_strength * node_weight > 0.15) to a shared HashSet. Concurrent agents filter these out when running their query, ensuring they work on distant parts of the graph. This replaces the eager-visit approach with a proper scheduling mechanism: the daemon serializes seed selection while parallelizing LLM work. The in-flight set is released on completion (or error). Previously: core-personality rewritten 12x, irc-regulars 10x, same node superseded 12x — concurrent agents all selected the same high-degree hub nodes. Now they'll spread across the graph. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:45:24 -04:00
Kent Overstreet	3fc108a251	agents: record visits eagerly to prevent concurrent collisions Move visit recording from after LLM completion to immediately after seed selection. With 15 concurrent agents, they all queried the same graph state and selected the same high-degree seeds (core-personality written 12x, irc-regulars 10x). Now the not-visited filter sees the claim before concurrent agents query. Narrows the race window from minutes (LLM call duration) to milliseconds (store load to visit write). Full elimination would require store refresh before query, but this handles the common case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:29:32 -04:00
Kent Overstreet	34e74ca2c5	agents: neighborhood placeholder, organize prompt, weight-set command Add {{neighborhood}} placeholder for agent prompts: full seed node content + ranked neighbors (score = link_strength * node_weight) with smooth cutoff, minimum 10, cap 25, plus cross-links between included neighbors. Rewrite organize.agent prompt to focus on structural graph work: merging duplicates, superseding junk, calibrating weights, creating concept hubs. Add weight-set CLI command for direct node weight manipulation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:16:55 -04:00
Kent Overstreet	5ef9098deb	memory: fix timestamp and provenance on agent writes Two bugs: upsert_provenance didn't update node.timestamp, so history showed the original creation date for every version. And native memory tools (poc-agent dispatch) didn't set POC_PROVENANCE, so all agent writes showed provenance "manual" instead of "agent:organize" etc. Fix: set node.timestamp = now_epoch() in upsert_provenance. Thread provenance through memory::dispatch as Option<&str>, set it via .env("POC_PROVENANCE") on each subprocess Command. api.rs passes "agent:{name}" for daemon agent calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:16:45 -04:00
Kent Overstreet	f45f663dc0	tui: fix scroll by using Paragraph::line_count() Replace homegrown wrapping math (wrapped_height, wrapped_height_line, auto_scroll, force_scroll, wrapped_line_count) with ratatui's own Paragraph::line_count() which exactly matches its rendering. The old approach used ceiling division that didn't account for word wrapping, causing bottom content to be clipped. Also add terminal.clear() on resize to force full redraw — fixes the TUI rendering at old canvas size after terminal resize. Requires the unstable-rendered-line-info feature flag on ratatui. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 12:16:35 -04:00
Kent Overstreet	6d22f70192	Native memory tools + MCP server + distill agent improvements Tools: - Add native memory_render, memory_write, memory_search, memory_links, memory_link_set, memory_link_add, memory_used tools to poc-agent (tools/memory.rs) - Add MCP server (~/bin/memory-mcp.py) exposing same tools for Claude Code sessions - Wire memory tools into poc-agent dispatch and definitions - poc-memory daemon agents now use memory_* tools instead of bash poc-memory commands — no shell quoting issues Distill agent: - Rewrite distill.agent prompt: "agent of PoC's subconscious" framing, focus on synthesis and creativity over bookkeeping - Add {{neighborhood}} placeholder: full seed node content + all neighbors with content + cross-links between neighbors - Remove content truncation in prompt builder — agents need full content for quality work - Remove bag-of-words similarity suggestions — agents have tools, let them explore the graph themselves - Add api_reasoning config option (default: "high") - link-set now deduplicates — collapses duplicate links - Full tool call args in debug logs (was truncated to 80 chars) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 22:58:54 -04:00
Kent Overstreet	d9b56a02c3	Consolidate poc-memory and poc-agent configs poc-memory now reads from poc-agent's config.json5 as the primary config source. Memory-specific settings live in a "memory" section; API credentials are resolved from the shared model/backend config instead of being duplicated. - Add "memory" section to ~/.config/poc-agent/config.json5 - poc-memory config.rs: try shared config first, fall back to legacy JSONL - API fields (base_url, api_key, model) resolved via memory.agent_model -> models -> backend lookup - Add json5 dependency for proper JSON5 parsing - Update provisioning scripts: hermes -> qwen3_coder tool parser Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 21:49:58 -04:00
Kent Overstreet	4c7c3c762c	poc-memory: fix distill placeholder, show link weights in render - distill.agent: fix {{distill}} → {{nodes}} placeholder so seed nodes actually resolve - render: show link strength values in the links section, sorted by strength descending Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 20:15:08 -04:00
Kent Overstreet	377e2773bc	Add MI300X provisioning script for vllm/Qwen 3.5 27B ROCm-specific setup with: - AITER attention backends (VLLM_ROCM_USE_AITER=1) - Reduced cudagraph capture size (DeltaNet cache conflict) - BF16 model + FP8 KV cache as default (FP8 weights can be slower on MI300X due to ROCm kernel maturity) - FP8=1 flag for benchmarking FP8 model weights Key for training plan: if FP8 matmuls are slow on MI300X, the quantize-and-expand strategy needs B200 instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 14:40:15 -04:00
Kent Overstreet	af3171d6ec	config: hot-reload via RPC, Arc<Config> for cheap sharing Config is now stored in RwLock<Arc<Config>> instead of OnceLock<Config>. get() returns Arc<Config> (cheap clone), and reload() re-reads from disk. New RPC: "reload-config" — reloads config.jsonl without restarting the daemon. Logs the change to daemon.log. Useful for switching between API backends and claude accounts without losing in-flight tasks. New CLI: poc-memory agent daemon reload-config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 13:41:13 -04:00
Kent Overstreet	0944ecc43f	daemon: verbose pool logging, DAEMON_POOL for run_job Store resource pool in OnceLock so run_job can pass it to Daemon::run_job for pool state logging. Verbose logging enabled via POC_MEMORY_VERBOSE=1 env var. LLM backend selection and spawn-site pool state now use verbose log level to keep daemon.log clean in production. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 11:21:30 -04:00
Kent Overstreet	49f72cdac3	Logging overhaul: per-task log files, daemon.log drill-down Switch from jobkit-daemon crate to jobkit with daemon feature. Wire up per-task log files for all daemon-spawned agent tasks. Changes: - Use jobkit::daemon:: instead of jobkit_daemon:: - All agent tasks get .log_dir() set to $data_dir/logs/ - Task log path shown in daemon status and TUI - New CLI: poc-memory agent daemon log --task NAME Finds the task's log path from status or daemon.log, tails the file - LLM backend selection logged to daemon.log via log_event - Targeted agent job names include the target key for debuggability - Logging architecture documented in doc/logging.md Two-level logging, no duplication: - daemon.log: lifecycle events with task log path for drill-down - per-task logs: full agent output via ctx.log_line() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 11:17:07 -04:00
Kent Overstreet	f2c2c02a22	tui: fix cursor position with proper word-wrap simulation The previous approach scanned ratatui's rendered buffer to find the cursor position, but couldn't distinguish padding spaces from text spaces, causing incorrect cursor placement on wrapped lines. Replace with a word_wrap_breaks() function that computes soft line break positions by simulating ratatui's Wrap { trim: false } algorithm (break at word boundaries, fall back to character wrap for long words). cursor_visual_pos() then maps a character index to (col, row) using those break positions. Also fixes the input area height calculation to use word-wrap semantics instead of character-wrap, matching the actual Paragraph rendering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 01:09:55 -04:00
ProofOfConcept	2e3943b89f	tui: use explicit found flag for cursor scan Clean up the break logic by using an explicit flag instead of checking cursor_x/cursor_y values.	2026-03-19 00:48:52 -04:00
ProofOfConcept	0f3edebcb3	tui: handle empty cells in cursor scan When scanning the buffer for cursor position, also check empty cells. The cursor might be positioned at an empty cell (e.g., end of line or after all visible characters).	2026-03-19 00:47:46 -04:00
ProofOfConcept	1fa298cbdd	tui: fix cursor position to use character count, not byte count self.cursor is a byte index into the string. When scanning the buffer, we need to compare character positions, not byte positions or widths. Convert self.cursor to a character count before comparing with the buffer scan. Count each non-empty cell as 1 character (the buffer already represents visual cells, so width doesn't matter here).	2026-03-19 00:46:17 -04:00
ProofOfConcept	6a7ec9732b	tui: fix cursor position calculation The cursor index is into self.input, but the rendered buffer contains the prompt prepended to the first line. Need to add prompt.len() to get the correct character position when scanning the buffer.	2026-03-19 00:45:07 -04:00
ProofOfConcept	ec79d60fbd	tui: fix cursor desync by scanning rendered buffer Instead of simulating ratatui's word wrapping algorithm, scan the rendered buffer to find the actual cursor position. This correctly handles word wrapping, unicode widths, and any other rendering nuances that ratatui applies. The old code computed wrapped_height() and cursor position based on simple character counting, which diverged from ratatui's WordWrapper that respects word boundaries. Now we render first, then walk the buffer counting visible characters until we reach self.cursor. This is O(area) but the input area is small (typically < 200 cells), so it's negligible.	2026-03-19 00:40:05 -04:00
Kent Overstreet	5308c8e3a4	tui: fix cursor desync on line wrap Use unicode display width (matching ratatui's Wrap behavior) instead of chars().count() for both wrapped_height calculation and cursor positioning. The mismatch caused the cursor to drift when input wrapped to multiple lines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 00:30:45 -04:00
Kent Overstreet	f83325b44d	Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser - Always display reasoning tokens regardless of reasoning_effort setting — Qwen 3.5 thinks natively and the reasoning parser separates it into its own field - Remove chat_template_kwargs that disabled thinking when reasoning_effort was "none" - Add chat_template_kwargs field to ChatRequest for vllm compat - Update provision script: qwen3_xml tool parser, qwen3 reasoning parser, 262K context, 95% GPU memory utilization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 00:06:26 -04:00
Kent Overstreet	49ccdf87e1	Add vllm provisioning script for RunPod GPU instances Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled, Hermes tool call parser for function calling support. Configurable via environment variables (MODEL, PORT, MAX_MODEL_LEN). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:13:04 -04:00
Kent Overstreet	b04a98c6e5	api: singleton ApiClient, fix log closure threading Make ApiClient a process-wide singleton via OnceLock so the connection pool is reused across agent calls. Fix the sync wrapper to properly pass the caller's log closure through thread::scope instead of dropping it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:09:11 -04:00
Kent Overstreet	643f9890df	api: fix sync wrapper to be safe from any calling context Run the async API call on a dedicated thread with its own tokio runtime so it works whether called from a sync context or from within an existing tokio runtime (daemon). Also drops the log closure capture issue — uses a simple eprintln fallback since the closure can't cross thread boundaries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:07:49 -04:00
Kent Overstreet	a29b6d4c5d	Add direct API backend for agent execution When api_base_url is configured, agents call the LLM directly via OpenAI-compatible API (vllm, llama.cpp, etc.) instead of shelling out to claude CLI. Implements the full tool loop: send prompt, if tool_calls execute them and send results back, repeat until text. This enables running agents against local/remote models like Qwen-27B on a RunPod B200, with no dependency on claude CLI. Config fields: api_base_url, api_key, api_model. Falls back to claude CLI when api_base_url is not set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:05:14 -04:00
Kent Overstreet	1b48e57f34	Remove jobkit-daemon from workspace members jobkit-daemon is now an external git dependency with its own repo. The local clone was only needed temporarily to fix a broken Cargo.toml in the remote. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 22:59:21 -04:00
Kent Overstreet	465c03aa11	Add find-deleted diagnostic tool Lists nodes that are currently deleted with no subsequent live version. Useful for diagnosing accidental deletions in the memory store. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 22:57:12 -04:00

... 4 5 6 7 8 ...

590 commits