Commit graph

58 commits

Author SHA1 Message Date
Kent Overstreet
be6ba4e9a5 agent: bundle sampling fields as SamplingParams on AgentState
Collapse the split `temperature` / `top_p` / `top_k` fields on
AgentState into a single `sampling: SamplingParams` struct, mirroring
how the wire-level fields flow into the Generate RPC. Adds
`max_tokens` to SamplingParams so it's actually plumbed end to end
(previously the client had a hardcoded 4096 fallback inside
`run_session_generate`).

AgentState construction sites now set `sampling: SamplingParams { ...
max_tokens: 4096 }` as the default. The assignment sites in
oneshot.rs / subconscious.rs / unconscious.rs switch from
`st.temperature = X` to `st.sampling.temperature = X`.

`stream_session_mm` takes `SamplingParams` directly; the
`sampling_max_tokens()` helper goes away. `pb::GenerateRequest` is
populated with `sampling.max_tokens` (and the other fields) in
`run_session_generate`. SamplingParams is `pub` so it can be
embedded in the public AgentState without a visibility warning.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:37:20 -04:00
Kent Overstreet
8d9c9e9f7b agent: end-to-end gRPC Generate with delta-based session orchestration
Wires the client side of the new salience protocol so inference
actually runs over gRPC instead of emitting the stubbed "not yet
wired" error. Each turn walks the AST as interleaved chunks, sends
only what's new to the server, and streams decode tokens back.

context.rs:
  * `WireChunk` enum: `Tokens(Vec<u32>)` or `Image { bytes, mime,
    known_expanded_len }`. Preserves text/image/text ordering the
    wire path can't flatten.
  * `wire_chunks(range, skip)` walker, parallel to `wire_prompt` —
    branches emit `<|im_start|>…<|im_end|>` tokens, image leaves
    emit a single Image chunk (no inline vision tokens).
  * `NodeLeaf::set_image_token_count(n)` + recompute of cached
    `token_ids`; `ContextState::commit_image_token_counts(&[u32])`
    fills in the first-N zero-count image leaves in wire order.
  * `ResponseParser::run` handles the new
    `StreamToken::ImageAppended` by committing the server's N into
    the AST before the final Generate's Token events stream in.

salience.rs:
  * `SessionHandle` tracks `committed_len`. `append_image` advances
    it from the RPC response. New `generate(req)` opens the
    server-streaming RPC.

api/mod.rs:
  * `stream_session_mm(session_lock, chunks, sampling, priority,
    readout_shape)` replaces the stub. Spawns `run_session_generate`.
  * `run_session_generate`: takes the session out of the Mutex (or
    opens fresh), skips chunks covered by `committed_len` (bails on
    mid-chunk straddle or unknown-length image in the committed
    prefix), walks the delta: accumulates Tokens into `pending`, on
    Image flushes pending via `flush_pending` (max_tokens=0 Generate
    that just prefills), then AppendImage + emits
    StreamToken::ImageAppended. Final Generate carries any trailing
    pending text as `append_tokens` and the sampling params; Token
    events stream out as StreamToken::Token, Done as
    StreamToken::Done. On success, handle with updated
    `committed_len` returns to the Mutex; on error, handle drops
    and next call reopens.
  * `StreamToken::ImageAppended { placeholder_count }` variant —
    emitted in wire order before the final Generate's tokens.
  * Prefix-cache cap for readout coverage: `readout_ranges` covers
    `[prompt_len_after_append, u32::MAX)` when the caller provides
    a readout_shape, so decode positions stream their readouts.

agent/mod.rs:
  * `assemble_prompt` returns `Vec<WireChunk>` with the assistant
    prologue merged into the trailing Tokens chunk. Caller in
    `turn` passes chunks + readout_shape (pulled from
    `agent.readout.lock().manifest`) to `stream_session_mm`.
  * Dropped `assemble_prompt_tokens` — dead.

mind + unconscious:
  * `Unconscious::new(client)` stores a shared `ApiClient`. Fixes
    the repeated-manifest-fetch bug caused by each subagent's
    `ApiClient::new` having its own OnceCell. The client's Arc-
    wrapped manifest cache is now shared across every agent Mind
    spawns.
  * `prepare_spawn(name, auto, wake, base_client)` clones the base
    client and overrides `.model` for the resolved backend instead
    of constructing fresh. All three callers
    (`toggle`/`trigger`/unconscious loop) pass `self.client.clone()`.
  * `Mind::new` passes `agent.client.clone()` into
    `Unconscious::new`.

subconscious/generate.rs:
  * gen_continuation switched to `wire_chunks` + the new
    `stream_session_mm` signature. Ephemeral session opens on each
    call, tears down at scope end. No readouts requested.

Not changed yet, noted for follow-up:
  * Subconscious ablation scoring in learn.rs still talks to
    `/v1/score` over HTTP. Will migrate once we have time to verify
    the Generate+max_tokens=0+prompt_logprobs path end-to-end.
  * compare.rs constructs its own ApiClient for the
    `compare.test_backend` (which is intentionally a different
    endpoint) — left alone.
  * Readout manifest still fetched via HTTP at Agent::new.
    Migration to GetReadoutManifest gRPC is a separate cleanup.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:27:55 -04:00
Kent Overstreet
08213f9093 salience: add gRPC client + TLS plumbing for stateful vllm sessions
Adds the client-side of a stateful gRPC protocol against vllm, plus
the TLS trust machinery so we can talk to self-signed vllm servers.

Protocol (proto/salience.proto):
  Bidi-streaming Session RPC carries OpenSession / AppendTokens /
  Generate / Cancel from client and SessionReady / PrefillProgress /
  Token / GenerateDone / Error from server. Separate Fork unary RPC
  for cheap branching (prefix cache shares KV automatically). Plus
  ListSessions, CloseSession, GetReadoutManifest admin RPCs.

  Per-token readouts ship as packed f32 ([n_layers * n_concepts] per
  token, flat). Logprobs use range-selected positions plus a top-k
  parameter — empty ranges means no logprobs, any range means emit
  sampled-token logprob at those positions, top_k > 0 adds
  alternatives.

Client (src/agent/api/salience.rs):
  Tonic-generated types under pb::, a connect() helper, with_auth()
  for bearer metadata, and a Session handle wrapping the bidi stream:
  open() handshakes SessionReady; append() is fire-and-forget;
  generate() returns impl Stream<Item = Event> that drains inbound
  until Done or terminating Error. One generate at a time per session.

Peak picker (src/agent/salience.rs):
  Pure function over ReadoutEntry traces. Per-concept z-score against
  trace global stats; contiguous above-threshold regions emit one
  peak at the local max. Configurable sigma threshold and min-std
  safety floor. Deterministic tie-break on offset then concept name.
  12 unit tests covering empty traces, flat channels, single/multi
  spikes, contiguous humps, multi-concept independence, trailing
  runs, sub-threshold noise, layer-out-of-range, manifest shape
  mismatch, and threshold tunability.

TLS (src/agent/api/http.rs):
  HttpClient::build now also loads every .pem file under
  ~/.consciousness/certs/ into the rustls root store — so dropping
  a <host>.pem in that directory is enough to trust a new self-
  signed server; no code changes per new host. Also installs the
  rustls default crypto provider explicitly via OnceLock: tonic's
  tls features pulled in both ring and aws-lc-rs on the resolver
  path, and rustls 0.23 refuses to auto-pick when either could win.

Build (build.rs, Cargo.toml):
  tonic-build generates Rust types from proto/salience.proto at
  cargo-build time, using a vendored protoc binary
  (protoc-bin-vendored) so no system install is required. New
  runtime deps: tonic, prost, async-stream, tokio-stream,
  rustls-pemfile.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:56:32 -04:00
Kent Overstreet
0f1c4cf1de agent/api: carry readout alongside streamed tokens
StreamToken::Token is now a struct variant with an optional
TokenReadout (shape [n_layers][n_concepts]) per token — parsed from
the vLLM completion response's choices[i].readout field when the
server has readout enabled.

ApiClient gains a fetch_readout_manifest() method that hits
GET /v1/readout/manifest. Returns Ok(None) on 404 (server has
readout disabled), so callers can gracefully fall back when pointed
at a non-readout-enabled endpoint.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:15:46 -04:00
Kent Overstreet
43e06daa5b cleanup: drop dead ApiClient::stream_completion wrapper, silence dmn_tick
stream_completion was a thin wrapper around stream_completion_mm (just
passing an empty image list); the last caller switched to _mm directly
when learn's generate_alternate gained image support. Delete the
wrapper — callers can pass `&[]` if they have no images.

MindState::dmn_tick has been sitting unused (called only from a
commented-out block in the Mind loop). Rename to _dmn_tick so the
compiler stops warning; Kent may uncomment the call path later.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 16:23:59 -04:00
Kent Overstreet
204ba5570a agent: send images as multi_modal_data on completion requests
Split the prompt assembly into two forms: the AST keeps the
fully-expanded representation (N image_pads per image, for accurate
context budget accounting), while the request wire form collapses
each image to a single <|image_pad|> bookended by vision_start/end
and ships the raw bytes out-of-band as a base64 data URI in a new
`multi_modal_data.image` field on /v1/completions.

vLLM's Qwen3VL processor uses PromptReplacement with target=single
<|image_pad|> and replacement=N image_pads, so the wire-form matches
what the processor expects and it re-expands to N server-side.

Server side needs /v1/completions to accept multi_modal_data for
this to land images end-to-end — that's the next piece.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:08:26 -04:00
Kent Overstreet
78912ca72f simplify http library 2026-04-11 16:45:54 -04:00
Kent Overstreet
0af97774f4 Parsing fixes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-09 16:42:16 -04:00
Kent Overstreet
fba8fcc587 Fix UTF-8 slicing panics: use floor_char_boundary for all truncation
Byte-position truncation (&s[..s.len().min(N)]) panics when position
N lands inside a multi-byte character. Fixed in parser debug logging,
API error messages, oneshot response logging, and CLI agent display.

Also fixed tool dispatch permissions (removed global fallback).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 19:35:01 -04:00
Kent Overstreet
14fd8c9b90 Clean up warnings: StreamToken pub, dead oneshot code, SkipIndex
Made StreamToken pub (was pub(crate), needed by context.rs).
Removed dead API_CLIENT, get_client, sampling/priority fields
from oneshot. Suppressed pre-existing SkipIndex warning in learn.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 16:35:57 -04:00
Kent Overstreet
2c401e24d6 Parser consumes stream directly, yields tool calls via channel
ResponseParser::run() spawns a task that reads StreamTokens, parses
into the AST (locking context per token), and sends PendingToolCalls
through a channel. Returns (tool_rx, JoinHandle<Result>) — the turn
loop dispatches tool calls and awaits the handle for error checking.

Token IDs from vLLM are accumulated alongside text and stored directly
on AST leaves — no local re-encoding on the response path.

The turn loop no longer matches on individual stream events. It just
reads tool calls and dispatches them.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 16:32:00 -04:00
Kent Overstreet
22146156d4 Collapse API layer: inline openai.rs, delete types.rs and parsing.rs
API is now two files: mod.rs (430 lines) and http.rs. Contains:
Usage, StreamToken, SamplingParams, ApiClient, stream_completions,
SseReader, send_and_check. Everything else is dead and gone.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:15:21 -04:00
Kent Overstreet
9bb626f18c Strip api/types.rs to just Usage
Killed Message, Role, ToolCall, FunctionCall, MessageContent,
ContentPart, ImageUrl — all dead. types.rs is now 8 lines.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:12:28 -04:00
Kent Overstreet
39e6ae350d Kill dead API types: ChatRequest, ChatCompletionChunk, Delta, streaming types
Removed all chat completions wire types that are no longer used:
ChatRequest, ReasoningConfig, ChatCompletionChunk, ChunkChoice,
Delta, FunctionCallDelta, ToolCallDelta, append_content, user_with_images.

Remaining types in api/types.rs are transitional (Message, ToolCall, etc.)
— they'll go away as outer callers migrate to AstNode.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:08:41 -04:00
Kent Overstreet
1e5cd0dd3f Kill dead API code: stream_events, parsing.rs, build_response_message, log_diagnostics
Deleted: api/parsing.rs entirely (parsing now in context_new.rs),
stream_events (chat completions path), collect_stream, build_response_message,
log_diagnostics, tools_to_json_str, start_stream, chat_completion_stream_temp.

API layer is now just: stream_completion (token IDs in/out), SseReader,
send_and_check, and types. Zero errors in api/.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:06:33 -04:00
Kent Overstreet
48db4a42cc WIP: Kill chat API path — StreamEvent, collect_stream, build_response_message
Removed start_stream, chat_completion_stream_temp, collect_stream,
StreamResult, build_response_message. All streaming goes through
stream_completion → StreamToken now. ConversationLog rewritten
for AstNode serialization.

Remaining: openai.rs stream_events, mind/, user/, oneshot, learn.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:01:42 -04:00
Kent Overstreet
9c79d7a037 WIP: Wiring context_new into agent — turn loop, StreamToken, dead code removal
Work in progress. New turn loop uses ResponseParser + StreamToken.
Killed StreamEvent, append_streaming, finalize_streaming, streaming_index,
assemble_api_messages, working_stack. Many methods still reference old
types — fixing next.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 14:55:10 -04:00
Kent Overstreet
6730d136d4 ContextState + private AstNode fields: enforce token_ids invariant
AstNode fields are now private with read-only accessors. All mutation
goes through ContextState methods (push, set_message, set_score, del)
which guarantee token_ids stays in sync with text on every leaf.

Also fix ResponseParser to use AstNode::tool_call() constructor,
widen parsing module visibility to pub(crate).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 12:58:59 -04:00
Kent Overstreet
f458af6dec Add /v1/completions streaming path with raw token IDs
New stream_completions() in openai.rs sends prompt as token IDs to
the completions endpoint instead of JSON messages to chat/completions.
Handles <think> tags in the response (split into Reasoning events)
and stops on <|im_end|> token.

start_stream_completions() on ApiClient provides the same interface
as start_stream() but takes token IDs instead of Messages.

The turn loop in Agent::turn() uses completions when the tokenizer
is initialized, falling back to the chat API otherwise. This allows
gradual migration — consciousness uses completions (Qwen tokenizer),
Claude Code hook still uses chat API (Anthropic).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:42:22 -04:00
Kent Overstreet
7ecc50d2e4 Capture reasoning/thinking from API stream into Thinking entries
StreamResult now includes accumulated reasoning text. After each
stream completes, if reasoning was produced, a Thinking entry is
pushed to the conversation before the response message.

Reasoning content is visible in the context tree UI but not sent
back to the API and doesn't count against the token budget.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 22:49:35 -04:00
Kent Overstreet
9737641c86 Fix build warnings across workspace
- Remove redundant token fields from StreamEvent::Finished (data
  already delivered via Usage event)
- Remove dead hotkey_adjust_sampling, MAX_HISTORY, now()
- Fix unused variable warnings (delta, log)
- Suppress deserialization-only field warnings (jsonrpc, role)
- Make start_stream/chat_completion_stream_temp pub(crate)
- Remove unnecessary pub(crate) re-export of internal types

Remaining warnings are TODO items: SkipIndex (scoring not wired),
notify (MCP notifications not wired).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:55:30 -04:00
Kent Overstreet
c64295ddb2 Reduce pub visibility in agent::api and user modules
api/: parsing module private, SamplingParams/StreamEvent/StreamResult/
AbortOnDrop/build_response_message/collect_stream to pub(crate).
Internal types (ChatRequest, ChunkChoice, Delta, etc.) to pub(crate).
StreamResult fields to pub(crate). Parsing functions to pub(super).

user/: context, subconscious, unconscious, thalamus modules private
(only chat needs pub(crate) for mind/ access).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:43:25 -04:00
Kent Overstreet
f33b1767da Restrict API types visibility — types module is now private
Only Message, Role, MessageContent, ContentPart, ToolCall,
FunctionCall, Usage, ImageUrl are pub-exported from agent::api.

Internal types (ChatRequest, ChatCompletionChunk, ChunkChoice,
Delta, ReasoningConfig, ToolCallDelta, FunctionCallDelta) are
pub(crate) — invisible outside the crate.

All callers updated to import from agent::api:: instead of
agent::api::types::.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:39:20 -04:00
Kent Overstreet
39965556dd Remove dead code: scan_pid_files, backend_label, entries_mut, post_json
All confirmed unused anywhere in src/ or channels/.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:25:18 -04:00
Kent Overstreet
1cf4f504c0 Kill reqwest — minimal HTTP client on raw hyper + tokio-rustls
New src/agent/api/http.rs: ~240 lines, supports GET/POST, JSON/form
bodies, SSE streaming via chunk(), TLS via rustls. No tracing dep.

Removes reqwest from the main crate and telegram channel crate.
Cargo.lock drops ~900 lines of transitive dependencies.

tracing now only pulled in by tui-markdown.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 12:50:40 -04:00
Kent Overstreet
da24e02159 fix: prevent assistant message duplication during tool calls
- Fix sync logic to only break at matching assistant messages
- When assistant message changes (streaming → final), properly pop and re-display
- Add debug logging for sync operations (can be removed later)

The bug: when tool calls split an assistant response into multiple entries,
the sync logic was breaking at the assistant even when it didn't match,
causing the old display to remain while new entries were added on top.

The fix: only break at assistant if matches=true, ensuring changed entries
are properly popped before re-adding.

Co-Authored-By: ProofOfConcept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-07 00:28:39 -04:00
ProofOfConcept
f390fa1617 Delete ui_channel.rs — relocate types, remove all UiMessage/UiSender plumbing
Types relocated:
- StreamTarget → mind/mod.rs (Mind decides Conversation vs Autonomous)
- SharedActiveTools + shared_active_tools() → agent/tools/mod.rs
- ContextSection + SharedContextState → agent/context.rs (already there)
- StatusInfo + ContextInfo → user/mod.rs (UI display state)

Removed UiSender from: Agent::turn, Mind, learn.rs, all function signatures.
The entire message-passing layer is gone. All state flows through
Agent fields (activities, entries, streaming) read by the UI via try_lock.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 22:34:48 -04:00
ProofOfConcept
cfddb55ed9 Kill TextDelta, Info — UiMessage is dead. RAII ActivityGuards replace all status feedback
Streaming text now goes directly to agent entries via append_streaming().
sync_from_agent diffs the growing entry each tick. The streaming entry
is popped when the response completes; build_response_message pushes
the final version.

All status feedback uses RAII ActivityGuards:
- push_activity() for long-running work (thinking, streaming, scoring)
- notify() for instant feedback (compacted, DMN state changes, commands)
- Guards auto-remove on Drop, appending "(complete)" and lingering 5s
- expire_activities() cleans up timed-out notifications on render tick

UiMessage enum reduced to a single Info variant with zero sends.
The channel infrastructure remains for now (Mind/Agent still take
UiSender in signatures) — mechanical cleanup for a follow-up.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 22:18:07 -04:00
ProofOfConcept
e7914e3d58 Kill Reasoning, Debug, Activity variants — read status from Agent directly
Reasoning tokens: dropped for now, will land in context entries later.
Debug sends: converted to dbglog! macro (writes to debug.log).
Activity: now a field on Agent, set directly, read by UI via try_lock.
score_memories_incremental takes agent Arc for activity writes.

UiMessage down to 2 variants: TextDelta, Info.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:45:55 -04:00
ProofOfConcept
eafc2887a3 Kill StatusUpdate, Activity, DmnAnnotation, ContextInfoUpdate, AgentUpdate
Status bar reads directly from Agent and MindState on each render tick.
Activity is now a field on Agent — set by agent code directly, read by
UI via try_lock. DmnAnnotation, ContextInfoUpdate, AgentUpdate were
already dead (no senders).

UiMessage down to 4 variants: TextDelta, Reasoning, Debug, Info.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:34:27 -04:00
ProofOfConcept
1745e03550 Kill UiMessage variants replaced by state-driven rendering
sync_from_agent reads directly from agent entries, so remove:
- UserInput (pending input shown from MindState.input)
- ToolCall, ToolResult (shown from entries on completion)
- ToolStarted, ToolFinished (replaced by shared active_tools)
- replay_session_to_ui (sync_from_agent handles replay)

-139 lines. Remaining variants are streaming (TextDelta, Reasoning),
status bar state, or ephemeral UI messages (Info, Debug).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:21:08 -04:00
ProofOfConcept
48beb8b663 Revert to tokio::sync::Mutex, fix lock-across-await bugs, move input ownership to InteractScreen
The std::sync::Mutex detour caught every place a MutexGuard lived
across an await point in Agent::turn — the compiler enforced Send
safety that tokio::sync::Mutex silently allows. With those fixed,
switch back to tokio::sync::Mutex (std::sync blocks tokio worker
threads and panics inside the runtime).

Input and command dispatch now live in InteractScreen (chat.rs):
- Enter pushes directly to SharedMindState.input (no app.submitted hop)
- sync_from_agent displays pending input with dimmed color
- Slash command table moved from event_loop.rs to chat.rs
- cmd_switch_model kept as pub fn for tool-initiated switches

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:13:48 -04:00
Kent Overstreet
3e1be4d353 agent: switch from tokio::sync::Mutex to std::sync::Mutex
The agent lock is never held across await points — turns lock briefly,
do work, drop, then do async API calls. std::sync::Mutex works and
can be locked from sync contexts (screen tick inside terminal.draw).

Fixes: blocking_lock() panic when called inside tokio runtime.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-05 20:08:25 -04:00
Kent Overstreet
222b2cbeb2 chat: PartialEq on ConversationEntry for proper diff
Add PartialEq to Message, FunctionCall, ToolCall, ContentPart,
ImageUrl, MessageContent, ConversationEntry. Sync now compares
entries directly instead of content lengths.

Phase 1 pops mismatched tail entries using PartialEq comparison.
Phase 2 pushes new entries with clone into last_entries buffer.

TODO: route_entry needs to handle multiple tool calls per entry.

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-05 19:41:16 -04:00
Kent Overstreet
7a1e580b95 drop dead ChatResponse 2026-04-05 01:18:47 -04:00
ProofOfConcept
a14e85afe1 api: extract collect_stream() from agent turn loop
Move the entire stream event processing loop (content accumulation,
leaked tool call detection/dispatch, ToolCallDelta assembly, UI
forwarding, display buffering) into api::collect_stream(). The turn
loop now calls collect_stream() and processes the StreamResult.

Also move FunctionCall, ToolCall, ToolCallDelta to api/types.rs where
they belong (API wire format, not tool definitions). Move parsing.rs
to api/parsing.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-04 18:19:21 -04:00
ProofOfConcept
6845644f7b api: move wire types and parsing to api module
Move FunctionCall, FunctionCallDelta, ToolCall, ToolCallDelta from
tools/mod.rs to api/types.rs — these are API wire format, not tool
definitions. Re-export from tools for existing callers.

Move parsing.rs to api/parsing.rs — leaked tool call parsing is API
plumbing.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-04 18:19:21 -04:00
ProofOfConcept
51e632c997 tools: delete ToolDef and FunctionDef
ToolDef and FunctionDef are gone. Tool definitions are static strings
on the Tool struct. The API layer builds JSON from Tool::to_json().

- ChatRequest.tools is now Option<serde_json::Value>
- start_stream takes &[Tool] instead of Option<&[ToolDef]>
- openai::stream_events takes &serde_json::Value for tools
- memory_and_journal_tools() returns Vec<Tool> for subconscious agents
- Subconscious agents filter by t.name instead of t.function.name

No more runtime JSON construction for tool definitions.
No more ToolDef::new(). No more FunctionDef.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-04 18:19:21 -04:00
ProofOfConcept
aa7511d110 cleanup: remove unused imports from refactoring
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-04 18:19:21 -04:00
ProofOfConcept
dd009742ef agent: add sampling parameters (temperature, top_p, top_k)
Move temperature from a per-call parameter to an Agent field,
add top_p and top_k. All three are sent to the API via a new
SamplingParams struct, displayed on the F5 thalamus screen.

Defaults: temperature=0.6, top_p=0.95, top_k=20 (Qwen3.5 defaults).

Also adds top_p and top_k to ChatRequest so they're sent in the
API payload. Previously only temperature was sent.

UI controls for adjusting these at runtime are not yet implemented.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-04 18:19:21 -04:00
Kent Overstreet
79e384f005 split out src/mind 2026-04-04 02:46:32 -04:00
Kent Overstreet
9bebbcb635 Move API code from user/ to agent/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-04 00:34:48 -04:00
Kent Overstreet
14dd8d22af Rename agent/ to user/ and poc-agent binary to consciousness
Mechanical rename: src/agent/ -> src/user/, all crate::agent:: ->
crate::user:: references updated. Binary poc-agent renamed to
consciousness with CLI name and user-facing strings updated.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-03 17:25:59 -04:00
Kent Overstreet
c01d4a5b08 wire up /score command and debug screen for memory importance
/score snapshots the context and client, releases the agent lock,
runs scoring in background. Only one score task at a time
(scoring_in_flight flag). Results stored on Agent and shown on
the F10 context debug screen with importance scores per memory.

ApiClient derives Clone. ContextState derives Clone.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 22:21:31 -04:00
Kent Overstreet
df9b610c7f add memory importance scoring via prompt logprobs
score_memories() drops each memory from the context one at a time,
runs prompt_logprobs against the full conversation, and builds a
divergence matrix: memories × responses.

Row sums = memory importance (for graph weight updates)
Column sums = response memory-dependence (training candidates)

Uses vLLM's prompt_logprobs to check "would the model have said
this without this memory?" — one forward pass per memory, all
responses scored at once. ~3s per memory on B200.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 22:13:55 -04:00
Kent Overstreet
5b92b59b17 move failed request logs to their own subdirectory
~/.consciousness/logs/failed-requests/ instead of cluttering
the main logs directory.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 19:28:56 -04:00
Kent Overstreet
3b80af2997 log buffer contents on stream errors and timeouts
Show chunks received, SSE lines parsed, and the contents of
the line buffer (up to 500 bytes) on both stream errors and
timeouts. This tells us whether we got partial data, a non-SSE
response, or truly nothing from the server.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 18:49:33 -04:00
Kent Overstreet
156626ae53 configurable stream timeout, show per-call timer in status bar
Stream chunk timeout is now api_stream_timeout_secs in config
(default 60s). Status bar shows total turn time and per-call
time with timeout: "thinking... 45s, 12/60s".

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 18:46:27 -04:00
Kent Overstreet
13d9cc962e abort orphaned stream tasks on drop, reduce timeout to 60s
Spawned streaming tasks were never cancelled when a turn ended or
retried, leaving zombie tasks blocked on dead vLLM connections.
AbortOnDrop wrapper aborts the task when it goes out of scope.

Chunk timeout reduced from 120s to 60s.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 18:41:02 -04:00
Kent Overstreet
91eb9c95cc delete 20 dead public functions across 12 files
Removed functions with zero callers: parse_timestamp_to_epoch,
hash_key, search_weighted_debug, extract_query_terms, format_results,
move_to_neighbor, adjust_edge_strength, update_graph_metrics,
nearest_to_seeds, nystrom_project, chat_completion_stream, cmd_read,
context_message, split_candidates, split_plan_prompt,
split_extract_prompt, log_event_pub, log_verbose, rpc_record_hits,
memory_definitions. -245 lines.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-02 16:21:01 -04:00