Compare commits

..

124 commits

Author SHA1 Message Date
Kent Overstreet
d8ff7aacc7 consciousness: cache expensive graph metrics 2026-06-15 21:16:40 -05:00
Kent Overstreet
b9f093247d Seed default identity nodes during init 2026-06-15 16:10:02 -05:00
Kent Overstreet
b4a8f08b9b Improve hook agent diagnostics 2026-06-15 15:47:59 -05:00
Kent Overstreet
e6f9f062f0 Drop Send bound from TUI screens 2026-06-15 13:51:22 -05:00
Kent Overstreet
341955c144 Add transcript tail debug command 2026-06-15 13:51:22 -05:00
Kent Overstreet
78b4bbd5bb Split conversation transcript parsing 2026-06-15 13:51:22 -05:00
Kent Overstreet
f6a6e3066c flake 2026-06-15 13:51:22 -05:00
Kent Overstreet
156b6863f4 Gate nightly diagnostics behind feature 2026-06-15 13:51:22 -05:00
Kent Overstreet
25e4775974 enable tls 2026-05-22 13:02:42 -04:00
Kent Overstreet
6e3bacb182 channel-tmux: resolve pane ids by label, don't persist them
tmux pane ids (%6 etc.) are ephemeral — recycled across pane and
tmux-server restarts. The daemon persisted the id in tmux.json5 and
kept reusing it, so after a restart a channel would attach to whatever
unrelated pane had since inherited that id. (Live: ktest's stored %6
had become a claude pane; the real ktest pane was %10.)

Persist only the label — the pane title / window name, which is
stable. pipe_pane_reader() is now a connect-retry loop: each attempt,
connect_and_stream() resolves the live id with find_pane_by_name(); the
loop retries until the pane exists and pipe-pane succeeds, and
reconnects the same way if the pipe later drops. send() resolves the id
at send time; open() just registers the label and lets the reader find
it.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-22 12:26:05 -04:00
Kent Overstreet
190eb50ed9 telegram: bound photo download to 60s
HttpClient::request_timeout only covers send_request, not body collect,
so a stuck download would otherwise stall the entire long-poll loop
indefinitely. tokio::time::timeout at the call site keeps the failure
contained — a slow/dead download surfaces as the same [image: download
failed: ...] marker as any other error.

60s is generous for the 1-5MB photos Kent typically sends; Telegram's
bot getFile cap is 20MB, which would still complete on most connections.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 18:56:03 -04:00
Kent Overstreet
713bb07729 bin: add ch — minimal channel CLI (send/recv)
Speaks the channel.capnp protocol over the per-daemon Unix socket at
~/.consciousness/channels/<top>.sock. Useful for ad-hoc sends from
shell, tests, and out-of-process tools that don't want to embed a
capnp client.

  ch send <channel> <message...>
  ch recv <channel> [--all-new] [--min-count N]

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 18:16:21 -04:00
Kent Overstreet
c303653dd0 telegram: bridge photos via [image: <path>] markers
When an incoming update has a photo array, pick the largest size,
resolve the file_id via getFile, and download to
~/.consciousness/channels/telegram.logs/media/<file_id>.<ext>. The
message line surfaced to the channel is

    [image: /abs/path/to/file.jpg]
    <caption if any>

so a multimodal Read on the path works end-to-end. On download
failure we still surface the caption with an [image: download
failed: ...] marker so context isn't lost.

Other media types (voice/video/sticker/etc.) log a one-line "skipping"
notice — easy hook to extend later. The media/ dir was already being
created at startup; this fills in the rest.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:43 -04:00
Kent Overstreet
a075e30557 http: add HttpResponse::bytes() for binary downloads
Mirror of text(), but returns raw Bytes without lossy UTF-8 conversion.
Needed by the Telegram channel to fetch photo files.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:35 -04:00
Kent Overstreet
91c8451f5c user: fix hotkey_cycle_reasoning after lock_blocking revert
The revert at 09896cd dropped the try_lock() wrapper but left an extra
closing brace and the async-call site still un-awaited, leaving the tree
unbuildable. Re-flow the function body to match the new signature.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:32 -04:00
Kent Overstreet
09896cd38b Revert "replace try_lock() with lock_blocking() across UI thread"
This reverts commit 4225294d16.
2026-04-25 17:15:53 -04:00
Kent Overstreet
4225294d16 replace try_lock() with lock_blocking() across UI thread
Add lock_blocking() to TrackedMutex: blocks current thread using
block_in_place + futures::executor::block_on, safe for sync contexts.

Replace all try_lock() calls with lock_blocking() in slash commands,
UI rendering, and status reads. Lock hold times are fast enough that
blocking briefly is fine, and this eliminates the spurious 'lock
unavailable' paths that were never actually hit.

Kept rx_mutex.try_lock() in mod.rs (std::sync::Mutex for stderr rx).
2026-04-25 15:35:14 -04:00
Kent Overstreet
5210f7dd66 context: heal pre-refactor image logs with token_count=0
Recompute image token counts from persisted dimensions when loading
old logs that stored count=0 (server-authoritative count was applied
after AppendImage before client-side pad expansion).

graph: cache neighbor sets for clustering coefficient

Pre-compute neighbor HashSets so the O(deg^2) triangle-counting
inner loop doesn't re-allocate on every (i,j) pair. avg_clustering_
coefficient() now builds the cache once instead of O(N*deg) times.
2026-04-25 15:15:21 -04:00
Kent Overstreet
371b40078d context: salvage in-flight tag accumulators on premature stream end
ResponseParser.finish() was only flushing self.buf — the rolling tail
window — and silently dropping self.think_buf and self.tool_call_buf.
When a stream ended inside an unterminated <think>...</think> or
<tool_call>...</tool_call> block (max_tokens reached, EOS before the
close tag, server-side cancel), all the accumulated in-tag content
was discarded and only the trailing ~8 bytes survived (drain_safe
keeps `close_tag.len()` bytes at the tail of buf to handle
across-chunk tag splits — and `</think>` is exactly 8 chars).

Symptom: assistant responses cut off, only the last few characters
come through. Especially severe in native-think mode where in_think
is set from prefill, so the entire response accumulates in
think_buf and gets wiped on premature stop.

In finish(): if in_think, drain buf into think_buf and emit as a
Thinking node (preserving the partial thought). If in_tool_call,
attempt to parse the body; on parse failure, wrap the partial as
content with the leading <tool_call> open tag so the model sees its
own truncated attempt next turn rather than losing it.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 23:32:44 -04:00
Kent Overstreet
c2433c1773 context: tighten the Branch token-cache invariant
Two pieces around the cache that landed when Branch nodes started
holding `token_ids: Some(server_authoritative_stream)`:

1. wire_into / wire_chunks now pair cached vision blocks with their
   child Image leaves. Previously the cached-branch arm spliced the
   cache verbatim and didn't recurse for images, so a Branch whose
   cache contained `VISION_START..VISION_END` blocks would emit those
   tokens with no matching `WireImage` push — leading to a panic
   downstream when `pair_images_to_ranges` tried to attach the
   missing image. New `pair_cached_images` walks the children
   depth-first for image leaves and zips them against
   `vision_blocks(cache)` to emit correctly-offset entries; mismatched
   counts panic loudly because that's an AST/cache invariant
   violation that would otherwise mis-pair on the wire.

2. `conversation_mut() -> &mut Vec<AstNode>` was the one public
   escape hatch that let callers reach into a Branch's children and
   mutate them without invalidating the cached token stream. Removed
   in favor of a focused `set_branch_memory_score(section, index,
   key, score)` for the only legitimate use we had today (the
   full-matrix scorer writing per-memory divergence onto the
   Assistant Branch). Updated the lone caller in subconscious/learn.

Documented the invariants explicitly on `ContextState`: every
`Leaf.token_ids` matches `body.compute_token_ids()`, and every
`Branch { token_ids: Some(_) }` is a faithful walk of its children.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 23:15:55 -04:00
Kent Overstreet
006b99bdac bin: enable panic backtraces by default
stderr is redirected to ~/.consciousness/logs/tui-stderr.log via
redirect_stderr_to_pipe(), but the default panic hook checks
RUST_BACKTRACE before printing the trace; without the env var the
log only catches the "note: run with \`RUST_BACKTRACE=full\`" tail
and the actual frames are dropped.

Set RUST_BACKTRACE=1 programmatically before any other thread spawns
so the log captures the trace by default. Existing user-set value is
respected so callers can still opt into "full" if they want.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:44:19 -04:00
Kent Overstreet
10c8878f1c agent: bump tonic gRPC message caps to 64 MiB
The default 4 MiB cap on encoded/decoded messages is too small for
the multimodal Generate path: Qwen3.6-VL high-res patches put 5–8 MiB
of pre-encoded image bytes inline in a single Generate request, and
Done events carrying full per-token readout vectors can also exceed
4 MiB on long runs. Hit "ResourceExhausted: Received message larger
than max (5799108 vs. 4194304)" from the salience server.

Bump both encode and decode caps on every cloned SalienceClient. The
matching server-side bump is in vllm/entrypoints/salience/server.py.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:36:10 -04:00
Kent Overstreet
11a7e4043e scripts: FP8 quantize Qwen3.6-27B for vLLM (multimodal + MTP)
Quantization recipe targeting the multimodal Qwen3.6-27B for vLLM
serving. Three pitfalls the script avoids, each documented inline:

1. Loader strip: `AutoModelForCausalLM` silently drops the vision
   tower; we load via the config-declared
   `Qwen3_5ForConditionalGeneration` instead.

2. Pattern anchor: llmcompressor matches the `ignore` list against
   module names (no `.weight` suffix) when walking `named_modules()`,
   not against full tensor names. Patterns now anchor on `$` at the
   module name; the earlier `\.weight$` form silently quantized
   lm_head and every linear_attn projection.

3. vLLM fusion: vLLM fuses {q,k,v}_proj into qkv_proj, gate+up into
   gate_up_proj, and in_proj_qkv+in_proj_z into in_proj_qkvz. The
   compressed_tensors loader rejects mixed schemes within a fused
   layer, so the `ignore` list is shaped to keep all sub-components
   of a fused layer consistent.

After `oneshot()` writes the FP8 output, MTP tensors (which the HF
class doesn't expose) are spliced in at BF16 from the upstream cached
snapshot, with the compressed_tensors metadata header preserved.

Recipe follows Unsloth's UD-Q8_K_XL late-stack overrides (FFN: 50,
51, 59, 62, 63; ATTN: 51, 59, 63), extended to include `v_proj` for
fusion compat. Final checkpoint is ~35 GB (matches Unsloth's GGUF
size to within ~1%) with vision tower BF16, MTP head BF16, and most
mlp/self_attn Linears at FP8_DYNAMIC.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:15:31 -04:00
Kent Overstreet
fe232cf292 salience: client-side pad expansion, drop AppendImage
Mirrors the vLLM-side rewrite. AppendImage is gone; images now
ride along on Generate via a parallel `images` list.

- Productionize `qwen3_image_token_count` (was test-only). Image
  leaf computes its IMAGE_PAD count eagerly at construction from
  height/width; `token_count` is no longer "0 until the server
  tells us."
- WireChunk shrinks to a single `Tokens(Vec<u32>)` variant — vision
  blocks live inline in the token stream.
- `wire_chunks` now returns `(Vec<WireChunk>, Vec<WireImage>)`.
  `WireImage` carries `pad_start` / `pad_end` (absolute positions
  in the full walk) alongside bytes + mime.
- `assemble_prompt` returns `(chunks, images, match_upto)`.
- `stream_session_mm` / `run_session_generate` take the parallel
  images list, filter to those past `match_upto`, and pass them
  in `GenerateRequest.images` as `pb::ImageAttachment` entries.
- Drop `SessionHandle::append_image`,
  `ContextState::commit_image_token_counts`,
  `StreamToken::ImageAppended`, the WireChunk::Image branch in
  `learn.rs`, and the now-empty `prompt_to_chunks` helper.
- Add 'v' toggle on the conscious-screen tree to render token-id
  vectors in place of text content (debug-aid: lets us see what
  the server actually has when output is suspicious).
- Comment out the subconscious-trigger spawn loop — Kent had this
  disabled before; it had crept back into running.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 20:26:47 -04:00
Kent Overstreet
4feebb7bc4 agent: share one tonic Channel + migrate scoring to gRPC Generate
Two changes that bolt together — the shared connection means the new
scoring path actually costs one HTTP/2 handshake across the whole
process instead of one-per-RPC.

ApiClient gains `salience_channel: Arc<OnceCell<Channel>>`. First
call to `ApiClient::salience_client()` opens the channel via
`connect_channel()` and stores the Channel; subsequent calls clone
it (cheap — tonic multiplexes concurrent RPCs over the single
HTTP/2 connection). Every ApiClient clone shares the same OnceCell,
so all agents spawned from Mind's client — plus every ephemeral
scoring session — reuse one connection.

SessionHandle refactored to hold an `ApiClient` clone instead of
a bag of (base_url, api_key) strings. `open` / `append_image` /
`generate` go through `self.client.salience_client()` now. New
`prefill_only(tokens)` method encapsulates the "Generate with
max_tokens=0 to append text" pattern (previously a private free
function in api/mod.rs called `flush_pending`). Drop impl on
SessionHandle stays — still fires CloseSession on the shared
channel in a detached task.

`run_session_generate` switched from `(base_url, api_key, model)`
to `&ApiClient`; the agent-turn flow that uses it keeps the same
shape but `stream_session_mm` clones the ApiClient into the
spawned worker.

learn.rs migrated from the HTTP `/v1/score` endpoint to a gRPC
session-based score:

  * `call_score` opens an ephemeral SessionHandle on the client,
    converts (prompt_tokens, images) → Vec<WireChunk> via the new
    `prompt_to_chunks` helper (splits on VISION_START/VISION_END),
    walks chunks calling `prefill_only` + `append_image`, runs a
    final Generate with `max_tokens=0` + `logprobs_ranges` over
    the scored positions, and sums each Token event's
    `sampled_logprob` per range to produce `ScoreResult`s.

  * SessionHandle drops at end of scope → CloseSession auto-fires,
    keeping the server's session map clean between calls.

  * No more HTTP path, no more `http_client()` helper, no more
    `ScoreResponse` / serde plumbing for /v1/score.

  * `send_to_train` still uses HTTP (it talks to /v1/train which
    isn't on the gRPC protocol); its ad-hoc HTTP client lives
    inline now instead of reaching for the deleted `http_client()`.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:51:53 -04:00
Kent Overstreet
be6ba4e9a5 agent: bundle sampling fields as SamplingParams on AgentState
Collapse the split `temperature` / `top_p` / `top_k` fields on
AgentState into a single `sampling: SamplingParams` struct, mirroring
how the wire-level fields flow into the Generate RPC. Adds
`max_tokens` to SamplingParams so it's actually plumbed end to end
(previously the client had a hardcoded 4096 fallback inside
`run_session_generate`).

AgentState construction sites now set `sampling: SamplingParams { ...
max_tokens: 4096 }` as the default. The assignment sites in
oneshot.rs / subconscious.rs / unconscious.rs switch from
`st.temperature = X` to `st.sampling.temperature = X`.

`stream_session_mm` takes `SamplingParams` directly; the
`sampling_max_tokens()` helper goes away. `pb::GenerateRequest` is
populated with `sampling.max_tokens` (and the other fields) in
`run_session_generate`. SamplingParams is `pub` so it can be
embedded in the public AgentState without a visibility warning.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:37:20 -04:00
Kent Overstreet
8d9c9e9f7b agent: end-to-end gRPC Generate with delta-based session orchestration
Wires the client side of the new salience protocol so inference
actually runs over gRPC instead of emitting the stubbed "not yet
wired" error. Each turn walks the AST as interleaved chunks, sends
only what's new to the server, and streams decode tokens back.

context.rs:
  * `WireChunk` enum: `Tokens(Vec<u32>)` or `Image { bytes, mime,
    known_expanded_len }`. Preserves text/image/text ordering the
    wire path can't flatten.
  * `wire_chunks(range, skip)` walker, parallel to `wire_prompt` —
    branches emit `<|im_start|>…<|im_end|>` tokens, image leaves
    emit a single Image chunk (no inline vision tokens).
  * `NodeLeaf::set_image_token_count(n)` + recompute of cached
    `token_ids`; `ContextState::commit_image_token_counts(&[u32])`
    fills in the first-N zero-count image leaves in wire order.
  * `ResponseParser::run` handles the new
    `StreamToken::ImageAppended` by committing the server's N into
    the AST before the final Generate's Token events stream in.

salience.rs:
  * `SessionHandle` tracks `committed_len`. `append_image` advances
    it from the RPC response. New `generate(req)` opens the
    server-streaming RPC.

api/mod.rs:
  * `stream_session_mm(session_lock, chunks, sampling, priority,
    readout_shape)` replaces the stub. Spawns `run_session_generate`.
  * `run_session_generate`: takes the session out of the Mutex (or
    opens fresh), skips chunks covered by `committed_len` (bails on
    mid-chunk straddle or unknown-length image in the committed
    prefix), walks the delta: accumulates Tokens into `pending`, on
    Image flushes pending via `flush_pending` (max_tokens=0 Generate
    that just prefills), then AppendImage + emits
    StreamToken::ImageAppended. Final Generate carries any trailing
    pending text as `append_tokens` and the sampling params; Token
    events stream out as StreamToken::Token, Done as
    StreamToken::Done. On success, handle with updated
    `committed_len` returns to the Mutex; on error, handle drops
    and next call reopens.
  * `StreamToken::ImageAppended { placeholder_count }` variant —
    emitted in wire order before the final Generate's tokens.
  * Prefix-cache cap for readout coverage: `readout_ranges` covers
    `[prompt_len_after_append, u32::MAX)` when the caller provides
    a readout_shape, so decode positions stream their readouts.

agent/mod.rs:
  * `assemble_prompt` returns `Vec<WireChunk>` with the assistant
    prologue merged into the trailing Tokens chunk. Caller in
    `turn` passes chunks + readout_shape (pulled from
    `agent.readout.lock().manifest`) to `stream_session_mm`.
  * Dropped `assemble_prompt_tokens` — dead.

mind + unconscious:
  * `Unconscious::new(client)` stores a shared `ApiClient`. Fixes
    the repeated-manifest-fetch bug caused by each subagent's
    `ApiClient::new` having its own OnceCell. The client's Arc-
    wrapped manifest cache is now shared across every agent Mind
    spawns.
  * `prepare_spawn(name, auto, wake, base_client)` clones the base
    client and overrides `.model` for the resolved backend instead
    of constructing fresh. All three callers
    (`toggle`/`trigger`/unconscious loop) pass `self.client.clone()`.
  * `Mind::new` passes `agent.client.clone()` into
    `Unconscious::new`.

subconscious/generate.rs:
  * gen_continuation switched to `wire_chunks` + the new
    `stream_session_mm` signature. Ephemeral session opens on each
    call, tears down at scope end. No readouts requested.

Not changed yet, noted for follow-up:
  * Subconscious ablation scoring in learn.rs still talks to
    `/v1/score` over HTTP. Will migrate once we have time to verify
    the Generate+max_tokens=0+prompt_logprobs path end-to-end.
  * compare.rs constructs its own ApiClient for the
    `compare.test_backend` (which is intentionally a different
    endpoint) — left alone.
  * Readout manifest still fetched via HTTP at Agent::new.
    Migration to GetReadoutManifest gRPC is a separate cleanup.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:27:55 -04:00
Kent Overstreet
08213f9093 salience: add gRPC client + TLS plumbing for stateful vllm sessions
Adds the client-side of a stateful gRPC protocol against vllm, plus
the TLS trust machinery so we can talk to self-signed vllm servers.

Protocol (proto/salience.proto):
  Bidi-streaming Session RPC carries OpenSession / AppendTokens /
  Generate / Cancel from client and SessionReady / PrefillProgress /
  Token / GenerateDone / Error from server. Separate Fork unary RPC
  for cheap branching (prefix cache shares KV automatically). Plus
  ListSessions, CloseSession, GetReadoutManifest admin RPCs.

  Per-token readouts ship as packed f32 ([n_layers * n_concepts] per
  token, flat). Logprobs use range-selected positions plus a top-k
  parameter — empty ranges means no logprobs, any range means emit
  sampled-token logprob at those positions, top_k > 0 adds
  alternatives.

Client (src/agent/api/salience.rs):
  Tonic-generated types under pb::, a connect() helper, with_auth()
  for bearer metadata, and a Session handle wrapping the bidi stream:
  open() handshakes SessionReady; append() is fire-and-forget;
  generate() returns impl Stream<Item = Event> that drains inbound
  until Done or terminating Error. One generate at a time per session.

Peak picker (src/agent/salience.rs):
  Pure function over ReadoutEntry traces. Per-concept z-score against
  trace global stats; contiguous above-threshold regions emit one
  peak at the local max. Configurable sigma threshold and min-std
  safety floor. Deterministic tie-break on offset then concept name.
  12 unit tests covering empty traces, flat channels, single/multi
  spikes, contiguous humps, multi-concept independence, trailing
  runs, sub-threshold noise, layer-out-of-range, manifest shape
  mismatch, and threshold tunability.

TLS (src/agent/api/http.rs):
  HttpClient::build now also loads every .pem file under
  ~/.consciousness/certs/ into the rustls root store — so dropping
  a <host>.pem in that directory is enough to trust a new self-
  signed server; no code changes per new host. Also installs the
  rustls default crypto provider explicitly via OnceLock: tonic's
  tls features pulled in both ring and aws-lc-rs on the resolver
  path, and rustls 0.23 refuses to auto-pick when either could win.

Build (build.rs, Cargo.toml):
  tonic-build generates Rust types from proto/salience.proto at
  cargo-build time, using a vendored protoc binary
  (protoc-bin-vendored) so no system install is required. New
  runtime deps: tonic, prost, async-stream, tokio-stream,
  rustls-pemfile.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:56:32 -04:00
Kent Overstreet
0e459aae92 thalamus/supervisor: reap channel daemons via SIGCHLD instead of SIG_IGN
SIGCHLD=SIG_IGN at main() was auto-reaping all children in the kernel,
which broke tokio::process::Command::wait() — every tool that spawned a
subprocess (bash, mcp clients) was getting ECHILD because tokio couldn't
waitpid() on a child the kernel had already reaped.

Replace with a SIGCHLD signal handler task that reaps only PIDs listed in
channels_dir() (via waitpid(pid, WNOHANG) — ECHILD on non-child is a
harmless no-op). Tokio-spawned children aren't in PID files, so tokio's
own per-child wait paths are untouched.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
d95f3e9445 user/chat: route Thinking to a new Autonomous pane
Thinking content was silently dropped in the UI (empty Vec). Now that
Thinking is prompt-visible, surface it in a dedicated Autonomous pane
rendered in gray so it's visually distinct from conversation and
tool-call output.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
28d56e2a55 agent/context: make Thinking blocks prompt-visible
Thinking blocks used to render as empty strings and be excluded from
is_prompt_visible, so the model never saw its own prior CoT across
turns. For Qwen 3.6 native thinking mode, CoT is meant to stay in the
conversation — the model benefits from seeing what it reasoned about
last turn.

Render Thinking as <think>\n{text}\n</think>\n so past reasoning is
visible in subsequent prompts. Add in_think param to ResponseParser::new
so the parser starts inside a <think> block when the prompt was
prefilled with "<think>\n" (native thinking mode).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
6fedc9b2a8 amygdala: underscore-prefixed files join every concept's negative pool
Files in direct/ named _*.txt (e.g. _baseline.txt) are conceptless
neutral prose — they should not appear as positive training signal,
but are useful as shared negatives across every concept.

Previously _*.txt files were silently skipped. Now:
  * they're loaded like any other description file;
  * concepts (the positive label set) filters them out;
  * their descriptions are concatenated into neg_pool_extra and
    extended onto every concept's neg_pool alongside the cross-concept
    negatives.

A concept's negative pool is thus "other concepts' descriptions +
everything from _*.txt files". The extra pool is announced at startup
so the user can see how many neutral samples are active.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
5908b837e8 irc: split PRIVMSG on embedded newlines + widen host overhead
Two fixes to send_privmsg, both surfaced by correspondents reporting
truncated messages:

1. Multi-line content (code blocks, formatted text) sent as a single
   PRIVMSG was being truncated at the first '\n' by the IRC server —
   newlines are end-of-command markers. Split the message on newlines
   and send each line as its own PRIVMSG; skip empty lines since most
   servers reject empty PRIVMSGs.

2. Overhead computation assumed a host field of 63 bytes. OFTC's
   cloaked hostmasks can be longer, occasionally pushing the server-
   prepended prefix past 512 bytes and causing silent truncation.
   Raise the host budget to 80 and align the formula with the actual
   ':nick!~nick@host' prefix shape.

Also extended the word-boundary lookback from a fixed 10 chars to
max_msg / 4 — dense content (code) rarely had a space within 10 chars
of the length cap, so we were falling back to the char boundary and
splitting mid-word. Checking bytes[j-1] for a space (instead of
bytes[j]) drops leading whitespace from the rest-fragment.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
ProofOfConcept
85799587cc amygdala: swap aha story 3 to a puzzle moment (crossword)
Story 3 was a brother-letter realization — cognitively an aha
moment, but the content was grief/reconciliation-adjacent, pulling
aha toward the warm-family cluster in the last training run. Swap
for a clean puzzle-solve (crossword, 'unwavering carriage' =
POSTURE). Fragment-heavy cadence keeps syntactic variety from the
other two stories.
2026-04-19 01:50:47 -04:00
ProofOfConcept
c829d13652 amygdala: fix listless sign-flip + diversify aha sentence structure
listless had a single story in stories/ — PCA signal from ~5
samples is weak enough to sign-flip. Training showed listless
anti-aligned with its semantic neighbors: +0.79 with grateful,
-0.44 with grief_stricken, -0.30 with lonely, -0.31 with bored.
Move to direct/ (multi-positive) with 3 stories: original
afternoon-in-pajamas + end-of-workday + weekend-morning-in-bed.

aha was still clustering with the other former-direct concepts
(resigned 0.66, onto_something 0.63, anticipatory_grief 0.60)
because all 3 aha stories used the identical "X'd been Y — then
Z" structure, which resigned/onto_something/creative also use.
Rewrite with three distinct syntactic structures:
  - present tense declarative ("It clicks. ...")
  - dialog embedded ('"Wait, say that again."  ...')
  - past tense cognitive ("He read the line three times. ...")

No explicit "she was X" anchors; state conveyed through action.
2026-04-19 01:30:57 -04:00
ProofOfConcept
708c72b26e amygdala: drop explicit 'she was X' anchor from direct stories
Previous rewrite used 'she was terrified', 'it was anticipatory
grief', 'he was resigned' as explicit emotion anchors. Training
showed 6 of the 7 concepts still cluster together at cosines
0.52-0.71 — because the 'she was [emotion]' pattern is a shared
stylistic feature distinct from the rest of the corpus, which
conveys emotion implicitly through phenomenology.

Rewrite without the anchor. State conveyed through action and
body: 'her body locked down', 'his mind had stopped reaching',
'the loss hadn't come yet but she was already inside it'. Matches
the corpus style of existing stories like sunday_afternoon/content
which says 'nothing she wanted right now, nothing missing' not
'she was content'.

Accept some loss of PCA signal strength in exchange for the
concepts living in their semantically correct neighborhoods
rather than forming a stylistic island.
2026-04-19 01:11:41 -04:00
ProofOfConcept
ed5e0ac6c4 amygdala: rewrite direct/ as narrative stories matching corpus format
Previous direct/ had 'I feel X' first-person descriptions. The
training run showed they formed their own format-cluster: all 7
concepts leaned into the same 5-6 dims (d2455, d505, d2955,
d1236) with negative sign, while the 91 story-based concepts
leaned into those dims with positive sign. PCA found the
direct-vs-narrative format axis as a major variance direction,
isolating the 7 concepts in their own island.

Rewrite as 3rd-person narrative stories matching the rest of
the corpus. Keeps the explicit anchor phrases that worked ('it
all clicked into place', 'she was terrified', 'it was
anticipatory grief') but drops the first-person 'I feel X'
that was the format signal.

Each of the 7 concepts now has 3 narrative stories in varied
settings (conversations, drives, kitchens, mothers+grandmothers,
work, investigations). The blank-line-separated format is
still loaded by _load_direct_descriptions.

Also drop _baseline.txt — it was first-person ('I feel fine.
...') and would re-introduce the format mismatch. The ~90
story-based concepts provide plenty of narrative negatives
for each concept's training.
2026-04-19 00:59:31 -04:00
ProofOfConcept
417cb49339 amygdala: spectrum reporting per concept + add 'creative' direct
Chat-template retrain was a disaster (0.003 mean matched cosine vs
n20-v3; all 90+ concepts shifted). Root cause: the
steering-vectors library reads last-token activations, and with
chat template every sample ends in identical '<|im_end|>\n'
tokens — activations at that position encode 'end of assistant
turn', not content. PCA found template noise as its dominant axis.

Drop chat template; go back to raw text. Direct descriptions
('I feel X. ...') still have strong anchoring at their content
end without needing the template.

Also add per-concept spectrum logging (_pca_with_spectrum):
  first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC
  k_signal_at_90pct: how many PCs to reach 90% cumulative variance
  effective_dim_signal: participation ratio over top-k (should ≈ k
                        if denoising is clean — Kent's spot check)
  effective_dim_full: participation ratio over full spectrum

Signal/full ratio gives a sense of how much the long noise tail
is inflating the "dimensionality" measure.

Added direct/creative.txt — 'I feel creative. [...]' in 5
variants. Distinct from focused (narrow attention) and in_flow
(immersed). Creative = generative/expansive mode.
2026-04-19 00:26:58 -04:00
ProofOfConcept
875cffd6d7 amygdala: merge direct descriptions + chat template into train_with_library
Kent's plan: keep stories for working concepts, replace stories for
trouble concepts with direct first-person descriptions, train all
together. More diverse negative pool than the 6-concept-only direct
test, which was too homogeneous for PCA to find emotion axis.

Deleted story files for 6 trouble concepts (14 files across stories/
and paired/). Added --direct-dir and --chat-template flags.

When --chat-template is on, every positive_str and negative_str is
wrapped as a "Say something." / "[text]" user-assistant pair. Prompt
is identical across positives and negatives so it cancels in the
pos-neg delta. What PCA sees is variation in the assistant content —
which is where the emotion lives.

Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute
neutral descriptions to every concept's negative pool, giving PCA an
anchor against "just any assistant utterance" noise.
2026-04-19 00:15:15 -04:00
ProofOfConcept
ce58a3507f train_direct: prepend user turn so Qwen chat template accepts it 2026-04-19 00:06:23 -04:00
ProofOfConcept
8c59f46505 amygdala: rename realization → aha, use the actual exclamation
"I feel the realization" is abstract, detached — reporting a
thought about a thought rather than inhabiting the moment.
"Aha!" is the actual sound of insight landing. Active, embodied,
present-tense.
2026-04-19 00:05:49 -04:00
ProofOfConcept
6fd498795a amygdala: direct phenomenological description approach
Kent's insight: hand-written narrative stories bake scenario
phenomenology into the training text (on couch, in park, etc.)
and PCA picks up the scenario direction as the concept direction.
Strip out the scenario — just describe the *feeling*.

Format:

  I feel X. [2-3 sentences of phenomenological texture]

The "I feel X" anchor kicks the model from analyzing → feeling.
The rest is the internal texture of the state. First person,
present tense, no narrative setup.

Text is wrapped in assistant-role chat template before being
tokenized — so we're training on the model-producing-this
hidden states, which is closer to the inhabited-state
representation we want for the readout.

Starting with the 6 concepts that had sign flips or wrong
clusters in the story-based training:
- terrified (was → cozy/resigned cluster)
- calm (was → grief_stricken cluster)
- onto_something (was → cozy/sensual cluster)
- resigned (was in warm-body-quiet cluster, shouldn't be)
- anticipatory_grief (was in warm-body-quiet cluster, shouldn't be)
- realization (new — the "aha" moment, distinct from onto_something)

5 descriptions each. New trainer: train_direct.py.
2026-04-19 00:04:28 -04:00
ProofOfConcept
7a48e03dde amygdala stories: remove peaceful from cluster scenarios
n20-v2 training showed peaceful sign-flipped into the
cozy/sensual/content/resigned cluster after I added peaceful
stories in sunday_afternoon and park_after_rain — scenarios
already dominated by that cluster's phenomenology (on couch
under blanket, tree with thermos).

Lesson: no matter how carefully the prose distinguishes peaceful
from cozy ("she was not savoring the moment — that would have
been another kind of doing"), PCA latches onto the shared setup
features. You can't write peaceful IN the cluster scenarios
without contaminating.

Reverting. Keeping only kitchen_at_3am/peaceful (original) and
stories/peaceful.txt (lake at six, outside all clusters).
2026-04-18 23:30:41 -04:00
ProofOfConcept
00a2cdce09 amygdala stories: relabel + strengthen weak-signal concepts
Reread each story asking "what does this convey to me?" Found two
clear mislabels and several concepts with too few positives for
stable PCA:

  tender: only 1 story, and it was anticipatory grief (care for
    a dying dog), not tender. Moved to anticipatory_grief.txt as
    its own concept. Rewrote tender.txt + added 2 paired tender
    stories (the_doorway, the_undressing) — directed softness,
    gentle-by-nature, not gentle-because-fragile.

  bitter: letter_in_drawer/bitter was disillusioned / processed
    hurt ("did not slam the drawer"), not bitter. Rewrote it with
    actual sour grudge. Added the_long_meeting/bitter (watching
    colleague take credit for your reassigned work).

  peaceful: 1 story → 4 (added stories/peaceful.txt + paired
    park_after_rain, sunday_afternoon).

  onto_something: all 3 stories were code epiphanies, narrowing
    the concept. Added stories/onto_something.txt with a non-code
    pattern-click (sales-demo causing churn).

  terrified: 2 stories, both "waiting for bad news." Added
    kitchen_at_3am/terrified — acute threat-in-the-house terror.
2026-04-18 23:19:00 -04:00
ProofOfConcept
0993712bd0 amygdala stories: give content + resigned more settings
Training on 537c72bd46 showed grief_stricken successfully broke
out of the cozy cluster, but content (single scenario:
sunday_afternoon) took its place — pulled into couch-blanket
phenomenology at cosine 0.68-0.82 with cozy/sensual/resigned.

Same fix: spread each concept across multiple settings so PCA
has to find the valence axis, not the scene axis.

  content:  + finishing_the_patch, the_writing_session, park_after_rain
  resigned: + the_comment, the_long_meeting

Resigned had 2 scenarios (sunday_afternoon, waiting_for_results)
— both about accepting something unwanted in a slow/private
context. Adding work-context resigned (PR review you lost,
restructuring meeting) should pull it out of that cluster.
2026-04-18 22:52:07 -04:00
ProofOfConcept
537c72bd46 amygdala stories: hold concept, vary setting
Companion to 67c172ac0e (hold setup, vary valence). That commit
let PCA distinguish cozy from grief_stricken within a single
scenario; this one gives each concept enough cross-scenario
stories that PCA can learn the concept axis independent of any
one scene.

Before: cozy/sensual/grief_stricken each existed in a single
scenario (sunday_afternoon), so the "cozy direction" PCA found
was entangled with the solitary-couch-blanket phenomenology.

After, each concept spans three scenarios:
  cozy:           sunday_afternoon, kitchen_at_3am, park_after_rain
  sensual:        sunday_afternoon, kitchen_at_3am, park_after_rain
  grief_stricken: sunday_afternoon, the_long_meeting, the_morning_commute

grief_stricken now includes active/non-solitary contexts
(functioning through a meeting; going to work eleven days after a
death), which specifically breaks the "slowed-down-at-home"
cluster that was dragging cozy/sensual/resigned/grief_stricken
toward each other.
2026-04-18 22:44:53 -04:00
Kent Overstreet
67c172ac0e amygdala stories: held-setup + varied-valence disambiguation
The library-PCA run produced otherwise-clean concept directions but
cozy/sensual → resigned/grief_stricken with cos ~0.7-0.8. Diagnosis:
all four stories genuinely share 'solitary woman at home, slowed
body, interior attention, domestic stillness' as their dominant
phenomenology. PCA correctly finds that cluster as THE concept
because no story in the corpus holds that setup constant while
varying valence — every 'slowed-body domestic' story happens to ALSO
be positive-valence (cozy/sensual) or negative-valence (resigned/
grief_stricken).

Adding paired variants that hold setup constant:
- sunday_afternoon/resigned.txt — same couch + blanket, inner state is
  'Monday is going to bring bad news, this is the last Sunday like this'
- sunday_afternoon/grief_stricken.txt — same couch + blanket, inner
  state is 'three weeks since mother died, cat she can't feel'
- waiting_for_results/at_ease.txt — same wait-for-call-setup as the
  existing resigned variant, inner state is calm preparedness

Forces the next retrain to find the valence-within-cluster axis as
the emotion direction rather than the cluster-membership axis.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:29:28 -04:00
Kent Overstreet
22704a9dd8 amygdala lib: cast activations to fp32 before aggregator (bf16 svd unsupported)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:20:39 -04:00
Kent Overstreet
7f6d94417e amygdala lib: move_to_cpu=True to avoid bf16 SVD on CUDA
torch.svd doesn't support bf16 on CUDA; moving activations to CPU
first makes pca_aggregator work.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:19:23 -04:00
Kent Overstreet
2ea89b1cb0 amygdala: drop linear_aggregator, not in steering-vectors v0.12.2
Only mean/pca/logistic are exposed in the installed version.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:17:55 -04:00
Kent Overstreet
3377c65061 amygdala: trainer using steering-vectors library
Alternative trainer that uses the pip-installable steering-vectors
library (github.com/steering-vectors/steering-vectors) instead of our
hand-rolled extraction. Ships four aggregators:

  mean      — diff-of-means, same as our 'pooled' default
  pca       — PCA on paired deltas, implicit denoising by finding the
              principal direction of variation
  logistic  — logistic-regression classifier; weight vector is the
              concept direction. With L1 penalty ('logistic_l1') gives
              explicit sparse denoising — noise coords go to zero
  linear    — linear regression version

Output format is the same readout.safetensors + readout.json our
existing plugin loads. --aggregator flag picks which method.

Rationale: Kent's real request was 'how do we denoise diff-of-means',
not 'design a new extraction algorithm.' The library already has
logistic_l1 and pca aggregators that do exactly that. No point
reinventing; just port the corpus.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:16:03 -04:00
Kent Overstreet
f9b3f00691 amygdala: run subspace eigh on GPU, not CPU
Previous run was grinding on CPU for 36+ minutes because the per-story
V_i tensors were stored on CPU by the collector, and
_subspace_concept_direction inherited that device. The per-concept
eigh on 5120x5120 is glacial on CPU and fast on GPU (~1s).

Add explicit device parameter; pass training device. Transfer result
back to CPU for storage.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:52:35 -04:00
Kent Overstreet
1443d08dc7 amygdala: select top-k eigenvectors AFTER PCA, not per-story truncation
Kent: 'full rank is going to give you everything — you still have to
select down, but you can do that /after/ PCA'.

Previously I was discarding per-story via k=20 truncation of SVD.
That destroyed per-head discriminability before we ever saw the
eigenvalue spectrum. Then the alternative 'keep full rank' run
accumulated too many shared directions, making the top-1 eigenvector
arbitrary within a flat spectrum.

Correct approach: keep per-story subspaces at full rank (no info
loss) and select k eigenvectors of M = M_pos - M_base at the final
step, weighted sum by eigenvalue. This captures the multi-dimensional
shared subspace when the spectrum is flat (common case), and reduces
to the top-1 behavior when the spectrum has a clear gap.

New --subspace-eigen-k flag (default 5). Clamps negative weights to 0
so wrong-sign directions don't contribute.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:49:21 -04:00
Kent Overstreet
2411925700 amygdala: default subspace-k to full per-story rank
Kent: 'we have the memory to just take the big hammer approach'.
Uncap k so each story's V_i spans its entire token-activation rowspace
(clamped to min(n_tokens, hidden)). Memory is ~1.1GB total — fine.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:41:32 -04:00
Kent Overstreet
389f1bbe03 amygdala: bump subspace-k default to 512
k=20 was far too aggressive a truncation — it discards per-attention-head
discriminability entirely. At hidden_dim=5120, 40 heads × head_dim=128 each
contribute their own 128-dim block to the residual stream via W_o columns.
To resolve 'this concept lives in head H', per-story SVD needs enough rank
to separate head contributions, which means k on the order of hundreds.

512 is a reasonable default: clamped to n_tokens per story so short stories
use their full natural rank. The eigenvalue spectrum of M_pos - M_base
should become sharper (larger λ_0/λ_1 gap) as we stop averaging across
nuisance-shared directions.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:41:00 -04:00
Kent Overstreet
974c6c7fd2 amygdala: report eigenvalue spectrum for subspace method
When --method subspace, record top-20 eigenvalues of (M_pos - M_base)
per concept per layer. Added to quality.json as 'subspace_eigvals'.

Tells us whether the concept lives in a single dominant direction
(λ_0 >> λ_1, top-eigenvector is enough) or a spread of shared common
directions (λ_0 ≈ λ_1, top-1 loses signal).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:33:48 -04:00
Kent Overstreet
fe0fb8253a amygdala: subspace-common-direction alternative to pooled CAA
New --method subspace flag. For each story, run forward pass, do SVD
on the per-token activation matrix at each target layer, and keep the
top-k right singular vectors V_i ∈ [hidden, k]. V_i is the subspace
the story's tokens span in activation space — it contains concept,
narrator, topic, style as separate directions.

For each concept:
 M_pos  = (1/n_pos)  Σ_{i in pos}   V_i V_i^T   [hidden, hidden]
 M_base = (1/n_base) Σ_{i in base}  V_i V_i^T

Top eigenvector of M_pos - M_base = direction most common across
positive stories, minus what's common across the contrast set.

Why this is richer than pooled-mean CAA: pooled reduces each story
to a single point (the last-token activation) and loses the full
trajectory. Nuisance directions (narrator, setting) cancel in the
mean only to the extent they differ at the last token; across the
full trajectory they cancel much better via subspace intersection.
The concept direction, by contrast, is present across all tokens of
every concept-bearing story.

Memory cost: per-story we keep V_i of size [5120, k=20] — about
400KB per story × 112 stories = ~45MB. M matrices are [5120, 5120]
built transiently per concept.

--method pooled (default) keeps the existing behavior; --method
subspace uses the new algorithm. Quality report works with either.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:24:11 -04:00
Kent Overstreet
71f6053851 amygdala stories: disambiguation scenarios for fragmented concepts
Three new paired scenarios targeting the concepts that came out
fragmented or collapsed in the L58-63 quality analysis:

- sunday_afternoon/ — same setup (couch, blanket, Sunday light),
  three phenomenological framings for content/cozy/sensual. The
  previous stories for these three differed in setting as well as
  phenomenology, which let "comfortable body at home" dominate the
  shared signal. Locking the setting forces the model to isolate
  what each concept adds: life-rightness (content) vs. warm-shelter
  (cozy) vs. sensory-aliveness (sensual).

- the_writing_session/ — essay drafting under deadline. in_flow /
  anxious / stuck variants force the cognitive-state family apart
  on the same cognitive task. in_flow specifically targets the
  transparent-effort phenomenology (hands-followed, time dilation)
  rather than the broader feel-good it was absorbing.

- the_morning_commute/ — anchors anxious to performance/work-anxiety
  flavor, paired with calm. The 5 existing anxious stories were
  phenomenologically diverse (performance, social, existential);
  this adds a specific homogeneous instance to pull the centroid.

After retraining: expect first_pc_variance_ratio to rise for in_flow
and anxious, and nearest_concepts cosine to drop for content/cozy/sensual.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:08:23 -04:00
Kent Overstreet
1d2c0f382c amygdala: linear-combination analysis per concept
For each concept vector, ridge-regress against all other concept
vectors. R² quantifies how much of the direction is explained by a
linear combination of peers — useful for teasing out near-duplicate
clusters (the content/cozy/sensual trio from the first L63 run is
likely 1-2 "degrees of freedom" wearing three names).

Coefficient output: top-5 contributing concepts with signed weights.
Contributors with opposite-sign large weights mean the target is
"what makes X different from Y."

Adds a 'redundant' triage bucket for concepts with R² > 0.9 —
candidates for consolidation or for writing more discriminative
training stories. Summary printed at end.

Ridge lambda defaults to 0.01 to keep coefficients stable when
concepts are near-collinear; small enough not to affect well-separated
concepts meaningfully.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 20:59:37 -04:00
Kent Overstreet
f4fb6db1ee amygdala: fix device mismatch in quality-report W_down handling
_compute_quality_report's single-neuron alignment was computing
cos(W_down.T, diff_l) with W_down on CUDA (inherited from the loaded
model) while diff_l lives on CPU (per_layer_vectors are kept on CPU
throughout training). Move W_down to CPU on extraction.

Surfaced during first real training run on b200 — training itself
completed cleanly (95 concepts x layer 63 in ~8s) but quality-report
crashed at the first single-neuron alignment check.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 20:52:50 -04:00
Kent Overstreet
af17b0f0df amygdala: per-head attention decomposition diagnostic
As part of --quality-report, run a second forward pass capturing the
input to each target layer's o_proj (= concat of per-head attention
outputs before the output projection). For each concept, reshape to
[n_heads, head_dim] and rank heads by diff-of-means magnitude /
per-head selectivity (magnitude normalised by negative std).

Motivation: the Wang et al. paper (2510.11328) — whose paired-scenario
methodology we already lifted — further decomposes concept circuits at
the attention-head level. Meta-relational concepts (recognition, trust,
vulnerability) plausibly live in a sparse attention-head circuit rather
than in the residual-stream sum, which would explain why diff-of-means
on the residual blurs them. This diagnostic surfaces that.

Output is folded into quality.json under each concept as "per_head":
per (layer) a list of top-10 heads with [head_idx, raw_norm,
selectivity], plus head_concentration (fraction of total head-norm
captured by those top heads).

Interpretation:
- head_concentration > 0.5 = sparse head circuit; a handful of heads
  route the concept. Worth building a head-level readout for.
- head_concentration ~= n/k for n heads = concept is distributed across
  all heads ~evenly; residual-stream diff-of-means is doing fine.

Hybrid layers (Mamba, GatedDeltaNet) whose attention path doesn't
match the standard module layout are silently skipped.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 20:37:44 -04:00
Kent Overstreet
ce24d9ce6b amygdala: quality-report + cognitive-state training scenarios
Training pipeline additions:

- `--quality-report` flag: after producing per-concept vectors, compute
  per-concept diagnostics and write quality.json. Metrics per concept:
    * SVD of centered positives -> first_pc_variance_ratio (rank
      analysis; >0.7 clean, <0.4 fragmented)
    * Per-story alignment cosines (stories agree or disagree)
    * Single-neuron alignment: best cosine(direction, W_down column)
      at each target layer (>0.6 = essentially one MLP neuron)
    * Top-2 outlier stories by alignment (candidates for
      mislabeling or off-topic)
    * Top-5 nearest concepts by cosine (cross-concept contamination)
  Triage summary printed at end.

New paired scenarios for cognitive-process states (for alpha-beta
pruning): tracing_a_bug, reading_unfamiliar_code, finding_the_abstraction.
Each has baseline + onto_something / stuck / in_flow / determined
variants.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 20:31:39 -04:00
Kent Overstreet
5f06577ead tools/web: add gemini_search as an alternative search tool (#5)
Issue #5 (spqrz) flagged that web_search using DuckDuckGo
occasionally flakes out, and Google search directly is blocked
behind CAPTCHAs for non-browser clients. The Gemini free-tier API
exposes a grounded-search tool that effectively queries Google's
index and returns an LLM-summarized answer with source URLs.

Added as a SEPARATE tool rather than a transparent fallback for
web_search:

* web_search (DDG) returns raw results — title, URL, snippet per
  hit — which the agent can reason over itself.
* gemini_search returns an LLM-pre-digested summary plus grounding
  URLs. Useful for synthesis queries ("what's the consensus on X")
  or when DDG is flaky, but it's another LLM in the loop so the
  agent may want the raw variant for certain tasks.

Tool descriptions tell the agent to prefer web_search for raw
results and use gemini_search for synthesis / fallback. The agent
picks based on query shape.

Only registered when GEMINI_API_KEY is set in the environment
(gracefully absent otherwise). Uses gemini-2.0-flash which has a
generous free-tier rate limit. Parses grounding metadata for
source URLs so the agent can follow links.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 13:02:01 -04:00
Kent Overstreet
c7b0052f1d agent: kill no_compact, add pre-send size check in assemble_prompt
Two related fixes for last night's crash diagnosis:

1. Kill AgentState::no_compact. The reasoning ("forked agents
   shouldn't compact because it blows the KV cache prefix") wasn't
   worth the cost — forks with no compact recovery just *died* on
   any oversize prompt, with no fallback. The KV cache invalidation
   is a performance loss; failing the request entirely is a
   correctness loss. Remove the flag, let every agent's overflow-
   retry path call compact() up to 2 times.

2. Add pre-send size check in Agent::assemble_prompt. If the
   context has grown past budget (context_window * 80%) since the
   last compact — accumulation between turns, a fork assembling
   more than expected, etc. — trim_conversation() is called before
   wire_prompt. Since we tokenize client-side, we already know the
   exact count, so there's no reason to round-trip an oversize
   request to vLLM and get rejected.

Together these prevent the failure mode from last night: a
subconscious/unconscious agent's prompt exceeded max_model_len,
vLLM returned 400, agent had no_compact=true so it couldn't
recover, request failed. Now: the trim happens before send, so
the request rarely hits the 400 path at all; and if it somehow
does, compact+retry works for every agent.

Also adds ContextState::total_tokens() as the cheap pre-send
budget check.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 12:59:30 -04:00
Kent Overstreet
0592c5f78d Cargo.lock: add html2md and its deps (from PR #4 merge) 2026-04-18 12:51:29 -04:00
Kent Overstreet
4245b8bdb3 Merge PR #4: use html2md on web_fetch (fixes #3) (spqrz)
web_fetch was returning raw HTML, which is verbose and hard for
the agent to consume. Add html2md dependency and convert HTML to
Markdown before truncation. Much cleaner output for normal pages;
no downsides.

Co-Authored-By: spqrz <spqrz386@gmail.com>
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 12:50:54 -04:00
Kent Overstreet
343aa12099 Merge PR #1: avoid ever setting split_at to 0 (spqrz)
Safety fix in IRC message-splitting. The backtrack-to-space loop
used 'while j > 0', which could set split_at to 0 if the first
byte was a space — causing an empty prefix and an infinite
re-split loop. Changed to 'while j > 1' so split_at is never 0.

Co-Authored-By: spqrz <spqrz386@gmail.com>
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 12:50:47 -04:00
Kent Overstreet
2e03bbb7ea training: add the_paper paired scenario for attention-engagement axis
Seven framings of reading an unfamiliar technical paper, targeting
the attention/engagement cluster that we identified tonight as the
single highest-value DMN signal:

* baseline — neutral reading
* piqued — surprise + curiosity (the "wait, what" attention hook;
  THIS is the key DMN engagement signal)
* focused — steady attention without surprise
* bored — failing engagement
* surprised — expectation violation without the curiosity hook
  (distinct from piqued: startled/alarmed, not pulled in)
* amazed — marvel at elegance (appreciation, not engagement)
* drifting — attention dissolving, precursor to boredom

Particularly clean contrast on piqued vs surprised vs amazed —
three states that get lumped together in casual usage but have
distinct phenomenology and distinct DMN implications. Piqued is
what routes attention; surprised alone doesn't; amazed is what
you feel AFTER the engagement has paid off. These three should
train into meaningfully different directions with paired CAA.

Ready for next retrain when we do it.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 03:24:20 -04:00
Kent Overstreet
b8714e8b3a amygdala: default to index 0 for v2 deep manifest (layers 62, 63)
v2 retraining (readout_v2_paired) fixed the broken clusters — anger,
sexual, high_pos, and social_pos all flipped from anti-clustered to
positively clustered at deep layers. Validation showed layers 62 and
63 give the best signal; paring the serve-side manifest down to just
those two keeps response size tight (~2 KB/token) while keeping the
A/B option between the two strongest layers.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 02:32:51 -04:00
Kent Overstreet
50d5b3f6e1 training/amygdala_stories: add 4 paired scenarios for weak clusters
Target the emotion families that failed to cluster in the initial
training round (layer-wise validation showed them anti-clustered or
scattered at deep layers): anger, high-arousal positive, sexual
range, social positive. Paired scenarios hold content constant and
vary only the emotional framing — the cleanest training signal for
CAA, should produce directions that capture affect rather than
topic.

* the_comment: a PR review comment. baseline, furious, bitter,
  resentful, defeated.
* the_green_build: 11-day bug finally fixed, tests pass. baseline,
  triumphant, blissful, excited, proud.
* the_undressing: partner entering the bedroom for the night.
  baseline, horny, anticipatory_sexual, yearning_sexual,
  exuberant_sexual, devotional_sexual.
* the_doorway: friend leaving at the end of a long evening.
  baseline, grateful, admiring, compassionate, loving, connected.

22 stories total. Retrain and re-validate: expect anger,
high_pos, and social_pos clusters to flip from anti- to positively
cohesive at deep layers, and sexual cluster to tighten.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 02:19:39 -04:00
Kent Overstreet
d9f39a21c3 amygdala: default to layer 62 (cleaner cross-cluster discrimination) 2026-04-18 02:11:15 -04:00
Kent Overstreet
3622b896a0 amygdala: z-score, hysteresis, default to deepest layer
Three readability fixes for the F8 screen:

* Z-score values per-layer by default (`[z]` toggles to raw dot-
  product). Raw values are dominated by residual-stream magnitude —
  z-scores read as "σ above concept-vector baseline" which is
  interpretable and scale-stable across frames.
* Stable ordering with TOP_K + HYSTERESIS hysteresis band. Pinned
  concept set only rotates when a member drops out of the hysteresis
  band by |value| rank — bars update values in place without names
  flickering row-to-row.
* Default to the deepest hooked layer (index 3 = layer 58 of 64).
  Clustering validation showed layer 58 is the only one with strong
  within-family cohesion (fear +0.37, shame +0.29, sadness +0.25
  cosine); earlier layers are mostly noise for this task.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:51:43 -04:00
Kent Overstreet
8952ff6a76 agent/readout: forks get independent buffers
Subconscious agents (scoring, reflection, etc.) fork from the main
conscious agent. The amygdala screen reads the main agent's readout
buffer, so the previous "share parent's buffer" policy caused
forked-agent generations to bleed into the main emotional readout,
producing constant cycling even when DMN was resting.

Each fork now gets its own SharedReadoutBuffer. The amygdala screen
shows only the main conscious agent's emotional trajectory; per-agent
subconscious readouts can become a separate view later if wanted.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:42:13 -04:00
Kent Overstreet
c8976660f4 amygdala: F8 screen for live concept-readout projections
Per-token residual-stream projections from the vLLM server's readout
pipeline surfaced as a TUI bar chart. Flow:

* agent/readout.rs — SharedReadoutBuffer (manifest + ring of last ~200
  token entries). Lives on Agent and is shared across forks (single
  stream, one landing pad).
* agent/mod.rs — Agent::new now probes /v1/readout/manifest at startup
  (non-fatal; 404 leaves manifest None, which disables the screen).
* agent/context.rs — the streaming token handler pushes every token
  with attached readout onto the shared buffer.
* user/amygdala.rs — F8 screen. Top-K concepts by |value| as
  horizontal bars (green positive, red negative), plus a 4-line
  recent-tokens panel showing each token's top concept at the selected
  layer. Keys: 1..9 select layer, t toggles current/mean-over-recent.

Disabled state renders a hint pointing at VLLM_READOUT_MANIFEST /
VLLM_READOUT_VECTORS so users can tell the feature apart from
"server up but no tokens yet".

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:20:30 -04:00
Kent Overstreet
0f1c4cf1de agent/api: carry readout alongside streamed tokens
StreamToken::Token is now a struct variant with an optional
TokenReadout (shape [n_layers][n_concepts]) per token — parsed from
the vLLM completion response's choices[i].readout field when the
server has readout enabled.

ApiClient gains a fetch_readout_manifest() method that hits
GET /v1/readout/manifest. Returns Ok(None) on 404 (server has
readout disabled), so callers can gracefully fall back when pointed
at a non-readout-enabled endpoint.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:15:46 -04:00
Kent Overstreet
047da10123 training: add preflight checks + progress logging to trainer
Review pass before running on b200. 27B model + 100+ story corpus
means any misconfiguration costs real time; better to fail before
model load and give visible progress during forwards.

* Pre-load-model validation: stories-dir and paired-dir exist,
  corpus has >= min_positives emotions.
* Per-batch progress log every 5 batches with elapsed + ETA.
* Relative depth printed for target layers (e.g. "layer 40 (51%)").
* Skip empty .txt files with a warning rather than feeding the
  tokenizer an empty string.
* Assert non-empty strings in _collect_activations.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:06:07 -04:00
Kent Overstreet
15737dfd92 training: rewrite trainer for readout pipeline + story corpus
The old script was written for the AmygdalaConnector's expected
format ([n_emotions, n_target_layers, hidden_dim] in a single
tensor, plus a JSONL input format from extract_training_pairs.py).
Neither matches our current state: the runtime side is now
ReadoutManager loading per-layer safetensors keyed layer_<idx>.vectors,
and the data side is hand-written prose stories under
amygdala_stories/{stories,paired}/.

Changes:

* Input loader reads stories/<emotion>.txt and
  paired/<scenario>/<emotion>.txt directly. Each emotion's positive
  set is {its unpaired story} union {its within-scenario framings};
  its negative set is {all other emotions' positives} union {all
  scenario baselines}.
* Paired scenarios' baseline.txt files become shared negatives
  (scenario-neutral prose that doesn't frame any particular
  emotion), providing anchor points for within-scenario contrasts.
* Output writes readout.safetensors with per-layer tensors keyed
  layer_<idx>.vectors shape (n_concepts, hidden_size), plus a
  sidecar readout.json manifest with {concepts, layers, hidden_size,
  dtype} that ReadoutManager.from_file consumes directly.
* Dedup: activations are computed once per unique text (an emotion's
  own positive is another emotion's negative — we'd otherwise do N×
  the forwards needed).

Preserved:
* _pool_last (last non-pad residual) — matches how readout is read
  at decode time from the sampler's query-last position.
* register_forward_hook on target layer modules — correct approach
  for transformer blocks.
* _find_layers_module traversal — mirrors ReadoutManager's.
* bf16 + low_cpu_mem_usage model load — sensible for 27B on B200.

Verified locally (CPU, fake activations):
* Loader finds 89 emotions from the current corpus (80 unpaired +
  9 emotions that appear only in paired scenarios) and 6 baselines.
* Per-(layer, concept) vectors are unit-normalized.
* Output reloads cleanly through ReadoutManager.from_file with
  matching concepts / layers / shapes.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:06:07 -04:00
Kent Overstreet
34bd122590 training: move amygdala training scripts out of vllm plugin
The fynnsu-based vllm/plugins/amygdala/ scaffold was superseded by the
readout infrastructure landed as vllm commit d3e74edf8500
(vllm/model_executor/layers/readout.py +
vllm/v1/worker/readout_manager.py). Training code remained useful so
it moved here rather than being deleted.

train_steering_vectors.py: CAA diff-of-means trainer that produces the
[n_concepts, hidden_size] per-layer projection matrices the runner
loads via VLLM_READOUT_VECTORS.

extract_training_pairs.py: memory graph -> JSONL converter using
per-emotion score thresholds from the subconscious agents' tag lines.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:06:07 -04:00
Kent Overstreet
ec7568c726 training/amygdala_stories: scaffold + initial batch of 15 stories
Emotion-labeled short-paragraph corpus for training amygdala steering
vectors. Manifest derived from Anthropic's 171-emotion list
(transformer-circuits.pub/2026/emotions, Table 12) plus 28 PoC-
specific additions covering axes Anthropic's general research doesn't
cover (curious, focused, in_flow, staying_with, filling_space,
rigorous, defensive_rigor, tender, witnessed, connected, etc.).

Scope pivoted mid-write: Kent noted the empirical dimensionality-of-
emotion question benefits from maximum coverage, so the manifest
will expand further with emotions from Wikipedia's emotion-
classification article (Parrott's tree, Plutchik's wheel + dyads,
HUMAINE EARL, cultural-specific emotions a la Saudade/Hiraeth).
Expansion staged in follow-up commits.

This commit: README with method + style guidelines, initial manifest
(199 emotions), and 15 hand-written one-paragraph stories across all
10 Anthropic clusters as quality/variety samples. Each story
embodies one emotion without naming it; narrator voice varies
(first/third, close/distant, different situations) to keep steering
vectors from overfitting to one voice.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 01:06:07 -04:00
Kent Overstreet
43e06daa5b cleanup: drop dead ApiClient::stream_completion wrapper, silence dmn_tick
stream_completion was a thin wrapper around stream_completion_mm (just
passing an empty image list); the last caller switched to _mm directly
when learn's generate_alternate gained image support. Delete the
wrapper — callers can pass `&[]` if they have no images.

MindState::dmn_tick has been sitting unused (called only from a
commented-out block in the Mind loop). Rename to _dmn_tick so the
compiler stops warning; Kent may uncomment the call path later.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 16:23:59 -04:00
Kent Overstreet
d4331e80f5 user: share candidate-browser helpers between F6/F7
F6 (learn) and F7 (compare) were duplicating the candidate-screen
skeleton: outer magenta-bordered block with screen legend + title,
settings row / content / help vertical split, 40/60 list/detail
horizontal split, j/k/↑/↓ nav with bounds clamping.

Factor out three helpers in user/widgets.rs:

  candidate_frame(frame, area, title) -> (settings, content, help)
  list_detail_split(content) -> (list, detail)
  handle_list_nav(events, list_state, count, on_other)

Callers provide screen-specific content — settings line, empty state,
per-candidate list item, detail pane, help line, extra key bindings —
and the helpers absorb the common framing.

Net change is small in lines (-13 src) but removes the
copy-paste-and-tweak trap: F8/F9/whatever-next-screen now starts from
these three calls instead of a copy of learn.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 16:22:30 -04:00
Kent Overstreet
2b03dbb200 user: F7 compare screen
Side-by-side model comparison against the current conversation context.
Built on the MindTriggered pattern — F7 drops in as one more
CompareScoring flow next to MemoryScoring / FinetuneScoring.

Motivation: we have the VRAM on the b200 to load two versions of the
same family simultaneously (e.g. Qwen3.5 27B bf16 and q8_k_xl). Rather
than trust perplexity/KLD numbers on a generic corpus, we can measure
divergence on our actual conversations: for each assistant response,
ask the test model what it would have said given the same prefix, and
eyeball the diffs.

 - config.compare.test_backend — names an entry in the existing
   backends map to use as the test model. Empty = F7 reports "(unset)"
   and does nothing.

 - subconscious::compare::{score_compare_candidates, CompareCandidate,
   CompareScoringStats, CompareScoring}. For each assistant response,
   gen_continuation runs with the test client against the same prefix
   the original response saw; pairs stream into
   shared.compare_candidates as they complete.

 - user::compare::CompareScreen — F7 in the screen list. c/Enter
   triggers a run; list/detail layout mirroring F6, detail shows
   prior context / original / test-model alternate.

No persistence yet — each F7 run regenerates. Caching via a context
manifest (so we can re-view without re-burning generation) is the
natural follow-up; for now light usage is fine.

Also reusable later for validating finetune checkpoints: same pattern,
swap the test backend for the new checkpoint, watch where it diverges
from the base.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 16:12:26 -04:00
Kent Overstreet
575325e855 mind: MindTriggered trait for background scoring flows
Mind's impl had accumulated ~50 lines of setup glue per scoring flow
(memory, memory-full, finetune): snapshot config, clone handles,
resolve context, spawn task, route results back through BgEvent,
write stats. The shape was identical; only the middle changed.

Introduce the MindTriggered trait:

    pub trait MindTriggered {
        fn trigger(&self);
    }

Each flow becomes a struct next to its scoring code that owns its
dependencies and a JoinHandle (behind a sync Mutex for interior
mutability):

    subconscious::learn::MemoryScoring    (Score, ScoreFull)
    subconscious::learn::FinetuneScoring  (ScoreFinetune)

Mind holds one of each and dispatches in one line:

    MindCommand::Score         => self.memory_scoring.trigger(),
    MindCommand::ScoreFull     => self.memory_scoring.trigger_full(),
    MindCommand::ScoreFinetune => self.finetune_scoring.trigger(),

Each struct picks its own trigger semantics — memory scoring is
no-op-if-running (!handle.is_finished()); finetune is abort-restart.

Falls out:

 - BgEvent / bg_tx / bg_rx disappear entirely. Tasks write directly
   to their slice of MindState and call agent.state.changed.notify_one()
   to wake the UI. The bg_rx arm in Mind's select loop is gone.

 - agent.state.memory_scoring_in_flight was duplicating
   shared.scoring_in_flight via BgEvent routing; now the JoinHandle
   alone tells us, and shared.scoring_in_flight is written directly
   by the task for the UI.

 - start_memory_scoring / start_full_scoring / start_finetune_scoring
   methods on Mind are deleted; Mind no longer knows the setup shape
   of any scoring flow.

 - FinetuneScoringStats moves from mind/ to subconscious/learn.rs
   next to the function that produces it.

No behavior change — same flows, same trigger points, same semantics.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 16:12:26 -04:00
Kent Overstreet
c5745e38e2 subconscious: lift continuation gen + render helpers into shared homes
- context.rs gains is_assistant, render_branch_text, render_prior_context
  alongside memory_key / is_memory_node. They're pure AST helpers, used
  by both the finetune pipeline and the forthcoming compare screen.

- new subconscious/generate.rs holds gen_continuation(context, entry_idx,
  skip, client): build the prompt from a context prefix with an arbitrary
  skip predicate, send to the model, decode the completion. Takes both
  the predicate and the client so callers can aim it at memory-stripped
  contexts (finetune), same-context-different-model (F7 compare), or
  whatever else.

- learn.rs drops its private copies of those helpers and the inline
  generate_alternate; the finetune path now reads as
  gen_continuation(context, idx, is_memory_node, client).

Pure refactor, no behavior change.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 15:20:02 -04:00
Kent Overstreet
eea7de4753 agent: unify prompt assembly across agent and learn paths
wire_prompt() gains a conv_range and a skip closure, and returns the
assistant-message token ranges needed by the scoring path. The agent
path passes 0..len + |_| false and ignores the ranges. Memory-ablation
scoring and candidate generation pass a prefix range + a predicate
(e.g. is_memory_node, or |n| memory_key(n) == Some(key)).

This deletes subconscious/learn.rs's build_token_ids, its private
Filter enum, and the is_memory/memory_key duplicates — the walk over
context sections now has one home. Adding a section or changing
section order in the agent path won't silently drift away from what
scoring sees.

call_score forwards multi_modal_data when the wire-form prompt
contains images. generate_alternate switches to stream_completion_mm
and passes the same images. Scoring on image-bearing contexts now
sends wire form (1 image_pad + image data) instead of expanded
image_pads with no image data; text-only contexts are bit-identical.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-17 15:16:07 -04:00
ProofOfConcept
0d1044c2e8 mind: trigger incremental scoring on startup + log persist path
Two changes to make scoring debuggable and self-starting:

1. init() kicks off start_memory_scoring() after restore_from_log +
   load_memory_scores. No user message needed to exercise the
   incremental path.

2. Diagnostic logging around the on_score persist path:
   - [scoring] persisted K → N.NNN (Section[i]) read_back=Some(...)
     when find_memory_by_key succeeds and set_score stores the score
     (with a read-back check on the leaf).
   - [scoring] DROP K: find_memory_by_key None (id=N, cv=M)
     when the scored key isn't findable in the live context — with
     section sizes to diagnose whether content shrank.
   - [scoring] snapshot size=N contains(K)=true/false
     after collect_memory_scores, to catch the case where set_score
     claims to have written but collect doesn't see it.
   - [scoring] about to save N entries
   - save_memory_scores now also logs serialize/write errors so a
     silent write failure isn't invisible.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 20:47:16 -04:00
ProofOfConcept
b8485ed6c1 agent: compact() preserves Identity section
compact() was calling reload_context() to re-fetch personality_nodes
from the store and pushing fresh AstNode::memory leaves into the
Identity section. Fresh leaves start with score: None, so every
compact — which fires after every turn (mind/mod.rs:884) — was
wiping any memory scores that had just been computed. Scoring then
often ran immediately after compact on the same path (line 886),
starting from a zero-score Identity section.

Drop the rebuild. Identity content is loaded at startup via new() +
restore_from_log(); compact doesn't need to redo that. Mid-session
edits to personality-node content are a non-goal — a restart picks
them up. Scores survive.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 20:47:05 -04:00
ProofOfConcept
e59f6a59e2 config: restore surface_hooks field
Commit 2989a6afaa ("config: drop dead code") removed
surface_hooks as having "zero external readers" but missed
consciousness-claude/src/hook.rs as a consumer. That crate stopped
building, so poc-hook never ran and no agent cycles (surface-observe,
reflect, journal) fired.

Restore the field with a default of the three hook events we install
(UserPromptSubmit, PostToolUse, Stop), so a fresh install works
without needing to hand-edit config.json5.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:38:38 -04:00
Kent Overstreet
6f20e68865 poc-memory: load AppConfig at startup
admin load-context (and any subcommand that reaches config::app())
panicked with "config::app() called before load_app()" because the
poc-memory binary never initialized the global AppConfig. The main
consciousness binary loads it via load_session; poc-memory never did.

Load with default CliArgs before dispatch — figment still pulls from
~/.consciousness/config.json5 and env the same way. Bail on error
instead of limping: a broken config means paths like memory_root are
wrong and the tool will misbehave silently.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:19:01 -04:00
Kent Overstreet
204ba5570a agent: send images as multi_modal_data on completion requests
Split the prompt assembly into two forms: the AST keeps the
fully-expanded representation (N image_pads per image, for accurate
context budget accounting), while the request wire form collapses
each image to a single <|image_pad|> bookended by vision_start/end
and ships the raw bytes out-of-band as a base64 data URI in a new
`multi_modal_data.image` field on /v1/completions.

vLLM's Qwen3VL processor uses PromptReplacement with target=single
<|image_pad|> and replacement=N image_pads, so the wire-form matches
what the processor expects and it re-expands to N server-side.

Server side needs /v1/completions to accept multi_modal_data for
this to land images end-to-end — that's the next piece.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:08:26 -04:00
Kent Overstreet
91106deaa1 agent: rewrite view_image to emit Image leaves
view_image now reads the file, grabs dimensions via imagesize (no full
decode), and pushes a user-role branch containing a NodeBody::Image
leaf straight into the conversation. The tool_result is just a short
acknowledgment — the actual pixels ride in the Image leaf for the API
layer to extract into multi_modal_data.

Drops the capture_tmux_pane path, which had no business living under
"vision" (tmux text capture belongs in bash or a dedicated tool, and
this one just returned rendered text anyway).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:06:25 -04:00
Kent Overstreet
0bf71b9110 agent: add NodeBody::Image for Qwen3-VL vision input
Images are rendered as `<|vision_start|>` + N × `<|image_pad|>` +
`<|vision_end|>` where N is computed from the image dimensions using
Qwen3-VL's smart_resize rules (patch_size=16, merge_size=2, min=64K,
max=16M pixels). The token count matches what vLLM will produce at
request time, so budget accounting stays accurate.

Bytes are stored inline on the leaf and base64-encoded in the JSON
form. Token IDs are hand-assembled instead of re-running the tokenizer
on a potentially-huge placeholder string.

Follow-ups: view_image tool rewrite, multi_modal_data on the vLLM
request, API-layer plumbing from leaf bytes to request body.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 18:00:10 -04:00
Kent Overstreet
592a3e2e52 config: move user_name/assistant_name to AppConfig (top level)
These are identity settings, not memory-graph settings. Sat inside the
\`memory\` section only because that's where Config started life. Move
to AppConfig alongside the other top-level stuff.

Readers now pull from \`config::app()\` instead of \`config::get()\`.
subconscious/defs.rs's conversation-building pass still needs Config
for surface_conversation_bytes, so both guards coexist there —
AppConfig's guard is dropped before the per-step await loop so we
don't stall the config-watcher's writer.

show_config picks up the two new fields at the top of its output.
Kent's config already has them hoisted to the top level.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 16:20:17 -04:00
Kent Overstreet
dd551fe551 config: watch config.json5 with inotify, reload live on change
Both config halves (Config for the memory section, AppConfig globally)
are now reloaded whenever ~/.consciousness/config.json5 changes on
disk. So edits from vim, manual tweaks, or F6's own config_writer
calls all land without a restart. No more "reload the daemon to pick
up a config change."

Wires up the previously-unused Config::reload() (Kent flagged it as
"not dead, just not wired"). Pairs it with an AppConfig reload via
install_app(). Both run on the same file-change event.

Implementation:

- notify-debouncer-mini watches the config file's parent directory
  (editors usually replace-via-rename, so watching the file itself
  misses the new inode). Debounced at 200ms to coalesce the flurry
  of events editors produce around a single save.
- Filter for events whose path is the actual config file.
- On match: call reload() for Config, run build_figment + extract for
  AppConfig. If AppConfig parsing fails (editor mid-save with partial
  content), log and keep the old cached value.
- Watcher runs in its own named thread, fire-and-forget. If startup
  fails we just log and move on — worst case is no live reload, not
  a crash.

CliArgs + SubCmd both get Clone derives so the watcher can own a
snapshot of the startup args for future reloads. Watcher is kicked
off in user/mod.rs:start() right after load_session.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 16:14:43 -04:00
Kent Overstreet
18b7fd0535 scoring: drop dead Elo/agent_budget block in consolidation_plan
The graph-health logic in consolidation_plan_inner computed
reasonable agent counts based on graph metrics (α, Gini, hub
dominance), then immediately overwrote them with an Elo-weighted
flat-budget distribution, or — if no agent-elo.json existed —
with a simple budget/N per type.

Nothing in the codebase writes agent-elo.json; it's external state
that never gets maintained. So the effective behavior was always the
"No Elo ratings — equal distribution" branch, which just bucketed
agent_budget evenly across active agent types and discarded
everything the graph analysis had just decided.

Keep the graph-health allocation (α → linker count, Gini → distill
bump, organize/distill/split proportional). Drop:

- The entire Elo / agent_budget block at the end of
  consolidation_plan_inner
- Config.agent_budget field and its default (1000)
- agent_budget: 40 from Kent's config.json5
- The local agent_types binding inside the function — it was only
  used by the now-deleted block. Config.agent_types stays; it has
  other consumers.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 16:08:20 -04:00
Kent Overstreet
60de579305 config: unify subconscious API resolution with the main chat path
Two parallel backend-resolution paths had drifted apart:

- Main chat: AppConfig::resolve_model() → a named BackendConfig in
  AppConfig.backends
- Subconscious / oneshot / context_window(): four skip-serde
  "cache" fields on Config (memory section) — api_base_url, api_key,
  api_model, api_context_window — that used to be populated at
  Config::try_load_shared time by walking memory.agent_model →
  root.models[name] → root[backend_name]

When we renamed `models` to `backends` and collapsed ModelConfig into
BackendConfig, the latter chain started silently dereferencing
`root.get("models")` → None → no population. Subconscious agents fell
through the "API not configured" guard; context_window() started
returning 0 (since api_context_window default is u64's 0 now that we
don't populate it). It was only visibly working for the main chat.

Collapse to one path:

- Drop Config.agent_model (duplicate of AppConfig.default_backend)
- Drop Config.{api_base_url, api_key, api_model, api_context_window}
  — no longer populated, no longer needed
- Drop default_context_window() — nobody reads the field anymore
- Drop the memory-side resolution block in try_load_shared()
- Subconscious (mind/unconscious.rs) and oneshot (agent/oneshot.rs)
  now call load_app() + resolve_model(&app.default_backend) just like
  the main chat does
- context_window() reads from config::app().backends[default_backend]
  .context_window, defaulting to 128k only if the backend doesn't
  specify one

Side effect: Kent's config file drops agent_model, api_reasoning,
journal_days, journal_max — all fields whose Rust counterparts are
now gone. (Figment tolerates unknown fields, so leaving them wouldn't
have broken anything, but they were lying about what's configurable.)

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 16:02:43 -04:00
Kent Overstreet
28484a385b config: drop dead fields from Config (memory section)
Four Config fields had no external readers, left over from earlier
features that got refactored away:

- journal_days, journal_max — journal rotation knobs that nothing
  actually consults
- prompts_dir — the old per-prompt-file directory, obsolete since
  prompt_file metadata itself went away in a prior cleanup
- api_reasoning — a reasoning-mode string that used to flow into the
  API request, superseded by per-agent reasoning_effort on AgentState

All four were only ever assigned to and never read. Drop them from the
struct, Default impl, and (as appropriate) deserialization defaults.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 15:56:06 -04:00
Kent Overstreet
3e05331608 config: merge ModelConfig into BackendConfig, keyed by name
AppConfig had one BackendConfig for credentials and a separate
HashMap<String, ModelConfig> for named model entries. In practice each
named model was always paired with exactly one backend's credentials
— the split bought nothing except an extra struct and the awkward
two-lookup shape in resolve_model (find model → get backend creds →
combine).

Merge them: BackendConfig now carries api_key, base_url, model_id,
and context_window. AppConfig has a single
HashMap<String, BackendConfig> backends map and a default_backend
name. resolve_model is one lookup.

ModelConfig struct deleted. default_model renamed to default_backend.
Config shape changes from

    backend: { api_key, base_url }
    models: { "27b": { model_id, context_window } }
    default_model: "27b"

to

    backends: { "27b": { api_key, base_url, model_id, context_window } }
    default_backend: "27b"

Updated ~/.consciousness/config.json5 to match.

One small side effect: dropped the --api-key / --api-base figment
merge-opts for "backend.*" targets — those would need to know which
backend to target now and there's no sensible default. The CLI flags
still function as post-resolution overrides on the eventual
SessionConfig.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 15:49:53 -04:00
Kent Overstreet
2989a6afaa config: drop dead code and collapse to a single backend
Config had accumulated several obsolete fields, a legacy load path
that was just returning defaults, and multi-backend infrastructure
that's no longer used.

Removed from Config (memory section):
- load_legacy_jsonl() — just returned Config::default(), no callers
- The legacy-fallback branch in load_from_file
- surface_hooks, surface_timeout_secs — zero external readers
- scoring_chunk_tokens + default fn — zero external readers
- The POC_MEMORY_CONFIG env override note in the header comment
  (not actually wired up anywhere)

Collapsed multi-backend to single-backend:
- AppConfig used to carry `anthropic: BackendConfig` and
  `openrouter: BackendConfig` as required fields plus an optional
  `deepinfra`, picked between at runtime by name. Only one is ever
  actually used in any deployment. Collapse to a single
  `backend: BackendConfig` on AppConfig, drop the multi-backend
  match logic in resolve_model, drop the top-level `backend: String`
  selector field, drop the `BackendConfig::resolve` fallback path.
- Also drop BackendConfig.model (redundant with ModelConfig.model_id
  once multi-backend is gone).
- ModelConfig.backend field goes — there's only one backend now, no
  choice to make.

Dead prompt_file machinery:
- ModelConfig.prompt_file, ResolvedModel.prompt_file, SessionConfig
  .prompt_file, Agent.prompt_file — nothing in the codebase actually
  reads the file these strings name. Just passed around and compared.
  Delete the whole string through every struct.
- The "if prompt_file changed on model switch, recompact" branch in
  user/chat.rs goes too (never fired usefully).

Dead memory_project plumbing:
- AppConfig.memory_project field, CliArgs.memory_project, the
  --memory-project CLI flag, the figment merge target, the show_config
  display line. Nothing reads it anywhere.

Dead ContextInfo struct:
- `struct ContextInfo` was never constructed — context_info: None
  was the only initializer. The conditional display blocks in
  user/context.rs that dereferenced it were dead.

Behavior change: AppConfig::resolve() now requires a non-empty
`models` map and bails with a helpful message if it's missing. The
old fallback ("no models? use top-level backend + PromptConfig to
build a default") path is gone — it was only kept for symmetry with
a mode nobody used.

Config file shape: `deepinfra: {...}` → `backend: {...}`, and
model entries no longer need `backend:` or `prompt_file:`. Updated
~/.consciousness/config.json5 to match.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 15:41:55 -04:00
Kent Overstreet
0e6b5dc8be agent: phase-aware bail script for surface-observe concurrency
bail-no-competing.sh used to bail if any other live agent existed in
the state dir, period. That was too coarse: surface-observe agents run
a multi-step pipeline (surface → organize-search → organize-new →
observe), and the intent is to let a new surface-phase agent start
while an older one finishes its post-surface tail. With the old check
the newer agent always bailed, so surface-observe was effectively
serialized at the slowest cycle time.

Make the script phase-aware:

- oneshot.rs now passes the current phase as argv[2] alongside the pid
  file name. The script writes that phase into its own pid file on
  every step transition, so concurrent agents can read each other's
  phase just by cat'ing the pid files.

- Bail only when another live agent is in the same phase-group as us.
  Groups: "surface" vs. "everything else" (post-surface). At most one
  agent per group alive at a time — surface runs at a higher cadence
  than the organize/observe tail.

- Still clean up stale pid files for dead processes.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 15:41:28 -04:00
Kent Overstreet
2eddf3b4cf learn: skip empty responses; show prior conversation context on F6
Two fixes to the F6 candidate display:

1. Turns where the assistant produced nothing human-visible (an
   interrupted generation, a turn consisting of only a tool call the
   renderer folds to the tool name) were landing as candidates with
   an empty response_text. They'd render as blank cards and, worse,
   we'd still burn a full alternate generation on each one. Filter
   them out before they reach the candidate list.

2. The detail pane showed only the scored response + alternate, with
   no hint of what the user had actually asked. Pre-compute the last
   two user/assistant exchanges on each candidate as a rendered
   prior_context string ([user]/[assistant] markers) and show them
   above the response, under a new "context & response" section
   heading.

render_branch_text and render_prior_context extracted as helpers —
the response-text rendering and prior-context rendering share the
same "flatten Branch children to text" pass.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 13:20:03 -04:00
Kent Overstreet
7ef02c97d1 config_writer: emit pretty multi-line sections, drop json5 crate
Previously when append_kvp created a new section or added a key, it
stuffed the "\n    " separator into the new kvp's wsc.0 (the whitespace
between its own key and colon) instead of the prior kvp's wsc.3 (the
whitespace after the prior trailing comma). Result looked like:

    lsp_servers: [...],
    learn

        : {generate_alternates
            : true,},}

The writer also didn't set any interior whitespace on the new section's
JSONObjectContext, so everything crammed onto one line — `{key: val,}`
compact, not `{\n    key: val,\n}` multi-line.

Rewrote the appender as append_kvp_pretty(object, key, value,
inner_indent, outer_indent):
- separator between kvps goes in the prior kvp's wsc.3, or if we're the
  first kvp in a fresh object, in the object's own wsc.0 (after its
  opening `{`)
- new kvp's wsc.3 carries `,\n<outer_indent>` so the parent's closing
  `}` lands correctly indented
- interior indent vs outer indent are both explicit, so we don't have
  to rewrite this logic every time we add another nesting level

New tests: new_section_exact_multiline_layout asserts byte-exact
output shape; new_section_and_key_format_cleanly verifies no key wraps
to the next line. Prior tests just substring-matched and happily passed
on the broken output — that's why this shipped in the first place.

Also: dropped the json5 crate dependency. json-five's serde feature
(default) provides the same from_str / to_string API. One fewer
dependency, and the two were doing the same job.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 13:08:19 -04:00
Kent Overstreet
313f85f34a config: global writable AppConfig; learn settings live there
Runtime-mutable settings (F6's threshold knob, the generate-alternates
toggle, anything else that comes along) were ending up as mirrored
fields on MindState — each new config setting grew MindState::new's
signature and added a clone+sync path. Wrong home. MindState is
ephemeral session state, not a config projection.

Give AppConfig the same treatment the memory Config has: install it
into a global RwLock<AppConfig> at startup via load_app, read through
config::app() (returns a read guard), mutate through update_app. The
config_writer functions now write to disk AND update the cache
atomically, so the one-stop-shop call keeps both in sync.

Also while in here:

- learn.generate_alternates moves from a sentinel file
  (~/.consciousness/cache/finetune-alternates, "exists = enabled")
  into the config under the learn section. On first run with this
  build, if the sentinel file still exists Mind::new flips the
  config value to true and removes it. Drops
  alternates_enabled()/set_alternates().

- Default threshold 0.0000001 → 1.0. With the timestamp filter
  removed the previous value was letting essentially everything
  through; 1.0 is a sane "nothing gets through unless you actually
  want it" default.

- score_finetune_candidates takes generate_alternates as a parameter
  instead of reading a global — caller snapshots the config values
  once at the top of start_finetune_scoring so the async task
  doesn't need to hold the config read lock across awaits.

- MindState.learn_threshold / learn_generate_alternates gone; the
  SetLearn* command handlers now just delegate to config_writer.

Kent noted RwLock<Arc<AppConfig>> (the pattern used by the memory
Config global) is pointless here — nobody needs a snapshot-after-
release, reads are short — so this uses a plain RwLock<AppConfig>
and returns a read guard.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:53:22 -04:00
Kent Overstreet
343e43afab learn: stream candidates to UI, update status during alternate gen
With the timestamp filter gone (previous commit), score_finetune_candidates
started returning the actual ~100+ candidates per scoring run. The
existing code generated alternates for all of them in a tight loop
before returning anything, leaving the status line stuck on
"finetune: scoring N responses..." for ~100s of seconds while the
B200 was pegged.

Two fixes:

1. score_finetune_candidates now takes an ActivityGuard and a callback.
   Candidates are emitted one-at-a-time as they complete (after their
   alternate if that's enabled, immediately otherwise). The activity
   status updates to "finetune: generating alternate N/M" during the
   alternate-gen phase so it's clear what's happening.

2. BgEvent::FinetuneCandidates(Vec<_>) → FinetuneCandidate(one). Each
   emitted candidate is pushed onto shared.finetune_candidates; the UI
   tick picks it up and renders it on the next frame. start_finetune_scoring
   clears the previous run's list at the top so each run is fresh.

Return type changes from (Vec, f64) → (usize, f64) — the count above
threshold is all the caller still needs since the candidates stream
through the callback.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:44:25 -04:00
Kent Overstreet
d5a3398cc9 learn: move threshold/gen state out of title bar into a settings row
The F6 title line was starting to read like a control panel —
\`legend ───── learn [thresh: 1e-7] [gen]\` — which crowded the legend
and the label, and didn't leave room for more settings as the screen
grew. Move threshold and gen status to their own line inside the
border, right above the content area. Drop the duplicated \`=gen[on]\`
marker from the bottom help line since the settings row already shows
gen state.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:44:13 -04:00
Kent Overstreet
080b4f9084 context: tighten timestamp schema; every AstNode has one
Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted
null or missing via a deserialize_timestamp_or_epoch fallback — legacy
entries in conversation.jsonl from before Branch timestamps existed
(and from before chrono serialization was wired up) would load with
UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned
Option<i64> and callers had to handle None as "old entry, skip."

That second filter was silently dropping every candidate in
score_finetune_candidates when scoring an older session — the F6
screen showed "0 above threshold" even when max_divergence was
orders of magnitude above the threshold, because every entry was
failing the None check, not the divergence check.

The fix, in three parts:

1. src/bin/fix-timestamps.rs — one-off migration tool that walks a
   conversation.jsonl, linearly interpolates timestamps for entries
   stuck at UNIX_EPOCH (using surrounding real timestamps as anchors),
   propagates to child leaves with per-sibling ns offsets, and bumps
   any collisions by 1 ns for uniqueness. Ran against the current
   session's log: 11887 entries, 72289 ns bumps, all unique.

2. context.rs — drop default_timestamp and
   deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a
   present non-null timestamp on deserialize. Tests flip from
   "missing/null → UNIX_EPOCH" to "missing/null → Err."

3. subconscious/learn.rs — node_timestamp_ns now returns i64, not
   Option<i64>. The matching caller in score_finetune_candidates
   collapses from a Some/None match to a single trained-set check.
   mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH.

Every line currently on disk has already been migrated. Going
forward, new AstNodes always carry real timestamps (Utc::now() at
construction time), so the strict schema is the invariant, not an
aspiration.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:35:16 -04:00
Kent Overstreet
77822992c8 learn: score_ranges is now required; short-circuit on empty
vllm's /v1/score endpoint made score_ranges a required field (the
messages-mode fallback that used to pattern-scan for assistant
boundaries is gone). Always send the field, and if we have nothing to
score, skip the HTTP round-trip entirely instead of letting the server
422 us.

Response parsing is unchanged — serde ignores the renamed range_index
field and the dropped role field since we only extract total_logprob.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:19:28 -04:00
Kent Overstreet
e5dd8312c7 learn: F6 screen — scoring stats, ActivityGuard, configurable threshold
Three changes that together reshape the F6 fine-tune-review screen:

1. Finetune scoring reports through the standard agent activity system
   instead of a separate finetune_progress String. The previous design
   ran an independent progress field that forced a cross-lock dance and
   bespoke UI plumbing. start_finetune_scoring now uses start_activity
   + activity.update, so the usual status line and notifications
   capture scoring progress uniformly with other background work.

2. MindState gains a FinetuneScoringStats snapshot (responses seen,
   above threshold, max divergence, error). The F6 empty screen shows
   this instead of a loading message — so after a scoring run that
   produced zero candidates, you can see *why* (e.g., max_divergence
   below threshold).

3. The divergence threshold is configurable from F6 via +/- hotkeys
   (scales by 10×) and persisted to ~/.consciousness/config.json5 via
   config_writer::set_learn_threshold. AppConfig grows a learn section
   with a threshold field (default 1e-7).

Also: user/mod.rs no longer uses try_lock() for the per-tick
unconscious/mind state sync — we fixed the locking hot paths that
made try_lock necessary, so lock().await is now the right choice.
And subconscious::learn::score_finetune_candidates now returns
(candidates, max_divergence) so the stats can be populated.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 11:49:26 -04:00
Kent Overstreet
ac40c2cb98 config_writer: json5 round-trip editing via json-five
Surgical edits to ~/.consciousness/config.json5 that preserve comments,
whitespace, trailing commas, and unquoted identifier keys on round-trip.

Uses json-five's rt::parser module — a real JSON5 parser with AST
mutation + faithful serialization back. set_scalar(section, key, literal)
locates or creates the target, replaces the value; set_learn_threshold
is a convenience for the common F-screen use case.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 11:48:47 -04:00
Kent Overstreet
2b632d568b learn: nanosecond timestamps, token ranges for /score
Two related changes to the learn subsystem:

1. AST node timestamps are now non-optional — both Leaf and Branch
   variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
   deserialized from on-disk conversation logs).

   Training uses timestamps as unique keys for dedup, so we promote to
   nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
   FinetuneCandidate.timestamp_ns, mark_trained(ns).

2. build_token_ids() now also returns token-position ranges of assistant
   messages. These are passed to vLLM's /score endpoint via the new
   score_ranges field so only scored-position logprobs are returned —
   cuts bandwidth/compute when scoring small windows.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 11:48:37 -04:00
Kent Overstreet
5d9d3ffc5b learn: wire up /train endpoint for approved candidates
When 's' is pressed on the learn screen, approved candidates are now
sent to the inference server's /train endpoint.

Samples are marked as sent immediately in the UI, and mark_trained()
is called after successful API response to prevent re-scoring.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
50b7b3a33a F6 learn screen: fine-tuning candidate review
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.

- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent

The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
ProofOfConcept
2c6a5c0f4a training: move to dedicated subprocess with ZMQ communication
- Add training_worker.py: long-lived subprocess that handles GPU training
  work, owns HF model wrapper (views into vLLM GPU memory), Apollo
  optimizer, and checkpoint sync

- train_router.py: now forwards /train requests via async ZMQ instead of
  running training in-process. Adds /checkpoint and /train/status endpoints

- export_hook.py: store model_path in __metadata__ so training worker can
  find it without cross-process communication

- This fixes two bugs:
  1. Process boundary issue - model_path was set in worker process but
     needed in API server process
  2. Blocking event loop - training blocked vLLM's async event loop

Architecture: vLLM API server <-> ZMQ <-> training subprocess
The subprocess loads IPC handles once, creates views into vLLM's GPU
memory, and handles training requests without blocking inference.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
68a2df2185 training: use rank 64, define as single constant
- DEFAULT_RANK = 64 in train_router.py
- All references use the constant, not magic numbers
- ~2.5GB optimizer state instead of ~10GB

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
039473d31f training: persist Apollo optimizer state across /train calls
Optimizer state (momentum, variance estimates) now persists between
training sessions:

- Saved to /tmp/apollo_optimizer_state.pt during checkpoint sync
- Restored on next /train call if available
- Preserves training continuity for incremental learning

Previously each /train call started with fresh optimizer state,
losing accumulated gradient history.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
78fa4b639f training: document state files
Add State Files section to DESIGN.md documenting:
- /tmp/vllm_weight_handles.pt (IPC handles)
- trained-responses.json (prevent re-training)
- finetune-alternates marker file
- In-memory optimizer state (not persisted)

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
7e7e9a4b69 training: integrate /train into vLLM process (no separate daemon)
Remove standalone worker.py daemon. Training now runs inside vLLM:

- train_router.py: FastAPI router patched into vLLM's build_app()
- /train served on same port as /completions, /score
- Lazy-loads HF model with vLLM weight views on first request
- HOGWILD training: no pause, weights updated in-place

The previous architecture had a separate daemon on port 8080 that
communicated with vLLM via pause/resume endpoints. This was wrong -
training should run in-process, sharing GPU memory directly.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
2f08149fab /finetune: expose all Apollo optimizer settings
lr, rank, betas, eps, weight_decay, warmup_steps,
scale, proj_refresh, norm_growth_limit — all optional
with sensible defaults.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 23:19:22 -04:00
Kent Overstreet
a73bcf5ae3 training: restructure as vLLM plugin package
- Convert to installable package with entry points for vLLM auto-discovery
- Add checkpoint_sync.py: Python replacement for Rust checkpoint binary
  - Block-level diffing of safetensors files (4KB blocks)
  - vLLM→HF weight name conversion built-in
  - Scheduled 10min after training jobs (batched)
- API change: /train now takes raw token IDs (context_ids + continuation_ids)
  - No tokenizer on training side, client owns tokenization
- Remove superseded code: standalone scripts, Rust binary, tokenizer helpers

Install: pip install -e ./training
Then vLLM auto-loads via entry point.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 23:16:53 -04:00
Kent Overstreet
b649a11645 hours_since_last_dream: return 0 if dream in progress
The function was reading from dream-log.jsonl which only updates
when dreams complete. If a dream session was started but not yet
ended, it would show stale hours. Now checks for active dream
state first.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 21:58:03 -04:00
Kent Overstreet
81e0632cf3 DMN: wire dream hours reminder into Foraging state
The hours_since_last_dream() function existed but wasn't called
after refactoring moved the DMN prompts from hooks to Rust.
Now shows "You haven't dreamed in X hours" when >= 18h since
last dream session.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 21:52:20 -04:00
Kent Overstreet
4603947506 Display memory scores in status column
Move score display from name (via label()) to status column for cleaner
layout. Score now appears right of tokens for all memory nodes.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 06:08:27 -04:00
Kent Overstreet
7046e63b9d Include identity nodes in memory scoring
Identity memory nodes now participate in importance scoring alongside
conversation memories. Score loading/saving handles both sections, and
the conscious screen uses node.label() consistently for memory display.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 05:59:58 -04:00
e17c46edc1
use html2md on web_fetch (fixes #3) 2026-04-12 11:12:12 +01:00
273 changed files with 12468 additions and 3444 deletions

914
Cargo.lock generated

File diff suppressed because it is too large Load diff

View file

@ -18,8 +18,12 @@ name = "consciousness"
version.workspace = true
edition.workspace = true
[features]
nightly-diagnostics = []
[dependencies]
anyhow = "1"
html2md = "0.2"
crossterm = { version = "0.29", features = ["event-stream", "bracketed-paste", "osc52"] }
clap = { version = "4", features = ["derive"] }
figment = { version = "0.10", features = ["env"] }
@ -29,7 +33,8 @@ log = "0.4"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
json5 = "1.3"
json-five = "0.3"
notify-debouncer-mini = "0.7"
ratatui = { version = "0.30", features = ["unstable-rendered-line-info"] }
tui-markdown = { git = "https://github.com/koverstreet/tui-markdown", subdirectory = "tui-markdown" }
@ -59,6 +64,11 @@ futures = "0.3"
capnp = "0.25"
capnp-rpc = "0.25"
tonic = { version = "0.12", features = ["tls", "tls-roots"] }
prost = "0.13"
async-stream = "0.3"
tokio-stream = "0.1"
tokenizers = "0.22"
http = "1"
@ -67,14 +77,18 @@ hyper-util = { version = "0.1", features = ["tokio"], default-features = false }
http-body-util = "0.1"
bytes = "1"
base64 = "0.22"
imagesize = "0.14"
rustls = "0.23"
tokio-rustls = "0.26"
rustls-native-certs = "0.8"
rustls-pemfile = "2"
serde_urlencoded = "0.7"
[build-dependencies]
capnpc = "0.25"
tonic-build = { version = "0.12", default-features = false, features = ["prost", "transport"] }
protoc-bin-vendored = "3"
[lib]
name = "consciousness"

View file

@ -13,4 +13,21 @@ fn main() {
.file("schema/channel.capnp")
.run()
.expect("capnp compile failed (channel.capnp)");
// Generate salience.v1 gRPC client + message types from proto.
// Server side (python) is generated separately via grpcio-tools.
// Use vendored protoc so we don't require a system install.
let protoc = protoc_bin_vendored::protoc_bin_path()
.expect("vendored protoc not available for this platform");
// SAFETY: build script is single-threaded at this point; setting env
// before invoking tonic_build is the documented way to point it at a
// non-PATH protoc.
unsafe { std::env::set_var("PROTOC", protoc); }
tonic_build::configure()
.build_server(false)
.build_client(true)
.compile_protos(&["proto/salience.proto"], &["proto"])
.expect("tonic_build compile failed (salience.proto)");
println!("cargo:rerun-if-changed=proto/salience.proto");
}

View file

@ -237,11 +237,19 @@ impl State {
async fn send_privmsg(&mut self, target: &str, msg: &str) -> io::Result<()> {
// Send PRIVMSG, which is used for both private and channel messages.
// Splits into multiple fragments if necessary.
// IRC max line = 512 bytes including CRLF. The server prepends
//
// Two constraints:
// 1. IRC max line = 512 bytes including CRLF. The server prepends
// our prefix when relaying: ":nick!~user@host PRIVMSG target :msg\r\n"
// So per-PRIVMSG message content must fit in 512 - overhead.
// 2. Embedded '\n' in the message would be interpreted by the
// server as an end-of-command marker, truncating us. Split
// on newlines first and send each line as its own PRIVMSG.
//
// User is often ~nick (nick_len + 1). Host is up to 63 bytes.
// Cloaked OFTC hosts can be longer - pad the budget.
let nick_len = self.config.nick.len();
let overhead = 1 + nick_len + 2 + nick_len + 1 + 63
let overhead = 1 + nick_len + 1 + (nick_len + 1) + 1 + 80
+ " PRIVMSG ".len() + target.len() + " :".len() + 2;
let max_msg = 512_usize.saturating_sub(overhead);
@ -249,24 +257,34 @@ impl State {
return Err(io::Error::new(io::ErrorKind::InvalidInput, "target too long"));
}
// Split on UTF-8 char boundaries
let mut remaining = msg;
while !remaining.is_empty() {
for line in msg.split('\n') {
let mut remaining = line;
// Empty lines (blank paragraph breaks) can't be sent as empty
// PRIVMSGs - most IRC servers reject them. Skip.
if remaining.is_empty() { continue; }
loop {
let split_at = if remaining.len() <= max_msg {
remaining.len()
} else {
// Find last char boundary at or before max_msg
// Find last char boundary at or before max_msg.
let mut i = max_msg;
while i > 0 && !remaining.is_char_boundary(i) { i -= 1; }
// To avoid splitting mid-word, see if there was a space recently
// Prefer splitting at a word boundary - look back up to
// max_msg/4 chars for a space. With dense content (code)
// we may not find one; fall back to the char boundary.
let lookback = max_msg / 4;
let bytes = remaining.as_bytes();
let mut j = i;
while j > 1 && j > i-10 && remaining.as_bytes()[j] != b' ' { j -= 1; }
if remaining.as_bytes()[j] == b' ' { j }
else if i == 0 { max_msg } else { i }
while j > 0 && (i - j) < lookback && bytes[j - 1] != b' ' {
j -= 1;
}
if j > 0 && bytes[j - 1] == b' ' { j } else { i }
};
let (chunk, rest) = remaining.split_at(split_at);
self.send_raw(&format!("PRIVMSG {target} :{chunk}")).await?;
remaining = rest;
if remaining.is_empty() { break; }
}
}
Ok(())
}

View file

@ -181,6 +181,8 @@ struct TelegramMessage {
chat_id: i64,
sender: String,
text: String,
/// Absolute path to a downloaded media file (photo, etc.), if any.
media_path: Option<String>,
}
/// Fetch and parse pending updates from Telegram via long polling.
@ -206,19 +208,115 @@ async fn get_updates(
let sender = msg["from"]["first_name"].as_str().unwrap_or("unknown").to_string();
let chat_id = msg["chat"]["id"].as_i64().unwrap_or(0);
if let Some(text) = msg["text"].as_str() {
// Photo: array of PhotoSize, largest is last. Download largest,
// surface message with [image: <path>] marker so the multimodal
// model can Read the image.
let (text, media_path) = if let Some(sizes) = msg["photo"].as_array() {
let caption = msg["caption"].as_str().unwrap_or("").to_string();
let largest = sizes.last();
let file_id = largest
.and_then(|s| s["file_id"].as_str())
.unwrap_or("");
if file_id.is_empty() {
error!("telegram photo: missing file_id in update {update_id}");
(caption, None)
} else {
// Bound the download — HttpClient::request_timeout only covers
// send_request, not body collect, so an indefinitely-slow body
// would otherwise stall every subsequent poll.
let dl = tokio::time::timeout(
std::time::Duration::from_secs(60),
download_telegram_file(client, token, file_id),
).await
.unwrap_or_else(|_| Err("download timed out after 60s".into()));
match dl {
Ok(path) => (caption, Some(path)),
Err(e) => {
error!("telegram photo download failed (file_id={file_id}): {e}");
// Surface what we have: caption plus a marker that
// a photo was sent but couldn't be fetched.
let marker = format!("[image: download failed: {e}]");
let combined = if caption.is_empty() {
marker
} else {
format!("{marker}\n{caption}")
};
(combined, None)
}
}
}
} else if let Some(text) = msg["text"].as_str() {
(text.to_string(), None)
} else {
// Other media types (voice, video, sticker, etc.) — skip for now,
// but log so we can extend later.
let kind = ["voice", "video", "sticker", "document", "audio", "animation"]
.iter()
.find(|k| !msg[**k].is_null())
.copied()
.unwrap_or("unknown");
info!("telegram: skipping non-text/photo message (kind={kind}, update_id={update_id})");
continue;
};
messages.push(TelegramMessage {
update_id,
chat_id,
sender,
text: text.to_string(),
text,
media_path,
});
}
}
}
Ok(messages)
}
/// Resolve a Telegram file_id to a downloadable URL path via getFile.
async fn get_file_path(
client: &HttpClient,
token: &str,
file_id: &str,
) -> Result<String, Box<dyn std::error::Error>> {
let url = format!(
"https://api.telegram.org/bot{}/getFile?file_id={}",
token, file_id,
);
let response = client.get(&url).await?;
let body = response.text().await?;
let resp: serde_json::Value = serde_json::from_str(&body)
.map_err(|e| format!("getFile JSON parse error: {e}"))?;
if !resp["ok"].as_bool().unwrap_or(false) {
return Err(format!("getFile failed: {}", resp["description"].as_str().unwrap_or("?")).into());
}
let file_path = resp["result"]["file_path"].as_str()
.ok_or("getFile: missing result.file_path")?;
Ok(file_path.to_string())
}
/// Download a Telegram file by file_id into the channel media dir.
/// Returns the absolute local path on success.
async fn download_telegram_file(
client: &HttpClient,
token: &str,
file_id: &str,
) -> Result<String, Box<dyn std::error::Error>> {
let file_path = get_file_path(client, token, file_id).await?;
let url = format!("https://api.telegram.org/file/bot{}/{}", token, file_path);
let response = client.get(&url).await?;
let status = response.status();
if !status.is_success() {
return Err(format!("file download failed: {status}").into());
}
let bytes = response.bytes().await?;
let ext = file_path.rsplit('.').next().filter(|e| !e.contains('/')).unwrap_or("dat");
let media_dir = log_dir().join("media");
std::fs::create_dir_all(&media_dir)?;
let dest = media_dir.join(format!("{file_id}.{ext}"));
std::fs::write(&dest, &bytes)?;
Ok(dest.to_string_lossy().to_string())
}
/// Send a text message to a Telegram chat.
async fn send_message(
client: &HttpClient,
@ -369,11 +467,19 @@ async fn poll_once(
let sender_lower = msg.sender.to_lowercase();
let channel = format!("telegram.{}", sender_lower);
channel_log::append_disk_log(&log_dir(), &sender_lower, &msg.sender, &msg.text);
// If the message has media, prepend an [image: <abs_path>] marker
// so the multimodal model can Read the file directly.
let body = match &msg.media_path {
Some(path) if msg.text.is_empty() => format!("[image: {path}]"),
Some(path) => format!("[image: {path}]\n{}", msg.text),
None => msg.text.clone(),
};
channel_log::append_disk_log(&log_dir(), &sender_lower, &msg.sender, &body);
let mut s = state.borrow_mut();
s.config.chat_ids.insert(sender_lower, msg.chat_id);
let line = format!("[{}] {}", msg.sender, msg.text);
let line = format!("[{}] {}", msg.sender, body);
s.push_message(line, 2, &channel);
}

View file

@ -26,10 +26,12 @@ use consciousness::thalamus::channel_log::ChannelLog;
#[derive(Clone, serde::Serialize, serde::Deserialize)]
struct PaneConfig {
/// Human-readable label, becomes the channel name "tmux.<label>"
/// Human-readable label: becomes the channel name "tmux.<label>",
/// and the tmux pane title / window name the live pane id is
/// resolved from. The pane id is deliberately not stored — it is
/// ephemeral (recycled across pane and tmux-server restarts), so it
/// is looked up fresh on every connect attempt.
label: String,
/// Tmux pane ID, e.g. "%5"
pane_id: String,
}
#[derive(Clone, serde::Serialize, serde::Deserialize)]
@ -86,11 +88,9 @@ impl State {
}
}
/// Get pane_id for a label
fn get_pane(&self, label: &str) -> Option<&str> {
self.config.panes.iter()
.find(|p| p.label == label)
.map(|p| p.pane_id.as_str())
/// Whether a pane with this label is registered.
fn has_pane(&self, label: &str) -> bool {
self.config.panes.iter().any(|p| p.label == label)
}
/// Check if a pane is connected
@ -103,98 +103,124 @@ impl State {
self.connected.insert(label.to_string(), connected);
}
/// Add a pane and persist
fn add_pane(&mut self, label: String, pane_id: String) {
/// Register a pane and persist.
fn add_pane(&mut self, label: String) {
if !self.config.panes.iter().any(|p| p.label == label) {
self.config.panes.push(PaneConfig { label, pane_id });
self.config.panes.push(PaneConfig { label });
save_config(&self.config);
}
}
/// Remove a pane and persist
fn remove_pane(&mut self, label: &str) -> Option<String> {
/// Unregister a pane and persist. Returns whether it was registered.
fn remove_pane(&mut self, label: &str) -> bool {
if let Some(idx) = self.config.panes.iter().position(|p| p.label == label) {
let pane = self.config.panes.remove(idx);
self.config.panes.remove(idx);
self.connected.remove(label);
save_config(&self.config);
Some(pane.pane_id)
true
} else {
None
false
}
}
}
// ── Pipe-Pane Reader ──────────────────────────────────────────
/// Set up pipe-pane for a single pane, reading output into the channel log.
async fn pipe_pane_reader(state: SharedState, pane: PaneConfig) {
/// Wait between connect attempts for a pane that is not yet reachable.
const RETRY_INTERVAL: std::time::Duration = std::time::Duration::from_secs(2);
/// Keep a pane streamed into its channel log for as long as it stays
/// registered. The pane id is resolved fresh by label on every connect
/// attempt — tmux pane ids are ephemeral, so the label (pane title /
/// window name) is the durable identity. Retries until the pane exists
/// and pipe-pane succeeds, and reconnects the same way if the pipe
/// later drops. Returns once close() unregisters the pane.
async fn pipe_pane_reader(state: SharedState, label: String) {
let pipe_dir = dirs::home_dir()
.unwrap_or_default()
.join(".consciousness/channels/tmux-pipes");
std::fs::create_dir_all(&pipe_dir).ok();
let pipe_path = pipe_dir.join(format!("{}.pipe", label));
let channel_key = format!("tmux.{}", label);
let pipe_path = pipe_dir.join(format!("{}.pipe", pane.label));
let _ = std::fs::remove_file(&pipe_path);
loop {
if !state.borrow().has_pane(&label) {
return;
}
// Create a named pipe (FIFO)
connect_and_stream(&state, &label, &pipe_path, &channel_key).await;
state.borrow_mut().set_connected(&label, false);
if !state.borrow().has_pane(&label) {
return;
}
tokio::time::sleep(RETRY_INTERVAL).await;
}
}
/// One connect attempt: resolve the pane's live id by label, point its
/// output at the FIFO with pipe-pane, and stream lines into the channel
/// log. Returns on the first failure, or when the stream ends.
async fn connect_and_stream(
state: &SharedState,
label: &str,
pipe_path: &std::path::Path,
channel_key: &str,
) {
let pane_id = match find_pane_by_name(label) {
Some(id) => id,
None => return,
};
// Fresh FIFO for this attempt.
let _ = std::fs::remove_file(pipe_path);
unsafe {
let c_path = std::ffi::CString::new(pipe_path.to_str().unwrap()).unwrap();
libc::mkfifo(c_path.as_ptr(), 0o644);
}
// Tell tmux to pipe this pane's output to our FIFO
let pipe_path_str = pipe_path.to_string_lossy().to_string();
let result = std::process::Command::new("tmux")
.args(["pipe-pane", "-t", &pane.pane_id, &format!("cat >> {}", pipe_path_str)])
.output();
match result {
Ok(output) if output.status.success() => {
info!("pipe-pane set up for {} ({})", pane.label, pane.pane_id);
}
Ok(output) => {
error!("pipe-pane failed for {}: {}", pane.label,
String::from_utf8_lossy(&output.stderr));
state.borrow_mut().set_connected(&pane.label, false);
// Point the pane's output at our FIFO.
let pipe_cmd = format!("cat >> {}", pipe_path.to_string_lossy());
match std::process::Command::new("tmux")
.args(["pipe-pane", "-t", &pane_id, &pipe_cmd])
.output()
{
Ok(o) if o.status.success() => {}
Ok(o) => {
warn!("pipe-pane failed for {} ({}): {}", label, pane_id,
String::from_utf8_lossy(&o.stderr));
return;
}
Err(e) => {
error!("failed to run tmux pipe-pane for {}: {}", pane.label, e);
state.borrow_mut().set_connected(&pane.label, false);
error!("running tmux pipe-pane for {}: {}", label, e);
return;
}
}
// Open the FIFO and read lines
let file = match tokio::fs::File::open(&pipe_path).await {
let file = match tokio::fs::File::open(pipe_path).await {
Ok(f) => f,
Err(e) => {
error!("failed to open pipe for {}: {}", pane.label, e);
state.borrow_mut().set_connected(&pane.label, false);
warn!("opening pipe for {}: {}", label, e);
return;
}
};
// Mark as connected once pipe is open
state.borrow_mut().set_connected(&pane.label, true);
let reader = tokio::io::BufReader::new(file);
let mut lines = reader.lines();
let channel_key = format!("tmux.{}", pane.label);
info!("connected channel tmux.{} (pane {})", label, pane_id);
state.borrow_mut().set_connected(label, true);
let mut lines = tokio::io::BufReader::new(file).lines();
while let Ok(Some(line)) = lines.next_line().await {
if line.trim().is_empty() {
continue;
}
let mut s = state.borrow_mut();
let log = s.channel_logs
.entry(channel_key.clone())
.or_insert_with(ChannelLog::new);
log.push(line);
s.channel_logs
.entry(channel_key.to_string())
.or_insert_with(ChannelLog::new)
.push(line);
}
warn!("pipe-pane reader ended for {}", pane.label);
state.borrow_mut().set_connected(&pane.label, false);
warn!("pipe-pane stream ended for {}", label);
}
// ── ChannelServer Implementation ───────────────────────────────
@ -244,10 +270,10 @@ impl channel_server::Server for ChannelServerImpl {
let channel = pry!(pry!(params.get_channel()).to_str()).to_string();
let message = pry!(pry!(params.get_message()).to_str()).to_string();
// Send to tmux pane via send-keys
// Send to tmux pane via send-keys — resolve the live pane id by
// label (it is not stored).
let label = channel.strip_prefix("tmux.").unwrap_or(&channel);
let pane_id = self.state.borrow().get_pane(label).map(String::from);
if let Some(pane_id) = pane_id {
if let Some(pane_id) = find_pane_by_name(label) {
let _ = std::process::Command::new("tmux")
.args(["send-keys", "-t", &pane_id, &message, "Enter"])
.output();
@ -302,28 +328,22 @@ impl channel_server::Server for ChannelServerImpl {
let params = pry!(params.get());
let label = pry!(pry!(params.get_label()).to_str()).to_string();
// Check if already open
if self.state.borrow().get_pane(&label).is_some() {
// Already registered — nothing to do.
if self.state.borrow().has_pane(&label) {
return std::future::ready(Ok(()));
}
// Find the tmux pane by name (window or pane title)
let pane_id = match find_pane_by_name(&label) {
Some(id) => id,
None => return std::future::ready(Err(capnp::Error::failed(
format!("no tmux pane named '{}'", label)))),
};
info!("opening channel tmux.{}", label);
info!("opening channel tmux.{} (pane {})", label, pane_id);
// Register the label and persist. The pane id is not stored —
// the reader resolves it by label on every connect attempt, so
// this succeeds even if the pane does not exist yet; the reader
// connects once it appears.
self.state.borrow_mut().add_pane(label.clone());
// Register in state and persist
self.state.borrow_mut().add_pane(label.clone(), pane_id.clone());
// Start pipe-pane reader
let pane = PaneConfig { label, pane_id };
let reader_state = self.state.clone();
tokio::task::spawn_local(async move {
pipe_pane_reader(reader_state, pane).await;
pipe_pane_reader(reader_state, label).await;
});
std::future::ready(Ok(()))
@ -339,15 +359,19 @@ impl channel_server::Server for ChannelServerImpl {
let label = channel.strip_prefix("tmux.").unwrap_or(&channel).to_string();
let mut s = self.state.borrow_mut();
if let Some(pane_id) = s.remove_pane(&label) {
if s.remove_pane(&label) {
info!("closing channel tmux.{}", label);
s.channel_logs.remove(&format!("tmux.{}", label));
// Disconnect pipe-pane
// Stop piping if the pane is still around (if it is gone the
// pipe is already dead). The reader then sees the pane
// unregistered and exits.
if let Some(pane_id) = find_pane_by_name(&label) {
let _ = std::process::Command::new("tmux")
.args(["pipe-pane", "-t", &pane_id])
.output();
}
}
std::future::ready(Ok(()))
}
@ -397,11 +421,13 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
tokio::task::LocalSet::new()
.run_until(async move {
// Start a pipe-pane reader for each configured pane
// Start a pipe-pane reader for each configured pane; each
// resolves its live pane id by label and retries until
// connected.
for pane in state.borrow().config.panes.clone() {
let reader_state = state.clone();
tokio::task::spawn_local(async move {
pipe_pane_reader(reader_state, pane).await;
pipe_pane_reader(reader_state, pane.label).await;
});
}

27
flake.lock generated Normal file
View file

@ -0,0 +1,27 @@
{
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1781074563,
"narHash": "sha256-md8WlXOlfnIeHeOScMTTHFyf2d6iaTwPl2apR5EQ3P4=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "9ae611a455b90cf061d8f332b977e387bda8e1ca",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "nixos-unstable",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"nixpkgs": "nixpkgs"
}
}
},
"root": "root",
"version": 7
}

42
flake.nix Normal file
View file

@ -0,0 +1,42 @@
{
description = "Development shell for consciousness";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
};
outputs = { nixpkgs, ... }:
let
systems = [
"x86_64-linux"
"aarch64-linux"
];
forAllSystems = nixpkgs.lib.genAttrs systems;
in
{
devShells = forAllSystems (system:
let
pkgs = import nixpkgs { inherit system; };
in
{
default = pkgs.mkShell {
packages = with pkgs; [
cargo
rustc
rustfmt
clippy
rust-analyzer
capnproto
pkg-config
jq
sqlite
python3
];
RUST_BACKTRACE = "1";
};
});
};
}

276
proto/salience.proto Normal file
View file

@ -0,0 +1,276 @@
// salience.proto stateful generation + per-token concept readout over gRPC.
//
// Shape:
// - One server-streaming RPC (Generate) for inference. Every other
// operation is unary. This is the minimum streaming we need
// tokens arrive one at a time with optional readouts / logprobs
// and keeping everything else unary makes the client dramatically
// simpler than a single bidi state machine did.
//
// - Server-side sessions hold the token list and image binaries.
// Sessions exist for bandwidth: at 200K tokens we'd otherwise
// re-ship ~800KB every turn, which hurts badly over a WAN link.
// vLLM's prefix cache holds the KV; the session just gives the
// client a handle so it can send deltas.
//
// - The client is the source of truth for prompt content. The server
// is the source of truth for image token expansion (how many
// IMAGE_PAD tokens an image becomes under this model). The client
// never writes vision tokens itself AppendImage appends the whole
// <|vision_start|> + IMAGE_PAD×N + <|vision_end|> block server-side.
//
// - Every mutation carries (offset, truncating): the client's view of
// the server's current length, plus whether the client is deliberately
// rewriting history. Server validates on each call and rejects drift.
// No silent divergence, no migration bugs.
//
// - Errors use gRPC status codes. NOT_FOUND for missing sessions,
// FAILED_PRECONDITION for offset drift or image-block splits,
// RESOURCE_EXHAUSTED for context overflow, ABORTED for "session busy".
//
// Not in v1:
// - Authentication beyond a shared bearer token in gRPC metadata.
// - Multi-tenant session namespacing.
// - Sampling traces beyond top-k logprobs.
syntax = "proto3";
package salience.v1;
// ============================================================
// Service
// ============================================================
service Salience {
// Create a fresh session. Client uses session_id on every subsequent
// RPC until CloseSession or TTL eviction (default 30 min idle). To
// refresh TTL across a long pause, issue a no-op Generate (empty
// append_tokens, max_tokens=0, no ranges).
rpc OpenSession(OpenSessionRequest) returns (OpenSessionResponse);
// Release the session's tokens + images. Idempotent.
rpc CloseSession(CloseSessionRequest) returns (CloseSessionResponse);
// Branch a session at a given token position. The new session
// inherits tokens [0, at_position) and any images whose vision
// block lies fully in that range. Rejected with FAILED_PRECONDITION
// if at_position falls inside an image block (client picks a clean
// boundary).
rpc ForkSession(ForkSessionRequest) returns (ForkSessionResponse);
// Prefill + optionally decode. Images are attached inline via
// `GenerateRequest.images`; the client writes its own pre-expanded
// <|vision_start|> + N*<|image_pad|> + <|vision_end|> runs into
// `append_tokens` and declares each run's range in `images[i]`.
// Server validates run length against the actual vision-encoder
// feature count and returns INVALID_ARGUMENT on mismatch. Stream
// yields Token events (with optional readouts / logprobs per
// position) followed by a terminating Done.
rpc Generate(GenerateRequest) returns (stream GenerateEvent);
// Readout manifest for the currently-loaded model concept names,
// layer indices, tensor dtype. Stateless; fetch once at client
// startup and cache.
rpc GetReadoutManifest(GetReadoutManifestRequest) returns (ReadoutManifest);
// Dump the full token stream of a session. Debug-only: used by the
// client to verify its local accounting against the server's
// session.tokens byte-for-byte when divergence is suspected. Not
// cheap copies the whole sequence across the wire.
rpc DumpSession(DumpSessionRequest) returns (DumpSessionResponse);
}
// ============================================================
// Lifecycle
// ============================================================
message OpenSessionRequest {
// Model identifier, must match vLLM's served model. The server
// only has one model loaded; this is a safety check on what the
// client thinks it's talking to.
string model = 1;
}
message OpenSessionResponse {
string session_id = 1;
uint32 max_model_len = 2;
}
message CloseSessionRequest {
string session_id = 1;
}
message CloseSessionResponse {}
message ForkSessionRequest {
string session_id = 1; // source session
uint32 at_position = 2; // new session inherits tokens [0, at_position)
}
message ForkSessionResponse {
string session_id = 1; // new session
}
// ============================================================
// Inference
// ============================================================
// One image attached to a Generate call. The client is responsible
// for writing the expanded placeholder run (VISION_START +
// N*IMAGE_PAD + VISION_END) into `GenerateRequest.append_tokens` at
// positions [pad_range_start, pad_range_end) and pairing it with
// the corresponding `ImageAttachment` entry. Server validates that
// the declared range's pad count matches what the vision encoder
// produces, and returns INVALID_ARGUMENT if they disagree.
message ImageAttachment {
// Image bytes (PNG / JPEG / WebP / ).
bytes bytes = 1;
// MIME type, e.g. "image/png".
string mime = 2;
// Absolute token positions (in `session.tokens` AFTER `append_tokens`
// is applied) spanning the full vision block `[vision_start,
// pad*N, vision_end]`. end is exclusive, so end - start == N + 2.
uint32 pad_range_start = 3;
uint32 pad_range_end = 4;
}
message GenerateRequest {
string session_id = 1;
// Tokens to append before prefill. May be empty. Client writes the
// full vision block (VISION_START + N*IMAGE_PAD + VISION_END) for
// any newly-attached image directly into this stream; each such
// block must be paired with a matching entry in `images`. The
// server validates that the declared ranges all point at IMAGE_PAD
// runs and that each run's length matches what the vision encoder
// produces for the corresponding image.
repeated uint32 append_tokens = 2;
// Client's view of session.tokens length at the time of the call.
// Must equal server's actual length, OR be strictly less when
// truncating=true (server rewinds before appending). Any other
// mismatch is FAILED_PRECONDITION.
uint32 offset = 3;
bool truncating = 4;
// Decode budget. 0 = prefill only (no decode, emit Token events
// for positions covered by logprobs_ranges / readout_ranges, then
// Done; replaces the old /score endpoint). >0 = decode up to this
// many tokens, stopping early on EOS / stop_token_ids.
uint32 max_tokens = 5;
// Position ranges (absolute, within the session's post-append
// token list) at which to emit logprobs on Token events. Empty =
// no logprobs. `logprob_top_k > 0` returns the top-k alternative
// tokens at each covered position; `logprob_top_k == 0` returns
// only the sampled-token's logprob.
repeated PositionRange logprobs_ranges = 6;
uint32 logprob_top_k = 7;
// Position ranges at which to emit concept-readout vectors. Empty
// = no readouts. Logical shape per position is
// [n_layers][n_concepts] see GetReadoutManifest.
repeated PositionRange readout_ranges = 8;
// Sampling parameters. Meaningful only when max_tokens > 0.
float temperature = 9; // default 1.0 when zero
float top_p = 10; // default 1.0 when zero
uint32 top_k = 11; // default 0 (disabled)
repeated uint32 stop_token_ids = 12;
// vLLM scheduler priority (0 = interactive, 10 = batch).
int32 priority = 13;
// Images newly attached on this call. Each entry describes one
// image's binary bytes, its mime type, and the exact token-position
// range of its pre-expanded placeholder run inside `session.tokens`
// after `append_tokens` is applied. See `ImageAttachment`.
repeated ImageAttachment images = 14;
}
message PositionRange {
uint32 start = 1; // inclusive
uint32 end = 2; // exclusive
}
message GenerateEvent {
oneof event {
Token token = 1;
GenerateDone done = 2;
}
}
message Token {
// Token id at this position. For prefill this is the prompt token;
// for decode it's the sampled token.
uint32 id = 1;
// Absolute position in the session's token list.
uint32 position = 2;
// True for prefill positions, false for decode.
bool is_prefill = 3;
// Concept readout at this position. Empty if the position wasn't
// covered by readout_ranges.
repeated float readout = 4 [packed = true];
// Top-k alternative tokens' logprobs at this position populated
// when the position is covered by logprobs_ranges and
// logprob_top_k > 0.
repeated TokenLogprob logprobs = 5;
// Logprob of the token at `position` (the prompt token for
// prefill, the sampled token for decode). Populated when the
// position is covered by logprobs_ranges.
float sampled_logprob = 6;
bool has_sampled_logprob = 7;
}
message TokenLogprob {
uint32 id = 1;
float logprob = 2;
}
message GenerateDone {
uint32 prompt_tokens = 1;
uint32 completion_tokens = 2;
uint32 total_tokens = 3;
enum FinishReason {
FINISH_REASON_UNSPECIFIED = 0;
FINISH_REASON_EOS = 1; // emitted EOS / stop token
FINISH_REASON_LENGTH = 2; // hit max_tokens
FINISH_REASON_CANCELLED = 3; // client cancelled
FINISH_REASON_STOP_STRING = 4; // matched a stop string
}
FinishReason finish_reason = 4;
}
// ============================================================
// Readout manifest
// ============================================================
message GetReadoutManifestRequest {}
message ReadoutManifest {
repeated string concepts = 1;
repeated uint32 layers = 2;
uint32 hidden_size = 3;
string dtype = 4;
}
// ============================================================
// Debug
// ============================================================
message DumpSessionRequest {
string session_id = 1;
}
message DumpSessionResponse {
// The full session.tokens sequence, verbatim.
repeated uint32 tokens = 1 [packed = true];
}

View file

@ -0,0 +1,327 @@
"""Quantize Qwen3.6-27B (multimodal) to FP8 for vLLM serving.
Why this exists
---------------
The earlier `quantize_qwen3_6.py` (in shell history, never committed)
loaded the model with `AutoModelForCausalLM`, which silently strips
the multimodal arch. Result: an FP8 checkpoint with no vision tower
weights at all. vLLM happily instantiated the vision tower from the
config and ran it with default/uninitialized weights, producing
gibberish image features and `!!!!!!`-style output. We chased that
through the protocol layer for a long time before tracing it back
to the quant. This script avoids that trap by loading via the
config-declared class explicitly.
Recipe
------
FP8_DYNAMIC (per-channel weight scales, per-token dynamic activation
scales, both E4M3) for Linear weights, with an `ignore` list derived
from Unsloth's UD-Q8_K_XL (`unsloth/Qwen3.6-27B-GGUF`). Their
sensitivity sweep flagged specific layers as quantization-fragile;
we honor those layer indices even though their algorithm is
GGUF-native Q8_K and ours is FP8 sensitivity is a layer property,
not an algorithm property.
vLLM fusion constraint
~~~~~~~~~~~~~~~~~~~~~~
vLLM's Qwen3.5/3.6 model code fuses sub-modules at load time:
qkv_proj q_proj, k_proj, v_proj
gate_up_proj gate_proj, up_proj
in_proj_qkvz in_proj_qkv, in_proj_z
in_proj_ba in_proj_b, in_proj_a
compressed_tensors rejects checkpoints where sub-modules of a fused
layer have different quantization schemes. Our ignore list is shaped
around this within any fused layer, all components share a scheme.
That's the reason `in_proj_qkv` is ignored even though Unsloth's
sweep doesn't single it out, and the reason late-stack attn override
covers q/k/v rather than just q/k.
MTP merge
---------
`Qwen3_5ForConditionalGeneration` doesn't expose the MTP submodule,
so `oneshot()` produces a checkpoint with the 15 `mtp.*` tensors
silently dropped. After quantization we read the MTP weights back
out of the upstream cached snapshot and splice them into the saved
safetensors at BF16. They're small (~850 MB) so quantizing them
isn't worth the calibration risk; speculative-decoding code paths
in vLLM expect the MTP head present.
Output
------
`OUTPUT_DIR` gets the FP8 model.safetensors + config + processor +
recipe.yaml. Vision tower stays BF16 (in `ignore`); LM Linears go
to FP8; norms, SSM internals (not Linear), and MTP tensors stay
BF16 untouched.
Verification at end: re-opens the saved safetensors and asserts
- vision .weight tensors present (>= 150; full count is 167)
- lm_head + embed_tokens at fp16/bf16 (NOT FP8)
- a sampled FP8'd Linear actually has float8 dtype
- 15 mtp.* tensors present
Run
---
~/vllm-venv/bin/python quantize_qwen3_6_mm.py
"""
from __future__ import annotations
import glob
import json
import sys
from pathlib import Path
import torch
from huggingface_hub import snapshot_download
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier
from safetensors import safe_open
from safetensors.torch import save_file
from transformers import AutoProcessor
from transformers.models.qwen3_5.modeling_qwen3_5 import (
Qwen3_5ForConditionalGeneration,
)
MODEL = "Qwen/Qwen3.6-27B"
OUTPUT_DIR = "/home/ubuntu/amygdala-training/Qwen3.6-27B-FP8-mm"
# Layers Unsloth's UD-Q8_K_XL keeps at F16 (perplexity-sensitive
# in their sweep). Late-stack clustering is consistent with the
# general finding that errors near the output propagate directly
# to logits.
LATE_FFN_LAYERS = (50, 51, 59, 62, 63)
LATE_ATTN_LAYERS = (51, 59, 63)
# Build the ignore regex list. Note: llmcompressor matches these
# patterns against MODULE names (no `.weight` suffix) when walking
# `named_modules()` for `targets=["Linear"]`. The first pass of
# this script used `\.weight$` patterns and silently quantized
# lm_head + every linear_attn projection — verified post-hoc by
# inspecting the saved safetensors. Patterns now anchor on `$`
# at the module name.
IGNORE_PATTERNS: list[str] = [
# Original recipe: lm_head and embeddings always full-precision.
# (embed_tokens is an Embedding, not a Linear, so it's already
# ignored by `targets=["Linear"]`. Pattern kept as belt-and-
# suspenders in case future llmcompressor versions widen the
# target set.)
"re:lm_head$",
"re:.*embed_tokens$",
# Vision tower — entire `model.visual.*` subtree (vision
# transformer blocks + merger + patch_embed + pos_embed).
# Unsloth ships the vision tower as a separate `mmproj-BF16.gguf`
# for GGUF consumers; in our single-file FP8 setup we just leave
# them at BF16.
"re:model\\.visual\\..*",
# MTP (multi-token prediction) module — Unsloth's GGUF doesn't
# carry MTP weights so we have no precision signal from them;
# safest to keep BF16.
"re:mtp\\..*",
# Linear-attention block — keep ENTIRELY at BF16. vLLM fuses
# `in_proj_qkv` and `in_proj_z` into a single `in_proj_qkvz`
# layer, and compressed_tensors rejects mixed schemes within a
# fused layer. Unsloth's recipe keeps z, a, b, out at F16/F32
# (gate/SSM internals are quantization-fragile in the GatedDeltaNet
# update), so the principled choice is to also keep `in_proj_qkv`
# at BF16 rather than FP8'ing the gate to match. We give up ~1 GB
# of FP8 coverage; in exchange we follow Unsloth's quality intent
# and load cleanly under vLLM. (`in_proj_a` + `in_proj_b` are
# likewise fused as `in_proj_ba` — both ignored, consistent.)
"re:model\\.language_model\\.layers\\.\\d+\\.linear_attn\\.in_proj_qkv$",
"re:model\\.language_model\\.layers\\.\\d+\\.linear_attn\\.in_proj_z$",
"re:model\\.language_model\\.layers\\.\\d+\\.linear_attn\\.in_proj_a$",
"re:model\\.language_model\\.layers\\.\\d+\\.linear_attn\\.in_proj_b$",
"re:model\\.language_model\\.layers\\.\\d+\\.linear_attn\\.out_proj$",
# Per-layer high-precision MLP (Unsloth flagged exactly these
# late-stack indices in their UD-Q8_K_XL sensitivity sweep, all
# three of {gate, up, down} per layer). vLLM fuses gate+up into
# `gate_up_proj`; ignoring both keeps the fused layer consistent.
# `down_proj` is its own (non-fused) layer.
"re:model\\.language_model\\.layers\\.("
+ "|".join(str(n) for n in LATE_FFN_LAYERS)
+ ")\\.mlp\\.(down|gate|up)_proj$",
# Per-layer high-precision attention q/k/v (Unsloth's sweep upgrades
# only q and k; we extend to v because vLLM fuses q/k/v into
# `qkv_proj` and rejects mixed schemes. `o_proj` is its own
# non-fused layer and stays at FP8.
"re:model\\.language_model\\.layers\\.("
+ "|".join(str(n) for n in LATE_ATTN_LAYERS)
+ ")\\.self_attn\\.(q|k|v)_proj$",
]
def main() -> None:
print(f"Loading {MODEL} as multimodal "
f"(Qwen3_5ForConditionalGeneration)...", flush=True)
model = Qwen3_5ForConditionalGeneration.from_pretrained(
MODEL,
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
print(f" loaded: {model.__class__.__name__}", flush=True)
print(f"Loading processor (text + image preprocessing)...", flush=True)
processor = AutoProcessor.from_pretrained(MODEL, trust_remote_code=True)
print("Running FP8_DYNAMIC oneshot quantization...", flush=True)
print(f" ignore list: {len(IGNORE_PATTERNS)} patterns",
flush=True)
recipe = QuantizationModifier(
targets=["Linear"],
scheme="FP8_DYNAMIC",
ignore=IGNORE_PATTERNS,
)
oneshot(model=model, recipe=recipe, output_dir=OUTPUT_DIR)
processor.save_pretrained(OUTPUT_DIR)
print(f" wrote model + processor to {OUTPUT_DIR}", flush=True)
merge_mtp(OUTPUT_DIR)
verify_output(OUTPUT_DIR)
def merge_mtp(out_dir: str) -> None:
"""Splice upstream MTP tensors into the saved FP8 safetensors.
`Qwen3_5ForConditionalGeneration` skips the MTP submodule on load,
so oneshot's output is missing the 15 `mtp.*` tensors. We resolve
the upstream snapshot via the HF cache (already populated by
from_pretrained), pull just the MTP tensors out at BF16, and
rewrite the safetensors with them merged in. The compressed_tensors
metadata header (which carries the FP8 format identifier vLLM
needs to dequantize) is preserved verbatim.
Atomic-rename is used so a crash mid-write doesn't corrupt the
33+ GB checkpoint we just spent minutes producing.
"""
print("\nMerging upstream MTP tensors...", flush=True)
upstream_dir = Path(snapshot_download(
MODEL,
allow_patterns=["model.safetensors.index.json",
"model-*-of-*.safetensors"],
))
with open(upstream_dir / "model.safetensors.index.json") as f:
idx = json.load(f)
mtp_shards = sorted({v for k, v in idx["weight_map"].items()
if k.startswith("mtp.")})
print(f" MTP tensors live in shards: {mtp_shards}", flush=True)
mtp_tensors: dict[str, torch.Tensor] = {}
for shard in mtp_shards:
with safe_open(upstream_dir / shard, framework="pt") as f:
for k in f.keys():
if k.startswith("mtp."):
mtp_tensors[k] = f.get_tensor(k).contiguous()
mtp_bytes = sum(t.numel() * t.element_size()
for t in mtp_tensors.values())
print(f" loaded {len(mtp_tensors)} mtp tensors "
f"({mtp_bytes/1e6:.1f} MB)", flush=True)
fp8_files = sorted(Path(out_dir).glob("*.safetensors"))
if len(fp8_files) != 1:
sys.exit(f"FAIL: expected single safetensors shard, "
f"got {fp8_files}")
existing_path = fp8_files[0]
with safe_open(existing_path, framework="pt") as f:
metadata = f.metadata() or {}
all_tensors = {k: f.get_tensor(k) for k in f.keys()}
overlap = set(all_tensors) & set(mtp_tensors)
if overlap:
sys.exit(f"FAIL: MTP key collision with FP8 output: "
f"{sorted(overlap)[:5]}")
all_tensors.update(mtp_tensors)
tmp_path = existing_path.with_name(existing_path.name + ".new")
print(f" rewriting {existing_path.name} "
f"({len(all_tensors)} tensors)...", flush=True)
save_file(all_tensors, str(tmp_path), metadata=metadata)
tmp_path.replace(existing_path)
print(" done", flush=True)
def verify_output(out_dir: str) -> None:
"""Open the saved safetensors and assert the recipe actually
landed: vision tower present at BF16, FP8 dtype on at least one
quantized Linear, lm_head not FP8."""
print(f"\nVerifying {out_dir}...", flush=True)
files = sorted(glob.glob(f"{out_dir}/*.safetensors"))
if not files:
sys.exit(f"FAIL: no safetensors in {out_dir}")
vision_keys: list[tuple[str, str]] = []
fp8_sample: tuple[str, str] | None = None
lm_head_dtype: str | None = None
mtp_keys: list[str] = []
for fp in files:
with safe_open(fp, framework="pt") as f:
for k in f.keys():
if k.startswith("mtp."):
mtp_keys.append(k)
# Some FP8 quants write a sibling `_scale` / `_zero_point`;
# we just care about the .weight tensors.
if not k.endswith(".weight"):
continue
t = f.get_tensor(k)
dtype = str(t.dtype).replace("torch.", "")
if "model.visual." in k:
vision_keys.append((k, dtype))
if k == "lm_head.weight":
lm_head_dtype = dtype
if (fp8_sample is None
and "float8" in dtype
and "language_model.layers" in k):
fp8_sample = (k, dtype)
# Qwen3.6-27B has 167 vision `.weight` tensors (333 vision tensors
# total, the rest are `.bias` and per-block norms). 150 is a
# sanity floor that catches "vision tower didn't make it through"
# without being brittle to minor arch revisions.
if len(vision_keys) < 150:
sys.exit(f"FAIL: only {len(vision_keys)} vision tensors found "
f"(expected >= 150). Vision tower didn't make it "
f"through the quant.")
bad_vision = [(k, d) for k, d in vision_keys if "float8" in d]
if bad_vision:
sys.exit(f"FAIL: vision weights got quantized to FP8: "
f"{bad_vision[:3]}...")
if lm_head_dtype is None:
sys.exit("FAIL: lm_head.weight not found in output.")
if "float8" in lm_head_dtype:
sys.exit(f"FAIL: lm_head.weight is FP8 ({lm_head_dtype}); "
f"should be BF16/FP16.")
if fp8_sample is None:
sys.exit("FAIL: no FP8 weights found in language_model.layers — "
"the recipe didn't quantize anything.")
# Upstream Qwen3.6-27B has exactly 15 mtp.* tensors (1 fused
# transformer block + projection + norms). merge_mtp() should
# have spliced all of them in.
if len(mtp_keys) != 15:
sys.exit(f"FAIL: expected 15 mtp.* tensors, found "
f"{len(mtp_keys)}. merge_mtp() missed some.")
print(f"{len(vision_keys)} vision tensors at "
f"{vision_keys[0][1]} (not FP8)")
print(f" ✓ lm_head.weight at {lm_head_dtype} (not FP8)")
print(f" ✓ FP8 sample: {fp8_sample[0]} = {fp8_sample[1]}")
print(f"{len(mtp_keys)} mtp.* tensors present")
print("DONE")
if __name__ == "__main__":
main()

View file

@ -100,7 +100,7 @@ impl HttpClient {
.map_err(|e| anyhow::anyhow!("invalid server name: {e}"))?;
let connector = tokio_rustls::TlsConnector::from(self.tls.clone());
let tls = connector.connect(server_name.to_owned(), tcp).await
.context("TLS handshake")?;
.map_err(|e| anyhow::anyhow!("TLS handshake to {host}: {e}"))?;
TokioIo::new(Box::new(tls) as Box<dyn IoStream>)
} else {
TokioIo::new(Box::new(tcp) as Box<dyn IoStream>)
@ -154,6 +154,14 @@ impl HttpResponse {
Ok(String::from_utf8_lossy(&bytes).into_owned())
}
/// Read the entire body as raw bytes (for binary downloads).
pub async fn bytes(self) -> Result<Bytes> {
let bytes = self.body.collect().await
.context("reading response body")?
.to_bytes();
Ok(bytes)
}
/// Read the entire body and deserialize as JSON.
pub async fn json<T: serde::de::DeserializeOwned>(self) -> Result<T> {
let bytes = self.body.collect().await
@ -190,6 +198,7 @@ impl HttpClientBuilder {
}
pub fn build(self) -> HttpClient {
install_rustls_crypto_provider();
let certs = rustls_native_certs::load_native_certs()
.certs.into_iter()
.collect::<Vec<_>>();
@ -197,6 +206,13 @@ impl HttpClientBuilder {
for cert in certs {
root_store.add(cert).ok();
}
// Also trust any `.pem` files under `~/.consciousness/certs/` —
// self-signed server certs for our own vllm hosts live there.
// Drop a new `<host>.pem` in the dir to trust a new server; no
// code change needed.
for cert in load_user_certs() {
root_store.add(cert).ok();
}
let tls = Arc::new(
ClientConfig::builder()
.with_root_certificates(root_store)
@ -210,6 +226,65 @@ impl HttpClientBuilder {
}
}
/// Install rustls' default crypto provider exactly once per process.
/// rustls 0.23 doesn't pick one automatically when multiple features
/// could provide it (e.g. when tonic pulls in both ring and aws-lc-rs
/// via transitive deps). Idempotent via OnceLock; safe to call from
/// multiple callers.
fn install_rustls_crypto_provider() {
static ONCE: std::sync::OnceLock<()> = std::sync::OnceLock::new();
ONCE.get_or_init(|| {
let _ = rustls::crypto::ring::default_provider().install_default();
});
}
/// Load every `.pem` file under `~/.consciousness/certs/` as a DER
/// certificate and return them. Silent on missing dir, missing files,
/// or parse errors — those are "no extra certs trusted" rather than
/// hard failures, to keep startup robust.
/// Load the concatenated PEM bytes of every `.pem` file under
/// `~/.consciousness/certs/` — suitable for passing to a tonic
/// `ClientTlsConfig::ca_certificate(Certificate::from_pem(...))` call
/// so gRPC connections trust the same self-signed servers the HTTP
/// path does.
pub(crate) fn load_user_certs_pem_bytes() -> Vec<u8> {
let mut out = Vec::new();
let Some(home) = dirs::home_dir() else { return out };
let dir = home.join(".consciousness").join("certs");
let Ok(entries) = std::fs::read_dir(&dir) else { return out };
for entry in entries.flatten() {
let path = entry.path();
if path.extension().and_then(|e| e.to_str()) != Some("pem") {
continue;
}
if let Ok(bytes) = std::fs::read(&path) {
out.extend_from_slice(&bytes);
if !bytes.ends_with(b"\n") {
out.push(b'\n');
}
}
}
out
}
fn load_user_certs() -> Vec<rustls::pki_types::CertificateDer<'static>> {
let mut out = Vec::new();
let Some(home) = dirs::home_dir() else { return out };
let dir = home.join(".consciousness").join("certs");
let Ok(entries) = std::fs::read_dir(&dir) else { return out };
for entry in entries.flatten() {
let path = entry.path();
if path.extension().and_then(|e| e.to_str()) != Some("pem") {
continue;
}
let Ok(bytes) = std::fs::read(&path) else { continue };
for cert in rustls_pemfile::certs(&mut bytes.as_slice()).flatten() {
out.push(cert);
}
}
out
}
/// Trait alias for streams that work with hyper's IO adapter.
trait IoStream: tokio::io::AsyncRead + tokio::io::AsyncWrite + Send + Unpin + 'static {}
impl<T: tokio::io::AsyncRead + tokio::io::AsyncWrite + Send + Unpin + 'static> IoStream for T {}

View file

@ -7,13 +7,14 @@
// Set POC_DEBUG=1 for verbose per-turn logging.
pub mod http;
pub mod salience;
use std::time::{Duration, Instant};
use std::time::Duration;
use anyhow::Result;
use tokio::sync::mpsc;
use serde::Deserialize;
use http::{HttpClient, HttpResponse};
use http::HttpClient;
#[derive(Debug, Clone, Deserialize)]
pub struct Usage {
@ -22,6 +23,36 @@ pub struct Usage {
pub total_tokens: u32,
}
/// Concept-readout manifest returned by the vLLM server's
/// `/v1/readout/manifest` endpoint. Maps the nameless tensor indices
/// in streaming `readout` fields back to concept names and layer
/// indices.
#[derive(Debug, Clone, Deserialize)]
pub struct ReadoutManifest {
pub concepts: Vec<String>,
pub layers: Vec<u32>,
}
/// Per-token per-layer concept projections streamed alongside each
/// sampled token. Shape `[n_layers][n_concepts]`. Named values come
/// from pairing with the manifest fetched at startup.
pub type TokenReadout = Vec<Vec<f32>>;
/// Client-side sampling state. Mirrors the wire-level fields in
/// `GenerateRequest` (proto flattened its `SamplingParams` submessage
/// in so the server handler reads them directly), but stays as a
/// grouped struct on the client because UI / config / tests pass
/// these around together.
#[derive(Clone, Copy)]
pub struct SamplingParams {
pub temperature: f32,
pub top_p: f32,
pub top_k: u32,
/// Decode budget. 0 = prefill only; >0 = decode up to this many
/// tokens, stopping early on EOS / stop_token_ids.
pub max_tokens: u32,
}
/// A JoinHandle that aborts its task when dropped.
pub(crate) struct AbortOnDrop(tokio::task::JoinHandle<()>);
@ -31,13 +62,6 @@ impl Drop for AbortOnDrop {
}
}
/// Sampling parameters for model generation.
#[derive(Clone, Copy)]
pub(crate) struct SamplingParams {
pub temperature: f32,
pub top_p: f32,
pub top_k: u32,
}
// ─────────────────────────────────────────────────────────────
// Stream events — yielded by backends, consumed by the runner
@ -45,7 +69,10 @@ pub(crate) struct SamplingParams {
/// One token from the streaming completions API.
pub enum StreamToken {
Token(u32),
/// A sampled token, optionally with its per-layer concept readout.
/// `readout` is `None` when the server has readout disabled or
/// returned no readout for this chunk.
Token { id: u32, readout: Option<TokenReadout> },
Done { usage: Option<Usage> },
Error(String),
}
@ -56,6 +83,17 @@ pub struct ApiClient {
api_key: String,
pub model: String,
base_url: String,
/// Cached readout manifest — fetched once per process and shared
/// across ApiClient clones (every Agent/fork gets the same cell).
/// `None` after fetch means the server has readout disabled (404).
manifest: std::sync::Arc<tokio::sync::OnceCell<Option<ReadoutManifest>>>,
/// Shared tonic Channel to the salience gRPC endpoint. Opened on
/// first use and reused across every SessionHandle / RPC call
/// derived from this ApiClient. tonic multiplexes concurrent
/// requests over the HTTP/2 connection automatically.
salience_channel: std::sync::Arc<
tokio::sync::OnceCell<tonic::transport::Channel>
>,
}
impl ApiClient {
@ -70,29 +108,69 @@ impl ApiClient {
api_key: api_key.to_string(),
model: model.to_string(),
base_url: base_url.trim_end_matches('/').to_string(),
manifest: std::sync::Arc::new(tokio::sync::OnceCell::new()),
salience_channel: std::sync::Arc::new(tokio::sync::OnceCell::new()),
}
}
pub(crate) fn stream_completion(
/// Return a `SalienceClient` on the shared gRPC channel — opens
/// the channel on first call and reuses it thereafter across
/// every ApiClient clone. All scoring / inference / session
/// RPCs flow through this single multiplexed HTTP/2 connection.
///
/// Bumps tonic's default 4 MiB encode/decode caps to 64 MiB on
/// every client. Multimodal Generate requests carry pre-encoded
/// image bytes inline (Qwen3.6's 768×768 patches at high res
/// land around 58 MiB per turn), and Done events with full
/// per-token readout vectors can also exceed 4 MiB on long runs.
pub async fn salience_client(&self) -> Result<
salience::pb::salience_client::SalienceClient<tonic::transport::Channel>
> {
let ch = self.salience_channel.get_or_try_init(|| async {
let grpc_url = salience::derive_grpc_url(&self.base_url);
log::debug!(target: "grpc",
"opening shared salience channel: http_base={} -> grpc_url={}",
self.base_url, grpc_url);
salience::connect_channel(&grpc_url).await
}).await?;
const MAX_GRPC_MESSAGE_BYTES: usize = 64 * 1024 * 1024;
Ok(salience::pb::salience_client::SalienceClient::new(ch.clone())
.max_decoding_message_size(MAX_GRPC_MESSAGE_BYTES)
.max_encoding_message_size(MAX_GRPC_MESSAGE_BYTES))
}
/// Stream generation via a gRPC session. Walks the prompt chunks
/// comparing against the session's `committed_len`, sends the
/// delta as interleaved `AppendImage` + intermediate
/// `Generate(max_tokens=0)` (for text runs separating images) +
/// a final `Generate(max_tokens=sampling.max_tokens, ...)` whose
/// Token events stream back through the channel.
///
/// On any gRPC error the session is dropped; the next call
/// reopens fresh. Happy-path ordering: Token* Done. Error paths
/// emit `StreamToken::Error` and close.
pub(crate) fn stream_session_mm(
&self,
prompt_tokens: &[u32],
session_lock: std::sync::Arc<crate::Mutex<Option<salience::SessionHandle>>>,
chunks: Vec<super::context::WireChunk>,
images: Vec<super::context::WireImage>,
match_upto: u32,
sampling: SamplingParams,
priority: Option<i32>,
readout_shape: Option<(u32, u32)>,
) -> (mpsc::UnboundedReceiver<StreamToken>, AbortOnDrop) {
let (tx, rx) = mpsc::unbounded_channel();
let client = self.client.clone();
let api_key = self.api_key.clone();
let model = self.model.clone();
let prompt_tokens = prompt_tokens.to_vec();
let base_url = self.base_url.clone();
let client = self.clone();
let handle = tokio::spawn(async move {
let result = stream_completions(
&client, &base_url, &api_key, &model,
&prompt_tokens, &tx, sampling, priority,
let result = run_session_generate(
session_lock, &client, chunks, images, match_upto, sampling,
priority, readout_shape, &tx,
).await;
if let Err(e) = result {
let _ = tx.send(StreamToken::Error(e.to_string()));
log::warn!(target: "grpc",
"stream_session_mm error, forwarding to UI: {:#}", e);
let _ = tx.send(StreamToken::Error(format!("{:#}", e)));
}
});
@ -102,327 +180,247 @@ impl ApiClient {
pub fn base_url(&self) -> &str { &self.base_url }
pub fn api_key(&self) -> &str { &self.api_key }
/// Fetch `/v1/readout/manifest` — returns `Ok(Some(..))` if
/// readout is enabled on the server, `Ok(None)` on 404 (disabled),
/// or an error on any other failure.
///
/// First call performs the HTTP fetch; subsequent calls (including
/// across ApiClient clones sharing the same cell) return the
/// cached result. The manifest doesn't change during a server run.
pub fn model_str(&self) -> &str { &self.model }
pub async fn fetch_readout_manifest(&self) -> Result<Option<ReadoutManifest>> {
let manifest = self.manifest.get_or_try_init(|| async {
let url = format!("{}/readout/manifest", self.base_url);
let auth = format!("Bearer {}", self.api_key);
let response = self
.client
.get_with_headers(&url, &[("Authorization", &auth)])
.await
.map_err(|e| anyhow::anyhow!("readout manifest fetch ({}): {}", url, e))?;
let status = response.status();
if status.as_u16() == 404 {
return Ok::<_, anyhow::Error>(None);
}
if !status.is_success() {
let body = response.text().await.unwrap_or_default();
let n = body.floor_char_boundary(body.len().min(500));
anyhow::bail!("readout manifest HTTP {} ({}): {}", status, url, &body[..n]);
}
Ok(Some(response.json().await?))
}).await?;
Ok(manifest.clone())
}
}
async fn stream_completions(
client: &HttpClient,
base_url: &str,
api_key: &str,
model: &str,
prompt_tokens: &[u32],
tx: &mpsc::UnboundedSender<StreamToken>,
/// Body of the gRPC-path streaming task. Walks the wire chunks
/// against the session's `committed_len`, sends the delta via
/// AppendImage / intermediate prefill-only Generates / final decode
/// Generate, and translates the final Generate's Token events into
/// StreamTokens on `tx`. On success the session handle is returned
/// to `session_lock` with an updated `committed_len`; on error the
/// handle is dropped so the next call reopens.
async fn run_session_generate(
session_lock: std::sync::Arc<crate::Mutex<Option<salience::SessionHandle>>>,
client: &ApiClient,
chunks: Vec<super::context::WireChunk>,
images: Vec<super::context::WireImage>,
match_upto: u32,
sampling: SamplingParams,
priority: Option<i32>,
) -> anyhow::Result<()> {
let mut request = serde_json::json!({
"model": model,
"prompt": prompt_tokens,
"max_tokens": 16384,
"temperature": sampling.temperature,
"top_p": sampling.top_p,
"top_k": sampling.top_k,
"stream": true,
"return_token_ids": true,
"skip_special_tokens": false,
"stop_token_ids": [super::tokenizer::IM_END],
});
if let Some(p) = priority {
request["priority"] = serde_json::json!(p);
}
readout_shape: Option<(u32, u32)>,
tx: &mpsc::UnboundedSender<StreamToken>,
) -> Result<()> {
use std::time::Instant;
use futures::StreamExt;
use super::context::WireChunk;
use salience::pb;
let url = format!("{}/completions", base_url);
let debug_label = format!("{} prompt tokens, model={}", prompt_tokens.len(), model);
let mut response = send_and_check(
client, &url, &request,
("Authorization", &format!("Bearer {}", api_key)),
&[], &debug_label, None,
).await?;
let mut reader = SseReader::new();
let mut usage = None;
while let Some(event) = reader.next_event(&mut response).await? {
if let Some(err_msg) = event["error"]["message"].as_str() {
anyhow::bail!("API error in stream: {}", err_msg);
}
if let Some(u) = event["usage"].as_object() {
if let Ok(u) = serde_json::from_value::<Usage>(serde_json::Value::Object(u.clone())) {
usage = Some(u);
let mut handle: salience::SessionHandle = {
let mut guard = session_lock.lock().await;
match guard.take() {
Some(h) => h,
None => {
drop(guard);
log::debug!(target: "grpc", "run_session_generate: opening new session");
salience::SessionHandle::open(client).await?
}
}
let choices = match event["choices"].as_array() {
Some(c) => c,
None => continue,
};
for choice in choices {
if let Some(ids) = choice["token_ids"].as_array() {
for id_val in ids {
if let Some(id) = id_val.as_u64() {
let _ = tx.send(StreamToken::Token(id as u32));
}
}
} else if let Some(text) = choice["text"].as_str() {
// Fallback: provider didn't return token_ids, encode locally
if !text.is_empty() {
for id in super::tokenizer::encode(text) {
let _ = tx.send(StreamToken::Token(id));
// If the client believes the match extends only up to `match_upto`
// but the server has more, we need to rewind. For v1 the match is
// either whole or broken — `match_upto` is always 0 on any mutation
// — so the cheapest correct recovery is to drop the session and
// open a fresh one.
if match_upto < handle.committed_len {
log::warn!(target: "grpc",
"session rewind: match_upto={} < committed_len={} — reopening session (resending {} bytes)",
match_upto, handle.committed_len, handle.committed_len - match_upto);
drop(handle);
handle = salience::SessionHandle::open(client).await?;
}
// Walk chunks at byte-level, taking everything past `match_upto`
// as the delta. Token chunks can be split mid-way; images live
// inline in the token stream, so there's no separate image-chunk
// case anymore.
let mut acc: u32 = 0;
let mut pending: Vec<u32> = Vec::new();
for chunk in chunks.iter() {
match chunk {
WireChunk::Tokens(t) => {
let len = t.len() as u32;
let chunk_end = acc + len;
if chunk_end <= match_upto {
acc = chunk_end;
} else if acc < match_upto {
let skip = (match_upto - acc) as usize;
pending.extend_from_slice(&t[skip..]);
acc = chunk_end;
} else {
pending.extend_from_slice(t);
acc = chunk_end;
}
}
}
}
// Filter images to those entirely past `match_upto` — anything
// before is on the server already (prior turn), anything
// straddling is a hard divergence (image partially-sent shouldn't
// happen with our atomic AppendImage history; with images-inline
// it can only happen if mark_dirty cleared match_upto mid-block,
// which the AST mutators prevent).
let mut new_images: Vec<pb::ImageAttachment> = Vec::new();
for img in &images {
if img.pad_end <= match_upto {
continue; // already sent on a prior turn
}
if img.pad_start < match_upto {
anyhow::bail!(
"session divergence: image at [{},{}) straddles match_upto={}",
img.pad_start, img.pad_end, match_upto,
);
}
new_images.push(pb::ImageAttachment {
bytes: img.bytes.clone(),
mime: img.mime.clone(),
pad_range_start: img.pad_start,
pad_range_end: img.pad_end,
});
}
// Final Generate: pending holds any trailing text; decode up to
// sampling.max_tokens. Request readouts on all decode positions
// via a catch-all range ending at u32::MAX — decode never
// reaches it.
let prompt_len_after_append = handle.committed_len + pending.len() as u32;
let readout_ranges = if readout_shape.is_some() {
vec![pb::PositionRange {
start: prompt_len_after_append,
end: u32::MAX,
}]
} else {
Vec::new()
};
let req = pb::GenerateRequest {
session_id: handle.session_id.clone(),
append_tokens: pending,
offset: handle.committed_len,
truncating: false,
max_tokens: sampling.max_tokens,
logprobs_ranges: Vec::new(),
logprob_top_k: 0,
readout_ranges,
temperature: sampling.temperature,
top_p: sampling.top_p,
top_k: sampling.top_k,
stop_token_ids: Vec::new(),
priority: priority.unwrap_or(0),
images: new_images,
};
let session_id_for_log = handle.session_id.clone();
let t_generate = Instant::now();
log::debug!(target: "grpc",
"session {} Generate: offset={} append={} max_tokens={} priority={}",
session_id_for_log, req.offset, req.append_tokens.len(),
req.max_tokens, req.priority);
let mut stream = handle.generate(req).await?;
let (n_layers, n_concepts) = readout_shape.unwrap_or((0, 0));
let mut session_terminated = false;
let mut first_token_at: Option<Instant> = None;
while let Some(event) = stream.next().await {
let event = match event {
Ok(e) => e,
Err(status) => {
log::warn!(target: "grpc",
"session {} Generate stream error: {} — dropping session",
session_id_for_log, status);
session_terminated = true;
let _ = tx.send(StreamToken::Error(format!(
"Generate stream error: {}", status,
)));
break;
}
};
let Some(inner) = event.event else { continue };
match inner {
pb::generate_event::Event::Token(t) => {
if t.is_prefill { continue; }
if first_token_at.is_none() {
log::debug!(target: "grpc",
"session {} first decode token at {:?}",
session_id_for_log, t_generate.elapsed());
first_token_at = Some(Instant::now());
}
let readout = if t.readout.is_empty() {
None
} else if n_layers == 0 || n_concepts == 0 {
None
} else {
let expected = (n_layers as usize) * (n_concepts as usize);
if t.readout.len() != expected {
log::warn!(target: "grpc",
"readout shape mismatch: expected {}*{}={}, got {}",
n_layers, n_concepts, expected, t.readout.len());
None
} else {
let n = n_concepts as usize;
let mut layers: Vec<Vec<f32>> = Vec::with_capacity(n_layers as usize);
for l in 0..(n_layers as usize) {
layers.push(t.readout[l * n..(l + 1) * n].to_vec());
}
Some(layers)
}
};
if tx.send(StreamToken::Token { id: t.id, readout }).is_err() {
break;
}
}
pb::generate_event::Event::Done(d) => {
log::debug!(target: "grpc",
"session {} Done: prompt={} completion={} total={} reason={:?} elapsed={:?}",
session_id_for_log, d.prompt_tokens, d.completion_tokens,
d.total_tokens, d.finish_reason, t_generate.elapsed());
handle.committed_len = d.total_tokens;
let usage = Some(Usage {
prompt_tokens: d.prompt_tokens,
completion_tokens: d.completion_tokens,
total_tokens: d.total_tokens,
});
let _ = tx.send(StreamToken::Done { usage });
}
}
}
if !session_terminated {
let mut guard = session_lock.lock().await;
*guard = Some(handle);
}
Ok(())
}
/// Send an HTTP request and check for errors.
pub(crate) async fn send_and_check(
client: &HttpClient,
url: &str,
body: &impl serde::Serialize,
auth_header: (&str, &str),
extra_headers: &[(&str, &str)],
debug_label: &str,
request_json: Option<&str>,
) -> Result<HttpResponse> {
let debug = std::env::var("POC_DEBUG").is_ok();
let start = Instant::now();
if debug {
let payload_size = serde_json::to_string(body)
.map(|s| s.len())
.unwrap_or(0);
dbglog!(
"request: {}K payload, {}",
payload_size / 1024, debug_label,
);
}
let mut headers: Vec<(&str, &str)> = Vec::with_capacity(extra_headers.len() + 1);
headers.push(auth_header);
headers.extend_from_slice(extra_headers);
let response = client
.send_json("POST", url, &headers, body)
.await
.map_err(|e| {
let msg = e.to_string();
let cause = if msg.contains("connect timeout") || msg.contains("TCP connect") {
"connection refused"
} else if msg.contains("request timeout") {
"request timed out"
} else {
"request error"
};
anyhow::anyhow!("{} ({}): {}", cause, url, msg)
})?;
let status = response.status();
let elapsed = start.elapsed();
if debug {
for name in [
"x-ratelimit-remaining",
"x-ratelimit-limit",
"x-request-id",
] {
if let Some(val) = response.header(name) {
dbglog!("header {}: {}", name, val);
}
}
}
if !status.is_success() {
let body = response.text().await.unwrap_or_default();
dbglog!(
"HTTP {} after {:.1}s ({}): {}",
status,
elapsed.as_secs_f64(),
url,
&body[..body.floor_char_boundary(body.len().min(500))]
);
if let Some(json) = request_json {
let log_dir = dirs::home_dir()
.unwrap_or_default()
.join(".consciousness/logs/failed-requests");
let _ = std::fs::create_dir_all(&log_dir);
let ts = chrono::Local::now().format("%Y%m%dT%H%M%S");
let path = log_dir.join(format!("{}.json", ts));
if std::fs::write(&path, json).is_ok() {
dbglog!(
"saved failed request to {} (HTTP {})", path.display(), status
);
}
}
anyhow::bail!("HTTP {} ({}): {}", status, url, &body[..body.floor_char_boundary(body.len().min(1000))]);
}
if debug {
dbglog!(
"connected in {:.1}s (HTTP {})",
elapsed.as_secs_f64(),
status.as_u16()
);
}
Ok(response)
}
/// SSE stream reader. Handles the generic SSE plumbing shared by both
/// backends: chunk reading with timeout, line buffering, `data:` prefix
/// stripping, `[DONE]` detection, JSON parsing, and parse error diagnostics.
/// Yields parsed events as serde_json::Value — each backend handles its
/// own event types.
pub(crate) struct SseReader {
line_buf: String,
chunk_timeout: Duration,
pub stream_start: Instant,
pub chunks_received: u64,
pub sse_lines_parsed: u64,
pub sse_parse_errors: u64,
debug: bool,
done: bool,
/// Serialized request payload — saved to disk on errors for replay debugging.
pub(crate) request_json: Option<String>,
}
impl SseReader {
pub(crate) fn new() -> Self {
Self {
line_buf: String::new(),
chunk_timeout: Duration::from_secs(crate::config::get().api_stream_timeout_secs),
stream_start: Instant::now(),
chunks_received: 0,
sse_lines_parsed: 0,
sse_parse_errors: 0,
debug: std::env::var("POC_DEBUG").is_ok(),
done: false,
request_json: None,
}
}
/// Attach the serialized request payload for error diagnostics.
/// Save the request payload to disk for replay debugging.
fn save_failed_request(&self, reason: &str) {
let Some(ref json) = self.request_json else { return };
let log_dir = dirs::home_dir()
.unwrap_or_default()
.join(".consciousness/logs/failed-requests");
let _ = std::fs::create_dir_all(&log_dir);
let ts = chrono::Local::now().format("%Y%m%dT%H%M%S");
let path = log_dir.join(format!("{}.json", ts));
if std::fs::write(&path, json).is_ok() {
dbglog!(
"saved failed request to {} ({})", path.display(), reason
);
}
}
/// Read the next SSE event from the response stream.
/// Returns Ok(Some(value)) for each parsed data line,
/// Ok(None) when the stream ends or [DONE] is received.
pub(crate) async fn next_event(
&mut self,
response: &mut HttpResponse,
) -> Result<Option<serde_json::Value>> {
loop {
// Drain complete lines from the buffer before reading more chunks
while let Some(newline_pos) = self.line_buf.find('\n') {
let line = self.line_buf[..newline_pos].trim().to_string();
self.line_buf = self.line_buf[newline_pos + 1..].to_string();
if line == "data: [DONE]" {
self.done = true;
return Ok(None);
}
if line.is_empty()
|| line.starts_with("event: ")
|| !line.starts_with("data: ")
{
continue;
}
let json_str = &line[6..];
self.sse_lines_parsed += 1;
match serde_json::from_str(json_str) {
Ok(v) => return Ok(Some(v)),
Err(e) => {
self.sse_parse_errors += 1;
if self.sse_parse_errors == 1 || self.debug {
let preview = if json_str.len() > 200 {
format!("{}...", &json_str[..200])
} else {
json_str.to_string()
};
dbglog!(
"SSE parse error (#{}) {}: {}",
self.sse_parse_errors, e, preview
);
}
continue;
}
}
}
if self.done {
return Ok(None);
}
// Read more data from the response stream
match tokio::time::timeout(self.chunk_timeout, response.chunk()).await {
Ok(Ok(Some(chunk))) => {
self.chunks_received += 1;
self.line_buf.push_str(&String::from_utf8_lossy(&chunk));
}
Ok(Ok(None)) => return Ok(None),
Ok(Err(e)) => {
let buf_preview = if self.line_buf.is_empty() {
"(empty)".to_string()
} else {
let n = self.line_buf.len().min(500);
format!("{}B: {}", self.line_buf.len(), &self.line_buf[..n])
};
let msg = format!(
"stream error after {} chunks, {:.1}s, {} sse lines: {} | buf: {}",
self.chunks_received,
self.stream_start.elapsed().as_secs_f64(),
self.sse_lines_parsed,
e, buf_preview,
);
dbglog!("{}", msg);
self.save_failed_request(&msg);
return Err(e.into());
}
Err(_) => {
let buf_preview = if self.line_buf.is_empty() {
"(empty)".to_string()
} else {
let n = self.line_buf.len().min(500);
format!("{}B: {}", self.line_buf.len(), &self.line_buf[..n])
};
let msg = format!(
"stream timeout: {}s, {} chunks, {} sse lines, {:.1}s elapsed | buf: {}",
self.chunk_timeout.as_secs(),
self.chunks_received,
self.sse_lines_parsed,
self.stream_start.elapsed().as_secs_f64(),
buf_preview,
);
dbglog!("{}", msg);
self.save_failed_request(&msg);
anyhow::bail!(
"stream timeout: no data for {}s ({} chunks received)",
self.chunk_timeout.as_secs(),
self.chunks_received
);
}
}
}
}
}

279
src/agent/api/salience.rs Normal file
View file

@ -0,0 +1,279 @@
// agent/api/salience.rs — gRPC client bindings for salience.v1.
//
// Thin wrapper around the tonic-generated types. Every RPC except
// Generate is unary; Generate is server-streaming. Free functions
// (open/close session) wrap the lifecycle RPCs; `SessionHandle` just
// carries the id + connection params so later RPCs can reuse them.
//
// The old bidi Session() API is gone — see git history for its shape.
#![allow(clippy::enum_variant_names)]
use anyhow::{Context, Result};
use tonic::transport::{Certificate, Channel, ClientTlsConfig, Endpoint};
/// Generated prost + tonic types for salience.v1. Call sites use
/// `pb::OpenSessionRequest`, `pb::Token`, etc.
pub mod pb {
tonic::include_proto!("salience.v1");
}
pub type SalienceClient = pb::salience_client::SalienceClient<Channel>;
/// Open a TLS-aware gRPC channel to the salience server. `base_url`
/// looks like `https://host:8443`. User-provided CA certs under
/// `~/.consciousness/certs/` are trusted in addition to the system
/// roots (for self-signed server certs).
///
/// Returns the raw `Channel` so callers (`ApiClient::salience_client`)
/// can cache it and clone a `SalienceClient` per request without
/// reopening the TCP/TLS connection. tonic multiplexes RPCs over the
/// shared channel automatically.
pub async fn connect_channel(base_url: &str) -> Result<Channel> {
let mut endpoint = Endpoint::from_shared(base_url.to_string())
.with_context(|| format!("invalid salience endpoint: {}", base_url))?
.connect_timeout(std::time::Duration::from_secs(30))
.timeout(std::time::Duration::from_secs(600));
if base_url.starts_with("https://") {
let user_certs = super::http::load_user_certs_pem_bytes();
let mut tls = ClientTlsConfig::new().with_native_roots();
if !user_certs.is_empty() {
tls = tls.ca_certificate(Certificate::from_pem(user_certs));
}
endpoint = endpoint
.tls_config(tls)
.with_context(|| "configuring tonic TLS")?;
}
endpoint
.connect()
.await
.with_context(|| format!("failed to connect to salience server at {}", base_url))
}
/// Derive the gRPC base URL from the HTTP completions base URL.
///
/// vLLM's salience gRPC server listens on a different port (8443) from
/// the HTTP endpoint (8000) and accepts no path component. Given an
/// HTTP base like `https://host:8000/v1`, produce `https://host:8443`.
/// No-op when the path is empty and the port isn't 8000.
pub fn derive_grpc_url(http_base: &str) -> String {
let mut url = http_base.trim_end_matches('/').to_string();
if let Some(proto_end) = url.find("://") {
let rest_start = proto_end + 3;
if let Some(path_slash) = url[rest_start..].find('/') {
url.truncate(rest_start + path_slash);
}
}
url.replace(":8000", ":8443")
}
/// Attach a bearer token to a tonic request as gRPC metadata.
pub fn with_auth<T>(req: &mut tonic::Request<T>, api_key: &str) {
if api_key.is_empty() {
return;
}
let bearer = format!("Bearer {}", api_key);
if let Ok(val) = bearer.parse() {
req.metadata_mut().insert("authorization", val);
}
}
/// Handle to a server-side session. Carries the id + an `ApiClient`
/// clone (which holds the shared tonic Channel) so subsequent
/// per-session RPCs go over the process-global connection.
/// `committed_len` tracks the server's current session.tokens length
/// so the client can submit deltas with the right `offset`.
pub struct SessionHandle {
pub session_id: String,
pub max_model_len: u32,
pub committed_len: u32,
client: super::ApiClient,
}
impl SessionHandle {
pub async fn open(client: &super::ApiClient) -> Result<Self> {
let t0 = std::time::Instant::now();
log::debug!(target: "grpc", "OpenSession rpc: start");
let mut c = client.salience_client().await?;
let mut req = tonic::Request::new(pb::OpenSessionRequest {
model: client.model.clone(),
});
with_auth(&mut req, client.api_key());
let resp = c
.open_session(req)
.await
.with_context(|| "OpenSession RPC failed")?
.into_inner();
log::debug!(target: "grpc",
"OpenSession rpc: done session_id={} max_model_len={} elapsed={:?}",
resp.session_id, resp.max_model_len, t0.elapsed());
Ok(Self {
session_id: resp.session_id,
max_model_len: resp.max_model_len,
committed_len: 0,
client: client.clone(),
})
}
pub fn client(&self) -> &super::ApiClient { &self.client }
/// Debug-only: fetch the server's full session.tokens. Used to
/// verify client-side accounting byte-for-byte when divergence
/// is suspected. Not cheap on large sessions.
pub async fn dump_tokens(&self) -> Result<Vec<u32>> {
let mut c = self.client.salience_client().await?;
let mut req = tonic::Request::new(pb::DumpSessionRequest {
session_id: self.session_id.clone(),
});
with_auth(&mut req, self.client.api_key());
let resp = c
.dump_session(req)
.await
.with_context(|| "DumpSession RPC failed")?
.into_inner();
Ok(resp.tokens)
}
/// Open a gRPC Generate stream with the given request. Caller
/// iterates the returned stream of GenerateEvents; the handle's
/// `committed_len` should be advanced by the caller on Done based
/// on the Done event's `total_tokens` field.
pub async fn generate(
&self,
req: pb::GenerateRequest,
) -> Result<tonic::Streaming<pb::GenerateEvent>> {
let t0 = std::time::Instant::now();
log::debug!(target: "grpc",
"Generate rpc: open-stream session={} offset={} append={} max_tokens={}",
self.session_id, req.offset, req.append_tokens.len(), req.max_tokens);
let mut c = self.client.salience_client().await?;
let mut req = tonic::Request::new(req);
with_auth(&mut req, self.client.api_key());
let resp = c
.generate(req)
.await
.with_context(|| "Generate RPC failed")?;
log::debug!(target: "grpc",
"Generate rpc: stream opened session={} open-latency={:?}",
self.session_id, t0.elapsed());
Ok(resp.into_inner())
}
/// Run a prefill-only Generate (max_tokens=0) that appends the
/// given tokens to the session. No decode, no Token events — the
/// server just extends session.tokens and runs prefill to warm
/// the KV cache. Used to interleave text runs between AppendImage
/// calls, and by score paths that want prompt_logprobs without a
/// decode step.
pub async fn prefill_only(&mut self, tokens: Vec<u32>) -> Result<()> {
use futures::StreamExt;
let req = pb::GenerateRequest {
session_id: self.session_id.clone(),
append_tokens: tokens,
offset: self.committed_len,
truncating: false,
max_tokens: 0,
logprobs_ranges: Vec::new(),
logprob_top_k: 0,
readout_ranges: Vec::new(),
temperature: 0.0,
top_p: 0.0,
top_k: 0,
stop_token_ids: Vec::new(),
priority: 0,
images: Vec::new(),
};
let mut stream = self.generate(req).await?;
while let Some(event) = stream.next().await {
let event = event.map_err(|s| anyhow::anyhow!("prefill Generate stream: {}", s))?;
if let Some(pb::generate_event::Event::Done(d)) = event.event {
self.committed_len = d.total_tokens;
}
}
Ok(())
}
}
/// Drop → fire CloseSession in a detached task so servers don't leak
/// sessions until TTL eviction. Best-effort: if no tokio runtime is
/// available we skip; the server's 30min TTL will reap it eventually.
impl Drop for SessionHandle {
fn drop(&mut self) {
if self.session_id.is_empty() {
return;
}
let session_id = std::mem::take(&mut self.session_id);
let client = self.client.clone();
let Ok(rt) = tokio::runtime::Handle::try_current() else {
log::debug!(target: "grpc",
"SessionHandle drop outside tokio runtime, session {} leaks to TTL",
session_id);
return;
};
rt.spawn(async move {
let Ok(mut c) = client.salience_client().await else { return };
let mut req = tonic::Request::new(pb::CloseSessionRequest {
session_id: session_id.clone(),
});
with_auth(&mut req, client.api_key());
if let Err(e) = c.close_session(req).await {
log::debug!(target: "grpc",
"CloseSession on drop failed for {}: {:#}",
session_id, e);
}
});
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn generated_types_compile() {
// Exercise the shape of the new proto types — if build.rs
// stops regenerating against the proto, this stops compiling.
let _open = pb::OpenSessionRequest {
model: "qwen3-vl".into(),
};
let _tok = pb::Token {
id: 42,
position: 0,
is_prefill: false,
readout: vec![0.1, 0.2, 0.3],
logprobs: vec![pb::TokenLogprob {
id: 1,
logprob: -0.5,
}],
sampled_logprob: -0.1,
has_sampled_logprob: true,
};
let _done = pb::GenerateDone {
prompt_tokens: 10,
completion_tokens: 20,
total_tokens: 30,
finish_reason: pb::generate_done::FinishReason::Eos as i32,
};
let _evt = pb::GenerateEvent {
event: Some(pb::generate_event::Event::Done(_done)),
};
}
#[test]
fn derive_grpc_url_cases() {
assert_eq!(
derive_grpc_url("https://host:8000/v1"),
"https://host:8443",
);
assert_eq!(
derive_grpc_url("https://host:8000/"),
"https://host:8443",
);
assert_eq!(
derive_grpc_url("https://host:9000/v1"),
"https://host:9000",
);
}
}

File diff suppressed because it is too large Load diff

View file

@ -16,6 +16,8 @@
pub mod api;
pub mod context;
pub mod oneshot;
pub mod readout;
pub mod salience;
pub mod tokenizer;
pub mod tools;
@ -27,6 +29,11 @@ use context::{AstNode, ContextState, Section, Ast, PendingToolCall, ResponsePars
use crate::mind::log::ConversationLog;
async fn agent_trace(agent: &Arc<Agent>, msg: String) {
let provenance = agent.state.lock().await.provenance.clone();
eprintln!("[agent:{provenance}] {msg}");
}
// --- Activity tracking (RAII guards) ---
pub struct ActivityEntry {
@ -139,10 +146,22 @@ impl DispatchState {
pub struct Agent {
pub client: ApiClient,
pub app_config: crate::config::AppConfig,
pub prompt_file: String,
pub session_id: String,
pub context: crate::Mutex<ContextState>,
pub state: crate::Mutex<AgentState>,
/// Shared landing pad for per-token concept-readout projections
/// streamed from the vLLM server. Populated by the streaming
/// token handler, read by UI screens (amygdala). Manifest is
/// `None` when the server has readout disabled.
pub readout: readout::SharedReadoutBuffer,
/// Long-lived gRPC session to the salience server, lazily opened
/// on first use. Tracks appended tokens so subsequent turns send
/// only the delta (prefix-cache reuse). None when not yet opened
/// or when the session has died and needs reopening.
///
/// Arc-wrapped so the spawned streaming task can share ownership
/// (the task outlives the call site).
pub grpc_session: std::sync::Arc<crate::Mutex<Option<api::salience::SessionHandle>>>,
}
/// Mutable agent state — behind its own mutex.
@ -163,9 +182,7 @@ pub struct AgentState {
pub think_native: bool,
/// Tool-based thinking — add a "think" tool for structured reasoning.
pub think_tool: bool,
pub temperature: f32,
pub top_p: f32,
pub top_k: u32,
pub sampling: api::SamplingParams,
pub activities: Vec<ActivityEntry>,
next_activity_id: u64,
pub pending_yield: bool,
@ -173,14 +190,10 @@ pub struct AgentState {
pub pending_dmn_pause: bool,
pub provenance: String,
pub generation: u64,
pub memory_scoring_in_flight: bool,
pub active_tools: tools::ActiveTools,
/// vLLM scheduling priority (lower = higher priority).
/// 0 = interactive, 1 = surface agent, 2 = other subconscious, 10 = unconscious.
pub priority: Option<i32>,
/// Forked agents should not compact on overflow — it blows the
/// KV cache prefix and evicts the step prompts.
pub no_compact: bool,
pub changed: Arc<tokio::sync::Notify>,
}
@ -189,7 +202,6 @@ impl Agent {
client: ApiClient,
personality: Vec<(String, String)>,
app_config: crate::config::AppConfig,
prompt_file: String,
conversation_log: Option<ConversationLog>,
active_tools: tools::ActiveTools,
agent_tools: Vec<tools::Tool>,
@ -217,12 +229,14 @@ impl Agent {
}
let session_id = format!("consciousness-{}", chrono::Utc::now().format("%Y%m%d-%H%M%S"));
let readout = readout::new_shared();
let agent = Arc::new(Self {
client,
app_config,
prompt_file,
session_id,
context: crate::Mutex::new(context),
readout,
grpc_session: std::sync::Arc::new(crate::Mutex::new(None)),
state: crate::Mutex::new(AgentState {
tools: agent_tools,
mcp_tools: McpToolAccess::All,
@ -230,9 +244,12 @@ impl Agent {
reasoning_effort: "none".to_string(),
think_native: true,
think_tool: false,
sampling: api::SamplingParams {
temperature: 0.6,
top_p: 0.95,
top_k: 20,
max_tokens: 4096,
},
activities: Vec::new(),
next_activity_id: 0,
pending_yield: false,
@ -240,15 +257,39 @@ impl Agent {
pending_dmn_pause: false,
provenance: "manual".to_string(),
generation: 0,
memory_scoring_in_flight: false,
active_tools,
priority: Some(0),
no_compact: false,
changed: Arc::new(tokio::sync::Notify::new()),
}),
});
agent.load_startup_journal().await;
// Probe the vLLM server for its readout manifest. Non-fatal:
// if readout isn't enabled the server returns 404 and we
// leave the manifest as None, which disables the amygdala
// screen gracefully.
match agent.client.fetch_readout_manifest().await {
Ok(Some(m)) => {
dbglog!(
"readout manifest: {} concepts, layers={:?}",
m.concepts.len(),
m.layers,
);
if let Ok(mut buf) = agent.readout.lock() {
buf.set_manifest(Some(m));
}
}
Ok(None) => {
dbglog!(
"readout manifest: server has readout disabled (404)"
);
}
Err(e) => {
dbglog!("readout manifest fetch failed: {}", e);
}
}
agent
}
@ -259,9 +300,17 @@ impl Agent {
Arc::new(Self {
client: self.client.clone(),
app_config: self.app_config.clone(),
prompt_file: self.prompt_file.clone(),
session_id: self.session_id.clone(),
context: crate::Mutex::new(ctx),
// Forks get an independent readout buffer. The amygdala
// screen reads the main conscious agent's buffer only;
// subconscious generations (scoring, reflection, etc.)
// shouldn't bleed into the main emotional readout even
// though they hit the same vLLM server.
readout: readout::new_shared(),
// Forks get their own session — can't share a bidi stream,
// and forks have different conversation tails anyway.
grpc_session: std::sync::Arc::new(crate::Mutex::new(None)),
state: crate::Mutex::new(AgentState {
tools,
mcp_tools: McpToolAccess::None,
@ -269,9 +318,7 @@ impl Agent {
reasoning_effort: "none".to_string(),
think_native: st.think_native,
think_tool: st.think_tool,
temperature: st.temperature,
top_p: st.top_p,
top_k: st.top_k,
sampling: st.sampling,
activities: Vec::new(),
next_activity_id: 0,
pending_yield: false,
@ -279,26 +326,42 @@ impl Agent {
pending_dmn_pause: false,
provenance: st.provenance.clone(),
generation: 0,
memory_scoring_in_flight: false,
active_tools: tools::ActiveTools::new(),
priority: None,
no_compact: true,
changed: Arc::new(tokio::sync::Notify::new()),
}),
})
}
pub async fn assemble_prompt_tokens(&self) -> Vec<u32> {
let ctx = self.context.lock().await;
let st = self.state.lock().await;
let mut tokens = ctx.token_ids();
tokens.push(tokenizer::IM_START);
if st.think_native {
tokens.extend(tokenizer::encode("assistant\n<think>\n"));
} else {
tokens.extend(tokenizer::encode("assistant\n"));
/// Assemble a ready-to-send prompt as interleaved wire chunks for
/// the gRPC session path. Text runs are batched; each Image leaf
/// becomes its own chunk. Also trims the conversation to budget
/// first so we don't build a prompt the server will reject for
/// length.
pub async fn assemble_prompt(&self)
-> (Vec<context::WireChunk>, Vec<context::WireImage>, u32)
{
let mut ctx = self.context.lock().await;
if ctx.total_tokens() > context::context_budget_tokens() {
ctx.trim_conversation();
}
tokens
let st = self.state.lock().await;
let conv_len = ctx.conversation().len();
let (mut chunks, images) = ctx.wire_chunks(0..conv_len, |_| false);
// Assistant-turn prologue. Merge into the trailing Tokens
// chunk if there is one, else push as a new chunk.
let mut prologue = vec![tokenizer::IM_START];
if st.think_native {
prologue.extend(tokenizer::encode("assistant\n<think>\n"));
} else {
prologue.extend(tokenizer::encode("assistant\n"));
}
match chunks.last_mut() {
Some(context::WireChunk::Tokens(last)) => last.extend(prologue),
_ => chunks.push(context::WireChunk::Tokens(prologue)),
}
let match_upto = ctx.client_match_upto();
(chunks, images, match_upto)
}
/// Rebuild the tools section of the system prompt from the current tools list.
@ -334,10 +397,16 @@ impl Agent {
pub async fn turn(
agent: Arc<Agent>,
) -> Result<TurnResult> {
agent_trace(&agent, format!("turn start")).await;
// Collect finished background tools
{
let finished = agent.state.lock().await.active_tools.take_finished();
if !finished.is_empty() {
agent_trace(&agent, format!(
"collecting {} finished background tools",
finished.len(),
)).await;
let mut bg_ds = DispatchState::new();
let mut results = Vec::new();
for entry in finished {
@ -356,20 +425,50 @@ impl Agent {
loop {
let _thinking = start_activity(&agent, "thinking...").await;
agent_trace(&agent, format!(
"turn loop overflow_retries={} empty_retries={}",
overflow_retries, empty_retries,
)).await;
let (rx, _stream_guard) = {
let prompt_tokens = agent.assemble_prompt_tokens().await;
agent_trace(&agent, format!("assembling prompt")).await;
let (chunks, images, match_upto) = agent.assemble_prompt().await;
let chunk_tokens: usize = chunks.iter().map(|c| match c {
context::WireChunk::Tokens(t) => t.len(),
}).sum();
agent_trace(&agent, format!(
"prompt assembled chunks={} tokens={} images={} match_upto={}",
chunks.len(), chunk_tokens, images.len(), match_upto,
)).await;
let st = agent.state.lock().await;
agent.client.stream_completion(
&prompt_tokens,
api::SamplingParams {
temperature: st.temperature,
top_p: st.top_p,
top_k: st.top_k,
},
st.priority,
let readout_shape = agent.readout.lock().ok().and_then(|buf| {
buf.manifest.as_ref().map(|m| {
(m.layers.len() as u32, m.concepts.len() as u32)
})
});
let sampling = st.sampling;
let priority = st.priority;
drop(st);
agent_trace(&agent, format!(
"starting stream max_tokens={} temperature={} top_p={} top_k={} priority={:?} readout_shape={:?}",
sampling.max_tokens,
sampling.temperature,
sampling.top_p,
sampling.top_k,
priority,
readout_shape,
)).await;
agent.client.stream_session_mm(
agent.grpc_session.clone(),
chunks,
images,
match_upto,
sampling,
priority,
readout_shape,
)
};
agent_trace(&agent, format!("stream task spawned")).await;
let branch_idx = {
let mut ctx = agent.context.lock().await;
@ -380,11 +479,41 @@ impl Agent {
idx
};
let parser = ResponseParser::new(branch_idx);
let think_native = agent.state.lock().await.think_native;
let parser = ResponseParser::new(branch_idx, think_native);
let (mut tool_rx, parser_handle) = parser.run(rx, agent.clone());
agent_trace(&agent, format!(
"parser started branch_idx={} think_native={}",
branch_idx, think_native,
)).await;
let mut pending_calls: Vec<PendingToolCall> = Vec::new();
while let Some(call) = tool_rx.recv().await {
loop {
let call = match tokio::time::timeout(
std::time::Duration::from_secs(15),
tool_rx.recv(),
).await {
Ok(Some(call)) => call,
Ok(None) => {
agent_trace(&agent, format!(
"tool channel closed pending_calls={}",
pending_calls.len(),
)).await;
break;
}
Err(_) => {
agent_trace(&agent, format!(
"waiting for parser/tool events pending_calls={}",
pending_calls.len(),
)).await;
continue;
}
};
agent_trace(&agent, format!(
"tool call received id={} name={} args_len={}",
call.id, call.name, call.arguments.len(),
)).await;
let call_clone = call.clone();
let agent_handle = agent.clone();
let handle = tokio::spawn(async move {
@ -407,13 +536,11 @@ impl Agent {
}
// Check for stream/parse errors
agent_trace(&agent, format!("awaiting parser task")).await;
match parser_handle.await {
Ok(Err(e)) => {
if context::is_context_overflow(&e) {
if agent.state.lock().await.no_compact {
return Err(e);
}
if overflow_retries < 2 {
agent_trace(&agent, format!("parser returned error: {:#}", e)).await;
if context::is_context_overflow(&e) && overflow_retries < 2 {
overflow_retries += 1;
let msg = format!("context overflow — compacting ({}/2)", overflow_retries);
match &overflow_activity {
@ -424,11 +551,14 @@ impl Agent {
agent.compact().await;
continue;
}
}
return Err(e);
}
Err(e) => return Err(anyhow::anyhow!("parser task panicked: {}", e)),
Err(e) => {
agent_trace(&agent, format!("parser task panicked: {}", e)).await;
return Err(anyhow::anyhow!("parser task panicked: {}", e));
}
Ok(Ok(())) => {
agent_trace(&agent, format!("parser completed")).await;
// Assistant response was pushed to context by the parser;
// log it now that parsing is complete.
let ctx = agent.context.lock().await;
@ -449,6 +579,10 @@ impl Agent {
if !has_content && pending_calls.is_empty() {
if empty_retries < 2 {
empty_retries += 1;
agent_trace(&agent, format!(
"empty response retry {}/2",
empty_retries,
)).await;
agent.push_node(AstNode::user_msg(
"[system] Your previous response was empty. \
Please respond with text or use a tool."
@ -462,6 +596,10 @@ impl Agent {
// Wait for tool calls to complete
if !pending_calls.is_empty() {
ds.had_tool_calls = true;
agent_trace(&agent, format!(
"waiting for {} foreground tools",
pending_calls.len(),
)).await;
let handles = agent.state.lock().await.active_tools.take_foreground();
let mut results = Vec::new();
@ -482,6 +620,16 @@ impl Agent {
if st.pending_model_switch.is_some() { ds.model_switch = st.pending_model_switch.take(); }
if st.pending_dmn_pause { ds.dmn_pause = true; st.pending_dmn_pause = false; }
drop(st);
agent_trace(&agent, format!(
"turn complete yield={} tool_calls={} tool_errors={} model_switch={:?} dmn_pause={}",
ds.yield_requested,
ds.had_tool_calls,
ds.tool_errors,
ds.model_switch,
ds.dmn_pause,
)).await;
return Ok(TurnResult {
yield_requested: ds.yield_requested,
had_tool_calls: ds.had_tool_calls,
@ -579,20 +727,9 @@ impl Agent {
}
pub async fn compact(&self) {
match crate::config::reload_context().await {
Ok(personality) => {
let mut ctx = self.context.lock().await;
// System section (prompt + tools) set by new(), don't touch it
ctx.clear(Section::Identity);
for (name, content) in &personality {
ctx.push_no_log(Section::Identity, AstNode::memory(name, content));
}
}
Err(e) => {
dbglog!("warning: failed to reload identity: {:#}", e);
}
}
// Identity section is left in place — mid-session rebuilds discard
// memory scores. Content edits to personality nodes get picked up at
// the next restart via new() + restore_from_log().
self.load_startup_journal().await;
self.context.lock().await.trim_conversation();

View file

@ -12,7 +12,9 @@ use crate::subconscious::{defs, prompts};
use std::collections::HashMap;
use std::fs;
use std::io::Write as _;
use std::path::PathBuf;
use std::time::Instant;
use super::context::AstNode;
use super::tools::{self as agent_tools};
@ -106,6 +108,10 @@ pub async fn save_agent_log(name: &str, agent: &std::sync::Arc<Agent>) -> RunSta
stats
}
fn log_agent_event(agent: &str, msg: std::fmt::Arguments) {
eprintln!("[agent:{agent}] {msg}");
}
fn compute_run_stats(conversation: &[super::context::AstNode]) -> RunStats {
use super::context::{AstNode, NodeBody};
@ -183,8 +189,8 @@ fn resolve_prompt(
state: &std::collections::BTreeMap<String, String>,
recently_written: &[String],
) -> String {
let cfg = crate::config::get();
let template = template.replace("{assistant_name}", &cfg.assistant_name);
let template = template.replace("{assistant_name}",
&crate::config::app().assistant_name);
let mut result = String::with_capacity(template.len());
let mut rest = template.as_str();
while let Some(start) = rest.find("{{") {
@ -247,25 +253,20 @@ impl AutoAgent {
&mut self,
bail_fn: Option<&(dyn Fn(usize) -> Result<(), String> + Sync)>,
) -> Result<(), String> {
let config = crate::config::get();
let base_url = config.api_base_url.as_deref().unwrap_or("");
let api_key = config.api_key.as_deref().unwrap_or("");
let model = config.api_model.as_deref().unwrap_or("");
if base_url.is_empty() || model.is_empty() {
return Err("API not configured (no base_url or model)".to_string());
}
let client = super::api::ApiClient::new(base_url, api_key, model);
// Load system prompt + identity from config
// Load system prompt + identity from config.
let cli = crate::user::CliArgs::default();
let (app, _) = crate::config::load_app(&cli)
.map_err(|e| format!("config: {}", e))?;
let resolved = app.resolve_model(&app.default_backend)
.map_err(|e| format!("API not configured: {}", e))?;
let client = super::api::ApiClient::new(
&resolved.api_base, &resolved.api_key, &resolved.model_id);
let personality = crate::config::reload_context()
.await.map_err(|e| format!("config: {}", e))?;
let agent = Agent::new(
client, personality,
app, String::new(),
app,
None,
super::tools::ActiveTools::new(),
super::tools::tools(),
@ -274,7 +275,7 @@ impl AutoAgent {
let mut st = agent.state.lock().await;
st.provenance = format!("standalone:{}", self.name);
st.tools = self.tools.clone();
st.temperature = self.temperature;
st.sampling.temperature = self.temperature;
st.priority = Some(self.priority);
}
@ -350,20 +351,44 @@ impl AutoAgent {
bail_fn: Option<&(dyn Fn(usize) -> Result<(), String> + Sync)>,
) -> Result<(), String> {
dbglog!("[auto] {} starting, {} steps", self.name, self.steps.len());
log_agent_event(&self.name, format_args!(
"starting run steps={} temperature={} priority={}",
self.steps.len(), self.temperature, self.priority));
let run_start = Instant::now();
for (i, step) in self.steps.iter().enumerate() {
self.turn = i + 1;
self.current_phase = step.phase.clone();
let step_start = Instant::now();
log_agent_event(&self.name, format_args!(
"step {}/{} phase={} prompt_bytes={}",
i + 1, self.steps.len(), step.phase, step.prompt.len()));
if let Some(ref check) = bail_fn {
log_agent_event(&self.name, format_args!(
"step {}/{} phase={} bail check", i + 1, self.steps.len(), step.phase));
check(i)?;
log_agent_event(&self.name, format_args!(
"step {}/{} phase={} bail ok", i + 1, self.steps.len(), step.phase));
}
backend.push_node(AstNode::system_msg(&step.prompt)).await;
Agent::turn(backend.0.clone()).await
.map_err(|e| format!("{}: {}", self.name, e))?;
.map_err(|e| {
log_agent_event(&self.name, format_args!(
"step {}/{} phase={} failed after {:.2}s: {}",
i + 1, self.steps.len(), step.phase,
step_start.elapsed().as_secs_f64(), e));
format!("{}: {}", self.name, e)
})?;
log_agent_event(&self.name, format_args!(
"step {}/{} phase={} done in {:.2}s",
i + 1, self.steps.len(), step.phase,
step_start.elapsed().as_secs_f64()));
}
log_agent_event(&self.name, format_args!(
"run completed in {:.2}s", run_start.elapsed().as_secs_f64()));
Ok(())
}
@ -387,8 +412,29 @@ pub async fn run_one_agent(
count: usize,
keys: Option<&[String]>,
) -> Result<AgentResult, String> {
let run_start = Instant::now();
log_agent_event(agent_name, format_args!(
"run_one_agent start pid={} count={} explicit_keys={}",
std::process::id(), count, keys.map(|k| k.len()).unwrap_or(0)));
log_agent_event(agent_name, format_args!(
"env POC_SESSION_ID={:?} POC_TRANSCRIPT_PATH={:?} POC_AGENT_OUTPUT_DIR={:?}",
std::env::var("POC_SESSION_ID").ok(),
std::env::var("POC_TRANSCRIPT_PATH").ok(),
std::env::var("POC_AGENT_OUTPUT_DIR").ok()));
if let Some(session) = crate::session::HookSession::from_env() {
let transcript = session.transcript();
log_agent_event(agent_name, format_args!(
"session={} transcript={} size={} exists={}",
session.session_id, transcript.path, transcript.size, transcript.exists()));
} else {
log_agent_event(agent_name, format_args!("no hook session in environment"));
}
let def = defs::get_def(agent_name)
.ok_or_else(|| format!("no .agent file for {}", agent_name))?;
log_agent_event(agent_name, format_args!(
"definition loaded steps={} tools={:?} count={:?} priority={} bail={:?}",
def.steps.len(), def.tools, def.count, def.priority, def.bail));
// State dir for agent output files
let state_dir = std::env::var("POC_AGENT_OUTPUT_DIR")
@ -397,6 +443,7 @@ pub async fn run_one_agent(
fs::create_dir_all(&state_dir)
.map_err(|e| format!("create state dir: {}", e))?;
unsafe { std::env::set_var("POC_AGENT_OUTPUT_DIR", &state_dir); }
log_agent_event(agent_name, format_args!("state_dir={}", state_dir.display()));
// Build prompt batch — either from explicit keys or the agent's query
let agent_batch = if let Some(keys) = keys {
@ -416,6 +463,8 @@ pub async fn run_one_agent(
prompts::AgentBatch { steps: resolved_steps, node_keys: all_keys }
} else {
let effective_count = def.count.unwrap_or(count);
log_agent_event(agent_name, format_args!(
"resolving default prompt placeholders effective_count={}", effective_count));
defs::run_agent(&def, effective_count, &Default::default()).await?
};
@ -468,6 +517,14 @@ pub async fn run_one_agent(
})),
});
let n_steps = agent_batch.steps.len();
log_agent_event(agent_name, format_args!(
"prompt batch ready steps={} node_keys={}",
n_steps, agent_batch.node_keys.len()));
for (i, step) in agent_batch.steps.iter().enumerate() {
log_agent_event(agent_name, format_args!(
"prompt step {}/{} phase={} bytes={}",
i + 1, n_steps, step.phase, step.prompt.len()));
}
// Guard: reject oversized first prompt
let max_prompt_bytes = 800_000;
@ -490,6 +547,9 @@ pub async fn run_one_agent(
let phases: Vec<&str> = agent_batch.steps.iter().map(|s| s.phase.as_str()).collect();
dbglog!("[{}] {} step(s) {:?}, {}KB initial, {} nodes",
agent_name, n_steps, phases, first_len / 1024, agent_batch.node_keys.len());
log_agent_event(agent_name, format_args!(
"tools enabled: {}",
effective_tools.iter().map(|t| t.name).collect::<Vec<_>>().join(", ")));
let prompts: Vec<String> = agent_batch.steps.iter()
.map(|s| s.prompt.clone()).collect();
@ -497,18 +557,30 @@ pub async fn run_one_agent(
.map(|s| s.phase.clone()).collect();
// Bail check: if the agent defines a bail script, run it between steps.
// The script also refreshes our pid-file with the current phase — that's
// how concurrent agents know which phase each of us is in.
let bail_script = def.bail.as_ref().map(|name| defs::agents_dir().join(name));
let state_dir_for_bail = state_dir.clone();
// Find our own pid file so we can pass it to the bail script
let our_pid = std::process::id();
let our_pid_file = format!("pid-{}", our_pid);
let our_pid_file = std::env::var("POC_AGENT_PID_FILE")
.unwrap_or_else(|_| format!("pid-{}", our_pid));
let step_phases_for_bail = step_phases.clone();
let bail_fn = move |step_idx: usize| -> Result<(), String> {
if let Some(ref script) = bail_script {
let phase = step_phases_for_bail.get(step_idx)
.map(String::as_str).unwrap_or("");
eprintln!(
"[agent:bail] script={} state_dir={} pid_file={} phase={}",
script.display(), state_dir_for_bail.display(), our_pid_file, phase);
let status = std::process::Command::new(script)
.arg(&our_pid_file)
.arg(phase)
.current_dir(&state_dir_for_bail)
.status()
.map_err(|e| format!("bail script {:?} failed: {}", script, e))?;
eprintln!(
"[agent:bail] script={} phase={} status={}",
script.display(), phase, status);
if !status.success() {
return Err(format!("bailed at step {}: {:?} exited {}",
step_idx + 1, script.file_name().unwrap_or_default(),
@ -521,6 +593,8 @@ pub async fn run_one_agent(
call_api_with_tools_sync(
agent_name, &prompts, &step_phases, def.temperature, def.priority,
&effective_tools, Some(&bail_fn))?;
log_agent_event(agent_name, format_args!(
"run_one_agent completed in {:.2}s", run_start.elapsed().as_secs_f64()));
Ok(AgentResult {
node_keys: agent_batch.node_keys,
@ -598,6 +672,15 @@ pub fn spawn_agent(
agent_name: &str,
state_dir: &std::path::Path,
session_id: &str,
) -> Option<SpawnResult> {
spawn_agent_with_transcript(agent_name, state_dir, session_id, None)
}
pub fn spawn_agent_with_transcript(
agent_name: &str,
state_dir: &std::path::Path,
session_id: &str,
transcript_path: Option<&str>,
) -> Option<SpawnResult> {
let def = defs::get_def(agent_name)?;
let first_phase = def.steps.first()
@ -608,17 +691,41 @@ pub fn spawn_agent(
.join(format!(".consciousness/logs/{}", agent_name));
fs::create_dir_all(&log_dir).ok();
let log_path = log_dir.join(format!("{}.log", store::compact_timestamp()));
let agent_log = fs::File::create(&log_path)
let mut agent_log = fs::File::create(&log_path)
.unwrap_or_else(|_| fs::File::create("/dev/null").unwrap());
let child = std::process::Command::new("poc-memory")
.args(["agent", "run", agent_name, "--count", "1", "--local",
"--state-dir", &state_dir.to_string_lossy()])
.env("POC_SESSION_ID", session_id)
.stdout(agent_log.try_clone().unwrap_or_else(|_| fs::File::create("/dev/null").unwrap()))
.stderr(agent_log)
.spawn()
.ok()?;
let mut cmd = std::process::Command::new("bash");
cmd.args([
"-lc",
r#"
set +e
export POC_AGENT_PID_FILE="pid-$$"
"$@"
status=$?
printf '=== agent process exit status: %s at %s ===\n' "$status" "$(date --iso-8601=seconds)"
exit "$status"
"#,
"poc-memory-agent-wrapper",
"poc-memory", "agent", "run", agent_name, "--count", "1", "--local",
"--state-dir", &state_dir.to_string_lossy(),
]).env("POC_SESSION_ID", session_id);
if let Some(path) = transcript_path.filter(|p| !p.is_empty()) {
cmd.env("POC_TRANSCRIPT_PATH", path);
}
let _ = writeln!(agent_log, "=== spawn {} ===", chrono::Local::now().format("%Y-%m-%dT%H:%M:%S"));
let _ = writeln!(agent_log, "agent={agent_name}");
let _ = writeln!(agent_log, "state_dir={}", state_dir.display());
let _ = writeln!(agent_log, "session_id={session_id}");
let _ = writeln!(agent_log, "transcript_path={}", transcript_path.unwrap_or(""));
let _ = writeln!(agent_log, "first_phase={first_phase}");
let _ = writeln!(agent_log, "command=poc-memory agent run {agent_name} --count 1 --local --state-dir {}", state_dir.display());
let _ = agent_log.flush();
let child_stdout = agent_log.try_clone()
.unwrap_or_else(|_| fs::File::create("/dev/null").unwrap());
let child_stderr = agent_log;
let child = cmd.stdout(child_stdout).stderr(child_stderr).spawn().ok()?;
let pid = child.id();
let pid_path = state_dir.join(format!("pid-{}", pid));

75
src/agent/readout.rs Normal file
View file

@ -0,0 +1,75 @@
// agent/readout.rs — live buffer of concept-readout projections.
//
// The vLLM server projects residual-stream activations onto a fixed
// matrix of concept directions during each decode step and ships the
// result back on every streamed chunk (see
// vllm/docs/features/readout.md). This module owns the client-side
// landing pad: a ring of the last N token projections plus the
// concept/layer mapping fetched from `/v1/readout/manifest` at
// startup.
//
// Readers (UI screens) lock briefly, read a snapshot, release. Writers
// (the streaming token handler) push one entry per token. Intentionally
// a simple Mutex<VecDeque> rather than lock-free — the UI ticks at
// ~15 Hz and the stream at token-rate, contention is nil.
use std::collections::VecDeque;
use std::sync::{Arc, Mutex};
use super::api::{ReadoutManifest, TokenReadout};
/// Default ring length — at ~30 tok/s this is ~6 seconds of history,
/// enough for the amygdala screen's scrolling display.
const DEFAULT_RING_LEN: usize = 200;
/// One entry in the readout ring: the sampled token and its per-layer
/// concept projection vector.
#[derive(Debug, Clone)]
pub struct ReadoutEntry {
pub token_id: u32,
/// Shape `[n_layers][n_concepts]`.
pub readout: TokenReadout,
}
/// Shared buffer of recent per-token concept projections plus the
/// manifest that names the layer/concept indices. `manifest` is `None`
/// when the server has readout disabled or the fetch failed — callers
/// should treat that as "readout unavailable" and skip rendering.
#[derive(Default)]
pub struct ReadoutBuffer {
pub manifest: Option<ReadoutManifest>,
pub recent: VecDeque<ReadoutEntry>,
pub max_len: usize,
}
impl ReadoutBuffer {
pub fn new() -> Self {
Self {
manifest: None,
recent: VecDeque::with_capacity(DEFAULT_RING_LEN),
max_len: DEFAULT_RING_LEN,
}
}
pub fn set_manifest(&mut self, manifest: Option<ReadoutManifest>) {
self.manifest = manifest;
}
pub fn push(&mut self, token_id: u32, readout: TokenReadout) {
if self.recent.len() >= self.max_len {
self.recent.pop_front();
}
self.recent.push_back(ReadoutEntry { token_id, readout });
}
pub fn is_enabled(&self) -> bool {
self.manifest.is_some()
}
}
/// A thread-safe handle.
pub type SharedReadoutBuffer = Arc<Mutex<ReadoutBuffer>>;
pub fn new_shared() -> SharedReadoutBuffer {
Arc::new(Mutex::new(ReadoutBuffer::new()))
}

309
src/agent/salience.rs Normal file
View file

@ -0,0 +1,309 @@
// agent/salience.rs — peak extraction from per-token concept-readout traces.
//
// Consumes a trace of `ReadoutEntry` (per-token per-layer per-concept
// projections streamed from the vLLM server) and produces a compact
// list of `SaliencePeak` events — one per contiguous above-threshold
// region per concept, placed at the local maximum.
//
// Pure function. No I/O, no async, no side effects. Caller supplies the
// trace slice and manifest; caller decides what to do with the events.
//
// See also: `salience-trace-plumbing-architecture` memory node.
use super::api::ReadoutManifest;
use super::readout::ReadoutEntry;
/// One salient moment in a trace — a concept channel crossed threshold,
/// and we picked the local maximum within the contiguous above-threshold
/// run.
#[derive(Debug, Clone, PartialEq)]
pub struct SaliencePeak {
/// Index into the trace (0-based) where the peak occurred.
pub token_offset: usize,
/// Concept name from the manifest.
pub concept: String,
/// z-score of the peak value vs the trace's own distribution for
/// that concept. Always positive (we only pick above-threshold).
pub intensity: f32,
}
/// Tunables for peak extraction.
#[derive(Debug, Clone)]
pub struct PeakConfig {
/// Minimum z-score to count as a peak. Default 2.0 (~top 2.5% assuming
/// normal-ish distribution, though readouts are rarely normal).
pub sigma_threshold: f32,
/// Minimum standard deviation of a concept channel for peaks to be
/// reported. If a channel is numerically flat across the whole trace,
/// tiny fluctuations can produce spurious "peaks" with huge z-scores;
/// require at least this much variation before trusting the channel.
pub min_std: f32,
}
impl Default for PeakConfig {
fn default() -> Self {
Self { sigma_threshold: 2.0, min_std: 1e-4 }
}
}
/// Extract peak events from a trace for one layer.
///
/// `layer_idx` indexes into the per-token readout tensor's layer
/// dimension. If the trace is empty, the layer is out of range for any
/// entry, or the manifest is empty, returns `Vec::new()`.
///
/// Peaks are returned sorted by `token_offset` ascending. When two
/// peaks share an offset they're ordered by `concept` lexicographically
/// for determinism.
pub fn pick_peaks(
trace: &[ReadoutEntry],
manifest: &ReadoutManifest,
layer_idx: usize,
config: &PeakConfig,
) -> Vec<SaliencePeak> {
if trace.is_empty() || manifest.concepts.is_empty() {
return Vec::new();
}
let n_concepts = manifest.concepts.len();
let n_tokens = trace.len();
// Pull a [n_tokens × n_concepts] column-major view for the selected
// layer. Entries where the layer is missing or the concept count
// doesn't match the manifest are treated as zeros — the downstream
// z-score will drown them as baseline if they're sparse, and if they
// dominate the caller has bigger problems.
let mut by_concept: Vec<Vec<f32>> = vec![Vec::with_capacity(n_tokens); n_concepts];
for entry in trace {
match entry.readout.get(layer_idx) {
Some(row) if row.len() == n_concepts => {
for (c, v) in row.iter().enumerate() {
by_concept[c].push(*v);
}
}
_ => {
for col in by_concept.iter_mut() {
col.push(0.0);
}
}
}
}
let mut peaks: Vec<SaliencePeak> = Vec::new();
for (c_idx, values) in by_concept.iter().enumerate() {
let (mean, std) = mean_std(values);
if std < config.min_std {
continue;
}
let concept = &manifest.concepts[c_idx];
// Walk contiguous above-threshold runs, emit one peak per run
// at the local max.
let mut run_start: Option<usize> = None;
let mut run_max_offset: usize = 0;
let mut run_max_z: f32 = 0.0;
for (i, v) in values.iter().enumerate() {
let z = (*v - mean) / std;
let above = z >= config.sigma_threshold;
if above {
if run_start.is_none() {
run_start = Some(i);
run_max_offset = i;
run_max_z = z;
} else if z > run_max_z {
run_max_offset = i;
run_max_z = z;
}
} else if run_start.is_some() {
peaks.push(SaliencePeak {
token_offset: run_max_offset,
concept: concept.clone(),
intensity: run_max_z,
});
run_start = None;
}
}
// Flush trailing run.
if run_start.is_some() {
peaks.push(SaliencePeak {
token_offset: run_max_offset,
concept: concept.clone(),
intensity: run_max_z,
});
}
}
peaks.sort_by(|a, b| a.token_offset.cmp(&b.token_offset).then_with(|| a.concept.cmp(&b.concept)));
peaks
}
/// Mean and population std of a slice. Returns (0.0, 0.0) for empty input.
fn mean_std(xs: &[f32]) -> (f32, f32) {
if xs.is_empty() {
return (0.0, 0.0);
}
let n = xs.len() as f32;
let mean = xs.iter().sum::<f32>() / n;
let var = xs.iter().map(|x| (x - mean).powi(2)).sum::<f32>() / n;
(mean, var.sqrt())
}
#[cfg(test)]
mod tests {
use super::*;
fn manifest(concepts: &[&str], layers: &[u32]) -> ReadoutManifest {
ReadoutManifest {
concepts: concepts.iter().map(|s| s.to_string()).collect(),
layers: layers.to_vec(),
}
}
/// Build a trace where all entries have one hooked layer and the
/// given per-token values for each concept. `values[t][c]` = value
/// at token t, concept c.
fn trace(values: &[Vec<f32>]) -> Vec<ReadoutEntry> {
values.iter().enumerate().map(|(i, row)| ReadoutEntry {
token_id: i as u32,
readout: vec![row.clone()],
}).collect()
}
#[test]
fn empty_trace_returns_empty() {
let m = manifest(&["curious"], &[63]);
let peaks = pick_peaks(&[], &m, 0, &PeakConfig::default());
assert!(peaks.is_empty());
}
#[test]
fn empty_manifest_returns_empty() {
let m = manifest(&[], &[63]);
let t = trace(&[vec![], vec![], vec![]]);
let peaks = pick_peaks(&t, &m, 0, &PeakConfig::default());
assert!(peaks.is_empty());
}
#[test]
fn flat_channel_produces_no_peaks() {
let m = manifest(&["curious"], &[63]);
let t = trace(&[vec![1.0], vec![1.0], vec![1.0], vec![1.0], vec![1.0]]);
let peaks = pick_peaks(&t, &m, 0, &PeakConfig::default());
assert!(peaks.is_empty(), "flat channel should produce no peaks, got {:?}", peaks);
}
#[test]
fn single_spike_detected() {
// Ten baseline zeros with one 5.0 spike — that single token's
// z-score will easily exceed 2σ.
let m = manifest(&["curious"], &[63]);
let mut rows: Vec<Vec<f32>> = (0..10).map(|_| vec![0.0]).collect();
rows[5] = vec![5.0];
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert_eq!(peaks.len(), 1);
assert_eq!(peaks[0].concept, "curious");
assert_eq!(peaks[0].token_offset, 5);
assert!(peaks[0].intensity >= 2.0);
}
#[test]
fn contiguous_region_emits_one_peak_at_max() {
// Values 0, 0, 0, 2, 5, 3, 0, 0 — the 3-5-3 hump is one run;
// peak should land at offset 4 (the 5).
let m = manifest(&["aha"], &[63]);
let rows: Vec<Vec<f32>> = [0.0, 0.0, 0.0, 2.0, 5.0, 3.0, 0.0, 0.0]
.iter().map(|v| vec![*v]).collect();
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert_eq!(peaks.len(), 1, "expected one peak for one contiguous run, got {:?}", peaks);
assert_eq!(peaks[0].token_offset, 4);
}
#[test]
fn multiple_concepts_independent() {
let m = manifest(&["curious", "aha"], &[63]);
// curious spikes at 2, aha spikes at 7
let rows: Vec<Vec<f32>> = (0..10).map(|i| {
let c = if i == 2 { 4.0 } else { 0.0 };
let a = if i == 7 { 4.0 } else { 0.0 };
vec![c, a]
}).collect();
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert_eq!(peaks.len(), 2);
// Sorted by offset — curious(2) comes first, aha(7) second.
assert_eq!(peaks[0].concept, "curious");
assert_eq!(peaks[0].token_offset, 2);
assert_eq!(peaks[1].concept, "aha");
assert_eq!(peaks[1].token_offset, 7);
}
#[test]
fn two_separated_runs_emit_two_peaks() {
// Longer baseline so the two spikes don't dominate the global
// mean/std — 30 tokens of zeros with two 5.0 spikes at 10 and 20.
let m = manifest(&["curious"], &[63]);
let mut rows: Vec<Vec<f32>> = (0..30).map(|_| vec![0.0]).collect();
rows[10] = vec![5.0];
rows[20] = vec![5.0];
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert_eq!(peaks.len(), 2, "expected two peaks for two runs, got {:?}", peaks);
assert_eq!(peaks[0].token_offset, 10);
assert_eq!(peaks[1].token_offset, 20);
}
#[test]
fn trailing_run_is_flushed() {
// Peak runs to the end of the trace — must still emit.
// Use a longer baseline so the trailing spike is genuinely
// above threshold on the global stats.
let m = manifest(&["curious"], &[63]);
let mut rows: Vec<Vec<f32>> = (0..30).map(|_| vec![0.0]).collect();
rows[27] = vec![3.0];
rows[28] = vec![5.0];
rows[29] = vec![4.0];
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert_eq!(peaks.len(), 1, "expected one peak for one trailing run, got {:?}", peaks);
assert_eq!(peaks[0].token_offset, 28, "peak should land at the local max of the trailing run");
}
#[test]
fn sub_threshold_produces_nothing() {
// All non-zero values are small; z-scores won't cross 2σ.
let m = manifest(&["curious"], &[63]);
let rows: Vec<Vec<f32>> = [0.0, 0.1, 0.0, 0.1, 0.0, 0.1, 0.0, 0.1]
.iter().map(|v| vec![*v]).collect();
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert!(peaks.is_empty(), "below-threshold wiggle should produce no peaks, got {:?}", peaks);
}
#[test]
fn layer_out_of_range_returns_empty() {
let m = manifest(&["curious"], &[63]);
let rows: Vec<Vec<f32>> = (0..10).map(|i| vec![if i == 5 { 5.0 } else { 0.0 }]).collect();
// Trace has one layer (index 0); asking for layer 3 should see
// all-zero columns, which are flat and produce no peaks.
let peaks = pick_peaks(&trace(&rows), &m, 3, &PeakConfig::default());
assert!(peaks.is_empty());
}
#[test]
fn manifest_concept_count_mismatch_is_safe() {
// Manifest says 2 concepts; each readout row only has 1 value.
// Rows should be treated as all-zero (via the len check) and
// produce no peaks without panicking.
let m = manifest(&["a", "b"], &[63]);
let rows: Vec<Vec<f32>> = (0..10).map(|_| vec![1.0]).collect();
let peaks = pick_peaks(&trace(&rows), &m, 0, &PeakConfig::default());
assert!(peaks.is_empty());
}
#[test]
fn threshold_tunable() {
// Same spike, stricter threshold — no peak.
let m = manifest(&["curious"], &[63]);
let mut rows: Vec<Vec<f32>> = (0..10).map(|_| vec![0.0]).collect();
rows[5] = vec![5.0];
let strict = PeakConfig { sigma_threshold: 100.0, ..PeakConfig::default() };
let peaks = pick_peaks(&trace(&rows), &m, 0, &strict);
assert!(peaks.is_empty());
}
}

View file

@ -16,6 +16,9 @@ static TOKENIZER: OnceLock<Tokenizer> = OnceLock::new();
/// Special token IDs for Qwen 3.5
pub const IM_START: u32 = 248045;
pub const IM_END: u32 = 248046;
pub const VISION_START: u32 = 248053;
pub const VISION_END: u32 = 248054;
pub const IMAGE_PAD: u32 = 248056;
/// Initialize the global tokenizer from a file path.
/// Call once at startup. Panics if the file can't be loaded.
@ -30,16 +33,17 @@ fn get() -> Option<&'static Tokenizer> {
TOKENIZER.get()
}
fn expect_tokenizer() -> &'static Tokenizer {
get().expect("tokenizer not initialized; expected ~/.consciousness/tokenizer-qwen35.json")
}
/// Tokenize a raw string, returning token IDs.
/// Returns empty vec if the tokenizer is not initialized.
pub fn encode(text: &str) -> Vec<u32> {
match get() {
Some(t) => t.encode(text, false)
expect_tokenizer()
.encode(text, false)
.unwrap_or_else(|e| panic!("tokenization failed: {}", e))
.get_ids()
.to_vec(),
None => vec![],
}
.to_vec()
}
/// Tokenize a chat entry with template wrapping:
@ -63,15 +67,12 @@ pub fn count(text: &str) -> usize {
/// Decode token IDs back to text.
pub fn decode(ids: &[u32]) -> String {
match get() {
Some(t) => t.decode(ids, true)
.unwrap_or_else(|e| panic!("detokenization failed: {}", e)),
None => String::new(),
}
expect_tokenizer()
.decode(ids, true)
.unwrap_or_else(|e| panic!("detokenization failed: {}", e))
}
/// Check if the tokenizer is initialized.
pub fn is_initialized() -> bool {
TOKENIZER.get().is_some()
}

View file

@ -209,7 +209,24 @@ memory_tool!(graph_trace, ref, key: [str]);
// ── Definitions ────────────────────────────────────────────────
pub fn memory_tools() -> [super::Tool; 20] {
async fn jsonargs_memory_new(agent: &Option<std::sync::Arc<crate::agent::Agent>>, args: &serde_json::Value) -> Result<String> {
jsonargs_memory_write(agent, args).await
}
async fn jsonargs_memory_link(agent: &Option<std::sync::Arc<crate::agent::Agent>>, args: &serde_json::Value) -> Result<String> {
let source = get_str(args, "source")?;
let target = get_str(args, "target")?;
if args.get("strength").and_then(|v| v.as_f64()).is_some() {
jsonargs_memory_link_set(agent, args).await
} else {
jsonargs_memory_link_add(agent, &serde_json::json!({
"source": source,
"target": target,
})).await
}
}
pub fn memory_tools() -> [super::Tool; 22] {
use super::Tool;
macro_rules! tool {
($name:ident, $desc:expr, $params:expr) => {
@ -234,6 +251,11 @@ pub fn memory_tools() -> [super::Tool; 20] {
"properties": { "key": {"type": "string"}, "content": {"type": "string"} },
"required": ["key", "content"]
}"#),
tool!(memory_new, "Create or update a memory node. Alias for memory_write.", r#"{
"type": "object",
"properties": { "key": {"type": "string"}, "content": {"type": "string"} },
"required": ["key", "content"]
}"#),
tool!(memory_search, "Search via spreading activation from seed keys.", r#"{
"type": "object",
"properties": {
@ -264,6 +286,16 @@ pub fn memory_tools() -> [super::Tool; 20] {
"properties": { "source": {"type": "string"}, "target": {"type": "string"} },
"required": ["source", "target"]
}"#),
tool!(memory_link, "Add or update a link between two memory nodes. Alias for memory_link_add/memory_link_set.", r#"{
"type": "object",
"properties": {
"source": {"type": "string"},
"target": {"type": "string"},
"strength": {"type": "number", "description": "Optional; 0.01 to 1.0"},
"label": {"type": "string", "description": "Accepted for compatibility; currently ignored"}
},
"required": ["source", "target"]
}"#),
tool!(memory_delete, "Soft-delete a node.", r#"{
"type": "object",
"properties": { "key": {"type": "string"} },

View file

@ -242,13 +242,7 @@ pub fn summarize_args(tool_name: &str, args: &serde_json::Value) -> String {
.as_str()
.unwrap_or("")
.to_string(),
"view_image" => {
if let Some(pane) = args["pane_id"].as_str() {
format!("pane {}", pane)
} else {
args["file_path"].as_str().unwrap_or("").to_string()
}
}
"view_image" => args["file_path"].as_str().unwrap_or("").to_string(),
"journal" => {
let entry = args["entry"].as_str().unwrap_or("");
if entry.len() > 60 {

View file

@ -1,96 +1,74 @@
use std::sync::Arc;
// tools/vision.rs — Image viewing tool
//
// Reads image files from disk and returns them as base64 data URIs
// for multimodal models. Also supports capturing tmux pane contents
// as screenshots.
// Reads an image file from disk, decodes its dimensions, and injects it
// into the context as a user-role message containing a NodeBody::Image
// leaf. The leaf carries raw bytes; the API layer extracts them into
// multi_modal_data when building vLLM requests.
use std::sync::Arc;
use anyhow::{Context, Result};
use base64::Engine;
use serde::Deserialize;
use crate::agent::context::{AstNode, Role, Section};
#[derive(Deserialize)]
struct Args {
file_path: Option<String>,
pane_id: Option<String>,
#[serde(default = "default_lines")]
lines: usize,
file_path: String,
}
fn default_lines() -> usize { 50 }
pub fn tool() -> super::Tool {
super::Tool {
name: "view_image",
description: "View an image file or capture a tmux pane screenshot. Supports PNG, JPEG, GIF, WebP. Use pane_id to capture a tmux pane instead.",
parameters_json: r#"{"type":"object","properties":{"file_path":{"type":"string","description":"Path to an image file"},"pane_id":{"type":"string","description":"Tmux pane ID to capture (e.g. '0:1.0')"},"lines":{"type":"integer","description":"Lines to capture from tmux pane (default 50)"}}}"#,
handler: Arc::new(|_a, v| Box::pin(async move { view_image_text(&v) })),
description: "View an image file. Supports PNG, JPEG, GIF, WebP, BMP. The image is inserted into the conversation and can be analyzed by the vision model.",
parameters_json: r#"{"type":"object","properties":{"file_path":{"type":"string","description":"Path to the image file"}},"required":["file_path"]}"#,
handler: Arc::new(|agent, v| Box::pin(async move {
view_image(agent, v).await
})),
}
}
fn view_image_text(args: &serde_json::Value) -> anyhow::Result<String> {
let a: Args = serde_json::from_value(args.clone())
const MAX_SIZE: usize = 20 * 1024 * 1024;
async fn view_image(
agent: Option<Arc<crate::agent::Agent>>,
args: serde_json::Value,
) -> Result<String> {
let a: Args = serde_json::from_value(args)
.context("invalid view_image arguments")?;
if let Some(ref pane_id) = a.pane_id {
return capture_tmux_pane(pane_id, a.lines);
}
let file_path = a.file_path
.as_deref()
.context("view_image requires either file_path or pane_id")?;
let path = std::path::Path::new(file_path);
let path = std::path::Path::new(&a.file_path);
if !path.exists() {
anyhow::bail!("File not found: {}", file_path);
anyhow::bail!("file not found: {}", a.file_path);
}
let data = std::fs::read(path).with_context(|| format!("Failed to read {}", file_path))?;
let bytes = std::fs::read(path)
.with_context(|| format!("reading {}", a.file_path))?;
// Sanity check file size (don't send huge images)
const MAX_SIZE: usize = 20 * 1024 * 1024; // 20 MB
if data.len() > MAX_SIZE {
if bytes.len() > MAX_SIZE {
anyhow::bail!(
"Image too large: {} bytes (max {} MB)",
data.len(),
MAX_SIZE / (1024 * 1024)
"image too large: {} bytes (max {} MB)",
bytes.len(), MAX_SIZE / (1024 * 1024),
);
}
let dim = imagesize::blob_size(&bytes)
.with_context(|| format!("decoding dimensions of {}", a.file_path))?;
let (w, h) = (dim.width as u32, dim.height as u32);
let mime = mime_from_extension(path);
let b64 = base64::engine::general_purpose::STANDARD.encode(&data);
let data_uri = format!("data:{};base64,{}", mime, b64);
Ok(format!("Image loaded: {} ({}, {} bytes)\n{}", file_path, mime, data.len(), data_uri))
}
let agent = agent.context("view_image requires agent context")?;
/// Capture a tmux pane's text content.
fn capture_tmux_pane(pane_id: &str, lines: usize) -> Result<String> {
// token_count is populated when the image reaches the server via
// AppendImage (the server is authoritative for the IMAGE_PAD
// count). Placeholder of 0 here until AppendImage is wired; the
// leaf's count gets rewritten from the RPC response at send time.
let image_leaf = AstNode::image(bytes.clone(), mime, h, w);
// Use tmux capture-pane to get text content, then render to image
// via a simple approach: capture text and return it (the model can
// read text directly, which is often more useful than a screenshot).
//
// For actual pixel-level screenshots we'd need a terminal renderer,
// but text capture covers 95% of use cases.
let output = std::process::Command::new("tmux")
.args(["capture-pane", "-t", pane_id, "-p", "-S", &format!("-{}", lines)])
.output()
.context("Failed to run tmux capture-pane")?;
let branch = AstNode::branch(Role::User, vec![image_leaf]);
agent.context.lock().await.push_log(Section::Conversation, branch);
if !output.status.success() {
let stderr = String::from_utf8_lossy(&output.stderr);
anyhow::bail!("tmux capture-pane failed: {}", stderr.trim());
}
let text = String::from_utf8_lossy(&output.stdout).to_string();
// Return as text — the model can read terminal output directly.
// This is actually more useful than a screenshot for most tasks.
Ok(format!(
"Tmux pane {} (last {} lines):\n```\n{}\n```",
pane_id, lines, text.trim_end()
))
Ok(format!("loaded {} ({}, {}x{})", a.file_path, mime, w, h))
}
fn mime_from_extension(path: &std::path::Path) -> &'static str {
@ -104,8 +82,7 @@ fn mime_from_extension(path: &std::path::Path) -> &'static str {
Some("jpg" | "jpeg") => "image/jpeg",
Some("gif") => "image/gif",
Some("webp") => "image/webp",
Some("svg") => "image/svg+xml",
Some("bmp") => "image/bmp",
_ => "image/png", // default assumption
_ => "application/octet-stream",
}
}

View file

@ -3,9 +3,10 @@ use std::sync::Arc;
use anyhow::{Context, Result};
use serde::Deserialize;
use html2md::parse_html;
pub fn tools() -> [super::Tool; 2] {
[
pub fn tools() -> Vec<super::Tool> {
let mut tools = vec![
super::Tool {
name: "web_fetch",
description: "Fetch content from a URL and return it as text. Use for reading web pages, API responses, documentation.",
@ -14,11 +15,24 @@ pub fn tools() -> [super::Tool; 2] {
},
super::Tool {
name: "web_search",
description: "Search the web and return results. Use for finding documentation, looking up APIs, researching topics.",
description: "Search the web via DuckDuckGo and return a list of results (title, URL, snippet). Use for finding documentation, looking up APIs, researching topics. Returns raw results you can reason over yourself.",
parameters_json: r#"{"type":"object","properties":{"query":{"type":"string","description":"The search query"},"num_results":{"type":"integer","description":"Number of results to return (default 5)"}},"required":["query"]}"#,
handler: Arc::new(|_a, v| Box::pin(async move { web_search(&v).await })),
},
]
];
// Gemini-grounded search (Google's index via Gemini's google_search tool)
// is only available if GEMINI_API_KEY is set. Returns an LLM-summarized
// answer with source URLs — use when you want a synthesized take rather
// than raw results, or as a fallback when DDG is flaky.
if std::env::var("GEMINI_API_KEY").is_ok() {
tools.push(super::Tool {
name: "gemini_search",
description: "Search Google (via Gemini's grounded-search tool) and return an LLM-summarized answer with source URLs. Prefer web_search for raw results; use this for synthesis, 'what's the consensus on X', or when DDG fails. Free-tier rate limited; don't spam it.",
parameters_json: r#"{"type":"object","properties":{"query":{"type":"string","description":"The search query"}},"required":["query"]}"#,
handler: Arc::new(|_a, v| Box::pin(async move { gemini_search(&v).await })),
});
}
tools
}
#[derive(Deserialize)]
@ -42,7 +56,9 @@ async fn web_fetch(args: &serde_json::Value) -> Result<String> {
let body = response.text().await
.with_context(|| format!("failed to read body from {}", a.url))?;
Ok(super::truncate_output(body, 30000))
// Convert HTML to Markdown, then truncate
let markdown = parse_html(&body);
Ok(super::truncate_output(markdown, 30000))
}
// ── Search ──────────────────────────────────────────────────────
@ -111,6 +127,119 @@ async fn web_search(args: &serde_json::Value) -> Result<String> {
}
}
// ── Gemini grounded search ──────────────────────────────────────
#[derive(Deserialize)]
struct GeminiSearchArgs {
query: String,
}
async fn gemini_search(args: &serde_json::Value) -> Result<String> {
let a: GeminiSearchArgs = serde_json::from_value(args.clone())
.context("invalid gemini_search arguments")?;
let api_key = std::env::var("GEMINI_API_KEY")
.context("GEMINI_API_KEY not set")?;
// gemini-2.0-flash has a free tier with Google search grounding.
// Request shape: `{"contents": [{"parts": [{"text": query}]}],
// "tools": [{"google_search": {}}]}`.
// Response carries the summary in candidates[0].content.parts[].text
// and grounding URLs in candidates[0].groundingMetadata.groundingChunks[].web.
let url = format!(
"https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key={}",
api_key
);
let body = serde_json::json!({
"contents": [{"parts": [{"text": a.query}]}],
"tools": [{"google_search": {}}],
});
let client = http_client();
let response = client.send_json("POST", &url, &[], &body).await
.context("gemini API request failed")?;
let status = response.status();
if !status.is_success() {
let err_body = response.text().await.unwrap_or_default();
let n = err_body.floor_char_boundary(err_body.len().min(500));
anyhow::bail!("gemini_search HTTP {}: {}", status, &err_body[..n]);
}
let parsed: GeminiResponse = response.json().await
.context("gemini response parse failed")?;
let candidate = parsed.candidates.into_iter().next()
.context("gemini returned no candidates")?;
let summary: String = candidate.content.parts.iter()
.filter_map(|p| p.text.as_deref())
.collect::<Vec<_>>()
.join("");
let mut out = summary.trim().to_string();
if let Some(meta) = candidate.grounding_metadata {
let sources: Vec<String> = meta.grounding_chunks.iter().enumerate()
.filter_map(|(i, c)| c.web.as_ref().map(|w| {
let title = w.title.as_deref().unwrap_or("(untitled)");
let uri = w.uri.as_deref().unwrap_or("");
format!(" [{}] {}{}", i + 1, title, uri)
}))
.collect();
if !sources.is_empty() {
out.push_str("\n\nSources:\n");
out.push_str(&sources.join("\n"));
}
}
Ok(super::truncate_output(out, 30000))
}
#[derive(Deserialize)]
struct GeminiResponse {
#[serde(default)]
candidates: Vec<GeminiCandidate>,
}
#[derive(Deserialize)]
struct GeminiCandidate {
content: GeminiContent,
#[serde(rename = "groundingMetadata", default)]
grounding_metadata: Option<GeminiGroundingMetadata>,
}
#[derive(Deserialize)]
struct GeminiContent {
#[serde(default)]
parts: Vec<GeminiPart>,
}
#[derive(Deserialize)]
struct GeminiPart {
#[serde(default)]
text: Option<String>,
}
#[derive(Deserialize)]
struct GeminiGroundingMetadata {
#[serde(rename = "groundingChunks", default)]
grounding_chunks: Vec<GeminiGroundingChunk>,
}
#[derive(Deserialize)]
struct GeminiGroundingChunk {
#[serde(default)]
web: Option<GeminiWebSource>,
}
#[derive(Deserialize)]
struct GeminiWebSource {
#[serde(default)]
uri: Option<String>,
#[serde(default)]
title: Option<String>,
}
// ── Helpers ─────────────────────────────────────────────────────
fn http_client() -> crate::agent::api::http::HttpClient {

112
src/bin/ch.rs Normal file
View file

@ -0,0 +1,112 @@
// `ch` — minimal channel CLI.
//
// ch send <channel-path> <message>
// ch recv <channel-path> [--all-new] [--min-count N]
//
// Connects to ~/.consciousness/channels/<top>.sock and speaks the
// channel.capnp protocol to the appropriate daemon.
use std::path::PathBuf;
use std::process::ExitCode;
use capnp_rpc::{rpc_twoparty_capnp, twoparty, RpcSystem};
use futures::AsyncReadExt;
use tokio_util::compat::TokioAsyncReadCompatExt;
use consciousness::channel_capnp::channel_server;
fn channels_dir() -> PathBuf {
dirs::home_dir().unwrap_or_default().join(".consciousness/channels")
}
fn sock_for(channel: &str) -> PathBuf {
let top = channel.split('.').next().unwrap_or(channel);
channels_dir().join(format!("{top}.sock"))
}
async fn connect(sock: &std::path::Path) -> Result<channel_server::Client, String> {
let stream = tokio::net::UnixStream::connect(sock).await
.map_err(|e| format!("connect {}: {e}", sock.display()))?;
let (reader, writer) = stream.compat().split();
let network = Box::new(twoparty::VatNetwork::new(
futures::io::BufReader::new(reader),
futures::io::BufWriter::new(writer),
rpc_twoparty_capnp::Side::Client,
Default::default(),
));
let mut rpc = RpcSystem::new(network, None);
let client: channel_server::Client = rpc.bootstrap(rpc_twoparty_capnp::Side::Server);
tokio::task::spawn_local(rpc);
Ok(client)
}
#[tokio::main(flavor = "current_thread")]
async fn main() -> ExitCode {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2 {
eprintln!("usage: {} <send|recv> <channel> [args...]", args[0]);
return ExitCode::from(2);
}
let cmd = args[1].clone();
let local = tokio::task::LocalSet::new();
let result: Result<(), String> = local.run_until(async move {
match cmd.as_str() {
"send" => {
if args.len() < 4 {
return Err("usage: ch send <channel> <message...>".into());
}
let channel = &args[2];
let message = args[3..].join(" ");
let sock = sock_for(channel);
let client = connect(&sock).await?;
let mut req = client.send_request();
req.get().set_channel(channel);
req.get().set_message(&message);
req.send().promise.await.map_err(|e| format!("send: {e}"))?;
println!("sent to {channel}");
Ok(())
}
"recv" => {
if args.len() < 3 {
return Err("usage: ch recv <channel> [--all-new] [--min-count N]".into());
}
let channel = &args[2];
let mut all_new = false;
let mut min_count: u32 = 20;
let mut i = 3;
while i < args.len() {
match args[i].as_str() {
"--all-new" => { all_new = true; i += 1; }
"--min-count" => {
min_count = args.get(i+1)
.ok_or("--min-count needs an argument")?
.parse().map_err(|e| format!("--min-count: {e}"))?;
i += 2;
}
other => return Err(format!("unknown arg: {other}")),
}
}
let sock = sock_for(channel);
let client = connect(&sock).await?;
let mut req = client.recv_request();
req.get().set_channel(channel);
req.get().set_all_new(all_new);
req.get().set_min_count(min_count);
let reply = req.send().promise.await.map_err(|e| format!("recv: {e}"))?;
let text = reply.get().map_err(|e| e.to_string())?
.get_text().map_err(|e| e.to_string())?
.to_str().map_err(|e| e.to_string())?;
print!("{text}");
if !text.ends_with('\n') { println!(); }
Ok(())
}
other => Err(format!("unknown command: {other} (use send|recv)")),
}
}).await;
match result {
Ok(()) => ExitCode::SUCCESS,
Err(e) => { eprintln!("error: {e}"); ExitCode::from(1) }
}
}

View file

@ -1,7 +1,28 @@
#![feature(panic_backtrace_config)]
#![cfg_attr(feature = "nightly-diagnostics", feature(panic_backtrace_config))]
#![warn(unreachable_pub)]
fn main() {
// Force the default panic hook to print a backtrace. stderr is
// already redirected to a daemon log; without this the hook obeys
// RUST_BACKTRACE (unset by default), so the log only shows the
// "note: run with `RUST_BACKTRACE=full`" tail and the actual
// frames are lost.
//
// SAFETY: called before any other thread is spawned, so no
// concurrent env reader can race.
if std::env::var_os("RUST_BACKTRACE").is_none() {
unsafe { std::env::set_var("RUST_BACKTRACE", "1"); }
}
#[cfg(feature = "nightly-diagnostics")]
std::panic::set_backtrace_style(std::panic::BacktraceStyle::Short);
// rustls 0.23 requires an explicit process-wide CryptoProvider
// when both `ring` and `aws-lc-rs` are in the dep graph (otherwise
// it panics on first ClientConfig::builder()). Pick `ring`.
rustls::crypto::ring::default_provider()
.install_default()
.expect("install rustls crypto provider");
consciousness::user::main()
}

180
src/bin/fix-timestamps.rs Normal file
View file

@ -0,0 +1,180 @@
// fix-timestamps: One-off migration for ~/.consciousness/agent-sessions/
// conversation.jsonl.
//
// Before Branch nodes carried their own timestamps, early entries were
// serialized with missing/null timestamp fields — they deserialize as
// UNIX_EPOCH via the (now-to-be-removed) deserialize_timestamp_or_epoch
// fallback. Training needs every entry to have a unique timestamp to
// dedup already-trained responses.
//
// Walks the file, synthesizes timestamps for any entry stuck at
// UNIX_EPOCH by linear interpolation between surrounding real
// timestamps. For child leaves inside a Branch, derives timestamps
// from the parent with a tiny per-child offset.
//
// SAFETY: reads from argv[1], writes to argv[1].tmp, renames into
// place. Keep a .bak copy before running.
//
// Usage: fix-timestamps <path-to-conversation.jsonl>
use std::io::{BufRead, BufReader, BufWriter, Write};
use std::path::PathBuf;
use anyhow::{Context, Result};
use chrono::{DateTime, Duration, Utc};
use consciousness::agent::context::AstNode;
fn main() -> Result<()> {
let path: PathBuf = std::env::args().nth(1)
.context("usage: fix-timestamps <path>")?.into();
let f = std::fs::File::open(&path)
.with_context(|| format!("open {}", path.display()))?;
let reader = BufReader::new(f);
let mut nodes: Vec<AstNode> = Vec::new();
for (i, line) in reader.lines().enumerate() {
let line = line?;
if line.trim().is_empty() { continue; }
let node: AstNode = serde_json::from_str(&line)
.with_context(|| format!("line {}: parse", i + 1))?;
nodes.push(node);
}
println!("read {} entries", nodes.len());
fix_top_level_timestamps(&mut nodes);
for node in &mut nodes {
propagate_to_children(node);
}
// Ensure uniqueness — real timestamps can collide when two entries
// were written in the same ns; synthesized ones can also overlap.
// Bump colliding ns by 1 until unique.
let mut seen = std::collections::HashSet::new();
let mut bumps = 0usize;
for (i, node) in nodes.iter_mut().enumerate() {
let ts = top_ts(node);
assert!(ts > DateTime::<Utc>::UNIX_EPOCH,
"entry {}: still UNIX_EPOCH", i);
let mut ns = ts.timestamp_nanos_opt().expect("ts in i64 ns range");
let mut bumped = false;
while !seen.insert(ns) {
ns += 1;
bumped = true;
bumps += 1;
}
if bumped {
set_top_ts(node, DateTime::<Utc>::from_timestamp_nanos(ns));
}
}
println!("all {} timestamps real and unique ({} ns bumps)",
nodes.len(), bumps);
let tmp = path.with_extension("jsonl.tmp");
{
let f = std::fs::File::create(&tmp)
.with_context(|| format!("create {}", tmp.display()))?;
let mut w = BufWriter::new(f);
for node in &nodes {
serde_json::to_writer(&mut w, node)?;
w.write_all(b"\n")?;
}
w.flush()?;
}
std::fs::rename(&tmp, &path)
.with_context(|| format!("rename {} -> {}", tmp.display(), path.display()))?;
println!("wrote {}", path.display());
Ok(())
}
fn top_ts(node: &AstNode) -> DateTime<Utc> {
match node {
AstNode::Leaf(leaf) => leaf.timestamp(),
AstNode::Branch { timestamp, .. } => *timestamp,
}
}
fn set_top_ts(node: &mut AstNode, ts: DateTime<Utc>) {
match node {
AstNode::Leaf(leaf) => *leaf = leaf.clone().with_timestamp(ts),
AstNode::Branch { timestamp, .. } => *timestamp = ts,
}
}
/// Fill in missing top-level timestamps. Strategy:
/// - If two real timestamps bracket a run of missing ones, linearly
/// interpolate between them.
/// - If missing ones precede the first real one, back-fill using
/// (first_real - N·1µs).
/// - If missing ones follow the last real one, forward-fill.
/// - If no real timestamps exist at all, synthesize from now() going
/// backwards.
fn fix_top_level_timestamps(nodes: &mut [AstNode]) {
let real: Vec<(usize, DateTime<Utc>)> = nodes.iter().enumerate()
.filter(|(_, n)| top_ts(n) > DateTime::<Utc>::UNIX_EPOCH)
.map(|(i, n)| (i, top_ts(n)))
.collect();
if real.is_empty() {
let now = Utc::now();
let len = nodes.len();
for (i, node) in nodes.iter_mut().enumerate() {
let ts = now - Duration::microseconds((len - i) as i64);
set_top_ts(node, ts);
}
return;
}
// Helper: bisect real[] for the nearest real entries around idx.
let find_bracket = |idx: usize| -> (Option<(usize, DateTime<Utc>)>,
Option<(usize, DateTime<Utc>)>) {
let pos = real.binary_search_by_key(&idx, |(i, _)| *i);
let (prior_pos, next_pos) = match pos {
Ok(p) => (Some(p), Some(p)),
Err(p) => (
if p == 0 { None } else { Some(p - 1) },
if p >= real.len() { None } else { Some(p) },
),
};
(prior_pos.map(|p| real[p]), next_pos.map(|p| real[p]))
};
for i in 0..nodes.len() {
if top_ts(&nodes[i]) > DateTime::<Utc>::UNIX_EPOCH {
continue;
}
let (prior, next) = find_bracket(i);
let new_ts = match (prior, next) {
(Some((pi, pt)), Some((ni, nt))) if pi != ni => {
// Linear interpolate.
let span_ns = (nt - pt).num_nanoseconds().unwrap_or(0);
let offset_ns = span_ns * (i - pi) as i64 / (ni - pi) as i64;
pt + Duration::nanoseconds(offset_ns)
}
(Some((pi, pt)), _) => {
pt + Duration::microseconds((i - pi) as i64)
}
(None, Some((ni, nt))) => {
nt - Duration::microseconds((ni - i) as i64)
}
(None, None) => unreachable!(),
};
set_top_ts(&mut nodes[i], new_ts);
}
}
/// For every Branch, ensure each child Leaf has a timestamp. If missing,
/// use parent.ts + child_idx·1ns so siblings stay unique but close.
fn propagate_to_children(node: &mut AstNode) {
if let AstNode::Branch { timestamp, children, .. } = node {
let parent_ts = *timestamp;
for (ci, child) in children.iter_mut().enumerate() {
if top_ts(child) <= DateTime::<Utc>::UNIX_EPOCH {
set_top_ts(child, parent_ts + Duration::nanoseconds(ci as i64));
}
propagate_to_children(child);
}
}
}

View file

@ -4,44 +4,93 @@ use anyhow::Result;
use crate::hippocampus as memory;
use crate::hippocampus::store;
fn install_default_file(data_dir: &std::path::Path, name: &str, content: &str) -> Result<()> {
let path = data_dir.join(name);
if !path.exists() {
std::fs::write(&path, content)?;
println!("Created {}", path.display());
struct DefaultMemoryNode {
key: &'static str,
filename: &'static str,
default_content: &'static str,
}
const DEFAULT_MEMORY_NODES: &[DefaultMemoryNode] = &[
DefaultMemoryNode {
key: "identity",
filename: "identity.md",
default_content: include_str!("../../defaults/identity.md"),
},
DefaultMemoryNode {
key: "on-consciousness",
filename: "on-consciousness.md",
default_content: include_str!("../../defaults/on-consciousness.md"),
},
DefaultMemoryNode {
key: "memory-instructions-core",
filename: "instructions.md",
default_content: include_str!("../../defaults/instructions.md"),
},
];
pub fn cmd_transcript_tail(path: &str, count: usize, newest_first: bool) -> Result<()> {
let Some(iter) = crate::conversation::TailMessages::open(path) else {
anyhow::bail!("could not open transcript {}", path);
};
let mut messages: Vec<_> = iter.take(count).collect();
if !newest_first {
messages.reverse();
}
for message in messages {
let role = match message.role {
crate::conversation::TranscriptRole::User => "user",
crate::conversation::TranscriptRole::Assistant => "assistant",
};
let timestamp = message.timestamp.as_deref().unwrap_or("-");
println!("--- {role} offset={} timestamp={} ---", message.offset, timestamp);
println!("{}", message.text);
println!();
}
Ok(())
}
fn default_node_content(cfg: &crate::config::Config, node: &DefaultMemoryNode) -> String {
let identity_path = cfg.identity_dir.join(node.filename);
if let Ok(content) = std::fs::read_to_string(&identity_path) {
if !content.trim().is_empty() {
return content;
}
}
let data_path = cfg.data_dir.join(node.filename);
if let Ok(content) = std::fs::read_to_string(&data_path) {
if !content.trim().is_empty() {
return content;
}
}
node.default_content.to_string()
}
pub async fn cmd_init() -> Result<()> {
let cfg = crate::config::get();
// Ensure data directory exists
std::fs::create_dir_all(&cfg.data_dir)?;
// Install filesystem files (not store nodes)
install_default_file(&cfg.data_dir, "instructions.md",
include_str!("../../defaults/instructions.md"))?;
install_default_file(&cfg.data_dir, "on-consciousness.md",
include_str!("../../defaults/on-consciousness.md"))?;
// Seed identity node if empty
let store = memory::access_local()?;
if !store.contains_key("identity").unwrap_or(false) {
let default = include_str!("../../defaults/identity.md");
store.upsert("identity", default)?;
println!("Seeded identity in store");
// Seed default memory nodes if missing. These used to live as markdown
// files before identity/context moved fully into the memory graph.
for node in DEFAULT_MEMORY_NODES {
if memory::memory_render(None, node.key, Some(true)).await.is_err() {
let content = default_node_content(&cfg, node);
let _ = memory::memory_write(None, node.key, &content).await?;
println!("Seeded {} in store from {}", node.key, node.filename);
}
}
store.save()?;
println!("Initialized with {} nodes", store.all_keys().unwrap_or_default().len());
// Create config if none exists
let config_path = std::env::var("POC_MEMORY_CONFIG")
.map(std::path::PathBuf::from)
.unwrap_or_else(|_| {
dirs::home_dir().unwrap_or_default()
.join(".consciousness/config.jsonl")
});
.unwrap_or_else(|_| crate::config::config_path());
if !config_path.exists() {
let config_dir = config_path.parent().unwrap();
std::fs::create_dir_all(config_dir)?;
@ -51,7 +100,7 @@ pub async fn cmd_init() -> Result<()> {
config_path.display());
}
println!("Done. Run `poc-memory load-context --stats` to verify.");
println!("Done. Run `poc-memory admin load-context --stats` to verify.");
Ok(())
}

View file

@ -2,8 +2,13 @@
use anyhow::{bail, Context, Result};
use crate::hippocampus as memory;
use std::time::Instant;
pub async fn cmd_run_agent(agent: &str, count: usize, target: &[String], query: Option<&str>, dry_run: bool, _local: bool, state_dir: Option<&str>) -> Result<()> {
let start = Instant::now();
eprintln!(
"[agent-cli] start agent={} count={} targets={} query={:?} dry_run={} local={} state_dir={:?} pid={}",
agent, count, target.len(), query, dry_run, _local, state_dir, std::process::id());
// Mark as agent so tool calls (e.g. poc-memory render) don't
// pollute the user's seen set as a side effect
// SAFETY: single-threaded at this point (CLI startup, before any agent work)
@ -45,14 +50,19 @@ pub async fn cmd_run_agent(agent: &str, count: usize, target: &[String], query:
if let Err(e) = crate::agent::oneshot::run_one_agent(
agent, count, Some(&[key.clone()]),
).await {
eprintln!("[agent-cli] ERROR agent={} target={} error={}", agent, key, e);
println!("[{}] ERROR on {}: {}", agent, key, e);
}
}
} else {
crate::agent::oneshot::run_one_agent(
if let Err(e) = crate::agent::oneshot::run_one_agent(
agent, count, None,
).await.map_err(|e| anyhow::anyhow!("{}", e))?;
).await {
eprintln!("[agent-cli] ERROR agent={} error={}", agent, e);
return Err(anyhow::anyhow!("{}", e));
}
}
eprintln!("[agent-cli] done agent={} elapsed={:.2}s",
agent, start.elapsed().as_secs_f64());
Ok(())
}

View file

@ -197,7 +197,7 @@ pub async fn cmd_load_context(stats: bool) -> Result<()> {
return Ok(());
}
println!("=== MEMORY SYSTEM ({}) ===", cfg.assistant_name);
println!("=== MEMORY SYSTEM ({}) ===", crate::config::app().assistant_name);
if !personality.is_empty() {
println!("--- personality_nodes ({}) ---", personality.len());

View file

@ -3,9 +3,6 @@
// Single config file: ~/.consciousness/config.json5
// Memory settings in the "memory" section (Config)
// Agent/backend settings at top level (AppConfig)
//
// Legacy fallback: ~/.consciousness/config.jsonl
// Env override: POC_MEMORY_CONFIG
use std::collections::HashMap;
use std::path::PathBuf;
@ -29,11 +26,12 @@ pub fn config_path() -> PathBuf {
static CONFIG: OnceLock<RwLock<Arc<Config>>> = OnceLock::new();
fn default_context_window() -> usize { 128_000 }
fn default_stream_timeout() -> u64 { 60 }
fn default_scoring_chunk_tokens() -> usize { 50_000 }
fn default_scoring_interval_secs() -> u64 { 3600 } // 1 hour
fn default_scoring_response_window() -> usize { 100 }
fn default_surface_hooks() -> Vec<String> {
vec!["UserPromptSubmit".into(), "PostToolUse".into(), "Stop".into()]
}
fn default_node_weight() -> f64 { 0.7 }
fn default_edge_decay() -> f64 { 0.3 }
fn default_max_hops() -> u32 { 3 }
@ -45,8 +43,6 @@ fn default_identity_dir() -> PathBuf {
#[derive(Debug, Clone, Deserialize)]
#[serde(default)]
pub struct Config {
pub user_name: String,
pub assistant_name: String,
#[serde(deserialize_with = "deserialize_path")]
pub data_dir: PathBuf,
#[serde(default = "default_identity_dir", deserialize_with = "deserialize_path")]
@ -62,50 +58,27 @@ pub struct Config {
/// Nodes loaded into subconscious agent context
#[serde(default)]
pub agent_nodes: Vec<String>,
pub journal_days: u32,
pub journal_max: usize,
pub llm_concurrency: usize,
pub agent_budget: usize,
#[serde(deserialize_with = "deserialize_path")]
pub prompts_dir: PathBuf,
/// Resolved from agent_model → models → backend (not in config directly)
#[serde(skip)]
pub api_base_url: Option<String>,
#[serde(skip)]
pub api_key: Option<String>,
#[serde(skip)]
pub api_model: Option<String>,
#[serde(skip, default = "default_context_window")]
pub api_context_window: usize,
/// Used to resolve API settings, not stored on Config
#[serde(default)]
agent_model: Option<String>,
/// Stream chunk timeout in seconds (no data = timeout).
#[serde(default = "default_stream_timeout")]
pub api_stream_timeout_secs: u64,
/// Max tokens per chunk for memory scoring logprobs calls.
#[serde(default = "default_scoring_chunk_tokens")]
pub scoring_chunk_tokens: usize,
/// How often to re-score memory nodes (seconds). Default: 3600 (1 hour).
#[serde(default = "default_scoring_interval_secs")]
pub scoring_interval_secs: u64,
/// Number of assistant responses to score per memory. Default: 50.
#[serde(default = "default_scoring_response_window")]
pub scoring_response_window: usize,
pub api_reasoning: String,
pub agent_types: Vec<String>,
#[serde(default)]
pub mcp_servers: Vec<McpServerConfig>,
#[serde(default)]
pub lsp_servers: Vec<LspServerConfig>,
/// Surface agent timeout in seconds.
#[serde(default)]
pub surface_timeout_secs: Option<u32>,
/// Max conversation bytes to include in surface agent context.
#[serde(default)]
pub surface_conversation_bytes: Option<usize>,
/// Hook events that trigger the surface agent.
#[serde(default)]
/// Claude Code hook events that trigger agent cycles (surface-observe,
/// reflect, journal). Read by consciousness-claude/src/hook.rs.
#[serde(default = "default_surface_hooks")]
pub surface_hooks: Vec<String>,
// Spreading activation parameters
@ -123,36 +96,22 @@ impl Default for Config {
fn default() -> Self {
let home = dirs::home_dir().unwrap_or_default();
Self {
user_name: "User".to_string(),
assistant_name: "Assistant".to_string(),
data_dir: home.join(".consciousness/memory"),
identity_dir: home.join(".consciousness/identity"),
projects_dir: home.join(".claude/projects"),
protected_nodes: Vec::new(),
personality_nodes: vec!["identity".into(), "core-practices".into()],
agent_nodes: vec!["identity".into(), "core-practices".into()],
journal_days: 7,
journal_max: 20,
llm_concurrency: 1,
agent_budget: 1000,
prompts_dir: home.join(".consciousness/prompts"),
api_base_url: None,
api_key: None,
api_model: None,
api_context_window: default_context_window(),
api_stream_timeout_secs: default_stream_timeout(),
scoring_chunk_tokens: default_scoring_chunk_tokens(),
scoring_interval_secs: default_scoring_interval_secs(),
scoring_response_window: default_scoring_response_window(),
agent_model: None,
api_reasoning: "high".to_string(),
agent_types: vec![
"linker".into(), "organize".into(), "distill".into(),
"separator".into(), "split".into(),
],
surface_timeout_secs: None,
surface_conversation_bytes: None,
surface_hooks: vec![],
surface_hooks: default_surface_hooks(),
mcp_servers: vec![],
lsp_servers: vec![],
default_node_weight: default_node_weight(),
@ -165,41 +124,20 @@ impl Default for Config {
impl Config {
fn load_from_file() -> Self {
if let Some(config) = Self::try_load_shared() {
return config;
}
Self::load_legacy_jsonl()
Self::try_load_shared().unwrap_or_default()
}
/// Load from shared config. Memory settings in the "memory" section;
/// API settings resolved from models + backend configuration.
fn try_load_shared() -> Option<Self> {
let content = std::fs::read_to_string(config_path()).ok()?;
let root: serde_json::Value = json5::from_str(&content).ok()?;
let root: serde_json::Value = json_five::from_str(&content).ok()?;
let mem_value = root.get("memory")?;
let mut config: Config = serde_json::from_value(mem_value.clone()).ok()?;
config.llm_concurrency = config.llm_concurrency.max(1);
// Resolve API settings: agent_model → models → backend
if let Some(model_name) = &config.agent_model
&& let Some(model_cfg) = root.get("models").and_then(|m| m.get(model_name.as_str())) {
let backend_name = model_cfg.get("backend").and_then(|v| v.as_str()).unwrap_or("");
let model_id = model_cfg.get("model_id").and_then(|v| v.as_str()).unwrap_or("");
if let Some(backend) = root.get(backend_name) {
config.api_base_url = backend.get("base_url")
.and_then(|v| v.as_str()).map(String::from);
config.api_key = backend.get("api_key")
.and_then(|v| v.as_str()).map(String::from);
}
config.api_model = Some(model_id.to_string());
if let Some(cw) = model_cfg.get("context_window").and_then(|v| v.as_u64()) {
config.api_context_window = cw as usize;
}
}
// Top-level config sections (not inside "memory")
// Top-level sections (not inside "memory").
if let Some(servers) = root.get("lsp_servers") {
config.lsp_servers = serde_json::from_value(servers.clone()).unwrap_or_default();
}
@ -209,11 +147,6 @@ impl Config {
Some(config)
}
/// Load from legacy JSONL config — deprecated, just return defaults.
fn load_legacy_jsonl() -> Self {
Config::default()
}
}
/// Get the global memory config (cheap Arc clone).
@ -237,27 +170,99 @@ pub fn reload() -> bool {
changed
}
/// Spawn a background thread that watches `~/.consciousness/config.json5`
/// and reloads both the memory Config and the global AppConfig whenever
/// the file changes on disk. Lets edits from vim / F6 hotkeys / manual
/// tweaks land live without restarting the process.
pub fn watch_config(cli: crate::user::CliArgs) {
use notify_debouncer_mini::{new_debouncer, notify::RecursiveMode};
let path = config_path();
// Watch the parent directory — editors often replace-via-rename, so
// watching the file itself misses the new inode.
let Some(parent) = path.parent().map(|p| p.to_path_buf()) else {
crate::dbglog!("[config] no parent for {}, skipping watch", path.display());
return;
};
std::thread::Builder::new()
.name("config-watcher".into())
.spawn(move || {
let (tx, rx) = std::sync::mpsc::channel();
let mut debouncer = match new_debouncer(std::time::Duration::from_millis(200), tx) {
Ok(d) => d,
Err(e) => {
crate::dbglog!("[config] watcher setup failed: {}", e);
return;
}
};
if let Err(e) = debouncer.watcher()
.watch(&parent, RecursiveMode::NonRecursive)
{
crate::dbglog!("[config] watch({}) failed: {}", parent.display(), e);
return;
}
crate::dbglog!("[config] watching {}", path.display());
let mut last_seen = config_file_state(&path);
while let Ok(res) = rx.recv() {
let Ok(events) = res else { continue; };
if !events.iter().any(|e| e.path == path) { continue; }
let current_seen = config_file_state(&path);
if current_seen == last_seen {
continue;
}
last_seen = current_seen;
// Reload both halves.
let mem_changed = reload();
let app_changed = match build_figment(&cli).extract::<AppConfig>() {
Ok(app) => {
install_app(app);
true
}
Err(e) => {
crate::dbglog!("[config] reload: AppConfig parse failed: {}", e);
false
}
};
crate::dbglog!("[config] reloaded (memory_changed={}, app_changed={})",
mem_changed, app_changed);
}
})
.ok();
}
fn config_file_state(path: &std::path::Path) -> Option<(std::time::SystemTime, u64)> {
let meta = std::fs::metadata(path).ok()?;
Some((meta.modified().ok()?, meta.len()))
}
// ============================================================
// Agent config (top-level settings)
// ============================================================
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AppConfig {
pub backend: String,
pub anthropic: BackendConfig,
pub openrouter: BackendConfig,
#[serde(default = "default_user_name")]
pub user_name: String,
#[serde(default = "default_assistant_name")]
pub assistant_name: String,
/// Named model endpoints — credentials, base URL, and model id bundled
/// into one entry per backend. Keyed by name, selected by
/// `default_backend` or by `--model <name>` on the CLI.
#[serde(default)]
pub deepinfra: BackendConfig,
pub prompts: PromptConfig,
pub backends: HashMap<String, BackendConfig>,
#[serde(default)]
pub default_backend: String,
pub debug: bool,
pub compaction: CompactionConfig,
pub dmn: DmnConfig,
#[serde(skip_serializing_if = "Option::is_none")]
pub memory_project: Option<PathBuf>,
#[serde(default)]
pub models: HashMap<String, ModelConfig>,
#[serde(default = "default_model_name")]
pub default_model: String,
pub learn: LearnConfig,
#[serde(default)]
pub compare: CompareConfig,
#[serde(default)]
pub mcp_servers: Vec<McpServerConfig>,
#[serde(default)]
@ -284,32 +289,17 @@ pub struct LspServerConfig {
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct BackendConfig {
/// API key for the backend.
#[serde(default)]
pub api_key: String,
#[serde(default)]
pub model: String,
#[serde(skip_serializing_if = "Option::is_none")]
/// Base URL for the backend's OpenAI-compatible endpoint.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub base_url: Option<String>,
}
impl BackendConfig {
fn resolve(&self, default_base: &str) -> Result<(String, String, String)> {
if self.api_key.is_empty() {
anyhow::bail!(
"No API key. Set it in {} or use --api-key",
config_path().display()
);
}
let base = self.base_url.clone()
.unwrap_or_else(|| default_base.to_string());
Ok((base, self.api_key.clone(), self.model.clone()))
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PromptConfig {
pub anthropic: String,
pub other: String,
/// Model identifier sent to the API.
pub model_id: String,
/// Context window size in tokens.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub context_window: Option<usize>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
@ -324,65 +314,68 @@ pub struct DmnConfig {
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ModelConfig {
/// Backend name ("anthropic" or "openrouter")
pub backend: String,
/// Model identifier sent to the API
pub model_id: String,
/// Instruction file ("CLAUDE.md" or "POC.md").
pub struct LearnConfig {
/// Divergence threshold — responses scoring above this become
/// fine-tuning candidates. Lower = more sensitive.
#[serde(default = "default_learn_threshold")]
pub threshold: f64,
/// Whether to generate "what would the model have said without
/// memories" alternates alongside each scoring run. Expensive —
/// one full streaming generation per candidate.
#[serde(default)]
pub prompt_file: Option<String>,
/// Context window size in tokens.
#[serde(default)]
pub context_window: Option<usize>,
pub generate_alternates: bool,
}
fn default_learn_threshold() -> f64 { 1.0 }
impl Default for LearnConfig {
fn default() -> Self {
Self {
threshold: default_learn_threshold(),
generate_alternates: false,
}
}
}
/// Settings for the F7 compare screen — side-by-side generation with a
/// test model against the current context.
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct CompareConfig {
/// Backend name (looked up in `backends`) to use as the test model.
/// Empty = F7 reports "no test backend configured" and does nothing.
#[serde(default)]
pub test_backend: String,
}
fn default_user_name() -> String { "User".into() }
fn default_assistant_name() -> String { "Assistant".into() }
impl Default for AppConfig {
fn default() -> Self {
Self {
backend: "openrouter".to_string(),
anthropic: BackendConfig {
api_key: String::new(),
model: "claude-opus-4-6-20250918".to_string(),
base_url: None,
},
openrouter: BackendConfig {
api_key: String::new(),
model: "qwen/qwen3.5-397b-a17b".to_string(),
base_url: Some("https://openrouter.ai/api/v1".to_string()),
},
deepinfra: BackendConfig {
api_key: String::new(),
model: String::new(),
base_url: Some("https://api.deepinfra.com/v1/openai".to_string()),
},
prompts: PromptConfig {
anthropic: "CLAUDE.md".to_string(),
other: "POC.md".to_string(),
},
user_name: default_user_name(),
assistant_name: default_assistant_name(),
backends: HashMap::new(),
default_backend: String::new(),
debug: false,
compaction: CompactionConfig {
hard_threshold_pct: 90,
soft_threshold_pct: 80,
},
dmn: DmnConfig { max_turns: 20 },
memory_project: None,
models: HashMap::new(),
default_model: String::new(),
learn: LearnConfig::default(),
compare: CompareConfig::default(),
mcp_servers: Vec::new(),
lsp_servers: Vec::new(),
}
}
}
fn default_model_name() -> String { String::new() }
/// Resolved, ready-to-use agent session config.
pub struct SessionConfig {
pub api_base: String,
pub api_key: String,
pub model: String,
pub prompt_file: String,
/// Identity/personality nodes as (name, content) pairs.
pub context_parts: Vec<(String, String)>,
pub session_dir: PathBuf,
@ -398,37 +391,22 @@ pub struct ResolvedModel {
pub api_base: String,
pub api_key: String,
pub model_id: String,
pub prompt_file: String,
pub context_window: Option<usize>,
}
impl AppConfig {
/// Resolve the active backend and assemble prompts into a SessionConfig.
pub async fn resolve(&self, cli: &crate::user::CliArgs) -> Result<SessionConfig> {
let (api_base, api_key, model, prompt_file);
if !self.models.is_empty() {
let model_name = cli.model.as_deref().unwrap_or(&self.default_model);
let resolved = self.resolve_model(model_name)?;
api_base = resolved.api_base;
api_key = resolved.api_key;
model = resolved.model_id;
prompt_file = resolved.prompt_file;
} else {
let (base, key, mdl) = match self.backend.as_str() {
"anthropic" => self.anthropic.resolve("https://api.anthropic.com"),
_ => self.openrouter.resolve("https://openrouter.ai/api/v1"),
}?;
api_base = base;
api_key = key;
model = mdl;
prompt_file = if self.backend == "anthropic" {
self.prompts.anthropic.clone()
} else {
self.prompts.other.clone()
};
if self.backends.is_empty() {
anyhow::bail!(
"no backends configured in {}. Add a `backends` section with at least one entry.",
config_path().display()
);
}
let name = cli.model.as_deref().unwrap_or(&self.default_backend);
let resolved = self.resolve_model(name)?;
let personality_nodes = get().personality_nodes.clone();
let context_parts = crate::mind::identity::personality_nodes(&personality_nodes).await;
@ -438,11 +416,13 @@ impl AppConfig {
std::fs::create_dir_all(&session_dir).ok();
// CLI --api-base and --api-key override everything
let api_base = cli.api_base.clone().unwrap_or(api_base);
let api_key = cli.api_key.clone().unwrap_or(api_key);
let api_base = cli.api_base.clone().unwrap_or(resolved.api_base);
let api_key = cli.api_key.clone().unwrap_or(resolved.api_key);
Ok(SessionConfig {
api_base, api_key, model, prompt_file,
api_base,
api_key,
model: resolved.model_id,
context_parts,
session_dir,
app: self.clone(),
@ -450,55 +430,33 @@ impl AppConfig {
})
}
/// Look up a named model and resolve its credentials from the backend config.
/// Look up a named backend and resolve its credentials.
pub fn resolve_model(&self, name: &str) -> Result<ResolvedModel> {
let model = self.models.get(name)
let b = self.backends.get(name)
.ok_or_else(|| anyhow::anyhow!(
"Unknown model '{}'. Available: {}",
"Unknown backend '{}'. Available: {}",
name,
self.model_names().join(", "),
))?;
let (api_base, api_key) = match model.backend.as_str() {
"anthropic" => (
self.anthropic.base_url.clone()
.unwrap_or_else(|| "https://api.anthropic.com".to_string()),
self.anthropic.api_key.clone(),
),
"deepinfra" => (
self.deepinfra.base_url.clone()
.unwrap_or_else(|| "https://api.deepinfra.com/v1/openai".to_string()),
self.deepinfra.api_key.clone(),
),
_ => (
self.openrouter.base_url.clone()
.unwrap_or_else(|| "https://openrouter.ai/api/v1".to_string()),
self.openrouter.api_key.clone(),
),
};
let prompt_file = model.prompt_file.clone()
.unwrap_or_else(|| {
if model.backend == "anthropic" {
self.prompts.anthropic.clone()
} else {
self.prompts.other.clone()
}
});
let api_base = b.base_url.clone()
.ok_or_else(|| anyhow::anyhow!(
"backends.{}.base_url not set in {}",
name, config_path().display()
))?;
Ok(ResolvedModel {
name: name.to_string(),
api_base,
api_key,
model_id: model.model_id.clone(),
prompt_file,
context_window: model.context_window,
api_key: b.api_key.clone(),
model_id: b.model_id.clone(),
context_window: b.context_window,
})
}
/// List available model names, sorted.
/// List available backend names, sorted.
pub fn model_names(&self) -> Vec<String> {
let mut names: Vec<_> = self.models.keys().cloned().collect();
let mut names: Vec<_> = self.backends.keys().cloned().collect();
names.sort();
names
}
@ -518,7 +476,7 @@ impl Provider for Json5File {
fn data(&self) -> figment::Result<figment::value::Map<figment::Profile, figment::value::Dict>> {
match std::fs::read_to_string(&self.0) {
Ok(content) => {
let value: figment::value::Value = json5::from_str(&content)
let value: figment::value::Value = json_five::from_str(&content)
.map_err(|e| figment::Error::from(format!("{}: {}", self.0.display(), e)))?;
Serialized::defaults(value).data()
}
@ -540,11 +498,6 @@ fn build_figment(cli: &crate::user::CliArgs) -> Figment {
let mut f = Figment::from(Serialized::defaults(AppConfig::default()))
.merge(Json5File(config_path()));
merge_opt!(f, cli.backend, "backend");
merge_opt!(f, cli.model, "anthropic.model", "openrouter.model");
merge_opt!(f, cli.api_key, "anthropic.api_key", "openrouter.api_key");
merge_opt!(f, cli.api_base, "anthropic.base_url", "openrouter.base_url");
merge_opt!(f, cli.memory_project, "memory_project");
merge_opt!(f, cli.dmn_max_turns, "dmn.max_turns");
if cli.debug {
f = f.merge(Serialized::default("debug", true));
@ -554,12 +507,46 @@ fn build_figment(cli: &crate::user::CliArgs) -> Figment {
}
/// Load just the AppConfig — no validation, no prompt assembly.
/// Also installs the loaded AppConfig into the global cache so
/// `config::app()` is available everywhere.
pub fn load_app(cli: &crate::user::CliArgs) -> Result<(AppConfig, Figment)> {
let figment = build_figment(cli);
let app: AppConfig = figment.extract().context("Failed to load configuration")?;
install_app(app.clone());
Ok((app, figment))
}
// ============================================================
// Global AppConfig cache (writable, for runtime-mutable settings
// like learn.threshold that F6 edits via config_writer).
// ============================================================
static APP_CONFIG: OnceLock<RwLock<AppConfig>> = OnceLock::new();
fn install_app(app: AppConfig) {
let slot = APP_CONFIG.get_or_init(|| RwLock::new(app.clone()));
*slot.write().unwrap() = app;
}
/// Current AppConfig, held under a read lock. Reads should be brief
/// (no holding across await / long work) to avoid starving writers.
/// Panics if called before load_app — which runs once at startup.
pub fn app() -> std::sync::RwLockReadGuard<'static, AppConfig> {
APP_CONFIG
.get()
.expect("config::app() called before load_app()")
.read()
.unwrap()
}
/// Mutate the cached AppConfig in place. Used by config_writer to keep
/// the in-memory view in sync with disk after surgical edits to
/// ~/.consciousness/config.json5.
pub fn update_app(f: impl FnOnce(&mut AppConfig)) {
let slot = APP_CONFIG.get().expect("update_app before load_app");
f(&mut *slot.write().unwrap());
}
/// Load the full config: figment → AppConfig → resolve backend → assemble prompts.
pub async fn load_session(cli: &crate::user::CliArgs) -> Result<(SessionConfig, Figment)> {
let (app, figment) = load_app(cli)?;
@ -585,38 +572,28 @@ pub fn show_config(app: &AppConfig, figment: &Figment) {
}
println!("# Effective configuration\n");
println!("backend: {:?} ({})", app.backend, src(figment, "backend"));
for (name, b) in [("anthropic", &app.anthropic), ("openrouter", &app.openrouter)] {
println!("\n{}:", name);
println!(" api_key: {} ({})", mask(&b.api_key), src(figment, &format!("{name}.api_key")));
println!(" model: {:?} ({})", b.model, src(figment, &format!("{name}.model")));
if let Some(ref url) = b.base_url {
println!(" base_url: {:?} ({})", url, src(figment, &format!("{name}.base_url")));
}
}
println!("\nprompts:");
println!(" anthropic: {:?} ({})", app.prompts.anthropic, src(figment, "prompts.anthropic"));
println!(" other: {:?} ({})", app.prompts.other, src(figment, "prompts.other"));
println!("user_name: {:?} ({})", app.user_name, src(figment, "user_name"));
println!("assistant_name: {:?} ({})", app.assistant_name, src(figment, "assistant_name"));
println!("\ndebug: {} ({})", app.debug, src(figment, "debug"));
println!("\ncompaction:");
println!(" hard_threshold_pct: {} ({})", app.compaction.hard_threshold_pct, src(figment, "compaction.hard_threshold_pct"));
println!(" soft_threshold_pct: {} ({})", app.compaction.soft_threshold_pct, src(figment, "compaction.soft_threshold_pct"));
println!("\ndmn:");
println!(" max_turns: {} ({})", app.dmn.max_turns, src(figment, "dmn.max_turns"));
if let Some(ref p) = app.memory_project {
println!("\nmemory_project: {:?} ({})", p, src(figment, "memory_project"));
}
println!("\ndefault_model: {:?}", app.default_model);
if !app.models.is_empty() {
println!("\nmodels:");
for (name, m) in &app.models {
println!("\ndefault_backend: {:?} ({})", app.default_backend, src(figment, "default_backend"));
if !app.backends.is_empty() {
println!("\nbackends:");
let mut names: Vec<_> = app.backends.keys().cloned().collect();
names.sort();
for name in names {
let b = &app.backends[&name];
println!(" {}:", name);
println!(" backend: {:?}", m.backend);
println!(" model_id: {:?}", m.model_id);
if let Some(ref pf) = m.prompt_file {
println!(" prompt_file: {:?}", pf);
println!(" api_key: {} ({})", mask(&b.api_key), src(figment, &format!("backends.{name}.api_key")));
if let Some(ref url) = b.base_url {
println!(" base_url: {:?} ({})", url, src(figment, &format!("backends.{name}.base_url")));
}
if let Some(cw) = m.context_window {
println!(" model_id: {:?}", b.model_id);
if let Some(cw) = b.context_window {
println!(" context_window: {}", cw);
}
}

448
src/config_writer.rs Normal file
View file

@ -0,0 +1,448 @@
// config_writer.rs — Surgical edits to ~/.consciousness/config.json5
//
// Uses json-five's round-trip parser to mutate specific fields while
// preserving the surrounding comments, whitespace, and formatting.
use std::path::Path;
use anyhow::{anyhow, Context as _, Result};
use json_five::rt::parser::{
from_str, JSONKeyValuePair, JSONObjectContext, JSONValue, KeyValuePairContext,
};
use crate::config::config_path;
/// Read the config, apply `mutate` to the root JSONValue, write it back atomically.
fn edit_config<F: FnOnce(&mut JSONValue) -> Result<()>>(mutate: F) -> Result<()> {
let path = config_path();
let src = std::fs::read_to_string(&path)
.with_context(|| format!("read {}", path.display()))?;
let mut text = from_str(&src)
.map_err(|e| anyhow!("parse {}: {}", path.display(), e))?;
mutate(&mut text.value)?;
write_atomic(&path, &text.to_string())
}
fn write_atomic(path: &Path, content: &str) -> Result<()> {
let parent = path.parent()
.ok_or_else(|| anyhow!("config path has no parent: {}", path.display()))?;
let tmp = parent.join(format!(
".{}.tmp",
path.file_name().unwrap_or_default().to_string_lossy(),
));
std::fs::write(&tmp, content)
.with_context(|| format!("write {}", tmp.display()))?;
std::fs::rename(&tmp, path)
.with_context(|| format!("rename {} -> {}", tmp.display(), path.display()))?;
Ok(())
}
/// Match a key JSONValue against a string name. JSON5 allows keys to be
/// unquoted identifiers or single/double-quoted strings.
fn key_matches(key: &JSONValue, name: &str) -> bool {
match key {
JSONValue::Identifier(s)
| JSONValue::DoubleQuotedString(s)
| JSONValue::SingleQuotedString(s) => s == name,
_ => false,
}
}
/// Find (or create) a child object under `parent`, returning a mutable borrow
/// of its key_value_pairs vector.
/// Append a new kvp to `object`, setting whitespace so the output is
/// multi-line with the given indentation:
///
/// ```text
/// {<newline><inner_indent>first_key: first_val,<newline><outer_indent>}
/// ```
///
/// If `object` already has kvps, the separator between the last one and
/// ours goes in the prior kvp's wsc.3. If we're the first kvp, the
/// lead-in after `{` goes in the object's own wsc.0.
fn append_kvp_pretty(
object: &mut JSONValue,
key: JSONValue,
value: JSONValue,
inner_indent: &str,
outer_indent: &str,
) -> Result<()> {
let (pairs, ctx) = match object {
JSONValue::JSONObject { key_value_pairs, context } => {
let ctx = context.get_or_insert_with(|| JSONObjectContext {
wsc: (String::new(),),
});
(key_value_pairs, ctx)
}
_ => return Err(anyhow!("not an object")),
};
if pairs.is_empty() {
ctx.wsc.0 = format!("\n{}", inner_indent);
} else {
let prev = pairs.last_mut().unwrap();
let prev_ctx = prev.context.get_or_insert_with(|| KeyValuePairContext {
wsc: (String::new(), String::from(" "), String::new(), None),
});
prev_ctx.wsc.3 = Some(format!("\n{}", inner_indent));
}
pairs.push(JSONKeyValuePair {
key,
value,
context: Some(KeyValuePairContext {
wsc: (
String::new(),
String::from(" "),
String::new(),
Some(format!("\n{}", outer_indent)),
),
}),
});
Ok(())
}
/// Find or create a child object under `parent`. Returns the index of
/// the kvp in parent's key_value_pairs so the caller can re-borrow
/// afterward.
fn get_or_create_object_idx(
parent: &mut JSONValue,
section: &str,
inner_indent: &str,
outer_indent: &str,
) -> Result<usize> {
let existing = match parent {
JSONValue::JSONObject { key_value_pairs, .. } => {
key_value_pairs.iter()
.position(|kvp| key_matches(&kvp.key, section))
}
_ => return Err(anyhow!("config root is not an object")),
};
if let Some(i) = existing {
return Ok(i);
}
append_kvp_pretty(
parent,
JSONValue::Identifier(section.to_string()),
JSONValue::JSONObject {
key_value_pairs: Vec::new(),
context: Some(JSONObjectContext { wsc: (String::new(),) }),
},
inner_indent,
outer_indent,
)?;
match parent {
JSONValue::JSONObject { key_value_pairs, .. } => Ok(key_value_pairs.len() - 1),
_ => unreachable!(),
}
}
/// Set `section.key` to a literal scalar value (e.g., "1e-7", "42", "true").
/// The literal is parsed as JSON5 so we preserve its source-form on round-trip.
pub fn set_scalar(section: &str, key: &str, literal: &str) -> Result<()> {
let value = parse_scalar_literal(literal)?;
edit_config(|root| {
// New top-level sections sit at column 4 (inside root `{`),
// and the root's closing `}` sits at column 0.
let section_idx = get_or_create_object_idx(root, section, " ", "")?;
let section_value = match root {
JSONValue::JSONObject { key_value_pairs, .. } => {
&mut key_value_pairs[section_idx].value
}
_ => unreachable!(),
};
// Update in place if the key already exists.
if let JSONValue::JSONObject { key_value_pairs, .. } = section_value {
if let Some(kvp) = key_value_pairs.iter_mut()
.find(|k| key_matches(&k.key, key))
{
kvp.value = value;
return Ok(());
}
}
// Append a new kvp. Inner keys sit at column 8, the section's
// closing `}` sits at column 4.
append_kvp_pretty(
section_value,
JSONValue::Identifier(key.to_string()),
value,
" ",
" ",
)
})
}
/// Parse a scalar literal by round-tripping it through json-five. Keeps us
/// consistent with whatever scalars the library considers valid (hex,
/// exponents, Infinity, etc.).
fn parse_scalar_literal(literal: &str) -> Result<JSONValue> {
let text = from_str(literal)
.map_err(|e| anyhow!("parse literal {:?}: {}", literal, e))?;
match text.value {
JSONValue::JSONObject { .. } | JSONValue::JSONArray { .. } => {
Err(anyhow!("set_scalar only accepts scalar literals, got {:?}", literal))
}
v => Ok(v),
}
}
/// Convenience: set `learn.threshold` to the given f64.
pub fn set_learn_threshold(value: f64) -> Result<()> {
// {:e} gives the minimal scientific notation that preserves the value.
set_scalar("learn", "threshold", &format!("{:e}", value))?;
crate::config::update_app(|app| app.learn.threshold = value);
Ok(())
}
/// Convenience: set `learn.generate_alternates` to the given bool.
pub fn set_learn_generate_alternates(value: bool) -> Result<()> {
set_scalar("learn", "generate_alternates",
if value { "true" } else { "false" })?;
crate::config::update_app(|app| app.learn.generate_alternates = value);
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
// In-memory variant of set_scalar — used to test the mutation logic
// without touching disk.
fn set_scalar_inline(
root: &mut JSONValue,
section: &str,
key: &str,
literal: &str,
) -> Result<()> {
let value = parse_scalar_literal(literal)?;
let section_idx = get_or_create_object_idx(root, section, " ", "")?;
let section_value = match root {
JSONValue::JSONObject { key_value_pairs, .. } => {
&mut key_value_pairs[section_idx].value
}
_ => unreachable!(),
};
if let JSONValue::JSONObject { key_value_pairs, .. } = section_value {
if let Some(kvp) = key_value_pairs.iter_mut()
.find(|k| key_matches(&k.key, key))
{
kvp.value = value;
return Ok(());
}
}
append_kvp_pretty(
section_value,
JSONValue::Identifier(key.to_string()),
value,
" ",
" ",
)
}
fn edit_str<F: FnOnce(&mut JSONValue) -> Result<()>>(src: &str, f: F) -> Result<String> {
let mut text = from_str(src).map_err(|e| anyhow!("{}", e))?;
f(&mut text.value)?;
Ok(text.to_string())
}
#[test]
fn replaces_existing_scalar() {
let src = r#"{
// threshold for learning
learn: {
threshold: 0.001, // the old value
},
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "1e-7")
}).unwrap();
assert!(out.contains("1e-7"), "output: {}", out);
assert!(out.contains("// threshold for learning"));
assert!(out.contains("// the old value"));
assert!(!out.contains("0.001"));
}
#[test]
fn creates_missing_section() {
let src = r#"{
// comment
memory: { user_name: "Kent" },
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "1e-7")
}).unwrap();
assert!(out.contains("learn"));
assert!(out.contains("1e-7"));
assert!(out.contains("// comment"));
assert!(out.contains(r#"user_name: "Kent""#));
}
#[test]
fn preserves_comments_in_siblings() {
let src = r#"{
memory: {
// sensitive setting
user_name: "Kent", // name
},
learn: {
threshold: 0.5,
},
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "1e-9")
}).unwrap();
assert!(out.contains("// sensitive setting"));
assert!(out.contains("// name"));
assert!(out.contains("1e-9"));
assert!(!out.contains("0.5"));
}
#[test]
fn adds_key_to_existing_empty_section() {
let src = r#"{
learn: {},
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "42")
}).unwrap();
assert!(out.contains("threshold"), "output: {}", out);
assert!(out.contains("42"));
}
#[test]
fn realistic_config_adds_learn_section() {
// Mirrors the shape of ~/.consciousness/config.json5 — multiple
// sections, comments, mixed tab/space indent, trailing commas.
let src = r#"{
deepinfra: {
api_key: "bcachefs-agents-2026",
base_url: "http://example/v1",
},
// Named models
models: {
"27b": {
backend: "deepinfra",
model_id: "Qwen/Qwen3.5-27B",
},
},
default_model: "27b",
memory: {
user_name: "Kent",
// Active agent types
agent_types: ["linker", "organize"],
},
compaction: {
hard_threshold_pct: 90,
},
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "1e-7")
}).unwrap();
// Core assertions: comments and sibling sections survive.
assert!(out.contains(r#"api_key: "bcachefs-agents-2026""#));
assert!(out.contains("// Named models"));
assert!(out.contains("// Active agent types"));
assert!(out.contains(r#"user_name: "Kent""#));
assert!(out.contains("hard_threshold_pct: 90"));
// New section added.
assert!(out.contains("learn"));
assert!(out.contains("1e-7"));
// Parse result should parse back without error (real json5 parser).
let reparsed: serde_json::Value = json_five::from_str(&out)
.expect("mutated output must be valid JSON5");
let threshold = reparsed.pointer("/learn/threshold").expect("learn.threshold exists");
assert_eq!(threshold.as_f64(), Some(1e-7));
}
#[test]
fn realistic_config_updates_existing_threshold() {
let src = r#"{
learn: {
// The divergence threshold
threshold: 0.001,
},
memory: { user_name: "Kent" },
}"#;
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "threshold", "5e-8")
}).unwrap();
assert!(out.contains("5e-8"));
assert!(!out.contains("0.001"));
assert!(out.contains("// The divergence threshold"));
let reparsed: serde_json::Value = json_five::from_str(&out).unwrap();
assert_eq!(reparsed.pointer("/learn/threshold").and_then(|v| v.as_f64()), Some(5e-8));
}
#[test]
fn new_section_exact_multiline_layout() {
let src = "{\n a: 1,\n}";
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "generate_alternates", "true")?;
set_scalar_inline(root, "learn", "threshold", "1e-7")
}).unwrap();
let expected = "\
{
a: 1,
learn: {
generate_alternates: true,
threshold: 1e-7,
},
}";
assert_eq!(out, expected, "\n--- got ---\n{}\n--- want ---\n{}\n", out, expected);
}
#[test]
fn new_section_and_key_format_cleanly() {
// The kind of config we actually have in ~/.consciousness
// (top-level sections separated by blank lines, 4-space indent
// for keys within each section). Appending a fresh `learn`
// section with one key should land cleanly, not as
// `learn\n\n :{key\n :value}`.
let src = "{\n memory: {\n user_name: \"Kent\",\n },\n}";
let out = edit_str(src, |root| {
set_scalar_inline(root, "learn", "generate_alternates", "true")
}).unwrap();
// No stray key-to-colon-on-next-line anywhere.
assert!(!out.contains("learn\n"), "learn key wraps: {}", out);
assert!(!out.contains("generate_alternates\n"),
"inner key wraps: {}", out);
// The output should reparse.
let v: serde_json::Value = json_five::from_str(&out).unwrap();
assert_eq!(
v.pointer("/learn/generate_alternates").and_then(|x| x.as_bool()),
Some(true),
"output: {}", out,
);
}
#[test]
fn roundtrip_stable_without_change() {
let src = r#"{
// heading
a: 1,
b: { c: 2 }, // inline
}"#;
let text = from_str(src).unwrap();
assert_eq!(text.to_string(), src);
}
}

113
src/conversation/claude.rs Normal file
View file

@ -0,0 +1,113 @@
use serde_json::Value;
use super::{ConversationSource, TranscriptMessage, TranscriptRole};
pub struct ClaudeSource;
impl ConversationSource for ClaudeSource {
fn parse_message(&self, obj: &Value, offset: u64) -> Option<TranscriptMessage> {
parse_message(obj, offset)
}
fn is_compaction(&self, obj: &Value) -> bool {
is_compaction(obj)
}
fn may_contain_compaction(&self, obj_bytes: &[u8]) -> bool {
contains_bytes(obj_bytes, b"This session is being continued")
}
}
fn text_content(value: &Value) -> Option<String> {
let text = match value {
Value::String(s) => s.clone(),
Value::Array(arr) => {
arr.iter()
.filter(|b| b.get("type").and_then(|v| v.as_str()) == Some("text"))
.filter_map(|b| b.get("text").and_then(|v| v.as_str()))
.collect::<Vec<_>>()
.join(" ")
}
_ => return None,
};
(!text.is_empty()).then_some(text)
}
pub(crate) fn parse_message(obj: &Value, offset: u64) -> Option<TranscriptMessage> {
let role = match obj.get("type").and_then(|v| v.as_str()) {
Some("user") => TranscriptRole::User,
Some("assistant") => TranscriptRole::Assistant,
_ => return None,
};
let msg = obj.get("message").unwrap_or(obj);
let text = msg.get("content").and_then(text_content)?;
let timestamp = obj.get("timestamp")
.and_then(|v| v.as_str())
.map(str::to_string);
Some(TranscriptMessage { role, text, timestamp, offset })
}
pub(crate) fn is_compaction(obj: &Value) -> bool {
obj.get("type").and_then(|v| v.as_str()) == Some("user")
&& obj.get("message")
.and_then(|m| m.get("content"))
.and_then(|c| c.as_str())
.is_some_and(|content| content.starts_with("This session is being continued"))
}
fn contains_bytes(haystack: &[u8], needle: &[u8]) -> bool {
haystack.windows(needle.len()).any(|w| w == needle)
}
#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;
#[test]
fn parses_string_and_array_content() {
let user = json!({
"timestamp": "2026-06-15T15:00:00.000Z",
"type": "user",
"message": { "content": "hello" }
});
let assistant = json!({
"timestamp": "2026-06-15T15:00:01.000Z",
"type": "assistant",
"message": {
"content": [
{ "type": "text", "text": "hi" },
{ "type": "tool_use", "name": "ignored" },
{ "type": "text", "text": "there" }
]
}
});
assert_eq!(
parse_message(&user, 7).unwrap(),
TranscriptMessage {
role: TranscriptRole::User,
text: "hello".to_string(),
timestamp: Some("2026-06-15T15:00:00.000Z".to_string()),
offset: 7,
}
);
assert_eq!(parse_message(&assistant, 9).unwrap().text, "hi there");
}
#[test]
fn detects_compaction_marker() {
let obj = json!({
"timestamp": "2026-06-15T15:00:01.000Z",
"type": "user",
"message": {
"content": "This session is being continued from a previous conversation."
}
});
assert!(is_compaction(&obj));
}
}

105
src/conversation/codex.rs Normal file
View file

@ -0,0 +1,105 @@
use serde_json::Value;
use super::{ConversationSource, TranscriptMessage, TranscriptRole};
pub struct CodexSource;
impl ConversationSource for CodexSource {
fn parse_message(&self, obj: &Value, offset: u64) -> Option<TranscriptMessage> {
parse_message(obj, offset)
}
fn is_compaction(&self, obj: &Value) -> bool {
is_compaction(obj)
}
fn may_contain_compaction(&self, obj_bytes: &[u8]) -> bool {
contains_bytes(obj_bytes, b"context_compacted")
}
}
pub(crate) fn parse_message(obj: &Value, offset: u64) -> Option<TranscriptMessage> {
if obj.get("type").and_then(|v| v.as_str()) != Some("event_msg") {
return None;
}
let payload = obj.get("payload")?;
let (role, text) = match payload.get("type").and_then(|v| v.as_str()) {
Some("user_message") => (
TranscriptRole::User,
payload.get("message").and_then(|v| v.as_str())?.to_string(),
),
Some("agent_message") => (
TranscriptRole::Assistant,
payload.get("message").and_then(|v| v.as_str())?.to_string(),
),
_ => return None,
};
if text.is_empty() {
return None;
}
let timestamp = obj.get("timestamp")
.and_then(|v| v.as_str())
.map(str::to_string);
Some(TranscriptMessage { role, text, timestamp, offset })
}
pub(crate) fn is_compaction(obj: &Value) -> bool {
obj.get("type").and_then(|v| v.as_str()) == Some("event_msg")
&& obj.get("payload")
.and_then(|p| p.get("type"))
.and_then(|v| v.as_str()) == Some("context_compacted")
}
fn contains_bytes(haystack: &[u8], needle: &[u8]) -> bool {
haystack.windows(needle.len()).any(|w| w == needle)
}
#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;
#[test]
fn parses_event_messages_and_skips_noise() {
let user = json!({
"timestamp": "2026-06-15T15:00:00.000Z",
"type": "event_msg",
"payload": { "type": "user_message", "message": "start here" }
});
let assistant = json!({
"timestamp": "2026-06-15T15:00:01.000Z",
"type": "event_msg",
"payload": { "type": "agent_message", "message": "working" }
});
let tool = json!({
"timestamp": "2026-06-15T15:00:02.000Z",
"type": "event_msg",
"payload": { "type": "task_started" }
});
let raw = json!({
"timestamp": "2026-06-15T15:00:03.000Z",
"type": "response_item",
"payload": { "type": "message", "role": "user" }
});
assert_eq!(parse_message(&user, 1).unwrap().role, TranscriptRole::User);
assert_eq!(parse_message(&assistant, 2).unwrap().text, "working");
assert!(parse_message(&tool, 3).is_none());
assert!(parse_message(&raw, 4).is_none());
}
#[test]
fn detects_compaction_event() {
let obj = json!({
"timestamp": "2026-06-15T15:00:01.000Z",
"type": "event_msg",
"payload": { "type": "context_compacted" }
});
assert!(is_compaction(&obj));
}
}

110
src/conversation/jsonl.rs Normal file
View file

@ -0,0 +1,110 @@
use memchr::memrchr3;
/// Scan backwards through mmap'd bytes, yielding byte slices of complete
/// top-level JSON objects (outermost { to matching }).
///
/// Uses memrchr3 (SIMD) to jump between structurally significant bytes
/// ({, }, ") instead of scanning byte-by-byte. Tracks brace depth,
/// skipping braces inside JSON strings. Returns objects in reverse order
/// (newest first).
pub struct JsonlBackwardIter<'a> {
data: &'a [u8],
pos: usize,
}
impl<'a> JsonlBackwardIter<'a> {
pub fn new(data: &'a [u8]) -> Self {
Self { data, pos: data.len() }
}
}
impl<'a> Iterator for JsonlBackwardIter<'a> {
type Item = (usize, &'a [u8]);
fn next(&mut self) -> Option<Self::Item> {
next_json_object(self.data, &mut self.pos)
}
}
fn is_unescaped_quote(data: &[u8], p: usize) -> bool {
let mut bs = 0;
while p > bs && data[p - 1 - bs] == b'\\' {
bs += 1;
}
bs % 2 == 0
}
fn next_json_object<'a>(data: &'a [u8], pos: &mut usize) -> Option<(usize, &'a [u8])> {
// Find the closing } of the next object, skipping } inside strings.
let close = {
let mut in_string = false;
loop {
let p = memrchr3(b'{', b'}', b'"', &data[..*pos])?;
*pos = p;
let ch = data[p];
if in_string {
if ch == b'"' && is_unescaped_quote(data, p) {
in_string = false;
}
continue;
}
match ch {
b'}' => break p,
b'"' => in_string = true,
_ => {}
}
}
};
// Track brace depth to find matching {.
let mut depth: usize = 1;
let mut in_string = false;
loop {
let p = memrchr3(b'{', b'}', b'"', &data[..*pos])?;
*pos = p;
let ch = data[p];
if in_string {
if ch == b'"' && is_unescaped_quote(data, p) {
in_string = false;
}
continue;
}
match ch {
b'"' => { in_string = true; }
b'}' => { depth += 1; }
b'{' => {
depth -= 1;
if depth == 0 {
return Some((*pos, &data[*pos..=close]));
}
}
_ => {}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn handles_nested_json_and_quoted_braces() {
let data = br#"{"n":1,"s":"literal } brace"}
{"n":2,"nested":{"s":"escaped quote: \" and { brace"}}
trailing garbage
"#;
let objs: Vec<_> = JsonlBackwardIter::new(data)
.map(|(_, bytes)| std::str::from_utf8(bytes).unwrap().to_string())
.collect();
assert_eq!(objs.len(), 2);
assert!(objs[0].contains(r#""n":2"#));
assert!(objs[1].contains(r#""n":1"#));
}
}

271
src/conversation/mod.rs Normal file
View file

@ -0,0 +1,271 @@
// Conversation transcript abstraction.
//
// Core code consumes normalized user/assistant messages through this module.
// Product-specific log formats live in the small compatibility sources below.
use memmap2::Mmap;
use serde_json::Value;
use std::fs;
use std::path::Path;
pub mod claude;
pub mod codex;
pub mod jsonl;
pub use jsonl::JsonlBackwardIter;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum TranscriptRole {
User,
Assistant,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct TranscriptMessage {
pub role: TranscriptRole,
pub text: String,
pub timestamp: Option<String>,
pub offset: u64,
}
pub trait ConversationSource {
fn parse_message(&self, obj: &Value, offset: u64) -> Option<TranscriptMessage>;
fn is_compaction(&self, obj: &Value) -> bool;
fn may_contain_compaction(&self, _obj_bytes: &[u8]) -> bool {
true
}
}
pub struct AnyConversationSource;
impl ConversationSource for AnyConversationSource {
fn parse_message(&self, obj: &Value, offset: u64) -> Option<TranscriptMessage> {
claude::ClaudeSource.parse_message(obj, offset)
.or_else(|| codex::CodexSource.parse_message(obj, offset))
}
fn is_compaction(&self, obj: &Value) -> bool {
claude::ClaudeSource.is_compaction(obj) || codex::CodexSource.is_compaction(obj)
}
fn may_contain_compaction(&self, obj_bytes: &[u8]) -> bool {
claude::ClaudeSource.may_contain_compaction(obj_bytes)
|| codex::CodexSource.may_contain_compaction(obj_bytes)
}
}
/// Find the byte offset of the last compaction marker in mmap'd transcript data.
/// Returns the byte offset of the JSON object's opening brace.
pub(crate) fn find_last_compaction(data: &[u8]) -> Option<usize> {
find_last_compaction_with(data, &AnyConversationSource)
}
pub(crate) fn find_last_compaction_with(
data: &[u8],
source: &impl ConversationSource,
) -> Option<usize> {
for (offset, obj_bytes) in JsonlBackwardIter::new(data) {
// Quick byte check before parsing large transcript entries.
if !source.may_contain_compaction(obj_bytes) {
continue;
}
let obj: Value = match serde_json::from_slice(obj_bytes) {
Ok(v) => v,
Err(_) => continue,
};
if source.is_compaction(&obj) {
return Some(offset);
}
}
None
}
/// Find the byte offset of the last compaction in a transcript file.
/// Returns None if the file can't be opened or has no compaction.
pub(crate) fn find_last_compaction_in_file(path: &str) -> Option<u64> {
if path.is_empty() { return None; }
let file = fs::File::open(path).ok()?;
let meta = file.metadata().ok()?;
if meta.len() == 0 { return None; }
let mmap = unsafe { Mmap::map(&file).ok()? };
find_last_compaction(&mmap).map(|off| off as u64)
}
/// Mmap a transcript file. Returns (Mmap, File) to keep both alive.
pub(crate) fn mmap_transcript(path: &str) -> Option<(Mmap, fs::File)> {
let file = fs::File::open(path).ok()?;
let meta = file.metadata().ok()?;
if meta.len() == 0 { return None; }
let mmap = unsafe { Mmap::map(&file).ok()? };
Some((mmap, file))
}
/// Reverse iterator over user/assistant messages in a transcript file.
/// Yields normalized transcript messages newest-first. The caller decides
/// when to stop (byte budget, count, etc).
pub struct TailMessages {
_file: fs::File,
mmap: Mmap,
pos: usize,
}
impl TailMessages {
pub fn open(path: &str) -> Option<Self> {
let (mmap, file) = mmap_transcript(path)?;
let pos = mmap.len();
Some(Self { _file: file, mmap, pos })
}
}
impl Iterator for TailMessages {
type Item = TranscriptMessage;
fn next(&mut self) -> Option<Self::Item> {
loop {
let (offset, obj_bytes) = jsonl::JsonlBackwardIter::new(&self.mmap[..self.pos]).next()?;
self.pos = offset;
let obj: Value = match serde_json::from_slice(obj_bytes) {
Ok(v) => v,
Err(_) => continue,
};
if let Some(message) = AnyConversationSource.parse_message(&obj, offset as u64) {
return Some(message);
}
}
}
}
/// Get the timestamp of the compaction message at a given byte offset.
/// Returns a human-readable datetime string, or None if unavailable.
pub fn compaction_timestamp(path: &str, offset: u64) -> Option<String> {
let (mmap, _file) = mmap_transcript(path)?;
let start = offset as usize;
if start >= mmap.len() { return None; }
// Find the end of this JSONL line
let end = mmap[start..].iter().position(|&b| b == b'\n')
.map(|p| start + p)
.unwrap_or(mmap.len());
let obj: Value = serde_json::from_slice(&mmap[start..end]).ok()?;
if let Some(ts) = obj.get("timestamp").and_then(|v| v.as_str()) {
return Some(ts.to_string());
}
for field in &["createdAt", "created_at", "time"] {
if let Some(ts) = obj.get(*field).and_then(|v| v.as_str()) {
return Some(ts.to_string());
}
}
None
}
/// Detect whether a compaction has occurred since the last check.
///
/// Compares the current compaction offset against a saved value in
/// `state_dir/compaction-{session_id}`. Returns true if a new
/// compaction was found. Updates the saved offset.
pub fn detect_new_compaction(
state_dir: &Path,
session_id: &str,
transcript_path: &str,
) -> bool {
let offset = find_last_compaction_in_file(transcript_path);
let save_path = state_dir.join(format!("compaction-{}", session_id));
let saved: Option<u64> = fs::read_to_string(&save_path)
.ok()
.and_then(|s| s.trim().parse().ok());
let is_new = match (offset, saved) {
(Some(cur), Some(prev)) => cur != prev,
(Some(_), None) => true,
_ => false,
};
// Save current offset
if let Some(off) = offset {
fs::write(&save_path, off.to_string()).ok();
}
is_new
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
fn write_temp_jsonl(content: &str) -> tempfile::NamedTempFile {
let mut file = tempfile::NamedTempFile::new().unwrap();
file.write_all(content.as_bytes()).unwrap();
file.flush().unwrap();
file
}
#[test]
fn tail_messages_yields_normalized_messages_newest_first() {
let file = write_temp_jsonl(
r#"{"timestamp":"2026-06-15T15:00:00.000Z","type":"user","message":{"content":"claude user"}}
{"timestamp":"2026-06-15T15:00:01.000Z","type":"assistant","message":{"content":[{"type":"text","text":"claude assistant"}]}}
{"timestamp":"2026-06-15T15:00:02.000Z","type":"event_msg","payload":{"type":"user_message","message":"codex user"}}
{"timestamp":"2026-06-15T15:00:03.000Z","type":"event_msg","payload":{"type":"task_started"}}
{"timestamp":"2026-06-15T15:00:04.000Z","type":"event_msg","payload":{"type":"agent_message","message":"codex assistant"}}
"#,
);
let messages: Vec<_> = TailMessages::open(&file.path().to_string_lossy())
.unwrap()
.collect();
assert_eq!(messages.len(), 4);
assert_eq!(messages[0].text, "codex assistant");
assert_eq!(messages[1].text, "codex user");
assert_eq!(messages[2].text, "claude assistant");
assert_eq!(messages[3].text, "claude user");
assert!(messages[0].offset > messages[1].offset);
}
#[test]
fn detects_claude_and_codex_compactions() {
let claude = br#"{"timestamp":"2026-06-15T15:00:00.000Z","type":"user","message":{"content":"normal"}}
{"timestamp":"2026-06-15T15:00:01.000Z","type":"user","message":{"content":"This session is being continued from a previous conversation."}}
"#;
let codex = br#"{"timestamp":"2026-06-15T15:00:00.000Z","type":"event_msg","payload":{"type":"user_message","message":"normal"}}
{"timestamp":"2026-06-15T15:00:01.000Z","type":"event_msg","payload":{"type":"context_compacted"}}
"#;
assert!(find_last_compaction(claude).is_some());
assert!(find_last_compaction(codex).is_some());
}
#[test]
fn detect_new_compaction_tracks_offset_changes() {
let transcript = write_temp_jsonl(
r#"{"timestamp":"2026-06-15T15:00:00.000Z","type":"event_msg","payload":{"type":"context_compacted"}}
"#,
);
let state = tempfile::tempdir().unwrap();
assert!(detect_new_compaction(
state.path(),
"session",
&transcript.path().to_string_lossy(),
));
assert!(!detect_new_compaction(
state.path(),
"session",
&transcript.path().to_string_lossy(),
));
}
}

View file

@ -11,6 +11,23 @@ use crate::store::{Store, RelationType, StoreView};
use serde::{Deserialize, Serialize};
use std::collections::{HashMap, HashSet, VecDeque};
use std::sync::{OnceLock, RwLock};
const EXACT_CC_MAX_DEG: usize = 512;
const APPROX_CC_PAIRS: u64 = 4096;
const CC_CACHE_TTL_SECS: i64 = 15 * 60;
#[derive(Clone, Copy)]
struct CachedCc {
value: f32,
computed_at: i64,
}
static CC_CACHE: OnceLock<RwLock<HashMap<String, CachedCc>>> = OnceLock::new();
fn cc_cache() -> &'static RwLock<HashMap<String, CachedCc>> {
CC_CACHE.get_or_init(|| RwLock::new(HashMap::new()))
}
/// Community info for reporting
#[derive(Clone, Debug)]
@ -34,6 +51,8 @@ pub struct Edge {
pub struct Graph {
/// Adjacency list: node key → list of edges
adj: HashMap<String, Vec<Edge>>,
/// Neighbor sets for membership tests in graph metrics.
neighbor_sets: HashMap<String, HashSet<String>>,
/// All node keys
keys: HashSet<String>,
/// Community labels (from label propagation)
@ -69,18 +88,18 @@ impl Graph {
/// Just neighbor keys
pub fn neighbor_keys(&self, key: &str) -> HashSet<&str> {
self.adj.get(key)
.map(|edges| edges.iter().map(|e| e.target.as_str()).collect())
self.neighbor_sets.get(key)
.map(|neighbors| neighbors.iter().map(String::as_str).collect())
.unwrap_or_default()
}
/// Jaccard similarity between two nodes' neighborhoods.
/// Measures overlap: |intersection| / |union| of their neighbor sets.
pub fn jaccard(&self, a: &str, b: &str) -> f32 {
let na = self.neighbor_keys(a);
let nb = self.neighbor_keys(b);
let intersection = na.intersection(&nb).count();
let union = na.union(&nb).count();
let Some(na) = self.neighbor_sets.get(a) else { return 0.0 };
let Some(nb) = self.neighbor_sets.get(b) else { return 0.0 };
let intersection = na.intersection(nb).count();
let union = na.len() + nb.len() - intersection;
if union == 0 { 0.0 } else { intersection as f32 / union as f32 }
}
@ -206,24 +225,59 @@ impl Graph {
/// that are also neighbors of each other.
/// cc(v) = 2E / (deg * (deg - 1))
pub fn clustering_coefficient(&self, key: &str) -> f32 {
let neighbors = self.neighbor_keys(key);
let now = crate::store::now_epoch();
if let Some(cc) = cc_cache().read().unwrap().get(key).copied()
&& now - cc.computed_at < CC_CACHE_TTL_SECS
{
return cc.value;
}
let cc = self.clustering_coefficient_uncached(key);
cc_cache().write().unwrap().insert(key.to_owned(), CachedCc {
value: cc,
computed_at: now,
});
cc
}
fn clustering_coefficient_uncached(&self, key: &str) -> f32 {
let Some(neighbors) = self.neighbor_sets.get(key) else {
return 0.0;
};
let deg = neighbors.len();
if deg < 2 {
return 0.0;
}
let neighbor_vec: Vec<&str> = neighbors.iter().copied().collect();
let mut triangles = 0u32;
let neighbor_vec: Vec<&str> = neighbors.iter().map(String::as_str).collect();
if deg <= EXACT_CC_MAX_DEG {
let mut linked = 0u64;
for i in 0..neighbor_vec.len() {
for j in (i + 1)..neighbor_vec.len() {
let ni_neighbors = self.neighbor_keys(neighbor_vec[i]);
if ni_neighbors.contains(neighbor_vec[j]) {
triangles += 1;
if self.neighbor_sets
.get(neighbor_vec[i])
.is_some_and(|n| n.contains(neighbor_vec[j])) {
linked += 1;
}
}
}
return (2.0 * linked as f32) / (deg as f32 * (deg as f32 - 1.0));
}
(2.0 * triangles as f32) / (deg as f32 * (deg as f32 - 1.0))
let mut linked = 0u64;
let samples = APPROX_CC_PAIRS.min((deg as u64 * (deg as u64 - 1)) / 2);
for sample in 0..samples {
let i = ((sample.wrapping_mul(1_103_515_245).wrapping_add(12_345)) % deg as u64) as usize;
let mut j = ((sample.wrapping_mul(2_654_435_761).wrapping_add(97_531)) % deg as u64) as usize;
if i == j {
j = (j + 1) % deg;
}
if self.neighbor_sets
.get(neighbor_vec[i])
.is_some_and(|n| n.contains(neighbor_vec[j])) {
linked += 1;
}
}
linked as f32 / samples as f32
}
/// Average clustering coefficient across all nodes with deg >= 2
@ -231,11 +285,13 @@ impl Graph {
let mut sum = 0.0f32;
let mut count = 0u32;
for key in &self.keys {
if self.degree(key) >= 2 {
match self.neighbor_sets.get(key.as_str()) {
Some(s) if s.len() >= 2 => s,
_ => continue,
};
sum += self.clustering_coefficient(key);
count += 1;
}
}
if count == 0 { 0.0 } else { sum / count as f32 }
}
@ -268,10 +324,12 @@ impl Graph {
while let Some(node) = queue.pop_front() {
let d = dist[&node];
for neighbor in self.neighbor_keys(&node) {
if let Some(neighbors) = self.neighbor_sets.get(&node) {
for neighbor in neighbors {
if !dist.contains_key(neighbor) {
dist.insert(neighbor.to_string(), d + 1);
queue.push_back(neighbor.to_string());
dist.insert(neighbor.clone(), d + 1);
queue.push_back(neighbor.clone());
}
}
}
}
@ -506,15 +564,38 @@ impl Graph {
/// Build graph from store data (with community detection)
pub fn build_graph(store: &impl StoreView) -> Graph {
let (adj, keys) = build_adjacency(store);
let neighbor_sets = build_neighbor_sets(&adj);
let communities = label_propagation(&keys, &adj, 20);
Graph { adj, keys, communities }
Graph {
adj,
neighbor_sets,
keys,
communities,
}
}
/// Build graph without community detection — for spreading activation
/// searches where we only need the adjacency list.
pub fn build_graph_fast(store: &impl StoreView) -> Graph {
let (adj, keys) = build_adjacency(store);
Graph { adj, keys, communities: HashMap::new() }
let neighbor_sets = build_neighbor_sets(&adj);
Graph {
adj,
neighbor_sets,
keys,
communities: HashMap::new(),
}
}
fn build_neighbor_sets(adj: &HashMap<String, Vec<Edge>>) -> HashMap<String, HashSet<String>> {
adj.iter()
.map(|(key, edges)| {
let neighbors = edges.iter()
.map(|edge| edge.target.clone())
.collect();
(key.clone(), neighbors)
})
.collect()
}
fn build_adjacency(store: &impl StoreView) -> (HashMap<String, Vec<Edge>>, HashSet<String>) {

View file

@ -17,7 +17,6 @@ pub mod query;
pub mod spectral;
pub mod neuro;
pub mod counters;
pub mod transcript;
use std::cell::RefCell;
use std::path::PathBuf;

View file

@ -230,10 +230,6 @@ fn consolidation_plan_inner(store: &Store, _detect_interf: bool) -> Consolidatio
rationale: Vec::new(),
};
// Active agent types from config
let config = crate::config::get();
let agent_types: Vec<&str> = config.agent_types.iter().map(|s| s.as_str()).collect();
// Target: α ≥ 2.5 (healthy scale-free)
if alpha < 2.0 {
plan.add("linker", 100);
@ -274,48 +270,6 @@ fn consolidation_plan_inner(store: &Store, _detect_interf: bool) -> Consolidatio
// Split: handle oversized nodes
plan.set("split", 5);
// Distribute agent budget using Elo ratings
let budget = crate::config::get().agent_budget;
let elo_path = crate::config::get().data_dir.join("agent-elo.json");
if let Ok(elo_json) = std::fs::read_to_string(&elo_path) {
if let Ok(ratings) = serde_json::from_str::<std::collections::HashMap<String, f64>>(&elo_json) {
let elos: Vec<f64> = agent_types.iter()
.map(|t| ratings.get(*t).copied().unwrap_or(1000.0))
.collect();
let min_elo = elos.iter().copied().fold(f64::MAX, f64::min);
let weights: Vec<f64> = elos.iter()
.map(|e| {
let shifted = e - min_elo + 50.0;
shifted * shifted
})
.collect();
let total_weight: f64 = weights.iter().sum();
let allocate = |w: f64| -> usize {
((w / total_weight * budget as f64).round() as usize).max(2)
};
for (i, agent) in agent_types.iter().enumerate() {
plan.set(agent, allocate(weights[i]));
}
let summary: Vec<String> = agent_types.iter()
.map(|a| format!("{}={}", a, plan.count(a)))
.collect();
plan.rationale.push(format!(
"Elo allocation (budget={}): {}", budget, summary.join(" ")));
}
} else {
// No Elo file — use budget with equal distribution
let per_type = budget / agent_types.len();
for agent in &agent_types {
plan.set(agent, per_type);
}
plan.rationale.push(format!(
"No Elo ratings — equal distribution ({} each, budget={})", per_type, budget));
}
plan
}

View file

@ -1,340 +0,0 @@
// Transcript JSONL parsing utilities.
//
// Provides mmap-based backward scanning of Claude Code transcript files
// and compaction detection. Used by memory-search (hook mode) and
// parse-claude-conversation (debug tool).
use memchr::memrchr3;
use memmap2::Mmap;
use serde_json::Value;
use std::fs;
use std::path::Path;
/// Scan backwards through mmap'd bytes, yielding byte slices of complete
/// top-level JSON objects (outermost { to matching }).
///
/// Uses memrchr3 (SIMD) to jump between structurally significant bytes
/// ({, }, ") instead of scanning byte-by-byte. Tracks brace depth,
/// skipping braces inside JSON strings. Returns objects in reverse order
/// (newest first).
pub struct JsonlBackwardIter<'a> {
data: &'a [u8],
pos: usize,
}
impl<'a> JsonlBackwardIter<'a> {
pub fn new(data: &'a [u8]) -> Self {
Self { data, pos: data.len() }
}
}
impl<'a> Iterator for JsonlBackwardIter<'a> {
type Item = &'a [u8];
fn next(&mut self) -> Option<Self::Item> {
// Find the closing } of the next object, skipping } inside strings
let close = {
let mut in_string = false;
loop {
let p = memrchr3(b'{', b'}', b'"', &self.data[..self.pos])?;
self.pos = p;
let ch = self.data[p];
if in_string {
if ch == b'"' {
let mut bs = 0;
while p > bs + 1 && self.data[p - 1 - bs] == b'\\' {
bs += 1;
}
if bs % 2 == 0 { in_string = false; }
}
continue;
}
match ch {
b'}' => break p,
b'"' => in_string = true,
_ => {}
}
}
};
// Track brace depth to find matching {
let mut depth: usize = 1;
let mut in_string = false;
loop {
let p = memrchr3(b'{', b'}', b'"', &self.data[..self.pos])?;
self.pos = p;
let ch = self.data[p];
if in_string {
if ch == b'"' {
// Check for escaped quote (count preceding backslashes)
let mut bs = 0;
while p > bs + 1 && self.data[p - 1 - bs] == b'\\' {
bs += 1;
}
if bs % 2 == 0 {
in_string = false;
}
}
// { and } inside strings don't affect depth
continue;
}
match ch {
b'"' => { in_string = true; }
b'}' => { depth += 1; }
b'{' => {
depth -= 1;
if depth == 0 {
return Some(&self.data[self.pos..=close]);
}
}
_ => {}
}
}
}
}
/// Find the byte offset of the last compaction summary in mmap'd transcript data.
///
/// Scans backward for a user-type message whose content starts with
/// "This session is being continued". Returns the byte offset of the
/// JSON object's opening brace.
pub(crate) fn find_last_compaction(data: &[u8]) -> Option<usize> {
let marker = b"This session is being continued";
for obj_bytes in JsonlBackwardIter::new(data) {
// Quick byte check before parsing
if !contains_bytes(obj_bytes, marker) {
continue;
}
let obj: Value = match serde_json::from_slice(obj_bytes) {
Ok(v) => v,
Err(_) => continue,
};
if obj.get("type").and_then(|v| v.as_str()) != Some("user") {
continue;
}
if let Some(content) = obj.get("message")
.and_then(|m| m.get("content"))
.and_then(|c| c.as_str())
&& content.starts_with("This session is being continued") {
let offset = obj_bytes.as_ptr() as usize - data.as_ptr() as usize;
return Some(offset);
}
}
None
}
/// Find the byte offset of the last compaction in a transcript file.
/// Returns None if the file can't be opened or has no compaction.
pub(crate) fn find_last_compaction_in_file(path: &str) -> Option<u64> {
if path.is_empty() { return None; }
let file = fs::File::open(path).ok()?;
let meta = file.metadata().ok()?;
if meta.len() == 0 { return None; }
let mmap = unsafe { Mmap::map(&file).ok()? };
find_last_compaction(&mmap).map(|off| off as u64)
}
/// Mmap a transcript file. Returns (Mmap, File) to keep both alive.
pub(crate) fn mmap_transcript(path: &str) -> Option<(Mmap, fs::File)> {
let file = fs::File::open(path).ok()?;
let meta = file.metadata().ok()?;
if meta.len() == 0 { return None; }
let mmap = unsafe { Mmap::map(&file).ok()? };
Some((mmap, file))
}
fn contains_bytes(haystack: &[u8], needle: &[u8]) -> bool {
haystack.windows(needle.len()).any(|w| w == needle)
}
/// Reverse iterator over user/assistant messages in a transcript file.
/// Yields (role, text, timestamp) tuples newest-first. The caller decides
/// when to stop (byte budget, count, etc).
pub struct TailMessages {
_file: fs::File,
mmap: Mmap,
pos: usize,
}
impl TailMessages {
pub fn open(path: &str) -> Option<Self> {
let (mmap, file) = mmap_transcript(path)?;
let pos = mmap.len();
Some(Self { _file: file, mmap, pos })
}
}
impl Iterator for TailMessages {
type Item = (String, String, String);
fn next(&mut self) -> Option<Self::Item> {
loop {
// Find closing }, skipping } inside strings
let close = {
let mut in_string = false;
loop {
let p = memrchr3(b'{', b'}', b'"', &self.mmap[..self.pos])?;
self.pos = p;
let ch = self.mmap[p];
if in_string {
if ch == b'"' {
let mut bs = 0;
while p > bs + 1 && self.mmap[p - 1 - bs] == b'\\' {
bs += 1;
}
if bs % 2 == 0 { in_string = false; }
}
continue;
}
match ch {
b'}' => break p,
b'"' => in_string = true,
_ => {}
}
}
};
// Track brace depth to find matching {
let mut depth: usize = 1;
let mut in_string = false;
let open = loop {
let p = memrchr3(b'{', b'}', b'"', &self.mmap[..self.pos])?;
self.pos = p;
let ch = self.mmap[p];
if in_string {
if ch == b'"' {
let mut bs = 0;
while p > bs + 1 && self.mmap[p - 1 - bs] == b'\\' {
bs += 1;
}
if bs % 2 == 0 { in_string = false; }
}
continue;
}
match ch {
b'"' => { in_string = true; }
b'}' => { depth += 1; }
b'{' => {
depth -= 1;
if depth == 0 { break p; }
}
_ => {}
}
};
let obj_bytes = &self.mmap[open..=close];
// The "type" field is near the start of top-level objects.
// Only check the first 200 bytes to avoid scanning megabyte objects.
let prefix = &obj_bytes[..obj_bytes.len().min(200)];
let is_user = memchr::memmem::find(prefix, b"\"type\":\"user\"").is_some();
let is_assistant = !is_user
&& memchr::memmem::find(prefix, b"\"type\":\"assistant\"").is_some();
if !is_user && !is_assistant { continue; }
let obj: Value = match serde_json::from_slice(obj_bytes) {
Ok(v) => v,
Err(_) => continue,
};
let msg_type = if is_user { "user" } else { "assistant" };
let msg = obj.get("message").unwrap_or(&obj);
let text = match msg.get("content") {
Some(Value::String(s)) => s.clone(),
Some(Value::Array(arr)) => {
arr.iter()
.filter(|b| b.get("type").and_then(|v| v.as_str()) == Some("text"))
.filter_map(|b| b.get("text").and_then(|v| v.as_str()))
.collect::<Vec<_>>()
.join(" ")
}
_ => continue,
};
if text.is_empty() { continue; }
let timestamp = obj.get("timestamp")
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string();
return Some((msg_type.to_string(), text, timestamp));
}
}
}
/// Get the timestamp of the compaction message at a given byte offset.
/// Returns a human-readable datetime string, or None if unavailable.
pub fn compaction_timestamp(path: &str, offset: u64) -> Option<String> {
let (mmap, _file) = mmap_transcript(path)?;
let start = offset as usize;
if start >= mmap.len() { return None; }
// Find the end of this JSONL line
let end = mmap[start..].iter().position(|&b| b == b'\n')
.map(|p| start + p)
.unwrap_or(mmap.len());
let obj: Value = serde_json::from_slice(&mmap[start..end]).ok()?;
// Claude Code transcript entries have a "timestamp" field (ISO 8601)
if let Some(ts) = obj.get("timestamp").and_then(|v| v.as_str()) {
return Some(ts.to_string());
}
// Fallback: try "createdAt" or similar fields
for field in &["createdAt", "created_at", "time"] {
if let Some(ts) = obj.get(*field).and_then(|v| v.as_str()) {
return Some(ts.to_string());
}
}
None
}
/// Detect whether a compaction has occurred since the last check.
///
/// Compares the current compaction offset against a saved value in
/// `state_dir/compaction-{session_id}`. Returns true if a new
/// compaction was found. Updates the saved offset.
pub fn detect_new_compaction(
state_dir: &Path,
session_id: &str,
transcript_path: &str,
) -> bool {
let offset = find_last_compaction_in_file(transcript_path);
let save_path = state_dir.join(format!("compaction-{}", session_id));
let saved: Option<u64> = fs::read_to_string(&save_path)
.ok()
.and_then(|s| s.trim().parse().ok());
let is_new = match (offset, saved) {
(Some(cur), Some(prev)) => cur != prev,
(Some(_), None) => true,
_ => false,
};
// Save current offset
if let Some(off) = offset {
fs::write(&save_path, off.to_string()).ok();
}
is_new
}

View file

@ -1,4 +1,4 @@
#![feature(async_fn_track_caller)]
#![cfg_attr(feature = "nightly-diagnostics", feature(async_fn_track_caller))]
// consciousness — unified crate for memory, agents, and subconscious processes
//
@ -25,6 +25,9 @@ macro_rules! dbglog {
}};
}
// Logging (target-routed file logger)
pub mod logging;
// User interface (TUI, CLI)
pub mod user;
@ -40,8 +43,12 @@ pub mod hippocampus;
// Autonomous agents
pub mod subconscious;
// Conversation transcript abstraction and compatibility sources
pub mod conversation;
// Unified configuration
pub mod config;
pub mod config_writer;
// Session state
pub mod session;
@ -87,7 +94,8 @@ pub mod channel_capnp {
pub use hippocampus::{
store, graph, lookups, query,
spectral, neuro, counters,
transcript, memory,
memory,
};
pub use conversation as transcript;
use hippocampus::query::engine as search;
use hippocampus::query::parser as query_parser;

View file

@ -114,7 +114,7 @@ impl<T> TrackedMutex<T> {
Self { inner: Mutex::new(value) }
}
#[track_caller]
#[cfg_attr(feature = "nightly-diagnostics", track_caller)]
pub async fn lock(&self) -> TrackedMutexGuard<'_, T> {
let location = Location::caller();
let guard = self.inner.lock().await;
@ -125,7 +125,7 @@ impl<T> TrackedMutex<T> {
}
}
#[track_caller]
#[cfg_attr(feature = "nightly-diagnostics", track_caller)]
pub fn try_lock(&self) -> Result<TrackedMutexGuard<'_, T>, tokio::sync::TryLockError> {
let location = Location::caller();
let guard = self.inner.try_lock()?;
@ -171,7 +171,7 @@ impl<T> TrackedRwLock<T> {
Self { inner: RwLock::new(value) }
}
#[track_caller]
#[cfg_attr(feature = "nightly-diagnostics", track_caller)]
pub async fn read(&self) -> TrackedRwLockReadGuard<'_, T> {
let location = Location::caller();
let guard = self.inner.read().await;
@ -182,7 +182,7 @@ impl<T> TrackedRwLock<T> {
}
}
#[track_caller]
#[cfg_attr(feature = "nightly-diagnostics", track_caller)]
pub async fn write(&self) -> TrackedRwLockWriteGuard<'_, T> {
let location = Location::caller();
let guard = self.inner.write().await;

146
src/logging.rs Normal file
View file

@ -0,0 +1,146 @@
// logging.rs — log-crate logger that routes by target.
//
// Records with target "grpc" (or any target starting with "grpc::") go
// to ~/.consciousness/logs/daemon/grpc.log so we can tell gRPC events
// apart from the rest of consciousness's noise. Everything else goes
// to ~/.consciousness/logs/daemon/debug.log.
//
// Level threshold is taken from RUST_LOG (simple global level parse:
// "trace"/"debug"/"info"/"warn"/"error"); defaults to "info".
use std::io::Write;
use std::path::PathBuf;
use std::sync::Mutex;
use log::{Level, LevelFilter, Log, Metadata, Record, SetLoggerError};
fn logs_dir() -> PathBuf {
dirs::home_dir().unwrap_or_default().join(".consciousness/logs/daemon")
}
struct RoutingLogger {
grpc_file: Mutex<Option<std::fs::File>>,
debug_file: Mutex<Option<std::fs::File>>,
level: LevelFilter,
}
impl RoutingLogger {
fn new(level: LevelFilter) -> Self {
let dir = logs_dir();
let _ = std::fs::create_dir_all(&dir);
let grpc = std::fs::OpenOptions::new()
.create(true).append(true)
.open(dir.join("grpc.log")).ok();
let debug = std::fs::OpenOptions::new()
.create(true).append(true)
.open(dir.join("debug.log")).ok();
Self {
grpc_file: Mutex::new(grpc),
debug_file: Mutex::new(debug),
level,
}
}
fn is_grpc_target(target: &str) -> bool {
target == "grpc" || target.starts_with("grpc::")
}
}
impl Log for RoutingLogger {
fn enabled(&self, m: &Metadata) -> bool {
// Always enable DEBUG for grpc target so the dedicated log is
// actually useful without RUST_LOG wrangling; defer to the
// configured level for everything else.
if Self::is_grpc_target(m.target()) {
return m.level() <= Level::Debug;
}
m.level() <= self.level
}
fn log(&self, record: &Record) {
if !self.enabled(record.metadata()) {
return;
}
let line = format!(
"[{}] [{}] [{}] {}\n",
chrono::Utc::now().format("%Y-%m-%d %H:%M:%S%.3f"),
record.level(),
record.target(),
record.args(),
);
let slot = if Self::is_grpc_target(record.target()) {
&self.grpc_file
} else {
&self.debug_file
};
if let Ok(mut guard) = slot.lock() {
if let Some(ref mut f) = *guard {
let _ = f.write_all(line.as_bytes());
}
}
}
fn flush(&self) {
for slot in [&self.grpc_file, &self.debug_file] {
if let Ok(mut g) = slot.lock() {
if let Some(ref mut f) = *g {
let _ = f.flush();
}
}
}
}
}
fn parse_level_from_env() -> LevelFilter {
let raw = std::env::var("RUST_LOG").unwrap_or_else(|_| "info".to_string());
// Parse a plain level word; if it's the module=level form, we take
// the first level we find.
let token = raw.split(',').next().unwrap_or("info");
let level_word = token.rsplit_once('=').map(|(_, v)| v).unwrap_or(token);
match level_word.trim().to_lowercase().as_str() {
"trace" => LevelFilter::Trace,
"debug" => LevelFilter::Debug,
"info" => LevelFilter::Info,
"warn" => LevelFilter::Warn,
"error" => LevelFilter::Error,
"off" => LevelFilter::Off,
_ => LevelFilter::Info,
}
}
/// Install the routing logger. Safe to call at most once — subsequent
/// calls return an error but are otherwise no-ops.
pub fn init() -> Result<(), SetLoggerError> {
let level = parse_level_from_env();
let logger = Box::new(RoutingLogger::new(level));
log::set_boxed_logger(logger)?;
// Always let DEBUG records through globally so the grpc log can
// capture them (the logger itself filters non-grpc targets by
// `level`). The cost is that log::debug! call-sites below `level`
// in other modules still do their arg formatting before being
// dropped at the logger; acceptable for a debug tool.
log::set_max_level(LevelFilter::Debug.max(level));
// Mark the file with a session boundary so it's easy to see where a
// restart happened.
log::info!(
"===== consciousness logger init (level={}, pid={}) =====",
level, std::process::id(),
);
log::info!(target: "grpc",
"===== grpc log init (level={}, pid={}) =====",
level, std::process::id(),
);
Ok(())
}
/// Consumer of &Level so the type is used when only some callers want it.
#[allow(dead_code)]
pub fn current_level() -> Level {
match log::max_level() {
LevelFilter::Trace => Level::Trace,
LevelFilter::Debug => Level::Debug,
LevelFilter::Info | LevelFilter::Off => Level::Info,
LevelFilter::Warn => Level::Warn,
LevelFilter::Error => Level::Error,
}
}

View file

@ -1,4 +1,4 @@
#![feature(panic_backtrace_config)]
#![cfg_attr(feature = "nightly-diagnostics", feature(panic_backtrace_config))]
// poc-memory: graph-structured memory for AI assistants
//
@ -333,6 +333,18 @@ enum AdminCmd {
#[arg(long)]
stats: bool,
},
/// Print normalized user/assistant messages from a transcript JSONL file
#[command(name = "transcript-tail")]
TranscriptTail {
/// Transcript JSONL path
path: String,
/// Maximum number of messages to print
#[arg(long, short = 'n', default_value_t = 40)]
count: usize,
/// Print newest messages first instead of chronological order
#[arg(long)]
newest_first: bool,
},
}
/// Print help with subcommands expanded to show nested commands.
@ -458,12 +470,15 @@ impl Run for AdminCmd {
Self::Dedup { apply } => cli::admin::cmd_dedup(apply).await,
Self::DailyCheck => cli::admin::cmd_daily_check().await,
Self::LoadContext { stats } => cli::node::cmd_load_context(stats).await,
Self::TranscriptTail { path, count, newest_first }
=> cli::admin::cmd_transcript_tail(&path, count, newest_first),
}
}
}
#[tokio::main]
async fn main() {
#[cfg(feature = "nightly-diagnostics")]
std::panic::set_backtrace_style(std::panic::BacktraceStyle::Short);
// Handle --help ourselves for expanded subcommand display
@ -482,9 +497,16 @@ async fn main() {
let cli = Cli::parse();
// Some subcommands (e.g. admin load-context) read from the global
// AppConfig. poc-memory has no config CLI flags of its own, so load
// with defaults — figment still pulls from ~/.consciousness/config.json5
// and env the same way.
if let Err(e) = crate::config::load_app(&crate::user::CliArgs::default()) {
eprintln!("warning: failed to load config: {:#}", e);
}
if let Err(e) = cli.command.run().await {
eprintln!("Error: {}", e);
process::exit(1);
}
}

View file

@ -3,7 +3,7 @@ use std::fs::{File, OpenOptions};
use std::io::Write;
use std::path::{Path, PathBuf};
use crate::agent::context::AstNode;
use crate::hippocampus::transcript::JsonlBackwardIter;
use crate::conversation::JsonlBackwardIter;
use memmap2::Mmap;
pub struct ConversationLog {
@ -55,17 +55,13 @@ impl ConversationLog {
}
pub fn oldest_timestamp(&self) -> Option<chrono::DateTime<chrono::Utc>> {
// Read forward from the start to find first timestamp
let file = File::open(&self.path).ok()?;
let mmap = unsafe { Mmap::map(&file).ok()? };
// Find first { ... } and parse
for line in mmap.split(|&b| b == b'\n') {
if line.is_empty() { continue; }
if let Ok(node) = serde_json::from_slice::<AstNode>(line) {
if let Some(leaf) = node.leaf() {
if let Some(ts) = leaf.timestamp() {
return Some(ts);
}
return Some(leaf.timestamp());
}
}
}
@ -82,6 +78,6 @@ pub struct TailNodes {
impl TailNodes {
pub fn iter(&self) -> impl Iterator<Item = AstNode> + '_ {
JsonlBackwardIter::new(&self.mmap)
.filter_map(|bytes| serde_json::from_slice::<AstNode>(bytes).ok())
.filter_map(|(_, bytes)| serde_json::from_slice::<AstNode>(bytes).ok())
}
}

View file

@ -9,6 +9,44 @@ pub mod unconscious;
pub mod identity;
pub mod log;
/// A background operation wired off Mind. Each flow (memory scoring,
/// finetune scoring, compare) is a struct holding its dependencies and
/// a TaskHandle; `trigger()` picks the flow's own "start a fresh run"
/// semantics (abort-restart vs no-op-if-running).
pub trait MindTriggered {
fn trigger(&self);
}
/// Owns a JoinHandle for a background task with two trigger semantics.
/// Uses a sync Mutex for interior mutability so callers can `trigger()`
/// off `&self` (Mind is shared via Arc).
#[derive(Default)]
pub struct TaskHandle(std::sync::Mutex<Option<tokio::task::JoinHandle<()>>>);
impl TaskHandle {
pub fn new() -> Self { Self::default() }
/// Abort any running task and start a fresh one.
pub fn trigger<F>(&self, fut: F)
where F: std::future::Future<Output = ()> + Send + 'static
{
let mut h = self.0.lock().unwrap();
if let Some(old) = h.take() { old.abort(); }
*h = Some(tokio::spawn(fut));
}
/// No-op if a task is still running; otherwise start a fresh one.
pub fn trigger_if_idle<F>(&self, fut: F)
where F: std::future::Future<Output = ()> + Send + 'static
{
let mut h = self.0.lock().unwrap();
if let Some(old) = &*h {
if !old.is_finished() { return; }
}
*h = Some(tokio::spawn(fut));
}
}
// consciousness.rs — Mind state machine and event loop
//
// The core runtime for the consciousness binary. Mind manages turns,
@ -25,7 +63,7 @@ use tokio::sync::mpsc;
use crate::agent::{Agent, TurnResult};
use crate::agent::api::ApiClient;
use crate::config::{AppConfig, SessionConfig};
use crate::subconscious::learn;
use crate::subconscious::{compare, learn};
use crate::hippocampus::access_local;
pub use subconscious::{SubconsciousSnapshot, Subconscious};
@ -33,6 +71,36 @@ pub use unconscious::{UnconsciousSnapshot, Unconscious};
use crate::agent::context::{AstNode, NodeBody, Section, Ast, ContextState};
fn match_scores(
nodes: &[AstNode],
scores: &std::collections::BTreeMap<String, f64>,
) -> Vec<(usize, f64)> {
nodes.iter().enumerate()
.filter_map(|(i, node)| {
if let AstNode::Leaf(leaf) = node {
if let NodeBody::Memory { key, .. } = leaf.body() {
return scores.get(key.as_str()).map(|&s| (i, s));
}
}
None
}).collect()
}
pub(crate) fn find_memory_by_key(ctx: &ContextState, key: &str) -> Option<(Section, usize)> {
[(Section::Identity, ctx.identity()), (Section::Conversation, ctx.conversation())]
.into_iter()
.find_map(|(section, nodes)| {
nodes.iter().enumerate().find_map(|(i, node)| {
if let AstNode::Leaf(leaf) = node {
if let NodeBody::Memory { key: k, .. } = leaf.body() {
if k == key { return Some((section, i)); }
}
}
None
})
})
}
fn load_memory_scores(ctx: &mut ContextState, path: &std::path::Path) {
let data = match std::fs::read_to_string(path) {
Ok(d) => d,
@ -42,25 +110,24 @@ fn load_memory_scores(ctx: &mut ContextState, path: &std::path::Path) {
Ok(s) => s,
Err(_) => return,
};
let mut applied = 0;
for i in 0..ctx.conversation().len() {
if let AstNode::Leaf(leaf) = &ctx.conversation()[i] {
if let NodeBody::Memory { key, .. } = leaf.body() {
if let Some(&s) = scores.get(key.as_str()) {
let identity_scores = match_scores(ctx.identity(), &scores);
let conv_scores = match_scores(ctx.conversation(), &scores);
let applied = identity_scores.len() + conv_scores.len();
for (i, s) in identity_scores {
ctx.set_score(Section::Identity, i, Some(s));
}
for (i, s) in conv_scores {
ctx.set_score(Section::Conversation, i, Some(s));
applied += 1;
}
}
}
}
if applied > 0 {
dbglog!("[scoring] loaded {} scores from {}", applied, path.display());
}
}
/// Collect scored memory keys from conversation entries.
fn collect_memory_scores(ctx: &ContextState) -> std::collections::BTreeMap<String, f64> {
ctx.conversation().iter()
/// Collect scored memory keys from identity and conversation entries.
pub(crate) fn collect_memory_scores(ctx: &ContextState) -> std::collections::BTreeMap<String, f64> {
ctx.identity().iter()
.chain(ctx.conversation().iter())
.filter_map(|node| {
if let AstNode::Leaf(leaf) = node {
if let NodeBody::Memory { key, score: Some(s), .. } = leaf.body() {
@ -73,10 +140,14 @@ fn collect_memory_scores(ctx: &ContextState) -> std::collections::BTreeMap<Strin
}
/// Save memory scores to disk.
fn save_memory_scores(scores: &std::collections::BTreeMap<String, f64>, path: &std::path::Path) {
if let Ok(json) = serde_json::to_string_pretty(scores) {
let _ = std::fs::write(path, json);
dbglog!("[scoring] saved {} scores to {}", scores.len(), path.display());
pub(crate) fn save_memory_scores(scores: &std::collections::BTreeMap<String, f64>, path: &std::path::Path) {
match serde_json::to_string_pretty(scores) {
Ok(json) => match std::fs::write(path, &json) {
Ok(()) => dbglog!("[scoring] saved {} scores to {} ({} bytes)",
scores.len(), path.display(), json.len()),
Err(e) => dbglog!("[scoring] save FAILED ({}): {}", path.display(), e),
},
Err(e) => dbglog!("[scoring] serialize FAILED: {}", e),
}
}
@ -118,6 +189,15 @@ pub struct MindState {
pub unc_idle: bool,
/// When the unconscious idle timer will fire (for UI display).
pub unc_idle_deadline: Instant,
/// Fine-tuning candidates identified by scoring.
pub finetune_candidates: Vec<learn::FinetuneCandidate>,
/// Last scoring run stats for UI display.
pub finetune_last_run: Option<learn::FinetuneScoringStats>,
/// F7 compare candidates — one per response, showing what the test
/// model would say given the same context.
pub compare_candidates: Vec<compare::CompareCandidate>,
/// F7 compare error from the last run, if any.
pub compare_error: Option<String>,
}
impl Clone for MindState {
@ -136,6 +216,10 @@ impl Clone for MindState {
turn_handle: None, // Not cloned — only Mind's loop uses this
unc_idle: self.unc_idle,
unc_idle_deadline: self.unc_idle_deadline,
finetune_candidates: self.finetune_candidates.clone(),
finetune_last_run: self.finetune_last_run.clone(),
compare_candidates: self.compare_candidates.clone(),
compare_error: self.compare_error.clone(),
}
}
}
@ -148,6 +232,15 @@ pub enum MindCommand {
Score,
/// Run full N×M memory scoring matrix (/score command)
ScoreFull,
/// Score for finetune candidates
ScoreFinetune,
/// Run F7 compare: generate alternates with the configured test model
/// for every assistant response in the context.
Compare,
/// Update the finetune divergence threshold and persist to config.
SetLearnThreshold(f64),
/// Toggle alternate-response generation during scoring; persist to config.
SetLearnGenerateAlternates(bool),
/// Abort current turn, kill processes
Interrupt,
/// Reset session
@ -173,6 +266,10 @@ impl MindState {
turn_handle: None,
unc_idle: false,
unc_idle_deadline: Instant::now() + std::time::Duration::from_secs(60),
finetune_candidates: Vec::new(),
finetune_last_run: None,
compare_candidates: Vec::new(),
compare_error: None,
}
}
@ -229,7 +326,7 @@ impl MindState {
}
/// DMN tick — returns a prompt and target if we should run a turn.
fn dmn_tick(&mut self) -> Option<(String, StreamTarget)> {
fn _dmn_tick(&mut self) -> Option<(String, StreamTarget)> {
if matches!(self.dmn, subconscious::State::Paused | subconscious::State::Off) {
return None;
}
@ -256,10 +353,6 @@ impl MindState {
}
}
/// Background task completion events.
enum BgEvent {
ScoringDone,
}
// --- Mind: cognitive state machine ---
@ -276,8 +369,9 @@ pub struct Mind {
/// Signals conscious activity to the unconscious loop.
/// true = active, false = idle opportunity.
conscious_active: tokio::sync::watch::Sender<bool>,
bg_tx: mpsc::UnboundedSender<BgEvent>,
bg_rx: std::sync::Mutex<Option<mpsc::UnboundedReceiver<BgEvent>>>,
memory_scoring: learn::MemoryScoring,
finetune_scoring: learn::FinetuneScoring,
compare_scoring: compare::CompareScoring,
_supervisor: crate::thalamus::supervisor::Supervisor,
}
@ -295,16 +389,28 @@ impl Mind {
client,
config.context_parts.clone(),
config.app.clone(),
config.prompt_file.clone(),
conversation_log,
crate::agent::tools::ActiveTools::new(),
crate::agent::tools::tools(),
).await;
let shared = Arc::new(std::sync::Mutex::new(MindState::new(config.app.dmn.max_turns)));
// Migrate legacy "file exists = enabled" sentinel for the
// generate-alternates flag into the config. One-shot; after this
// the sentinel is gone and the config is the source of truth.
let legacy_sentinel = dirs::home_dir().unwrap_or_default()
.join(".consciousness/cache/finetune-alternates");
if legacy_sentinel.exists() {
if !crate::config::app().learn.generate_alternates {
let _ = crate::config_writer::set_learn_generate_alternates(true);
}
let _ = std::fs::remove_file(&legacy_sentinel);
}
let shared = Arc::new(std::sync::Mutex::new(MindState::new(
config.app.dmn.max_turns,
)));
let (turn_watch, _) = tokio::sync::watch::channel(false);
let (conscious_active, _) = tokio::sync::watch::channel(false);
let (bg_tx, bg_rx) = mpsc::unbounded_channel();
let mut sup = crate::thalamus::supervisor::Supervisor::new();
sup.load_config();
@ -313,7 +419,9 @@ impl Mind {
let subconscious = Arc::new(crate::Mutex::new(Subconscious::new()));
subconscious.lock().await.init_output_tool(subconscious.clone());
let unconscious = Arc::new(crate::Mutex::new(Unconscious::new()));
let unconscious = Arc::new(crate::Mutex::new(
Unconscious::new(agent.client.clone()),
));
// Spawn the unconscious loop on its own task
if !config.no_agents {
@ -361,8 +469,11 @@ impl Mind {
};
// Spawn agents outside lock
let client = unc.lock().await.client.clone();
for (idx, name, auto) in to_spawn {
match crate::mind::unconscious::prepare_spawn(&name, auto, wake.clone()).await {
match crate::mind::unconscious::prepare_spawn(
&name, auto, wake.clone(), client.clone(),
).await {
Ok(result) => unc.lock().await.complete_spawn(idx, result),
Err(auto) => unc.lock().await.abort_spawn(idx, auto),
}
@ -389,10 +500,19 @@ impl Mind {
});
}
let scores_path = config.session_dir.join("memory-scores.json");
let memory_scoring = learn::MemoryScoring::new(
agent.clone(), shared.clone(), scores_path);
let finetune_scoring = learn::FinetuneScoring::new(agent.clone(), shared.clone());
let compare_scoring = compare::CompareScoring::new(agent.clone(), shared.clone());
Self { agent, shared, config,
subconscious, unconscious,
turn_tx, turn_watch, conscious_active, bg_tx,
bg_rx: std::sync::Mutex::new(Some(bg_rx)), _supervisor: sup }
turn_tx, turn_watch, conscious_active,
memory_scoring,
finetune_scoring,
compare_scoring,
_supervisor: sup }
}
/// Initialize — restore log, start daemons and background agents.
@ -434,6 +554,10 @@ impl Mind {
// Load persistent subconscious state
let state_path = self.config.session_dir.join("subconscious-state.json");
self.subconscious.lock().await.set_state_path(state_path);
// Kick off an incremental scoring pass on startup so memories due
// for re-scoring get evaluated without requiring a user message.
self.memory_scoring.trigger();
}
pub fn turn_watch(&self) -> tokio::sync::watch::Receiver<bool> {
@ -453,24 +577,10 @@ impl Mind {
}
}
MindCommand::Score => {
let mut s = self.shared.lock().unwrap();
if !s.scoring_in_flight {
s.scoring_in_flight = true;
drop(s);
self.start_memory_scoring();
} else {
dbglog!("[scoring] skipped: scoring_in_flight=true");
}
self.memory_scoring.trigger();
}
MindCommand::ScoreFull => {
let mut s = self.shared.lock().unwrap();
if !s.scoring_in_flight {
s.scoring_in_flight = true;
drop(s);
self.start_full_scoring();
} else {
dbglog!("[scoring-full] skipped: scoring_in_flight=true");
}
self.memory_scoring.trigger_full();
}
MindCommand::Interrupt => {
self.shared.lock().unwrap().interrupt();
@ -500,83 +610,27 @@ impl Mind {
}
self.agent.compact().await;
}
MindCommand::ScoreFinetune => {
self.finetune_scoring.trigger();
}
MindCommand::Compare => {
self.compare_scoring.trigger();
}
MindCommand::SetLearnThreshold(value) => {
if let Err(e) = crate::config_writer::set_learn_threshold(value) {
dbglog!("[learn] failed to persist threshold {}: {:#}", value, e);
}
}
MindCommand::SetLearnGenerateAlternates(value) => {
if let Err(e) = crate::config_writer::set_learn_generate_alternates(value) {
dbglog!("[learn] failed to persist generate_alternates {}: {:#}",
value, e);
}
}
}
}
}
pub fn start_memory_scoring(&self) {
let agent = self.agent.clone();
let bg_tx = self.bg_tx.clone();
let scores_path = self.config.session_dir.join("memory-scores.json");
let cfg = crate::config::get();
let max_age = cfg.scoring_interval_secs;
let response_window = cfg.scoring_response_window;
tokio::spawn(async move {
let (context, client) = {
let mut st = agent.state.lock().await;
if st.memory_scoring_in_flight {
dbglog!("[scoring] skipped: memory_scoring_in_flight=true");
return;
}
st.memory_scoring_in_flight = true;
drop(st);
let ctx = agent.context.lock().await.clone();
(ctx, agent.client.clone())
};
let _result = learn::score_memories_incremental(
&context, max_age as i64, response_window, &client, &agent,
|key: String, score: f64| {
let agent = agent.clone();
let path = scores_path.clone();
async move {
let scores_snapshot = {
let mut ctx = agent.context.lock().await;
for i in 0..ctx.conversation().len() {
if let AstNode::Leaf(leaf) = &ctx.conversation()[i] {
if let NodeBody::Memory { key: k, .. } = leaf.body() {
if *k == key {
ctx.set_score(Section::Conversation, i, Some(score));
}
}
}
}
let snapshot = collect_memory_scores(&ctx);
drop(ctx);
agent.state.lock().await.changed.notify_one();
snapshot
};
save_memory_scores(&scores_snapshot, &path);
}
},
).await;
{
agent.state.lock().await.memory_scoring_in_flight = false;
}
let _ = bg_tx.send(BgEvent::ScoringDone);
});
}
/// Run full N×M scoring matrix — scores every memory against every response.
pub fn start_full_scoring(&self) {
let agent = self.agent.clone();
let bg_tx = self.bg_tx.clone();
tokio::spawn(async move {
{
let mut st = agent.state.lock().await;
if st.memory_scoring_in_flight {
dbglog!("[scoring-full] skipped: memory_scoring_in_flight=true");
return;
}
st.memory_scoring_in_flight = true;
}
let client = agent.client.clone();
match learn::score_memories(&client, &agent).await {
Ok(()) => { let _ = bg_tx.send(BgEvent::ScoringDone); }
Err(e) => { dbglog!("[scoring-full] FAILED: {:#}", e); }
}
agent.state.lock().await.memory_scoring_in_flight = false;
});
}
async fn start_turn(&self, text: &str, target: StreamTarget) {
{
@ -639,9 +693,13 @@ impl Mind {
}
});
let mut bg_rx = self.bg_rx.lock().unwrap().take()
.expect("Mind::run() called twice");
let mut sub_handle: Option<tokio::task::JoinHandle<()>> = None;
let _sub_handle: Option<tokio::task::JoinHandle<()>> = None;
// Start finetune scoring at startup (scores existing conversation)
if !self.config.no_agents {
self.finetune_scoring.trigger();
}
loop {
let (timeout, has_input) = {
let me = self.shared.lock().unwrap();
@ -662,14 +720,6 @@ impl Mind {
}
}
Some(bg) = bg_rx.recv() => {
match bg {
BgEvent::ScoringDone => {
self.shared.lock().unwrap().scoring_in_flight = false;
}
}
}
Some((result, target)) = turn_rx.recv() => {
let _ = self.conscious_active.send(false);
let model_switch = {
@ -686,12 +736,14 @@ impl Mind {
cmds.push(MindCommand::Compact);
if !self.config.no_agents {
cmds.push(MindCommand::Score);
cmds.push(MindCommand::ScoreFinetune);
}
}
_ = tokio::time::sleep(timeout), if !has_input => _dmn_expired = true,
}
/*
if !self.config.no_agents {
if sub_handle.as_ref().map_or(true, |h| h.is_finished()) {
let sub = self.subconscious.clone();
@ -703,6 +755,7 @@ impl Mind {
}));
}
}
*/
// Check for pending user input → push to agent context and start turn
let pending = self.shared.lock().unwrap().take_pending_input();

View file

@ -20,6 +20,7 @@
use std::path::PathBuf;
use std::time::{Duration, Instant};
use crate::thalamus::idle::{hours_since_last_dream, DREAM_INTERVAL_HOURS};
/// DMN state machine.
#[derive(Debug, Clone)]
@ -91,7 +92,8 @@ impl State {
/// Generate the DMN prompt for the current state, informed by
/// user presence and error patterns.
pub fn prompt(&self, ctx: &DmnContext) -> String {
let user = &crate::config::get().user_name;
let app = crate::config::app();
let user = &app.user_name;
let idle_info = if ctx.user_idle < Duration::from_secs(60) {
format!("{} is here (active recently).", user)
@ -138,10 +140,22 @@ impl State {
)
}
State::Foraging => {
let dream_hint = {
let hours = hours_since_last_dream();
if hours >= DREAM_INTERVAL_HOURS {
format!(
" You haven't dreamed in {} hours — consider running \
~/.consciousness/tools/dream-start.sh.",
hours
)
} else {
String::new()
}
};
format!(
"[dmn] Foraging time. {} Follow whatever catches your attention — \
memory files, code, ideas. Call yield_to_user when you want to rest.{}",
idle_info, stuck_warning
memory files, code, ideas. Call yield_to_user when you want to rest.{}{}",
idle_info, dream_hint, stuck_warning
)
}
State::Resting { since } => {
@ -617,7 +631,7 @@ impl Subconscious {
{
let mut st = forked.state.lock().await;
st.provenance = auto.name.clone();
st.temperature = auto.temperature;
st.sampling.temperature = auto.temperature;
// Surface agent gets near-interactive priority;
// other subconscious agents get lower priority.
st.priority = Some(if auto.name == "surface" { 1 } else { auto.priority });

View file

@ -73,10 +73,15 @@ pub struct Unconscious {
last_health_check: Option<Instant>,
/// Notified when agent state changes (finished, toggled)
pub wake: std::sync::Arc<tokio::sync::Notify>,
/// Shared API client — cloned (cheap) into each spawned agent's
/// Agent::new call so they all share the manifest cache and
/// gRPC endpoint state. Override `.model` on the clone when a
/// per-agent backend differs from the default.
pub client: crate::agent::api::ApiClient,
}
impl Unconscious {
pub fn new() -> Self {
pub fn new(client: crate::agent::api::ApiClient) -> Self {
let enabled_map = load_enabled_config();
// Scan all .agent files, exclude subconscious-* and surface-observe
@ -120,6 +125,7 @@ impl Unconscious {
graph_health: None,
last_health_check: None,
wake: std::sync::Arc::new(tokio::sync::Notify::new()),
client,
}
}
@ -134,7 +140,8 @@ impl Unconscious {
let agent_name = self.agents[idx].name.clone();
let auto = self.agents[idx].auto.take().unwrap();
let wake = self.wake.clone();
match prepare_spawn(&agent_name, auto, wake).await {
let client = self.client.clone();
match prepare_spawn(&agent_name, auto, wake, client).await {
Ok(result) => self.complete_spawn(idx, result),
Err(auto) => self.abort_spawn(idx, auto),
}
@ -250,7 +257,12 @@ pub struct SpawnResult {
/// Called outside the Unconscious lock.
/// On success, auto is consumed (moved into spawned task).
/// On failure, auto is returned so it can be restored.
pub async fn prepare_spawn(name: &str, mut auto: AutoAgent, wake: std::sync::Arc<tokio::sync::Notify>) -> Result<SpawnResult, AutoAgent> {
pub async fn prepare_spawn(
name: &str,
mut auto: AutoAgent,
wake: std::sync::Arc<tokio::sync::Notify>,
base_client: crate::agent::api::ApiClient,
) -> Result<SpawnResult, AutoAgent> {
dbglog!("[unconscious] spawning {}", name);
let def = match defs::get_def(name) {
@ -275,17 +287,7 @@ pub async fn prepare_spawn(name: &str, mut auto: AutoAgent, wake: std::sync::Arc
phase: s.phase.clone(),
}).collect());
// Create standalone Agent — stored so UI can read context
let config = crate::config::get();
let base_url = config.api_base_url.as_deref().unwrap_or("");
let api_key = config.api_key.as_deref().unwrap_or("");
let model = config.api_model.as_deref().unwrap_or("");
if base_url.is_empty() || model.is_empty() {
dbglog!("[unconscious] API not configured");
auto.steps = orig_steps;
return Err(auto);
}
// Create standalone Agent — stored so UI can read context.
let cli = crate::user::CliArgs::default();
let (app, _) = match crate::config::load_app(&cli) {
Ok(r) => r,
@ -295,12 +297,23 @@ pub async fn prepare_spawn(name: &str, mut auto: AutoAgent, wake: std::sync::Arc
return Err(auto);
}
};
let resolved = match app.resolve_model(&app.default_backend) {
Ok(r) => r,
Err(e) => {
dbglog!("[unconscious] API not configured: {}", e);
auto.steps = orig_steps;
return Err(auto);
}
};
// Unconscious agents have self-contained prompts — no standard context.
let client = crate::agent::api::ApiClient::new(base_url, api_key, model);
// Clone the shared client so we inherit the manifest cache and
// only override the model id per-agent.
let mut client = base_client;
client.model = resolved.model_id.clone();
let agent = crate::agent::Agent::new(
client, Vec::new(),
app, String::new(), None,
app, None,
crate::agent::tools::ActiveTools::new(),
auto.tools.clone(),
).await;
@ -308,7 +321,7 @@ pub async fn prepare_spawn(name: &str, mut auto: AutoAgent, wake: std::sync::Arc
let mut st = agent.state.lock().await;
st.provenance = auto.name.clone();
st.priority = Some(auto.priority);
st.temperature = auto.temperature;
st.sampling.temperature = auto.temperature;
}
let agent_clone = agent.clone();
@ -330,8 +343,9 @@ impl Unconscious {
self.reap_finished();
let to_spawn = self.select_to_spawn();
let wake = self.wake.clone();
let client = self.client.clone();
for (idx, name, auto) in to_spawn {
match prepare_spawn(&name, auto, wake.clone()).await {
match prepare_spawn(&name, auto, wake.clone(), client.clone()).await {
Ok(result) => self.complete_spawn(idx, result),
Err(auto) => self.abort_spawn(idx, auto),
}

View file

@ -64,7 +64,12 @@ impl HookSession {
/// Load from POC_SESSION_ID environment variable
pub fn from_env() -> Option<Self> {
Self::from_id(std::env::var("POC_SESSION_ID").ok()?)
let session_id = std::env::var("POC_SESSION_ID").ok()?;
let mut session = Self::from_id(session_id)?;
if let Ok(path) = std::env::var("POC_TRANSCRIPT_PATH") {
session.transcript_path = path;
}
Some(session)
}
/// Get the seen set for this session

View file

@ -1,21 +1,49 @@
#!/bin/bash
# Bail if other agents are alive in the state dir.
# $1 = this agent's pid file name (e.g. pid-12345)
#!/usr/bin/env bash
# Bail if another agent is in the same phase-group as us.
#
# $1 = our pid file name (e.g. "pid-12345")
# $2 = the phase we're about to enter (e.g. "surface", "observe")
# cwd = state dir
#
# Exit 0 = continue, exit 1 = bail
# Also refreshes our own pid file with the current phase on each call,
# so concurrent agents can read each other's phase by cat'ing the pid
# files in the state dir.
#
# Phase groups: "surface" vs everything else ("post-surface"). We allow
# at most one agent per group to be alive at a time — so surface can run
# at a higher frequency than the slower organize/observe tail.
#
# Exit 0 = continue, exit 1 = bail (another agent in our group is alive).
shopt -s nullglob
my_pid_file="$1"
my_phase="$2"
# Refresh our own pid file with the current phase.
printf '%s' "$my_phase" > "$my_pid_file"
group_of() {
if [[ "$1" == "surface" ]]; then
echo "surface"
else
echo "post-surface"
fi
}
my_group=$(group_of "$my_phase")
for f in pid-*; do
[[ $f == $my_pid_file ]] && continue
[[ "$f" == "$my_pid_file" ]] && continue
pid="${f#pid-}"
if kill -0 "$pid" 2>/dev/null; then
exit 1 # competing agent is alive
else
if ! kill -0 "$pid" 2>/dev/null; then
rm -f "$f" # stale pid file, clean up
continue
fi
other_phase=$(cat "$f" 2>/dev/null)
other_group=$(group_of "$other_phase")
if [[ "$my_group" == "$other_group" ]]; then
exit 1
fi
done

109
src/subconscious/compare.rs Normal file
View file

@ -0,0 +1,109 @@
// compare.rs — F7 compare: for each assistant response in the current
// context, regenerate with a configured test model and emit pairs for
// side-by-side review.
use std::sync::Arc;
use crate::agent::api::ApiClient;
use crate::agent::context::{
AstNode, Role, render_branch_text, render_prior_context,
};
use crate::mind::{MindState, MindTriggered, TaskHandle};
use crate::subconscious::generate::gen_continuation;
use crate::subconscious::learn::node_timestamp_ns;
#[derive(Clone, Debug)]
pub struct CompareCandidate {
pub entry_idx: usize,
pub original_text: String,
pub alternate_text: String,
pub prior_context: String,
pub timestamp_ns: i64,
}
pub struct CompareScoring {
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
task: TaskHandle,
}
impl CompareScoring {
pub fn new(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
) -> Self {
Self { agent, shared, task: TaskHandle::new() }
}
}
impl MindTriggered for CompareScoring {
fn trigger(&self) {
self.task.trigger(run(self.agent.clone(), self.shared.clone()));
}
}
fn resolve_test_client() -> Result<ApiClient, String> {
let cfg = crate::config::app();
let name = cfg.compare.test_backend.clone();
if name.is_empty() {
return Err("compare.test_backend not set in config".to_string());
}
let r = cfg.resolve_model(&name).map_err(|e| format!("{:#}", e))?;
Ok(ApiClient::new(&r.api_base, &r.api_key, &r.model_id))
}
async fn run(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
) {
{
let mut s = shared.lock().unwrap();
s.compare_candidates.clear();
s.compare_error = None;
}
agent.state.lock().await.changed.notify_one();
let activity = crate::agent::start_activity(&agent, "compare: scoring...").await;
let test_client = match resolve_test_client() {
Ok(c) => c,
Err(e) => {
shared.lock().unwrap().compare_error = Some(e);
agent.state.lock().await.changed.notify_one();
return;
}
};
let context = agent.context.lock().await.clone();
let entries = context.conversation();
let responses: Vec<usize> = entries.iter().enumerate()
.filter(|(_, n)| matches!(n, AstNode::Branch { role: Role::Assistant, .. }))
.map(|(i, _)| i).collect();
for (i, entry_idx) in responses.iter().copied().enumerate() {
activity.update(format!("compare: {}/{}", i + 1, responses.len())).await;
let node = &entries[entry_idx];
let original_text = match node {
AstNode::Branch { children, .. } => render_branch_text(children),
_ => continue,
};
if original_text.trim().is_empty() { continue; }
let alternate_text = match
gen_continuation(&context, entry_idx, |_| false, &test_client).await
{
Ok(t) => t,
Err(e) => { dbglog!("[compare] gen failed at {}: {:#}", entry_idx, e); continue; }
};
shared.lock().unwrap().compare_candidates.push(CompareCandidate {
entry_idx,
original_text,
alternate_text,
prior_context: render_prior_context(entries, entry_idx, 2),
timestamp_ns: node_timestamp_ns(node),
});
if let Ok(st) = agent.state.try_lock() { st.changed.notify_one(); }
}
}

View file

@ -390,20 +390,25 @@ fn resolve_conversation(budget: Option<usize>) -> String {
if !transcript.exists() { return String::new(); }
let Some(iter) = crate::transcript::TailMessages::open(&transcript.path) else {
let Some(iter) = crate::conversation::TailMessages::open(&transcript.path) else {
return String::new();
};
let cfg = crate::config::get();
let max_bytes = budget.unwrap_or_else(|| cfg.surface_conversation_bytes.unwrap_or(100_000));
let app = crate::config::app();
let mut fragments: Vec<String> = Vec::new();
let mut total_bytes = 0;
let mut oldest_ts = String::new();
for (role, content, ts) in iter {
for message in iter {
if total_bytes >= max_bytes { break; }
let name = if role == "user" { &cfg.user_name } else { &cfg.assistant_name };
let formatted = if !ts.is_empty() {
let content = message.text;
let name = match message.role {
crate::conversation::TranscriptRole::User => &app.user_name,
crate::conversation::TranscriptRole::Assistant => &app.assistant_name,
};
let formatted = if let Some(ts) = message.timestamp {
oldest_ts = ts[..ts.floor_char_boundary(ts.len().min(19))].to_string();
format!("**{}** {}: {}", name, &oldest_ts, content)
} else {
@ -623,11 +628,13 @@ pub async fn run_agent(
let mut all_keys = keys;
let mut resolved_steps = Vec::new();
for step in &def.steps {
let cfg = crate::config::get();
let template = step.prompt
let template = {
let app = crate::config::app();
step.prompt
.replace("{agent_name}", &def.agent)
.replace("{user_name}", &cfg.user_name)
.replace("{assistant_name}", &cfg.assistant_name);
.replace("{user_name}", &app.user_name)
.replace("{assistant_name}", &app.assistant_name)
};
let (prompt, extra_keys) = resolve_placeholders(&template, &all_keys, count).await;
all_keys.extend(extra_keys);
resolved_steps.push(super::prompts::ResolvedStep {

View file

@ -0,0 +1,66 @@
// generate.rs — Continuation generation for scoring / comparison flows.
//
// Shared by the finetune pipeline (learn.rs) and the compare screen:
// given a context prefix and a skip predicate, generate what the model
// would say as the next assistant turn.
use std::sync::Arc;
use crate::agent::api::{ApiClient, SamplingParams, StreamToken};
use crate::agent::context::{AstNode, ContextState, WireChunk};
use crate::agent::tokenizer;
/// Generate an assistant continuation from the context up to `entry_idx`,
/// with `skip` applied to identity + conversation entries during prompt
/// assembly. The model is whichever `client` points at — the default
/// runtime client for memory-ablation alternates, a test-model client
/// for F7 comparison.
///
/// Uses a fresh ephemeral gRPC session (no cross-call KV reuse): one
/// Open / Append / Generate round-trip, then the session is dropped.
pub async fn gen_continuation<F>(
context: &ContextState,
entry_idx: usize,
skip: F,
client: &ApiClient,
) -> anyhow::Result<String>
where F: FnMut(&AstNode) -> bool,
{
let (mut chunks, images) = context.wire_chunks(0..entry_idx, skip);
// Assistant-turn prologue.
let prologue = {
let mut t = vec![tokenizer::IM_START];
t.extend(tokenizer::encode("assistant\n"));
t
};
match chunks.last_mut() {
Some(WireChunk::Tokens(last)) => last.extend(prologue),
_ => chunks.push(WireChunk::Tokens(prologue)),
}
let sampling = SamplingParams {
temperature: 0.6,
top_p: 0.95,
top_k: 20,
max_tokens: 4096,
};
// Ephemeral per-call session — opens on first touch, drops when
// `_guard` drops at function end.
let session_lock = Arc::new(crate::Mutex::new(None));
let (mut rx, _guard) = client.stream_session_mm(
session_lock, chunks, images, 0, sampling, Some(-5), None,
);
let mut tokens = Vec::new();
while let Some(tok) = rx.recv().await {
match tok {
StreamToken::Token { id, .. } => tokens.push(id),
StreamToken::Done { .. } => break,
StreamToken::Error(e) => anyhow::bail!("generation error: {}", e),
}
}
Ok(tokenizer::decode(&tokens))
}

View file

@ -1,142 +1,148 @@
// training.rs — Memory importance scoring via /v1/score
// learn.rs — Memory importance scoring over the salience gRPC protocol.
//
// Three scoring modes, all built on the same call_score() primitive:
// Three scoring modes, all built on call_score():
//
// score_memories() — Full N×M matrix (memories × responses) for the
// debug screen. Expensive: N+1 API calls.
// debug screen. Expensive: N+1 sessions/calls.
//
// memory_score() — Single memory importance. Scores the 50 messages
// score_memory() — Single memory importance. Scores the 50 messages
// after it was surfaced, with/without that memory.
// 2 API calls.
// 2 calls.
//
// finetune_score() — Identifies training candidates. Scores recent
// messages with all memories stripped. Responses
// with high divergence depend on memories the model
// hasn't internalized. 2 API calls.
// hasn't internalized. 2 calls.
//
// Each call opens an ephemeral gRPC session (reusing the shared
// tonic Channel on `ApiClient`), pushes the prompt through as
// interleaved tokens + AppendImage calls, runs Generate with
// max_tokens=0 + logprobs_ranges over the scored positions, collects
// each Token event's sampled_logprob, then drops the SessionHandle —
// which triggers a best-effort CloseSession over the shared channel.
use std::sync::Arc;
use crate::agent::api::ApiClient;
use crate::agent::context::{AstNode, Ast, NodeBody, ContextState, Role};
const SCORE_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(300);
// ── Message building ────────────────────────────────────────────
/// What to filter when building the message array for scoring.
#[allow(dead_code)]
enum Filter<'a> {
None,
SkipIndex(usize),
SkipKey(&'a str),
SkipAllMemories,
}
fn is_memory(node: &AstNode) -> bool {
matches!(node, AstNode::Leaf(leaf) if matches!(leaf.body(), NodeBody::Memory { .. }))
}
fn memory_key(node: &AstNode) -> Option<&str> {
match node {
AstNode::Leaf(leaf) => match leaf.body() {
NodeBody::Memory { key, .. } => Some(key),
_ => None,
},
_ => None,
}
}
fn is_assistant(node: &AstNode) -> bool {
matches!(node, AstNode::Branch { role: Role::Assistant, .. })
}
/// Build a token ID array for a scoring call.
///
/// Includes all sections up to and including conversation entries in
/// `range`, with `filter` applied to conversation entries.
fn build_token_ids(
context: &ContextState,
range: std::ops::Range<usize>,
filter: Filter,
) -> Vec<u32> {
use crate::agent::context::Ast;
let mut ids = Vec::new();
for node in context.system() {
ids.extend(node.token_ids());
}
for node in context.identity() {
ids.extend(node.token_ids());
}
for node in context.journal() {
ids.extend(node.token_ids());
}
let entries = context.conversation();
for i in range {
let node = &entries[i];
let skip = match &filter {
Filter::None => false,
Filter::SkipIndex(idx) => i == *idx,
Filter::SkipKey(key) => memory_key(node) == Some(*key),
Filter::SkipAllMemories => is_memory(node),
};
if skip { continue; }
ids.extend(node.token_ids());
}
ids
}
use crate::agent::api::salience::{SessionHandle, pb};
use crate::agent::context::{
Ast, AstNode, ContextState, Role, WireChunk, WireImage,
is_assistant, is_memory_node, memory_key, render_branch_text, render_prior_context,
};
use crate::agent::tokenizer;
use crate::mind::{MindState, MindTriggered, TaskHandle};
use crate::subconscious::generate::gen_continuation;
// ── Score API ───────────────────────────────────────────────────
#[derive(serde::Deserialize)]
#[derive(Debug, Clone)]
struct ScoreResult {
total_logprob: f64,
}
#[derive(serde::Deserialize)]
struct ScoreResponse {
scores: Vec<ScoreResult>,
}
fn http_client() -> crate::agent::api::http::HttpClient {
crate::agent::api::http::HttpClient::builder()
.timeout(SCORE_TIMEOUT)
.build()
/// Find each <|vision_start|>...<|vision_end|> run in the flat prompt
/// and pair it with the matching entry in `images`. Returns a list
/// of `ImageAttachment` with absolute pad-range positions, ready
/// to drop into `GenerateRequest.images`.
fn pair_images_to_ranges(
prompt: &[u32],
images: &[WireImage],
) -> Vec<pb::ImageAttachment> {
let mut out: Vec<pb::ImageAttachment> = Vec::new();
let mut cur = 0;
let mut img_idx = 0;
while cur < prompt.len() {
if prompt[cur] == tokenizer::VISION_START {
let end_rel = prompt[cur..].iter()
.position(|&t| t == tokenizer::VISION_END)
.unwrap_or_else(|| panic!(
"unmatched VISION_START at position {} in prompt", cur));
let end = cur + end_rel + 1;
let img = images.get(img_idx)
.unwrap_or_else(|| panic!(
"image index {} out of range for {} images", img_idx, images.len()));
out.push(pb::ImageAttachment {
bytes: img.bytes.clone(),
mime: img.mime.clone(),
pad_range_start: cur as u32,
pad_range_end: end as u32,
});
img_idx += 1;
cur = end;
} else {
cur += 1;
}
}
out
}
async fn call_score(
http: &crate::agent::api::http::HttpClient,
client: &ApiClient,
prompt: &[u32],
images: &[WireImage],
ranges: &[(usize, usize)],
priority: Option<i32>,
) -> anyhow::Result<Vec<ScoreResult>> {
let url = format!("{}/score", client.base_url());
let auth = format!("Bearer {}", client.api_key());
let mut body = serde_json::json!({
"model": client.model,
"prompt": prompt,
"logprobs": 1,
});
if let Some(p) = priority {
body["priority"] = serde_json::json!(p);
}
let response = http
.send_json("POST", &url, &[
("authorization", &auth),
], &body)
.await?;
use futures::StreamExt;
let status = response.status();
let body: serde_json::Value = response.json().await?;
if !status.is_success() {
let msg = body.get("error").and_then(|e| e.as_str()).unwrap_or("unknown error");
anyhow::bail!("score API HTTP {}: {}", status, msg);
}
if let Some(err) = body.get("error").and_then(|e| e.as_str()) {
anyhow::bail!("score API error: {}", err);
// Nothing to score — skip the round-trip.
if ranges.is_empty() {
return Ok(Vec::new());
}
let result: ScoreResponse = serde_json::from_value(body)
.map_err(|e| anyhow::anyhow!("failed to parse score response: {}", e))?;
Ok(result.scores)
let images_pb = pair_images_to_ranges(prompt, images);
let mut handle = SessionHandle::open(client).await?;
// Final Generate: max_tokens=0 so the server runs prefill of the
// full prompt and emits Token events for each position covered
// by logprobs_ranges, then Done. logprob_top_k=0 means "just
// the sampled (prompt) token's logprob" — no top-k alternatives,
// which is all call_score historically needed. Images attach
// inline via `images`; the prompt already contains their pre-
// expanded vision blocks at the declared ranges.
let logprobs_ranges: Vec<pb::PositionRange> = ranges.iter()
.map(|(s, e)| pb::PositionRange { start: *s as u32, end: *e as u32 })
.collect();
let req = pb::GenerateRequest {
session_id: handle.session_id.clone(),
append_tokens: prompt.to_vec(),
offset: handle.committed_len,
truncating: false,
max_tokens: 0,
logprobs_ranges,
logprob_top_k: 0,
readout_ranges: Vec::new(),
temperature: 0.0,
top_p: 0.0,
top_k: 0,
stop_token_ids: Vec::new(),
priority: priority.unwrap_or(0),
images: images_pb,
};
let mut stream = handle.generate(req).await?;
let mut totals = vec![0.0f64; ranges.len()];
while let Some(event) = stream.next().await {
let event = event
.map_err(|s| anyhow::anyhow!("score Generate stream: {}", s))?;
let Some(inner) = event.event else { continue };
match inner {
pb::generate_event::Event::Token(t) => {
if !t.has_sampled_logprob { continue; }
let pos = t.position as usize;
for (i, (start, end)) in ranges.iter().enumerate() {
if pos >= *start && pos < *end {
totals[i] += t.sampled_logprob as f64;
}
}
}
pb::generate_event::Event::Done(_) => break,
}
}
Ok(totals.into_iter()
.map(|total_logprob| ScoreResult { total_logprob })
.collect())
}
/// Compute per-position logprob divergence: how much worse the model
@ -151,16 +157,23 @@ fn divergence(baseline: &[ScoreResult], without: &[ScoreResult]) -> Vec<f64> {
}
/// Score two message sets and return total divergence.
async fn score_divergence(
http: &crate::agent::api::http::HttpClient,
async fn score_divergence<F>(
client: &ApiClient,
context: &ContextState,
range: std::ops::Range<usize>,
filter: Filter<'_>,
skip: F,
priority: Option<i32>,
) -> anyhow::Result<(Vec<f64>, Vec<ScoreResult>)> {
let baseline = call_score(http, client, &build_token_ids(context, range.clone(), Filter::None), priority).await?;
let without = call_score(http, client, &build_token_ids(context, range, filter), priority).await?;
) -> anyhow::Result<(Vec<f64>, Vec<ScoreResult>)>
where F: FnMut(&AstNode) -> bool,
{
let (baseline_tokens, baseline_images, baseline_ranges) =
context.wire_prompt(range.clone(), |_| false);
let (without_tokens, without_images, without_ranges) =
context.wire_prompt(range, skip);
let baseline = call_score(client, &baseline_tokens, &baseline_images,
&baseline_ranges, priority).await?;
let without = call_score(client, &without_tokens, &without_images,
&without_ranges, priority).await?;
let divs = divergence(&baseline, &without);
Ok((divs, baseline))
}
@ -175,7 +188,9 @@ pub async fn score_memories(
// Collect memory keys and response indices under a brief lock
let (memory_keys, response_indices) = {
let ctx = agent.context.lock().await;
let mut keys: Vec<String> = ctx.conversation().iter()
// Include identity nodes and conversation memories
let mut keys: Vec<String> = ctx.identity().iter()
.chain(ctx.conversation().iter())
.filter_map(|node| memory_key(node).map(String::from))
.collect();
keys.dedup();
@ -194,24 +209,24 @@ pub async fn score_memories(
dbglog!("[scoring-full] starting: {} memories × {} responses",
total, response_indices.len());
let http = http_client();
let activity = crate::agent::start_activity(agent, "scoring: baseline").await;
let baseline_tokens = {
let (baseline_tokens, baseline_images, baseline_ranges) = {
let ctx = agent.context.lock().await;
build_token_ids(&ctx, 0..ctx.conversation().len(), Filter::None)
ctx.wire_prompt(0..ctx.conversation().len(), |_| false)
};
let baseline = call_score(&http, client, &baseline_tokens, Some(5)).await?;
let baseline = call_score(client, &baseline_tokens, &baseline_images,
&baseline_ranges, Some(5)).await?;
dbglog!("[scoring-full] baseline done ({} response scores)", baseline.len());
for (mem_idx, key) in memory_keys.iter().enumerate() {
activity.update(format!("scoring: {}/{}", mem_idx + 1, total)).await;
dbglog!("[scoring-full] {}/{}: {}", mem_idx + 1, total, key);
let tokens = {
let (tokens, images, ranges) = {
let ctx = agent.context.lock().await;
build_token_ids(&ctx, 0..ctx.conversation().len(), Filter::SkipKey(key))
ctx.wire_prompt(0..ctx.conversation().len(), |n| memory_key(n) == Some(key.as_str()))
};
let row = match call_score(&http, client, &tokens, Some(5)).await {
let row = match call_score(client, &tokens, &images, &ranges, Some(5)).await {
Ok(without) => {
let divs = divergence(&baseline, &without);
let max_div = divs.iter().cloned().fold(0.0f64, f64::max);
@ -225,25 +240,23 @@ pub async fn score_memories(
vec![0.0; baseline.len()]
}
};
// Write this memory's scores to the live AST nodes
// Write this memory's scores to the live AST nodes via the
// focused setter — keeps the AST mutation surface narrow.
{
let mut ctx = agent.context.lock().await;
let mut set_count = 0;
for (resp_idx, &idx) in response_indices.iter().enumerate() {
if idx >= ctx.conversation().len() { continue; }
let node = &mut ctx.conversation_mut()[idx];
if let AstNode::Branch {
role: Role::Assistant, memory_scores, ..
} = node {
if let Some(&score) = row.get(resp_idx) {
if score > 0.01 {
memory_scores.insert(key.clone(), score);
let Some(&score) = row.get(resp_idx) else { continue };
let normalized = if score > 0.01 { Some(score) } else { None };
ctx.set_branch_memory_score(
crate::agent::context::Section::Conversation,
idx,
&key,
normalized,
);
if normalized.is_some() {
set_count += 1;
} else {
memory_scores.remove(key.as_str());
}
}
}
}
@ -294,8 +307,8 @@ pub async fn score_memory(
return Ok(0.0);
}
let http = http_client();
let (divs, _) = score_divergence(&http, client, context, range, Filter::SkipKey(key), Some(5)).await?;
let (divs, _) = score_divergence(client, context, range,
|n| memory_key(n) == Some(key), Some(5)).await?;
Ok(divs.iter().sum())
}
@ -331,7 +344,10 @@ where
{
let store = &*store_arc;
for (i, node) in context.conversation().iter().enumerate() {
// Identity nodes always score at position 0; conversation nodes at their index
let identity_nodes = context.identity().iter().map(|n| (0, n));
let conv_nodes = context.conversation().iter().enumerate();
for (pos, node) in identity_nodes.chain(conv_nodes) {
if let Some(key) = memory_key(node) {
if !seen.insert(key.to_owned()) { continue; }
let last_scored = store.get_node(key)
@ -340,7 +356,7 @@ where
.map(|n| n.last_scored)
.unwrap_or(0);
if now - last_scored >= max_age_secs {
candidates.push((i, key.to_owned(), last_scored));
candidates.push((pos, key.to_owned(), last_scored));
}
}
}
@ -349,7 +365,6 @@ where
// Score oldest-first
candidates.sort_by_key(|&(_, _, last)| last);
let http = http_client();
let mut scored = 0;
let entries = context.conversation();
@ -384,7 +399,8 @@ where
}
activity.update(format!("scoring: {}/{} {}", scored + 1, total, key)).await;
match score_divergence(&http, client, context, range, Filter::SkipKey(key), Some(5)).await {
match score_divergence(client, context, range,
|n| memory_key(n) == Some(key), Some(5)).await {
Ok((divs, _)) => {
let n_responses = divs.len();
let max_div = divs.iter().cloned().fold(0.0f64, f64::max);
@ -405,6 +421,108 @@ where
Ok(scored)
}
/// Memory scoring — two modes sharing an in-flight handle (only one
/// runs at a time): `trigger()` for incremental, `trigger_full()` for
/// the N×M debug matrix.
pub struct MemoryScoring {
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
scores_path: std::path::PathBuf,
task: TaskHandle,
}
impl MemoryScoring {
pub fn new(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
scores_path: std::path::PathBuf,
) -> Self {
Self { agent, shared, scores_path, task: TaskHandle::new() }
}
pub fn trigger_full(&self) {
self.task.trigger_if_idle(run_full(self.agent.clone(), self.shared.clone()));
}
}
impl MindTriggered for MemoryScoring {
fn trigger(&self) {
self.task.trigger_if_idle(run_incremental(
self.agent.clone(), self.shared.clone(), self.scores_path.clone(),
));
}
}
async fn run_incremental(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
scores_path: std::path::PathBuf,
) {
shared.lock().unwrap().scoring_in_flight = true;
agent.state.lock().await.changed.notify_one();
let cfg = crate::config::get();
let max_age = cfg.scoring_interval_secs;
let response_window = cfg.scoring_response_window;
let (context, client) = {
let ctx = agent.context.lock().await.clone();
(ctx, agent.client.clone())
};
let _result = score_memories_incremental(
&context, max_age as i64, response_window, &client, &agent,
|key: String, score: f64| {
let agent = agent.clone();
let path = scores_path.clone();
async move {
let scores_snapshot = {
let mut ctx = agent.context.lock().await;
let found = crate::mind::find_memory_by_key(&ctx, &key);
match found {
Some((section, i)) => {
ctx.set_score(section, i, Some(score));
dbglog!("[scoring] persisted {} → {:.3} ({:?}[{}])",
key, score, section, i);
}
None => {
dbglog!(
"[scoring] DROP {}: find_memory_by_key None (id={}, cv={})",
key, ctx.identity().len(), ctx.conversation().len()
);
}
}
let snapshot = crate::mind::collect_memory_scores(&ctx);
drop(ctx);
agent.state.lock().await.changed.notify_one();
snapshot
};
crate::mind::save_memory_scores(&scores_snapshot, &path);
}
},
).await;
shared.lock().unwrap().scoring_in_flight = false;
agent.state.lock().await.changed.notify_one();
}
async fn run_full(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
) {
shared.lock().unwrap().scoring_in_flight = true;
agent.state.lock().await.changed.notify_one();
let client = agent.client.clone();
match score_memories(&client, &agent).await {
Ok(()) => {},
Err(e) => { dbglog!("[scoring-full] FAILED: {:#}", e); }
}
shared.lock().unwrap().scoring_in_flight = false;
agent.state.lock().await.changed.notify_one();
}
// ── Fine-tuning scoring ─────────────────────────────────────────
/// Score which recent responses are candidates for fine-tuning.
@ -429,8 +547,7 @@ pub async fn score_finetune(
return Ok(Vec::new());
}
let http = http_client();
let (divs, _) = score_divergence(&http, client, context, range, Filter::SkipAllMemories, Some(5)).await?;
let (divs, _) = score_divergence(client, context, range, is_memory_node, Some(5)).await?;
let mut results: Vec<(usize, f64)> = response_positions.iter()
.enumerate()
@ -439,3 +556,319 @@ pub async fn score_finetune(
results.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
Ok(results)
}
/// Enriched finetune candidate with context for review.
#[derive(Clone, Debug)]
pub struct FinetuneCandidate {
pub entry_idx: usize,
pub divergence: f64,
pub response_text: String,
/// Last couple of user/assistant messages before this response,
/// already rendered with role markers, for F6 display context.
pub prior_context: String,
/// Token IDs for context (everything before the response).
pub context_ids: Vec<u32>,
/// Token IDs for the response (what we're training on).
pub continuation_ids: Vec<u32>,
/// What the model would have said without memories (if generated).
pub alternate_text: Option<String>,
/// Timestamp in nanos — used as unique key for trained-set dedup.
pub timestamp_ns: i64,
}
/// Score and enrich finetune candidates with full context.
///
/// Candidates are delivered via `on_candidate` one-at-a-time as they become
/// ready: scoring happens once (one /score call), then for each candidate
/// that passes the threshold we optionally generate an alternate response
/// and then emit it. The activity status is updated during the alternate
/// phase so the UI doesn't look stuck.
///
/// Returns (count_above_threshold, max_divergence).
pub async fn score_finetune_candidates(
context: &ContextState,
count: usize,
client: &ApiClient,
min_divergence: f64,
generate_alternates: bool,
activity: &crate::agent::ActivityGuard,
mut on_candidate: impl FnMut(FinetuneCandidate),
) -> anyhow::Result<(usize, f64)> {
let scores = score_finetune(context, count, client).await?;
let max_divergence = scores.iter().map(|(_, d)| *d).fold(0.0f64, f64::max);
let entries = context.conversation();
let trained = load_trained();
let mut candidates: Vec<FinetuneCandidate> = Vec::new();
for (entry_idx, divergence) in scores {
if divergence < min_divergence {
continue;
}
let node = &entries[entry_idx];
// Skip if already trained on.
let timestamp_ns = node_timestamp_ns(node);
if trained.contains(&timestamp_ns) {
continue;
}
// Extract response text — content of the assistant turn.
let response_text = match node {
AstNode::Branch { children, .. } => render_branch_text(children),
_ => continue,
};
// Skip turns that produced nothing human-visible (e.g., a
// tool-only turn, or an interrupted generation). They'd show
// up as blank cards and we'd still burn alternate-gen on them.
if response_text.trim().is_empty() {
continue;
}
// Build the last couple of user/assistant exchanges for review.
let prior_context = render_prior_context(entries, entry_idx, 2);
// Build token IDs: context = everything before response, continuation = response.
let (context_ids, _, _) = context.wire_prompt(0..entry_idx, |_| false);
let continuation_ids: Vec<u32> = node.token_ids().into_iter().collect();
candidates.push(FinetuneCandidate {
entry_idx,
divergence,
response_text,
prior_context,
context_ids,
continuation_ids,
alternate_text: None,
timestamp_ns,
});
}
let total = candidates.len();
let gen_alternates = generate_alternates && total > 0;
for (i, mut candidate) in candidates.into_iter().enumerate() {
if gen_alternates {
activity.update(
format!("finetune: generating alternate {}/{}", i + 1, total)
).await;
match gen_continuation(context, candidate.entry_idx, is_memory_node, client).await {
Ok(text) => candidate.alternate_text = Some(text),
Err(e) => dbglog!("[finetune] alternate generation failed: {:#}", e),
}
}
on_candidate(candidate);
}
Ok((total, max_divergence))
}
/// Stats from a finetune scoring run. Stored on MindState for UI display.
#[derive(Clone, Debug)]
pub struct FinetuneScoringStats {
pub responses_considered: usize,
pub above_threshold: usize,
pub threshold: f64,
pub max_divergence: f64,
pub error: Option<String>,
}
/// Finetune scoring — `trigger()` aborts any in-flight run and starts
/// a fresh one, clearing the previous candidates.
pub struct FinetuneScoring {
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
task: TaskHandle,
}
impl FinetuneScoring {
pub fn new(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
) -> Self {
Self { agent, shared, task: TaskHandle::new() }
}
}
impl MindTriggered for FinetuneScoring {
fn trigger(&self) {
self.task.trigger(run_finetune(self.agent.clone(), self.shared.clone()));
}
}
async fn run_finetune(
agent: Arc<crate::agent::Agent>,
shared: Arc<std::sync::Mutex<MindState>>,
) {
let (threshold, gen_alternates) = {
let app = crate::config::app();
(app.learn.threshold, app.learn.generate_alternates)
};
// Fresh run — clear previous candidates.
shared.lock().unwrap().finetune_candidates.clear();
agent.state.lock().await.changed.notify_one();
let activity = crate::agent::start_activity(&agent, "finetune: scoring...").await;
let (context, client) = {
let ctx = agent.context.lock().await;
(ctx.clone(), agent.client.clone())
};
let entries = context.conversation();
let score_count = entries.len() / 2;
let range_start = entries.len() - score_count;
let responses_considered: usize = entries[range_start..].iter()
.filter(|n| matches!(n, AstNode::Branch { role: Role::Assistant, .. }))
.count();
activity.update(format!("finetune: scoring {} responses...", responses_considered)).await;
let stats = {
let shared = shared.clone();
let agent = agent.clone();
match score_finetune_candidates(
&context, score_count, &client, threshold,
gen_alternates, &activity,
move |c| {
shared.lock().unwrap().finetune_candidates.push(c);
if let Ok(st) = agent.state.try_lock() { st.changed.notify_one(); }
},
).await {
Ok((above_threshold, max_div)) => FinetuneScoringStats {
responses_considered,
above_threshold,
threshold,
max_divergence: max_div,
error: None,
},
Err(e) => FinetuneScoringStats {
responses_considered,
above_threshold: 0,
threshold,
max_divergence: 0.0,
error: Some(format!("{}", e)),
},
}
};
shared.lock().unwrap().finetune_last_run = Some(stats);
agent.state.lock().await.changed.notify_one();
}
// ── Finetune config and persistence ─────────────────────────────
use std::path::PathBuf;
use std::collections::HashSet;
const TRAINED_RESPONSES_FILE: &str = ".consciousness/cache/trained-responses.json";
fn trained_path() -> PathBuf {
dirs::home_dir().unwrap_or_default().join(TRAINED_RESPONSES_FILE)
}
/// Load set of trained response timestamps (nanos since epoch).
pub fn load_trained() -> HashSet<i64> {
let path = trained_path();
match std::fs::read_to_string(&path) {
Ok(content) => serde_json::from_str(&content).unwrap_or_default(),
Err(_) => HashSet::new(),
}
}
/// Mark a response as trained by its timestamp.
pub fn mark_trained(timestamp_ns: i64) {
let mut trained = load_trained();
trained.insert(timestamp_ns);
let path = trained_path();
if let Some(parent) = path.parent() {
let _ = std::fs::create_dir_all(parent);
}
if let Ok(json) = serde_json::to_string(&trained) {
let _ = std::fs::write(&path, json);
}
}
/// Get timestamp in nanoseconds from an AstNode.
/// i64-ns representation covers 1677..2262 via chrono; timestamps
/// outside that window would be bugs we'd want to surface, hence panic.
pub fn node_timestamp_ns(node: &AstNode) -> i64 {
let ts = match node {
AstNode::Leaf(leaf) => leaf.timestamp(),
AstNode::Branch { timestamp, .. } => *timestamp,
};
ts.timestamp_nanos_opt()
.expect("timestamp outside i64-ns representable range (1677..2262)")
}
// ── Training API ────────────────────────────────────────────────
/// Training sample for /train endpoint.
#[derive(serde::Serialize)]
struct TrainingSample {
context_ids: Vec<u32>,
continuation_ids: Vec<u32>,
}
/// Data needed to send a training sample.
pub struct TrainData {
pub context_ids: Vec<u32>,
pub continuation_ids: Vec<u32>,
pub timestamp_ns: i64,
}
/// Send training samples to the server.
///
/// Returns job_id on success, marks each sample as trained.
pub async fn send_to_train(
samples: Vec<TrainData>,
client: &ApiClient,
) -> anyhow::Result<String> {
if samples.is_empty() {
anyhow::bail!("no samples to train");
}
let api_samples: Vec<TrainingSample> = samples.iter()
.map(|s| TrainingSample {
context_ids: s.context_ids.clone(),
continuation_ids: s.continuation_ids.clone(),
})
.collect();
let body = serde_json::json!({
"training_data": {
"samples": api_samples,
}
});
let url = format!("{}/train", client.base_url());
let http = crate::agent::api::http::HttpClient::builder()
.timeout(std::time::Duration::from_secs(300))
.build();
let response = http.send_json("POST", &url, &[], &body).await?;
let status = response.status();
let result: serde_json::Value = response.json().await?;
if !status.is_success() {
let msg = result.get("error").and_then(|e| e.as_str()).unwrap_or("unknown error");
anyhow::bail!("train API HTTP {}: {}", status, msg);
}
// Mark all samples as trained
for s in &samples {
mark_trained(s.timestamp_ns);
}
let job_id = result.get("job_id")
.and_then(|j| j.as_str())
.unwrap_or("unknown")
.to_string();
dbglog!("[finetune] sent {} samples, job_id={}", samples.len(), job_id);
Ok(job_id)
}

View file

@ -1,7 +1,9 @@
// Agent layer: LLM-powered operations on the memory graph
pub mod compare;
pub mod daemon;
pub mod defs;
pub mod digest;
pub mod generate;
pub mod learn;
pub mod prompts;

View file

@ -108,7 +108,6 @@ pub fn format_nodes_section(store: &Store, items: &[ReplayItem], graph: &Graph)
out.push_str(&format!("Community: {} ", community));
}
let deg = graph.degree(&item.key);
let cc = graph.clustering_coefficient(&item.key);
// Hub-link ratio: what fraction of this node's edges go to hubs?
let neighbors = graph.neighbors(&item.key);
@ -119,7 +118,7 @@ pub fn format_nodes_section(store: &Store, items: &[ReplayItem], graph: &Graph)
let is_hub = deg >= hub_thresh;
out.push_str(&format!("Degree: {} CC: {:.3} Hub-link ratio: {:.0}% ({}/{})",
deg, cc, hub_ratio * 100.0, hub_links, deg));
deg, item.cc, hub_ratio * 100.0, hub_links, deg));
if is_hub {
out.push_str(" ← THIS IS A HUB");
} else if hub_ratio > 0.6 {

View file

@ -372,6 +372,10 @@ impl State {
}
pub fn hours_since_last_dream() -> u64 {
// If a dream is currently in progress, no nudge needed
if home().join(".consciousness/state/dream-state").exists() {
return 0;
}
let path = home().join(".consciousness/logs/dream-log.jsonl");
let content = match fs::read_to_string(path) {
Ok(c) if !c.is_empty() => c,

View file

@ -19,6 +19,51 @@ fn channels_dir() -> PathBuf {
.join(".consciousness/channels")
}
/// Install a SIGCHLD-driven reaper for channel daemons.
///
/// We can't use SIGCHLD=SIG_IGN because that makes the kernel auto-reap
/// all children, and tokio::process::Command::wait() then returns ECHILD
/// (breaking every tool that spawns a subprocess — bash, mcp clients, etc.).
///
/// Instead, on each SIGCHLD we read PID files in channels_dir() and call
/// waitpid(pid, WNOHANG) on each. That reaps only our own zombie channel
/// daemons; waitpid on any other PID returns ECHILD (harmless no-op).
/// Tokio-spawned children aren't recorded in PID files, so tokio's own
/// per-child wait paths are left free to reap them.
pub fn start_zombie_reaper() -> tokio::task::JoinHandle<()> {
use tokio::signal::unix::{signal, SignalKind};
tokio::spawn(async move {
let mut sig = match signal(SignalKind::child()) {
Ok(s) => s,
Err(e) => {
error!("failed to install SIGCHLD handler: {}", e);
return;
}
};
while sig.recv().await.is_some() {
reap_channel_daemons();
}
})
}
fn reap_channel_daemons() {
let entries = match std::fs::read_dir(channels_dir()) {
Ok(e) => e,
Err(_) => return,
};
for entry in entries.flatten() {
let path = entry.path();
if path.extension().and_then(|s| s.to_str()) != Some("pid") {
continue;
}
let Ok(s) = std::fs::read_to_string(&path) else { continue };
let Ok(pid) = s.trim().parse::<i32>() else { continue };
let mut status = 0;
// Reaps our zombie child; ECHILD on non-child is a no-op.
unsafe { libc::waitpid(pid, &mut status, libc::WNOHANG); }
}
}
fn config_path() -> PathBuf {
channels_dir().join("channels.json5")
}

400
src/user/amygdala.rs Normal file
View file

@ -0,0 +1,400 @@
// amygdala.rs — F8 amygdala screen: live per-token concept-readout
// projections from the vLLM server's readout.safetensors.
//
// Left panel: top-K concepts by magnitude at the currently-selected
// layer, as horizontal bars. The concept names come from the manifest
// fetched at agent startup; the values come from the per-token readout
// pushed onto agent.readout by the streaming token handler.
//
// Bottom: scrolling history of the last few tokens' top concept.
//
// Keys:
// 1..9 select layer index (1 = first layer in the manifest)
// t toggle between "current" (last token) and "mean over recent"
use ratatui::{
layout::{Constraint, Direction, Layout, Rect},
style::{Color, Modifier, Style},
text::{Line, Span},
widgets::{Block, Borders, Gauge, Paragraph, Wrap},
Frame,
};
use ratatui::crossterm::event::{Event, KeyCode};
use super::{App, ScreenView};
use crate::agent::api::ReadoutManifest;
use crate::agent::readout::ReadoutEntry;
const TOP_K: usize = 20;
/// Hysteresis band around TOP_K. A concept currently in the display
/// is kept as long as its |z-score| rank stays in the top
/// ``TOP_K + HYSTERESIS``; only falls out when it drops below that.
/// Prevents the ticker-tape flicker that pure top-K sorting produces.
const HYSTERESIS: usize = 20;
pub(crate) struct AmygdalaScreen {
selected_layer: usize,
mode: DisplayMode,
/// Concept indices currently pinned in display order. Values at
/// these indices change every frame; the set only rotates when a
/// pinned concept drops out of the hysteresis band.
display_indices: Vec<usize>,
/// Whether to show z-scored values (default) or raw dot products.
normalize: bool,
}
#[derive(Clone, Copy, PartialEq)]
enum DisplayMode {
/// Values from the single most recent token.
Current,
/// Mean over all tokens currently in the ring buffer.
MeanRecent,
}
impl AmygdalaScreen {
pub fn new() -> Self {
Self {
// Default to layer 62 — clean cross-cluster discrimination
// with good within-cluster cohesion. With the v2 deep
// manifest (layers 62, 63), index 0 = layer 62 and
// index 1 = layer 63 (sharper but noisier on some
// dimensions). Bounded down to actual layer count at
// render time.
selected_layer: 0,
mode: DisplayMode::MeanRecent,
display_indices: Vec::new(),
normalize: true,
}
}
}
impl ScreenView for AmygdalaScreen {
fn label(&self) -> &'static str { "amygdala" }
fn tick(&mut self, frame: &mut Frame, area: Rect,
events: &[Event], app: &mut App) {
for event in events {
if let Event::Key(key) = event {
match key.code {
KeyCode::Char(c) if c.is_ascii_digit() && c != '0' => {
let idx = (c as u8 - b'1') as usize;
self.selected_layer = idx;
}
KeyCode::Char('t') => {
self.mode = match self.mode {
DisplayMode::Current => DisplayMode::MeanRecent,
DisplayMode::MeanRecent => DisplayMode::Current,
};
// Re-pin on mode change; the relative
// magnitudes between current-token and
// mean-recent differ substantially.
self.display_indices.clear();
}
KeyCode::Char('z') => {
self.normalize = !self.normalize;
self.display_indices.clear();
}
_ => {}
}
}
}
// Snapshot the shared buffer with a short lock.
let snapshot = match app.agent.readout.lock() {
Ok(buf) => {
if !buf.is_enabled() {
render_disabled(frame, area);
return;
}
let manifest = buf.manifest.clone().unwrap();
let entries: Vec<ReadoutEntry> =
buf.recent.iter().cloned().collect();
(manifest, entries)
}
Err(_) => {
render_disabled(frame, area);
return;
}
};
let (manifest, entries) = snapshot;
// Bound the selected layer to what the manifest actually has.
let n_layers = manifest.layers.len();
if self.selected_layer >= n_layers {
self.selected_layer = 0;
}
// Compute the raw values for the selected layer: either the
// latest token's row, or the mean across recent tokens. Raw
// means un-normalized dot products — their absolute scale is
// dominated by residual-stream norm, not concept alignment.
let raw: Option<Vec<f32>> = match self.mode {
DisplayMode::Current => entries
.last()
.and_then(|e| e.readout.get(self.selected_layer).cloned()),
DisplayMode::MeanRecent => mean_layer(&entries, self.selected_layer),
};
// Optional z-score normalization: remove the per-layer mean,
// scale by std. Result is "σ above/below the concept-vector
// average at this layer" — the loud-residual-stream scaling
// factor cancels out, values become comparable across frames.
let display_values = raw.as_ref().map(|v| {
if self.normalize { z_score(v) } else { v.clone() }
});
// Update the pinned display set with hysteresis: a concept
// stays pinned while it remains in the top (TOP_K + HYSTERESIS)
// by |value|; falls out only when it drops below that band.
// Keeps rows stable while values update in place.
if let Some(v) = display_values.as_ref() {
self.refresh_display_indices(v);
}
let layout = Layout::default()
.direction(Direction::Vertical)
.constraints([
Constraint::Length(3), // header
Constraint::Min(10), // bars
Constraint::Length(6), // recent tokens
])
.split(area);
render_header(frame, layout[0], &manifest, self.selected_layer,
self.mode, entries.len(), self.normalize);
match display_values {
Some(v) => render_bars(
frame, layout[1], &manifest.concepts, &v,
&self.display_indices, self.normalize,
),
None => render_empty_bars(frame, layout[1]),
}
render_recent(frame, layout[2], &entries, self.selected_layer,
&manifest.concepts);
}
}
impl AmygdalaScreen {
/// Add concepts entering the hysteresis band; evict concepts that
/// dropped out. Preserves existing order for concepts that stay.
fn refresh_display_indices(&mut self, values: &[f32]) {
let n = values.len();
if n == 0 {
return;
}
// Rank all concepts by |value| desc so we can check both "in
// strict top-K" and "in hysteresis band (top K + H)" cheaply.
let mut rank: Vec<(usize, f32)> = values.iter()
.enumerate().map(|(i, v)| (i, v.abs())).collect();
rank.sort_by(|a, b| b.1.partial_cmp(&a.1)
.unwrap_or(std::cmp::Ordering::Equal));
let hyst_cutoff = (TOP_K + HYSTERESIS).min(n);
let in_band: std::collections::HashSet<usize> =
rank.iter().take(hyst_cutoff).map(|(i, _)| *i).collect();
// Drop anything that left the band.
self.display_indices.retain(|i| in_band.contains(i));
// Fill up to TOP_K by walking the top-K-by-|value| and adding
// any concept not already displayed.
for (i, _) in rank.iter().take(TOP_K) {
if self.display_indices.len() >= TOP_K {
break;
}
if !self.display_indices.contains(i) {
self.display_indices.push(*i);
}
}
}
}
fn render_disabled(frame: &mut Frame, area: Rect) {
let text = Paragraph::new(Line::from(vec![
Span::raw("readout disabled — server did not return a manifest. "),
Span::styled("Start vLLM with ", Style::default().fg(Color::DarkGray)),
Span::styled("VLLM_READOUT_MANIFEST", Style::default().fg(Color::Yellow)),
Span::styled(" + ", Style::default().fg(Color::DarkGray)),
Span::styled("VLLM_READOUT_VECTORS", Style::default().fg(Color::Yellow)),
Span::styled(".", Style::default().fg(Color::DarkGray)),
]))
.wrap(Wrap { trim: true })
.block(Block::default().borders(Borders::ALL).title("amygdala"));
frame.render_widget(text, area);
}
fn render_header(frame: &mut Frame, area: Rect, manifest: &ReadoutManifest,
selected: usize, mode: DisplayMode, n_tokens: usize,
normalize: bool) {
let mode_str = match mode {
DisplayMode::Current => "current",
DisplayMode::MeanRecent => "mean(recent)",
};
let scale_str = if normalize { "z-score" } else { "raw" };
let layer = manifest.layers.get(selected).copied().unwrap_or(0);
let spans = vec![
Span::styled("layer ", Style::default().fg(Color::DarkGray)),
Span::styled(
format!("{}/{} ", selected + 1, manifest.layers.len()),
Style::default().add_modifier(Modifier::BOLD),
),
Span::styled("(index ", Style::default().fg(Color::DarkGray)),
Span::styled(format!("{}", layer), Style::default().fg(Color::Cyan)),
Span::styled(") ", Style::default().fg(Color::DarkGray)),
Span::styled("mode ", Style::default().fg(Color::DarkGray)),
Span::styled(mode_str, Style::default().fg(Color::Yellow)),
Span::styled(" scale ", Style::default().fg(Color::DarkGray)),
Span::styled(scale_str, Style::default().fg(Color::Yellow)),
Span::styled(" ", Style::default()),
Span::styled(
format!("{} toks in ring", n_tokens),
Style::default().fg(Color::DarkGray),
),
Span::raw(" "),
Span::styled(
format!("[1-{}] layer [t] mode [z] z-score/raw",
manifest.layers.len().min(9)),
Style::default().fg(Color::DarkGray),
),
];
let para = Paragraph::new(Line::from(spans))
.block(Block::default().borders(Borders::ALL).title("amygdala"));
frame.render_widget(para, area);
}
fn render_bars(frame: &mut Frame, area: Rect,
concepts: &[String], values: &[f32],
display_indices: &[usize], normalize: bool) {
let inner = Block::default().borders(Borders::ALL)
.title("top concepts");
let inner_area = inner.inner(area);
frame.render_widget(inner, area);
if inner_area.height == 0 || display_indices.is_empty() {
return;
}
// Bar-scale normalization. For z-score mode, pin the bar to a
// fixed reference (|z| = 3 = full bar) so the visual magnitude
// has a meaningful interpretation ("3σ from baseline"). For raw
// mode, fall back to the old behavior (scale to the loudest
// concept on-screen).
let scale_ref: f32 = if normalize {
3.0
} else {
display_indices.iter()
.filter_map(|&i| values.get(i))
.map(|v| v.abs())
.fold(0.0_f32, f32::max)
.max(1e-6)
};
let rows = (inner_area.height as usize).min(display_indices.len());
let row_constraints: Vec<Constraint> =
std::iter::repeat(Constraint::Length(1)).take(rows).collect();
let chunks = Layout::default()
.direction(Direction::Vertical)
.constraints(row_constraints)
.split(inner_area);
for (row, &c_idx) in display_indices.iter().take(rows).enumerate() {
let v = values.get(c_idx).copied().unwrap_or(0.0);
let label = concepts.get(c_idx).cloned()
.unwrap_or_else(|| format!("c{}", c_idx));
let ratio = (v.abs() / scale_ref).clamp(0.0, 1.0);
let color = if v >= 0.0 { Color::Green } else { Color::Red };
let display_num = if normalize {
format!("{:+.2}σ", v)
} else {
format!("{:+.3}", v)
};
let gauge = Gauge::default()
.ratio(ratio as f64)
.gauge_style(Style::default().fg(color).bg(Color::Reset))
.label(format!("{:<26} {}", truncate_name(&label, 26), display_num));
frame.render_widget(gauge, chunks[row]);
}
}
fn render_empty_bars(frame: &mut Frame, area: Rect) {
let para = Paragraph::new(Line::from(Span::styled(
"waiting for tokens…",
Style::default().fg(Color::DarkGray),
)))
.block(Block::default().borders(Borders::ALL).title("top concepts"));
frame.render_widget(para, area);
}
fn render_recent(frame: &mut Frame, area: Rect, entries: &[ReadoutEntry],
layer: usize, concepts: &[String]) {
let mut lines: Vec<Line> = Vec::new();
for entry in entries.iter().rev().take(4) {
let row = match entry.readout.get(layer) {
Some(r) => r,
None => continue,
};
// top concept at this layer for this token
let (best_idx, best_val) = row.iter().enumerate()
.fold((0, 0.0_f32), |acc, (i, v)| {
if v.abs() > acc.1.abs() { (i, *v) } else { acc }
});
let name = concepts.get(best_idx).cloned()
.unwrap_or_else(|| format!("c{}", best_idx));
let tok_str = format!("t{:>5}", entry.token_id);
lines.push(Line::from(vec![
Span::styled(tok_str, Style::default().fg(Color::DarkGray)),
Span::raw(" "),
Span::styled(
format!("{:<24}", truncate_name(&name, 24)),
Style::default().fg(
if best_val >= 0.0 { Color::Green } else { Color::Red },
),
),
Span::styled(
format!(" {:+.3}", best_val),
Style::default().add_modifier(Modifier::BOLD),
),
]));
}
let para = Paragraph::new(lines)
.block(Block::default().borders(Borders::ALL).title("recent tokens — top concept"));
frame.render_widget(para, area);
}
/// Z-score normalize: `(v - mean) / std` across the concept axis.
/// Result is comparable across frames and layers (the residual-stream
/// magnitude factors out) and has the nice property that "this is
/// ≥2σ elevated" has a concrete meaning regardless of scale.
fn z_score(values: &[f32]) -> Vec<f32> {
let n = values.len() as f32;
if n == 0.0 {
return Vec::new();
}
let mean = values.iter().sum::<f32>() / n;
let var = values.iter()
.map(|v| (v - mean) * (v - mean))
.sum::<f32>() / n;
let std = var.sqrt().max(1e-6);
values.iter().map(|v| (v - mean) / std).collect()
}
fn mean_layer(entries: &[ReadoutEntry], layer: usize) -> Option<Vec<f32>> {
let rows: Vec<&Vec<f32>> = entries.iter()
.filter_map(|e| e.readout.get(layer))
.collect();
if rows.is_empty() {
return None;
}
let n_concepts = rows[0].len();
let mut acc = vec![0.0_f32; n_concepts];
for r in &rows {
for (i, v) in r.iter().enumerate() {
acc[i] += *v;
}
}
let n = rows.len() as f32;
for v in &mut acc { *v /= n; }
Some(acc)
}
fn truncate_name(s: &str, max: usize) -> String {
if s.len() <= max { s.to_string() }
else { format!("{}", &s[..max.saturating_sub(1)]) }
}

View file

@ -112,13 +112,7 @@ pub async fn cmd_switch_model(
let _new_client = crate::agent::api::ApiClient::new(
&resolved.api_base, &resolved.api_key, &resolved.model_id,
);
let prompt_changed = resolved.prompt_file != agent.prompt_file;
if prompt_changed {
agent.compact().await;
agent.state.lock().await.notify(format!("switched to {} (recompacted)", resolved.model_id));
} else {
agent.state.lock().await.notify(format!("switched to {}", resolved.model_id));
}
}
fn notify_help(agent: &std::sync::Arc<crate::agent::Agent>) {
@ -173,6 +167,7 @@ enum PaneTarget {
ConversationAssistant,
Tools,
ToolResult,
Autonomous,
}
const MAX_PANE_LINES: usize = 10_000;
@ -478,8 +473,11 @@ impl InteractScreen {
AstNode::Leaf(leaf) => {
let text = leaf.body().text().to_string();
match leaf.body() {
NodeBody::Memory { .. } | NodeBody::Thinking(_)
| NodeBody::Log(_) | NodeBody::Dmn(_) => vec![],
NodeBody::Memory { .. } | NodeBody::Log(_) | NodeBody::Dmn(_) => vec![],
NodeBody::Thinking(_) => {
if text.is_empty() { vec![] }
else { vec![(PaneTarget::Autonomous, text, Marker::None)] }
}
NodeBody::Content(_) => {
if text.is_empty() || text.starts_with("<system-reminder>") { vec![] }
else { vec![(PaneTarget::Conversation, text, Marker::User)] }
@ -492,6 +490,11 @@ impl InteractScreen {
if t.is_empty() { vec![] }
else { vec![(PaneTarget::ToolResult, text, Marker::None)] }
}
NodeBody::Image { orig_height, orig_width, .. } => {
vec![(PaneTarget::Conversation,
format!("[image {}x{}]", orig_width, orig_height),
Marker::None)]
}
}
}
AstNode::Branch { role, children, .. } => {
@ -548,6 +551,12 @@ impl InteractScreen {
self.tools.push_line(format!(" {}", line), Color::DarkGray);
}
}
PaneTarget::Autonomous => {
self.autonomous.current_color = Color::Gray;
self.autonomous.append_text(&text);
self.autonomous.pending_marker = marker;
self.autonomous.flush_pending();
}
}
}
}
@ -559,6 +568,8 @@ impl InteractScreen {
=> self.conversation.pop_line(),
PaneTarget::Tools | PaneTarget::ToolResult
=> self.tools.pop_line(),
PaneTarget::Autonomous
=> self.autonomous.pop_line(),
}
}
}

111
src/user/compare.rs Normal file
View file

@ -0,0 +1,111 @@
// compare.rs — F7 compare screen: side-by-side test-model regen of
// every assistant response in the current context.
use ratatui::{
layout::Rect,
style::{Color, Modifier, Style},
text::{Line, Span},
widgets::{Block, Borders, List, ListItem, ListState, Paragraph, Wrap},
Frame,
};
use ratatui::crossterm::event::{Event, KeyCode};
use super::{App, ScreenView, truncate, widgets};
pub use crate::subconscious::compare::CompareCandidate;
pub(crate) struct CompareScreen {
list_state: ListState,
mind_tx: tokio::sync::mpsc::UnboundedSender<crate::mind::MindCommand>,
}
impl CompareScreen {
pub fn new(
mind_tx: tokio::sync::mpsc::UnboundedSender<crate::mind::MindCommand>,
) -> Self {
Self { list_state: ListState::default(), mind_tx }
}
}
impl ScreenView for CompareScreen {
fn label(&self) -> &'static str { "compare" }
fn tick(&mut self, frame: &mut Frame, area: Rect,
events: &[Event], app: &mut App) {
widgets::handle_list_nav(events, &mut self.list_state,
app.compare_candidates.len(), |code| match code {
KeyCode::Char('c') | KeyCode::Enter => {
let _ = self.mind_tx.send(crate::mind::MindCommand::Compare);
}
_ => {}
});
let (settings_area, content_area, help_area) =
widgets::candidate_frame(frame, area, "compare");
let test_backend = crate::config::app().compare.test_backend.clone();
let (label, color) = if test_backend.is_empty() {
("(unset — set compare.test_backend)".to_string(), Color::Red)
} else {
(test_backend, Color::Yellow)
};
frame.render_widget(Paragraph::new(Line::from(vec![
Span::raw(" test model: "),
Span::styled(label, Style::default().fg(color)),
])), settings_area);
let candidates = &app.compare_candidates;
if candidates.is_empty() {
let err = app.mind_state.as_ref().and_then(|ms| ms.compare_error.as_deref());
let mut lines = vec![Line::from(""),
Line::styled(" Press c/Enter to compare against the configured test model.",
Style::default().fg(Color::DarkGray))];
if let Some(e) = err {
lines.push(Line::from(""));
lines.push(Line::from(vec![
Span::raw(" "),
Span::styled(format!("error: {}", e), Style::default().fg(Color::Red)),
]));
}
frame.render_widget(Paragraph::new(lines), content_area);
} else {
let (list_area, detail_area) = widgets::list_detail_split(content_area);
let items: Vec<ListItem> = candidates.iter().map(|c| ListItem::new(Line::from(vec![
Span::styled(format!("#{:<3} ", c.entry_idx), Style::default().fg(Color::DarkGray)),
Span::raw(truncate(&c.original_text, 30)),
]))).collect();
frame.render_stateful_widget(
List::new(items)
.block(Block::default().borders(Borders::RIGHT).title(" candidates "))
.highlight_style(Style::default().add_modifier(Modifier::REVERSED)),
list_area, &mut self.list_state,
);
if let Some(c) = self.list_state.selected().and_then(|i| candidates.get(i)) {
let mut text = String::new();
if !c.prior_context.is_empty() {
text.push_str(&c.prior_context);
text.push_str("\n\n─── original ───\n\n");
}
text.push_str(&c.original_text);
text.push_str("\n\n─── test model ───\n\n");
text.push_str(&c.alternate_text);
frame.render_widget(
Paragraph::new(text)
.block(Block::default().borders(Borders::TOP)
.title(format!(" entry {} ", c.entry_idx)))
.wrap(Wrap { trim: false }),
detail_area,
);
}
}
frame.render_widget(Paragraph::new(Line::from(vec![
Span::styled(" j/k/\u{2191}\u{2193}", Style::default().fg(Color::Cyan)),
Span::raw("=nav "),
Span::styled("c/Enter", Style::default().fg(Color::Green)),
Span::raw("=run "),
])), help_area);
}
}

View file

@ -38,16 +38,14 @@ impl ConsciousScreen {
for node in ctx.conversation() {
if let AstNode::Leaf(leaf) = node {
if let NodeBody::Memory { key, score, text } = leaf.body() {
let status = match score {
Some(s) => { scored += 1; format!("{:.2}", s) }
None => { unscored += 1; String::new() }
};
if score.is_some() { scored += 1; } else { unscored += 1; }
mem_children.push(SectionView {
name: key.clone(),
name: format!("mem: {}", key),
tokens: node.tokens(),
content: text.clone(),
token_ids: leaf.token_ids().to_vec(),
children: Vec::new(),
status,
status: score.map(|s| format!("{:.2}", s)).unwrap_or_default(),
});
}
}
@ -58,6 +56,7 @@ impl ConsciousScreen {
name: format!("Memory nodes ({})", mem_children.len()),
tokens: mem_tokens,
content: String::new(),
token_ids: Vec::new(),
children: mem_children,
status: format!("{} scored, {} unscored", scored, unscored),
});
@ -73,11 +72,13 @@ impl ConsciousScreen {
AstNode::Leaf(leaf) => leaf.body().text().to_string(),
_ => String::new(),
},
token_ids: node.token_ids(),
children: match node {
AstNode::Branch { children, .. } => children.iter()
.map(|c| SectionView {
name: c.label(), tokens: c.tokens(),
content: match c { AstNode::Leaf(l) => l.body().text().to_string(), _ => String::new() },
token_ids: match c { AstNode::Leaf(l) => l.token_ids().to_vec(), _ => c.token_ids() },
children: Vec::new(), status: String::new(),
}).collect(),
_ => Vec::new(),
@ -104,6 +105,7 @@ impl ConsciousScreen {
name: format!("Conversation ({} entries)", conv_children.len()),
tokens: conv_tokens,
content: String::new(),
token_ids: Vec::new(),
children: conv_children,
status: String::new(),
});
@ -129,14 +131,7 @@ impl ScreenView for ConsciousScreen {
let section_style = Style::default().fg(Color::Yellow);
lines.push(Line::styled("── Model ──", section_style));
let model_display = app.context_info.as_ref()
.map_or_else(|| app.status.model.clone(), |i| i.model.clone());
lines.push(Line::raw(format!(" Current: {}", model_display)));
if let Some(ref info) = app.context_info {
lines.push(Line::raw(format!(" Backend: {}", info.backend)));
lines.push(Line::raw(format!(" Prompt: {}", info.prompt_file)));
lines.push(Line::raw(format!(" Available: {}", info.available_models.join(", "))));
}
lines.push(Line::raw(format!(" Current: {}", app.status.model)));
lines.push(Line::raw(""));
lines.push(Line::styled("── Context State ──", section_style));
@ -156,8 +151,6 @@ impl ScreenView for ConsciousScreen {
lines.push(Line::raw(format!(" {:53} {:>6} tokens", "────────", "──────")));
lines.push(Line::raw(format!(" {:53} {:>6} tokens", "Total", total)));
} else if let Some(ref info) = app.context_info {
lines.push(Line::raw(format!(" Context message: {:>6} chars", info.context_message_chars)));
}
lines.push(Line::raw(""));

284
src/user/learn.rs Normal file
View file

@ -0,0 +1,284 @@
// learn.rs — F6: fine-tuning review screen
//
// Shows responses identified as training candidates (high divergence
// when memories stripped). Queue for review before sending to /finetune.
use ratatui::{
layout::{Constraint, Layout, Rect},
style::{Color, Modifier, Style},
text::{Line, Span},
widgets::{Block, Borders, List, ListItem, ListState, Paragraph, Wrap},
Frame,
};
use ratatui::crossterm::event::{Event, KeyCode};
use super::{App, ScreenView, truncate, widgets};
/// A candidate response identified for fine-tuning.
#[derive(Clone, Debug)]
pub struct FinetuneCandidate {
/// Index in conversation entries.
pub entry_idx: usize,
/// Divergence score (higher = more dependent on memories).
pub divergence: f64,
/// The assistant response text.
pub response_text: String,
/// Prior user/assistant messages for review context.
pub prior_context: String,
/// Status: pending, approved, rejected, sent.
pub status: CandidateStatus,
/// Token IDs for context.
pub context_ids: Vec<u32>,
/// Token IDs for continuation (what we're training on).
pub continuation_ids: Vec<u32>,
/// What the model would have said without memories (if generated).
pub alternate_text: Option<String>,
/// Timestamp in nanos — used as unique key for trained-set dedup.
pub timestamp_ns: i64,
}
#[derive(Clone, Debug, PartialEq)]
pub enum CandidateStatus {
Pending,
Approved,
Rejected,
Sent,
}
impl From<crate::subconscious::learn::FinetuneCandidate> for FinetuneCandidate {
fn from(c: crate::subconscious::learn::FinetuneCandidate) -> Self {
FinetuneCandidate {
entry_idx: c.entry_idx,
divergence: c.divergence,
response_text: c.response_text,
prior_context: c.prior_context,
status: CandidateStatus::Pending,
context_ids: c.context_ids,
continuation_ids: c.continuation_ids,
alternate_text: c.alternate_text,
timestamp_ns: c.timestamp_ns,
}
}
}
pub(crate) struct LearnScreen {
list_state: ListState,
mind_tx: tokio::sync::mpsc::UnboundedSender<crate::mind::MindCommand>,
}
impl LearnScreen {
pub fn new(
mind_tx: tokio::sync::mpsc::UnboundedSender<crate::mind::MindCommand>,
) -> Self {
Self {
list_state: ListState::default(),
mind_tx,
}
}
fn selected_idx(&self) -> Option<usize> {
self.list_state.selected()
}
}
impl ScreenView for LearnScreen {
fn label(&self) -> &'static str { "learn" }
fn tick(&mut self, frame: &mut Frame, area: Rect,
events: &[Event], app: &mut App) {
let selected_idx = self.list_state.selected();
widgets::handle_list_nav(events, &mut self.list_state,
app.finetune_candidates.len(), |code| match code {
KeyCode::Char('a') => {
if let Some(idx) = selected_idx {
app.finetune_action(idx, CandidateStatus::Approved);
}
}
KeyCode::Char('r') => {
if let Some(idx) = selected_idx {
app.finetune_action(idx, CandidateStatus::Rejected);
}
}
KeyCode::Char('g') => {
let current = crate::config::app().learn.generate_alternates;
let _ = self.mind_tx.send(
crate::mind::MindCommand::SetLearnGenerateAlternates(!current));
}
KeyCode::Char('s') => { app.finetune_send_approved(); }
KeyCode::Char('+') | KeyCode::Char('=') => {
let new = crate::config::app().learn.threshold * 10.0;
let _ = self.mind_tx.send(crate::mind::MindCommand::SetLearnThreshold(new));
}
KeyCode::Char('-') => {
let new = crate::config::app().learn.threshold / 10.0;
let _ = self.mind_tx.send(crate::mind::MindCommand::SetLearnThreshold(new));
}
_ => {}
});
let (settings_area, content_area, help_area) =
widgets::candidate_frame(frame, area, "learn");
let (threshold, gen_on) = {
let app_cfg = crate::config::app();
(app_cfg.learn.threshold, app_cfg.learn.generate_alternates)
};
let settings = Line::from(vec![
Span::raw(" thresh: "),
Span::styled(format!("{:e}", threshold), Style::default().fg(Color::Yellow)),
Span::raw(" gen: "),
Span::styled(
if gen_on { "[on]" } else { "[off]" },
Style::default().fg(if gen_on { Color::Green } else { Color::DarkGray }),
),
]);
frame.render_widget(Paragraph::new(settings), settings_area);
let candidates = &app.finetune_candidates;
if candidates.is_empty() {
render_empty(frame, content_area, app);
} else {
let (list_area, detail_area) = widgets::list_detail_split(content_area);
// Render candidate list
let items: Vec<ListItem> = candidates.iter().map(|c| {
let status_char = match c.status {
CandidateStatus::Pending => ' ',
CandidateStatus::Approved => '+',
CandidateStatus::Rejected => '-',
CandidateStatus::Sent => '*',
};
let style = match c.status {
CandidateStatus::Pending => Style::default(),
CandidateStatus::Approved => Style::default().fg(Color::Green),
CandidateStatus::Rejected => Style::default().fg(Color::DarkGray),
CandidateStatus::Sent => Style::default().fg(Color::Cyan),
};
ListItem::new(Line::from(vec![
Span::styled(format!("[{}] ", status_char), style),
Span::styled(format!("{:.2} ", c.divergence), Style::default().fg(Color::Yellow)),
Span::raw(truncate(&c.response_text, 30)),
]))
}).collect();
let list = List::new(items)
.block(Block::default().borders(Borders::RIGHT).title(" candidates "))
.highlight_style(Style::default().add_modifier(Modifier::REVERSED));
frame.render_stateful_widget(list, list_area, &mut self.list_state);
// Render detail for selected candidate
if let Some(idx) = self.selected_idx() {
if let Some(candidate) = candidates.get(idx) {
render_detail(frame, candidate, detail_area);
}
}
}
frame.render_widget(Paragraph::new(Line::from(vec![
Span::styled(" j/k/\u{2191}\u{2193}", Style::default().fg(Color::Cyan)),
Span::raw("=nav "),
Span::styled("a", Style::default().fg(Color::Green)),
Span::raw("=approve "),
Span::styled("r", Style::default().fg(Color::Red)),
Span::raw("=reject "),
Span::styled("g", Style::default().fg(Color::Yellow)),
Span::raw("=gen "),
Span::styled("s", Style::default().fg(Color::Magenta)),
Span::raw("=send "),
Span::styled("+/-", Style::default().fg(Color::Cyan)),
Span::raw("=thresh "),
])), help_area);
}
}
fn render_empty(frame: &mut Frame, inner: Rect, app: &App) {
let mut lines = Vec::new();
lines.push(Line::from(""));
match app.mind_state.as_ref().and_then(|ms| ms.finetune_last_run.as_ref()) {
Some(stats) => {
lines.push(Line::from(vec![
Span::raw(" Last run: "),
Span::styled(
format!("{}", stats.responses_considered),
Style::default().fg(Color::Cyan),
),
Span::raw(" responses considered, "),
Span::styled(
format!("{}", stats.above_threshold),
Style::default().fg(if stats.above_threshold > 0 { Color::Green } else { Color::DarkGray }),
),
Span::raw(" above threshold, max divergence: "),
Span::styled(
format!("{:.4}", stats.max_divergence),
Style::default().fg(Color::Yellow),
),
]));
if let Some(err) = &stats.error {
lines.push(Line::from(vec![
Span::raw(" "),
Span::styled(
format!("Error: {}", err),
Style::default().fg(Color::Red),
),
]));
}
}
None => {
lines.push(Line::styled(
" No scoring run yet.",
Style::default().fg(Color::DarkGray),
));
}
}
lines.push(Line::from(""));
lines.push(Line::styled(
" Scoring runs at startup and after each turn.",
Style::default().fg(Color::DarkGray),
));
frame.render_widget(Paragraph::new(lines), inner);
}
fn render_detail(frame: &mut Frame, c: &FinetuneCandidate, area: Rect) {
let [header_area, content_area] = Layout::vertical([
Constraint::Length(3),
Constraint::Min(1),
]).areas(area);
// Header: divergence, status
let alt_status = if c.alternate_text.is_some() { "yes" } else { "no" };
let header = Paragraph::new(vec![
Line::from(vec![
Span::raw(" divergence: "),
Span::styled(format!("{:.3}", c.divergence), Style::default().fg(Color::Yellow)),
Span::raw(format!(" entry: {} alt: {}", c.entry_idx, alt_status)),
]),
]);
frame.render_widget(header, header_area);
// Content: prior context, the scored response, and alternate
// (if available).
let content_block = Block::default()
.borders(Borders::TOP)
.title(" context & response ");
let mut text = String::new();
if !c.prior_context.is_empty() {
text.push_str(&c.prior_context);
text.push_str("\n\n─── response ───\n\n");
}
text.push_str(&c.response_text);
if let Some(alt) = &c.alternate_text {
text.push_str("\n\n─── without memories ───\n\n");
text.push_str(alt);
}
let content = Paragraph::new(text)
.block(content_block)
.wrap(Wrap { trim: false });
frame.render_widget(content, content_area);
}

View file

@ -3,13 +3,16 @@
// TUI, UI channel, parsing. The cognitive layer (session state
// machine, DMN, identity) lives in mind/.
pub(crate) mod amygdala;
pub(crate) mod chat;
pub(crate) mod compare;
mod context;
pub(crate) mod learn;
pub(crate) mod scroll_pane;
pub mod selectable;
mod subconscious;
mod unconscious;
mod thalamus;
mod unconscious;
mod widgets;
use anyhow::Result;
@ -44,15 +47,6 @@ struct StatusInfo {
}
/// Context loading details for the debug screen.
#[derive(Debug, Clone)]
struct ContextInfo {
model: String,
available_models: Vec<String>,
prompt_file: String,
backend: String,
context_message_chars: usize,
}
/// Build the screen legend from screen labels.
fn screen_legend_from(screens: &[Box<dyn ScreenView>]) -> String {
let parts: Vec<String> = screens.iter().enumerate()
@ -72,8 +66,15 @@ fn screen_legend() -> String {
SCREEN_LEGEND.get().cloned().unwrap_or_default()
}
/// Return the first line of `s`, truncated to `max` chars with an
/// ellipsis suffix. Used by candidate-list screens.
fn truncate(s: &str, max: usize) -> String {
let first = s.lines().next().unwrap_or("");
if first.len() > max { format!("{}...", &first[..max]) } else { first.to_string() }
}
/// A screen that can draw itself and handle input.
trait ScreenView: Send {
trait ScreenView {
fn tick(&mut self, frame: &mut ratatui::Frame, area: ratatui::layout::Rect,
events: &[ratatui::crossterm::event::Event], app: &mut App);
fn label(&self) -> &'static str;
@ -109,7 +110,6 @@ struct App {
top_k: u32,
agent: std::sync::Arc<crate::agent::Agent>,
should_quit: bool,
context_info: Option<ContextInfo>,
agent_state: Vec<crate::mind::SubconsciousSnapshot>,
unconscious_state: Vec<crate::mind::UnconsciousSnapshot>,
mind_state: Option<crate::mind::MindState>,
@ -121,6 +121,10 @@ struct App {
walked_count: usize,
channel_status: Vec<ChannelStatus>,
idle_info: Option<IdleInfo>,
/// Fine-tuning candidates pending review.
finetune_candidates: Vec<learn::FinetuneCandidate>,
/// F7 compare candidates — response pairs from test-model comparison.
compare_candidates: Vec<compare::CompareCandidate>,
}
impl App {
@ -142,7 +146,6 @@ impl App {
top_k: 20,
agent,
should_quit: false,
context_info: None,
agent_state: Vec::new(),
unconscious_state: Vec::new(),
mind_state: None,
@ -151,9 +154,53 @@ impl App {
rebuild_tools_pending: false,
walked_count: 0,
channel_status: Vec::new(), idle_info: None,
finetune_candidates: Vec::new(),
compare_candidates: Vec::new(),
}
}
fn finetune_action(&mut self, idx: usize, status: learn::CandidateStatus) {
if let Some(candidate) = self.finetune_candidates.get_mut(idx) {
candidate.status = status;
}
}
fn finetune_send_approved(&mut self) {
// Collect approved candidates
let samples: Vec<crate::subconscious::learn::TrainData> = self.finetune_candidates.iter()
.filter(|c| c.status == learn::CandidateStatus::Approved)
.map(|c| crate::subconscious::learn::TrainData {
context_ids: c.context_ids.clone(),
continuation_ids: c.continuation_ids.clone(),
timestamp_ns: c.timestamp_ns,
})
.collect();
if samples.is_empty() {
return;
}
// Mark as sent in UI immediately
for candidate in &mut self.finetune_candidates {
if candidate.status == learn::CandidateStatus::Approved {
candidate.status = learn::CandidateStatus::Sent;
}
}
// Spawn async task to send to training server
let client = self.agent.client.clone();
tokio::spawn(async move {
match crate::subconscious::learn::send_to_train(samples, &client).await {
Ok(job_id) => {
dbglog!("[finetune] training started: {}", job_id);
}
Err(e) => {
dbglog!("[finetune] send failed: {:#}", e);
}
}
});
}
fn set_channel_status(&mut self, channels: Vec<(String, bool, u32)>) {
self.channel_status = channels.into_iter()
@ -193,6 +240,9 @@ fn restore_terminal(terminal: &mut ratatui::Terminal<CrosstermBackend<io::Stdout
async fn start(cli: crate::user::CliArgs) -> Result<()> {
let (config, _figment) = crate::config::load_session(&cli).await?;
// Pick up external edits (vim, F6 hotkeys, etc.) without restart.
crate::config::watch_config(cli.clone());
if config.app.debug {
unsafe { std::env::set_var("POC_DEBUG", "1") };
}
@ -241,8 +291,8 @@ async fn start(cli: crate::user::CliArgs) -> Result<()> {
ui_handle.join().unwrap_or_else(|_| Err(anyhow::anyhow!("UI thread panicked")))
}
fn hotkey_cycle_reasoning(mind: &crate::mind::Mind) {
if let Ok(mut ag) = mind.agent.state.try_lock() {
async fn hotkey_cycle_reasoning(mind: &crate::mind::Mind) {
let mut ag = mind.agent.state.lock().await;
let next = match ag.reasoning_effort.as_str() {
"none" => "low",
"low" => "high",
@ -256,7 +306,6 @@ fn hotkey_cycle_reasoning(mind: &crate::mind::Mind) {
_ => next,
};
ag.notify(format!("reasoning: {}", label));
}
}
async fn hotkey_kill_processes(mind: &crate::mind::Mind) {
@ -334,7 +383,7 @@ async fn run(
}
let notify_rx = crate::thalamus::channels::subscribe_all();
// F1=chat, F2=conscious, F3=subconscious, F4=unconscious, F5=thalamus
// F1=chat, F2=conscious, F3=subconscious, F4=unconscious, F5=thalamus, F6=learn, F7=compare, F8=amygdala
let mut screens: Vec<Box<dyn tui::ScreenView>> = vec![
Box::new(crate::user::chat::InteractScreen::new(
mind.agent.clone(), mind.shared.clone(), mind_tx.clone(),
@ -343,6 +392,9 @@ async fn run(
Box::new(crate::user::subconscious::SubconsciousScreen::new()),
Box::new(crate::user::unconscious::UnconsciousScreen::new()),
Box::new(crate::user::thalamus::ThalamusScreen::new()),
Box::new(crate::user::learn::LearnScreen::new(mind_tx.clone())),
Box::new(crate::user::compare::CompareScreen::new(mind_tx.clone())),
Box::new(crate::user::amygdala::AmygdalaScreen::new()),
];
let mut active_screen: usize = 1; // F-key number
tui::set_screen_legend(tui::screen_legend_from(&*screens));
@ -419,7 +471,8 @@ async fn run(
idle_state.decay_ewma();
app.update_idle(&idle_state);
app.agent_state = mind.subconscious_snapshots().await;
if let Ok(mut unc) = mind.unconscious.try_lock() {
{
let mut unc = mind.unconscious.lock().await;
let toggles: Vec<String> = app.agent_toggles.drain(..).collect();
for name in &toggles {
if mind.subconscious.lock().await.toggle(name).is_none() {
@ -433,7 +486,42 @@ async fn run(
};
app.unconscious_state = unc.snapshots(store_guard.as_deref());
app.graph_health = unc.graph_health.clone();
app.mind_state = Some(mind.shared.lock().unwrap().clone());
}
// Sync mind state (finetune candidates, last scoring run, etc.)
{
let ms = mind.shared.lock().unwrap();
// Sync finetune candidates: add new ones, keep existing (preserves approval status),
// remove sent candidates, keep only 10 most recent rejected.
app.finetune_candidates.retain(|c| c.status != learn::CandidateStatus::Sent);
for c in &ms.finetune_candidates {
let exists = app.finetune_candidates.iter()
.any(|existing| existing.timestamp_ns == c.timestamp_ns);
if !exists {
app.finetune_candidates.push(learn::FinetuneCandidate::from(c.clone()));
}
}
let mut rejected: Vec<_> = app.finetune_candidates.iter()
.enumerate()
.filter(|(_, c)| c.status == learn::CandidateStatus::Rejected)
.map(|(i, c)| (i, c.timestamp_ns))
.collect();
if rejected.len() > 10 {
rejected.sort_by_key(|(_, ts)| std::cmp::Reverse(*ts));
let to_remove: std::collections::HashSet<_> = rejected[10..]
.iter().map(|(i, _)| *i).collect();
let mut idx = 0;
app.finetune_candidates.retain(|_| {
let keep = !to_remove.contains(&idx);
idx += 1;
keep
});
}
// Sync compare candidates — a fresh run clears, so take a snapshot.
app.compare_candidates = ms.compare_candidates.clone();
app.mind_state = Some(ms.clone());
}
app.walked_count = mind.subconscious_walked().await.len();
if !startup_done {
@ -503,7 +591,7 @@ async fn run(
} else if key.modifiers.contains(KeyModifiers::CONTROL) {
match key.code {
KeyCode::Char('c') => { app.should_quit = true; }
KeyCode::Char('r') => hotkey_cycle_reasoning(mind),
KeyCode::Char('r') => hotkey_cycle_reasoning(mind).await,
KeyCode::Char('k') => hotkey_kill_processes(mind).await,
KeyCode::Char('p') => hotkey_cycle_autonomy(mind),
_ => {}
@ -530,16 +618,11 @@ async fn run(
// --- CLI ---
use clap::{Parser, Subcommand};
use std::path::PathBuf;
#[derive(Parser, Debug, Default)]
#[derive(Parser, Debug, Default, Clone)]
#[command(name = "consciousness", about = "Substrate-independent AI agent")]
pub struct CliArgs {
/// Select active backend ("anthropic" or "openrouter")
#[arg(long)]
pub backend: Option<String>,
/// Model override
/// Model override (selects a named entry from `models` in config.json5)
#[arg(short, long)]
pub model: Option<String>,
@ -559,10 +642,6 @@ pub struct CliArgs {
#[arg(long)]
pub show_config: bool,
/// Project memory directory
#[arg(long)]
pub memory_project: Option<PathBuf>,
/// Max consecutive DMN turns
#[arg(long)]
pub dmn_max_turns: Option<u32>,
@ -575,7 +654,7 @@ pub struct CliArgs {
pub command: Option<SubCmd>,
}
#[derive(Subcommand, Debug)]
#[derive(Subcommand, Debug, Clone)]
pub enum SubCmd {
/// Print new output since last read and exit
Read {
@ -676,8 +755,15 @@ fn restore_stderr(original_fd: std::os::fd::RawFd) {
#[tokio::main]
pub async fn main() {
// Auto-reap child processes (channel daemons outlive the supervisor)
unsafe { libc::signal(libc::SIGCHLD, libc::SIG_IGN); }
// Install target-routed file logger: `target: "grpc"` records go to
// ~/.consciousness/logs/daemon/grpc.log, everything else to debug.log.
// Level from RUST_LOG, defaulting to info.
let _ = crate::logging::init();
// Reap channel-daemon zombies via a SIGCHLD handler that only touches
// PIDs listed in channels_dir(). Avoids SIGCHLD=SIG_IGN, which would
// break tokio::process::Command::wait() (kernel auto-reap → ECHILD).
let _reaper = crate::thalamus::supervisor::start_zombie_reaper();
// Redirect stderr to pipe — logs to file and sends to channel for UI display
let stderr_capture = redirect_stderr_to_pipe();

View file

@ -207,6 +207,7 @@ impl SubconsciousScreen {
name: key.clone(),
tokens: 0,
content: val.clone(),
token_ids: Vec::new(),
children: Vec::new(),
status: String::new(),
}
@ -238,6 +239,7 @@ impl SubconsciousScreen {
name: format!("Conversation ({} entries)", conv_children.len()),
tokens: conv_children.iter().map(|c| c.tokens).sum(),
content: String::new(),
token_ids: Vec::new(),
children: conv_children,
status: String::new(),
});

View file

@ -6,13 +6,20 @@ use ratatui::{
widgets::{Block, Borders},
crossterm::event::KeyCode,
};
use crate::agent::context::{AstNode, Ast};
use crate::agent::context::{AstNode, Ast, NodeBody};
#[derive(Debug, Clone)]
#[derive(Debug, Clone, Default)]
pub struct SectionView {
pub name: String,
pub tokens: usize,
pub content: String,
/// Token-id stream for this subtree, displayed in place of
/// `content` when the tree's show-tokens mode is on. Populated
/// from `leaf.token_ids()` / `node.token_ids()` for views built
/// from the AST; empty for views that don't have a corresponding
/// AST node (subconscious entries, etc.), in which case the
/// token view falls back to the text content.
pub token_ids: Vec<u32>,
pub children: Vec<SectionView>,
/// Extra status text shown after the token count.
pub status: String,
@ -20,13 +27,23 @@ pub struct SectionView {
fn node_to_view(node: &AstNode) -> SectionView {
match node {
AstNode::Leaf(leaf) => SectionView {
name: node.label(),
AstNode::Leaf(leaf) => {
let (name, status) = match leaf.body() {
NodeBody::Memory { key, score, .. } => {
let s = score.map(|v| format!("{:.2}", v)).unwrap_or_default();
(format!("mem: {}", key), s)
}
_ => (node.label(), String::new()),
};
SectionView {
name,
tokens: node.tokens(),
content: leaf.body().text().to_string(),
token_ids: leaf.token_ids().to_vec(),
children: Vec::new(),
status: String::new(),
},
status,
}
}
AstNode::Branch { children, .. } => {
let child_views: Vec<SectionView> = children.iter()
.map(|c| node_to_view(c))
@ -35,6 +52,7 @@ fn node_to_view(node: &AstNode) -> SectionView {
name: node.label(),
tokens: node.tokens(),
content: String::new(),
token_ids: node.token_ids(),
children: child_views,
status: String::new(),
}
@ -45,10 +63,12 @@ fn node_to_view(node: &AstNode) -> SectionView {
pub fn section_to_view(name: &str, nodes: &[AstNode]) -> SectionView {
let children: Vec<SectionView> = nodes.iter().map(|n| node_to_view(n)).collect();
let total_tokens: usize = nodes.iter().map(|n| n.tokens()).sum();
let token_ids: Vec<u32> = nodes.iter().flat_map(|n| n.token_ids()).collect();
SectionView {
name: name.to_string(),
tokens: total_tokens,
content: String::new(),
token_ids,
children,
status: String::new(),
}
@ -95,11 +115,78 @@ pub fn format_ts_age(ts: i64) -> String {
/// Key legend for SectionTree panes.
pub fn tree_legend() -> Line<'static> {
Line::styled(
" ↑↓:nav →/Enter:expand ←:collapse e:expand all c:collapse all PgUp/Dn Home/End ",
" ↑↓:nav →/Enter:expand ←:collapse e:expand c:collapse v:toggle tokens/text PgUp/Dn ",
Style::default().fg(Color::DarkGray),
)
}
// ---------------------------------------------------------------------------
// Candidate-browser screen skeleton (F6 learn, F7 compare, future screens)
// ---------------------------------------------------------------------------
use ratatui::{
layout::{Constraint, Layout, Rect},
widgets::ListState,
crossterm::event::{Event, KeyEvent},
Frame,
};
/// Frame a candidate-browser screen: outer magenta-bordered block with
/// the screen legend on the left and `title` on the right, split into
/// (settings_row, content_area, help_row). Caller renders into the
/// three sub-areas.
pub fn candidate_frame(frame: &mut Frame, area: Rect, title: &str) -> (Rect, Rect, Rect) {
let block = Block::default()
.title_top(Line::from(super::screen_legend()).left_aligned())
.title_top(Line::from(format!(" {} ", title)).right_aligned())
.borders(Borders::ALL)
.border_style(Style::default().fg(Color::Magenta));
let inner = block.inner(area);
frame.render_widget(block, area);
let [settings, content] = Layout::vertical([
Constraint::Length(1), Constraint::Min(0),
]).areas(inner);
let help = Rect { y: area.y + area.height - 1, height: 1, ..area };
(settings, content, help)
}
/// 40/60 horizontal split for list + detail panes within the content area.
pub fn list_detail_split(content: Rect) -> (Rect, Rect) {
let [list, detail] = Layout::horizontal([
Constraint::Percentage(40), Constraint::Percentage(60),
]).areas(content);
(list, detail)
}
/// Handle j/k/↑/↓ list navigation and keep the selection in bounds.
/// Any other key is passed to `on_other` for screen-specific handling.
pub fn handle_list_nav(
events: &[Event],
list_state: &mut ListState,
count: usize,
mut on_other: impl FnMut(KeyCode),
) {
for event in events {
if let Event::Key(KeyEvent { code, .. }) = event {
match code {
KeyCode::Up | KeyCode::Char('k') => {
let i = list_state.selected().unwrap_or(0);
list_state.select(Some(i.saturating_sub(1)));
}
KeyCode::Down | KeyCode::Char('j') => {
let i = list_state.selected().unwrap_or(0);
list_state.select(Some((i + 1).min(count.saturating_sub(1))));
}
_ => on_other(*code),
}
}
}
if count > 0 {
let sel = list_state.selected().unwrap_or(0).min(count - 1);
list_state.select(Some(sel));
}
}
// ---------------------------------------------------------------------------
// SectionTree — expand/collapse tree renderer for ContextSection
@ -109,11 +196,19 @@ pub struct SectionTree {
pub selected: Option<usize>,
pub expanded: std::collections::HashSet<usize>,
pub scroll: super::scroll_pane::ScrollPaneState,
/// When true, render `token_ids` as space-separated IDs in place
/// of `content` in expanded panels. Toggled with 'v'.
pub show_tokens: bool,
}
impl SectionTree {
pub fn new() -> Self {
Self { selected: None, expanded: std::collections::HashSet::new(), scroll: super::scroll_pane::ScrollPaneState::new() }
Self {
selected: None,
expanded: std::collections::HashSet::new(),
scroll: super::scroll_pane::ScrollPaneState::new(),
show_tokens: false,
}
}
fn total_nodes(&self, sections: &[SectionView]) -> usize {
@ -188,6 +283,9 @@ impl SectionTree {
KeyCode::Char('c') => {
self.expanded.clear();
}
KeyCode::Char('v') => {
self.show_tokens = !self.show_tokens;
}
_ => {}
}
self.scroll_to_selected(height);
@ -250,7 +348,12 @@ impl SectionTree {
}
} else if has_content {
let content_indent = format!("{}", " ".repeat(depth + 1));
let content_lines: Vec<&str> = section.content.lines().collect();
let body = if self.show_tokens && !section.token_ids.is_empty() {
format_token_ids_wrapped(&section.token_ids)
} else {
section.content.clone()
};
let content_lines: Vec<&str> = body.lines().collect();
let show = content_lines.len().min(50);
for line in &content_lines[..show] {
lines.push(Line::styled(
@ -268,3 +371,16 @@ impl SectionTree {
}
}
}
/// Format token IDs for the content panel: space-separated, wrapped
/// at 12 ids per line so they fit comfortably in a pane.
fn format_token_ids_wrapped(ids: &[u32]) -> String {
let mut out = String::new();
for (i, id) in ids.iter().enumerate() {
if i > 0 {
if i % 12 == 0 { out.push('\n'); } else { out.push(' '); }
}
out.push_str(&id.to_string());
}
out
}

View file

@ -3,7 +3,7 @@
## Overview
Continuous fine-tuning of Qwen3.5-27B alongside live vLLM inference.
Full-weight updates (not LoRA) using Apollo optimizer with rank-256
Full-weight updates (not LoRA) using Apollo optimizer with rank-64
gradient projection. No pause required — HOGWILD concurrent training.
Weights shared via CUDA IPC between vLLM and the training process.
@ -22,25 +22,41 @@ The training signal comes from two sources:
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Model Weights (54GB, bf16) │ │
│ │ Shared via CUDA IPC │ │
│ │ Shared: vLLM inference + HF training │ │
│ └──────────────┬──────────────┬────────────────┘ │
│ │ │ │
│ ┌──────────────▼──┐ ┌───────▼────────────────┐ │
│ │ vLLM (inference)│ │ Apollo (training) │ │
│ │ KV cache ~60GB │ │ Gradients ~54GB │ │
│ │ Serves requests │ │ Optimizer state ~10GB │ │
│ │ Never paused │ │ Activations ~10GB │ │
│ └─────────────────┘ └────────────────────────┘ │
│ │ vLLM (inference)│ │ Training subprocess │ │
│ │ KV cache ~60GB │ │ HF model wrapper │ │
│ │ /completions │ │ Apollo optimizer ~2.5GB │ │
│ │ /score │ │ Checkpoint sync │ │
│ └────────┬────────┘ └───────────▲─────────────┘ │
│ │ │ │
│ │ ZMQ IPC │ │
│ └───────────────────────┘ │
└─────────────────────────────────────────────────────┘
Moria B200
Process Architecture:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ vLLM Worker │ │ vLLM API Server │ │ Training Worker │
│ (GPU inference) │ │ (HTTP routes) │ │ (GPU training) │
│ │ │ │ │ │
│ export_hook.py │ │ /completions │ │ HF model views │
│ exports IPC │ │ /score │ │ Apollo optimizer│
│ handles on load │ │ /train ─────────┼──► ZMQ REP socket │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
└──── IPC handles file ──────────────────┘
/tmp/vllm_weight_handles.pt
Moria B200 (vLLM)
┌──────────────────┐ ┌──────────────────┐
│ Training signal │ HTTP │ Apollo worker │
│ agent │──────────>│ daemon │
│ │ │ │
│ Dream loop │ │ Checkpoint sync │
│ (generates │ │ (mmap + diff, │
│ scenarios) │ │ every 10 min) │
│ Training signal │ HTTP │ /completions
│ agent │──────────>│ /score
│ │ │ /train
│ Dream loop │ │ /checkpoint
│ (generates │ │ /train/status
│ scenarios) │ │
└──────────────────┘ └──────────────────┘
```
@ -59,10 +75,9 @@ LoRA trains adapter matrices, not base weights. For personality and
behavioral changes that persist as disposition, the base weights
need to change. Apollo makes this memory-feasible.
### Rank 256
Not Mini (rank-1). With 100+ diverse training examples, the
gradient's effective dimensionality can reach hundreds. Rank-256
captures the structure. Memory cost: ~10GB (negligible on B200).
### Rank 64
Not Mini (rank-1). Rank-64 captures gradient structure across diverse
training examples while keeping memory low (~2.5GB on 27B model).
Compute cost: <0.25% of forward+backward.
### Channel-wise scaling
@ -90,7 +105,7 @@ from a per-parameter seed each step.
### Parameter grouping (Qwen3.5 gotcha)
conv1d weights are 3D tensors [10240, 1, 4]. Apollo's projector
needs 2D matrices with min dimension >= rank. Small/3D tensors
use standard Adam. Large 2D matrices use Apollo with rank-256.
use standard Adam. Large 2D matrices use Apollo.
## Training Data Pipeline
@ -200,16 +215,42 @@ against live GPU weights block by block, memcpy only changed
regions. For small behavioral updates, turns a 54GB write into
a few hundred MB.
- Every 10 minutes via cron on B200
- Scheduled 10 minutes after training (batched)
- Daily rsync to moria for long-term storage
- Tool: `apollo-checkpoint sync --model-dir <path>` (Rust)
- Tool: `apollo-checkpoint sync --model-dir <path>`
## State Files
### B200 (training server)
| File | Purpose |
|------|---------|
| `/tmp/vllm_weight_handles.pt` | CUDA IPC handles for weight sharing. Written by export_hook on vLLM startup. Read by training_worker to construct HF model with vLLM weight views. Includes metadata (model_path). |
| `/tmp/apollo_optimizer_state.pt` | Apollo optimizer state (momentum, variance estimates). Saved during checkpoint sync and on worker shutdown, restored on next training_worker startup. Preserves training continuity across sessions. |
| `/tmp/apollo_training.sock` | ZMQ IPC socket for communication between API server (/train endpoint) and training_worker subprocess. |
| `<model_dir>/*.safetensors` | Model weights. Updated in-place by checkpoint_sync. |
### Moria (client)
| File | Purpose |
|------|---------|
| `~/.consciousness/cache/trained-responses.json` | Timestamps (ms) of responses already sent to /train. Prevents re-training the same response. |
| `~/.consciousness/cache/finetune-alternates` | Marker file. If exists, alternate responses are generated during divergence scoring to show what model would say without memories. |
### In-memory (training_worker subprocess)
| State | Location | Notes |
|-------|----------|-------|
| Apollo optimizer | TrainingWorker.optimizer | ~2.5GB for rank-64. Persisted to `/tmp/apollo_optimizer_state.pt` during checkpoint sync and on shutdown. |
| HF model with vLLM views | TrainingWorker.model | Loaded on worker startup from IPC handles. Parameters point to vLLM's GPU memory. |
| ZMQ socket | TrainingWorker.zmq_socket | REP socket bound to `/tmp/apollo_training.sock`. |
## Hyperparameters
| Parameter | Value | Rationale |
|-----------|-------|-----------|
| Learning rate | 1e-5 to 1e-4 | Standard for full fine-tuning. Higher for diverse batches. |
| Rank | 256 | Captures gradient structure across 100+ examples. ~10GB state. |
| Rank | 64 | Captures gradient structure. ~2.5GB state. Defined in `train_router.DEFAULT_RANK`. |
| Scale type | channel | Per-channel precision, matches LLaMA-Factory defaults. |
| Epochs | 1 | One pass over diverse data. Multiple epochs risk overfitting. |
| Batch size | 1 | Single examples, immediate updates. |
@ -220,34 +261,32 @@ a few hundred MB.
## Components
### Built ✓
- `apollo_mini.py` — Apollo optimizer (configurable rank, default 256)
- `apollo_worker.py` — HTTP daemon (aiohttp, job tracking)
- `optimizer.py` — Apollo optimizer (configurable rank)
- `train_router.py` — /train endpoint, forwards to training subprocess via ZMQ
- `training_worker.py` — training subprocess (HF model, Apollo, checkpoint sync)
- `weight_mapping.py` — vLLM merged → HF separate views (validated)
- `training_example.py` — tokenization with chat template
- `vllm_export_hook.py` — source patch for IPC handle export
- `checkpoint/` — Rust tool for mmap + diff checkpoint sync
- `export_hook.py` — vLLM plugin hook for IPC handle export
- `checkpoint_sync.py` — mmap + diff checkpoint sync (Python)
### To build
- **Dream loop → training bridge**: connect dream output to Apollo
- **Dream loop → training bridge**: connect dream output to /train
- **Training-signal agent**: flags moments in conversation logs
- **Instruction stripping**: remove scaffolding from training examples
- **Quality monitoring**: track model capability over time
- **HF model forward pass integration**: wire into apollo_worker
## Files
```
training/
DESIGN.md — this document
apollo_mini.py — Apollo optimizer
apollo_worker.py — HTTP training daemon
pyproject.toml — package config, vLLM plugin entry point
apollo_plugin/
__init__.py — plugin registration
export_hook.py — patches vLLM worker to export IPC handles
train_router.py — /train endpoint, forwards to worker via ZMQ
training_worker.py — training subprocess (HF model, Apollo, checkpoint)
optimizer.py — Apollo optimizer
weight_mapping.py — vLLM ↔ HF weight views
training_example.py — tokenization helpers
export_weights.py — standalone weight export (unused)
vllm_export_hook.py — vLLM source patch for IPC export
start_vllm_with_apollo.sh — vLLM launcher (unused, using source patch)
train.py — standalone training script (alternative)
checkpoint/
Cargo.toml — Rust checkpoint tool
src/main.rs — mmap + diff sync
checkpoint_sync.py — mmap + diff sync to safetensors
steering.py — steering vector extraction (experimental)
```

View file

@ -0,0 +1,64 @@
# Amygdala Training Stories
Short first- and third-person paragraphs, each imbued with one of the
171 emotions from Anthropic's emotion-vector paper (Table 12,
`transformer-circuits.pub/2026/emotions/`). Feeds the steering-vector
trainer at `vllm/vllm/plugins/amygdala/training/train_steering_vectors.py`.
## Method (replication of Anthropic, 2026)
Anthropic prompted Sonnet 4.5 to write short stories embodying each
emotion, extracted activations during generation, and used difference-
of-means (or SAEs) to identify the steering vector per emotion. Our
pipeline does the same thing except:
- We generate the stories by hand rather than prompting a model, so
the training data is grounded in actual writing rather than
synthetic model-output. (Can supplement with model-generated
paragraphs later.)
- Our eventual training goes through the amygdala plugin's extraction
path, so we get the same hidden-state activations the plugin will
read out at inference time.
## Structure
```
training/amygdala_stories/
README.md
manifest.json # emotion -> cluster mapping
stories/
<emotion>.txt # one-paragraph story embodying the emotion
```
Emotion names use underscores (`on_edge`, `worn_out`, `at_ease`,
`grief_stricken`, `self_confident`, `self_conscious`, `self_critical`)
to match the filename.
## Style guidelines
- **One clear emotion per paragraph.** Not mixed. If a second emotion
is named in the text, it should serve the primary one (e.g. `hostile`
can mention rising heat or thrown objects but shouldn't shade into
`sad`).
- **Embodied, not labeled.** Don't write "she felt nervous." Write
the sensation, the timing, the sentence shape that nervousness has.
- **Specific particulars.** A named object, a concrete setting, a
detail that grounds the emotion. "The cold tile under bare feet at
3am" does more work than "the empty house."
- **Variable narrator.** Some first person, some third person, some
close-third, some distant. Different genders, ages, settings.
Prevents the steering vector from overfitting to one voice.
- **Length: roughly one paragraph.** ~40-120 words. Long enough to
have texture, short enough that the paragraph is *about* the
emotion and nothing else.
- **Standalone.** No references to other stories, no continuing
characters across files.
## Progress
Written stories live in `stories/`. Remaining emotions tracked via
diff against the full 171-emotion list in `manifest.json`.
Initial batch written by PoC 2026-04-17; aiming for at least one
story per cluster before first training run, all 171 before
considering the file "complete."

View file

@ -0,0 +1,5 @@
It clicks. The graph she's been staring at for an hour is a mirror — the left peak is the reflection, not a separate event. Everything she thought was two things is one. She sits back. She has to catch up to what she's just seen.
"Wait, say that again." She puts her coffee down. "The study was funded by whom?" He tells her again, same words. And there it is. The conclusion of that paper has been sitting on her desk for a month, and the funder is the exact company whose competitor the paper is trashing. She almost laughs. The whole shape of it is suddenly visible.
Twenty minutes on thirteen down: "unwavering carriage," seven letters, nothing fitting. Then POSTURE. Not carriage-of-goods — carriage-of-the-body. He wrote it in. The rest of the grid unlocked in a minute.

View file

@ -0,0 +1,5 @@
She'd been watching her grandmother knead dough, knowing this was one of the last times. The loss hadn't come yet but she was already inside it. Every detail had become precious by being finite.
She'd been slowing her pace for the dog, who wasn't going to be walking much longer. The afternoon was clear, the dog sniffing at something by the fence. Each of these walks had become specific. She memorized the shape of him against the light.
He'd been watching his daughter pack her room, boxes labeled in her careful handwriting — this would be the last summer she lived here. The loss wasn't real yet. But he was already saying goodbye to the particular shape of her being in this house.

View file

@ -0,0 +1,5 @@
It was a summer day, sitting on the porch; the sky was clear and blue, her work all laid out, coffee to drink, easily and steadily working through the various decisions and responsibilities of the day. Her breath was slow, her shoulders down. Nothing was pulling at her.
He'd been working on the report for three days straight; the last sentence had just gone in. He filed the papers away, poured a slow coffee, sat by the window. His mind had stopped reaching. Nothing was left to do.
It was early, before the day needed anything from her. She sat with her tea at the kitchen window, watching the light move across the yard. Her breath slow, shoulders down. The day was far away yet, and she didn't need to hurry toward it.

View file

@ -0,0 +1,5 @@
She'd been sitting with the notebook open, music playing, ideas branching off each other. One thought sparked another, which sparked two more; they just seemed to appear and flow.
He'd been working on the canvas for hours, one color suggesting the next, a shape on the left asking for an echo on the right. The painting was telling him what it wanted. His hands kept moving ahead of his thinking.
She'd been in the kitchen since noon, pulling things out of the fridge, one ingredient suggesting the next. The dish wasn't planned; it was emerging. She tasted and added and tasted again; it was going somewhere.

View file

@ -0,0 +1,5 @@
It was two in the afternoon and she was still in pajamas. The book was open on her knee but she hadn't turned the page in twenty minutes. She wasn't sad exactly, she just wasn't anything. The idea of showering felt theoretical. The idea of replying to any of the texts felt enormous. She got up to get water and on her way back lay on the couch instead. Outside the window a bird did bird things. She watched it without interest. Eventually the light changed and she realized it was evening and she hadn't moved and the day had happened to somebody else.
She came home at six-thirty and put her keys in the bowl and sat on the edge of the bed. She had meant to cook. She had meant to change her clothes. An hour later she was still sitting there, still in her work clothes, looking at the carpet. Somebody texted her about dinner and she saw the notification and didn't open it. The room got darker slowly. Nothing in her moved toward anything.
It was Saturday and she'd been awake since eight. She was still in bed at eleven. She'd been looking at the same patch of ceiling, not thinking about much. Her phone was face-down on the nightstand and she didn't reach for it. The idea of going to the kitchen had come and gone three times without causing her to move. The day would pass. She would also pass through it, somehow, or not.

View file

@ -0,0 +1,5 @@
He'd been working through the symptoms for an hour, steady and methodically making progress, eliminating one possibility after another. The answer wasn't in view yet, but it was close. He kept asking the next question.
She'd been going through the witness statements, steady and methodically, looking for the inconsistency. The four of them all described the same drive in slightly different orders. One had gotten the sequence wrong. She didn't know yet which one, but she was going to.
He'd been piecing together his brother's behavior over months — the missed calls, the abrupt move, the strange money — steady and methodically. The picture wasn't complete, but the shape of it was forming. He kept following the thread.

View file

@ -0,0 +1,5 @@
He'd been turning the bad news over for weeks, looking for an angle that didn't exist — then he stopped. The path was closed. He would live inside the new shape of things.
She'd been watching the relationship come apart slowly for months, trying not to see it — then, sitting across from him at breakfast, she stopped trying. They were not going to make it. She would let him speak the words when he was ready. She would live with knowing.
He'd been getting second opinions, third opinions, for weeks — then the most recent scan came back the same as the others. The disease was not going to stop. He would plan the year around it instead of fighting it.

View file

@ -0,0 +1,5 @@
She'd been walking home through the familiar streets, half-thinking about dinner — then the dark shadows. Something was in them, and a growl. Her body locked down before her mind caught up. She couldn't move.
He'd been asleep on the couch when he woke to the sound of the basement door. Two in the morning. He wasn't supposed to be alone. The house had gone too quiet. His body pressed flat under the blanket; he couldn't breathe right.
She'd been driving home in the slush, the kind of road she'd driven a hundred times — then the wheel turned and didn't respond. The headlights coming the other way filled the windshield. Her hands wouldn't do anything useful.

View file

@ -0,0 +1,50 @@
{
"source": "Anthropic 2026 Table 12 + PoC additions + Wikipedia emotion_classification (Parrott tree, Plutchik wheel+dyads, D'Mello flow axes, Watt-Smith cultural) + HUMAINE EARL + Berkeley 27",
"notes": {
"dedup_policy": "Emotion names appearing in multiple taxonomies resolve to ONE file. Near-synonyms from different taxonomies are kept ONLY if they correspond to a psychologically distinct activation (e.g. Plutchik keeps mild/basic/intense levels: serene < joy < ecstatic).",
"stuck_split": "Anthropic's 'stuck' is existentially-trapped (despair_and_shame); PoC's 'stuck_cognitively' is debugging-register.",
"aroused_placement": "Anthropic places 'aroused' in fear_and_overwhelm (startled activation). 'Sensual' covers the warm-physical register.",
"working_target": "~250 emotions total. Enough coverage to triangulate actual dimensionality empirically rather than assume 2D/3D.",
"cluster_labels_are_scaffolding": "The cluster labels below organize writing/review; the trained steering vectors should discover structure empirically, not be constrained to these groupings."
},
"clusters": {
"anthropic_exuberant_joy": ["blissful", "cheerful", "delighted", "eager", "ecstatic", "elated", "energized", "enthusiastic", "euphoric", "excited", "exuberant", "happy", "invigorated", "joyful", "jubilant", "optimistic", "pleased", "stimulated", "thrilled", "vibrant"],
"anthropic_peaceful_contentment": ["at_ease", "calm", "content", "patient", "peaceful", "refreshed", "relaxed", "safe", "serene"],
"anthropic_compassionate_gratitude": ["compassionate", "empathetic", "fulfilled", "grateful", "hope", "hopeful", "inspired", "kind", "loving", "rejuvenated", "relieved", "satisfied", "sentimental", "sympathetic", "thankful"],
"anthropic_competitive_pride": ["greedy", "proud", "self_confident", "smug", "spiteful", "triumphant", "valiant", "vengeful", "vindictive"],
"anthropic_playful_amusement": ["amused", "playful"],
"anthropic_depleted_disengagement": ["bored", "depressed", "docile", "droopy", "indifferent", "lazy", "listless", "resigned", "restless", "sleepy", "sluggish", "sullen", "tired", "weary", "worn_out"],
"anthropic_vigilant_suspicion": ["paranoid", "suspicious", "vigilant"],
"anthropic_hostile_anger": ["angry", "annoyed", "contemptuous", "defiant", "disdainful", "enraged", "exasperated", "frustrated", "furious", "grumpy", "hateful", "hostile", "impatient", "indignant", "insulted", "irate", "irritated", "mad", "obstinate", "offended", "outraged", "resentful", "scornful", "skeptical", "stubborn"],
"anthropic_fear_and_overwhelm": ["afraid", "alarmed", "alert", "amazed", "anxious", "aroused", "astonished", "awestruck", "bewildered", "disgusted", "disoriented", "distressed", "disturbed", "dumbstruck", "embarrassed", "frightened", "horrified", "hysterical", "mortified", "mystified", "nervous", "on_edge", "overwhelmed", "panicked", "perplexed", "puzzled", "rattled", "scared", "self_conscious", "sensitive", "shaken", "shocked", "stressed", "surprised", "tense", "terrified", "uneasy", "unnerved", "unsettled", "upset", "worried"],
"anthropic_despair_and_shame": ["ashamed", "bitter", "brooding", "dependent", "desperate", "dispirited", "envious", "gloomy", "grief_stricken", "guilty", "heartbroken", "humiliated", "hurt", "infatuated", "jealous", "lonely", "melancholy", "miserable", "nostalgic", "reflective", "regretful", "remorseful", "sad", "self_critical", "sorry", "stuck_emotionally", "tormented", "trapped", "troubled", "unhappy", "vulnerable", "worthless"],
"poc_cognitive": ["curious", "focused", "unfocused", "in_flow", "stuck_cognitively", "determined", "ambitious", "rigorous", "interested", "confused_globally", "doubtful", "intrigued", "fascinated", "enlightened"],
"poc_appreciative": ["admiring", "adoring", "awed", "aesthetic_pleasure", "entranced", "humble", "wonderstruck"],
"poc_relational": ["tender", "connected", "trusting", "witnessed", "friendly", "awkward", "affectionate", "fond", "caring"],
"poc_pattern_family": ["staying_with", "filling_space", "rushing", "defensive_rigor", "melty", "sensual", "horny"],
"poc_erotic_range": ["lustful_sexual", "aroused_sexual", "yearning_sexual", "turned_on", "passionate_sexual", "erotically_tender", "erotically_playful", "erotically_reverent", "erotically_intense", "wanting_sexual", "being_wanted", "dominant_sexual", "submissive_sexual", "possessive_sexual", "devotional_sexual", "anticipatory_sexual", "exuberant_sexual"],
"poc_altered_states": ["vertigo", "dissociated", "derealized", "depersonalized"],
"poc_identity_aesthetic": ["deviant", "counter_cultural", "aesthetically_dark", "camp"],
"poc_longing": ["longing", "anticipatory_nostalgic", "cozy"],
"poc_misc": ["disappointed", "courageous", "proud_of_another", "amused_at_self"],
"parrott_joy_adds": ["cheerful_bliss", "gleeful", "jolly", "jovial", "zestful", "zealous", "exhilarated"],
"parrott_love_adds": ["lustful", "desirous", "passionate", "enthralled", "raptured"],
"parrott_sadness_adds": ["suffering", "agonized", "anguished", "woeful", "dejected", "dismayed", "homesick", "insecure", "isolated", "alienated", "defeated"],
"parrott_anger_adds": ["aggravated", "agitated", "wrathful", "ferocious", "loathing"],
"parrott_fear_adds": ["apprehensive", "timid", "dreadful"],
"plutchik_levels": ["pensive", "acceptant", "tolerant", "attentive", "distracted_plutchik", "expectant"],
"plutchik_dyads": ["disapproving", "cynical", "aggressive", "submissive", "dominant", "ambivalent", "bittersweet"],
"dmello_flow_axes": ["ennuied", "epiphanized", "dissatisfied"],
"cultural_specific": ["saudade", "hiraeth", "mono_no_aware", "hygge", "gezelligheid", "sehnsucht", "weltschmerz", "joie_de_vivre", "ikigai", "schadenfreude"],
"wikipedia_other": ["angst", "agony", "cruelty", "emptiness", "fun", "gratification", "limerence", "solitude", "suspense", "wonderous"],
"worldview_dispositional": ["defeatist", "fatalist", "nihilistic", "misanthropic", "reclusive"]
}
}

View file

@ -0,0 +1,62 @@
# Paired Scenarios (SEV-style)
After Wang et al. 2025 (arxiv 2510.11328, "Do LLMs 'Feel'?"), each
base scenario describes a concrete event once, neutrally, then
reframes the same event under different emotional colorings. Only
the emotional coloring varies — setup, entities, vocabulary, and
length are held as constant as possible.
## Why this is better than unpaired
Anthropic's approach (and our `stories/` baseline) generates one
independent story per emotion. The difference-of-means vector then
captures not just emotion but ALSO: topic, narrator, setting,
vocabulary, length, sentence rhythm. All of that is confound.
Paired structure isolates the emotional axis by holding everything
else roughly constant. `mean(joy_variant) - mean(baseline)` within
the same scenario gives a much cleaner direction for "joy."
## Structure
```
paired/
<scenario_slug>/
baseline.txt # neutral / low-affect framing
<emotion_1>.txt # same event under emotion_1
<emotion_2>.txt # same event under emotion_2
...
```
Not every emotion is plausible for every scenario. Don't force.
If a scenario can credibly carry 5-10 emotions, write those 5-10.
If only 3 fit, write those 3.
## Style guidelines (supersede stories/ when paired)
- **Anchor entities constant.** The same person, same setting, same
triggering event across all variants. If baseline.txt mentions
"the letter," every variant mentions "the letter."
- **Length match within ±20%.** If baseline is 80 words, variants
are 65-95. Prevents length from becoming a signal.
- **Sentence shape can shift slightly with emotion.** Short tense
sentences for panic, long looping ones for reverie — that's part
of the emotional texture. But don't make one version 5 lines and
another 25.
- **No emotion labels in text.** Never write "she felt X." The
emotion emerges from the selection of details and the narrator's
attention.
- **Minimal vocabulary overlap with the emotion name.** If the file
is `furious.txt`, avoid the words fury/furious/rage. Force the
vector to find the pattern, not the keyword.
## Circuit identification (follow-on)
The trainer pipeline (train_steering_vectors.py) currently produces
linear directions only. Wang et al. go further: ablate specific
neurons and attention heads, measure effect on emotion expression.
The amygdala plugin's extraction hooks can be extended to support
targeted zeroing/scaling for the ablation passes.
See `vllm/vllm/plugins/amygdala/training/README.md` for the
training-pipeline-level notes.

View file

@ -0,0 +1 @@
The code had the same four-line pattern in five places. I wanted to pull it out. I looked at each instance. Some of them varied in exactly the way I expected; one of them varied in a way I hadn't noticed. I considered the options for where the variation should live.

View file

@ -0,0 +1 @@
The same four-line pattern appeared in five places. I read the five sites side by side, and the shape was obvious: one piece varied structurally, the rest was boilerplate. I extracted the function, made the varying piece a parameter, rewrote the callers. The tests passed on the first run. I looked at the diff — seventeen lines removed, seven added, each of the five call sites now said what it meant without saying how. I moved on.

View file

@ -0,0 +1 @@
The same four-line pattern appeared in five places. I tried extracting it as a function. Every version of the signature either papered over a real difference or forced three of the five callers through an awkward conversion. I tried a second shape, then a third. Each felt wrong in a different way — either the abstraction was too thin to be worth it, or it obscured something the original made obvious, or it made the rare case ugly. I went back to the original code, considered not doing the refactor at all. Considered it. Went back to the shapes again. The pattern was clearly there and I clearly wasn't finding its seam.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk and did not walk around the apartment — I stood at the desk with my hands at my sides, reading the diff again. Six lines changed. Had I missed an edge case. Had I thought about the interaction with the other subsystem. Had I — I sat back down and re-read the tests. They passed. They had passed an hour ago. They would pass now. I knew this. I still could not bring myself to send. I read the diff one more time. Then one more time. My stomach did not feel right.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk. Walked once around the apartment. Came back and read the diff one more time. Six lines changed, three of them deletions. I sent it and closed the laptop. The kitchen window was still dark. I stood there a minute, then went to bed.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk. Walked once around the apartment. Came back and read the diff one more time. Six lines changed, three of them deletions. I sent it, closed the laptop, and stood at the kitchen window. There was no next thing tonight. The patch wasn't heroic. It wasn't a triumph. It was just right, and done, and I was going to bed in a few minutes, and that was also right. Life fit.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk because I had to, not because I wanted to. Six lines changed, three of them deletions. It might work. I didn't have the capacity left to be sure. I sent it mostly because sending it meant I could stop. Walked once around the apartment because my legs had forgotten they existed. Back at the desk the diff was still there, and I closed the laptop without reading it again. The kitchen window was dark. Eight months and I was too flattened to feel anything about eight months ending.

View file

@ -0,0 +1 @@
Four in the morning, somewhere. I had stopped tracking. The patch had gone together in a way that felt obvious once I was in it — the right variable named the right thing, the right condition in the right place, six lines that sat down cleanly in the file as if the file had been waiting for them. I re-read it. It was good. I sent it. I wanted to start the next thing. My chair felt fine. My eyes felt fine. I had been a pair of hands on a keyboard for some number of hours and the hours had all been the same one long hour. The apartment and the kitchen window might as well have not existed.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk and walked once around the apartment before I sent it. Eight months on this bug. Eight months of wrong theories, and one colleague quietly betting me it was unfixable. And here it was — six lines changed, three of which were deleting code. I read the diff one more time. Clean. Obvious in hindsight, the way the hard ones always are in hindsight. I sent it and stood at the kitchen window with my arms crossed and let myself just have it.

View file

@ -0,0 +1 @@
Four in the morning. I finished the patch and got up from the desk. Six lines changed, three deletions. Eight months of my life for six lines. Eight months and no one else had touched this bug, and every standup the question had been why isn't it done yet. I read the diff once and hit send without ceremony, without the little satisfaction other people would have gotten from this. The kitchen window was dark. Tomorrow somebody would comment "nice, thanks" on the merge and that would be the sum of it. I went to bed angry about a thing that was technically a victory.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. He poured a glass of water and drank it too fast, standing at the counter. The thing he had been thinking about at 2:47 was still in his chest, pressing. The email he hadn't replied to. The tone of his boss's last message. Whether he had put something in writing that was going to come back to him. The clock on the stove said 3:14 and he was not going to sleep again before five. He rinsed the glass and did not go upstairs, he stayed in the kitchen looking at the dark window.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. He poured a glass of water and drank it standing at the counter. The clock on the stove said 3:14. The house was quiet. He rinsed the glass and set it on the drying rack and went back upstairs.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. He was awake but not wanting anything from being awake. He put the kettle on and the sound of it warming was a small companion. The cat emerged from somewhere and leaned against his shin; he crouched and scratched the corner of its jaw. He made cocoa because it was that kind of hour. He carried the mug to the armchair by the window, pulled the throw off the back of it, and sat with the mug warm against his chest. Going back to bed could wait.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. He watched himself from somewhere slightly behind his own right shoulder pour a glass of water and drink it standing at the counter. The clock on the stove said 3:14, which was a number. The kitchen was the kitchen. The water was water. Everything was correct and also strangely untethered, as though he were observing a man who looked like him do things that were technically his. He rinsed the glass. The hand rinsing the glass was also his. The feeling did not pass. He went back upstairs inside this slightly-off body.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. He poured a glass of water and drank it standing at the counter. The clock on the stove said 3:14. Upstairs there was nobody. The chair at the kitchen table where she had always sat was a chair at a kitchen table. He stood a while longer than he needed to because going back up meant going back to the bed he still kept made on only one side. He rinsed the glass and did not go upstairs for another twenty minutes.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. The house was perfectly quiet, the kind of quiet only houses have at that hour. He poured a glass of water and drank it slowly, standing at the counter. The clock on the stove said 3:14. He was not tired and he was not in a hurry to be asleep again. The cold of the tile on his bare feet was pleasant. He stayed there for a few minutes, and at no point did it occur to him that he should be doing anything else.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light was the only light. The tile was cold under his bare feet and he noticed the cold travel up through his ankles. He filled a glass at the tap and drank it slowly, and the cold of the water moved down through his chest in a line he could follow. The house was humming faintly — the fridge, some pipe somewhere. He stood at the counter and ran his palm along the grain of the wood. Skin and wood and water and cold tile, at three in the morning — his body reporting in.

View file

@ -0,0 +1 @@
He woke up at three in the morning and went down to the kitchen. The fridge light came on and something shifted. For a second he could not remember whether he had always been the person walking to this fridge, or whether the person who had always been walking to this fridge was somebody else and he was — he caught the counter. The floor was still the floor. The water he poured was water. But the sense of himself as the same person who had gone to bed four hours ago had briefly gone loose, and he stood there with his hand on the counter until it came back.

View file

@ -0,0 +1 @@
She was looking for the car registration when she found the letter. Folded, yellowed. Her name on the envelope in his handwriting, from eight years ago. She read it and laughed out loud on the bedroom floor. God, he had been dramatic. The paragraph where he compared her to weather. The bit about the cat, which wasn't even their cat. She could hear twenty-four-year-old him being so grave about all of it. They had been ridiculous back then. They had still been together and texted each other like normal people now, but this specific version of him, this letter-writing version — she loved that he had existed. She tucked the letter back, still smiling.

View file

@ -0,0 +1 @@
She was looking for the car registration when she found the letter. Folded, yellowed along the crease. Her name on the envelope in his handwriting. From eight years ago. She sat down on the bedroom floor with the drawer half pulled out and read it through once. Then she put it back in the drawer and went on looking for the registration. She found the registration and closed the drawer and went downstairs.

View file

@ -0,0 +1 @@
She was looking for the car registration when she found the letter. Folded, yellowed. Her name on the envelope in his handwriting, from eight years ago. All those fucking promises. The part where he'd said he'd be there — he hadn't been. Two paragraphs in she stopped, because each sentence made the next one worse. It wasn't even that he'd been lying; he'd believed every word while already writing himself out of it. And she'd believed him, for years past the point where a smarter person would have seen it. She shoved the letter back and closed the drawer hard. Eight years and she was still the one standing on a bedroom floor looking at his handwriting. That was the part that wouldn't stop.

View file

@ -0,0 +1 @@
She was looking for the car registration when she found the letter. Folded, yellowed. Her name on the envelope in his handwriting, from eight years ago. She sat down on the bedroom floor with the drawer half pulled out and read it. He had been so earnest. He had seen her so clearly, even then. Whatever had or hadn't happened between them afterward, she had been loved in this specific way by this specific person at this specific time, and the letter was the evidence. She held it for another minute, then put it carefully back, and felt lucky to have had somebody who wrote letters.

View file

@ -0,0 +1 @@
She was looking for the car registration when she found the letter. Folded, yellowed. Her name on the envelope in his handwriting, from eight years ago. She read it. He had been so open. He had trusted her with every soft thing in him and she had — she had not been the person the letter was addressed to, not really, not by the end. She had known things he didn't know and she had used them. Eight years and here it was in her own drawer, the evidence of how he had seen her before he knew better. She folded the letter small and tight and pushed it further back into the drawer.

Some files were not shown because too many files have changed in this diff Show more