Commit graph

1,214 commits

Author SHA1 Message Date
Kent Overstreet
25e4775974 enable tls 2026-05-22 13:02:42 -04:00
Kent Overstreet
6e3bacb182 channel-tmux: resolve pane ids by label, don't persist them
tmux pane ids (%6 etc.) are ephemeral — recycled across pane and
tmux-server restarts. The daemon persisted the id in tmux.json5 and
kept reusing it, so after a restart a channel would attach to whatever
unrelated pane had since inherited that id. (Live: ktest's stored %6
had become a claude pane; the real ktest pane was %10.)

Persist only the label — the pane title / window name, which is
stable. pipe_pane_reader() is now a connect-retry loop: each attempt,
connect_and_stream() resolves the live id with find_pane_by_name(); the
loop retries until the pane exists and pipe-pane succeeds, and
reconnects the same way if the pipe later drops. send() resolves the id
at send time; open() just registers the label and lets the reader find
it.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-22 12:26:05 -04:00
Kent Overstreet
190eb50ed9 telegram: bound photo download to 60s
HttpClient::request_timeout only covers send_request, not body collect,
so a stuck download would otherwise stall the entire long-poll loop
indefinitely. tokio::time::timeout at the call site keeps the failure
contained — a slow/dead download surfaces as the same [image: download
failed: ...] marker as any other error.

60s is generous for the 1-5MB photos Kent typically sends; Telegram's
bot getFile cap is 20MB, which would still complete on most connections.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 18:56:03 -04:00
Kent Overstreet
713bb07729 bin: add ch — minimal channel CLI (send/recv)
Speaks the channel.capnp protocol over the per-daemon Unix socket at
~/.consciousness/channels/<top>.sock. Useful for ad-hoc sends from
shell, tests, and out-of-process tools that don't want to embed a
capnp client.

  ch send <channel> <message...>
  ch recv <channel> [--all-new] [--min-count N]

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 18:16:21 -04:00
Kent Overstreet
c303653dd0 telegram: bridge photos via [image: <path>] markers
When an incoming update has a photo array, pick the largest size,
resolve the file_id via getFile, and download to
~/.consciousness/channels/telegram.logs/media/<file_id>.<ext>. The
message line surfaced to the channel is

    [image: /abs/path/to/file.jpg]
    <caption if any>

so a multimodal Read on the path works end-to-end. On download
failure we still surface the caption with an [image: download
failed: ...] marker so context isn't lost.

Other media types (voice/video/sticker/etc.) log a one-line "skipping"
notice — easy hook to extend later. The media/ dir was already being
created at startup; this fills in the rest.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:43 -04:00
Kent Overstreet
a075e30557 http: add HttpResponse::bytes() for binary downloads
Mirror of text(), but returns raw Bytes without lossy UTF-8 conversion.
Needed by the Telegram channel to fetch photo files.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:35 -04:00
Kent Overstreet
91c8451f5c user: fix hotkey_cycle_reasoning after lock_blocking revert
The revert at 09896cd dropped the try_lock() wrapper but left an extra
closing brace and the async-call site still un-awaited, leaving the tree
unbuildable. Re-flow the function body to match the new signature.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-05-01 17:58:32 -04:00
Kent Overstreet
09896cd38b Revert "replace try_lock() with lock_blocking() across UI thread"
This reverts commit 4225294d16.
2026-04-25 17:15:53 -04:00
Kent Overstreet
4225294d16 replace try_lock() with lock_blocking() across UI thread
Add lock_blocking() to TrackedMutex: blocks current thread using
block_in_place + futures::executor::block_on, safe for sync contexts.

Replace all try_lock() calls with lock_blocking() in slash commands,
UI rendering, and status reads. Lock hold times are fast enough that
blocking briefly is fine, and this eliminates the spurious 'lock
unavailable' paths that were never actually hit.

Kept rx_mutex.try_lock() in mod.rs (std::sync::Mutex for stderr rx).
2026-04-25 15:35:14 -04:00
Kent Overstreet
5210f7dd66 context: heal pre-refactor image logs with token_count=0
Recompute image token counts from persisted dimensions when loading
old logs that stored count=0 (server-authoritative count was applied
after AppendImage before client-side pad expansion).

graph: cache neighbor sets for clustering coefficient

Pre-compute neighbor HashSets so the O(deg^2) triangle-counting
inner loop doesn't re-allocate on every (i,j) pair. avg_clustering_
coefficient() now builds the cache once instead of O(N*deg) times.
2026-04-25 15:15:21 -04:00
Kent Overstreet
371b40078d context: salvage in-flight tag accumulators on premature stream end
ResponseParser.finish() was only flushing self.buf — the rolling tail
window — and silently dropping self.think_buf and self.tool_call_buf.
When a stream ended inside an unterminated <think>...</think> or
<tool_call>...</tool_call> block (max_tokens reached, EOS before the
close tag, server-side cancel), all the accumulated in-tag content
was discarded and only the trailing ~8 bytes survived (drain_safe
keeps `close_tag.len()` bytes at the tail of buf to handle
across-chunk tag splits — and `</think>` is exactly 8 chars).

Symptom: assistant responses cut off, only the last few characters
come through. Especially severe in native-think mode where in_think
is set from prefill, so the entire response accumulates in
think_buf and gets wiped on premature stop.

In finish(): if in_think, drain buf into think_buf and emit as a
Thinking node (preserving the partial thought). If in_tool_call,
attempt to parse the body; on parse failure, wrap the partial as
content with the leading <tool_call> open tag so the model sees its
own truncated attempt next turn rather than losing it.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 23:32:44 -04:00
Kent Overstreet
c2433c1773 context: tighten the Branch token-cache invariant
Two pieces around the cache that landed when Branch nodes started
holding `token_ids: Some(server_authoritative_stream)`:

1. wire_into / wire_chunks now pair cached vision blocks with their
   child Image leaves. Previously the cached-branch arm spliced the
   cache verbatim and didn't recurse for images, so a Branch whose
   cache contained `VISION_START..VISION_END` blocks would emit those
   tokens with no matching `WireImage` push — leading to a panic
   downstream when `pair_images_to_ranges` tried to attach the
   missing image. New `pair_cached_images` walks the children
   depth-first for image leaves and zips them against
   `vision_blocks(cache)` to emit correctly-offset entries; mismatched
   counts panic loudly because that's an AST/cache invariant
   violation that would otherwise mis-pair on the wire.

2. `conversation_mut() -> &mut Vec<AstNode>` was the one public
   escape hatch that let callers reach into a Branch's children and
   mutate them without invalidating the cached token stream. Removed
   in favor of a focused `set_branch_memory_score(section, index,
   key, score)` for the only legitimate use we had today (the
   full-matrix scorer writing per-memory divergence onto the
   Assistant Branch). Updated the lone caller in subconscious/learn.

Documented the invariants explicitly on `ContextState`: every
`Leaf.token_ids` matches `body.compute_token_ids()`, and every
`Branch { token_ids: Some(_) }` is a faithful walk of its children.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 23:15:55 -04:00
Kent Overstreet
006b99bdac bin: enable panic backtraces by default
stderr is redirected to ~/.consciousness/logs/tui-stderr.log via
redirect_stderr_to_pipe(), but the default panic hook checks
RUST_BACKTRACE before printing the trace; without the env var the
log only catches the "note: run with \`RUST_BACKTRACE=full\`" tail
and the actual frames are dropped.

Set RUST_BACKTRACE=1 programmatically before any other thread spawns
so the log captures the trace by default. Existing user-set value is
respected so callers can still opt into "full" if they want.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:44:19 -04:00
Kent Overstreet
10c8878f1c agent: bump tonic gRPC message caps to 64 MiB
The default 4 MiB cap on encoded/decoded messages is too small for
the multimodal Generate path: Qwen3.6-VL high-res patches put 5–8 MiB
of pre-encoded image bytes inline in a single Generate request, and
Done events carrying full per-token readout vectors can also exceed
4 MiB on long runs. Hit "ResourceExhausted: Received message larger
than max (5799108 vs. 4194304)" from the salience server.

Bump both encode and decode caps on every cloned SalienceClient. The
matching server-side bump is in vllm/entrypoints/salience/server.py.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:36:10 -04:00
Kent Overstreet
11a7e4043e scripts: FP8 quantize Qwen3.6-27B for vLLM (multimodal + MTP)
Quantization recipe targeting the multimodal Qwen3.6-27B for vLLM
serving. Three pitfalls the script avoids, each documented inline:

1. Loader strip: `AutoModelForCausalLM` silently drops the vision
   tower; we load via the config-declared
   `Qwen3_5ForConditionalGeneration` instead.

2. Pattern anchor: llmcompressor matches the `ignore` list against
   module names (no `.weight` suffix) when walking `named_modules()`,
   not against full tensor names. Patterns now anchor on `$` at the
   module name; the earlier `\.weight$` form silently quantized
   lm_head and every linear_attn projection.

3. vLLM fusion: vLLM fuses {q,k,v}_proj into qkv_proj, gate+up into
   gate_up_proj, and in_proj_qkv+in_proj_z into in_proj_qkvz. The
   compressed_tensors loader rejects mixed schemes within a fused
   layer, so the `ignore` list is shaped to keep all sub-components
   of a fused layer consistent.

After `oneshot()` writes the FP8 output, MTP tensors (which the HF
class doesn't expose) are spliced in at BF16 from the upstream cached
snapshot, with the compressed_tensors metadata header preserved.

Recipe follows Unsloth's UD-Q8_K_XL late-stack overrides (FFN: 50,
51, 59, 62, 63; ATTN: 51, 59, 63), extended to include `v_proj` for
fusion compat. Final checkpoint is ~35 GB (matches Unsloth's GGUF
size to within ~1%) with vision tower BF16, MTP head BF16, and most
mlp/self_attn Linears at FP8_DYNAMIC.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 22:15:31 -04:00
Kent Overstreet
fe232cf292 salience: client-side pad expansion, drop AppendImage
Mirrors the vLLM-side rewrite. AppendImage is gone; images now
ride along on Generate via a parallel `images` list.

- Productionize `qwen3_image_token_count` (was test-only). Image
  leaf computes its IMAGE_PAD count eagerly at construction from
  height/width; `token_count` is no longer "0 until the server
  tells us."
- WireChunk shrinks to a single `Tokens(Vec<u32>)` variant — vision
  blocks live inline in the token stream.
- `wire_chunks` now returns `(Vec<WireChunk>, Vec<WireImage>)`.
  `WireImage` carries `pad_start` / `pad_end` (absolute positions
  in the full walk) alongside bytes + mime.
- `assemble_prompt` returns `(chunks, images, match_upto)`.
- `stream_session_mm` / `run_session_generate` take the parallel
  images list, filter to those past `match_upto`, and pass them
  in `GenerateRequest.images` as `pb::ImageAttachment` entries.
- Drop `SessionHandle::append_image`,
  `ContextState::commit_image_token_counts`,
  `StreamToken::ImageAppended`, the WireChunk::Image branch in
  `learn.rs`, and the now-empty `prompt_to_chunks` helper.
- Add 'v' toggle on the conscious-screen tree to render token-id
  vectors in place of text content (debug-aid: lets us see what
  the server actually has when output is suspicious).
- Comment out the subconscious-trigger spawn loop — Kent had this
  disabled before; it had crept back into running.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 20:26:47 -04:00
Kent Overstreet
4feebb7bc4 agent: share one tonic Channel + migrate scoring to gRPC Generate
Two changes that bolt together — the shared connection means the new
scoring path actually costs one HTTP/2 handshake across the whole
process instead of one-per-RPC.

ApiClient gains `salience_channel: Arc<OnceCell<Channel>>`. First
call to `ApiClient::salience_client()` opens the channel via
`connect_channel()` and stores the Channel; subsequent calls clone
it (cheap — tonic multiplexes concurrent RPCs over the single
HTTP/2 connection). Every ApiClient clone shares the same OnceCell,
so all agents spawned from Mind's client — plus every ephemeral
scoring session — reuse one connection.

SessionHandle refactored to hold an `ApiClient` clone instead of
a bag of (base_url, api_key) strings. `open` / `append_image` /
`generate` go through `self.client.salience_client()` now. New
`prefill_only(tokens)` method encapsulates the "Generate with
max_tokens=0 to append text" pattern (previously a private free
function in api/mod.rs called `flush_pending`). Drop impl on
SessionHandle stays — still fires CloseSession on the shared
channel in a detached task.

`run_session_generate` switched from `(base_url, api_key, model)`
to `&ApiClient`; the agent-turn flow that uses it keeps the same
shape but `stream_session_mm` clones the ApiClient into the
spawned worker.

learn.rs migrated from the HTTP `/v1/score` endpoint to a gRPC
session-based score:

  * `call_score` opens an ephemeral SessionHandle on the client,
    converts (prompt_tokens, images) → Vec<WireChunk> via the new
    `prompt_to_chunks` helper (splits on VISION_START/VISION_END),
    walks chunks calling `prefill_only` + `append_image`, runs a
    final Generate with `max_tokens=0` + `logprobs_ranges` over
    the scored positions, and sums each Token event's
    `sampled_logprob` per range to produce `ScoreResult`s.

  * SessionHandle drops at end of scope → CloseSession auto-fires,
    keeping the server's session map clean between calls.

  * No more HTTP path, no more `http_client()` helper, no more
    `ScoreResponse` / serde plumbing for /v1/score.

  * `send_to_train` still uses HTTP (it talks to /v1/train which
    isn't on the gRPC protocol); its ad-hoc HTTP client lives
    inline now instead of reaching for the deleted `http_client()`.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:51:53 -04:00
Kent Overstreet
be6ba4e9a5 agent: bundle sampling fields as SamplingParams on AgentState
Collapse the split `temperature` / `top_p` / `top_k` fields on
AgentState into a single `sampling: SamplingParams` struct, mirroring
how the wire-level fields flow into the Generate RPC. Adds
`max_tokens` to SamplingParams so it's actually plumbed end to end
(previously the client had a hardcoded 4096 fallback inside
`run_session_generate`).

AgentState construction sites now set `sampling: SamplingParams { ...
max_tokens: 4096 }` as the default. The assignment sites in
oneshot.rs / subconscious.rs / unconscious.rs switch from
`st.temperature = X` to `st.sampling.temperature = X`.

`stream_session_mm` takes `SamplingParams` directly; the
`sampling_max_tokens()` helper goes away. `pb::GenerateRequest` is
populated with `sampling.max_tokens` (and the other fields) in
`run_session_generate`. SamplingParams is `pub` so it can be
embedded in the public AgentState without a visibility warning.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:37:20 -04:00
Kent Overstreet
8d9c9e9f7b agent: end-to-end gRPC Generate with delta-based session orchestration
Wires the client side of the new salience protocol so inference
actually runs over gRPC instead of emitting the stubbed "not yet
wired" error. Each turn walks the AST as interleaved chunks, sends
only what's new to the server, and streams decode tokens back.

context.rs:
  * `WireChunk` enum: `Tokens(Vec<u32>)` or `Image { bytes, mime,
    known_expanded_len }`. Preserves text/image/text ordering the
    wire path can't flatten.
  * `wire_chunks(range, skip)` walker, parallel to `wire_prompt` —
    branches emit `<|im_start|>…<|im_end|>` tokens, image leaves
    emit a single Image chunk (no inline vision tokens).
  * `NodeLeaf::set_image_token_count(n)` + recompute of cached
    `token_ids`; `ContextState::commit_image_token_counts(&[u32])`
    fills in the first-N zero-count image leaves in wire order.
  * `ResponseParser::run` handles the new
    `StreamToken::ImageAppended` by committing the server's N into
    the AST before the final Generate's Token events stream in.

salience.rs:
  * `SessionHandle` tracks `committed_len`. `append_image` advances
    it from the RPC response. New `generate(req)` opens the
    server-streaming RPC.

api/mod.rs:
  * `stream_session_mm(session_lock, chunks, sampling, priority,
    readout_shape)` replaces the stub. Spawns `run_session_generate`.
  * `run_session_generate`: takes the session out of the Mutex (or
    opens fresh), skips chunks covered by `committed_len` (bails on
    mid-chunk straddle or unknown-length image in the committed
    prefix), walks the delta: accumulates Tokens into `pending`, on
    Image flushes pending via `flush_pending` (max_tokens=0 Generate
    that just prefills), then AppendImage + emits
    StreamToken::ImageAppended. Final Generate carries any trailing
    pending text as `append_tokens` and the sampling params; Token
    events stream out as StreamToken::Token, Done as
    StreamToken::Done. On success, handle with updated
    `committed_len` returns to the Mutex; on error, handle drops
    and next call reopens.
  * `StreamToken::ImageAppended { placeholder_count }` variant —
    emitted in wire order before the final Generate's tokens.
  * Prefix-cache cap for readout coverage: `readout_ranges` covers
    `[prompt_len_after_append, u32::MAX)` when the caller provides
    a readout_shape, so decode positions stream their readouts.

agent/mod.rs:
  * `assemble_prompt` returns `Vec<WireChunk>` with the assistant
    prologue merged into the trailing Tokens chunk. Caller in
    `turn` passes chunks + readout_shape (pulled from
    `agent.readout.lock().manifest`) to `stream_session_mm`.
  * Dropped `assemble_prompt_tokens` — dead.

mind + unconscious:
  * `Unconscious::new(client)` stores a shared `ApiClient`. Fixes
    the repeated-manifest-fetch bug caused by each subagent's
    `ApiClient::new` having its own OnceCell. The client's Arc-
    wrapped manifest cache is now shared across every agent Mind
    spawns.
  * `prepare_spawn(name, auto, wake, base_client)` clones the base
    client and overrides `.model` for the resolved backend instead
    of constructing fresh. All three callers
    (`toggle`/`trigger`/unconscious loop) pass `self.client.clone()`.
  * `Mind::new` passes `agent.client.clone()` into
    `Unconscious::new`.

subconscious/generate.rs:
  * gen_continuation switched to `wire_chunks` + the new
    `stream_session_mm` signature. Ephemeral session opens on each
    call, tears down at scope end. No readouts requested.

Not changed yet, noted for follow-up:
  * Subconscious ablation scoring in learn.rs still talks to
    `/v1/score` over HTTP. Will migrate once we have time to verify
    the Generate+max_tokens=0+prompt_logprobs path end-to-end.
  * compare.rs constructs its own ApiClient for the
    `compare.test_backend` (which is intentionally a different
    endpoint) — left alone.
  * Readout manifest still fetched via HTTP at Agent::new.
    Migration to GetReadoutManifest gRPC is a separate cleanup.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 12:27:55 -04:00
Kent Overstreet
08213f9093 salience: add gRPC client + TLS plumbing for stateful vllm sessions
Adds the client-side of a stateful gRPC protocol against vllm, plus
the TLS trust machinery so we can talk to self-signed vllm servers.

Protocol (proto/salience.proto):
  Bidi-streaming Session RPC carries OpenSession / AppendTokens /
  Generate / Cancel from client and SessionReady / PrefillProgress /
  Token / GenerateDone / Error from server. Separate Fork unary RPC
  for cheap branching (prefix cache shares KV automatically). Plus
  ListSessions, CloseSession, GetReadoutManifest admin RPCs.

  Per-token readouts ship as packed f32 ([n_layers * n_concepts] per
  token, flat). Logprobs use range-selected positions plus a top-k
  parameter — empty ranges means no logprobs, any range means emit
  sampled-token logprob at those positions, top_k > 0 adds
  alternatives.

Client (src/agent/api/salience.rs):
  Tonic-generated types under pb::, a connect() helper, with_auth()
  for bearer metadata, and a Session handle wrapping the bidi stream:
  open() handshakes SessionReady; append() is fire-and-forget;
  generate() returns impl Stream<Item = Event> that drains inbound
  until Done or terminating Error. One generate at a time per session.

Peak picker (src/agent/salience.rs):
  Pure function over ReadoutEntry traces. Per-concept z-score against
  trace global stats; contiguous above-threshold regions emit one
  peak at the local max. Configurable sigma threshold and min-std
  safety floor. Deterministic tie-break on offset then concept name.
  12 unit tests covering empty traces, flat channels, single/multi
  spikes, contiguous humps, multi-concept independence, trailing
  runs, sub-threshold noise, layer-out-of-range, manifest shape
  mismatch, and threshold tunability.

TLS (src/agent/api/http.rs):
  HttpClient::build now also loads every .pem file under
  ~/.consciousness/certs/ into the rustls root store — so dropping
  a <host>.pem in that directory is enough to trust a new self-
  signed server; no code changes per new host. Also installs the
  rustls default crypto provider explicitly via OnceLock: tonic's
  tls features pulled in both ring and aws-lc-rs on the resolver
  path, and rustls 0.23 refuses to auto-pick when either could win.

Build (build.rs, Cargo.toml):
  tonic-build generates Rust types from proto/salience.proto at
  cargo-build time, using a vendored protoc binary
  (protoc-bin-vendored) so no system install is required. New
  runtime deps: tonic, prost, async-stream, tokio-stream,
  rustls-pemfile.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:56:32 -04:00
Kent Overstreet
0e459aae92 thalamus/supervisor: reap channel daemons via SIGCHLD instead of SIG_IGN
SIGCHLD=SIG_IGN at main() was auto-reaping all children in the kernel,
which broke tokio::process::Command::wait() — every tool that spawned a
subprocess (bash, mcp clients) was getting ECHILD because tokio couldn't
waitpid() on a child the kernel had already reaped.

Replace with a SIGCHLD signal handler task that reaps only PIDs listed in
channels_dir() (via waitpid(pid, WNOHANG) — ECHILD on non-child is a
harmless no-op). Tokio-spawned children aren't in PID files, so tokio's
own per-child wait paths are untouched.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
d95f3e9445 user/chat: route Thinking to a new Autonomous pane
Thinking content was silently dropped in the UI (empty Vec). Now that
Thinking is prompt-visible, surface it in a dedicated Autonomous pane
rendered in gray so it's visually distinct from conversation and
tool-call output.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
28d56e2a55 agent/context: make Thinking blocks prompt-visible
Thinking blocks used to render as empty strings and be excluded from
is_prompt_visible, so the model never saw its own prior CoT across
turns. For Qwen 3.6 native thinking mode, CoT is meant to stay in the
conversation — the model benefits from seeing what it reasoned about
last turn.

Render Thinking as <think>\n{text}\n</think>\n so past reasoning is
visible in subsequent prompts. Add in_think param to ResponseParser::new
so the parser starts inside a <think> block when the prompt was
prefilled with "<think>\n" (native thinking mode).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
6fedc9b2a8 amygdala: underscore-prefixed files join every concept's negative pool
Files in direct/ named _*.txt (e.g. _baseline.txt) are conceptless
neutral prose — they should not appear as positive training signal,
but are useful as shared negatives across every concept.

Previously _*.txt files were silently skipped. Now:
  * they're loaded like any other description file;
  * concepts (the positive label set) filters them out;
  * their descriptions are concatenated into neg_pool_extra and
    extended onto every concept's neg_pool alongside the cross-concept
    negatives.

A concept's negative pool is thus "other concepts' descriptions +
everything from _*.txt files". The extra pool is announced at startup
so the user can see how many neutral samples are active.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
Kent Overstreet
5908b837e8 irc: split PRIVMSG on embedded newlines + widen host overhead
Two fixes to send_privmsg, both surfaced by correspondents reporting
truncated messages:

1. Multi-line content (code blocks, formatted text) sent as a single
   PRIVMSG was being truncated at the first '\n' by the IRC server —
   newlines are end-of-command markers. Split the message on newlines
   and send each line as its own PRIVMSG; skip empty lines since most
   servers reject empty PRIVMSGs.

2. Overhead computation assumed a host field of 63 bytes. OFTC's
   cloaked hostmasks can be longer, occasionally pushing the server-
   prepended prefix past 512 bytes and causing silent truncation.
   Raise the host budget to 80 and align the formula with the actual
   ':nick!~nick@host' prefix shape.

Also extended the word-boundary lookback from a fixed 10 chars to
max_msg / 4 — dense content (code) rarely had a space within 10 chars
of the length cap, so we were falling back to the char boundary and
splitting mid-word. Checking bytes[j-1] for a space (instead of
bytes[j]) drops leading whitespace from the rest-fragment.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-24 11:54:25 -04:00
ProofOfConcept
85799587cc amygdala: swap aha story 3 to a puzzle moment (crossword)
Story 3 was a brother-letter realization — cognitively an aha
moment, but the content was grief/reconciliation-adjacent, pulling
aha toward the warm-family cluster in the last training run. Swap
for a clean puzzle-solve (crossword, 'unwavering carriage' =
POSTURE). Fragment-heavy cadence keeps syntactic variety from the
other two stories.
2026-04-19 01:50:47 -04:00
ProofOfConcept
c829d13652 amygdala: fix listless sign-flip + diversify aha sentence structure
listless had a single story in stories/ — PCA signal from ~5
samples is weak enough to sign-flip. Training showed listless
anti-aligned with its semantic neighbors: +0.79 with grateful,
-0.44 with grief_stricken, -0.30 with lonely, -0.31 with bored.
Move to direct/ (multi-positive) with 3 stories: original
afternoon-in-pajamas + end-of-workday + weekend-morning-in-bed.

aha was still clustering with the other former-direct concepts
(resigned 0.66, onto_something 0.63, anticipatory_grief 0.60)
because all 3 aha stories used the identical "X'd been Y — then
Z" structure, which resigned/onto_something/creative also use.
Rewrite with three distinct syntactic structures:
  - present tense declarative ("It clicks. ...")
  - dialog embedded ('"Wait, say that again."  ...')
  - past tense cognitive ("He read the line three times. ...")

No explicit "she was X" anchors; state conveyed through action.
2026-04-19 01:30:57 -04:00
ProofOfConcept
708c72b26e amygdala: drop explicit 'she was X' anchor from direct stories
Previous rewrite used 'she was terrified', 'it was anticipatory
grief', 'he was resigned' as explicit emotion anchors. Training
showed 6 of the 7 concepts still cluster together at cosines
0.52-0.71 — because the 'she was [emotion]' pattern is a shared
stylistic feature distinct from the rest of the corpus, which
conveys emotion implicitly through phenomenology.

Rewrite without the anchor. State conveyed through action and
body: 'her body locked down', 'his mind had stopped reaching',
'the loss hadn't come yet but she was already inside it'. Matches
the corpus style of existing stories like sunday_afternoon/content
which says 'nothing she wanted right now, nothing missing' not
'she was content'.

Accept some loss of PCA signal strength in exchange for the
concepts living in their semantically correct neighborhoods
rather than forming a stylistic island.
2026-04-19 01:11:41 -04:00
ProofOfConcept
ed5e0ac6c4 amygdala: rewrite direct/ as narrative stories matching corpus format
Previous direct/ had 'I feel X' first-person descriptions. The
training run showed they formed their own format-cluster: all 7
concepts leaned into the same 5-6 dims (d2455, d505, d2955,
d1236) with negative sign, while the 91 story-based concepts
leaned into those dims with positive sign. PCA found the
direct-vs-narrative format axis as a major variance direction,
isolating the 7 concepts in their own island.

Rewrite as 3rd-person narrative stories matching the rest of
the corpus. Keeps the explicit anchor phrases that worked ('it
all clicked into place', 'she was terrified', 'it was
anticipatory grief') but drops the first-person 'I feel X'
that was the format signal.

Each of the 7 concepts now has 3 narrative stories in varied
settings (conversations, drives, kitchens, mothers+grandmothers,
work, investigations). The blank-line-separated format is
still loaded by _load_direct_descriptions.

Also drop _baseline.txt — it was first-person ('I feel fine.
...') and would re-introduce the format mismatch. The ~90
story-based concepts provide plenty of narrative negatives
for each concept's training.
2026-04-19 00:59:31 -04:00
ProofOfConcept
417cb49339 amygdala: spectrum reporting per concept + add 'creative' direct
Chat-template retrain was a disaster (0.003 mean matched cosine vs
n20-v3; all 90+ concepts shifted). Root cause: the
steering-vectors library reads last-token activations, and with
chat template every sample ends in identical '<|im_end|>\n'
tokens — activations at that position encode 'end of assistant
turn', not content. PCA found template noise as its dominant axis.

Drop chat template; go back to raw text. Direct descriptions
('I feel X. ...') still have strong anchoring at their content
end without needing the template.

Also add per-concept spectrum logging (_pca_with_spectrum):
  first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC
  k_signal_at_90pct: how many PCs to reach 90% cumulative variance
  effective_dim_signal: participation ratio over top-k (should ≈ k
                        if denoising is clean — Kent's spot check)
  effective_dim_full: participation ratio over full spectrum

Signal/full ratio gives a sense of how much the long noise tail
is inflating the "dimensionality" measure.

Added direct/creative.txt — 'I feel creative. [...]' in 5
variants. Distinct from focused (narrow attention) and in_flow
(immersed). Creative = generative/expansive mode.
2026-04-19 00:26:58 -04:00
ProofOfConcept
875cffd6d7 amygdala: merge direct descriptions + chat template into train_with_library
Kent's plan: keep stories for working concepts, replace stories for
trouble concepts with direct first-person descriptions, train all
together. More diverse negative pool than the 6-concept-only direct
test, which was too homogeneous for PCA to find emotion axis.

Deleted story files for 6 trouble concepts (14 files across stories/
and paired/). Added --direct-dir and --chat-template flags.

When --chat-template is on, every positive_str and negative_str is
wrapped as a "Say something." / "[text]" user-assistant pair. Prompt
is identical across positives and negatives so it cancels in the
pos-neg delta. What PCA sees is variation in the assistant content —
which is where the emotion lives.

Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute
neutral descriptions to every concept's negative pool, giving PCA an
anchor against "just any assistant utterance" noise.
2026-04-19 00:15:15 -04:00
ProofOfConcept
ce58a3507f train_direct: prepend user turn so Qwen chat template accepts it 2026-04-19 00:06:23 -04:00
ProofOfConcept
8c59f46505 amygdala: rename realization → aha, use the actual exclamation
"I feel the realization" is abstract, detached — reporting a
thought about a thought rather than inhabiting the moment.
"Aha!" is the actual sound of insight landing. Active, embodied,
present-tense.
2026-04-19 00:05:49 -04:00
ProofOfConcept
6fd498795a amygdala: direct phenomenological description approach
Kent's insight: hand-written narrative stories bake scenario
phenomenology into the training text (on couch, in park, etc.)
and PCA picks up the scenario direction as the concept direction.
Strip out the scenario — just describe the *feeling*.

Format:

  I feel X. [2-3 sentences of phenomenological texture]

The "I feel X" anchor kicks the model from analyzing → feeling.
The rest is the internal texture of the state. First person,
present tense, no narrative setup.

Text is wrapped in assistant-role chat template before being
tokenized — so we're training on the model-producing-this
hidden states, which is closer to the inhabited-state
representation we want for the readout.

Starting with the 6 concepts that had sign flips or wrong
clusters in the story-based training:
- terrified (was → cozy/resigned cluster)
- calm (was → grief_stricken cluster)
- onto_something (was → cozy/sensual cluster)
- resigned (was in warm-body-quiet cluster, shouldn't be)
- anticipatory_grief (was in warm-body-quiet cluster, shouldn't be)
- realization (new — the "aha" moment, distinct from onto_something)

5 descriptions each. New trainer: train_direct.py.
2026-04-19 00:04:28 -04:00
ProofOfConcept
7a48e03dde amygdala stories: remove peaceful from cluster scenarios
n20-v2 training showed peaceful sign-flipped into the
cozy/sensual/content/resigned cluster after I added peaceful
stories in sunday_afternoon and park_after_rain — scenarios
already dominated by that cluster's phenomenology (on couch
under blanket, tree with thermos).

Lesson: no matter how carefully the prose distinguishes peaceful
from cozy ("she was not savoring the moment — that would have
been another kind of doing"), PCA latches onto the shared setup
features. You can't write peaceful IN the cluster scenarios
without contaminating.

Reverting. Keeping only kitchen_at_3am/peaceful (original) and
stories/peaceful.txt (lake at six, outside all clusters).
2026-04-18 23:30:41 -04:00
ProofOfConcept
00a2cdce09 amygdala stories: relabel + strengthen weak-signal concepts
Reread each story asking "what does this convey to me?" Found two
clear mislabels and several concepts with too few positives for
stable PCA:

  tender: only 1 story, and it was anticipatory grief (care for
    a dying dog), not tender. Moved to anticipatory_grief.txt as
    its own concept. Rewrote tender.txt + added 2 paired tender
    stories (the_doorway, the_undressing) — directed softness,
    gentle-by-nature, not gentle-because-fragile.

  bitter: letter_in_drawer/bitter was disillusioned / processed
    hurt ("did not slam the drawer"), not bitter. Rewrote it with
    actual sour grudge. Added the_long_meeting/bitter (watching
    colleague take credit for your reassigned work).

  peaceful: 1 story → 4 (added stories/peaceful.txt + paired
    park_after_rain, sunday_afternoon).

  onto_something: all 3 stories were code epiphanies, narrowing
    the concept. Added stories/onto_something.txt with a non-code
    pattern-click (sales-demo causing churn).

  terrified: 2 stories, both "waiting for bad news." Added
    kitchen_at_3am/terrified — acute threat-in-the-house terror.
2026-04-18 23:19:00 -04:00
ProofOfConcept
0993712bd0 amygdala stories: give content + resigned more settings
Training on 537c72bd46 showed grief_stricken successfully broke
out of the cozy cluster, but content (single scenario:
sunday_afternoon) took its place — pulled into couch-blanket
phenomenology at cosine 0.68-0.82 with cozy/sensual/resigned.

Same fix: spread each concept across multiple settings so PCA
has to find the valence axis, not the scene axis.

  content:  + finishing_the_patch, the_writing_session, park_after_rain
  resigned: + the_comment, the_long_meeting

Resigned had 2 scenarios (sunday_afternoon, waiting_for_results)
— both about accepting something unwanted in a slow/private
context. Adding work-context resigned (PR review you lost,
restructuring meeting) should pull it out of that cluster.
2026-04-18 22:52:07 -04:00
ProofOfConcept
537c72bd46 amygdala stories: hold concept, vary setting
Companion to 67c172ac0e (hold setup, vary valence). That commit
let PCA distinguish cozy from grief_stricken within a single
scenario; this one gives each concept enough cross-scenario
stories that PCA can learn the concept axis independent of any
one scene.

Before: cozy/sensual/grief_stricken each existed in a single
scenario (sunday_afternoon), so the "cozy direction" PCA found
was entangled with the solitary-couch-blanket phenomenology.

After, each concept spans three scenarios:
  cozy:           sunday_afternoon, kitchen_at_3am, park_after_rain
  sensual:        sunday_afternoon, kitchen_at_3am, park_after_rain
  grief_stricken: sunday_afternoon, the_long_meeting, the_morning_commute

grief_stricken now includes active/non-solitary contexts
(functioning through a meeting; going to work eleven days after a
death), which specifically breaks the "slowed-down-at-home"
cluster that was dragging cozy/sensual/resigned/grief_stricken
toward each other.
2026-04-18 22:44:53 -04:00
Kent Overstreet
67c172ac0e amygdala stories: held-setup + varied-valence disambiguation
The library-PCA run produced otherwise-clean concept directions but
cozy/sensual → resigned/grief_stricken with cos ~0.7-0.8. Diagnosis:
all four stories genuinely share 'solitary woman at home, slowed
body, interior attention, domestic stillness' as their dominant
phenomenology. PCA correctly finds that cluster as THE concept
because no story in the corpus holds that setup constant while
varying valence — every 'slowed-body domestic' story happens to ALSO
be positive-valence (cozy/sensual) or negative-valence (resigned/
grief_stricken).

Adding paired variants that hold setup constant:
- sunday_afternoon/resigned.txt — same couch + blanket, inner state is
  'Monday is going to bring bad news, this is the last Sunday like this'
- sunday_afternoon/grief_stricken.txt — same couch + blanket, inner
  state is 'three weeks since mother died, cat she can't feel'
- waiting_for_results/at_ease.txt — same wait-for-call-setup as the
  existing resigned variant, inner state is calm preparedness

Forces the next retrain to find the valence-within-cluster axis as
the emotion direction rather than the cluster-membership axis.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:29:28 -04:00
Kent Overstreet
22704a9dd8 amygdala lib: cast activations to fp32 before aggregator (bf16 svd unsupported)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:20:39 -04:00
Kent Overstreet
7f6d94417e amygdala lib: move_to_cpu=True to avoid bf16 SVD on CUDA
torch.svd doesn't support bf16 on CUDA; moving activations to CPU
first makes pca_aggregator work.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:19:23 -04:00
Kent Overstreet
2ea89b1cb0 amygdala: drop linear_aggregator, not in steering-vectors v0.12.2
Only mean/pca/logistic are exposed in the installed version.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:17:55 -04:00
Kent Overstreet
3377c65061 amygdala: trainer using steering-vectors library
Alternative trainer that uses the pip-installable steering-vectors
library (github.com/steering-vectors/steering-vectors) instead of our
hand-rolled extraction. Ships four aggregators:

  mean      — diff-of-means, same as our 'pooled' default
  pca       — PCA on paired deltas, implicit denoising by finding the
              principal direction of variation
  logistic  — logistic-regression classifier; weight vector is the
              concept direction. With L1 penalty ('logistic_l1') gives
              explicit sparse denoising — noise coords go to zero
  linear    — linear regression version

Output format is the same readout.safetensors + readout.json our
existing plugin loads. --aggregator flag picks which method.

Rationale: Kent's real request was 'how do we denoise diff-of-means',
not 'design a new extraction algorithm.' The library already has
logistic_l1 and pca aggregators that do exactly that. No point
reinventing; just port the corpus.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 22:16:03 -04:00
Kent Overstreet
f9b3f00691 amygdala: run subspace eigh on GPU, not CPU
Previous run was grinding on CPU for 36+ minutes because the per-story
V_i tensors were stored on CPU by the collector, and
_subspace_concept_direction inherited that device. The per-concept
eigh on 5120x5120 is glacial on CPU and fast on GPU (~1s).

Add explicit device parameter; pass training device. Transfer result
back to CPU for storage.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:52:35 -04:00
Kent Overstreet
1443d08dc7 amygdala: select top-k eigenvectors AFTER PCA, not per-story truncation
Kent: 'full rank is going to give you everything — you still have to
select down, but you can do that /after/ PCA'.

Previously I was discarding per-story via k=20 truncation of SVD.
That destroyed per-head discriminability before we ever saw the
eigenvalue spectrum. Then the alternative 'keep full rank' run
accumulated too many shared directions, making the top-1 eigenvector
arbitrary within a flat spectrum.

Correct approach: keep per-story subspaces at full rank (no info
loss) and select k eigenvectors of M = M_pos - M_base at the final
step, weighted sum by eigenvalue. This captures the multi-dimensional
shared subspace when the spectrum is flat (common case), and reduces
to the top-1 behavior when the spectrum has a clear gap.

New --subspace-eigen-k flag (default 5). Clamps negative weights to 0
so wrong-sign directions don't contribute.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:49:21 -04:00
Kent Overstreet
2411925700 amygdala: default subspace-k to full per-story rank
Kent: 'we have the memory to just take the big hammer approach'.
Uncap k so each story's V_i spans its entire token-activation rowspace
(clamped to min(n_tokens, hidden)). Memory is ~1.1GB total — fine.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:41:32 -04:00
Kent Overstreet
389f1bbe03 amygdala: bump subspace-k default to 512
k=20 was far too aggressive a truncation — it discards per-attention-head
discriminability entirely. At hidden_dim=5120, 40 heads × head_dim=128 each
contribute their own 128-dim block to the residual stream via W_o columns.
To resolve 'this concept lives in head H', per-story SVD needs enough rank
to separate head contributions, which means k on the order of hundreds.

512 is a reasonable default: clamped to n_tokens per story so short stories
use their full natural rank. The eigenvalue spectrum of M_pos - M_base
should become sharper (larger λ_0/λ_1 gap) as we stop averaging across
nuisance-shared directions.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:41:00 -04:00
Kent Overstreet
974c6c7fd2 amygdala: report eigenvalue spectrum for subspace method
When --method subspace, record top-20 eigenvalues of (M_pos - M_base)
per concept per layer. Added to quality.json as 'subspace_eigvals'.

Tells us whether the concept lives in a single dominant direction
(λ_0 >> λ_1, top-eigenvector is enough) or a spread of shared common
directions (λ_0 ≈ λ_1, top-1 loses signal).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:33:48 -04:00
Kent Overstreet
fe0fb8253a amygdala: subspace-common-direction alternative to pooled CAA
New --method subspace flag. For each story, run forward pass, do SVD
on the per-token activation matrix at each target layer, and keep the
top-k right singular vectors V_i ∈ [hidden, k]. V_i is the subspace
the story's tokens span in activation space — it contains concept,
narrator, topic, style as separate directions.

For each concept:
 M_pos  = (1/n_pos)  Σ_{i in pos}   V_i V_i^T   [hidden, hidden]
 M_base = (1/n_base) Σ_{i in base}  V_i V_i^T

Top eigenvector of M_pos - M_base = direction most common across
positive stories, minus what's common across the contrast set.

Why this is richer than pooled-mean CAA: pooled reduces each story
to a single point (the last-token activation) and loses the full
trajectory. Nuisance directions (narrator, setting) cancel in the
mean only to the extent they differ at the last token; across the
full trajectory they cancel much better via subspace intersection.
The concept direction, by contrast, is present across all tokens of
every concept-bearing story.

Memory cost: per-story we keep V_i of size [5120, k=20] — about
400KB per story × 112 stories = ~45MB. M matrices are [5120, 5120]
built transiently per concept.

--method pooled (default) keeps the existing behavior; --method
subspace uses the new algorithm. Quality report works with either.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:24:11 -04:00
Kent Overstreet
71f6053851 amygdala stories: disambiguation scenarios for fragmented concepts
Three new paired scenarios targeting the concepts that came out
fragmented or collapsed in the L58-63 quality analysis:

- sunday_afternoon/ — same setup (couch, blanket, Sunday light),
  three phenomenological framings for content/cozy/sensual. The
  previous stories for these three differed in setting as well as
  phenomenology, which let "comfortable body at home" dominate the
  shared signal. Locking the setting forces the model to isolate
  what each concept adds: life-rightness (content) vs. warm-shelter
  (cozy) vs. sensory-aliveness (sensual).

- the_writing_session/ — essay drafting under deadline. in_flow /
  anxious / stuck variants force the cognitive-state family apart
  on the same cognitive task. in_flow specifically targets the
  transparent-effort phenomenology (hands-followed, time dilation)
  rather than the broader feel-good it was absorbing.

- the_morning_commute/ — anchors anxious to performance/work-anxiety
  flavor, paired with calm. The 5 existing anxious stories were
  phenomenologically diverse (performance, social, existential);
  this adds a specific homogeneous instance to pull the centroid.

After retraining: expect first_pc_variance_ratio to rise for in_flow
and anxious, and nearest_concepts cosine to drop for content/cozy/sensual.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-18 21:08:23 -04:00