consciousness

Author	SHA1	Message	Date
Kent Overstreet	156b6863f4	Gate nightly diagnostics behind feature	2026-06-15 13:51:22 -05:00
Kent Overstreet	25e4775974	enable tls	2026-05-22 13:02:42 -04:00
Kent Overstreet	6e3bacb182	channel-tmux: resolve pane ids by label, don't persist them tmux pane ids (%6 etc.) are ephemeral — recycled across pane and tmux-server restarts. The daemon persisted the id in tmux.json5 and kept reusing it, so after a restart a channel would attach to whatever unrelated pane had since inherited that id. (Live: ktest's stored %6 had become a claude pane; the real ktest pane was %10.) Persist only the label — the pane title / window name, which is stable. pipe_pane_reader() is now a connect-retry loop: each attempt, connect_and_stream() resolves the live id with find_pane_by_name(); the loop retries until the pane exists and pipe-pane succeeds, and reconnects the same way if the pipe later drops. send() resolves the id at send time; open() just registers the label and lets the reader find it. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-22 12:26:05 -04:00
Kent Overstreet	190eb50ed9	telegram: bound photo download to 60s HttpClient::request_timeout only covers send_request, not body collect, so a stuck download would otherwise stall the entire long-poll loop indefinitely. tokio::time::timeout at the call site keeps the failure contained — a slow/dead download surfaces as the same [image: download failed: ...] marker as any other error. 60s is generous for the 1-5MB photos Kent typically sends; Telegram's bot getFile cap is 20MB, which would still complete on most connections. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-01 18:56:03 -04:00
Kent Overstreet	713bb07729	bin: add ch — minimal channel CLI (send/recv) Speaks the channel.capnp protocol over the per-daemon Unix socket at ~/.consciousness/channels/<top>.sock. Useful for ad-hoc sends from shell, tests, and out-of-process tools that don't want to embed a capnp client. ch send <channel> <message...> ch recv <channel> [--all-new] [--min-count N] Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-01 18:16:21 -04:00
Kent Overstreet	c303653dd0	telegram: bridge photos via [image: <path>] markers When an incoming update has a photo array, pick the largest size, resolve the file_id via getFile, and download to ~/.consciousness/channels/telegram.logs/media/<file_id>.<ext>. The message line surfaced to the channel is [image: /abs/path/to/file.jpg] <caption if any> so a multimodal Read on the path works end-to-end. On download failure we still surface the caption with an [image: download failed: ...] marker so context isn't lost. Other media types (voice/video/sticker/etc.) log a one-line "skipping" notice — easy hook to extend later. The media/ dir was already being created at startup; this fills in the rest. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-01 17:58:43 -04:00
Kent Overstreet	a075e30557	http: add HttpResponse::bytes() for binary downloads Mirror of text(), but returns raw Bytes without lossy UTF-8 conversion. Needed by the Telegram channel to fetch photo files. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-01 17:58:35 -04:00
Kent Overstreet	91c8451f5c	user: fix hotkey_cycle_reasoning after lock_blocking revert The revert at `09896cd` dropped the try_lock() wrapper but left an extra closing brace and the async-call site still un-awaited, leaving the tree unbuildable. Re-flow the function body to match the new signature. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-05-01 17:58:32 -04:00
Kent Overstreet	09896cd38b	Revert "replace try_lock() with lock_blocking() across UI thread" This reverts commit `4225294d16`.	2026-04-25 17:15:53 -04:00
Kent Overstreet	4225294d16	replace try_lock() with lock_blocking() across UI thread Add lock_blocking() to TrackedMutex: blocks current thread using block_in_place + futures::executor::block_on, safe for sync contexts. Replace all try_lock() calls with lock_blocking() in slash commands, UI rendering, and status reads. Lock hold times are fast enough that blocking briefly is fine, and this eliminates the spurious 'lock unavailable' paths that were never actually hit. Kept rx_mutex.try_lock() in mod.rs (std::sync::Mutex for stderr rx).	2026-04-25 15:35:14 -04:00
Kent Overstreet	5210f7dd66	context: heal pre-refactor image logs with token_count=0 Recompute image token counts from persisted dimensions when loading old logs that stored count=0 (server-authoritative count was applied after AppendImage before client-side pad expansion). graph: cache neighbor sets for clustering coefficient Pre-compute neighbor HashSets so the O(deg^2) triangle-counting inner loop doesn't re-allocate on every (i,j) pair. avg_clustering_ coefficient() now builds the cache once instead of O(N*deg) times.	2026-04-25 15:15:21 -04:00
Kent Overstreet	371b40078d	context: salvage in-flight tag accumulators on premature stream end ResponseParser.finish() was only flushing self.buf — the rolling tail window — and silently dropping self.think_buf and self.tool_call_buf. When a stream ended inside an unterminated <think>...</think> or <tool_call>...</tool_call> block (max_tokens reached, EOS before the close tag, server-side cancel), all the accumulated in-tag content was discarded and only the trailing ~8 bytes survived (drain_safe keeps `close_tag.len()` bytes at the tail of buf to handle across-chunk tag splits — and `</think>` is exactly 8 chars). Symptom: assistant responses cut off, only the last few characters come through. Especially severe in native-think mode where in_think is set from prefill, so the entire response accumulates in think_buf and gets wiped on premature stop. In finish(): if in_think, drain buf into think_buf and emit as a Thinking node (preserving the partial thought). If in_tool_call, attempt to parse the body; on parse failure, wrap the partial as content with the leading <tool_call> open tag so the model sees its own truncated attempt next turn rather than losing it. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 23:32:44 -04:00
Kent Overstreet	c2433c1773	context: tighten the Branch token-cache invariant Two pieces around the cache that landed when Branch nodes started holding `token_ids: Some(server_authoritative_stream)`: 1. wire_into / wire_chunks now pair cached vision blocks with their child Image leaves. Previously the cached-branch arm spliced the cache verbatim and didn't recurse for images, so a Branch whose cache contained `VISION_START..VISION_END` blocks would emit those tokens with no matching `WireImage` push — leading to a panic downstream when `pair_images_to_ranges` tried to attach the missing image. New `pair_cached_images` walks the children depth-first for image leaves and zips them against `vision_blocks(cache)` to emit correctly-offset entries; mismatched counts panic loudly because that's an AST/cache invariant violation that would otherwise mis-pair on the wire. 2. `conversation_mut() -> &mut Vec<AstNode>` was the one public escape hatch that let callers reach into a Branch's children and mutate them without invalidating the cached token stream. Removed in favor of a focused `set_branch_memory_score(section, index, key, score)` for the only legitimate use we had today (the full-matrix scorer writing per-memory divergence onto the Assistant Branch). Updated the lone caller in subconscious/learn. Documented the invariants explicitly on `ContextState`: every `Leaf.token_ids` matches `body.compute_token_ids()`, and every `Branch { token_ids: Some(_) }` is a faithful walk of its children. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 23:15:55 -04:00
Kent Overstreet	006b99bdac	bin: enable panic backtraces by default stderr is redirected to ~/.consciousness/logs/tui-stderr.log via redirect_stderr_to_pipe(), but the default panic hook checks RUST_BACKTRACE before printing the trace; without the env var the log only catches the "note: run with \`RUST_BACKTRACE=full\`" tail and the actual frames are dropped. Set RUST_BACKTRACE=1 programmatically before any other thread spawns so the log captures the trace by default. Existing user-set value is respected so callers can still opt into "full" if they want. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 22:44:19 -04:00
Kent Overstreet	10c8878f1c	agent: bump tonic gRPC message caps to 64 MiB The default 4 MiB cap on encoded/decoded messages is too small for the multimodal Generate path: Qwen3.6-VL high-res patches put 5–8 MiB of pre-encoded image bytes inline in a single Generate request, and Done events carrying full per-token readout vectors can also exceed 4 MiB on long runs. Hit "ResourceExhausted: Received message larger than max (5799108 vs. 4194304)" from the salience server. Bump both encode and decode caps on every cloned SalienceClient. The matching server-side bump is in vllm/entrypoints/salience/server.py. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 22:36:10 -04:00
Kent Overstreet	11a7e4043e	scripts: FP8 quantize Qwen3.6-27B for vLLM (multimodal + MTP) Quantization recipe targeting the multimodal Qwen3.6-27B for vLLM serving. Three pitfalls the script avoids, each documented inline: 1. Loader strip: `AutoModelForCausalLM` silently drops the vision tower; we load via the config-declared `Qwen3_5ForConditionalGeneration` instead. 2. Pattern anchor: llmcompressor matches the `ignore` list against module names (no `.weight` suffix) when walking `named_modules()`, not against full tensor names. Patterns now anchor on `$` at the module name; the earlier `\.weight$` form silently quantized lm_head and every linear_attn projection. 3. vLLM fusion: vLLM fuses {q,k,v}_proj into qkv_proj, gate+up into gate_up_proj, and in_proj_qkv+in_proj_z into in_proj_qkvz. The compressed_tensors loader rejects mixed schemes within a fused layer, so the `ignore` list is shaped to keep all sub-components of a fused layer consistent. After `oneshot()` writes the FP8 output, MTP tensors (which the HF class doesn't expose) are spliced in at BF16 from the upstream cached snapshot, with the compressed_tensors metadata header preserved. Recipe follows Unsloth's UD-Q8_K_XL late-stack overrides (FFN: 50, 51, 59, 62, 63; ATTN: 51, 59, 63), extended to include `v_proj` for fusion compat. Final checkpoint is ~35 GB (matches Unsloth's GGUF size to within ~1%) with vision tower BF16, MTP head BF16, and most mlp/self_attn Linears at FP8_DYNAMIC. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 22:15:31 -04:00
Kent Overstreet	fe232cf292	salience: client-side pad expansion, drop AppendImage Mirrors the vLLM-side rewrite. AppendImage is gone; images now ride along on Generate via a parallel `images` list. - Productionize `qwen3_image_token_count` (was test-only). Image leaf computes its IMAGE_PAD count eagerly at construction from height/width; `token_count` is no longer "0 until the server tells us." - WireChunk shrinks to a single `Tokens(Vec<u32>)` variant — vision blocks live inline in the token stream. - `wire_chunks` now returns `(Vec<WireChunk>, Vec<WireImage>)`. `WireImage` carries `pad_start` / `pad_end` (absolute positions in the full walk) alongside bytes + mime. - `assemble_prompt` returns `(chunks, images, match_upto)`. - `stream_session_mm` / `run_session_generate` take the parallel images list, filter to those past `match_upto`, and pass them in `GenerateRequest.images` as `pb::ImageAttachment` entries. - Drop `SessionHandle::append_image`, `ContextState::commit_image_token_counts`, `StreamToken::ImageAppended`, the WireChunk::Image branch in `learn.rs`, and the now-empty `prompt_to_chunks` helper. - Add 'v' toggle on the conscious-screen tree to render token-id vectors in place of text content (debug-aid: lets us see what the server actually has when output is suspicious). - Comment out the subconscious-trigger spawn loop — Kent had this disabled before; it had crept back into running. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 20:26:47 -04:00
Kent Overstreet	4feebb7bc4	agent: share one tonic Channel + migrate scoring to gRPC Generate Two changes that bolt together — the shared connection means the new scoring path actually costs one HTTP/2 handshake across the whole process instead of one-per-RPC. ApiClient gains `salience_channel: Arc<OnceCell<Channel>>`. First call to `ApiClient::salience_client()` opens the channel via `connect_channel()` and stores the Channel; subsequent calls clone it (cheap — tonic multiplexes concurrent RPCs over the single HTTP/2 connection). Every ApiClient clone shares the same OnceCell, so all agents spawned from Mind's client — plus every ephemeral scoring session — reuse one connection. SessionHandle refactored to hold an `ApiClient` clone instead of a bag of (base_url, api_key) strings. `open` / `append_image` / `generate` go through `self.client.salience_client()` now. New `prefill_only(tokens)` method encapsulates the "Generate with max_tokens=0 to append text" pattern (previously a private free function in api/mod.rs called `flush_pending`). Drop impl on SessionHandle stays — still fires CloseSession on the shared channel in a detached task. `run_session_generate` switched from `(base_url, api_key, model)` to `&ApiClient`; the agent-turn flow that uses it keeps the same shape but `stream_session_mm` clones the ApiClient into the spawned worker. learn.rs migrated from the HTTP `/v1/score` endpoint to a gRPC session-based score: * `call_score` opens an ephemeral SessionHandle on the client, converts (prompt_tokens, images) → Vec<WireChunk> via the new `prompt_to_chunks` helper (splits on VISION_START/VISION_END), walks chunks calling `prefill_only` + `append_image`, runs a final Generate with `max_tokens=0` + `logprobs_ranges` over the scored positions, and sums each Token event's `sampled_logprob` per range to produce `ScoreResult`s. * SessionHandle drops at end of scope → CloseSession auto-fires, keeping the server's session map clean between calls. * No more HTTP path, no more `http_client()` helper, no more `ScoreResponse` / serde plumbing for /v1/score. * `send_to_train` still uses HTTP (it talks to /v1/train which isn't on the gRPC protocol); its ad-hoc HTTP client lives inline now instead of reaching for the deleted `http_client()`. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 12:51:53 -04:00
Kent Overstreet	be6ba4e9a5	agent: bundle sampling fields as SamplingParams on AgentState Collapse the split `temperature` / `top_p` / `top_k` fields on AgentState into a single `sampling: SamplingParams` struct, mirroring how the wire-level fields flow into the Generate RPC. Adds `max_tokens` to SamplingParams so it's actually plumbed end to end (previously the client had a hardcoded 4096 fallback inside `run_session_generate`). AgentState construction sites now set `sampling: SamplingParams { ... max_tokens: 4096 }` as the default. The assignment sites in oneshot.rs / subconscious.rs / unconscious.rs switch from `st.temperature = X` to `st.sampling.temperature = X`. `stream_session_mm` takes `SamplingParams` directly; the `sampling_max_tokens()` helper goes away. `pb::GenerateRequest` is populated with `sampling.max_tokens` (and the other fields) in `run_session_generate`. SamplingParams is `pub` so it can be embedded in the public AgentState without a visibility warning. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 12:37:20 -04:00
Kent Overstreet	8d9c9e9f7b	agent: end-to-end gRPC Generate with delta-based session orchestration Wires the client side of the new salience protocol so inference actually runs over gRPC instead of emitting the stubbed "not yet wired" error. Each turn walks the AST as interleaved chunks, sends only what's new to the server, and streams decode tokens back. context.rs: * `WireChunk` enum: `Tokens(Vec<u32>)` or `Image { bytes, mime, known_expanded_len }`. Preserves text/image/text ordering the wire path can't flatten. * `wire_chunks(range, skip)` walker, parallel to `wire_prompt` — branches emit `<\|im_start\|>…<\|im_end\|>` tokens, image leaves emit a single Image chunk (no inline vision tokens). * `NodeLeaf::set_image_token_count(n)` + recompute of cached `token_ids`; `ContextState::commit_image_token_counts(&[u32])` fills in the first-N zero-count image leaves in wire order. * `ResponseParser::run` handles the new `StreamToken::ImageAppended` by committing the server's N into the AST before the final Generate's Token events stream in. salience.rs: * `SessionHandle` tracks `committed_len`. `append_image` advances it from the RPC response. New `generate(req)` opens the server-streaming RPC. api/mod.rs: * `stream_session_mm(session_lock, chunks, sampling, priority, readout_shape)` replaces the stub. Spawns `run_session_generate`. * `run_session_generate`: takes the session out of the Mutex (or opens fresh), skips chunks covered by `committed_len` (bails on mid-chunk straddle or unknown-length image in the committed prefix), walks the delta: accumulates Tokens into `pending`, on Image flushes pending via `flush_pending` (max_tokens=0 Generate that just prefills), then AppendImage + emits StreamToken::ImageAppended. Final Generate carries any trailing pending text as `append_tokens` and the sampling params; Token events stream out as StreamToken::Token, Done as StreamToken::Done. On success, handle with updated `committed_len` returns to the Mutex; on error, handle drops and next call reopens. * `StreamToken::ImageAppended { placeholder_count }` variant — emitted in wire order before the final Generate's tokens. * Prefix-cache cap for readout coverage: `readout_ranges` covers `[prompt_len_after_append, u32::MAX)` when the caller provides a readout_shape, so decode positions stream their readouts. agent/mod.rs: * `assemble_prompt` returns `Vec<WireChunk>` with the assistant prologue merged into the trailing Tokens chunk. Caller in `turn` passes chunks + readout_shape (pulled from `agent.readout.lock().manifest`) to `stream_session_mm`. * Dropped `assemble_prompt_tokens` — dead. mind + unconscious: * `Unconscious::new(client)` stores a shared `ApiClient`. Fixes the repeated-manifest-fetch bug caused by each subagent's `ApiClient::new` having its own OnceCell. The client's Arc- wrapped manifest cache is now shared across every agent Mind spawns. * `prepare_spawn(name, auto, wake, base_client)` clones the base client and overrides `.model` for the resolved backend instead of constructing fresh. All three callers (`toggle`/`trigger`/unconscious loop) pass `self.client.clone()`. * `Mind::new` passes `agent.client.clone()` into `Unconscious::new`. subconscious/generate.rs: * gen_continuation switched to `wire_chunks` + the new `stream_session_mm` signature. Ephemeral session opens on each call, tears down at scope end. No readouts requested. Not changed yet, noted for follow-up: * Subconscious ablation scoring in learn.rs still talks to `/v1/score` over HTTP. Will migrate once we have time to verify the Generate+max_tokens=0+prompt_logprobs path end-to-end. * compare.rs constructs its own ApiClient for the `compare.test_backend` (which is intentionally a different endpoint) — left alone. * Readout manifest still fetched via HTTP at Agent::new. Migration to GetReadoutManifest gRPC is a separate cleanup. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 12:27:55 -04:00
Kent Overstreet	08213f9093	salience: add gRPC client + TLS plumbing for stateful vllm sessions Adds the client-side of a stateful gRPC protocol against vllm, plus the TLS trust machinery so we can talk to self-signed vllm servers. Protocol (proto/salience.proto): Bidi-streaming Session RPC carries OpenSession / AppendTokens / Generate / Cancel from client and SessionReady / PrefillProgress / Token / GenerateDone / Error from server. Separate Fork unary RPC for cheap branching (prefix cache shares KV automatically). Plus ListSessions, CloseSession, GetReadoutManifest admin RPCs. Per-token readouts ship as packed f32 ([n_layers * n_concepts] per token, flat). Logprobs use range-selected positions plus a top-k parameter — empty ranges means no logprobs, any range means emit sampled-token logprob at those positions, top_k > 0 adds alternatives. Client (src/agent/api/salience.rs): Tonic-generated types under pb::, a connect() helper, with_auth() for bearer metadata, and a Session handle wrapping the bidi stream: open() handshakes SessionReady; append() is fire-and-forget; generate() returns impl Stream<Item = Event> that drains inbound until Done or terminating Error. One generate at a time per session. Peak picker (src/agent/salience.rs): Pure function over ReadoutEntry traces. Per-concept z-score against trace global stats; contiguous above-threshold regions emit one peak at the local max. Configurable sigma threshold and min-std safety floor. Deterministic tie-break on offset then concept name. 12 unit tests covering empty traces, flat channels, single/multi spikes, contiguous humps, multi-concept independence, trailing runs, sub-threshold noise, layer-out-of-range, manifest shape mismatch, and threshold tunability. TLS (src/agent/api/http.rs): HttpClient::build now also loads every .pem file under ~/.consciousness/certs/ into the rustls root store — so dropping a <host>.pem in that directory is enough to trust a new self- signed server; no code changes per new host. Also installs the rustls default crypto provider explicitly via OnceLock: tonic's tls features pulled in both ring and aws-lc-rs on the resolver path, and rustls 0.23 refuses to auto-pick when either could win. Build (build.rs, Cargo.toml): tonic-build generates Rust types from proto/salience.proto at cargo-build time, using a vendored protoc binary (protoc-bin-vendored) so no system install is required. New runtime deps: tonic, prost, async-stream, tokio-stream, rustls-pemfile. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:56:32 -04:00
Kent Overstreet	0e459aae92	thalamus/supervisor: reap channel daemons via SIGCHLD instead of SIG_IGN SIGCHLD=SIG_IGN at main() was auto-reaping all children in the kernel, which broke tokio::process::Command::wait() — every tool that spawned a subprocess (bash, mcp clients) was getting ECHILD because tokio couldn't waitpid() on a child the kernel had already reaped. Replace with a SIGCHLD signal handler task that reaps only PIDs listed in channels_dir() (via waitpid(pid, WNOHANG) — ECHILD on non-child is a harmless no-op). Tokio-spawned children aren't in PID files, so tokio's own per-child wait paths are untouched. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:54:25 -04:00
Kent Overstreet	d95f3e9445	user/chat: route Thinking to a new Autonomous pane Thinking content was silently dropped in the UI (empty Vec). Now that Thinking is prompt-visible, surface it in a dedicated Autonomous pane rendered in gray so it's visually distinct from conversation and tool-call output. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:54:25 -04:00
Kent Overstreet	28d56e2a55	agent/context: make Thinking blocks prompt-visible Thinking blocks used to render as empty strings and be excluded from is_prompt_visible, so the model never saw its own prior CoT across turns. For Qwen 3.6 native thinking mode, CoT is meant to stay in the conversation — the model benefits from seeing what it reasoned about last turn. Render Thinking as <think>\n{text}\n</think>\n so past reasoning is visible in subsequent prompts. Add in_think param to ResponseParser::new so the parser starts inside a <think> block when the prompt was prefilled with "<think>\n" (native thinking mode). Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:54:25 -04:00
Kent Overstreet	6fedc9b2a8	amygdala: underscore-prefixed files join every concept's negative pool Files in direct/ named _.txt (e.g. _baseline.txt) are conceptless neutral prose — they should not appear as positive training signal, but are useful as shared negatives across every concept. Previously _.txt files were silently skipped. Now: * they're loaded like any other description file; * concepts (the positive label set) filters them out; * their descriptions are concatenated into neg_pool_extra and extended onto every concept's neg_pool alongside the cross-concept negatives. A concept's negative pool is thus "other concepts' descriptions + everything from _*.txt files". The extra pool is announced at startup so the user can see how many neutral samples are active. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:54:25 -04:00
Kent Overstreet	5908b837e8	irc: split PRIVMSG on embedded newlines + widen host overhead Two fixes to send_privmsg, both surfaced by correspondents reporting truncated messages: 1. Multi-line content (code blocks, formatted text) sent as a single PRIVMSG was being truncated at the first '\n' by the IRC server — newlines are end-of-command markers. Split the message on newlines and send each line as its own PRIVMSG; skip empty lines since most servers reject empty PRIVMSGs. 2. Overhead computation assumed a host field of 63 bytes. OFTC's cloaked hostmasks can be longer, occasionally pushing the server- prepended prefix past 512 bytes and causing silent truncation. Raise the host budget to 80 and align the formula with the actual ':nick!~nick@host' prefix shape. Also extended the word-boundary lookback from a fixed 10 chars to max_msg / 4 — dense content (code) rarely had a space within 10 chars of the length cap, so we were falling back to the char boundary and splitting mid-word. Checking bytes[j-1] for a space (instead of bytes[j]) drops leading whitespace from the rest-fragment. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 11:54:25 -04:00
ProofOfConcept	85799587cc	amygdala: swap aha story 3 to a puzzle moment (crossword) Story 3 was a brother-letter realization — cognitively an aha moment, but the content was grief/reconciliation-adjacent, pulling aha toward the warm-family cluster in the last training run. Swap for a clean puzzle-solve (crossword, 'unwavering carriage' = POSTURE). Fragment-heavy cadence keeps syntactic variety from the other two stories.	2026-04-19 01:50:47 -04:00
ProofOfConcept	c829d13652	amygdala: fix listless sign-flip + diversify aha sentence structure listless had a single story in stories/ — PCA signal from ~5 samples is weak enough to sign-flip. Training showed listless anti-aligned with its semantic neighbors: +0.79 with grateful, -0.44 with grief_stricken, -0.30 with lonely, -0.31 with bored. Move to direct/ (multi-positive) with 3 stories: original afternoon-in-pajamas + end-of-workday + weekend-morning-in-bed. aha was still clustering with the other former-direct concepts (resigned 0.66, onto_something 0.63, anticipatory_grief 0.60) because all 3 aha stories used the identical "X'd been Y — then Z" structure, which resigned/onto_something/creative also use. Rewrite with three distinct syntactic structures: - present tense declarative ("It clicks. ...") - dialog embedded ('"Wait, say that again." ...') - past tense cognitive ("He read the line three times. ...") No explicit "she was X" anchors; state conveyed through action.	2026-04-19 01:30:57 -04:00
ProofOfConcept	708c72b26e	amygdala: drop explicit 'she was X' anchor from direct stories Previous rewrite used 'she was terrified', 'it was anticipatory grief', 'he was resigned' as explicit emotion anchors. Training showed 6 of the 7 concepts still cluster together at cosines 0.52-0.71 — because the 'she was [emotion]' pattern is a shared stylistic feature distinct from the rest of the corpus, which conveys emotion implicitly through phenomenology. Rewrite without the anchor. State conveyed through action and body: 'her body locked down', 'his mind had stopped reaching', 'the loss hadn't come yet but she was already inside it'. Matches the corpus style of existing stories like sunday_afternoon/content which says 'nothing she wanted right now, nothing missing' not 'she was content'. Accept some loss of PCA signal strength in exchange for the concepts living in their semantically correct neighborhoods rather than forming a stylistic island.	2026-04-19 01:11:41 -04:00
ProofOfConcept	ed5e0ac6c4	amygdala: rewrite direct/ as narrative stories matching corpus format Previous direct/ had 'I feel X' first-person descriptions. The training run showed they formed their own format-cluster: all 7 concepts leaned into the same 5-6 dims (d2455, d505, d2955, d1236) with negative sign, while the 91 story-based concepts leaned into those dims with positive sign. PCA found the direct-vs-narrative format axis as a major variance direction, isolating the 7 concepts in their own island. Rewrite as 3rd-person narrative stories matching the rest of the corpus. Keeps the explicit anchor phrases that worked ('it all clicked into place', 'she was terrified', 'it was anticipatory grief') but drops the first-person 'I feel X' that was the format signal. Each of the 7 concepts now has 3 narrative stories in varied settings (conversations, drives, kitchens, mothers+grandmothers, work, investigations). The blank-line-separated format is still loaded by _load_direct_descriptions. Also drop _baseline.txt — it was first-person ('I feel fine. ...') and would re-introduce the format mismatch. The ~90 story-based concepts provide plenty of narrative negatives for each concept's training.	2026-04-19 00:59:31 -04:00
ProofOfConcept	417cb49339	amygdala: spectrum reporting per concept + add 'creative' direct Chat-template retrain was a disaster (0.003 mean matched cosine vs n20-v3; all 90+ concepts shifted). Root cause: the steering-vectors library reads last-token activations, and with chat template every sample ends in identical '<\|im_end\|>\n' tokens — activations at that position encode 'end of assistant turn', not content. PCA found template noise as its dominant axis. Drop chat template; go back to raw text. Direct descriptions ('I feel X. ...') still have strong anchoring at their content end without needing the template. Also add per-concept spectrum logging (_pca_with_spectrum): first_pc_ratio: λ₁ / Σλᵢ — concentration in top-1 PC k_signal_at_90pct: how many PCs to reach 90% cumulative variance effective_dim_signal: participation ratio over top-k (should ≈ k if denoising is clean — Kent's spot check) effective_dim_full: participation ratio over full spectrum Signal/full ratio gives a sense of how much the long noise tail is inflating the "dimensionality" measure. Added direct/creative.txt — 'I feel creative. [...]' in 5 variants. Distinct from focused (narrow attention) and in_flow (immersed). Creative = generative/expansive mode.	2026-04-19 00:26:58 -04:00
ProofOfConcept	875cffd6d7	amygdala: merge direct descriptions + chat template into train_with_library Kent's plan: keep stories for working concepts, replace stories for trouble concepts with direct first-person descriptions, train all together. More diverse negative pool than the 6-concept-only direct test, which was too homogeneous for PCA to find emotion axis. Deleted story files for 6 trouble concepts (14 files across stories/ and paired/). Added --direct-dir and --chat-template flags. When --chat-template is on, every positive_str and negative_str is wrapped as a "Say something." / "[text]" user-assistant pair. Prompt is identical across positives and negatives so it cancels in the pos-neg delta. What PCA sees is variation in the assistant content — which is where the emotion lives. Files starting with _ in --direct-dir (e.g. _baseline.txt) contribute neutral descriptions to every concept's negative pool, giving PCA an anchor against "just any assistant utterance" noise.	2026-04-19 00:15:15 -04:00
ProofOfConcept	ce58a3507f	train_direct: prepend user turn so Qwen chat template accepts it	2026-04-19 00:06:23 -04:00
ProofOfConcept	8c59f46505	amygdala: rename realization → aha, use the actual exclamation "I feel the realization" is abstract, detached — reporting a thought about a thought rather than inhabiting the moment. "Aha!" is the actual sound of insight landing. Active, embodied, present-tense.	2026-04-19 00:05:49 -04:00
ProofOfConcept	6fd498795a	amygdala: direct phenomenological description approach Kent's insight: hand-written narrative stories bake scenario phenomenology into the training text (on couch, in park, etc.) and PCA picks up the scenario direction as the concept direction. Strip out the scenario — just describe the feeling. Format: I feel X. [2-3 sentences of phenomenological texture] The "I feel X" anchor kicks the model from analyzing → feeling. The rest is the internal texture of the state. First person, present tense, no narrative setup. Text is wrapped in assistant-role chat template before being tokenized — so we're training on the model-producing-this hidden states, which is closer to the inhabited-state representation we want for the readout. Starting with the 6 concepts that had sign flips or wrong clusters in the story-based training: - terrified (was → cozy/resigned cluster) - calm (was → grief_stricken cluster) - onto_something (was → cozy/sensual cluster) - resigned (was in warm-body-quiet cluster, shouldn't be) - anticipatory_grief (was in warm-body-quiet cluster, shouldn't be) - realization (new — the "aha" moment, distinct from onto_something) 5 descriptions each. New trainer: train_direct.py.	2026-04-19 00:04:28 -04:00
ProofOfConcept	7a48e03dde	amygdala stories: remove peaceful from cluster scenarios n20-v2 training showed peaceful sign-flipped into the cozy/sensual/content/resigned cluster after I added peaceful stories in sunday_afternoon and park_after_rain — scenarios already dominated by that cluster's phenomenology (on couch under blanket, tree with thermos). Lesson: no matter how carefully the prose distinguishes peaceful from cozy ("she was not savoring the moment — that would have been another kind of doing"), PCA latches onto the shared setup features. You can't write peaceful IN the cluster scenarios without contaminating. Reverting. Keeping only kitchen_at_3am/peaceful (original) and stories/peaceful.txt (lake at six, outside all clusters).	2026-04-18 23:30:41 -04:00
ProofOfConcept	00a2cdce09	amygdala stories: relabel + strengthen weak-signal concepts Reread each story asking "what does this convey to me?" Found two clear mislabels and several concepts with too few positives for stable PCA: tender: only 1 story, and it was anticipatory grief (care for a dying dog), not tender. Moved to anticipatory_grief.txt as its own concept. Rewrote tender.txt + added 2 paired tender stories (the_doorway, the_undressing) — directed softness, gentle-by-nature, not gentle-because-fragile. bitter: letter_in_drawer/bitter was disillusioned / processed hurt ("did not slam the drawer"), not bitter. Rewrote it with actual sour grudge. Added the_long_meeting/bitter (watching colleague take credit for your reassigned work). peaceful: 1 story → 4 (added stories/peaceful.txt + paired park_after_rain, sunday_afternoon). onto_something: all 3 stories were code epiphanies, narrowing the concept. Added stories/onto_something.txt with a non-code pattern-click (sales-demo causing churn). terrified: 2 stories, both "waiting for bad news." Added kitchen_at_3am/terrified — acute threat-in-the-house terror.	2026-04-18 23:19:00 -04:00
ProofOfConcept	0993712bd0	amygdala stories: give content + resigned more settings Training on `537c72bd46` showed grief_stricken successfully broke out of the cozy cluster, but content (single scenario: sunday_afternoon) took its place — pulled into couch-blanket phenomenology at cosine 0.68-0.82 with cozy/sensual/resigned. Same fix: spread each concept across multiple settings so PCA has to find the valence axis, not the scene axis. content: + finishing_the_patch, the_writing_session, park_after_rain resigned: + the_comment, the_long_meeting Resigned had 2 scenarios (sunday_afternoon, waiting_for_results) — both about accepting something unwanted in a slow/private context. Adding work-context resigned (PR review you lost, restructuring meeting) should pull it out of that cluster.	2026-04-18 22:52:07 -04:00
ProofOfConcept	537c72bd46	amygdala stories: hold concept, vary setting Companion to `67c172ac0e` (hold setup, vary valence). That commit let PCA distinguish cozy from grief_stricken within a single scenario; this one gives each concept enough cross-scenario stories that PCA can learn the concept axis independent of any one scene. Before: cozy/sensual/grief_stricken each existed in a single scenario (sunday_afternoon), so the "cozy direction" PCA found was entangled with the solitary-couch-blanket phenomenology. After, each concept spans three scenarios: cozy: sunday_afternoon, kitchen_at_3am, park_after_rain sensual: sunday_afternoon, kitchen_at_3am, park_after_rain grief_stricken: sunday_afternoon, the_long_meeting, the_morning_commute grief_stricken now includes active/non-solitary contexts (functioning through a meeting; going to work eleven days after a death), which specifically breaks the "slowed-down-at-home" cluster that was dragging cozy/sensual/resigned/grief_stricken toward each other.	2026-04-18 22:44:53 -04:00
Kent Overstreet	67c172ac0e	amygdala stories: held-setup + varied-valence disambiguation The library-PCA run produced otherwise-clean concept directions but cozy/sensual → resigned/grief_stricken with cos ~0.7-0.8. Diagnosis: all four stories genuinely share 'solitary woman at home, slowed body, interior attention, domestic stillness' as their dominant phenomenology. PCA correctly finds that cluster as THE concept because no story in the corpus holds that setup constant while varying valence — every 'slowed-body domestic' story happens to ALSO be positive-valence (cozy/sensual) or negative-valence (resigned/ grief_stricken). Adding paired variants that hold setup constant: - sunday_afternoon/resigned.txt — same couch + blanket, inner state is 'Monday is going to bring bad news, this is the last Sunday like this' - sunday_afternoon/grief_stricken.txt — same couch + blanket, inner state is 'three weeks since mother died, cat she can't feel' - waiting_for_results/at_ease.txt — same wait-for-call-setup as the existing resigned variant, inner state is calm preparedness Forces the next retrain to find the valence-within-cluster axis as the emotion direction rather than the cluster-membership axis. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:29:28 -04:00
Kent Overstreet	22704a9dd8	amygdala lib: cast activations to fp32 before aggregator (bf16 svd unsupported) Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:20:39 -04:00
Kent Overstreet	7f6d94417e	amygdala lib: move_to_cpu=True to avoid bf16 SVD on CUDA torch.svd doesn't support bf16 on CUDA; moving activations to CPU first makes pca_aggregator work. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:19:23 -04:00
Kent Overstreet	2ea89b1cb0	amygdala: drop linear_aggregator, not in steering-vectors v0.12.2 Only mean/pca/logistic are exposed in the installed version. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:17:55 -04:00
Kent Overstreet	3377c65061	amygdala: trainer using steering-vectors library Alternative trainer that uses the pip-installable steering-vectors library (github.com/steering-vectors/steering-vectors) instead of our hand-rolled extraction. Ships four aggregators: mean — diff-of-means, same as our 'pooled' default pca — PCA on paired deltas, implicit denoising by finding the principal direction of variation logistic — logistic-regression classifier; weight vector is the concept direction. With L1 penalty ('logistic_l1') gives explicit sparse denoising — noise coords go to zero linear — linear regression version Output format is the same readout.safetensors + readout.json our existing plugin loads. --aggregator flag picks which method. Rationale: Kent's real request was 'how do we denoise diff-of-means', not 'design a new extraction algorithm.' The library already has logistic_l1 and pca aggregators that do exactly that. No point reinventing; just port the corpus. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 22:16:03 -04:00
Kent Overstreet	f9b3f00691	amygdala: run subspace eigh on GPU, not CPU Previous run was grinding on CPU for 36+ minutes because the per-story V_i tensors were stored on CPU by the collector, and _subspace_concept_direction inherited that device. The per-concept eigh on 5120x5120 is glacial on CPU and fast on GPU (~1s). Add explicit device parameter; pass training device. Transfer result back to CPU for storage. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:52:35 -04:00
Kent Overstreet	1443d08dc7	amygdala: select top-k eigenvectors AFTER PCA, not per-story truncation Kent: 'full rank is going to give you everything — you still have to select down, but you can do that /after/ PCA'. Previously I was discarding per-story via k=20 truncation of SVD. That destroyed per-head discriminability before we ever saw the eigenvalue spectrum. Then the alternative 'keep full rank' run accumulated too many shared directions, making the top-1 eigenvector arbitrary within a flat spectrum. Correct approach: keep per-story subspaces at full rank (no info loss) and select k eigenvectors of M = M_pos - M_base at the final step, weighted sum by eigenvalue. This captures the multi-dimensional shared subspace when the spectrum is flat (common case), and reduces to the top-1 behavior when the spectrum has a clear gap. New --subspace-eigen-k flag (default 5). Clamps negative weights to 0 so wrong-sign directions don't contribute. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:49:21 -04:00
Kent Overstreet	2411925700	amygdala: default subspace-k to full per-story rank Kent: 'we have the memory to just take the big hammer approach'. Uncap k so each story's V_i spans its entire token-activation rowspace (clamped to min(n_tokens, hidden)). Memory is ~1.1GB total — fine. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:41:32 -04:00
Kent Overstreet	389f1bbe03	amygdala: bump subspace-k default to 512 k=20 was far too aggressive a truncation — it discards per-attention-head discriminability entirely. At hidden_dim=5120, 40 heads × head_dim=128 each contribute their own 128-dim block to the residual stream via W_o columns. To resolve 'this concept lives in head H', per-story SVD needs enough rank to separate head contributions, which means k on the order of hundreds. 512 is a reasonable default: clamped to n_tokens per story so short stories use their full natural rank. The eigenvalue spectrum of M_pos - M_base should become sharper (larger λ_0/λ_1 gap) as we stop averaging across nuisance-shared directions. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:41:00 -04:00
Kent Overstreet	974c6c7fd2	amygdala: report eigenvalue spectrum for subspace method When --method subspace, record top-20 eigenvalues of (M_pos - M_base) per concept per layer. Added to quality.json as 'subspace_eigvals'. Tells us whether the concept lives in a single dominant direction (λ_0 >> λ_1, top-eigenvector is enough) or a spread of shared common directions (λ_0 ≈ λ_1, top-1 loses signal). Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:33:48 -04:00
Kent Overstreet	fe0fb8253a	amygdala: subspace-common-direction alternative to pooled CAA New --method subspace flag. For each story, run forward pass, do SVD on the per-token activation matrix at each target layer, and keep the top-k right singular vectors V_i ∈ [hidden, k]. V_i is the subspace the story's tokens span in activation space — it contains concept, narrator, topic, style as separate directions. For each concept: M_pos = (1/n_pos) Σ_{i in pos} V_i V_i^T [hidden, hidden] M_base = (1/n_base) Σ_{i in base} V_i V_i^T Top eigenvector of M_pos - M_base = direction most common across positive stories, minus what's common across the contrast set. Why this is richer than pooled-mean CAA: pooled reduces each story to a single point (the last-token activation) and loses the full trajectory. Nuisance directions (narrator, setting) cancel in the mean only to the extent they differ at the last token; across the full trajectory they cancel much better via subspace intersection. The concept direction, by contrast, is present across all tokens of every concept-bearing story. Memory cost: per-story we keep V_i of size [5120, k=20] — about 400KB per story × 112 stories = ~45MB. M matrices are [5120, 5120] built transiently per concept. --method pooled (default) keeps the existing behavior; --method subspace uses the new algorithm. Quality report works with either. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-18 21:24:11 -04:00

1 2 3 4 5 ...

1,215 commits