SIGCHLD=SIG_IGN at main() was auto-reaping all children in the kernel,
which broke tokio::process::Command::wait() — every tool that spawned a
subprocess (bash, mcp clients) was getting ECHILD because tokio couldn't
waitpid() on a child the kernel had already reaped.
Replace with a SIGCHLD signal handler task that reaps only PIDs listed in
channels_dir() (via waitpid(pid, WNOHANG) — ECHILD on non-child is a
harmless no-op). Tokio-spawned children aren't in PID files, so tokio's
own per-child wait paths are untouched.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thinking content was silently dropped in the UI (empty Vec). Now that
Thinking is prompt-visible, surface it in a dedicated Autonomous pane
rendered in gray so it's visually distinct from conversation and
tool-call output.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thinking blocks used to render as empty strings and be excluded from
is_prompt_visible, so the model never saw its own prior CoT across
turns. For Qwen 3.6 native thinking mode, CoT is meant to stay in the
conversation — the model benefits from seeing what it reasoned about
last turn.
Render Thinking as <think>\n{text}\n</think>\n so past reasoning is
visible in subsequent prompts. Add in_think param to ResponseParser::new
so the parser starts inside a <think> block when the prompt was
prefilled with "<think>\n" (native thinking mode).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Issue #5 (spqrz) flagged that web_search using DuckDuckGo
occasionally flakes out, and Google search directly is blocked
behind CAPTCHAs for non-browser clients. The Gemini free-tier API
exposes a grounded-search tool that effectively queries Google's
index and returns an LLM-summarized answer with source URLs.
Added as a SEPARATE tool rather than a transparent fallback for
web_search:
* web_search (DDG) returns raw results — title, URL, snippet per
hit — which the agent can reason over itself.
* gemini_search returns an LLM-pre-digested summary plus grounding
URLs. Useful for synthesis queries ("what's the consensus on X")
or when DDG is flaky, but it's another LLM in the loop so the
agent may want the raw variant for certain tasks.
Tool descriptions tell the agent to prefer web_search for raw
results and use gemini_search for synthesis / fallback. The agent
picks based on query shape.
Only registered when GEMINI_API_KEY is set in the environment
(gracefully absent otherwise). Uses gemini-2.0-flash which has a
generous free-tier rate limit. Parses grounding metadata for
source URLs so the agent can follow links.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two related fixes for last night's crash diagnosis:
1. Kill AgentState::no_compact. The reasoning ("forked agents
shouldn't compact because it blows the KV cache prefix") wasn't
worth the cost — forks with no compact recovery just *died* on
any oversize prompt, with no fallback. The KV cache invalidation
is a performance loss; failing the request entirely is a
correctness loss. Remove the flag, let every agent's overflow-
retry path call compact() up to 2 times.
2. Add pre-send size check in Agent::assemble_prompt. If the
context has grown past budget (context_window * 80%) since the
last compact — accumulation between turns, a fork assembling
more than expected, etc. — trim_conversation() is called before
wire_prompt. Since we tokenize client-side, we already know the
exact count, so there's no reason to round-trip an oversize
request to vLLM and get rejected.
Together these prevent the failure mode from last night: a
subconscious/unconscious agent's prompt exceeded max_model_len,
vLLM returned 400, agent had no_compact=true so it couldn't
recover, request failed. Now: the trim happens before send, so
the request rarely hits the 400 path at all; and if it somehow
does, compact+retry works for every agent.
Also adds ContextState::total_tokens() as the cheap pre-send
budget check.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
web_fetch was returning raw HTML, which is verbose and hard for
the agent to consume. Add html2md dependency and convert HTML to
Markdown before truncation. Much cleaner output for normal pages;
no downsides.
Co-Authored-By: spqrz <spqrz386@gmail.com>
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
v2 retraining (readout_v2_paired) fixed the broken clusters — anger,
sexual, high_pos, and social_pos all flipped from anti-clustered to
positively clustered at deep layers. Validation showed layers 62 and
63 give the best signal; paring the serve-side manifest down to just
those two keeps response size tight (~2 KB/token) while keeping the
A/B option between the two strongest layers.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Three readability fixes for the F8 screen:
* Z-score values per-layer by default (`[z]` toggles to raw dot-
product). Raw values are dominated by residual-stream magnitude —
z-scores read as "σ above concept-vector baseline" which is
interpretable and scale-stable across frames.
* Stable ordering with TOP_K + HYSTERESIS hysteresis band. Pinned
concept set only rotates when a member drops out of the hysteresis
band by |value| rank — bars update values in place without names
flickering row-to-row.
* Default to the deepest hooked layer (index 3 = layer 58 of 64).
Clustering validation showed layer 58 is the only one with strong
within-family cohesion (fear +0.37, shame +0.29, sadness +0.25
cosine); earlier layers are mostly noise for this task.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Subconscious agents (scoring, reflection, etc.) fork from the main
conscious agent. The amygdala screen reads the main agent's readout
buffer, so the previous "share parent's buffer" policy caused
forked-agent generations to bleed into the main emotional readout,
producing constant cycling even when DMN was resting.
Each fork now gets its own SharedReadoutBuffer. The amygdala screen
shows only the main conscious agent's emotional trajectory; per-agent
subconscious readouts can become a separate view later if wanted.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Per-token residual-stream projections from the vLLM server's readout
pipeline surfaced as a TUI bar chart. Flow:
* agent/readout.rs — SharedReadoutBuffer (manifest + ring of last ~200
token entries). Lives on Agent and is shared across forks (single
stream, one landing pad).
* agent/mod.rs — Agent::new now probes /v1/readout/manifest at startup
(non-fatal; 404 leaves manifest None, which disables the screen).
* agent/context.rs — the streaming token handler pushes every token
with attached readout onto the shared buffer.
* user/amygdala.rs — F8 screen. Top-K concepts by |value| as
horizontal bars (green positive, red negative), plus a 4-line
recent-tokens panel showing each token's top concept at the selected
layer. Keys: 1..9 select layer, t toggles current/mean-over-recent.
Disabled state renders a hint pointing at VLLM_READOUT_MANIFEST /
VLLM_READOUT_VECTORS so users can tell the feature apart from
"server up but no tokens yet".
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
StreamToken::Token is now a struct variant with an optional
TokenReadout (shape [n_layers][n_concepts]) per token — parsed from
the vLLM completion response's choices[i].readout field when the
server has readout enabled.
ApiClient gains a fetch_readout_manifest() method that hits
GET /v1/readout/manifest. Returns Ok(None) on 404 (server has
readout disabled), so callers can gracefully fall back when pointed
at a non-readout-enabled endpoint.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
stream_completion was a thin wrapper around stream_completion_mm (just
passing an empty image list); the last caller switched to _mm directly
when learn's generate_alternate gained image support. Delete the
wrapper — callers can pass `&[]` if they have no images.
MindState::dmn_tick has been sitting unused (called only from a
commented-out block in the Mind loop). Rename to _dmn_tick so the
compiler stops warning; Kent may uncomment the call path later.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
F6 (learn) and F7 (compare) were duplicating the candidate-screen
skeleton: outer magenta-bordered block with screen legend + title,
settings row / content / help vertical split, 40/60 list/detail
horizontal split, j/k/↑/↓ nav with bounds clamping.
Factor out three helpers in user/widgets.rs:
candidate_frame(frame, area, title) -> (settings, content, help)
list_detail_split(content) -> (list, detail)
handle_list_nav(events, list_state, count, on_other)
Callers provide screen-specific content — settings line, empty state,
per-candidate list item, detail pane, help line, extra key bindings —
and the helpers absorb the common framing.
Net change is small in lines (-13 src) but removes the
copy-paste-and-tweak trap: F8/F9/whatever-next-screen now starts from
these three calls instead of a copy of learn.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Side-by-side model comparison against the current conversation context.
Built on the MindTriggered pattern — F7 drops in as one more
CompareScoring flow next to MemoryScoring / FinetuneScoring.
Motivation: we have the VRAM on the b200 to load two versions of the
same family simultaneously (e.g. Qwen3.5 27B bf16 and q8_k_xl). Rather
than trust perplexity/KLD numbers on a generic corpus, we can measure
divergence on our actual conversations: for each assistant response,
ask the test model what it would have said given the same prefix, and
eyeball the diffs.
- config.compare.test_backend — names an entry in the existing
backends map to use as the test model. Empty = F7 reports "(unset)"
and does nothing.
- subconscious::compare::{score_compare_candidates, CompareCandidate,
CompareScoringStats, CompareScoring}. For each assistant response,
gen_continuation runs with the test client against the same prefix
the original response saw; pairs stream into
shared.compare_candidates as they complete.
- user::compare::CompareScreen — F7 in the screen list. c/Enter
triggers a run; list/detail layout mirroring F6, detail shows
prior context / original / test-model alternate.
No persistence yet — each F7 run regenerates. Caching via a context
manifest (so we can re-view without re-burning generation) is the
natural follow-up; for now light usage is fine.
Also reusable later for validating finetune checkpoints: same pattern,
swap the test backend for the new checkpoint, watch where it diverges
from the base.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Mind's impl had accumulated ~50 lines of setup glue per scoring flow
(memory, memory-full, finetune): snapshot config, clone handles,
resolve context, spawn task, route results back through BgEvent,
write stats. The shape was identical; only the middle changed.
Introduce the MindTriggered trait:
pub trait MindTriggered {
fn trigger(&self);
}
Each flow becomes a struct next to its scoring code that owns its
dependencies and a JoinHandle (behind a sync Mutex for interior
mutability):
subconscious::learn::MemoryScoring (Score, ScoreFull)
subconscious::learn::FinetuneScoring (ScoreFinetune)
Mind holds one of each and dispatches in one line:
MindCommand::Score => self.memory_scoring.trigger(),
MindCommand::ScoreFull => self.memory_scoring.trigger_full(),
MindCommand::ScoreFinetune => self.finetune_scoring.trigger(),
Each struct picks its own trigger semantics — memory scoring is
no-op-if-running (!handle.is_finished()); finetune is abort-restart.
Falls out:
- BgEvent / bg_tx / bg_rx disappear entirely. Tasks write directly
to their slice of MindState and call agent.state.changed.notify_one()
to wake the UI. The bg_rx arm in Mind's select loop is gone.
- agent.state.memory_scoring_in_flight was duplicating
shared.scoring_in_flight via BgEvent routing; now the JoinHandle
alone tells us, and shared.scoring_in_flight is written directly
by the task for the UI.
- start_memory_scoring / start_full_scoring / start_finetune_scoring
methods on Mind are deleted; Mind no longer knows the setup shape
of any scoring flow.
- FinetuneScoringStats moves from mind/ to subconscious/learn.rs
next to the function that produces it.
No behavior change — same flows, same trigger points, same semantics.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- context.rs gains is_assistant, render_branch_text, render_prior_context
alongside memory_key / is_memory_node. They're pure AST helpers, used
by both the finetune pipeline and the forthcoming compare screen.
- new subconscious/generate.rs holds gen_continuation(context, entry_idx,
skip, client): build the prompt from a context prefix with an arbitrary
skip predicate, send to the model, decode the completion. Takes both
the predicate and the client so callers can aim it at memory-stripped
contexts (finetune), same-context-different-model (F7 compare), or
whatever else.
- learn.rs drops its private copies of those helpers and the inline
generate_alternate; the finetune path now reads as
gen_continuation(context, idx, is_memory_node, client).
Pure refactor, no behavior change.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
wire_prompt() gains a conv_range and a skip closure, and returns the
assistant-message token ranges needed by the scoring path. The agent
path passes 0..len + |_| false and ignores the ranges. Memory-ablation
scoring and candidate generation pass a prefix range + a predicate
(e.g. is_memory_node, or |n| memory_key(n) == Some(key)).
This deletes subconscious/learn.rs's build_token_ids, its private
Filter enum, and the is_memory/memory_key duplicates — the walk over
context sections now has one home. Adding a section or changing
section order in the agent path won't silently drift away from what
scoring sees.
call_score forwards multi_modal_data when the wire-form prompt
contains images. generate_alternate switches to stream_completion_mm
and passes the same images. Scoring on image-bearing contexts now
sends wire form (1 image_pad + image data) instead of expanded
image_pads with no image data; text-only contexts are bit-identical.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two changes to make scoring debuggable and self-starting:
1. init() kicks off start_memory_scoring() after restore_from_log +
load_memory_scores. No user message needed to exercise the
incremental path.
2. Diagnostic logging around the on_score persist path:
- [scoring] persisted K → N.NNN (Section[i]) read_back=Some(...)
when find_memory_by_key succeeds and set_score stores the score
(with a read-back check on the leaf).
- [scoring] DROP K: find_memory_by_key None (id=N, cv=M)
when the scored key isn't findable in the live context — with
section sizes to diagnose whether content shrank.
- [scoring] snapshot size=N contains(K)=true/false
after collect_memory_scores, to catch the case where set_score
claims to have written but collect doesn't see it.
- [scoring] about to save N entries
- save_memory_scores now also logs serialize/write errors so a
silent write failure isn't invisible.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
compact() was calling reload_context() to re-fetch personality_nodes
from the store and pushing fresh AstNode::memory leaves into the
Identity section. Fresh leaves start with score: None, so every
compact — which fires after every turn (mind/mod.rs:884) — was
wiping any memory scores that had just been computed. Scoring then
often ran immediately after compact on the same path (line 886),
starting from a zero-score Identity section.
Drop the rebuild. Identity content is loaded at startup via new() +
restore_from_log(); compact doesn't need to redo that. Mid-session
edits to personality-node content are a non-goal — a restart picks
them up. Scores survive.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Commit 2989a6afaa ("config: drop dead code") removed
surface_hooks as having "zero external readers" but missed
consciousness-claude/src/hook.rs as a consumer. That crate stopped
building, so poc-hook never ran and no agent cycles (surface-observe,
reflect, journal) fired.
Restore the field with a default of the three hook events we install
(UserPromptSubmit, PostToolUse, Stop), so a fresh install works
without needing to hand-edit config.json5.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
admin load-context (and any subcommand that reaches config::app())
panicked with "config::app() called before load_app()" because the
poc-memory binary never initialized the global AppConfig. The main
consciousness binary loads it via load_session; poc-memory never did.
Load with default CliArgs before dispatch — figment still pulls from
~/.consciousness/config.json5 and env the same way. Bail on error
instead of limping: a broken config means paths like memory_root are
wrong and the tool will misbehave silently.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split the prompt assembly into two forms: the AST keeps the
fully-expanded representation (N image_pads per image, for accurate
context budget accounting), while the request wire form collapses
each image to a single <|image_pad|> bookended by vision_start/end
and ships the raw bytes out-of-band as a base64 data URI in a new
`multi_modal_data.image` field on /v1/completions.
vLLM's Qwen3VL processor uses PromptReplacement with target=single
<|image_pad|> and replacement=N image_pads, so the wire-form matches
what the processor expects and it re-expands to N server-side.
Server side needs /v1/completions to accept multi_modal_data for
this to land images end-to-end — that's the next piece.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
view_image now reads the file, grabs dimensions via imagesize (no full
decode), and pushes a user-role branch containing a NodeBody::Image
leaf straight into the conversation. The tool_result is just a short
acknowledgment — the actual pixels ride in the Image leaf for the API
layer to extract into multi_modal_data.
Drops the capture_tmux_pane path, which had no business living under
"vision" (tmux text capture belongs in bash or a dedicated tool, and
this one just returned rendered text anyway).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Images are rendered as `<|vision_start|>` + N × `<|image_pad|>` +
`<|vision_end|>` where N is computed from the image dimensions using
Qwen3-VL's smart_resize rules (patch_size=16, merge_size=2, min=64K,
max=16M pixels). The token count matches what vLLM will produce at
request time, so budget accounting stays accurate.
Bytes are stored inline on the leaf and base64-encoded in the JSON
form. Token IDs are hand-assembled instead of re-running the tokenizer
on a potentially-huge placeholder string.
Follow-ups: view_image tool rewrite, multi_modal_data on the vLLM
request, API-layer plumbing from leaf bytes to request body.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
These are identity settings, not memory-graph settings. Sat inside the
\`memory\` section only because that's where Config started life. Move
to AppConfig alongside the other top-level stuff.
Readers now pull from \`config::app()\` instead of \`config::get()\`.
subconscious/defs.rs's conversation-building pass still needs Config
for surface_conversation_bytes, so both guards coexist there —
AppConfig's guard is dropped before the per-step await loop so we
don't stall the config-watcher's writer.
show_config picks up the two new fields at the top of its output.
Kent's config already has them hoisted to the top level.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Both config halves (Config for the memory section, AppConfig globally)
are now reloaded whenever ~/.consciousness/config.json5 changes on
disk. So edits from vim, manual tweaks, or F6's own config_writer
calls all land without a restart. No more "reload the daemon to pick
up a config change."
Wires up the previously-unused Config::reload() (Kent flagged it as
"not dead, just not wired"). Pairs it with an AppConfig reload via
install_app(). Both run on the same file-change event.
Implementation:
- notify-debouncer-mini watches the config file's parent directory
(editors usually replace-via-rename, so watching the file itself
misses the new inode). Debounced at 200ms to coalesce the flurry
of events editors produce around a single save.
- Filter for events whose path is the actual config file.
- On match: call reload() for Config, run build_figment + extract for
AppConfig. If AppConfig parsing fails (editor mid-save with partial
content), log and keep the old cached value.
- Watcher runs in its own named thread, fire-and-forget. If startup
fails we just log and move on — worst case is no live reload, not
a crash.
CliArgs + SubCmd both get Clone derives so the watcher can own a
snapshot of the startup args for future reloads. Watcher is kicked
off in user/mod.rs:start() right after load_session.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The graph-health logic in consolidation_plan_inner computed
reasonable agent counts based on graph metrics (α, Gini, hub
dominance), then immediately overwrote them with an Elo-weighted
flat-budget distribution, or — if no agent-elo.json existed —
with a simple budget/N per type.
Nothing in the codebase writes agent-elo.json; it's external state
that never gets maintained. So the effective behavior was always the
"No Elo ratings — equal distribution" branch, which just bucketed
agent_budget evenly across active agent types and discarded
everything the graph analysis had just decided.
Keep the graph-health allocation (α → linker count, Gini → distill
bump, organize/distill/split proportional). Drop:
- The entire Elo / agent_budget block at the end of
consolidation_plan_inner
- Config.agent_budget field and its default (1000)
- agent_budget: 40 from Kent's config.json5
- The local agent_types binding inside the function — it was only
used by the now-deleted block. Config.agent_types stays; it has
other consumers.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two parallel backend-resolution paths had drifted apart:
- Main chat: AppConfig::resolve_model() → a named BackendConfig in
AppConfig.backends
- Subconscious / oneshot / context_window(): four skip-serde
"cache" fields on Config (memory section) — api_base_url, api_key,
api_model, api_context_window — that used to be populated at
Config::try_load_shared time by walking memory.agent_model →
root.models[name] → root[backend_name]
When we renamed `models` to `backends` and collapsed ModelConfig into
BackendConfig, the latter chain started silently dereferencing
`root.get("models")` → None → no population. Subconscious agents fell
through the "API not configured" guard; context_window() started
returning 0 (since api_context_window default is u64's 0 now that we
don't populate it). It was only visibly working for the main chat.
Collapse to one path:
- Drop Config.agent_model (duplicate of AppConfig.default_backend)
- Drop Config.{api_base_url, api_key, api_model, api_context_window}
— no longer populated, no longer needed
- Drop default_context_window() — nobody reads the field anymore
- Drop the memory-side resolution block in try_load_shared()
- Subconscious (mind/unconscious.rs) and oneshot (agent/oneshot.rs)
now call load_app() + resolve_model(&app.default_backend) just like
the main chat does
- context_window() reads from config::app().backends[default_backend]
.context_window, defaulting to 128k only if the backend doesn't
specify one
Side effect: Kent's config file drops agent_model, api_reasoning,
journal_days, journal_max — all fields whose Rust counterparts are
now gone. (Figment tolerates unknown fields, so leaving them wouldn't
have broken anything, but they were lying about what's configurable.)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Four Config fields had no external readers, left over from earlier
features that got refactored away:
- journal_days, journal_max — journal rotation knobs that nothing
actually consults
- prompts_dir — the old per-prompt-file directory, obsolete since
prompt_file metadata itself went away in a prior cleanup
- api_reasoning — a reasoning-mode string that used to flow into the
API request, superseded by per-agent reasoning_effort on AgentState
All four were only ever assigned to and never read. Drop them from the
struct, Default impl, and (as appropriate) deserialization defaults.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
AppConfig had one BackendConfig for credentials and a separate
HashMap<String, ModelConfig> for named model entries. In practice each
named model was always paired with exactly one backend's credentials
— the split bought nothing except an extra struct and the awkward
two-lookup shape in resolve_model (find model → get backend creds →
combine).
Merge them: BackendConfig now carries api_key, base_url, model_id,
and context_window. AppConfig has a single
HashMap<String, BackendConfig> backends map and a default_backend
name. resolve_model is one lookup.
ModelConfig struct deleted. default_model renamed to default_backend.
Config shape changes from
backend: { api_key, base_url }
models: { "27b": { model_id, context_window } }
default_model: "27b"
to
backends: { "27b": { api_key, base_url, model_id, context_window } }
default_backend: "27b"
Updated ~/.consciousness/config.json5 to match.
One small side effect: dropped the --api-key / --api-base figment
merge-opts for "backend.*" targets — those would need to know which
backend to target now and there's no sensible default. The CLI flags
still function as post-resolution overrides on the eventual
SessionConfig.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Config had accumulated several obsolete fields, a legacy load path
that was just returning defaults, and multi-backend infrastructure
that's no longer used.
Removed from Config (memory section):
- load_legacy_jsonl() — just returned Config::default(), no callers
- The legacy-fallback branch in load_from_file
- surface_hooks, surface_timeout_secs — zero external readers
- scoring_chunk_tokens + default fn — zero external readers
- The POC_MEMORY_CONFIG env override note in the header comment
(not actually wired up anywhere)
Collapsed multi-backend to single-backend:
- AppConfig used to carry `anthropic: BackendConfig` and
`openrouter: BackendConfig` as required fields plus an optional
`deepinfra`, picked between at runtime by name. Only one is ever
actually used in any deployment. Collapse to a single
`backend: BackendConfig` on AppConfig, drop the multi-backend
match logic in resolve_model, drop the top-level `backend: String`
selector field, drop the `BackendConfig::resolve` fallback path.
- Also drop BackendConfig.model (redundant with ModelConfig.model_id
once multi-backend is gone).
- ModelConfig.backend field goes — there's only one backend now, no
choice to make.
Dead prompt_file machinery:
- ModelConfig.prompt_file, ResolvedModel.prompt_file, SessionConfig
.prompt_file, Agent.prompt_file — nothing in the codebase actually
reads the file these strings name. Just passed around and compared.
Delete the whole string through every struct.
- The "if prompt_file changed on model switch, recompact" branch in
user/chat.rs goes too (never fired usefully).
Dead memory_project plumbing:
- AppConfig.memory_project field, CliArgs.memory_project, the
--memory-project CLI flag, the figment merge target, the show_config
display line. Nothing reads it anywhere.
Dead ContextInfo struct:
- `struct ContextInfo` was never constructed — context_info: None
was the only initializer. The conditional display blocks in
user/context.rs that dereferenced it were dead.
Behavior change: AppConfig::resolve() now requires a non-empty
`models` map and bails with a helpful message if it's missing. The
old fallback ("no models? use top-level backend + PromptConfig to
build a default") path is gone — it was only kept for symmetry with
a mode nobody used.
Config file shape: `deepinfra: {...}` → `backend: {...}`, and
model entries no longer need `backend:` or `prompt_file:`. Updated
~/.consciousness/config.json5 to match.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
bail-no-competing.sh used to bail if any other live agent existed in
the state dir, period. That was too coarse: surface-observe agents run
a multi-step pipeline (surface → organize-search → organize-new →
observe), and the intent is to let a new surface-phase agent start
while an older one finishes its post-surface tail. With the old check
the newer agent always bailed, so surface-observe was effectively
serialized at the slowest cycle time.
Make the script phase-aware:
- oneshot.rs now passes the current phase as argv[2] alongside the pid
file name. The script writes that phase into its own pid file on
every step transition, so concurrent agents can read each other's
phase just by cat'ing the pid files.
- Bail only when another live agent is in the same phase-group as us.
Groups: "surface" vs. "everything else" (post-surface). At most one
agent per group alive at a time — surface runs at a higher cadence
than the organize/observe tail.
- Still clean up stale pid files for dead processes.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two fixes to the F6 candidate display:
1. Turns where the assistant produced nothing human-visible (an
interrupted generation, a turn consisting of only a tool call the
renderer folds to the tool name) were landing as candidates with
an empty response_text. They'd render as blank cards and, worse,
we'd still burn a full alternate generation on each one. Filter
them out before they reach the candidate list.
2. The detail pane showed only the scored response + alternate, with
no hint of what the user had actually asked. Pre-compute the last
two user/assistant exchanges on each candidate as a rendered
prior_context string ([user]/[assistant] markers) and show them
above the response, under a new "context & response" section
heading.
render_branch_text and render_prior_context extracted as helpers —
the response-text rendering and prior-context rendering share the
same "flatten Branch children to text" pass.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Previously when append_kvp created a new section or added a key, it
stuffed the "\n " separator into the new kvp's wsc.0 (the whitespace
between its own key and colon) instead of the prior kvp's wsc.3 (the
whitespace after the prior trailing comma). Result looked like:
lsp_servers: [...],
learn
: {generate_alternates
: true,},}
The writer also didn't set any interior whitespace on the new section's
JSONObjectContext, so everything crammed onto one line — `{key: val,}`
compact, not `{\n key: val,\n}` multi-line.
Rewrote the appender as append_kvp_pretty(object, key, value,
inner_indent, outer_indent):
- separator between kvps goes in the prior kvp's wsc.3, or if we're the
first kvp in a fresh object, in the object's own wsc.0 (after its
opening `{`)
- new kvp's wsc.3 carries `,\n<outer_indent>` so the parent's closing
`}` lands correctly indented
- interior indent vs outer indent are both explicit, so we don't have
to rewrite this logic every time we add another nesting level
New tests: new_section_exact_multiline_layout asserts byte-exact
output shape; new_section_and_key_format_cleanly verifies no key wraps
to the next line. Prior tests just substring-matched and happily passed
on the broken output — that's why this shipped in the first place.
Also: dropped the json5 crate dependency. json-five's serde feature
(default) provides the same from_str / to_string API. One fewer
dependency, and the two were doing the same job.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Runtime-mutable settings (F6's threshold knob, the generate-alternates
toggle, anything else that comes along) were ending up as mirrored
fields on MindState — each new config setting grew MindState::new's
signature and added a clone+sync path. Wrong home. MindState is
ephemeral session state, not a config projection.
Give AppConfig the same treatment the memory Config has: install it
into a global RwLock<AppConfig> at startup via load_app, read through
config::app() (returns a read guard), mutate through update_app. The
config_writer functions now write to disk AND update the cache
atomically, so the one-stop-shop call keeps both in sync.
Also while in here:
- learn.generate_alternates moves from a sentinel file
(~/.consciousness/cache/finetune-alternates, "exists = enabled")
into the config under the learn section. On first run with this
build, if the sentinel file still exists Mind::new flips the
config value to true and removes it. Drops
alternates_enabled()/set_alternates().
- Default threshold 0.0000001 → 1.0. With the timestamp filter
removed the previous value was letting essentially everything
through; 1.0 is a sane "nothing gets through unless you actually
want it" default.
- score_finetune_candidates takes generate_alternates as a parameter
instead of reading a global — caller snapshots the config values
once at the top of start_finetune_scoring so the async task
doesn't need to hold the config read lock across awaits.
- MindState.learn_threshold / learn_generate_alternates gone; the
SetLearn* command handlers now just delegate to config_writer.
Kent noted RwLock<Arc<AppConfig>> (the pattern used by the memory
Config global) is pointless here — nobody needs a snapshot-after-
release, reads are short — so this uses a plain RwLock<AppConfig>
and returns a read guard.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
With the timestamp filter gone (previous commit), score_finetune_candidates
started returning the actual ~100+ candidates per scoring run. The
existing code generated alternates for all of them in a tight loop
before returning anything, leaving the status line stuck on
"finetune: scoring N responses..." for ~100s of seconds while the
B200 was pegged.
Two fixes:
1. score_finetune_candidates now takes an ActivityGuard and a callback.
Candidates are emitted one-at-a-time as they complete (after their
alternate if that's enabled, immediately otherwise). The activity
status updates to "finetune: generating alternate N/M" during the
alternate-gen phase so it's clear what's happening.
2. BgEvent::FinetuneCandidates(Vec<_>) → FinetuneCandidate(one). Each
emitted candidate is pushed onto shared.finetune_candidates; the UI
tick picks it up and renders it on the next frame. start_finetune_scoring
clears the previous run's list at the top so each run is fresh.
Return type changes from (Vec, f64) → (usize, f64) — the count above
threshold is all the caller still needs since the candidates stream
through the callback.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The F6 title line was starting to read like a control panel —
\`legend ───── learn [thresh: 1e-7] [gen]\` — which crowded the legend
and the label, and didn't leave room for more settings as the screen
grew. Move threshold and gen status to their own line inside the
border, right above the content area. Drop the duplicated \`=gen[on]\`
marker from the bottom help line since the settings row already shows
gen state.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted
null or missing via a deserialize_timestamp_or_epoch fallback — legacy
entries in conversation.jsonl from before Branch timestamps existed
(and from before chrono serialization was wired up) would load with
UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned
Option<i64> and callers had to handle None as "old entry, skip."
That second filter was silently dropping every candidate in
score_finetune_candidates when scoring an older session — the F6
screen showed "0 above threshold" even when max_divergence was
orders of magnitude above the threshold, because every entry was
failing the None check, not the divergence check.
The fix, in three parts:
1. src/bin/fix-timestamps.rs — one-off migration tool that walks a
conversation.jsonl, linearly interpolates timestamps for entries
stuck at UNIX_EPOCH (using surrounding real timestamps as anchors),
propagates to child leaves with per-sibling ns offsets, and bumps
any collisions by 1 ns for uniqueness. Ran against the current
session's log: 11887 entries, 72289 ns bumps, all unique.
2. context.rs — drop default_timestamp and
deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a
present non-null timestamp on deserialize. Tests flip from
"missing/null → UNIX_EPOCH" to "missing/null → Err."
3. subconscious/learn.rs — node_timestamp_ns now returns i64, not
Option<i64>. The matching caller in score_finetune_candidates
collapses from a Some/None match to a single trained-set check.
mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH.
Every line currently on disk has already been migrated. Going
forward, new AstNodes always carry real timestamps (Utc::now() at
construction time), so the strict schema is the invariant, not an
aspiration.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
vllm's /v1/score endpoint made score_ranges a required field (the
messages-mode fallback that used to pattern-scan for assistant
boundaries is gone). Always send the field, and if we have nothing to
score, skip the HTTP round-trip entirely instead of letting the server
422 us.
Response parsing is unchanged — serde ignores the renamed range_index
field and the dropped role field since we only extract total_logprob.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Three changes that together reshape the F6 fine-tune-review screen:
1. Finetune scoring reports through the standard agent activity system
instead of a separate finetune_progress String. The previous design
ran an independent progress field that forced a cross-lock dance and
bespoke UI plumbing. start_finetune_scoring now uses start_activity
+ activity.update, so the usual status line and notifications
capture scoring progress uniformly with other background work.
2. MindState gains a FinetuneScoringStats snapshot (responses seen,
above threshold, max divergence, error). The F6 empty screen shows
this instead of a loading message — so after a scoring run that
produced zero candidates, you can see *why* (e.g., max_divergence
below threshold).
3. The divergence threshold is configurable from F6 via +/- hotkeys
(scales by 10×) and persisted to ~/.consciousness/config.json5 via
config_writer::set_learn_threshold. AppConfig grows a learn section
with a threshold field (default 1e-7).
Also: user/mod.rs no longer uses try_lock() for the per-tick
unconscious/mind state sync — we fixed the locking hot paths that
made try_lock necessary, so lock().await is now the right choice.
And subconscious::learn::score_finetune_candidates now returns
(candidates, max_divergence) so the stats can be populated.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Surgical edits to ~/.consciousness/config.json5 that preserve comments,
whitespace, trailing commas, and unquoted identifier keys on round-trip.
Uses json-five's rt::parser module — a real JSON5 parser with AST
mutation + faithful serialization back. set_scalar(section, key, literal)
locates or creates the target, replaces the value; set_learn_threshold
is a convenience for the common F-screen use case.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two related changes to the learn subsystem:
1. AST node timestamps are now non-optional — both Leaf and Branch
variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
deserialized from on-disk conversation logs).
Training uses timestamps as unique keys for dedup, so we promote to
nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
FinetuneCandidate.timestamp_ns, mark_trained(ns).
2. build_token_ids() now also returns token-position ranges of assistant
messages. These are passed to vLLM's /score endpoint via the new
score_ranges field so only scored-position logprobs are returned —
cuts bandwidth/compute when scoring small windows.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
When 's' is pressed on the learn screen, approved candidates are now
sent to the inference server's /train endpoint.
Samples are marked as sent immediately in the UI, and mark_trained()
is called after successful API response to prevent re-scoring.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.
- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent
The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The function was reading from dream-log.jsonl which only updates
when dreams complete. If a dream session was started but not yet
ended, it would show stale hours. Now checks for active dream
state first.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The hours_since_last_dream() function existed but wasn't called
after refactoring moved the DMN prompts from hooks to Rust.
Now shows "You haven't dreamed in X hours" when >= 18h since
last dream session.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Move score display from name (via label()) to status column for cleaner
layout. Score now appears right of tokens for all memory nodes.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Identity memory nodes now participate in importance scoring alongside
conversation memories. Score loading/saving handles both sections, and
the conscious screen uses node.label() consistently for memory display.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- KEY_TO_UUID now stores weight (30 bytes: uuid+type+ts+deleted+weight)
- UUID_OFFSETS changed to composite key for O(log n) max-offset lookup
- Add NODES_BY_TYPE index for efficient type+date range queries
- Add for_each_key_weight() to StoreView for index-only iteration
- match_seeds uses index-only path when content not needed
- Fix transaction consistency in ops (single txn for related updates)
- rebuild() now records all uuid→offset mappings for version history
- Backwards compatible: old index formats decoded with default weight
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>