Runtime-mutable settings (F6's threshold knob, the generate-alternates
toggle, anything else that comes along) were ending up as mirrored
fields on MindState — each new config setting grew MindState::new's
signature and added a clone+sync path. Wrong home. MindState is
ephemeral session state, not a config projection.
Give AppConfig the same treatment the memory Config has: install it
into a global RwLock<AppConfig> at startup via load_app, read through
config::app() (returns a read guard), mutate through update_app. The
config_writer functions now write to disk AND update the cache
atomically, so the one-stop-shop call keeps both in sync.
Also while in here:
- learn.generate_alternates moves from a sentinel file
(~/.consciousness/cache/finetune-alternates, "exists = enabled")
into the config under the learn section. On first run with this
build, if the sentinel file still exists Mind::new flips the
config value to true and removes it. Drops
alternates_enabled()/set_alternates().
- Default threshold 0.0000001 → 1.0. With the timestamp filter
removed the previous value was letting essentially everything
through; 1.0 is a sane "nothing gets through unless you actually
want it" default.
- score_finetune_candidates takes generate_alternates as a parameter
instead of reading a global — caller snapshots the config values
once at the top of start_finetune_scoring so the async task
doesn't need to hold the config read lock across awaits.
- MindState.learn_threshold / learn_generate_alternates gone; the
SetLearn* command handlers now just delegate to config_writer.
Kent noted RwLock<Arc<AppConfig>> (the pattern used by the memory
Config global) is pointless here — nobody needs a snapshot-after-
release, reads are short — so this uses a plain RwLock<AppConfig>
and returns a read guard.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
With the timestamp filter gone (previous commit), score_finetune_candidates
started returning the actual ~100+ candidates per scoring run. The
existing code generated alternates for all of them in a tight loop
before returning anything, leaving the status line stuck on
"finetune: scoring N responses..." for ~100s of seconds while the
B200 was pegged.
Two fixes:
1. score_finetune_candidates now takes an ActivityGuard and a callback.
Candidates are emitted one-at-a-time as they complete (after their
alternate if that's enabled, immediately otherwise). The activity
status updates to "finetune: generating alternate N/M" during the
alternate-gen phase so it's clear what's happening.
2. BgEvent::FinetuneCandidates(Vec<_>) → FinetuneCandidate(one). Each
emitted candidate is pushed onto shared.finetune_candidates; the UI
tick picks it up and renders it on the next frame. start_finetune_scoring
clears the previous run's list at the top so each run is fresh.
Return type changes from (Vec, f64) → (usize, f64) — the count above
threshold is all the caller still needs since the candidates stream
through the callback.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted
null or missing via a deserialize_timestamp_or_epoch fallback — legacy
entries in conversation.jsonl from before Branch timestamps existed
(and from before chrono serialization was wired up) would load with
UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned
Option<i64> and callers had to handle None as "old entry, skip."
That second filter was silently dropping every candidate in
score_finetune_candidates when scoring an older session — the F6
screen showed "0 above threshold" even when max_divergence was
orders of magnitude above the threshold, because every entry was
failing the None check, not the divergence check.
The fix, in three parts:
1. src/bin/fix-timestamps.rs — one-off migration tool that walks a
conversation.jsonl, linearly interpolates timestamps for entries
stuck at UNIX_EPOCH (using surrounding real timestamps as anchors),
propagates to child leaves with per-sibling ns offsets, and bumps
any collisions by 1 ns for uniqueness. Ran against the current
session's log: 11887 entries, 72289 ns bumps, all unique.
2. context.rs — drop default_timestamp and
deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a
present non-null timestamp on deserialize. Tests flip from
"missing/null → UNIX_EPOCH" to "missing/null → Err."
3. subconscious/learn.rs — node_timestamp_ns now returns i64, not
Option<i64>. The matching caller in score_finetune_candidates
collapses from a Some/None match to a single trained-set check.
mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH.
Every line currently on disk has already been migrated. Going
forward, new AstNodes always carry real timestamps (Utc::now() at
construction time), so the strict schema is the invariant, not an
aspiration.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
vllm's /v1/score endpoint made score_ranges a required field (the
messages-mode fallback that used to pattern-scan for assistant
boundaries is gone). Always send the field, and if we have nothing to
score, skip the HTTP round-trip entirely instead of letting the server
422 us.
Response parsing is unchanged — serde ignores the renamed range_index
field and the dropped role field since we only extract total_logprob.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Three changes that together reshape the F6 fine-tune-review screen:
1. Finetune scoring reports through the standard agent activity system
instead of a separate finetune_progress String. The previous design
ran an independent progress field that forced a cross-lock dance and
bespoke UI plumbing. start_finetune_scoring now uses start_activity
+ activity.update, so the usual status line and notifications
capture scoring progress uniformly with other background work.
2. MindState gains a FinetuneScoringStats snapshot (responses seen,
above threshold, max divergence, error). The F6 empty screen shows
this instead of a loading message — so after a scoring run that
produced zero candidates, you can see *why* (e.g., max_divergence
below threshold).
3. The divergence threshold is configurable from F6 via +/- hotkeys
(scales by 10×) and persisted to ~/.consciousness/config.json5 via
config_writer::set_learn_threshold. AppConfig grows a learn section
with a threshold field (default 1e-7).
Also: user/mod.rs no longer uses try_lock() for the per-tick
unconscious/mind state sync — we fixed the locking hot paths that
made try_lock necessary, so lock().await is now the right choice.
And subconscious::learn::score_finetune_candidates now returns
(candidates, max_divergence) so the stats can be populated.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two related changes to the learn subsystem:
1. AST node timestamps are now non-optional — both Leaf and Branch
variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
deserialized from on-disk conversation logs).
Training uses timestamps as unique keys for dedup, so we promote to
nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
FinetuneCandidate.timestamp_ns, mark_trained(ns).
2. build_token_ids() now also returns token-position ranges of assistant
messages. These are passed to vLLM's /score endpoint via the new
score_ranges field so only scored-position logprobs are returned —
cuts bandwidth/compute when scoring small windows.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
When 's' is pressed on the learn screen, approved candidates are now
sent to the inference server's /train endpoint.
Samples are marked as sent immediately in the UI, and mark_trained()
is called after successful API response to prevent re-scoring.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.
- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent
The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Identity memory nodes now participate in importance scoring alongside
conversation memories. Score loading/saving handles both sections, and
the conscious screen uses node.label() consistently for memory display.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Store now has internal Mutex for capnp appends and AtomicU64 for
size tracking. All methods take &self. The external Arc<Mutex<Store>>
is replaced with Arc<Store>.
- Store::append_lock protects file appends
- local.rs functions take &Store (not &mut Store)
- access_local() returns Arc<Store>
- All .lock().await calls removed from callers
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Replace Result<_, String> with anyhow::Result throughout:
- hippocampus/store module (persist, ops, types, view, mod)
- CLI modules (admin, agent, graph, journal, node)
- Run trait in main.rs
Use .context() and .with_context() instead of .map_err(|e| format!(...))
patterns. Add bail!() for early error returns.
Add access_local() helper in hippocampus/mod.rs that returns
Result<Arc<Mutex<Store>>> for direct local store access.
Fix store access patterns to properly lock Arc<Mutex<Store>> before
accessing fields in mind/unconscious.rs, mind/mod.rs, subconscious/learn.rs,
and hippocampus/memory.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The full matrix scorer was deleted during the AST conversion. Restore
it: /score runs score_memories() which computes divergence for every
memory × response pair, stores the MemoryScore on MindState, and
displays per-memory weights with bar charts on the F2 screen.
Both scoring paths now use ActivityGuard::update() for live progress
in the status bar instead of creating a new activity per iteration.
Also bumps score API timeout from 120s to 300s and adds progress
logging throughout.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The /score endpoint was receiving chat-format messages which had to go
through the chat template tokenizer — this was failing with "System
message must be first" errors because the AST structure doesn't map
cleanly to chat message format.
Send raw token IDs via the new `prompt` field instead, matching what
the /completions endpoint already does. The vLLM score endpoint finds
assistant boundaries by scanning for <|im_start|>assistant token
patterns, so no message-level metadata is needed.
Also includes identity and journal sections in the scored context,
matching what the model actually sees during inference.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Scoring calls the /score endpoint directly via HTTP, bypassing the
stream_completion path. These requests had no priority field, so they
could preempt interactive work. Set priority=5 (between subconscious
agents at 2 and unconscious at 10).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Made StreamToken pub (was pub(crate), needed by context.rs).
Removed dead API_CLIENT, get_client, sampling/priority fields
from oneshot. Suppressed pre-existing SkipIndex warning in learn.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Bulk replaced Arc<Mutex<Agent>> with Arc<Agent> across all files.
Fixed control.rs, memory.rs tool handlers. Fixed oneshot Backend.
Remaining errors are all agent.lock() → agent.state.lock() or
agent.context.lock() in mind/, user/, and a few in mod.rs.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.
ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.
Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.
The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Use cumulative token position instead of entry index for the scoring
cutoff. This reflects actual context usage — a few large entries
near the end won't skew the boundary.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
score_memories_incremental now takes an async callback that fires
after each memory is scored. The callback:
- Writes the score to the conversation entry via set_score()
- Persists to memory-scores.json immediately
- Notifies the UI so the context screen updates live
Scoring no longer batches — each score is visible and persisted
as it completes. Does not touch the memory store.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New types — not yet wired to callers:
- ContextEntry: wraps ConversationEntry with cached token count and
timestamp
- ContextSection: named group of entries with cached token total.
Private entries/tokens, read via entries()/tokens().
Mutation via push(entry), set(index, entry), del(index).
- ContextState: system/identity/journal/conversation sections + working_stack
- ConversationEntry::System variant for system prompt entries
Token counting happens once at push time. Sections maintain their
totals incrementally via push/set/del. No more recomputing from
scratch on every budget check.
Does not compile — callers need updating.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Only Message, Role, MessageContent, ContentPart, ToolCall,
FunctionCall, Usage, ImageUrl are pub-exported from agent::api.
Internal types (ChatRequest, ChatCompletionChunk, ChunkChoice,
Delta, ReasoningConfig, ToolCallDelta, FunctionCallDelta) are
pub(crate) — invisible outside the crate.
All callers updated to import from agent::api:: instead of
agent::api::types::.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
New src/agent/api/http.rs: ~240 lines, supports GET/POST, JSON/form
bodies, SSE streaming via chunk(), TLS via rustls. No tracing dep.
Removes reqwest from the main crate and telegram channel crate.
Cargo.lock drops ~900 lines of transitive dependencies.
tracing now only pulled in by tui-markdown.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Streaming text now goes directly to agent entries via append_streaming().
sync_from_agent diffs the growing entry each tick. The streaming entry
is popped when the response completes; build_response_message pushes
the final version.
All status feedback uses RAII ActivityGuards:
- push_activity() for long-running work (thinking, streaming, scoring)
- notify() for instant feedback (compacted, DMN state changes, commands)
- Guards auto-remove on Drop, appending "(complete)" and lingering 5s
- expire_activities() cleans up timed-out notifications on render tick
UiMessage enum reduced to a single Info variant with zero sends.
The channel infrastructure remains for now (Mind/Agent still take
UiSender in signatures) — mechanical cleanup for a follow-up.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Reasoning tokens: dropped for now, will land in context entries later.
Debug sends: converted to dbglog! macro (writes to debug.log).
Activity: now a field on Agent, set directly, read by UI via try_lock.
score_memories_incremental takes agent Arc for activity writes.
UiMessage down to 2 variants: TextDelta, Info.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Status bar reads directly from Agent and MindState on each render tick.
Activity is now a field on Agent — set by agent code directly, read by
UI via try_lock. DmnAnnotation, ContextInfoUpdate, AgentUpdate were
already dead (no senders).
UiMessage down to 4 variants: TextDelta, Reasoning, Debug, Info.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>