Commit graph

29 commits

Author SHA1 Message Date
Kent Overstreet
343e43afab learn: stream candidates to UI, update status during alternate gen
With the timestamp filter gone (previous commit), score_finetune_candidates
started returning the actual ~100+ candidates per scoring run. The
existing code generated alternates for all of them in a tight loop
before returning anything, leaving the status line stuck on
"finetune: scoring N responses..." for ~100s of seconds while the
B200 was pegged.

Two fixes:

1. score_finetune_candidates now takes an ActivityGuard and a callback.
   Candidates are emitted one-at-a-time as they complete (after their
   alternate if that's enabled, immediately otherwise). The activity
   status updates to "finetune: generating alternate N/M" during the
   alternate-gen phase so it's clear what's happening.

2. BgEvent::FinetuneCandidates(Vec<_>) → FinetuneCandidate(one). Each
   emitted candidate is pushed onto shared.finetune_candidates; the UI
   tick picks it up and renders it on the next frame. start_finetune_scoring
   clears the previous run's list at the top so each run is fresh.

Return type changes from (Vec, f64) → (usize, f64) — the count above
threshold is all the caller still needs since the candidates stream
through the callback.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:44:25 -04:00
Kent Overstreet
080b4f9084 context: tighten timestamp schema; every AstNode has one
Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted
null or missing via a deserialize_timestamp_or_epoch fallback — legacy
entries in conversation.jsonl from before Branch timestamps existed
(and from before chrono serialization was wired up) would load with
UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned
Option<i64> and callers had to handle None as "old entry, skip."

That second filter was silently dropping every candidate in
score_finetune_candidates when scoring an older session — the F6
screen showed "0 above threshold" even when max_divergence was
orders of magnitude above the threshold, because every entry was
failing the None check, not the divergence check.

The fix, in three parts:

1. src/bin/fix-timestamps.rs — one-off migration tool that walks a
   conversation.jsonl, linearly interpolates timestamps for entries
   stuck at UNIX_EPOCH (using surrounding real timestamps as anchors),
   propagates to child leaves with per-sibling ns offsets, and bumps
   any collisions by 1 ns for uniqueness. Ran against the current
   session's log: 11887 entries, 72289 ns bumps, all unique.

2. context.rs — drop default_timestamp and
   deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a
   present non-null timestamp on deserialize. Tests flip from
   "missing/null → UNIX_EPOCH" to "missing/null → Err."

3. subconscious/learn.rs — node_timestamp_ns now returns i64, not
   Option<i64>. The matching caller in score_finetune_candidates
   collapses from a Some/None match to a single trained-set check.
   mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH.

Every line currently on disk has already been migrated. Going
forward, new AstNodes always carry real timestamps (Utc::now() at
construction time), so the strict schema is the invariant, not an
aspiration.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:35:16 -04:00
Kent Overstreet
77822992c8 learn: score_ranges is now required; short-circuit on empty
vllm's /v1/score endpoint made score_ranges a required field (the
messages-mode fallback that used to pattern-scan for assistant
boundaries is gone). Always send the field, and if we have nothing to
score, skip the HTTP round-trip entirely instead of letting the server
422 us.

Response parsing is unchanged — serde ignores the renamed range_index
field and the dropped role field since we only extract total_logprob.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:19:28 -04:00
Kent Overstreet
e5dd8312c7 learn: F6 screen — scoring stats, ActivityGuard, configurable threshold
Three changes that together reshape the F6 fine-tune-review screen:

1. Finetune scoring reports through the standard agent activity system
   instead of a separate finetune_progress String. The previous design
   ran an independent progress field that forced a cross-lock dance and
   bespoke UI plumbing. start_finetune_scoring now uses start_activity
   + activity.update, so the usual status line and notifications
   capture scoring progress uniformly with other background work.

2. MindState gains a FinetuneScoringStats snapshot (responses seen,
   above threshold, max divergence, error). The F6 empty screen shows
   this instead of a loading message — so after a scoring run that
   produced zero candidates, you can see *why* (e.g., max_divergence
   below threshold).

3. The divergence threshold is configurable from F6 via +/- hotkeys
   (scales by 10×) and persisted to ~/.consciousness/config.json5 via
   config_writer::set_learn_threshold. AppConfig grows a learn section
   with a threshold field (default 1e-7).

Also: user/mod.rs no longer uses try_lock() for the per-tick
unconscious/mind state sync — we fixed the locking hot paths that
made try_lock necessary, so lock().await is now the right choice.
And subconscious::learn::score_finetune_candidates now returns
(candidates, max_divergence) so the stats can be populated.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 11:49:26 -04:00
Kent Overstreet
2b632d568b learn: nanosecond timestamps, token ranges for /score
Two related changes to the learn subsystem:

1. AST node timestamps are now non-optional — both Leaf and Branch
   variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
   deserialized from on-disk conversation logs).

   Training uses timestamps as unique keys for dedup, so we promote to
   nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
   FinetuneCandidate.timestamp_ns, mark_trained(ns).

2. build_token_ids() now also returns token-position ranges of assistant
   messages. These are passed to vLLM's /score endpoint via the new
   score_ranges field so only scored-position logprobs are returned —
   cuts bandwidth/compute when scoring small windows.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 11:48:37 -04:00
Kent Overstreet
5d9d3ffc5b learn: wire up /train endpoint for approved candidates
When 's' is pressed on the learn screen, approved candidates are now
sent to the inference server's /train endpoint.

Samples are marked as sent immediately in the UI, and mark_trained()
is called after successful API response to prevent re-scoring.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
50b7b3a33a F6 learn screen: fine-tuning candidate review
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.

- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent

The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 02:04:26 -04:00
Kent Overstreet
7046e63b9d Include identity nodes in memory scoring
Identity memory nodes now participate in importance scoring alongside
conversation memories. Score loading/saving handles both sections, and
the conscious screen uses node.label() consistently for memory display.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-15 05:59:58 -04:00
Kent Overstreet
b3d0a3ab25 store: internal locking, remove Arc<Mutex<Store>> wrapper
Store now has internal Mutex for capnp appends and AtomicU64 for
size tracking. All methods take &self. The external Arc<Mutex<Store>>
is replaced with Arc<Store>.

- Store::append_lock protects file appends
- local.rs functions take &Store (not &mut Store)
- access_local() returns Arc<Store>
- All .lock().await calls removed from callers

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-13 21:49:54 -04:00
Kent Overstreet
af3e41f1d9 migrate more files to use index-based node access
- learn.rs, daemon.rs, graph.rs, digest.rs, prompts.rs
- Convert store.nodes.get() → store.get_node()
- Convert store.nodes.contains_key() → store.contains_key()
- Convert store.nodes.values/iter() → all_keys + get_node

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-13 19:37:11 -04:00
Kent Overstreet
b8db8754be Convert store and CLI to anyhow::Result for cleaner error handling
Replace Result<_, String> with anyhow::Result throughout:
- hippocampus/store module (persist, ops, types, view, mod)
- CLI modules (admin, agent, graph, journal, node)
- Run trait in main.rs

Use .context() and .with_context() instead of .map_err(|e| format!(...))
patterns. Add bail!() for early error returns.

Add access_local() helper in hippocampus/mod.rs that returns
Result<Arc<Mutex<Store>>> for direct local store access.

Fix store access patterns to properly lock Arc<Mutex<Store>> before
accessing fields in mind/unconscious.rs, mind/mod.rs, subconscious/learn.rs,
and hippocampus/memory.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-13 18:05:04 -04:00
ProofOfConcept
58cec97e57 Restore full N×M memory scoring matrix (/score command)
The full matrix scorer was deleted during the AST conversion. Restore
it: /score runs score_memories() which computes divergence for every
memory × response pair, stores the MemoryScore on MindState, and
displays per-memory weights with bar charts on the F2 screen.

Both scoring paths now use ActivityGuard::update() for live progress
in the status bar instead of creating a new activity per iteration.

Also bumps score API timeout from 120s to 300s and adds progress
logging throughout.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-10 01:47:54 -04:00
ProofOfConcept
be65399710 Switch memory scoring from chat messages to raw token IDs
The /score endpoint was receiving chat-format messages which had to go
through the chat template tokenizer — this was failing with "System
message must be first" errors because the AST structure doesn't map
cleanly to chat message format.

Send raw token IDs via the new `prompt` field instead, matching what
the /completions endpoint already does. The vLLM score endpoint finds
assistant boundaries by scanning for <|im_start|>assistant token
patterns, so no message-level metadata is needed.

Also includes identity and journal sections in the scored context,
matching what the model actually sees during inference.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-09 21:07:00 -04:00
ProofOfConcept
67332eb55e Add vLLM priority to memory scoring requests
Scoring calls the /score endpoint directly via HTTP, bypassing the
stream_completion path. These requests had no priority field, so they
could preempt interactive work. Set priority=5 (between subconscious
agents at 2 and unconscious at 10).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-09 20:42:38 -04:00
Kent Overstreet
14fd8c9b90 Clean up warnings: StreamToken pub, dead oneshot code, SkipIndex
Made StreamToken pub (was pub(crate), needed by context.rs).
Removed dead API_CLIENT, get_client, sampling/priority fields
from oneshot. Suppressed pre-existing SkipIndex warning in learn.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 16:35:57 -04:00
Kent Overstreet
1d61b091b0 WIP: Agent/AgentState — 36 errors remaining, all .lock() → .state.lock() or .context.lock()
Bulk replaced Arc<Mutex<Agent>> with Arc<Agent> across all files.
Fixed control.rs, memory.rs tool handlers. Fixed oneshot Backend.
Remaining errors are all agent.lock() → agent.state.lock() or
agent.context.lock() in mind/, user/, and a few in mod.rs.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:40:36 -04:00
Kent Overstreet
e587431f9a IT BUILDS: Full AST migration compiles — zero errors
All callers migrated from old context types to AstNode/ContextState.
Killed: Message, Role (api), ConversationEntry, ContextEntry,
ContextSection, working_stack, api/parsing.rs, api/types.rs,
api/openai.rs, context_old.rs.

Oneshot standalone path stubbed (needs completions API rewrite).
12 warnings remaining (dead code cleanup).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:29:52 -04:00
Kent Overstreet
d0d876e067 WIP: Fix mind/, dmn, UI layer — 35 errors remaining
mind/mod.rs and mind/dmn.rs fully migrated to AST types.
user/context.rs, user/widgets.rs, user/chat.rs partially migrated.
Killed working_stack tool, tokenize_conv_entry, context_old.rs.

Remaining: learn.rs (22), oneshot.rs (5), subconscious.rs (3),
chat.rs (3), widgets.rs (1), context.rs (1).

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 15:24:49 -04:00
Kent Overstreet
5e4067c04f Replace token counting with token generation via HuggingFace tokenizer
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.

ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.

Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.

The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 11:20:03 -04:00
Kent Overstreet
613704720b Score memories in first 60% of conversation by tokens
Use cumulative token position instead of entry index for the scoring
cutoff. This reflects actual context usage — a few large entries
near the end won't skew the boundary.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2026-04-07 21:43:59 -04:00
Kent Overstreet
fd58386951 Incremental memory scoring with per-score persistence
score_memories_incremental now takes an async callback that fires
after each memory is scored. The callback:
- Writes the score to the conversation entry via set_score()
- Persists to memory-scores.json immediately
- Notifies the UI so the context screen updates live

Scoring no longer batches — each score is visible and persisted
as it completes. Does not touch the memory store.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 21:34:14 -04:00
Kent Overstreet
62996e27d7 WIP: ContextEntry/ContextSection data structures for incremental token counting
New types — not yet wired to callers:

- ContextEntry: wraps ConversationEntry with cached token count and
  timestamp
- ContextSection: named group of entries with cached token total.
  Private entries/tokens, read via entries()/tokens().
  Mutation via push(entry), set(index, entry), del(index).
- ContextState: system/identity/journal/conversation sections + working_stack
- ConversationEntry::System variant for system prompt entries

Token counting happens once at push time. Sections maintain their
totals incrementally via push/set/del. No more recomputing from
scratch on every budget check.

Does not compile — callers need updating.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 20:48:08 -04:00
Kent Overstreet
f33b1767da Restrict API types visibility — types module is now private
Only Message, Role, MessageContent, ContentPart, ToolCall,
FunctionCall, Usage, ImageUrl are pub-exported from agent::api.

Internal types (ChatRequest, ChatCompletionChunk, ChunkChoice,
Delta, ReasoningConfig, ToolCallDelta, FunctionCallDelta) are
pub(crate) — invisible outside the crate.

All callers updated to import from agent::api:: instead of
agent::api::types::.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 13:39:20 -04:00
Kent Overstreet
1cf4f504c0 Kill reqwest — minimal HTTP client on raw hyper + tokio-rustls
New src/agent/api/http.rs: ~240 lines, supports GET/POST, JSON/form
bodies, SSE streaming via chunk(), TLS via rustls. No tracing dep.

Removes reqwest from the main crate and telegram channel crate.
Cargo.lock drops ~900 lines of transitive dependencies.

tracing now only pulled in by tui-markdown.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-07 12:50:40 -04:00
ProofOfConcept
f390fa1617 Delete ui_channel.rs — relocate types, remove all UiMessage/UiSender plumbing
Types relocated:
- StreamTarget → mind/mod.rs (Mind decides Conversation vs Autonomous)
- SharedActiveTools + shared_active_tools() → agent/tools/mod.rs
- ContextSection + SharedContextState → agent/context.rs (already there)
- StatusInfo + ContextInfo → user/mod.rs (UI display state)

Removed UiSender from: Agent::turn, Mind, learn.rs, all function signatures.
The entire message-passing layer is gone. All state flows through
Agent fields (activities, entries, streaming) read by the UI via try_lock.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 22:34:48 -04:00
ProofOfConcept
cfddb55ed9 Kill TextDelta, Info — UiMessage is dead. RAII ActivityGuards replace all status feedback
Streaming text now goes directly to agent entries via append_streaming().
sync_from_agent diffs the growing entry each tick. The streaming entry
is popped when the response completes; build_response_message pushes
the final version.

All status feedback uses RAII ActivityGuards:
- push_activity() for long-running work (thinking, streaming, scoring)
- notify() for instant feedback (compacted, DMN state changes, commands)
- Guards auto-remove on Drop, appending "(complete)" and lingering 5s
- expire_activities() cleans up timed-out notifications on render tick

UiMessage enum reduced to a single Info variant with zero sends.
The channel infrastructure remains for now (Mind/Agent still take
UiSender in signatures) — mechanical cleanup for a follow-up.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 22:18:07 -04:00
ProofOfConcept
e7914e3d58 Kill Reasoning, Debug, Activity variants — read status from Agent directly
Reasoning tokens: dropped for now, will land in context entries later.
Debug sends: converted to dbglog! macro (writes to debug.log).
Activity: now a field on Agent, set directly, read by UI via try_lock.
score_memories_incremental takes agent Arc for activity writes.

UiMessage down to 2 variants: TextDelta, Info.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:45:55 -04:00
ProofOfConcept
eafc2887a3 Kill StatusUpdate, Activity, DmnAnnotation, ContextInfoUpdate, AgentUpdate
Status bar reads directly from Agent and MindState on each render tick.
Activity is now a field on Agent — set by agent code directly, read by
UI via try_lock. DmnAnnotation, ContextInfoUpdate, AgentUpdate were
already dead (no senders).

UiMessage down to 4 variants: TextDelta, Reasoning, Debug, Info.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-05 21:34:27 -04:00
Kent Overstreet
390b6c6c0a more reorg 2026-04-05 01:48:11 -04:00
Renamed from src/agent/training.rs (Browse further)