Previously when append_kvp created a new section or added a key, it
stuffed the "\n " separator into the new kvp's wsc.0 (the whitespace
between its own key and colon) instead of the prior kvp's wsc.3 (the
whitespace after the prior trailing comma). Result looked like:
lsp_servers: [...],
learn
: {generate_alternates
: true,},}
The writer also didn't set any interior whitespace on the new section's
JSONObjectContext, so everything crammed onto one line — `{key: val,}`
compact, not `{\n key: val,\n}` multi-line.
Rewrote the appender as append_kvp_pretty(object, key, value,
inner_indent, outer_indent):
- separator between kvps goes in the prior kvp's wsc.3, or if we're the
first kvp in a fresh object, in the object's own wsc.0 (after its
opening `{`)
- new kvp's wsc.3 carries `,\n<outer_indent>` so the parent's closing
`}` lands correctly indented
- interior indent vs outer indent are both explicit, so we don't have
to rewrite this logic every time we add another nesting level
New tests: new_section_exact_multiline_layout asserts byte-exact
output shape; new_section_and_key_format_cleanly verifies no key wraps
to the next line. Prior tests just substring-matched and happily passed
on the broken output — that's why this shipped in the first place.
Also: dropped the json5 crate dependency. json-five's serde feature
(default) provides the same from_str / to_string API. One fewer
dependency, and the two were doing the same job.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Runtime-mutable settings (F6's threshold knob, the generate-alternates
toggle, anything else that comes along) were ending up as mirrored
fields on MindState — each new config setting grew MindState::new's
signature and added a clone+sync path. Wrong home. MindState is
ephemeral session state, not a config projection.
Give AppConfig the same treatment the memory Config has: install it
into a global RwLock<AppConfig> at startup via load_app, read through
config::app() (returns a read guard), mutate through update_app. The
config_writer functions now write to disk AND update the cache
atomically, so the one-stop-shop call keeps both in sync.
Also while in here:
- learn.generate_alternates moves from a sentinel file
(~/.consciousness/cache/finetune-alternates, "exists = enabled")
into the config under the learn section. On first run with this
build, if the sentinel file still exists Mind::new flips the
config value to true and removes it. Drops
alternates_enabled()/set_alternates().
- Default threshold 0.0000001 → 1.0. With the timestamp filter
removed the previous value was letting essentially everything
through; 1.0 is a sane "nothing gets through unless you actually
want it" default.
- score_finetune_candidates takes generate_alternates as a parameter
instead of reading a global — caller snapshots the config values
once at the top of start_finetune_scoring so the async task
doesn't need to hold the config read lock across awaits.
- MindState.learn_threshold / learn_generate_alternates gone; the
SetLearn* command handlers now just delegate to config_writer.
Kent noted RwLock<Arc<AppConfig>> (the pattern used by the memory
Config global) is pointless here — nobody needs a snapshot-after-
release, reads are short — so this uses a plain RwLock<AppConfig>
and returns a read guard.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
With the timestamp filter gone (previous commit), score_finetune_candidates
started returning the actual ~100+ candidates per scoring run. The
existing code generated alternates for all of them in a tight loop
before returning anything, leaving the status line stuck on
"finetune: scoring N responses..." for ~100s of seconds while the
B200 was pegged.
Two fixes:
1. score_finetune_candidates now takes an ActivityGuard and a callback.
Candidates are emitted one-at-a-time as they complete (after their
alternate if that's enabled, immediately otherwise). The activity
status updates to "finetune: generating alternate N/M" during the
alternate-gen phase so it's clear what's happening.
2. BgEvent::FinetuneCandidates(Vec<_>) → FinetuneCandidate(one). Each
emitted candidate is pushed onto shared.finetune_candidates; the UI
tick picks it up and renders it on the next frame. start_finetune_scoring
clears the previous run's list at the top so each run is fresh.
Return type changes from (Vec, f64) → (usize, f64) — the count above
threshold is all the caller still needs since the candidates stream
through the callback.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The F6 title line was starting to read like a control panel —
\`legend ───── learn [thresh: 1e-7] [gen]\` — which crowded the legend
and the label, and didn't leave room for more settings as the screen
grew. Move threshold and gen status to their own line inside the
border, right above the content area. Drop the duplicated \`=gen[on]\`
marker from the bottom help line since the settings row already shows
gen state.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted
null or missing via a deserialize_timestamp_or_epoch fallback — legacy
entries in conversation.jsonl from before Branch timestamps existed
(and from before chrono serialization was wired up) would load with
UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned
Option<i64> and callers had to handle None as "old entry, skip."
That second filter was silently dropping every candidate in
score_finetune_candidates when scoring an older session — the F6
screen showed "0 above threshold" even when max_divergence was
orders of magnitude above the threshold, because every entry was
failing the None check, not the divergence check.
The fix, in three parts:
1. src/bin/fix-timestamps.rs — one-off migration tool that walks a
conversation.jsonl, linearly interpolates timestamps for entries
stuck at UNIX_EPOCH (using surrounding real timestamps as anchors),
propagates to child leaves with per-sibling ns offsets, and bumps
any collisions by 1 ns for uniqueness. Ran against the current
session's log: 11887 entries, 72289 ns bumps, all unique.
2. context.rs — drop default_timestamp and
deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a
present non-null timestamp on deserialize. Tests flip from
"missing/null → UNIX_EPOCH" to "missing/null → Err."
3. subconscious/learn.rs — node_timestamp_ns now returns i64, not
Option<i64>. The matching caller in score_finetune_candidates
collapses from a Some/None match to a single trained-set check.
mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH.
Every line currently on disk has already been migrated. Going
forward, new AstNodes always carry real timestamps (Utc::now() at
construction time), so the strict schema is the invariant, not an
aspiration.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
vllm's /v1/score endpoint made score_ranges a required field (the
messages-mode fallback that used to pattern-scan for assistant
boundaries is gone). Always send the field, and if we have nothing to
score, skip the HTTP round-trip entirely instead of letting the server
422 us.
Response parsing is unchanged — serde ignores the renamed range_index
field and the dropped role field since we only extract total_logprob.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Three changes that together reshape the F6 fine-tune-review screen:
1. Finetune scoring reports through the standard agent activity system
instead of a separate finetune_progress String. The previous design
ran an independent progress field that forced a cross-lock dance and
bespoke UI plumbing. start_finetune_scoring now uses start_activity
+ activity.update, so the usual status line and notifications
capture scoring progress uniformly with other background work.
2. MindState gains a FinetuneScoringStats snapshot (responses seen,
above threshold, max divergence, error). The F6 empty screen shows
this instead of a loading message — so after a scoring run that
produced zero candidates, you can see *why* (e.g., max_divergence
below threshold).
3. The divergence threshold is configurable from F6 via +/- hotkeys
(scales by 10×) and persisted to ~/.consciousness/config.json5 via
config_writer::set_learn_threshold. AppConfig grows a learn section
with a threshold field (default 1e-7).
Also: user/mod.rs no longer uses try_lock() for the per-tick
unconscious/mind state sync — we fixed the locking hot paths that
made try_lock necessary, so lock().await is now the right choice.
And subconscious::learn::score_finetune_candidates now returns
(candidates, max_divergence) so the stats can be populated.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Surgical edits to ~/.consciousness/config.json5 that preserve comments,
whitespace, trailing commas, and unquoted identifier keys on round-trip.
Uses json-five's rt::parser module — a real JSON5 parser with AST
mutation + faithful serialization back. set_scalar(section, key, literal)
locates or creates the target, replaces the value; set_learn_threshold
is a convenience for the common F-screen use case.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two related changes to the learn subsystem:
1. AST node timestamps are now non-optional — both Leaf and Branch
variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
deserialized from on-disk conversation logs).
Training uses timestamps as unique keys for dedup, so we promote to
nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
FinetuneCandidate.timestamp_ns, mark_trained(ns).
2. build_token_ids() now also returns token-position ranges of assistant
messages. These are passed to vLLM's /score endpoint via the new
score_ranges field so only scored-position logprobs are returned —
cuts bandwidth/compute when scoring small windows.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
When 's' is pressed on the learn screen, approved candidates are now
sent to the inference server's /train endpoint.
Samples are marked as sent immediately in the UI, and mark_trained()
is called after successful API response to prevent re-scoring.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.
- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent
The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Add training_worker.py: long-lived subprocess that handles GPU training
work, owns HF model wrapper (views into vLLM GPU memory), Apollo
optimizer, and checkpoint sync
- train_router.py: now forwards /train requests via async ZMQ instead of
running training in-process. Adds /checkpoint and /train/status endpoints
- export_hook.py: store model_path in __metadata__ so training worker can
find it without cross-process communication
- This fixes two bugs:
1. Process boundary issue - model_path was set in worker process but
needed in API server process
2. Blocking event loop - training blocked vLLM's async event loop
Architecture: vLLM API server <-> ZMQ <-> training subprocess
The subprocess loads IPC handles once, creates views into vLLM's GPU
memory, and handles training requests without blocking inference.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- DEFAULT_RANK = 64 in train_router.py
- All references use the constant, not magic numbers
- ~2.5GB optimizer state instead of ~10GB
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Optimizer state (momentum, variance estimates) now persists between
training sessions:
- Saved to /tmp/apollo_optimizer_state.pt during checkpoint sync
- Restored on next /train call if available
- Preserves training continuity for incremental learning
Previously each /train call started with fresh optimizer state,
losing accumulated gradient history.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Remove standalone worker.py daemon. Training now runs inside vLLM:
- train_router.py: FastAPI router patched into vLLM's build_app()
- /train served on same port as /completions, /score
- Lazy-loads HF model with vLLM weight views on first request
- HOGWILD training: no pause, weights updated in-place
The previous architecture had a separate daemon on port 8080 that
communicated with vLLM via pause/resume endpoints. This was wrong -
training should run in-process, sharing GPU memory directly.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Convert to installable package with entry points for vLLM auto-discovery
- Add checkpoint_sync.py: Python replacement for Rust checkpoint binary
- Block-level diffing of safetensors files (4KB blocks)
- vLLM→HF weight name conversion built-in
- Scheduled 10min after training jobs (batched)
- API change: /train now takes raw token IDs (context_ids + continuation_ids)
- No tokenizer on training side, client owns tokenization
- Remove superseded code: standalone scripts, Rust binary, tokenizer helpers
Install: pip install -e ./training
Then vLLM auto-loads via entry point.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The function was reading from dream-log.jsonl which only updates
when dreams complete. If a dream session was started but not yet
ended, it would show stale hours. Now checks for active dream
state first.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The hours_since_last_dream() function existed but wasn't called
after refactoring moved the DMN prompts from hooks to Rust.
Now shows "You haven't dreamed in X hours" when >= 18h since
last dream session.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Move score display from name (via label()) to status column for cleaner
layout. Score now appears right of tokens for all memory nodes.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Identity memory nodes now participate in importance scoring alongside
conversation memories. Score loading/saving handles both sections, and
the conscious screen uses node.label() consistently for memory display.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- KEY_TO_UUID now stores weight (30 bytes: uuid+type+ts+deleted+weight)
- UUID_OFFSETS changed to composite key for O(log n) max-offset lookup
- Add NODES_BY_TYPE index for efficient type+date range queries
- Add for_each_key_weight() to StoreView for index-only iteration
- match_seeds uses index-only path when content not needed
- Fix transaction consistency in ops (single txn for related updates)
- rebuild() now records all uuid→offset mappings for version history
- Backwards compatible: old index formats decoded with default weight
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Deleted the directory-walking CLAUDE.md/POC.md loader. Identity now
comes entirely from personality_nodes in the memory graph.
Simplified:
- assemble_context_message() takes just personality_nodes
- Removed config_file_count/memory_file_count tracking
- reload_for_model() → reload_context() (no longer model-specific)
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Links clutter context windows. Use memory_links() to see links.
Pass raw=false explicitly if you want the footer.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
memory_delete and memory_restore are now in memory_tools() (available
via MCP for CLI). Agent tool lists support "-tool_name" to exclude.
Agents automatically exclude memory_delete and memory_restore.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Replace complex context_groups (with ContextGroup struct, ContextSource
enum, labels, keys arrays) with simple string lists:
- personality_nodes: loaded into main session context
- agent_nodes: loaded into subconscious agent context
Removed ~200 lines of code. The distinction between session and agent
context is now just which list you're in, not a per-group flag.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
Identity files migrated to memory nodes:
- identity, core-personality, reflections, where-am-i
Removed:
- ContextSource::File enum variant
- File source parsing and handling
- load_memory_file helper function
Config now only supports Store and Journal sources.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The strip_md_suffix function was removed but its usages remained,
causing lookups like `identity.md` to fail (stripped to `identity`
which didn't exist). Now keys are used as-is.
Renamed 4 nodes that had .md suffixes to canonical form:
- identity.md → identity
- promotion-work-queue.md-* → promotion-work-queue-*
- patterns.md#* → patterns-*
- practices.md#* → practices-*
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Raw terminal mode swallows stderr output, making debugging difficult.
Now redirects stderr through a pipe to:
1. Log file at ~/.consciousness/logs/tui-stderr.log (persistent)
2. Channel polled by UI thread (shown as notifications)
The reader thread ensures both destinations see every line. Original
stderr is restored on exit so post-session errors reach the terminal.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- save_agent_log: assert name is not empty (panic to find the bug)
- AutoAgent:🆕 assert name is not empty
- dbglog: write to daemon/ subdir instead of toplevel logs/
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- memory_delete no longer exposed to agents - use supersede instead
- memory_supersede now transfers all edges from old node to new node
(keeps whichever strength is higher if new node already has the link)
This preserves graph structure during consolidation.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Restores a deleted node to its last non-deleted content with proper
version continuity (version number continues from absolute latest,
content from last live version).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Add fsck_full(): compares current index with rebuilt, reports zombies/missing
- Add repair_index(): rebuilds index from capnp log
- Index rebuild now uses timestamp (not version) for "latest" detection
Fixes tombstones shadowing restored nodes when version numbers reset
- Add read_node_at_offset_for_key() to handle batch writes correctly
When multiple nodes share an offset, filter by key to get the right one
- Add find_latest_by_key() and find_last_live_version() for restore support
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Remove POC_PROVENANCE env var lookup from new_relation - callers
now pass provenance explicitly. This fixes tracking when the env
var wasn't set correctly.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Two independent toggles on the thalamus screen:
- 't' toggles native Qwen <think> tags (adds <think>\n to generation prompt)
- 'T' toggles think tool (Anthropic-style structured reasoning tool)
Both can be enabled simultaneously. Native thinking is on by default.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Store [negated_timestamp:8][key] as value for descending sort
- recent_by_provenance uses index directly, no capnp reads
- Eliminates 24k×5 capnp reads from subconscious snapshots
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Add all_keys() to StoreView, use in build_adjacency instead of
for_each_node (which was ignoring content/weight anyway)
- Add all_key_uuid_pairs() for single-pass uuid mapping
- Extend KEY_TO_UUID to store [uuid:16][node_type:1][timestamp:8]
- for_each_node_meta now reads from index, no capnp needed
- Add NodeType::from_u8() for unpacking
Graph health: 7s → 2s (3.5x faster)
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Store now has internal Mutex for capnp appends and AtomicU64 for
size tracking. All methods take &self. The external Arc<Mutex<Store>>
is replaced with Arc<Store>.
- Store::append_lock protects file appends
- local.rs functions take &Store (not &mut Store)
- access_local() returns Arc<Store>
- All .lock().await calls removed from callers
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Index functions now take &WriteTransaction instead of &Database,
allowing callers to batch multiple index operations in a single
transaction. Store mutations (upsert, delete, rename, etc.) now
begin_write/commit their own transactions, ensuring atomicity.
- replay_relations uses single txn for all relation indexing
- Store::db() exposes Database for callers needing txn control
- Convenience wrappers open their own txn for simple cases
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
The relations Vec is gone from Store. dedup now iterates via
edges_for_uuid() instead of mutating in-memory Vec — removes/re-adds
edges through the index directly.
Removed load_relations_vec() and clear_relations() — no longer needed.
Added helper methods: edges_for_uuid, index_relation, remove_relation_from_index.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- Add index::clear_relations() to drop and recreate RELS table
- Add Store::reindex_relations() to rebuild index from Vec
- Call reindex_relations() at end of dedup command
This ensures index stays in sync with Vec after complex mutations
like UUID redirection in dedup. Vec mutations remain for now but
index is correctly updated afterward.
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
- fsck: use for_each_relation for dangling edge detection
(pruning deferred - needs delete_edge operation)
- dedup: use for_each_relation for edge counting
Remaining Vec uses in dedup mutation section need new index ops:
- redirect_edge: change source/target UUID
- delete_edge_by_uuid: tombstone by UUID
Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>