consciousness/src/mind/log.rs

83 lines
2.6 KiB
Rust
Raw Normal View History

2026-04-05 01:48:11 -04:00
use anyhow::{Context, Result};
use std::fs::{File, OpenOptions};
use std::io::Write;
2026-04-05 01:48:11 -04:00
use std::path::{Path, PathBuf};
use crate::agent::context::AstNode;
use crate::hippocampus::transcript::JsonlBackwardIter;
use memmap2::Mmap;
2026-04-05 01:48:11 -04:00
pub struct ConversationLog {
path: PathBuf,
}
impl ConversationLog {
pub fn new(path: PathBuf) -> Result<Self> {
if let Some(parent) = path.parent() {
std::fs::create_dir_all(parent)
.with_context(|| format!("creating log dir {}", parent.display()))?;
}
Ok(Self { path })
}
pub fn append_node(&self, node: &AstNode) -> Result<()> {
2026-04-05 01:48:11 -04:00
let mut file = OpenOptions::new()
.create(true)
.append(true)
.open(&self.path)
.with_context(|| format!("opening log {}", self.path.display()))?;
let line = serde_json::to_string(node)
.context("serializing node for log")?;
2026-04-05 01:48:11 -04:00
writeln!(file, "{}", line)
.context("writing to conversation log")?;
2026-04-06 23:04:08 -04:00
file.sync_all()
.context("syncing conversation log")?;
2026-04-05 01:48:11 -04:00
Ok(())
}
/// Read nodes from the tail of the log, newest first.
/// Caller decides when to stop (budget, count, etc).
pub fn read_tail(&self) -> Result<TailNodes> {
2026-04-05 01:48:11 -04:00
if !self.path.exists() {
anyhow::bail!("log does not exist");
2026-04-05 01:48:11 -04:00
}
let file = File::open(&self.path)
.with_context(|| format!("opening log {}", self.path.display()))?;
if file.metadata()?.len() == 0 {
anyhow::bail!("log is empty");
2026-04-05 01:48:11 -04:00
}
let mmap = unsafe { Mmap::map(&file)? };
Ok(TailNodes { _file: file, mmap })
2026-04-05 01:48:11 -04:00
}
pub fn path(&self) -> &Path {
&self.path
}
pub fn oldest_timestamp(&self) -> Option<chrono::DateTime<chrono::Utc>> {
let file = File::open(&self.path).ok()?;
let mmap = unsafe { Mmap::map(&file).ok()? };
for line in mmap.split(|&b| b == b'\n') {
2026-04-05 01:48:11 -04:00
if line.is_empty() { continue; }
if let Ok(node) = serde_json::from_slice::<AstNode>(line) {
if let Some(leaf) = node.leaf() {
context: tighten timestamp schema; every AstNode has one Previously NodeLeaf.timestamp and AstNode::Branch.timestamp accepted null or missing via a deserialize_timestamp_or_epoch fallback — legacy entries in conversation.jsonl from before Branch timestamps existed (and from before chrono serialization was wired up) would load with UNIX_EPOCH as a sentinel. Downstream, node_timestamp_ns() returned Option<i64> and callers had to handle None as "old entry, skip." That second filter was silently dropping every candidate in score_finetune_candidates when scoring an older session — the F6 screen showed "0 above threshold" even when max_divergence was orders of magnitude above the threshold, because every entry was failing the None check, not the divergence check. The fix, in three parts: 1. src/bin/fix-timestamps.rs — one-off migration tool that walks a conversation.jsonl, linearly interpolates timestamps for entries stuck at UNIX_EPOCH (using surrounding real timestamps as anchors), propagates to child leaves with per-sibling ns offsets, and bumps any collisions by 1 ns for uniqueness. Ran against the current session's log: 11887 entries, 72289 ns bumps, all unique. 2. context.rs — drop default_timestamp and deserialize_timestamp_or_epoch. NodeLeaf and Branch now require a present non-null timestamp on deserialize. Tests flip from "missing/null → UNIX_EPOCH" to "missing/null → Err." 3. subconscious/learn.rs — node_timestamp_ns now returns i64, not Option<i64>. The matching caller in score_finetune_candidates collapses from a Some/None match to a single trained-set check. mind/log.rs's oldest_timestamp no longer filters UNIX_EPOCH. Every line currently on disk has already been migrated. Going forward, new AstNodes always carry real timestamps (Utc::now() at construction time), so the strict schema is the invariant, not an aspiration. Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 12:35:16 -04:00
return Some(leaf.timestamp());
2026-04-05 01:48:11 -04:00
}
}
}
None
}
}
/// Iterates over conversation log nodes newest-first, using mmap + backward scan.
pub struct TailNodes {
_file: File,
mmap: Mmap,
}
impl TailNodes {
pub fn iter(&self) -> impl Iterator<Item = AstNode> + '_ {
JsonlBackwardIter::new(&self.mmap)
.filter_map(|bytes| serde_json::from_slice::<AstNode>(bytes).ok())
}
}