Two related changes to the learn subsystem:
1. AST node timestamps are now non-optional — both Leaf and Branch
variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
deserialized from on-disk conversation logs).
Training uses timestamps as unique keys for dedup, so we promote to
nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
FinetuneCandidate.timestamp_ns, mark_trained(ns).
2. build_token_ids() now also returns token-position ranges of assistant
messages. These are passed to vLLM's /score endpoint via the new
score_ranges field so only scored-position logprobs are returned —
cuts bandwidth/compute when scoring small windows.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Wire up divergence scoring to identify responses that depend heavily on
memories the model hasn't internalized. These are candidates for fine-tuning.
- Score finetune candidates automatically after each turn
- Track trained responses by timestamp to prevent overtraining
- F6 screen shows candidates with divergence scores
- j/k nav, a=approve, r=reject, g=toggle alternate gen, s=send
- Additive sync preserves approval status across ticks
- Keeps 10 most recent rejected, removes sent
The 's' key currently just marks as trained locally — actual /finetune
endpoint call to follow.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>