Two related changes to the learn subsystem:
1. AST node timestamps are now non-optional — both Leaf and Branch
variants carry a DateTime<Utc>. UNIX_EPOCH means "unset" (old entries
deserialized from on-disk conversation logs).
Training uses timestamps as unique keys for dedup, so we promote to
nanosecond precision: node_timestamp_ns(), TrainData.timestamp_ns,
FinetuneCandidate.timestamp_ns, mark_trained(ns).
2. build_token_ids() now also returns token-position ranges of assistant
messages. These are passed to vLLM's /score endpoint via the new
score_ranges field so only scored-position logprobs are returned —
cuts bandwidth/compute when scoring small windows.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Uses JsonlBackwardIter (SIMD memrchr3) to scan the conversation log
newest-first without reading/parsing the whole file. Stops as soon
as the conversation budget is full. Only the kept nodes get
retokenized and pushed into context.
18MB log → only tokenize the ~50 nodes that fit in the budget.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>