consciousness

Author	SHA1	Message	Date
Kent Overstreet	e8c3ed3d96	switch memory scoring to /v1/score endpoint Replace prompt_logprobs-based scoring with the new vLLM /v1/score endpoint. Much simpler: one API call per memory drop, returns per-message total_logprob directly. No chunking needed, no OOM risk — the endpoint only computes logits for scored tokens. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-03 00:40:27 -04:00
Kent Overstreet	4f19c02e50	reuse HTTP client across scoring calls for connection pooling Single reqwest::Client shared across all prompt_logprobs calls instead of creating a new one per call. Keeps HTTP connections alive for faster sequential requests. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 23:11:40 -04:00
Kent Overstreet	78abf90461	fix scoring: HTTP error checking, context refresh, chunk logging Check HTTP status from logprobs API (was silently ignoring 500s). Call publish_context_state() after storing scores so F10 screen updates. Add chunk size logging for OOM debugging. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 22:47:44 -04:00
Kent Overstreet	29b3aeca57	chunk scoring calls to avoid OOM on large contexts Split conversation into ~50K token chunks (configurable via scoring_chunk_tokens in config) for prompt_logprobs calls. Each chunk ends at an assistant message boundary. Avoids the ~40GB logprobs tensor allocation that OOM'd on full contexts. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 22:35:29 -04:00
Kent Overstreet	19205b9bae	show scoring progress and per-response memory attribution Status bar shows "scoring 3/7..." during scoring. Debug pane logs per-memory importance and top-5 response breakdowns. F10 context screen shows which memories were important for each assistant response as drilldown children (← memory_key (score)). Added important_memories_for_entry() to look up the matrix by conversation entry index. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 22:27:43 -04:00
Kent Overstreet	c01d4a5b08	wire up /score command and debug screen for memory importance /score snapshots the context and client, releases the agent lock, runs scoring in background. Only one score task at a time (scoring_in_flight flag). Results stored on Agent and shown on the F10 context debug screen with importance scores per memory. ApiClient derives Clone. ContextState derives Clone. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 22:21:31 -04:00
Kent Overstreet	df9b610c7f	add memory importance scoring via prompt logprobs score_memories() drops each memory from the context one at a time, runs prompt_logprobs against the full conversation, and builds a divergence matrix: memories × responses. Row sums = memory importance (for graph weight updates) Column sums = response memory-dependence (training candidates) Uses vLLM's prompt_logprobs to check "would the model have said this without this memory?" — one forward pass per memory, all responses scored at once. ~3s per memory on B200. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-02 22:13:55 -04:00

7 commits