Add lock_blocking() to TrackedMutex: blocks current thread using
block_in_place + futures::executor::block_on, safe for sync contexts.
Replace all try_lock() calls with lock_blocking() in slash commands,
UI rendering, and status reads. Lock hold times are fast enough that
blocking briefly is fine, and this eliminates the spurious 'lock
unavailable' paths that were never actually hit.
Kept rx_mutex.try_lock() in mod.rs (std::sync::Mutex for stderr rx).
Creates an Agent from global config (API credentials, system prompt,
identity), overrides tools with the agent's tool set, and runs through
the standard Backend → run_with_backend → Agent::turn() path.
This enables poc-hook spawned agents (surface-observe, journal, etc.)
to work with the completions API instead of the deleted chat API.
Also added Default derive to CliArgs for config loading.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add agent/tokenizer.rs with global Qwen 3.5 tokenizer that generates
actual token IDs including chat template wrapping. ContextEntry now
stores token_ids: Vec<u32> instead of tokens: usize — the count is
derived from the length.
ContextEntry::new() tokenizes automatically via the global tokenizer.
ContextSection::push_entry() takes a raw ConversationEntry and
tokenizes it. set_message() re-tokenizes without needing an external
tokenizer parameter.
Token IDs include the full chat template: <|im_start|>role\ncontent
<|im_end|>\n — so concatenating token_ids across entries produces a
ready-to-send prompt for vLLM's /v1/completions endpoint.
The old tiktoken CoreBPE is now unused on Agent (will be removed in
a followup). Token counts are now exact for Qwen 3.5 instead of the
~85-90% approximation from cl100k_base.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Documents the root cause of the streaming display bug —
pop removes 1 line per entry but push produces N lines
(markdown, tool results). Includes concrete fix approach
using per-entry line count tracking.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
- subconscious.rs: use .get(fork_point..) instead of direct slice
to avoid panic when fork_point > entries.len()
- dmn.rs: batch all output injections (surface, reflection, thalamus)
under a single agent lock acquisition instead of three separate ones
- dmn.rs: use Store::cached() instead of Store::load() when rendering
surfaced memories
- Add scoring persistence analysis notes
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>