pub → pub(crate) for SseReader methods (used across child modules).
pub → pub(super) for openai::stream_events, tool definitions, store
helpers. pub → private for normalize_link and differentiate_hub_with_graph
(only used within their own files).
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Serialize request JSON before send_and_check so it's available
for both HTTP errors and stream errors. Extracted save logic
into save_failed_request helper on SseReader.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Delete anthropic.rs (713 lines) — we only use OpenAI-compatible
endpoints (vLLM, OpenRouter). Simplify ApiClient to store base_url
directly instead of Backend enum.
SseReader now stores the serialized request payload and saves it
to ~/.consciousness/logs/failed-request-{ts}.json on stream timeout,
so failed requests can be replayed with curl for debugging.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Thread request priority through the API call chain to vLLM's
priority scheduler. Lower value = higher priority, with preemption.
Priority is set per-agent in the .agent header:
- interactive (runner): 0 (default, highest)
- surface-observe: 1 (near-realtime, watches conversation)
- all other agents: 10 (batch, default if not specified)
Requires vLLM started with --scheduling-policy priority.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Split the streaming pipeline: API backends yield StreamEvents through
a channel, the runner reads them and routes to the appropriate UI pane.
- Add StreamEvent enum (Content, Reasoning, ToolCallDelta, etc.)
- API start_stream() spawns backend as a task, returns event receiver
- Runner loops over events, sends content to conversation pane but
suppresses <tool_call> XML with a buffered tail for partial tags
- OpenAI backend refactored to stream_events() — no more UI coupling
- Anthropic backend gets a wrapper that synthesizes events from the
existing stream() (TODO: native event streaming)
- chat_completion_stream() kept for subconscious agents, reimplemented
on top of the event stream
- Usage derives Clone
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
Tool call parsing was only in runner.rs, so subconscious agents
(poc-memory agent run) never recovered leaked tool calls from
models that emit <tool_call> as content text (e.g. Qwen via Crane).
Move the recovery into build_response_message where both code paths
share it. Leaked tool calls are promoted to structured tool_calls
and the content is cleaned, so all consumers see them uniformly.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
No more subcrate nesting — src/, agents/, schema/, defaults/, build.rs
all live at the workspace root. poc-daemon remains as the only workspace
member. Crate name (poc-memory) and all imports unchanged.
Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-03-25 00:54:12 -04:00
Renamed from poc-memory/src/agent/api/mod.rs (Browse further)