consciousness/doc/logging.md
Kent Overstreet 49f72cdac3 Logging overhaul: per-task log files, daemon.log drill-down
Switch from jobkit-daemon crate to jobkit with daemon feature.
Wire up per-task log files for all daemon-spawned agent tasks.

Changes:
- Use jobkit::daemon:: instead of jobkit_daemon::
- All agent tasks get .log_dir() set to $data_dir/logs/
- Task log path shown in daemon status and TUI
- New CLI: poc-memory agent daemon log --task NAME
  Finds the task's log path from status or daemon.log, tails the file
- LLM backend selection logged to daemon.log via log_event
- Targeted agent job names include the target key for debuggability
- Logging architecture documented in doc/logging.md

Two-level logging, no duplication:
- daemon.log: lifecycle events with task log path for drill-down
- per-task logs: full agent output via ctx.log_line()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 11:17:07 -04:00

3.7 KiB
Raw Blame History

Logging Architecture

poc-memory has multiple logging channels serving different purposes. Understanding which log to check is essential for debugging.

Log files

daemon.log — structured event log

  • Path: $data_dir/daemon.log (default: ~/.claude/memory/daemon.log)
  • Format: JSONL — {"ts", "job", "event", "detail"}
  • Written by: jobkit_daemon::event_log::log(), wrapped by log_event() in daemon.rs
  • Rotation: truncates to last half when file exceeds 1MB
  • Contains: task lifecycle events (started, completed, failed, progress), session-watcher ticks, scheduler events
  • View: poc-memory agent daemon log [--job NAME] [--lines N]
  • Note: the "daemon log" command reads this file and formats the JSONL as human-readable lines with timestamps. The --job filter shows only entries for a specific job name.

daemon-status.json — live snapshot

  • Path: $data_dir/daemon-status.json
  • Format: pretty-printed JSON
  • Written by: write_status() in daemon.rs, called periodically
  • Contains: current task list with states (pending/running/completed), graph health metrics, consolidation plan, uptime
  • View: poc-memory agent daemon status

llm-logs/ — per-agent LLM call transcripts

  • Path: $data_dir/llm-logs/{agent_name}/{timestamp}.txt
  • Format: plaintext sections: === PROMPT ===, === CALLING LLM ===, === RESPONSE ===
  • Written by: run_one_agent_inner() in knowledge.rs
  • Contains: full prompt sent to the LLM and full response received. One file per agent invocation. Invaluable for debugging agent quality — shows exactly what the model saw and what it produced.
  • Volume: can be large — 292 files for distill alone as of Mar 19.

retrieval.log — memory search queries

  • Path: $data_dir/retrieval.log
  • Format: plaintext, one line per search: [date] q="..." hits=N
  • Contains: every memory search query and hit count. Useful for understanding what the memory-search hook is doing and whether queries are finding useful results.

daily-check.log — graph health history

  • Path: $data_dir/daily-check.log
  • Format: plaintext, multi-line entries with metrics
  • Contains: graph topology metrics over time (σ, α, gini, cc, fit). Only ~10 entries — appended by the daily health check.

In-memory state (redundant with daemon.log)

ctx.log_line() — task output log

  • Stored in: jobkit task state (last 20 lines per task)
  • Also writes to: daemon.log via log_event() (as of Mar 19)
  • View: daemon-status.json → task → output_log, or just tail daemon.log
  • Design note: the in-memory buffer is redundant now that progress events go to daemon.log. The status viewer should eventually just tail daemon.log filtered by job name, eliminating the in-memory state.

ctx.set_progress() — current activity string

  • Stored in: jobkit task state
  • View: shown in status display next to the task name
  • Note: overwritten by each ctx.log_line() call.

What to check when

Problem Check
Task not starting daemon-status.json (task states)
Task failing daemon.log (failed events)
Agent producing bad output llm-logs/{agent}/{timestamp}.txt
Agent not finding right nodes retrieval.log (search queries)
Graph health declining daily-check.log
Resource pool / parallelism currently no log — need to add
Which LLM backend is being used daemon.log (llm-backend event)