consciousness/doc/logging.md
Kent Overstreet 49f72cdac3 Logging overhaul: per-task log files, daemon.log drill-down
Switch from jobkit-daemon crate to jobkit with daemon feature.
Wire up per-task log files for all daemon-spawned agent tasks.

Changes:
- Use jobkit::daemon:: instead of jobkit_daemon::
- All agent tasks get .log_dir() set to $data_dir/logs/
- Task log path shown in daemon status and TUI
- New CLI: poc-memory agent daemon log --task NAME
  Finds the task's log path from status or daemon.log, tails the file
- LLM backend selection logged to daemon.log via log_event
- Targeted agent job names include the target key for debuggability
- Logging architecture documented in doc/logging.md

Two-level logging, no duplication:
- daemon.log: lifecycle events with task log path for drill-down
- per-task logs: full agent output via ctx.log_line()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 11:17:07 -04:00

76 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Logging Architecture
poc-memory has multiple logging channels serving different purposes.
Understanding which log to check is essential for debugging.
## Log files
### daemon.log — structured event log
- **Path**: `$data_dir/daemon.log` (default: `~/.claude/memory/daemon.log`)
- **Format**: JSONL — `{"ts", "job", "event", "detail"}`
- **Written by**: `jobkit_daemon::event_log::log()`, wrapped by `log_event()` in daemon.rs
- **Rotation**: truncates to last half when file exceeds 1MB
- **Contains**: task lifecycle events (started, completed, failed, progress),
session-watcher ticks, scheduler events
- **View**: `poc-memory agent daemon log [--job NAME] [--lines N]`
- **Note**: the "daemon log" command reads this file and formats the JSONL
as human-readable lines with timestamps. The `--job` filter shows only
entries for a specific job name.
### daemon-status.json — live snapshot
- **Path**: `$data_dir/daemon-status.json`
- **Format**: pretty-printed JSON
- **Written by**: `write_status()` in daemon.rs, called periodically
- **Contains**: current task list with states (pending/running/completed),
graph health metrics, consolidation plan, uptime
- **View**: `poc-memory agent daemon status`
### llm-logs/ — per-agent LLM call transcripts
- **Path**: `$data_dir/llm-logs/{agent_name}/{timestamp}.txt`
- **Format**: plaintext sections: `=== PROMPT ===`, `=== CALLING LLM ===`,
`=== RESPONSE ===`
- **Written by**: `run_one_agent_inner()` in knowledge.rs
- **Contains**: full prompt sent to the LLM and full response received.
One file per agent invocation. Invaluable for debugging agent quality —
shows exactly what the model saw and what it produced.
- **Volume**: can be large — 292 files for distill alone as of Mar 19.
### retrieval.log — memory search queries
- **Path**: `$data_dir/retrieval.log`
- **Format**: plaintext, one line per search: `[date] q="..." hits=N`
- **Contains**: every memory search query and hit count. Useful for
understanding what the memory-search hook is doing and whether
queries are finding useful results.
### daily-check.log — graph health history
- **Path**: `$data_dir/daily-check.log`
- **Format**: plaintext, multi-line entries with metrics
- **Contains**: graph topology metrics over time (σ, α, gini, cc, fit).
Only ~10 entries — appended by the daily health check.
## In-memory state (redundant with daemon.log)
### ctx.log_line() — task output log
- **Stored in**: jobkit task state (last 20 lines per task)
- **Also writes to**: daemon.log via `log_event()` (as of Mar 19)
- **View**: `daemon-status.json` → task → output_log, or just tail daemon.log
- **Design note**: the in-memory buffer is redundant now that progress
events go to daemon.log. The status viewer should eventually just
tail daemon.log filtered by job name, eliminating the in-memory state.
### ctx.set_progress() — current activity string
- **Stored in**: jobkit task state
- **View**: shown in status display next to the task name
- **Note**: overwritten by each `ctx.log_line()` call.
## What to check when
| Problem | Check |
|----------------------------------|------------------------------------|
| Task not starting | daemon-status.json (task states) |
| Task failing | daemon.log (failed events) |
| Agent producing bad output | llm-logs/{agent}/{timestamp}.txt |
| Agent not finding right nodes | retrieval.log (search queries) |
| Graph health declining | daily-check.log |
| Resource pool / parallelism | **currently no log** — need to add |
| Which LLM backend is being used | daemon.log (llm-backend event) |