consciousness/doc/logging.md

77 lines
3.7 KiB
Markdown
Raw Permalink Normal View History

# Logging Architecture
poc-memory has multiple logging channels serving different purposes.
Understanding which log to check is essential for debugging.
## Log files
### daemon.log — structured event log
- **Path**: `$data_dir/daemon.log` (default: `~/.consciousness/memory/daemon.log`)
- **Format**: JSONL — `{"ts", "job", "event", "detail"}`
- **Written by**: `jobkit_daemon::event_log::log()`, wrapped by `log_event()` in daemon.rs
- **Rotation**: truncates to last half when file exceeds 1MB
- **Contains**: task lifecycle events (started, completed, failed, progress),
session-watcher ticks, scheduler events
- **View**: `poc-memory agent daemon log [--job NAME] [--lines N]`
- **Note**: the "daemon log" command reads this file and formats the JSONL
as human-readable lines with timestamps. The `--job` filter shows only
entries for a specific job name.
### daemon-status.json — live snapshot
- **Path**: `$data_dir/daemon-status.json`
- **Format**: pretty-printed JSON
- **Written by**: `write_status()` in daemon.rs, called periodically
- **Contains**: current task list with states (pending/running/completed),
graph health metrics, consolidation plan, uptime
- **View**: `poc-memory agent daemon status`
### llm-logs/ — per-agent LLM call transcripts
- **Path**: `$data_dir/llm-logs/{agent_name}/{timestamp}.txt`
- **Format**: plaintext sections: `=== PROMPT ===`, `=== CALLING LLM ===`,
`=== RESPONSE ===`
- **Written by**: `run_one_agent_inner()` in knowledge.rs
- **Contains**: full prompt sent to the LLM and full response received.
One file per agent invocation. Invaluable for debugging agent quality —
shows exactly what the model saw and what it produced.
- **Volume**: can be large — 292 files for distill alone as of Mar 19.
### retrieval.log — memory search queries
- **Path**: `$data_dir/retrieval.log`
- **Format**: plaintext, one line per search: `[date] q="..." hits=N`
- **Contains**: every memory search query and hit count. Useful for
understanding what the memory-search hook is doing and whether
queries are finding useful results.
### daily-check.log — graph health history
- **Path**: `$data_dir/daily-check.log`
- **Format**: plaintext, multi-line entries with metrics
- **Contains**: graph topology metrics over time (σ, α, gini, cc, fit).
Only ~10 entries — appended by the daily health check.
## In-memory state (redundant with daemon.log)
### ctx.log_line() — task output log
- **Stored in**: jobkit task state (last 20 lines per task)
- **Also writes to**: daemon.log via `log_event()` (as of Mar 19)
- **View**: `daemon-status.json` → task → output_log, or just tail daemon.log
- **Design note**: the in-memory buffer is redundant now that progress
events go to daemon.log. The status viewer should eventually just
tail daemon.log filtered by job name, eliminating the in-memory state.
### ctx.set_progress() — current activity string
- **Stored in**: jobkit task state
- **View**: shown in status display next to the task name
- **Note**: overwritten by each `ctx.log_line()` call.
## What to check when
| Problem | Check |
|----------------------------------|------------------------------------|
| Task not starting | daemon-status.json (task states) |
| Task failing | daemon.log (failed events) |
| Agent producing bad output | llm-logs/{agent}/{timestamp}.txt |
| Agent not finding right nodes | retrieval.log (search queries) |
| Graph health declining | daily-check.log |
| Resource pool / parallelism | **currently no log** — need to add |
| Which LLM backend is being used | daemon.log (llm-backend event) |