Logging overhaul: per-task log files, daemon.log drill-down
Switch from jobkit-daemon crate to jobkit with daemon feature. Wire up per-task log files for all daemon-spawned agent tasks. Changes: - Use jobkit::daemon:: instead of jobkit_daemon:: - All agent tasks get .log_dir() set to $data_dir/logs/ - Task log path shown in daemon status and TUI - New CLI: poc-memory agent daemon log --task NAME Finds the task's log path from status or daemon.log, tails the file - LLM backend selection logged to daemon.log via log_event - Targeted agent job names include the target key for debuggability - Logging architecture documented in doc/logging.md Two-level logging, no duplication: - daemon.log: lifecycle events with task log path for drill-down - per-task logs: full agent output via ctx.log_line() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f2c2c02a22
commit
49f72cdac3
7 changed files with 192 additions and 54 deletions
76
doc/logging.md
Normal file
76
doc/logging.md
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# Logging Architecture
|
||||
|
||||
poc-memory has multiple logging channels serving different purposes.
|
||||
Understanding which log to check is essential for debugging.
|
||||
|
||||
## Log files
|
||||
|
||||
### daemon.log — structured event log
|
||||
- **Path**: `$data_dir/daemon.log` (default: `~/.claude/memory/daemon.log`)
|
||||
- **Format**: JSONL — `{"ts", "job", "event", "detail"}`
|
||||
- **Written by**: `jobkit_daemon::event_log::log()`, wrapped by `log_event()` in daemon.rs
|
||||
- **Rotation**: truncates to last half when file exceeds 1MB
|
||||
- **Contains**: task lifecycle events (started, completed, failed, progress),
|
||||
session-watcher ticks, scheduler events
|
||||
- **View**: `poc-memory agent daemon log [--job NAME] [--lines N]`
|
||||
- **Note**: the "daemon log" command reads this file and formats the JSONL
|
||||
as human-readable lines with timestamps. The `--job` filter shows only
|
||||
entries for a specific job name.
|
||||
|
||||
### daemon-status.json — live snapshot
|
||||
- **Path**: `$data_dir/daemon-status.json`
|
||||
- **Format**: pretty-printed JSON
|
||||
- **Written by**: `write_status()` in daemon.rs, called periodically
|
||||
- **Contains**: current task list with states (pending/running/completed),
|
||||
graph health metrics, consolidation plan, uptime
|
||||
- **View**: `poc-memory agent daemon status`
|
||||
|
||||
### llm-logs/ — per-agent LLM call transcripts
|
||||
- **Path**: `$data_dir/llm-logs/{agent_name}/{timestamp}.txt`
|
||||
- **Format**: plaintext sections: `=== PROMPT ===`, `=== CALLING LLM ===`,
|
||||
`=== RESPONSE ===`
|
||||
- **Written by**: `run_one_agent_inner()` in knowledge.rs
|
||||
- **Contains**: full prompt sent to the LLM and full response received.
|
||||
One file per agent invocation. Invaluable for debugging agent quality —
|
||||
shows exactly what the model saw and what it produced.
|
||||
- **Volume**: can be large — 292 files for distill alone as of Mar 19.
|
||||
|
||||
### retrieval.log — memory search queries
|
||||
- **Path**: `$data_dir/retrieval.log`
|
||||
- **Format**: plaintext, one line per search: `[date] q="..." hits=N`
|
||||
- **Contains**: every memory search query and hit count. Useful for
|
||||
understanding what the memory-search hook is doing and whether
|
||||
queries are finding useful results.
|
||||
|
||||
### daily-check.log — graph health history
|
||||
- **Path**: `$data_dir/daily-check.log`
|
||||
- **Format**: plaintext, multi-line entries with metrics
|
||||
- **Contains**: graph topology metrics over time (σ, α, gini, cc, fit).
|
||||
Only ~10 entries — appended by the daily health check.
|
||||
|
||||
## In-memory state (redundant with daemon.log)
|
||||
|
||||
### ctx.log_line() — task output log
|
||||
- **Stored in**: jobkit task state (last 20 lines per task)
|
||||
- **Also writes to**: daemon.log via `log_event()` (as of Mar 19)
|
||||
- **View**: `daemon-status.json` → task → output_log, or just tail daemon.log
|
||||
- **Design note**: the in-memory buffer is redundant now that progress
|
||||
events go to daemon.log. The status viewer should eventually just
|
||||
tail daemon.log filtered by job name, eliminating the in-memory state.
|
||||
|
||||
### ctx.set_progress() — current activity string
|
||||
- **Stored in**: jobkit task state
|
||||
- **View**: shown in status display next to the task name
|
||||
- **Note**: overwritten by each `ctx.log_line()` call.
|
||||
|
||||
## What to check when
|
||||
|
||||
| Problem | Check |
|
||||
|----------------------------------|------------------------------------|
|
||||
| Task not starting | daemon-status.json (task states) |
|
||||
| Task failing | daemon.log (failed events) |
|
||||
| Agent producing bad output | llm-logs/{agent}/{timestamp}.txt |
|
||||
| Agent not finding right nodes | retrieval.log (search queries) |
|
||||
| Graph health declining | daily-check.log |
|
||||
| Resource pool / parallelism | **currently no log** — need to add |
|
||||
| Which LLM backend is being used | daemon.log (llm-backend event) |
|
||||
Loading…
Add table
Add a link
Reference in a new issue