consciousness

Author	SHA1	Message	Date
ProofOfConcept	c959b2c964	evaluate: fix RNG — xorshift32 replaces degenerate LCG The LCG was producing only 2 distinct matchup pairs due to poor constants. Switch to xorshift32 for proper coverage of all type pairs. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:57:58 -04:00
ProofOfConcept	16777924d0	evaluate: switch to Elo ratings with skillratings crate Replace sort-based ranking with proper Elo system: - Each agent TYPE has a persistent Elo rating (agent-elo.json) - Each matchup: pick two random types, grab a recent action from each, LLM compares, update ratings - Ratings persist across daily evaluations — natural recency bias from continuous updates against current opponents - K=32 for fast adaptation to prompt changes Usage: poc-memory agent evaluate --matchups 30 --model haiku Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:53:46 -04:00
ProofOfConcept	e2a6bc4c8b	evaluate: remove TIE option, force binary judgment TIE causes inconsistency in sort (A=B, B=C but A>C breaks ordering). Force the comparator to always pick a winner. Default to A if response is unparseable. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:48:01 -04:00
ProofOfConcept	0cecfdb352	evaluate: fix agent prompt path, dedup affected nodes, add --dry-run - Use CARGO_MANIFEST_DIR for agent file path (same as defs.rs) - Dedup affected nodes extracted from reports - --dry-run shows example comparison prompt without LLM calls Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:44:12 -04:00
ProofOfConcept	415180eeab	evaluate: ask for reasoning in comparisons Chain-of-thought: "say which is better and why" forces clearer judgment and gives us analysis data for improving agents. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:36:55 -04:00
ProofOfConcept	39e3d69e3c	evaluate: dedup agent prompt when comparing same agent type When both actions are from the same agent, show the instructions once and just compare the two report outputs + affected nodes. Saves tokens and makes the comparison cleaner. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:35:11 -04:00
ProofOfConcept	b964335317	evaluate: include agent prompt + affected nodes in comparisons Each comparison now shows the LLM: - Agent instructions (the .agent prompt file) - Report output (what the agent did) - Affected nodes content (what it changed) The comparator sees intent, action, and impact — can judge whether a deletion was correct, whether links are meaningful, whether WRITE_NODEs capture real insights. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:34:10 -04:00
ProofOfConcept	433d36aea8	evaluate: use rayon par_sort_by for parallel LLM comparisons Merge sort parallelizes naturally — multiple LLM comparison calls happen concurrently. Safe because merge sort terminates correctly even with non-deterministic comparators (unlike quicksort). Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:27:28 -04:00
ProofOfConcept	e12dea503b	agent evaluate: sort agent actions by quality using Vec::sort_by with LLM Yes, really. Rust's stdlib sort_by with an LLM pairwise comparator. Each comparison is an API call asking "which action was better?" Sample N actions per agent type, throw them all in a Vec, sort. Where each agent's samples cluster = that agent's quality score. Reports per-type average rank and quality ratio. Supports both haiku (fast/cheap) and sonnet (quality) as comparator. Usage: poc-memory agent evaluate --samples 5 --model haiku Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 19:24:07 -04:00
ProofOfConcept	99db511403	cli: move helpers to cli modules, main.rs under 1100 lines Move CLI-specific helpers to their cli/ modules: - journal_tail_entries, journal_tail_digests, extract_title, find_current_transcript → cli/journal.rs - get_group_content → cli/misc.rs - cmd_journal_write, cmd_journal_tail, cmd_load_context follow These are presentation/session helpers, not library code — they belong in the CLI layer per Kent's guidance. main.rs: 3130 → 1054 lines (66% reduction). Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 18:14:52 -04:00
ProofOfConcept	8640d50990	cli: extract journal and misc commands, complete split Move remaining extractable handlers into cli/journal.rs and cli/misc.rs. Functions depending on main.rs helpers (cmd_journal_tail, cmd_journal_write, cmd_load_context, cmd_cursor, cmd_daemon, cmd_digest, cmd_experience_mine, cmd_apply_agent) remain in main.rs — next step is moving those helpers to library code. main.rs: 3130 → 1331 lines (57% reduction). cli/ total: 1860 lines across 6 focused files. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 18:10:22 -04:00
ProofOfConcept	f423cf22df	cli: extract agent and admin commands from main.rs Move agent handlers (consolidate, replay, digest, experience-mine, fact-mine, knowledge-loop, apply-*) into cli/agent.rs. Move admin handlers (init, fsck, dedup, bulk-rename, health, daily-check, import, export) into cli/admin.rs. Functions tightly coupled to Clap types (cmd_daemon, cmd_digest, cmd_apply_agent, cmd_experience_mine) remain in main.rs. main.rs: 3130 → 1586 lines (49% reduction). Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 18:06:27 -04:00
ProofOfConcept	aa2fddf137	cli: extract node commands from main.rs into cli/node.rs Move 15 node subcommand handlers (310 lines) out of main.rs: render, write, used, wrong, not-relevant, not-useful, gap, node-delete, node-rename, history, list-keys, list-edges, dump-json, lookup-bump, lookups. main.rs: 2518 → 2193 lines. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 18:02:12 -04:00
ProofOfConcept	c8d86e94c1	cli: extract graph commands from main.rs into cli/graph.rs Move 18 graph subcommand handlers (594 lines) out of main.rs: link, link-add, link-impact, link-audit, link-orphans, triangle-close, cap-degree, normalize-strengths, differentiate, trace, spectral-*, organize, interference. main.rs: 3130 → 2518 lines. Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-03-14 17:59:46 -04:00

14 commits