consciousness

Author	SHA1	Message	Date
Kent Overstreet	11a7e4043e	scripts: FP8 quantize Qwen3.6-27B for vLLM (multimodal + MTP) Quantization recipe targeting the multimodal Qwen3.6-27B for vLLM serving. Three pitfalls the script avoids, each documented inline: 1. Loader strip: `AutoModelForCausalLM` silently drops the vision tower; we load via the config-declared `Qwen3_5ForConditionalGeneration` instead. 2. Pattern anchor: llmcompressor matches the `ignore` list against module names (no `.weight` suffix) when walking `named_modules()`, not against full tensor names. Patterns now anchor on `$` at the module name; the earlier `\.weight$` form silently quantized lm_head and every linear_attn projection. 3. vLLM fusion: vLLM fuses {q,k,v}_proj into qkv_proj, gate+up into gate_up_proj, and in_proj_qkv+in_proj_z into in_proj_qkvz. The compressed_tensors loader rejects mixed schemes within a fused layer, so the `ignore` list is shaped to keep all sub-components of a fused layer consistent. After `oneshot()` writes the FP8 output, MTP tensors (which the HF class doesn't expose) are spliced in at BF16 from the upstream cached snapshot, with the compressed_tensors metadata header preserved. Recipe follows Unsloth's UD-Q8_K_XL late-stack overrides (FFN: 50, 51, 59, 62, 63; ATTN: 51, 59, 63), extended to include `v_proj` for fusion compat. Final checkpoint is ~35 GB (matches Unsloth's GGUF size to within ~1%) with vision tower BF16, MTP head BF16, and most mlp/self_attn Linears at FP8_DYNAMIC. Co-Authored-By: Proof of Concept <poc@bcachefs.org>	2026-04-24 22:15:31 -04:00
Kent Overstreet	aa46b1d5a6	poc-agent: read context_groups from config instead of hardcoded list - Remove MEMORY_FILES constant from identity.rs - Add ContextGroup struct for deserializing from config - Load context_groups from ~/.config/poc-agent/config.json5 - Check ~/.config/poc-agent/ first for identity files, then project/global - Debug screen now shows what's actually configured This eliminates the hardcoded duplication and makes the debug output match what's in the config file.	2026-03-24 01:53:28 -04:00
Kent Overstreet	d9b56a02c3	Consolidate poc-memory and poc-agent configs poc-memory now reads from poc-agent's config.json5 as the primary config source. Memory-specific settings live in a "memory" section; API credentials are resolved from the shared model/backend config instead of being duplicated. - Add "memory" section to ~/.config/poc-agent/config.json5 - poc-memory config.rs: try shared config first, fall back to legacy JSONL - API fields (base_url, api_key, model) resolved via memory.agent_model -> models -> backend lookup - Add json5 dependency for proper JSON5 parsing - Update provisioning scripts: hermes -> qwen3_coder tool parser Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 21:49:58 -04:00
Kent Overstreet	377e2773bc	Add MI300X provisioning script for vllm/Qwen 3.5 27B ROCm-specific setup with: - AITER attention backends (VLLM_ROCM_USE_AITER=1) - Reduced cudagraph capture size (DeltaNet cache conflict) - BF16 model + FP8 KV cache as default (FP8 weights can be slower on MI300X due to ROCm kernel maturity) - FP8=1 flag for benchmarking FP8 model weights Key for training plan: if FP8 matmuls are slow on MI300X, the quantize-and-expand strategy needs B200 instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 14:40:15 -04:00
ProofOfConcept	6a7ec9732b	tui: fix cursor position calculation The cursor index is into self.input, but the rendered buffer contains the prompt prepended to the first line. Need to add prompt.len() to get the correct character position when scanning the buffer.	2026-03-19 00:45:07 -04:00
Kent Overstreet	f83325b44d	Fix poc-agent for vllm/Qwen 3.5: reasoning display, tool parser - Always display reasoning tokens regardless of reasoning_effort setting — Qwen 3.5 thinks natively and the reasoning parser separates it into its own field - Remove chat_template_kwargs that disabled thinking when reasoning_effort was "none" - Add chat_template_kwargs field to ChatRequest for vllm compat - Update provision script: qwen3_xml tool parser, qwen3 reasoning parser, 262K context, 95% GPU memory utilization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 00:06:26 -04:00
Kent Overstreet	49ccdf87e1	Add vllm provisioning script for RunPod GPU instances Sets up vllm with Qwen 2.5 27B Instruct, prefix caching enabled, Hermes tool call parser for function calling support. Configurable via environment variables (MODEL, PORT, MAX_MODEL_LEN). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 23:13:04 -04:00
ProofOfConcept	552d255dc3	migrate agent output to capnp store, add provenance tracking All agent output now goes to the store as nodes instead of markdown/JSON files. Each node carries a Provenance enum identifying which agent created it (AgentDigest, AgentConsolidate, AgentFactMine, AgentKnowledgeObservation, etc — 14 variants total). Store changes: - upsert_provenance() method for agent-created nodes - Provenance enum expanded from 5 to 14 variants Agent changes: - digest: writes to store nodes (daily-YYYY-MM-DD.md etc) - consolidate: reports/actions/logs stored as _consolidation-* nodes - knowledge: depth DB and agent output stored as _knowledge-* nodes - enrich: experience-mine results go directly to store - llm: --no-session-persistence prevents transcript accumulation Deleted: 14 Python/shell scripts replaced by Rust implementations.	2026-03-05 15:30:57 -05:00
ProofOfConcept	5c641d9f8a	knowledge agents: extractor, connector, challenger, observation Four layer-2 agents that produce new knowledge from the memory graph: mine conversations, extract patterns from clusters, find cross-domain connections, stress-test existing nodes. Output to agent-results/. knowledge_loop.py runs them on a schedule with quality tracking.	2026-03-03 10:56:44 -05:00
ProofOfConcept	71e6f15d82	spectral decomposition, search improvements, char boundary fix - New spectral module: Laplacian eigendecomposition of the memory graph. Commands: spectral, spectral-save, spectral-neighbors, spectral-positions, spectral-suggest. Spectral neighbors expand search results beyond keyword matching to structural proximity. - Search: use StoreView trait to avoid 6MB state.bin rewrite on every query. Append-only retrieval logging. Spectral expansion shows structurally nearby nodes after text results. - Fix panic in journal-tail: string truncation at byte 67 could land inside a multi-byte character (em dash). Now walks back to char boundary. - Replay queue: show classification and spectral outlier score. - Knowledge agents: extractor, challenger, connector prompts and runner scripts for automated graph enrichment. - memory-search hook: stale state file cleanup (24h expiry).	2026-03-03 01:33:31 -05:00
ProofOfConcept	3afc947b88	delete superseded Python scripts Seven scripts (1,658 lines) replaced by native Rust subcommands: - journal-agent.py → poc-memory journal-enrich - digest-link-parser.py → poc-memory digest-links - apply-consolidation.py → poc-memory apply-consolidation - daily-digest.py → poc-memory digest daily - weekly-digest.py → poc-memory digest weekly - monthly-digest.py → poc-memory digest monthly - refine-source.sh → folded into journal-enrich Also updated poc-journal to use Rust journal-enrich instead of Python journal-agent.py, and cleaned up stale __pycache__. Remaining Python (2,154 lines): consolidation-agents, consolidation-loop, content-promotion-agent, bulk-categorize, retroactive-digest, store_helpers, call-sonnet.sh, daily-check.sh — still active and evolving.	2026-03-01 00:13:03 -05:00
ProofOfConcept	d14710e477	scripts: use capnp store instead of reading markdown directly Add store_helpers.py with shared helpers that call poc-memory commands (list-keys, render, journal-tail) instead of globbing ~/.claude/memory/*.md and parsing section headers. All 9 Python scripts updated: get_semantic_keys(), get_topic_file_index(), get_recent_journal(), parse_journal_entries(), read_journal_range(), collect_topic_stems(), and file preview rendering now go through the store. This completes the clean switch — no script reads archived markdown files.	2026-02-28 23:32:47 -05:00
ProofOfConcept	4b0bba7c56	replace state.json cache with bincode state.bin Faster serialization/deserialization, smaller on disk (4.2MB vs 5.9MB). Automatic migration from state.json on first load — reads the JSON, writes state.bin, deletes the old file. Added list-keys, list-edges, dump-json commands so Python scripts no longer need to parse the cache directly. Updated bulk-categorize.py and consolidation-loop.py to use the new CLI commands.	2026-02-28 22:30:03 -05:00
ProofOfConcept	23fac4e5fe	poc-memory v0.4.0: graph-structured memory with consolidation pipeline Rust core: - Cap'n Proto append-only storage (nodes + relations) - Graph algorithms: clustering coefficient, community detection, schema fit, small-world metrics, interference detection - BM25 text similarity with Porter stemming - Spaced repetition replay queue - Commands: search, init, health, status, graph, categorize, link-add, link-impact, decay, consolidate-session, etc. Python scripts: - Episodic digest pipeline: daily/weekly/monthly-digest.py - retroactive-digest.py for backfilling - consolidation-agents.py: 3 parallel Sonnet agents - apply-consolidation.py: structured action extraction + apply - digest-link-parser.py: extract ~400 explicit links from digests - content-promotion-agent.py: promote episodic obs to semantic files - bulk-categorize.py: categorize all nodes via single Sonnet call - consolidation-loop.py: multi-round automated consolidation Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>	2026-02-28 22:17:00 -05:00

14 commits