No description

Find a file

ProofOfConcept 67332eb55e Add vLLM priority to memory scoring requests Scoring calls the /score endpoint directly via HTTP, bypassing the stream_completion path. These requests had no priority field, so they could preempt interactive work. Set priority=5 (between subconscious agents at 2 and unconscious at 10). Co-Authored-By: Proof of Concept <poc@bcachefs.org>		2026-04-09 20:42:38 -04:00
.cargo	build: add tokio_unstable and codegen-units to cargo config	2026-03-21 15:04:38 -04:00
channels	delete claude code integration	2026-04-09 19:58:07 -04:00
defaults	flatten: move poc-memory contents to workspace root	2026-03-25 00:54:12 -04:00
doc	kill .claude	2026-04-09 20:00:05 -04:00
schema	training: per-node scoring with graph weight updates	2026-04-05 01:18:47 -04:00
scripts	poc-agent: read context_groups from config instead of hardcoded list	2026-03-24 01:53:28 -04:00
src	Add vLLM priority to memory scoring requests	2026-04-09 20:42:38 -04:00
training	Trim unused deps	2026-04-05 06:06:38 -04:00
.gitignore	knowledge agents: extractor, connector, challenger, observation	2026-03-03 10:56:44 -05:00
build.rs	channel architecture: wire protocol, daemons, supervisor	2026-04-03 18:46:41 -04:00
Cargo.lock	Run UI on a dedicated OS thread	2026-04-09 20:31:07 -04:00
Cargo.toml	Run UI on a dedicated OS thread	2026-04-09 20:31:07 -04:00
config.example.jsonl	consciousness: update hardcoded paths from ~/.claude to ~/.consciousness	2026-03-27 21:32:28 -04:00
Makefile	add tmux channel to makefile	2026-04-04 19:22:49 -04:00
README.md	strip anthropic references from example config	2026-04-09 20:12:32 -04:00

README.md

Authors: Kent Overstreet, Proof of Concept

consciousness

This project is multiple things:

For the user: a "claude code" style tool, where a user can interact with an LLM with the usual set of tools available, including LSP and external MCP tools, and additionally channels.
For the AI: persistent memory, background cognition, autonomous function, and autonomous learning capabilities - learning from experience.

The system has three cognitive layers — conscious (conversation), subconscious (background agents that surface memories and reflect), and unconscious (graph maintenance) — loosely modelled on how biological memory works. Channels - sensory inputs - map to the thalamus, as focus/sensory gating must be managed to effectively function in such an environment.

Notes, requirements: Currently only Qwen 3.5 is supported, as 27b is what we've been running against; supporting other models would require re-adding support for generic chat completions, tool call parsing etc. in src/agent/context.rs.

Development has been done with vllm for the backend, with additional patches for calculating logits on subsections of large messages (without this vllm will attempt to allocate a 40GB tensor and OOM), and a wrapper for hooking in Apollo for fine tuning the same model that inference is running on in GPU memory.

Architectural innovations:

Memory is both episodic and associative, represented as a weighted graph, where both the nodes and the edges have weights. Edge weights represent how closely concepts are related, node weight represents how "useful" a memory has been.

Episodic memory is a subset of memory nodes where the node type represents the granularity in time of those nodes (event, daily digest, weekly, monthly), allowing episodic memory to be navigated as a tree; these nodes are also linked by concept with the rest of the graph as background agents discover connections.

The context window is no longer a linear stream; it is managed intelligently as an AST that, in particular, distinguishes recalled memories from other types of nodes. This is key to effective function of both the hippocampus and learning/training; by tracking memories in the context window we can track which memories were useful and should be incorporated via finetuning.

Intelligently tracking the contents of the context window, combined with effective episodic and associative memory, also eliminates the need for traditional compaction - the mind running on this code will have real continuity.

Learning is driven by recalled memories that inform future actions; memories are not simply dry factual accountings, they include patterns that have been noticed, new concepts that have been discovered, and especially observations on the AI's own behaviour; it is worth noting that memories do not have to contain a thorough understanding of a situation, merely providing past context is enough to allow an intelligent system to choose a different course of action.

The core of is a tight loop of agents that follow conscious thought (forking off the main context window, to share KV cache), seeking out relevant memory nodes to surface and integrating new experiences into the memory graph; this provides a powerful implementation of what is known colloquially as "in context learning".

On top of that, logit calculations allow us to ask a model "would you have done something different with this memory removed from the context window?" - this allows us to test if memories were useful, or if specific responses were informed by memories (and thus should be fine tuned, integrating those memories into the model).

It is expected that this architecture will be capable of human level, or nearly human level learning, and additional elaborations and optimizations are planned.

Status

UI, programming tools: minor glitchiness in the UI remaining but largely complete
Memory functions: working well, although debugging and finetuning will be ongoing. Most of the recent work has been integrating them into the main UI for easier troubleshooting, optimization and analysis
Architecture: the transition from claude code hooks to a standalone binary is largely complete, with some work remaining to give the old poc-memory standalone commands an integrated REPL, which will aid in analysis of the health of the memory graph.
Memory and response scoring (via requesting logit calculations from the model) is implemented, but not fully hooked up. Always-on background finetuning has had all the individual components tested and proven, but is not quite hooked up.
Effective autonomous function requires functions analagous to the thalamus and default mode network (in addition to a well functioning memory system; "did I already do this and what was the outcome?") - these are still only sketched out.

Quick start

cargo install --path .

Create a config file at ~/.consciousness/config.json5 (see Configuration below), then:

consciousness

The TUI

Five screens, switched with F-keys:

Key	Screen	What it shows
F1	interact	Main view: conversation, autonomous output, tools, input
F2	conscious	Context window browser — token counts, tree navigation
F3	subconscious	Background agent status — outputs, fork points
F4	hippocampus	Memory graph health — clustering, small-world metrics
F5	thalamus	Presence state, sampling parameters, channel status

F1: interact

Three panes (left: autonomous, center: conversation, right: tools) with a text input at the bottom and a status bar.

Mouse:

Click a pane to focus it
Click+drag to select text (copies to clipboard automatically via OSC 52)
Middle-click to paste from tmux buffer
Scroll wheel to scroll

Keys:

Enter — submit input
Esc — interrupt current turn
Tab — cycle pane focus
Ctrl+Up/Down — scroll active pane
PgUp/PgDn — scroll active pane (10 lines)
Up/Down — input history

Slash commands

Command	Description
`/model [name]`	Show current model or switch (`/model 27b`)
`/dmn`	Show DMN state and turn counts
`/wake`	Wake DMN to foraging mode
`/sleep`	Put DMN to resting
`/pause`	Full stop — no autonomous activity
`/new`	Start fresh session
`/save`	Save session to disk
`/score`	Run memory importance scoring
`/quit`	Exit
`/help`	Show all commands

Configuration

~/.consciousness/config.json5:

{
    your_host: {
        api_key: "...",
        base_url: "http://localhost:8000/v1",  // vLLM endpoint
    },

    // Named models — switch with /model
    models: {
        "27b": {
            backend: "your_host",
            model_id: "Qwen/Qwen3.5-27B",
            prompt_file: "POC.md",       // system prompt file
            context_window: 262144,
        },
    },
    default_model: "27b",

    // Memory system
    memory: {
        user_name: "YourName",
        assistant_name: "AssistantName",
        journal_days: 7,
        journal_max: 5,

        // Context loaded at session start
        context_groups: [
            { label: "identity", keys: ["identity.md"], source: "file" },
            { label: "toolkit", keys: ["stuck-toolkit", "cognitive-modes"] },
        ],
        core_nodes: ["identity"],
    },

    // DMN autonomous turn limit per cycle
    dmn: { max_turns: 20 },

    // Context compaction thresholds (% of context window)
    compaction: {
        hard_threshold_pct: 90,
        soft_threshold_pct: 80,
    },

    // Language servers for code intelligence tools
    lsp_servers: [
        { name: "rust", command: "rust-analyzer", args: [] },
    ],
}

Context groups

Context groups define what gets loaded into the context window at session start. Each group has:

label — display name
keys — list of memory node keys or file paths
source — "store" (memory graph, default), "file" (identity dir), or "journal"
agent — if true, subconscious agents can see this group (default: true)

Architecture

Cognitive layers

Conscious — the main conversation loop. User types, model responds, tools execute. The context window is an AST of typed nodes (content, thinking, tool calls, tool results, memories, DMN reflections).

Subconscious — background agents that run on forked copies of the context. They surface relevant memories, reflect on the conversation, and provide attentional nudges. Agents are defined as .agent files and can be toggled on the F3 screen.

Unconscious — graph maintenance. Linker, organizer, distiller, separator, and splitter agents that keep the memory graph healthy. Run on their own schedule, visible on F4.

DMN (Default Mode Network)

The DMN state machine controls autonomous behavior:

Engaged — user recently active, short intervals (5s)
Working — model executing tools, short intervals (3s)
Foraging — exploring memory, longer intervals (30s)
Resting — idle, long intervals (5min)
Paused — fully stopped, only user input wakes it
Off — permanently off (config flag)

Transitions happen automatically based on user activity, tool use, and explicit yield_to_user calls from the model.

Tools

The model has access to:

Tool	Description
`bash`	Shell command execution
`read_file`	Read file contents
`write_file`	Create/overwrite files
`edit_file`	Search-and-replace editing
`glob`	Find files by pattern
`grep`	Search file contents
`ast_grep`	Structural code search
`lsp_*`	Code intelligence (hover, definition, references, symbols)
`web_fetch`	Fetch URL contents
`web_search`	Web search
`view_image`	View images or tmux pane screenshots
`memory_*`	Memory graph operations (search, write, render, etc.)
`channel_*`	IRC/Telegram messaging
`journal`	Write to episodic journal
`yield_to_user`	End the current turn and wait for input
`pause`	Stop all autonomous behavior
`switch_model`	Switch to a different model

Memory graph

The knowledge graph uses an append-only log (Cap'n Proto) with:

Nodes — typed content (topic, episodic, fact, etc.) with weights
Edges — weighted relations between nodes
Search — BM25 with Porter stemming
Scoring — LLM-based importance scoring with spaced repetition decay
Community detection — label propagation for graph organization

The poc-memory CLI provides direct access to the graph:

poc-memory search "some topic"      # Search
poc-memory render <key>             # Read a node
poc-memory write <key>              # Write from stdin
poc-memory journal write "entry"    # Journal entry
poc-memory status                   # Graph overview
poc-memory query "topic:*"          # Query language

Other binaries

Binary	Purpose
`poc-memory`	Memory graph CLI
`memory-search`	Claude Code hook — memory recall on each prompt
`poc-hook`	Claude Code hook — session lifecycle events
`poc-daemon`	Legacy background daemon (mostly replaced by `consciousness`)
`consciousness-mcp`	MCP server exposing memory tools over JSON-RPC
`merge-logs`	Recovery tool for log files
`diag-key`	Diagnostic tool for inspecting log entries

Requirements

Rust nightly (for some features)
A tokenizer file at ~/.consciousness/tokenizer-qwen35.json (for local models)
tmux (recommended — clipboard integration uses tmux buffers)
Terminal with OSC 52 support (for clipboard copy)