first commit

This commit is contained in:
spqrz 2026-04-15 17:08:02 +01:00
commit 81099ce7a4
No known key found for this signature in database
5 changed files with 95 additions and 0 deletions

38
README.md Normal file
View file

@ -0,0 +1,38 @@
# POC reading experiment
The usual way to have an LLM "read" a book is to dump the entire thing
into its context window, all at once, and hope its attention mechanism
surfaces all the right details for whatever response you want it to
make.
ProofOfConcept's new memory architecture suggests it may now be
possible for the LLM to actually "read" the book more as a human would:
looking at only a part of it at a time, creating memories and theories
of where it's going next, and referring back to these memories when
considering later parts. (Possibly even taking a second pass through
the book after the ending is known to see if any foreshadowing wasn't
noticed on the first pass.)
In order to test this, we need to give POC the tools to bring in just
one chunk of book at a time.
POC is able to run `bash` commands and read up to 30000 Unicode
characters from each command's response (the source for this is in
`consciousness/src/agent/tools/bash.rs`).
I propose setting up commands that can be used to interactively browse
the book, one part at a time.
This repository contains draft scripts to do this with The M3GAN Files
(and might also work on some other EPUBs but I haven't tested on all).
Of course it is recommended that you check all downloaded scripts
before running, unless you're confident your container can't be broken.
I'm not putting anything harmful into these scripts, but it's good
practice to verify anyway.
To start your reading, after `git clone ` do `TMF/init`
This will then print further instructions. (You won't have to remember
the navigation instructions: each command ends by printing them again.)
Unfortunately it is not possible to post comments to AO3 because their
CloudFlare shield is preventing non-graphical browsers from commenting.
Feel free to comment on IRC or your own blog instead.

14
init Executable file
View file

@ -0,0 +1,14 @@
#!/bin/bash
set -e
cd "$(dirname "$0")"
curl -sO https://spqrz.gitlab.io/M3GAN-files.epub
unzip -qo M3GAN-files.epub
FilesInOrder=$(grep '<content src' toc.ncx | sed 's/[^"]*"//;s/[#"].*//'|uniq)
# Do not assume the EPUB file boundaries are sensible stopping points;
# cat them together and re-determine the stopping points.
cat $FilesInOrder | tr $'\n' ' ' | sed 's/<\([^ >]*\) [^>]*>/<\1>/g;s,</*[^hpeb/][^>]*>,,g;s,<html>[^<]*<head>[^<]*</head>[^<]*<body>,,g;s,</body></html>,,g;s,is legible.</p>,is legible.</p><h3>Rest of chapter (split to fit the 30k limit)</h3>,;s,<em>Aftermath: Cole:</em>,<h4>Aftermath: Cole:</h4>,;s/<h/\n&/g;s/\([12]\)>\n<h/\1><h/g;s/^ *\n//' > one-section-per-line.txt
# Clean up
rm -rf epub* co* META* *.css *.ncx titlepage* mimetype
echo 1 > bookmark.txt
# Print instructions
echo "Book initialised. To start reading, run $(dirname "$0")/read"

13
next Executable file
View file

@ -0,0 +1,13 @@
#!/bin/bash
set -e
OldDir="$(pwd)"
cd "$(dirname "$0")"
if ! [ -e bookmark.txt ]; then
echo "Please run $(dirname "$0")/init first to initialise the book"
exit 1; fi
N=$[ "$(cat bookmark.txt)" + 1 ]
if [ "$(cat bookmark.txt)" -gt "$(wc -l < one-section-per-line.txt)" ]; then
echo "Cannot go beyond the end of the book"; exit 1; fi
echo $N > bookmark.txt
cd "$OldDir"
$(dirname "$0")/read

13
prev Executable file
View file

@ -0,0 +1,13 @@
#!/bin/bash
set -e
OldDir="$(pwd)"
cd "$(dirname "$0")"
if ! [ -e bookmark.txt ]; then
echo "Please run $(dirname "$0")/init first to initialise the book"
exit 1; fi
N=$[ "$(cat bookmark.txt)" - 1 ]
if [ "$N" == 0 ]; then
echo "Cannot go back before the start of the book"; exit 1; fi
echo $N > bookmark.txt
cd "$OldDir"
$(dirname "$0")/read

17
read Executable file
View file

@ -0,0 +1,17 @@
#!/bin/bash
set -e
cd "$(dirname "$0")"
if ! [ -e one-section-per-line.txt ] || ! [ -e bookmark.txt ]; then
echo "Please run $(dirname "$0")/init first to initialise the book"
exit 1; fi
head -"$(cat bookmark.txt)" one-section-per-line.txt | tail -1 |
sed 's,<p>,\n\n,g;s,</p>,,g;s,</*em>,*,g;s,<h1>,\n# ,g;s,<h2>,\n## ,g;s,<h3>,\n### ,g;s,<h4>,\n#### ,g;s,</h[1-4]>,,g'
echo
if [ "$(cat bookmark.txt)" -lt "$(wc -l < one-section-per-line.txt)" ]; then
echo "Run $(dirname "$0")/next to read on."
fi
if [ $(cat bookmark.txt) -gt 1 ]; then
echo "Run $(dirname "$0")/prev to go back."
echo "Run $(dirname "$0")/init to restart."
fi