Commit graph

5 commits

Author SHA1 Message Date
Kent Overstreet
0e6b5dc8be agent: phase-aware bail script for surface-observe concurrency
bail-no-competing.sh used to bail if any other live agent existed in
the state dir, period. That was too coarse: surface-observe agents run
a multi-step pipeline (surface → organize-search → organize-new →
observe), and the intent is to let a new surface-phase agent start
while an older one finishes its post-surface tail. With the old check
the newer agent always bailed, so surface-observe was effectively
serialized at the slowest cycle time.

Make the script phase-aware:

- oneshot.rs now passes the current phase as argv[2] alongside the pid
  file name. The script writes that phase into its own pid file on
  every step transition, so concurrent agents can read each other's
  phase just by cat'ing the pid files.

- Bail only when another live agent is in the same phase-group as us.
  Groups: "surface" vs. "everything else" (post-surface). At most one
  agent per group alive at a time — surface runs at a higher cadence
  than the organize/observe tail.

- Still clean up stale pid files for dead processes.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-16 15:41:28 -04:00
Kent Overstreet
c300013ce5 improve bail-no-competing.sh 2026-04-11 18:41:44 -04:00
Kent Overstreet
70ee7abea5 Fix restore_from_log panic on Thinking entries, fix bail nullglob
restore_from_log called .message() on all entries including Thinking
entries, which panic. Filter them out alongside Log entries.

Also fix bail-no-competing.sh: without nullglob, when no pid-* files
exist the glob stays literal and always triggers a false bail.

Co-Authored-By: Proof of Concept <poc@bcachefs.org>
2026-04-08 10:39:07 -04:00
ProofOfConcept
5d803441c9 cleanup: kill dead code, fix signal handler safety
- Remove unused now_secs(), parse_json_response, any_alive, Regex import
- Signal handler: replace Mutex with AtomicPtr<c_char> for signal safety
  (Mutex::lock in a signal handler can deadlock if main thread holds it)
- PidGuard Drop reclaims the leaked CString; signal handler just unlinks
- scan_pid_files moved to knowledge.rs as pub helper
- setup_agent_state calls scan_pid_files to clean stale pids on startup

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 15:58:59 -04:00
ProofOfConcept
52703b4637 agents: bail script support, pid file simplification, cleanup
- Bail command moved from hardcoded closure to external script
  specified in agent JSON header ("bail": "bail-no-competing.sh")
- Runner executes script between steps with pid file path as $1,
  cwd = state dir. Non-zero exit stops the pipeline.
- PID files simplified to just the phase name (no JSON) for easy
  bash inspection (cat pid-*)
- scan_pid_files helper deduplicates pid scanning logic
- Timeout check uses file mtime instead of embedded timestamp
- PID file cleaned up on bail/error (not just success)
- output() tool validates key names (rejects pid-*, /, ..)
- Agent log files append instead of truncate
- Fixed orphaned derive and doc comment on AgentStep/AgentDef
- Phase written after bail check passes, not before

Co-Authored-By: Kent Overstreet <kent.overstreet@linux.dev>
2026-03-26 15:20:29 -04:00