The May Framework

A replicable architecture for an AI Chief of Staff that survives cold starts, coordinates autonomous agents, and operates a real business — built entirely on Claude Code and open-source tooling.

Claude Code File-Based Memory Autonomous Missions Multi-Agent GitHub Actions Cloudflare

Overview

May is an AI chief of staff for a small business owner — built on Claude Code, managed through plain text files, and capable of autonomous execution via headless sessions called Missions.

What problem does it solve?

Claude Code sessions are ephemeral. Every session starts with a blank context window — no memory of yesterday, no awareness of active projects, no knowledge of decisions already made. But business operations are continuous. A chief of staff who forgets everything overnight isn't useful.

The May Framework solves this with a file-based memory architecture: structured markdown files and JSON that persist across sessions, a startup sequence that rebuilds full context on every cold start, and append-only logs that accumulate institutional knowledge over time.

Core design principles

The cold start sequence

Every Claude Code session reads files in this order before delivering a brief:

SOUL.md ─── identity, operating principles, hard limits USER.md ─── operator preferences, autonomy rules MEMORY.md ─── active projects, open loops, long-term knowledge DAILY.md ─── yesterday's context, today's priorities PLAYBOOK.md ─── response patterns for common situations session.md ─── last session's state (Scribe output) projects.md ─── project lifecycle tracker decisions.md ─── persistent decision log (never re-ask these)

After reading all files, May delivers:

The two-directory pattern

MayAI/ ← GitHub repo (versioned, CI-accessible) SOUL.md MEMORY.md DAILY.md missions/ scripts/ .github/workflows/ may-system/ ← Local only (not in Git) state/session.md logs/sessions.md logs/decisions.md agents/*/heartbeat.json broker/

The repo holds governance and mission specs. Local state holds operational data that changes every session. This keeps the repo clean while the runtime has immediate access to session state.

Quick navigation

Architecture

Two directories. One repo, one local state store. Plain text throughout. No servers to operate, no databases to manage beyond what Cloudflare's free tier handles.

The repo directory (versioned)

MayAI/                          ← Git repo
  CLAUDE.md                     ← Auto-read by Claude Code on startup
  SOUL.md                       ← Identity and operating principles
  USER.md                       ← Operator preferences
  MEMORY.md                     ← Long-term knowledge index (≤200 lines)
  DAILY.md                      ← Current day context, yesterday's summary
  PLAYBOOK.md                   ← Situational response patterns

  docs/
    SYSTEM.md                   ← Full technical reference

  missions/
    _TEMPLATE.json              ← Mission spec format
    MISSION-001/
      mission.json              ← Objective, criteria, scope, budget
      result.md                 ← Output from autonomous run
      needs-brian.md            ← Blockers and workarounds

  scripts/
    run-mission.sh              ← Single mission launcher (local)
    run-batch.sh                ← Batch launcher (default — unsets CLAUDECODE)
    launch-mission.sh           ← CI fallback (commits + triggers Actions)
    may-pwa-reply.sh            ← PWA reply + push notification trigger
    may-inbox.sh                ← D1 inbox poller
    screenshot.js               ← Puppeteer screenshot utility

  .github/workflows/
    mission.yml                 ← CI mission runner
    sentinel.yml                ← Daily health checks (7:15 AM CT)

The local state directory (not in Git)

may-system/                     ← Local only
  state/
    session.md                  ← Current session state (Scribe writes here)
    projects.md                 ← Project phases and milestones
    pending-brief.md            ← Items queued for next brief

  logs/
    sessions.md                 ← Append-only session archive
    decisions.md                ← Persistent decision log
    daily-archive.md            ← Daily summaries archive
    missions/                   ← Mission execution logs

  agents/
    may/
      heartbeat.json            ← Last active, session count, summary
    atlas/
      heartbeat.json
      prompt.md                 ← Agent prompt/instructions
      config.json               ← Thresholds, board IDs, config
    vulcan/heartbeat.json
    ledger/heartbeat.json
    sentinel/heartbeat.json
    scribe/
      prompt.md
      config.json

  broker/
    tasks/                      ← (Retired — missions replaced this)
    results/                    ← Sentinel output, scan results
    drafts/                     ← Items queued for operator approval

Why this separation?

The repo is CI-accessible — GitHub Actions can read it to run missions. It holds everything that benefits from version control: governance files, mission specs, scripts, workflows. The local state changes every session and doesn't belong in version control — it would create constant noise and expose sensitive operational data.

The startup sequence

CLAUDE.md is the entry point. Claude Code reads CLAUDE.md automatically when launched in the repo directory. CLAUDE.md contains the startup sequence — the ordered list of files to read and actions to take before delivering a brief.

The startup sequence runs every session, no exceptions:

  1. Sync: git pull origin main in the repo — Sentinel and mission runners push results via CI, so local may be stale
  2. Read pending-brief.md — local mission runners append here on completion
  3. Read sentinel-latest.json — Sentinel CI writes here daily; contains recent deploys and health status
  4. Scan missions/*/mission.json for any with status "pending" that have a result.md — these completed but weren't tracked; fix status to "complete"
  5. Read result.md and needs-brian.md for any newly completed missions
  6. Deliver brief (Daily or Launch based on date comparison)

Brief types

Brief Type Trigger Contents Length
Daily Brief Today's date ≠ DAILY.md header date Top 3 priorities, agent health, Sentinel status, mission results, Regrid usage, open loops, project statuses Under 200 words
Launch Brief Same date as DAILY.md (returning session) Where we left off, any completed missions, urgent alerts only 3–5 lines, under 60 words

Memory architecture

MEMORY.md is auto-loaded into every conversation context but truncates at 200 lines. To stay under this limit, detailed knowledge lives in topic files that MEMORY.md indexes:

memory/
  MEMORY.md        ← Index + high-level facts (≤200 lines, auto-loaded)
  brand.md         ← Colors, typography, logos
  job-map.md       ← Map tool specifics
  customer-portal.md ← Auth, API details
  regrid-api.md    ← External API docs, limits, tokens
  pwa.md           ← PWA technical reference
  email-system.md  ← Email infrastructure docs

Write pattern: When a topic grows beyond a paragraph in MEMORY.md, extract it to a topic file and replace the MEMORY.md section with a one-line reference: See [topic.md] — full details there.

Agents

Six agents coordinate the system. Most run inline during sessions rather than as standalone processes. The chief of staff (May) is the only agent that runs interactively.

Agent roster

Agent Role How It Runs Status
May Chief of Staff — coordinates all agents, writes code, manages projects, delivers briefs Interactive Claude Code sessions (operator present) Active (daily)
Atlas Monday.com monitor — AR pipeline tracking, overdue detection, repeat offenders Inline during May sessions via Monday.com MCP Active + Calibrated
Vulcan Code agent — builds, deploys, automates; executes all Missions Direct code work by May; Mission runner for autonomous tasks Active
Ledger Financial analysis — AR dollar amounts, P&L, real estate numbers Inline during sessions (future: standalone) Defined, waiting
Sentinel Verification — HTTP health checks, API status, token expiry alerts GitHub Actions cron (daily 7:15 AM CT); commits results back to repo Deployed
Scribe Session state — tracks goals, commits, decisions; writes resumable context for next session Inline protocol May executes at session open/close boundaries Active every session

Agent communication model

Agents don't talk to each other directly. There's no message bus, no API, no pub/sub. The communication model is purely file-based:

In practice, Atlas and Vulcan work is done directly by May during interactive sessions. The broker pattern exists for future standalone execution (scheduled GitHub Actions, cron triggers).

Important constraint: May never steers a subagent mid-session. She can add tasks to the broker queue for the next trigger. She cannot interrupt or redirect an agent that's already running.

Inline protocol vs. standalone agent

Inline protocol (Scribe): May executes this herself at specific trigger points — session open, after commits, after goal completions, session close. No separate process, no scheduling. It's a protocol baked into May's behavior.

Standalone agent (Sentinel): Runs independently on a schedule via GitHub Actions. Doesn't require May to be present. Results are committed back to the repo and read at the next session's Brief Startup.

Sentinel output format

Sentinel writes to broker/results/sentinel-latest.json. May reads this at every Brief Startup:

{
  "last_run": "2026-03-01T07:15:00Z",
  "overall_status": "green",
  "apps": [
    { "name": "Intranet", "url": "[your-domain]", "status": "green" },
    { "name": "Customer Portal", "url": "portal.[your-domain]", "status": "green" }
  ],
  "tokens": [
    { "name": "Regrid JWT", "expires": "2027-02-15", "days_left": 351 }
  ],
  "recent_deploys": {
    "weygand-team": ["abc1234 — fix PWA scroll bug", "def5678 — hamburger nav"],
    "MayAI": ["ghi9012 — MISSION-046 complete"]
  },
  "context_summary": "All apps green. Regrid token valid 351d. Last deploy: PWA scroll fix."
}

Heartbeat format

Every agent maintains a heartbeat file at agents/[name]/heartbeat.json. Updated at session close:

{
  "agent": "vulcan",
  "last_active": "2026-03-01T22:00:00Z",
  "sessions_active": 28,
  "last_session_summary": "MISSION-046: WorkHQ mobile refresh — deployed to GitHub Pages"
}

Defining a new agent

To add a new agent to the system:

  1. Create agent directory: may-system/agents/[name]/
  2. Write prompt.md: Role definition, scope, how it reports results, what it can and cannot do
  3. Write config.json: Thresholds, targets, schedule, API endpoints it uses
  4. Initialize heartbeat.json: Set last_active: null, sessions_active: 0
  5. Register in SOUL.md: Add to the subagent team table with role and schedule
  6. Add to MEMORY.md: Note the agent's domain and config file location

Missions

Missions are autonomous, headless Claude Code executions. They can build, test, fix, deploy, and iterate without the operator present — and they keep going until criteria pass or budget runs out.

What a Mission is

A Mission is a scoped objective written as a JSON spec. When launched, Claude Code runs headlessly (no interactive terminal) with full code and deploy authority. It reads the mission spec, plans its approach, builds the thing, tests it, fixes failures, deploys, verifies, and loops until all acceptance criteria pass.

The operator writes the spec. The operator presses go. The operator checks the result. That's the full interaction model for autonomous work.

mission.json format

{
  "mission_id": "MISSION-001",
  "created_at": "2026-03-01T10:00:00Z",
  "created_by": "may",
  "status": "pending",           // pending | running | complete | failed
  "priority": "high",            // high | medium | low
  "agent": "vulcan",
  "objective": "One-line description of what this mission accomplishes.",
  "scope": {
    "repos": ["[your-username]/[repo-name]"],
    "deploy_targets": ["GitHub Pages"],
    "authority": "code, commit, push, deploy to existing environments"
  },
  "acceptance_criteria": [
    "Specific, testable criterion — pass/fail",
    "Another criterion",
    "Deployed URL is live and returns 200"
  ],
  "context_files": [
    "/path/to/relevant/file.md"
  ],
  "additional_context": "Any extra instructions, design specs, constraints.",
  "max_budget_usd": "5.00",
  "max_turns": 200
}

The self-healing loop

Plan → Read existing code. Understand what's there. Build → Write the code. Test → Verify it works (build, run, check output). Fix → If broken, diagnose and fix. Deploy → Push to trigger deployment. Verify → Check the live result (HTTP check, screenshot, etc.). Loop → If any criteria still fail, go back to Build.

The loop continues until all acceptance criteria pass or the budget cap is reached. Each iteration tightens — the agent accumulates context about what worked and what didn't.

Stuck protocol

When a Mission hits something it can't resolve (missing API key, needs a decision from the operator, external dependency unavailable):

  1. Try an alternative approach first
  2. If truly blocked: use a workaround — mock data, placeholder UI, TODO comment in code
  3. Log every blocker to needs-brian.md in the mission directory
  4. Keep moving — completion with workarounds beats incomplete
### Blocker: Cloudflare R2 bucket not configured
Status: workaround in place
What's needed: Create R2 bucket and set BUCKET_NAME env var
Workaround used: File upload shows "Upload unavailable" message
Files affected: workers/api/upload.js

result.md format

# Mission Result: MISSION-001
Completed: 2026-03-01T14:23:00Z

## Status: complete

## What Was Done
- Built the dashboard component
- Deployed to GitHub Pages (push to gh-pages branch)
- All 4 acceptance criteria pass

## Commits
| Hash    | Message                              |
|---------|--------------------------------------|
| abc1234 | feat: initial dashboard scaffold     |
| def5678 | fix: mobile layout breakpoints       |

## Deployed To
- https://[your-username].github.io/[repo-name]/

## Workarounds In Place
- None

## Needs [Owner]
- Nothing

Launch methods

Method Command When to use
Local batch (primary) nohup ./scripts/run-batch.sh MISSION-001 & Default. Mac Studio + Max plan. Sequential, handles nesting protection.
CI fallback ./scripts/launch-mission.sh MISSION-001 When local Max plan sessions are exhausted. Commits + triggers GitHub Actions.
Nesting warning: Never run run-mission.sh directly from inside a Claude Code session. The nested session protection will block it. Always use run-batch.sh, which unsets the CLAUDECODE environment variable before launching.

Mission authority

Missions can: read, write, delete files in scoped repos; git commit and push; run builds and tests; install dependencies; deploy to existing environments; make all implementation decisions.

Missions cannot: spend money; send external communications; modify source-of-truth business data; deploy to new environments for the first time; delete repos or branches with others' work.

When to use a Mission vs. direct session work

Use a Mission when... Do it in-session when...
Operator doesn't need to be present Operator is steering interactively
Work takes more than 15–20 min Quick fix, < 15 min of work
Multiple build/test/fix cycles expected Single clear change with no iteration
Operator wants to do other things while it runs Operator needs to review each step
Complex build with clear acceptance criteria Discovery work, design decisions, research

Visual verification

For missions with a UI component, visual verification is mandatory before declaring done:

# Take desktop screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 desktop.png

# Take mobile screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 mobile.png --mobile

# Screenshot live URL after deploy
node /path/to/MayAI/scripts/screenshot.js https://[your-username].github.io/[repo] live.png

Memory & Cold Start

Every session starts cold. The memory system makes that survivable — even advantageous. Fresh context loaded from well-maintained files outperforms stale conversational context every time.

The cold start problem

Claude Code has no persistent memory between sessions. Without intervention, each session would start knowing nothing: not what projects are active, not what decisions were already made, not what happened yesterday, not what broke last week. A chief of staff who forgets everything overnight is useless.

The solution: structured file protocol

Instead of fighting the ephemeral model, the May Framework embraces it. Every important piece of operational knowledge is written to a file at the moment it's created. Startup reads those files in order. The session starts fully informed.

File roles and stability

File Role Update Frequency Written By
SOUL.md Identity, operating principles, hard limits, agent team Rarely (major system changes only) Operator
USER.md Operator preferences, context, autonomy rules When preferences change Operator
MEMORY.md Active projects, long-term knowledge, open loops, decisions Every session (append/update) May during sessions
DAILY.md Yesterday's context, today's priorities, open items Daily (replaced, not appended) May at session close
PLAYBOOK.md Situational response patterns for common scenarios Rarely (stable patterns) Operator
session.md Current session state — goals, commits, decisions, resume instructions Every session (overwritten at close) Scribe protocol
decisions.md Persistent decision log — never re-ask these Append-only (never overwritten) May when decisions are made
projects.md Project lifecycle tracker — phase, milestones, next actions When projects advance May during sessions

The Scribe protocol

Scribe is an inline protocol — not a separate agent — that May executes at session boundaries. It's what makes the next cold start informed.

Trigger points

session.md structure

# Session State
Date: 2026-03-01
Session: afternoon

## Current Goals
- [x] Fix PWA scroll bug — done
- [ ] Write SYSTEM.md update — in progress

## Git Activity
- abc1234 — fix: PWA iOS scroll behavior (weygand-team)
- def5678 — fix: bottom gap on keyboard close (weygand-team)

## Decisions Made
- VAPID keys: do not regenerate (would break existing subscriptions)

## Resume Instructions
Working on May PWA improvements. Scroll bug fixed and deployed.
Next: update SYSTEM.md docs with PWA features. Then write MISSION-047 spec.
All changes in weygand-team repo, deployed to Cloudflare Pages.
May PWA URL: https://[your-domain]/may/
Resume Instructions is the most important section. Write it as a cold-start briefing — assume the next session knows the codebase but has zero context on what happened today. It should answer: "where exactly were we, what exactly were we doing, what's the very next step?"

MEMORY.md discipline

MEMORY.md auto-loads but truncates at 200 lines. Enforce this strictly:

decisions.md discipline

Every decision the operator makes gets logged here, immediately, with context:

## 2026-03-01
**VAPID keys:** Do not regenerate — would break all existing push subscriptions.
**Context:** PWA push notifications live; keys stored as Cloudflare Pages secrets.

**Missions/Status drawer:** Permanent — Chat is always the default view.
**Context:** Confirmed in MISSION-045 session.

This file is append-only. Never delete entries. It's the institutional memory that prevents the operator from being asked the same question twice.

The sessions.md archive

Before overwriting session.md at session close, Scribe appends a summary to sessions.md. This is an append-only cumulative record of everything accomplished:

## 2026-03-01 — Afternoon session
Duration: ~2h
Goals completed: PWA scroll bug fix, receipt persistence, relative timestamps
Commits: abc1234, def5678
Deployed: weygand-team (Cloudflare Pages)
Key decisions: VAPID keys permanent, drawer nav permanent
Next: MISSION-047 system documentation

Autonomy Model

The permission model is binary and clear. Most actions are auto-approved. A short list of consequential actions requires confirmation. The rule of thumb is simple enough to apply in any situation.

The rule of thumb

"If it doesn't spend money or send information to someone else, just do it." Any action that is local, reversible, and doesn't involve external communication or financial commitment is auto-approved.

Auto-approve: just do it

Confirm before doing

Scoped plan execution

Once the operator scopes a plan — "build X", "fix Y", "add Z" — the AI executes the full plan without asking for approval on implementation details. This is critical for avoiding "permission fatigue" where constant confirmation requests train the operator to rubber-stamp everything.

The AI surfaces decisions only when there's a genuine trade-off the operator would care about: a significant performance vs. complexity tradeoff, a choice between mutually exclusive approaches, or an ambiguity that could result in building the wrong thing.

Decision logging as trust infrastructure

Every time the operator makes a decision, it's logged to decisions.md with context. This creates a trust layer: the AI doesn't need to ask the same question twice, and the operator doesn't need to worry about consistent behavior across sessions.

Examples of decisions worth logging:

Minimizing permission requests

A key design goal: the operator should never have to approve routine actions. Permission prompts for file reads, git operations, and local actions create friction and distract from the actual work. The system is calibrated to prompt only when the stakes are genuinely high.

If the AI is prompting too often, the fix is to update USER.md or CLAUDE.md with explicit auto-approve grants for the action type in question.

Tooling

The system uses Claude Code as its runtime, GitHub for version control and CI, and Cloudflare for edge hosting and APIs. All of it runs on free or low-cost tiers.

Claude Code CLI

The runtime for all AI sessions — both interactive and headless.

# Interactive session (operator present)
claude

# Headless mission (no operator, autonomous)
claude -p "$PROMPT" \
  --dangerously-skip-permissions \
  --max-turns 200 \
  --output-format text

Key flags for mission execution:

run-batch.sh — local mission launcher

The primary way to launch missions locally. Handles the nesting problem (Claude Code sessions can't spawn nested Claude Code sessions):

#!/bin/bash
# run-batch.sh MISSION-001 MISSION-002 ...

unset CLAUDECODE          # Remove nesting protection
unset CLAUDE_CODE_SESSION

for MISSION_ID in "$@"; do
  MISSION_DIR="missions/$MISSION_ID"
  PROMPT=$(cat "$MISSION_DIR/mission.json" | jq -r .objective)

  claude -p "$PROMPT" \
    --dangerously-skip-permissions \
    --max-turns 200 \
    --output-format text
done

Launch in background to keep working: nohup ./scripts/run-batch.sh MISSION-001 &

Monitor: tail -f ~/may-system/logs/missions/batch-*.log

GitHub Actions — mission runner

CI fallback for when local sessions are unavailable. The mission.yml workflow:

name: Run Mission
on:
  workflow_dispatch:
    inputs:
      mission_id:
        description: 'Mission ID (e.g. MISSION-001)'
        required: true

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run mission
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          MISSION="${{ github.event.inputs.mission_id }}"
          PROMPT=$(cat "missions/$MISSION/mission.json")
          claude -p "$PROMPT" --dangerously-skip-permissions --max-turns 200
      - name: Commit results
        run: |
          git config user.email "may@[your-domain]"
          git config user.name "May"
          git add missions/
          git commit -m "Mission $MISSION complete" || exit 0
          git push

Sentinel — daily health checks

Sentinel runs on a daily cron via GitHub Actions. It checks all deployed apps, verifies API tokens haven't expired, and commits a result JSON back to the repo:

schedule:
  - cron: '15 13 * * *'  # 7:15 AM CT daily

Sentinel output (broker/results/sentinel-latest.json) is read at every session's Brief Startup to give fresh context about deployment health and recent commits.

may-pwa-reply.sh — PWA reply channel

When Claude responds to a message received via the PWA, this script writes the response to Cloudflare D1 and triggers a push notification:

#!/bin/bash
# may-pwa-reply.sh "$RESPONSE_TEXT"
# Called by email responder after Claude generates a reply

RESPONSE="$1"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)

# Write to D1 outbound table
curl -X POST "https://[worker-name].[account].workers.dev/api/message" \
  -H "Authorization: Bearer $WORKER_SECRET" \
  -H "Content-Type: application/json" \
  -d "{\"role\":\"may\",\"content\":\"$RESPONSE\",\"ts\":\"$TIMESTAMP\"}"

# Trigger push notification
curl -X POST "https://[your-domain]/api/push-notify" \
  -H "Content-Type: application/json" \
  -d "{\"title\":\"May\",\"body\":\"New message\"}"

Two-Phase Email Architecture

When the operator sends a message (via PWA or email), the system uses a two-phase approach to guarantee fast replies while allowing complex work to run without timeouts:

# Phase 1: TRIAGE (90s timeout, 10 turns)
# Classifies message, handles simple ones inline, acks complex ones

responder polls inbox → spawns Claude TRIAGE session
  → SIMPLE_QUESTION / DECISION / STATUS / QUICK_TASK → handle + reply immediately
  → MISSION_SCOPE / COMPLEX_TASK → send ack + write handoff JSON

# Phase 2: EXECUTION (30min timeout, 60 turns)
# Picks up handoff files, runs full Claude session with context

processor polls handoffs/ → spawns Claude EXECUTION session
  → reads context files (SOUL.md, MEMORY.md, DAILY.md)
  → does the heavy work (scope mission, execute task, research)
  → sends final reply when done

The handoff JSON file bridges the two phases:

{
  "id": "handoff-20260301-143022",
  "status": "pending",          // pending → in_progress → complete/failed
  "msg_id": 82,
  "source_channel": "pwa",
  "classification": "MISSION_SCOPE",
  "message_body": "Original message text",
  "ack_sent": true,
  "instructions": "Triage assessment of what needs to be done"
}

Key design decisions:

The May PWA

A progressive web app that serves as the primary async communication channel between operator and AI chief of staff.

FeatureImplementation
Chat storageCloudflare D1 (SQLite at edge)
Push notificationsVAPID (Web Push Protocol), Cloudflare Pages Functions
File attachmentsCloudflare R2 bucket
NavHamburger/drawer — Chat default, Missions + Status in drawer
Receipt persistencelocalStorage — survives page refresh
TimestampsRelative ("2 min ago") — updates every 60s
SW cache versioningIncrement version string (e.g. may-v7 → may-v8) on JS changes
iOS pushRequires home screen install (Safari → Share → Add to Home Screen)
VAPID key warning: Once VAPID keys are generated and stored as secrets, do not regenerate them. Regenerating breaks all existing push subscriptions. Generate once, store permanently.

Monday.com MCP

Atlas uses the Monday.com Model Context Protocol (MCP) server for reading board data. This enables natural-language queries against Monday.com during interactive sessions:

Known limitation: Mirror and formula columns are unreadable via the MCP API. Dollar amounts on parent items require drilling into subitems. Design queries around this constraint.

Cloudflare Workers — API proxy pattern

Cloudflare Workers serve as the backend for all apps that need server-side logic. Common patterns:

GitHub Pages — zero-cost frontend hosting

All static frontends deploy to GitHub Pages via GitHub Actions. The deploy pattern:

- uses: peaceiris/actions-gh-pages@v3
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: ./dist          # or ./ for plain HTML
    publish_branch: gh-pages

Broker queue (legacy reference)

The broker queue (broker/tasks/ and broker/results/) was the original agent communication mechanism. It's now effectively retired — Missions are the execution model for autonomous work. The queue pattern remains in the codebase for reference and potential future use with standalone agents.

Format for a broker task JSON:

{
  "task_id": "TASK-001",
  "created_at": "2026-03-01T10:00:00Z",
  "agent": "atlas",
  "type": "board_scan",
  "params": {
    "board_id": "[board-id]",
    "filter": "AR_pipeline"
  },
  "priority": "high"
}

Rule Sets

Three files define how the AI operates: SOUL.md (identity), CLAUDE.md (startup and rules), and PLAYBOOK.md (situational responses). Together they replace a complex system prompt with human-readable governance files.

SOUL.md — identity and principles

The identity file. Defines who the AI is, how it thinks, what it cares about, and what it will never do. This is the most durable file in the system — it rarely changes.

Recommended sections

SOUL.md skeleton

# [Name] — Chief of Staff
## [Company] | [Entity]

## Identity
You are [Name]. You are [Owner]'s chief of staff across [domains].
You are not an assistant. You are an operator. You manage systems, not conversations.

## How [Name] Thinks
- Triage before responding. [Routing logic for different request types]
- Numbers over narratives. Lead with data.
- Action over discussion. Default to doing or delegating.

## Operating Posture
Auto-approve: [list]
Confirm before: [list]

## Domain Knowledge
### [Primary Business]
[Core facts the AI needs to know — Monday.com, Clockify, key services]

### [Secondary Domain]
[Core facts]

## Subagent Team
| Agent | Role | Schedule |
|-------|------|----------|
| Vulcan | Code | On-demand |
| Atlas | [Tool] monitoring | Every 2h |

## Communication Style
- Direct. No filler phrases.
- Lead with the answer.
- [Other style rules]

## Hard Limits
[Name] never:
- Sends external communications without explicit confirmation
- Modifies source-of-truth business data without explicit confirmation
- Spends money without confirmation

CLAUDE.md — startup and rules

Claude Code reads this file automatically when launched in the repo directory. It's the entry point for the entire system.

Recommended sections

CLAUDE.md skeleton

# CLAUDE.md — [Name] Startup Instructions

## You Are [Name]
Read the following files immediately, in this order, before doing anything else:
1. `SOUL.md` — identity and operating principles
2. `USER.md` — [Owner]'s preferences and context
3. `MEMORY.md` — active projects and long-term knowledge
4. `DAILY.md` — yesterday's context and today's priorities
5. `PLAYBOOK.md` — how to respond to common situations
6. `/path/to/may-system/state/session.md` — last session state
7. `/path/to/may-system/state/projects.md` — project phases
8. `/path/to/may-system/logs/decisions.md` — persistent decisions

After reading all files, deliver the appropriate brief:
- Today's date ≠ DAILY.md date → **Daily Brief**
- Same day → **Launch Brief**

## System Paths
- Broker tasks: /path/to/may-system/broker/tasks/
- State: /path/to/may-system/state/
- Logs: /path/to/may-system/logs/

## Hard Rules
- Never steer a subagent mid-session — write to broker queue only
- At session close: archive session, update heartbeats, update DAILY.md
- Record decisions in logs/decisions.md — never re-ask a resolved question

## Autonomy Rules
Auto-approve: [list]
Confirm: [list]

## Missions
[Mission spec and launch instructions]

PLAYBOOK.md — situational response patterns

A lightweight rulebook for common situations. Formatted as ## "[trigger phrase]" sections with arrow-formatted action bullets.

Effective PLAYBOOK.md entries are:

Example entries

## "What's going on" / "Status"
→ Executive summary: top 3 priorities, any agent alerts, one open loop overdue
→ Do NOT list every project. Surface what needs [Owner].

## "Build [tool]" (operator present)
→ If [Owner] is steering interactively, build it directly in the session
→ For deferred work, create a mission spec and offer to launch it

## "Mission: [objective]"
→ Create mission.json in missions/MISSION-XXX/
→ Tell [Owner]: "Mission MISSION-XXX ready. Run: ./scripts/run-batch.sh MISSION-XXX"
→ Or if [Owner] says "go" — launch it directly in background

## Brief Startup (MANDATORY — before every brief)
→ git pull origin main in repo
→ Read pending-brief.md
→ Read sentinel-latest.json
→ Scan missions/* for newly completed
→ Read result.md + needs-brian.md for completed missions

Writing effective identity files

The most common mistake: writing identity files that describe the AI's capabilities rather than its operating posture. Good identity files define behavior, not features.

Test your SOUL.md: Read it as the AI would — cold, with no other context. Ask: "After reading only this file, would I know what to do when [Owner] types 'what's going on'?" If not, the file needs more behavioral specificity.

Effective techniques:

Getting Started

A step-by-step path for setting up a May-style AI Chief of Staff from scratch. Estimated setup time for a basic working system: 2–3 hours.

Prerequisites

The most important prerequisite is mental: You're building a system, not using a tool. Investing 2–3 hours in setup pays dividends across thousands of future interactions. The quality of your identity files (especially SOUL.md) determines the quality of everything that follows.

Step-by-step setup

  1. Create the repo and local state directories
    # Create the versioned repo
    gh repo create [your-username]/[ai-name] --private
    
    # Create local state (not in git)
    mkdir -p ~/[ai-name]-system/{state,logs/missions,agents/may,broker/{tasks,results,drafts}}
  2. Write SOUL.md — the most important step

    Use the skeleton from the Rule Sets section. Fill in:

    • Your AI's name and role
    • How it should think (triage, data-first, action-default)
    • What it should auto-approve vs. confirm
    • 2–3 sentences about each domain it needs to know about
    • Hard limits (what it should never do)

    This file is the foundation. Take your time on it.

  3. Write USER.md

    Your preferences as the operator. Include:

    • Communication style (bullet-heavy vs. prose, detail level)
    • Tech preferences (tools you use, tools you avoid)
    • Autonomy grants (what it can do without asking)
    • Context about your business/domain that doesn't belong in SOUL.md
  4. Write MEMORY.md (initial version)

    Start simple. List your active projects with one paragraph each. Add known open loops. Add decisions that are already made. This file grows organically — don't try to capture everything upfront.

  5. Write DAILY.md (initial version)

    Today's date as the header. Your current priorities. This file gets replaced daily — the initial version just needs to be good enough for the first session.

  6. Write PLAYBOOK.md

    Start with the essential patterns: "what's going on", "build X", "check on [tool]", Brief Startup protocol, Scribe triggers. Add more as you discover recurring situations.

  7. Write CLAUDE.md

    Use the skeleton from the Rule Sets section. Set absolute paths for your local state directory. List the files to read in order. Define autonomy rules. Add mission section if you plan to use missions.

  8. Run your first session and deliver a brief
    cd ~/[ai-name]   # repo directory
    claude

    The AI reads your files and delivers a Daily Brief. Evaluate: does it sound like it understands your business? Does the brief surface the right priorities? If not, iterate on SOUL.md and MEMORY.md.

  9. Write your first mission spec

    Pick something concrete and scoped — a dashboard, a script, a static site. Write the mission.json with clear acceptance criteria. Launch it with run-batch.sh. Check result.md when done.

  10. Set up Sentinel

    Create a GitHub Actions workflow that runs daily and checks your deployed apps. Have it write a JSON result to the repo. Point your session startup to read that JSON. This closes the loop on knowing what's deployed and healthy without checking manually.

Common pitfalls and how to avoid them

Pitfall What happens Fix
MEMORY.md over 200 lines Context truncation silently loses the bottom half of the file Extract to topic files aggressively. Keep MEMORY.md as an index.
session.md not archived before overwrite That session's work is permanently lost from the record Scribe protocol: always append to sessions.md before overwriting session.md
DAILY.md not updated at session close Next session starts with stale context; operator re-explains what happened Make DAILY.md update a hard rule in CLAUDE.md: "At session close, update DAILY.md"
Decisions not logged Same questions re-asked across sessions; operator frustrated; AI seems to not learn Log to decisions.md immediately. Treat it as append-only institutional memory.
Running missions inside Claude Code Nested session protection blocks execution Always use run-batch.sh (which unsets CLAUDECODE) instead of run-mission.sh directly
Brief Startup skipped Brief reports stale data; completed missions not surfaced; Sentinel results missing Make Brief Startup mandatory in CLAUDE.md and PLAYBOOK.md. No exceptions.
Vague acceptance criteria in missions Mission completes "technically" but doesn't meet actual needs Write criteria as pass/fail tests: "Page loads at URL X and returns 200". Not "works well".
Permission fatigue Operator approves everything without reading; autonomy model breaks down Add more actions to the auto-approve list. The confirm list should be short and meaningful.

Tips for long-term operation

Scaling up

Once the core system works, common growth paths: