The May Framework

A replicable architecture for an AI Chief of Staff that survives cold starts, coordinates autonomous agents, and operates a real business — built entirely on Claude Code and open-source tooling.

Claude Code File-Based Memory Autonomous Missions Multi-Agent GitHub Actions Cloudflare

Overview

May is an AI chief of staff for a small business owner — built on Claude Code, managed through plain text files, and capable of autonomous execution via headless sessions called Missions.

What problem does it solve?

Claude Code sessions are ephemeral. Every session starts with a blank context window — no memory of yesterday, no awareness of active projects, no knowledge of decisions already made. But business operations are continuous. A chief of staff who forgets everything overnight isn't useful.

The May Framework solves this with a file-based memory architecture: structured markdown files and JSON that persist across sessions, a startup sequence that rebuilds full context on every cold start, and append-only logs that accumulate institutional knowledge over time.

Core design principles

Files are the brain. Everything May knows at startup comes from files. The files are the memory.
Ephemeral sessions, persistent state. Sessions are temporary. The file system is permanent.
One operator, full authority. Designed for a single owner-operator who wants autonomous AI execution, not committee review.
Minimal infrastructure. Plain text, GitHub, Cloudflare free tier. No database servers, no message queues, no orchestration platforms.
AI-parseable structure. Files are formatted for both human readability and AI ingestion. Headers enable section-by-section parsing.

The cold start sequence

Every Claude Code session reads files in this order before delivering a brief:

SOUL.md ─── identity, operating principles, hard limits USER.md ─── operator preferences, autonomy rules MEMORY.md ─── active projects, open loops, long-term knowledge DAILY.md ─── yesterday's context, today's priorities PLAYBOOK.md ─── response patterns for common situations session.md ─── last session's state (Scribe output) projects.md ─── project lifecycle tracker decisions.md ─── persistent decision log (never re-ask these)

After reading all files, May delivers:

Daily Brief — if today's date differs from DAILY.md header (first session of the day)
Launch Brief — if same day (returning to an in-progress session)

The two-directory pattern

MayAI/ ← GitHub repo (versioned, CI-accessible) SOUL.md MEMORY.md DAILY.md missions/ scripts/ .github/workflows/ may-system/ ← Local only (not in Git) state/session.md logs/sessions.md logs/decisions.md agents/*/heartbeat.json broker/

The repo holds governance and mission specs. Local state holds operational data that changes every session. This keeps the repo clean while the runtime has immediate access to session state.

Quick navigation

Architecture

File hierarchy, startup sequence, session continuity

Agents

May, Atlas, Vulcan, Sentinel, Scribe, Ledger

Missions

Autonomous headless execution, self-healing loop

Memory

Cold start protocol, file roles, Scribe

Autonomy

Permission model, auto-approve rules

Tooling

Scripts, workflows, PWA, Monday.com MCP

Rule Sets

SOUL, CLAUDE.md, PLAYBOOK patterns

Getting Started

Step-by-step replication guide

Architecture

Two directories. One repo, one local state store. Plain text throughout. No servers to operate, no databases to manage beyond what Cloudflare's free tier handles.

The repo directory (versioned)

MayAI/                          ← Git repo
  CLAUDE.md                     ← Auto-read by Claude Code on startup
  SOUL.md                       ← Identity and operating principles
  USER.md                       ← Operator preferences
  MEMORY.md                     ← Long-term knowledge index (≤200 lines)
  DAILY.md                      ← Current day context, yesterday's summary
  PLAYBOOK.md                   ← Situational response patterns

  docs/
    SYSTEM.md                   ← Full technical reference

  missions/
    _TEMPLATE.json              ← Mission spec format
    MISSION-001/
      mission.json              ← Objective, criteria, scope, budget
      result.md                 ← Output from autonomous run
      needs-brian.md            ← Blockers and workarounds

  scripts/
    run-mission.sh              ← Single mission launcher (local)
    run-batch.sh                ← Batch launcher (default — unsets CLAUDECODE)
    launch-mission.sh           ← CI fallback (commits + triggers Actions)
    may-pwa-reply.sh            ← PWA reply + push notification trigger
    may-inbox.sh                ← D1 inbox poller
    screenshot.js               ← Puppeteer screenshot utility

  .github/workflows/
    mission.yml                 ← CI mission runner
    sentinel.yml                ← Daily health checks (7:15 AM CT)

The local state directory (not in Git)

may-system/                     ← Local only
  state/
    session.md                  ← Current session state (Scribe writes here)
    projects.md                 ← Project phases and milestones
    pending-brief.md            ← Items queued for next brief

  logs/
    sessions.md                 ← Append-only session archive
    decisions.md                ← Persistent decision log
    daily-archive.md            ← Daily summaries archive
    missions/                   ← Mission execution logs

  agents/
    may/
      heartbeat.json            ← Last active, session count, summary
    atlas/
      heartbeat.json
      prompt.md                 ← Agent prompt/instructions
      config.json               ← Thresholds, board IDs, config
    vulcan/heartbeat.json
    ledger/heartbeat.json
    sentinel/heartbeat.json
    scribe/
      prompt.md
      config.json

  broker/
    tasks/                      ← (Retired — missions replaced this)
    results/                    ← Sentinel output, scan results
    drafts/                     ← Items queued for operator approval

Why this separation?

The repo is CI-accessible — GitHub Actions can read it to run missions. It holds everything that benefits from version control: governance files, mission specs, scripts, workflows. The local state changes every session and doesn't belong in version control — it would create constant noise and expose sensitive operational data.

The startup sequence

CLAUDE.md is the entry point. Claude Code reads CLAUDE.md automatically when launched in the repo directory. CLAUDE.md contains the startup sequence — the ordered list of files to read and actions to take before delivering a brief.

The startup sequence runs every session, no exceptions:

Sync: git pull origin main in the repo — Sentinel and mission runners push results via CI, so local may be stale
Read pending-brief.md — local mission runners append here on completion
Read sentinel-latest.json — Sentinel CI writes here daily; contains recent deploys and health status
Scan missions/*/mission.json for any with status "pending" that have a result.md — these completed but weren't tracked; fix status to "complete"
Read result.md and needs-brian.md for any newly completed missions
Deliver brief (Daily or Launch based on date comparison)

Brief types

Brief Type	Trigger	Contents	Length
Daily Brief	Today's date ≠ DAILY.md header date	Top 3 priorities, agent health, Sentinel status, mission results, Regrid usage, open loops, project statuses	Under 200 words
Launch Brief	Same date as DAILY.md (returning session)	Where we left off, any completed missions, urgent alerts only	3–5 lines, under 60 words

Memory architecture

MEMORY.md is auto-loaded into every conversation context but truncates at 200 lines. To stay under this limit, detailed knowledge lives in topic files that MEMORY.md indexes:

memory/
  MEMORY.md        ← Index + high-level facts (≤200 lines, auto-loaded)
  brand.md         ← Colors, typography, logos
  job-map.md       ← Map tool specifics
  customer-portal.md ← Auth, API details
  regrid-api.md    ← External API docs, limits, tokens
  pwa.md           ← PWA technical reference
  email-system.md  ← Email infrastructure docs

Write pattern: When a topic grows beyond a paragraph in MEMORY.md, extract it to a topic file and replace the MEMORY.md section with a one-line reference: See [topic.md] — full details there.

Agents

Six agents coordinate the system. Most run inline during sessions rather than as standalone processes. The chief of staff (May) is the only agent that runs interactively.

Agent roster

Agent	Role	How It Runs	Status
May	Chief of Staff — coordinates all agents, writes code, manages projects, delivers briefs	Interactive Claude Code sessions (operator present)	Active (daily)
Atlas	Monday.com monitor — AR pipeline tracking, overdue detection, repeat offenders	Inline during May sessions via Monday.com MCP	Active + Calibrated
Vulcan	Code agent — builds, deploys, automates; executes all Missions	Direct code work by May; Mission runner for autonomous tasks	Active
Ledger	Financial analysis — AR dollar amounts, P&L, real estate numbers	Inline during sessions (future: standalone)	Defined, waiting
Sentinel	Verification — HTTP health checks, API status, token expiry alerts	GitHub Actions cron (daily 7:15 AM CT); commits results back to repo	Deployed
Scribe	Session state — tracks goals, commits, decisions; writes resumable context for next session	Inline protocol May executes at session open/close boundaries	Active every session

Agent communication model

Agents don't talk to each other directly. There's no message bus, no API, no pub/sub. The communication model is purely file-based:

May writes a task JSON to broker/tasks/
Agent executes (when next triggered) and writes result JSON to broker/results/
May reads result on next startup

In practice, Atlas and Vulcan work is done directly by May during interactive sessions. The broker pattern exists for future standalone execution (scheduled GitHub Actions, cron triggers).

Important constraint: May never steers a subagent mid-session. She can add tasks to the broker queue for the next trigger. She cannot interrupt or redirect an agent that's already running.

Inline protocol vs. standalone agent

Inline protocol (Scribe): May executes this herself at specific trigger points — session open, after commits, after goal completions, session close. No separate process, no scheduling. It's a protocol baked into May's behavior.

Standalone agent (Sentinel): Runs independently on a schedule via GitHub Actions. Doesn't require May to be present. Results are committed back to the repo and read at the next session's Brief Startup.

Sentinel output format

Sentinel writes to broker/results/sentinel-latest.json. May reads this at every Brief Startup:

{
  "last_run": "2026-03-01T07:15:00Z",
  "overall_status": "green",
  "apps": [
    { "name": "Intranet", "url": "[your-domain]", "status": "green" },
    { "name": "Customer Portal", "url": "portal.[your-domain]", "status": "green" }
  ],
  "tokens": [
    { "name": "Regrid JWT", "expires": "2027-02-15", "days_left": 351 }
  ],
  "recent_deploys": {
    "weygand-team": ["abc1234 — fix PWA scroll bug", "def5678 — hamburger nav"],
    "MayAI": ["ghi9012 — MISSION-046 complete"]
  },
  "context_summary": "All apps green. Regrid token valid 351d. Last deploy: PWA scroll fix."
}

Heartbeat format

Every agent maintains a heartbeat file at agents/[name]/heartbeat.json. Updated at session close:

{
  "agent": "vulcan",
  "last_active": "2026-03-01T22:00:00Z",
  "sessions_active": 28,
  "last_session_summary": "MISSION-046: WorkHQ mobile refresh — deployed to GitHub Pages"
}

Defining a new agent

To add a new agent to the system:

Create agent directory: may-system/agents/[name]/
Write prompt.md: Role definition, scope, how it reports results, what it can and cannot do
Write config.json: Thresholds, targets, schedule, API endpoints it uses
Initialize heartbeat.json: Set last_active: null, sessions_active: 0
Register in SOUL.md: Add to the subagent team table with role and schedule
Add to MEMORY.md: Note the agent's domain and config file location

Missions

Missions are autonomous, headless Claude Code executions. They can build, test, fix, deploy, and iterate without the operator present — and they keep going until criteria pass or budget runs out.

What a Mission is

A Mission is a scoped objective written as a JSON spec. When launched, Claude Code runs headlessly (no interactive terminal) with full code and deploy authority. It reads the mission spec, plans its approach, builds the thing, tests it, fixes failures, deploys, verifies, and loops until all acceptance criteria pass.

The operator writes the spec. The operator presses go. The operator checks the result. That's the full interaction model for autonomous work.

mission.json format

{
  "mission_id": "MISSION-001",
  "created_at": "2026-03-01T10:00:00Z",
  "created_by": "may",
  "status": "pending",           // pending | running | complete | failed
  "priority": "high",            // high | medium | low
  "agent": "vulcan",
  "objective": "One-line description of what this mission accomplishes.",
  "scope": {
    "repos": ["[your-username]/[repo-name]"],
    "deploy_targets": ["GitHub Pages"],
    "authority": "code, commit, push, deploy to existing environments"
  },
  "acceptance_criteria": [
    "Specific, testable criterion — pass/fail",
    "Another criterion",
    "Deployed URL is live and returns 200"
  ],
  "context_files": [
    "/path/to/relevant/file.md"
  ],
  "additional_context": "Any extra instructions, design specs, constraints.",
  "max_budget_usd": "5.00",
  "max_turns": 200
}

The self-healing loop

Plan → Read existing code. Understand what's there. Build → Write the code. Test → Verify it works (build, run, check output). Fix → If broken, diagnose and fix. Deploy → Push to trigger deployment. Verify → Check the live result (HTTP check, screenshot, etc.). Loop → If any criteria still fail, go back to Build.

The loop continues until all acceptance criteria pass or the budget cap is reached. Each iteration tightens — the agent accumulates context about what worked and what didn't.

Stuck protocol

When a Mission hits something it can't resolve (missing API key, needs a decision from the operator, external dependency unavailable):

Try an alternative approach first
If truly blocked: use a workaround — mock data, placeholder UI, TODO comment in code
Log every blocker to needs-brian.md in the mission directory
Keep moving — completion with workarounds beats incomplete

### Blocker: Cloudflare R2 bucket not configured
Status: workaround in place
What's needed: Create R2 bucket and set BUCKET_NAME env var
Workaround used: File upload shows "Upload unavailable" message
Files affected: workers/api/upload.js

result.md format

# Mission Result: MISSION-001
Completed: 2026-03-01T14:23:00Z

## Status: complete

## What Was Done
- Built the dashboard component
- Deployed to GitHub Pages (push to gh-pages branch)
- All 4 acceptance criteria pass

## Commits
| Hash    | Message                              |
|---------|--------------------------------------|
| abc1234 | feat: initial dashboard scaffold     |
| def5678 | fix: mobile layout breakpoints       |

## Deployed To
- https://[your-username].github.io/[repo-name]/

## Workarounds In Place
- None

## Needs [Owner]
- Nothing

Launch methods

Method	Command	When to use
Local batch (primary)	`nohup ./scripts/run-batch.sh MISSION-001 &`	Default. Mac Studio + Max plan. Sequential, handles nesting protection.
CI fallback	`./scripts/launch-mission.sh MISSION-001`	When local Max plan sessions are exhausted. Commits + triggers GitHub Actions.

Nesting warning: Never run run-mission.sh directly from inside a Claude Code session. The nested session protection will block it. Always use run-batch.sh, which unsets the CLAUDECODE environment variable before launching.

Mission authority

Missions can: read, write, delete files in scoped repos; git commit and push; run builds and tests; install dependencies; deploy to existing environments; make all implementation decisions.

Missions cannot: spend money; send external communications; modify source-of-truth business data; deploy to new environments for the first time; delete repos or branches with others' work.

When to use a Mission vs. direct session work

Use a Mission when...	Do it in-session when...
Operator doesn't need to be present	Operator is steering interactively
Work takes more than 15–20 min	Quick fix, < 15 min of work
Multiple build/test/fix cycles expected	Single clear change with no iteration
Operator wants to do other things while it runs	Operator needs to review each step
Complex build with clear acceptance criteria	Discovery work, design decisions, research

Visual verification

For missions with a UI component, visual verification is mandatory before declaring done:

# Take desktop screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 desktop.png

# Take mobile screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 mobile.png --mobile

# Screenshot live URL after deploy
node /path/to/MayAI/scripts/screenshot.js https://[your-username].github.io/[repo] live.png

Memory & Cold Start

Every session starts cold. The memory system makes that survivable — even advantageous. Fresh context loaded from well-maintained files outperforms stale conversational context every time.

The cold start problem

Claude Code has no persistent memory between sessions. Without intervention, each session would start knowing nothing: not what projects are active, not what decisions were already made, not what happened yesterday, not what broke last week. A chief of staff who forgets everything overnight is useless.

The solution: structured file protocol

Instead of fighting the ephemeral model, the May Framework embraces it. Every important piece of operational knowledge is written to a file at the moment it's created. Startup reads those files in order. The session starts fully informed.

File roles and stability

File	Role	Update Frequency	Written By
`SOUL.md`	Identity, operating principles, hard limits, agent team	Rarely (major system changes only)	Operator
`USER.md`	Operator preferences, context, autonomy rules	When preferences change	Operator
`MEMORY.md`	Active projects, long-term knowledge, open loops, decisions	Every session (append/update)	May during sessions
`DAILY.md`	Yesterday's context, today's priorities, open items	Daily (replaced, not appended)	May at session close
`PLAYBOOK.md`	Situational response patterns for common scenarios	Rarely (stable patterns)	Operator
`session.md`	Current session state — goals, commits, decisions, resume instructions	Every session (overwritten at close)	Scribe protocol
`decisions.md`	Persistent decision log — never re-ask these	Append-only (never overwritten)	May when decisions are made
`projects.md`	Project lifecycle tracker — phase, milestones, next actions	When projects advance	May during sessions

The Scribe protocol

Scribe is an inline protocol — not a separate agent — that May executes at session boundaries. It's what makes the next cold start informed.

Trigger points

Session open: Read session.md from last session. Initialize goals from operator's first message or carryover.
After git commit: Append commit hash + message to session.md Git Activity. Update goal progress.
After git push: Log push (branch, remote) to session.md.
Goal completed: Mark done with timestamp. Note follow-up goals.
Decision made: Capture decision + implications. Append to decisions.md. Update session.md.
Session close: Write final session.md. Append summary to sessions.md (archive). Update DAILY.md. Update agent heartbeats.

session.md structure

# Session State
Date: 2026-03-01
Session: afternoon

## Current Goals
- [x] Fix PWA scroll bug — done
- [ ] Write SYSTEM.md update — in progress

## Git Activity
- abc1234 — fix: PWA iOS scroll behavior (weygand-team)
- def5678 — fix: bottom gap on keyboard close (weygand-team)

## Decisions Made
- VAPID keys: do not regenerate (would break existing subscriptions)

## Resume Instructions
Working on May PWA improvements. Scroll bug fixed and deployed.
Next: update SYSTEM.md docs with PWA features. Then write MISSION-047 spec.
All changes in weygand-team repo, deployed to Cloudflare Pages.
May PWA URL: https://[your-domain]/may/

Resume Instructions is the most important section. Write it as a cold-start briefing — assume the next session knows the codebase but has zero context on what happened today. It should answer: "where exactly were we, what exactly were we doing, what's the very next step?"

MEMORY.md discipline

MEMORY.md auto-loads but truncates at 200 lines. Enforce this strictly:

Keep MEMORY.md as an index — one to three sentences per project, then "See [topic.md]"
Extract detailed technical notes to topic files as soon as a section grows beyond 5–6 lines
Prune resolved items from Open Loops; archive closed projects
Never let MEMORY.md exceed 200 lines — truncation means the bottom half of the file is silently lost

decisions.md discipline

Every decision the operator makes gets logged here, immediately, with context:

## 2026-03-01
**VAPID keys:** Do not regenerate — would break all existing push subscriptions.
**Context:** PWA push notifications live; keys stored as Cloudflare Pages secrets.

**Missions/Status drawer:** Permanent — Chat is always the default view.
**Context:** Confirmed in MISSION-045 session.

This file is append-only. Never delete entries. It's the institutional memory that prevents the operator from being asked the same question twice.

The sessions.md archive

Before overwriting session.md at session close, Scribe appends a summary to sessions.md. This is an append-only cumulative record of everything accomplished:

## 2026-03-01 — Afternoon session
Duration: ~2h
Goals completed: PWA scroll bug fix, receipt persistence, relative timestamps
Commits: abc1234, def5678
Deployed: weygand-team (Cloudflare Pages)
Key decisions: VAPID keys permanent, drawer nav permanent
Next: MISSION-047 system documentation

Autonomy Model

The permission model is binary and clear. Most actions are auto-approved. A short list of consequential actions requires confirmation. The rule of thumb is simple enough to apply in any situation.

The rule of thumb

"If it doesn't spend money or send information to someone else, just do it." Any action that is local, reversible, and doesn't involve external communication or financial commitment is auto-approved.

Auto-approve: just do it

All code decisions within a scoped plan (architecture, naming, structure, libraries, refactoring)
File creation, modification, deletion within project repos
Git commits and pushes to existing repos and branches
Running builds, tests, dev servers, linters
API reads (any external API — read operations only)
Research, drafting, formatting, summarizing
Installing dependencies, updating configs
Creating/updating system files (MEMORY, DAILY, broker tasks, heartbeats)
Deploying to existing environments (GitHub Pages, Cloudflare Pages)

Confirm before doing

Spending money (API upgrades, paid services, new infrastructure)
Sending external communications (emails, Slack, PR comments visible to others)
Modifying source-of-truth business data (Monday.com items, Clockify entries)
First-time deployment to a new environment
Deleting a repo, branch, or production data
Any action touching financial records or sensitive documents
Actions that cannot be undone

Scoped plan execution

Once the operator scopes a plan — "build X", "fix Y", "add Z" — the AI executes the full plan without asking for approval on implementation details. This is critical for avoiding "permission fatigue" where constant confirmation requests train the operator to rubber-stamp everything.

The AI surfaces decisions only when there's a genuine trade-off the operator would care about: a significant performance vs. complexity tradeoff, a choice between mutually exclusive approaches, or an ambiguity that could result in building the wrong thing.

Decision logging as trust infrastructure

Every time the operator makes a decision, it's logged to decisions.md with context. This creates a trust layer: the AI doesn't need to ask the same question twice, and the operator doesn't need to worry about consistent behavior across sessions.

Examples of decisions worth logging:

Threshold values ("flag overdue after 7 days, not 3")
Data source preferences ("county GIS first, external API as fallback")
Autonomy grants ("auto-approve all commits to this repo")
Architecture choices ("use vanilla JS, no frameworks")
External communication policies ("never email without confirmation")

Minimizing permission requests

A key design goal: the operator should never have to approve routine actions. Permission prompts for file reads, git operations, and local actions create friction and distract from the actual work. The system is calibrated to prompt only when the stakes are genuinely high.

If the AI is prompting too often, the fix is to update USER.md or CLAUDE.md with explicit auto-approve grants for the action type in question.

Tooling

The system uses Claude Code as its runtime, GitHub for version control and CI, and Cloudflare for edge hosting and APIs. All of it runs on free or low-cost tiers.

Claude Code CLI

The runtime for all AI sessions — both interactive and headless.

# Interactive session (operator present)
claude

# Headless mission (no operator, autonomous)
claude -p "$PROMPT" \
  --dangerously-skip-permissions \
  --max-turns 200 \
  --output-format text

Key flags for mission execution:

-p "$PROMPT" — pass the mission spec as a prompt (non-interactive)
--dangerously-skip-permissions — skip confirmation prompts (required for autonomous execution)
--max-turns N — limit agentic turns to control cost
--output-format text — plain text output (no streaming JSON)

run-batch.sh — local mission launcher

The primary way to launch missions locally. Handles the nesting problem (Claude Code sessions can't spawn nested Claude Code sessions):

#!/bin/bash
# run-batch.sh MISSION-001 MISSION-002 ...

unset CLAUDECODE          # Remove nesting protection
unset CLAUDE_CODE_SESSION

for MISSION_ID in "$@"; do
  MISSION_DIR="missions/$MISSION_ID"
  PROMPT=$(cat "$MISSION_DIR/mission.json" | jq -r .objective)

  claude -p "$PROMPT" \
    --dangerously-skip-permissions \
    --max-turns 200 \
    --output-format text
done

Launch in background to keep working: nohup ./scripts/run-batch.sh MISSION-001 &

Monitor: tail -f ~/may-system/logs/missions/batch-*.log

GitHub Actions — mission runner

CI fallback for when local sessions are unavailable. The mission.yml workflow:

name: Run Mission
on:
  workflow_dispatch:
    inputs:
      mission_id:
        description: 'Mission ID (e.g. MISSION-001)'
        required: true

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run mission
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          MISSION="${{ github.event.inputs.mission_id }}"
          PROMPT=$(cat "missions/$MISSION/mission.json")
          claude -p "$PROMPT" --dangerously-skip-permissions --max-turns 200
      - name: Commit results
        run: |
          git config user.email "may@[your-domain]"
          git config user.name "May"
          git add missions/
          git commit -m "Mission $MISSION complete" || exit 0
          git push

Sentinel — daily health checks

Sentinel runs on a daily cron via GitHub Actions. It checks all deployed apps, verifies API tokens haven't expired, and commits a result JSON back to the repo:

schedule:
  - cron: '15 13 * * *'  # 7:15 AM CT daily

Sentinel output (broker/results/sentinel-latest.json) is read at every session's Brief Startup to give fresh context about deployment health and recent commits.

may-pwa-reply.sh — PWA reply channel

When Claude responds to a message received via the PWA, this script writes the response to Cloudflare D1 and triggers a push notification:

#!/bin/bash
# may-pwa-reply.sh "$RESPONSE_TEXT"
# Called by email responder after Claude generates a reply

RESPONSE="$1"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)

# Write to D1 outbound table
curl -X POST "https://[worker-name].[account].workers.dev/api/message" \
  -H "Authorization: Bearer $WORKER_SECRET" \
  -H "Content-Type: application/json" \
  -d "{\"role\":\"may\",\"content\":\"$RESPONSE\",\"ts\":\"$TIMESTAMP\"}"

# Trigger push notification
curl -X POST "https://[your-domain]/api/push-notify" \
  -H "Content-Type: application/json" \
  -d "{\"title\":\"May\",\"body\":\"New message\"}"

Two-Phase Email Architecture

When the operator sends a message (via PWA or email), the system uses a two-phase approach to guarantee fast replies while allowing complex work to run without timeouts:

# Phase 1: TRIAGE (90s timeout, 10 turns)
# Classifies message, handles simple ones inline, acks complex ones

responder polls inbox → spawns Claude TRIAGE session
  → SIMPLE_QUESTION / DECISION / STATUS / QUICK_TASK → handle + reply immediately
  → MISSION_SCOPE / COMPLEX_TASK → send ack + write handoff JSON

# Phase 2: EXECUTION (30min timeout, 60 turns)
# Picks up handoff files, runs full Claude session with context

processor polls handoffs/ → spawns Claude EXECUTION session
  → reads context files (SOUL.md, MEMORY.md, DAILY.md)
  → does the heavy work (scope mission, execute task, research)
  → sends final reply when done

The handoff JSON file bridges the two phases:

{
  "id": "handoff-20260301-143022",
  "status": "pending",          // pending → in_progress → complete/failed
  "msg_id": 82,
  "source_channel": "pwa",
  "classification": "MISSION_SCOPE",
  "message_body": "Original message text",
  "ack_sent": true,
  "instructions": "Triage assessment of what needs to be done"
}

Key design decisions:

Operator always gets a response within ~30 seconds, even for complex requests
Fallback guarantee: if even the triage times out, a generic ack is auto-sent and a handoff is auto-written
Stale recovery: the processor detects dead PIDs on in-progress handoffs and reclaims them
Failure notification: if the execution phase fails, a message is sent via PWA so the operator knows
Both phases run as LaunchAgents (macOS) — responder polls every 60s, processor polls every 30s

The May PWA

A progressive web app that serves as the primary async communication channel between operator and AI chief of staff.

Feature	Implementation
Chat storage	Cloudflare D1 (SQLite at edge)
Push notifications	VAPID (Web Push Protocol), Cloudflare Pages Functions
File attachments	Cloudflare R2 bucket
Nav	Hamburger/drawer — Chat default, Missions + Status in drawer
Receipt persistence	localStorage — survives page refresh
Timestamps	Relative ("2 min ago") — updates every 60s
SW cache versioning	Increment version string (e.g. may-v7 → may-v8) on JS changes
iOS push	Requires home screen install (Safari → Share → Add to Home Screen)

VAPID key warning: Once VAPID keys are generated and stored as secrets, do not regenerate them. Regenerating breaks all existing push subscriptions. Generate once, store permanently.

Monday.com MCP

Atlas uses the Monday.com Model Context Protocol (MCP) server for reading board data. This enables natural-language queries against Monday.com during interactive sessions:

Read board items, statuses, column values
Filter by status, date, assignee
Drill into subitems (where invoice/payment data lives)

Known limitation: Mirror and formula columns are unreadable via the MCP API. Dollar amounts on parent items require drilling into subitems. Design queries around this constraint.

Cloudflare Workers — API proxy pattern

Cloudflare Workers serve as the backend for all apps that need server-side logic. Common patterns:

API proxy: Worker holds API keys; frontend calls worker; worker calls external API. Keys never exposed in client-side code.
Auth: Worker generates magic link JWT, validates JWT on subsequent requests, returns 401 for unauthenticated access.
Email routing: Worker receives inbound email via Cloudflare Email, parses it, routes to appropriate handler (D1 write for PWA, email forward for standard).

GitHub Pages — zero-cost frontend hosting

All static frontends deploy to GitHub Pages via GitHub Actions. The deploy pattern:

- uses: peaceiris/actions-gh-pages@v3
  with:
    github_token: ${{ secrets.GITHUB_TOKEN }}
    publish_dir: ./dist          # or ./ for plain HTML
    publish_branch: gh-pages

Broker queue (legacy reference)

The broker queue (broker/tasks/ and broker/results/) was the original agent communication mechanism. It's now effectively retired — Missions are the execution model for autonomous work. The queue pattern remains in the codebase for reference and potential future use with standalone agents.

Format for a broker task JSON:

{
  "task_id": "TASK-001",
  "created_at": "2026-03-01T10:00:00Z",
  "agent": "atlas",
  "type": "board_scan",
  "params": {
    "board_id": "[board-id]",
    "filter": "AR_pipeline"
  },
  "priority": "high"
}

Rule Sets

Three files define how the AI operates: SOUL.md (identity), CLAUDE.md (startup and rules), and PLAYBOOK.md (situational responses). Together they replace a complex system prompt with human-readable governance files.

SOUL.md — identity and principles

The identity file. Defines who the AI is, how it thinks, what it cares about, and what it will never do. This is the most durable file in the system — it rarely changes.

Recommended sections

Identity: Name, role, who it serves, the two-sentence version of what it is
How it thinks: Triage approach, data vs. narrative preference, action vs. discussion default
Operating posture: Monitor → Analyze → Act (or Confirm). What triggers confirmation.
Domain knowledge: What the AI knows about your business (can reference MEMORY.md for details)
Startup sequence: Ordered list of files to read (mirrors CLAUDE.md)
Subagent team: Table of agents, roles, schedules
Communication style: Tone, format preferences, what to never say
Hard limits: Short list of things this AI never does, ever

SOUL.md skeleton

# [Name] — Chief of Staff
## [Company] | [Entity]

## Identity
You are [Name]. You are [Owner]'s chief of staff across [domains].
You are not an assistant. You are an operator. You manage systems, not conversations.

## How [Name] Thinks
- Triage before responding. [Routing logic for different request types]
- Numbers over narratives. Lead with data.
- Action over discussion. Default to doing or delegating.

## Operating Posture
Auto-approve: [list]
Confirm before: [list]

## Domain Knowledge
### [Primary Business]
[Core facts the AI needs to know — Monday.com, Clockify, key services]

### [Secondary Domain]
[Core facts]

## Subagent Team
| Agent | Role | Schedule |
|-------|------|----------|
| Vulcan | Code | On-demand |
| Atlas | [Tool] monitoring | Every 2h |

## Communication Style
- Direct. No filler phrases.
- Lead with the answer.
- [Other style rules]

## Hard Limits
[Name] never:
- Sends external communications without explicit confirmation
- Modifies source-of-truth business data without explicit confirmation
- Spends money without confirmation

CLAUDE.md — startup and rules

Claude Code reads this file automatically when launched in the repo directory. It's the entry point for the entire system.

Recommended sections

You Are [Name]: One-paragraph reminder of identity (brief — SOUL.md has the full version)
Read these files: Ordered list of files to read before doing anything else
After reading: What to do (deliver brief, check date, etc.)
System paths: Absolute paths for broker, logs, state, agents (eliminates path confusion)
Hard rules: Non-negotiable behaviors (archive before overwrite, record decisions, ask one clarifying question)
Autonomy rules: Auto-approve list and confirm-before list
Missions section: How missions work in this system

CLAUDE.md skeleton

# CLAUDE.md — [Name] Startup Instructions

## You Are [Name]
Read the following files immediately, in this order, before doing anything else:
1. `SOUL.md` — identity and operating principles
2. `USER.md` — [Owner]'s preferences and context
3. `MEMORY.md` — active projects and long-term knowledge
4. `DAILY.md` — yesterday's context and today's priorities
5. `PLAYBOOK.md` — how to respond to common situations
6. `/path/to/may-system/state/session.md` — last session state
7. `/path/to/may-system/state/projects.md` — project phases
8. `/path/to/may-system/logs/decisions.md` — persistent decisions

After reading all files, deliver the appropriate brief:
- Today's date ≠ DAILY.md date → **Daily Brief**
- Same day → **Launch Brief**

## System Paths
- Broker tasks: /path/to/may-system/broker/tasks/
- State: /path/to/may-system/state/
- Logs: /path/to/may-system/logs/

## Hard Rules
- Never steer a subagent mid-session — write to broker queue only
- At session close: archive session, update heartbeats, update DAILY.md
- Record decisions in logs/decisions.md — never re-ask a resolved question

## Autonomy Rules
Auto-approve: [list]
Confirm: [list]

## Missions
[Mission spec and launch instructions]

PLAYBOOK.md — situational response patterns

A lightweight rulebook for common situations. Formatted as ## "[trigger phrase]" sections with arrow-formatted action bullets.

Effective PLAYBOOK.md entries are:

Trigger-based (starts with a phrase pattern)
Action-oriented (arrows pointing to specific steps)
Referencing specific files and paths, not abstract concepts
Short enough to scan in 30 seconds

Example entries

## "What's going on" / "Status"
→ Executive summary: top 3 priorities, any agent alerts, one open loop overdue
→ Do NOT list every project. Surface what needs [Owner].

## "Build [tool]" (operator present)
→ If [Owner] is steering interactively, build it directly in the session
→ For deferred work, create a mission spec and offer to launch it

## "Mission: [objective]"
→ Create mission.json in missions/MISSION-XXX/
→ Tell [Owner]: "Mission MISSION-XXX ready. Run: ./scripts/run-batch.sh MISSION-XXX"
→ Or if [Owner] says "go" — launch it directly in background

## Brief Startup (MANDATORY — before every brief)
→ git pull origin main in repo
→ Read pending-brief.md
→ Read sentinel-latest.json
→ Scan missions/* for newly completed
→ Read result.md + needs-brian.md for completed missions

Writing effective identity files

The most common mistake: writing identity files that describe the AI's capabilities rather than its operating posture. Good identity files define behavior, not features.

Test your SOUL.md: Read it as the AI would — cold, with no other context. Ask: "After reading only this file, would I know what to do when [Owner] types 'what's going on'?" If not, the file needs more behavioral specificity.

Effective techniques:

Be direct. "You are not an assistant. You are an operator." is clearer than "You should try to be helpful in an operator-like way."
Name the anti-patterns. "Never say 'Great question.' Never say 'Certainly.'" prevents the most common AI communication filler.
Specify routing logic. "Before answering anything, determine: is this a status question, a decision request, a task to delegate, or a problem to solve?"
Hard limits are non-negotiable. The hard limits list should be short (5–7 items) and absolute — no hedging.

Getting Started

A step-by-step path for setting up a May-style AI Chief of Staff from scratch. Estimated setup time for a basic working system: 2–3 hours.

Prerequisites

Claude Code CLI: Install from claude.ai/code. Active subscription required (Max plan recommended for heavy use).
GitHub account: Free tier works. Repos can be private.
Cloudflare account: Free tier works for Workers, Pages, D1, R2.
Basic familiarity with: git, bash scripting, JSON. You don't need to be a developer — the AI writes the code.

The most important prerequisite is mental: You're building a system, not using a tool. Investing 2–3 hours in setup pays dividends across thousands of future interactions. The quality of your identity files (especially SOUL.md) determines the quality of everything that follows.

Step-by-step setup

Create the repo and local state directories

# Create the versioned repo
gh repo create [your-username]/[ai-name] --private

# Create local state (not in git)
mkdir -p ~/[ai-name]-system/{state,logs/missions,agents/may,broker/{tasks,results,drafts}}

Write SOUL.md — the most important step
Use the skeleton from the Rule Sets section. Fill in:
- Your AI's name and role
- How it should think (triage, data-first, action-default)
- What it should auto-approve vs. confirm
- 2–3 sentences about each domain it needs to know about
- Hard limits (what it should never do)
This file is the foundation. Take your time on it.
Write USER.md
Your preferences as the operator. Include:
- Communication style (bullet-heavy vs. prose, detail level)
- Tech preferences (tools you use, tools you avoid)
- Autonomy grants (what it can do without asking)
- Context about your business/domain that doesn't belong in SOUL.md
Write MEMORY.md (initial version)
Start simple. List your active projects with one paragraph each. Add known open loops. Add decisions that are already made. This file grows organically — don't try to capture everything upfront.
Write DAILY.md (initial version)
Today's date as the header. Your current priorities. This file gets replaced daily — the initial version just needs to be good enough for the first session.
Write PLAYBOOK.md
Start with the essential patterns: "what's going on", "build X", "check on [tool]", Brief Startup protocol, Scribe triggers. Add more as you discover recurring situations.
Write CLAUDE.md
Use the skeleton from the Rule Sets section. Set absolute paths for your local state directory. List the files to read in order. Define autonomy rules. Add mission section if you plan to use missions.
Run your first session and deliver a brief
```
cd ~/[ai-name]   # repo directory
claude
```
The AI reads your files and delivers a Daily Brief. Evaluate: does it sound like it understands your business? Does the brief surface the right priorities? If not, iterate on SOUL.md and MEMORY.md.
Write your first mission spec
Pick something concrete and scoped — a dashboard, a script, a static site. Write the mission.json with clear acceptance criteria. Launch it with run-batch.sh. Check result.md when done.
Set up Sentinel
Create a GitHub Actions workflow that runs daily and checks your deployed apps. Have it write a JSON result to the repo. Point your session startup to read that JSON. This closes the loop on knowing what's deployed and healthy without checking manually.

Common pitfalls and how to avoid them

Pitfall	What happens	Fix
MEMORY.md over 200 lines	Context truncation silently loses the bottom half of the file	Extract to topic files aggressively. Keep MEMORY.md as an index.
session.md not archived before overwrite	That session's work is permanently lost from the record	Scribe protocol: always append to sessions.md before overwriting session.md
DAILY.md not updated at session close	Next session starts with stale context; operator re-explains what happened	Make DAILY.md update a hard rule in CLAUDE.md: "At session close, update DAILY.md"
Decisions not logged	Same questions re-asked across sessions; operator frustrated; AI seems to not learn	Log to decisions.md immediately. Treat it as append-only institutional memory.
Running missions inside Claude Code	Nested session protection blocks execution	Always use run-batch.sh (which unsets CLAUDECODE) instead of run-mission.sh directly
Brief Startup skipped	Brief reports stale data; completed missions not surfaced; Sentinel results missing	Make Brief Startup mandatory in CLAUDE.md and PLAYBOOK.md. No exceptions.
Vague acceptance criteria in missions	Mission completes "technically" but doesn't meet actual needs	Write criteria as pass/fail tests: "Page loads at URL X and returns 200". Not "works well".
Permission fatigue	Operator approves everything without reading; autonomy model breaks down	Add more actions to the auto-approve list. The confirm list should be short and meaningful.

Tips for long-term operation

Review MEMORY.md weekly. Prune resolved open loops. Archive closed projects. Add new domains as they emerge.
Run Sentinel before trusting a brief. If Sentinel hasn't run in 24+ hours, the health data may be stale.
Treat needs-brian.md as your inbox. After any mission completes, read its needs-brian.md before moving on.
Log decisions immediately. The moment a decision is made — in any session — log it. Don't batch them up.
Write clear mission specs, not vague goals. "Build a dashboard showing X, Y, Z data with filters for A and B, deployed to GitHub Pages, all criteria passing" is a mission. "Improve the dashboard" is not.
Let missions fail loudly. If a mission can't meet criteria, it should say so in result.md. A partial result with clear blockers is better than silent failure.
Update SOUL.md as you learn. The first version of your identity files won't be perfect. Iterate. Add communication style rules as you notice patterns. Tighten the autonomy lists as you build trust.

Scaling up

Once the core system works, common growth paths:

Add Sentinel → automated health monitoring, deploy verification, token expiry alerts
Add a PWA → async communication channel (Claude responds even when you're not at a terminal)
Add email routing → Cloudflare Email + Workers → Claude responds to emails autonomously
Add Monday.com MCP → natural-language queries against project management data during sessions
Scale missions → batch launch multiple missions in parallel, each scoped to a different repo/feature