The May Framework
A replicable architecture for an AI Chief of Staff that survives cold starts, coordinates autonomous agents, and operates a real business — built entirely on Claude Code and open-source tooling.
Overview
May is an AI chief of staff for a small business owner — built on Claude Code, managed through plain text files, and capable of autonomous execution via headless sessions called Missions.
What problem does it solve?
Claude Code sessions are ephemeral. Every session starts with a blank context window — no memory of yesterday, no awareness of active projects, no knowledge of decisions already made. But business operations are continuous. A chief of staff who forgets everything overnight isn't useful.
The May Framework solves this with a file-based memory architecture: structured markdown files and JSON that persist across sessions, a startup sequence that rebuilds full context on every cold start, and append-only logs that accumulate institutional knowledge over time.
Core design principles
- Files are the brain. Everything May knows at startup comes from files. The files are the memory.
- Ephemeral sessions, persistent state. Sessions are temporary. The file system is permanent.
- One operator, full authority. Designed for a single owner-operator who wants autonomous AI execution, not committee review.
- Minimal infrastructure. Plain text, GitHub, Cloudflare free tier. No database servers, no message queues, no orchestration platforms.
- AI-parseable structure. Files are formatted for both human readability and AI ingestion. Headers enable section-by-section parsing.
The cold start sequence
Every Claude Code session reads files in this order before delivering a brief:
After reading all files, May delivers:
- Daily Brief — if today's date differs from DAILY.md header (first session of the day)
- Launch Brief — if same day (returning to an in-progress session)
The two-directory pattern
The repo holds governance and mission specs. Local state holds operational data that changes every session. This keeps the repo clean while the runtime has immediate access to session state.
Quick navigation
Architecture
Two directories. One repo, one local state store. Plain text throughout. No servers to operate, no databases to manage beyond what Cloudflare's free tier handles.
The repo directory (versioned)
MayAI/ ← Git repo
CLAUDE.md ← Auto-read by Claude Code on startup
SOUL.md ← Identity and operating principles
USER.md ← Operator preferences
MEMORY.md ← Long-term knowledge index (≤200 lines)
DAILY.md ← Current day context, yesterday's summary
PLAYBOOK.md ← Situational response patterns
docs/
SYSTEM.md ← Full technical reference
missions/
_TEMPLATE.json ← Mission spec format
MISSION-001/
mission.json ← Objective, criteria, scope, budget
result.md ← Output from autonomous run
needs-brian.md ← Blockers and workarounds
scripts/
run-mission.sh ← Single mission launcher (local)
run-batch.sh ← Batch launcher (default — unsets CLAUDECODE)
launch-mission.sh ← CI fallback (commits + triggers Actions)
may-pwa-reply.sh ← PWA reply + push notification trigger
may-inbox.sh ← D1 inbox poller
screenshot.js ← Puppeteer screenshot utility
.github/workflows/
mission.yml ← CI mission runner
sentinel.yml ← Daily health checks (7:15 AM CT)
The local state directory (not in Git)
may-system/ ← Local only
state/
session.md ← Current session state (Scribe writes here)
projects.md ← Project phases and milestones
pending-brief.md ← Items queued for next brief
logs/
sessions.md ← Append-only session archive
decisions.md ← Persistent decision log
daily-archive.md ← Daily summaries archive
missions/ ← Mission execution logs
agents/
may/
heartbeat.json ← Last active, session count, summary
atlas/
heartbeat.json
prompt.md ← Agent prompt/instructions
config.json ← Thresholds, board IDs, config
vulcan/heartbeat.json
ledger/heartbeat.json
sentinel/heartbeat.json
scribe/
prompt.md
config.json
broker/
tasks/ ← (Retired — missions replaced this)
results/ ← Sentinel output, scan results
drafts/ ← Items queued for operator approval
Why this separation?
The repo is CI-accessible — GitHub Actions can read it to run missions. It holds everything that benefits from version control: governance files, mission specs, scripts, workflows. The local state changes every session and doesn't belong in version control — it would create constant noise and expose sensitive operational data.
The startup sequence
The startup sequence runs every session, no exceptions:
- Sync:
git pull origin mainin the repo — Sentinel and mission runners push results via CI, so local may be stale - Read pending-brief.md — local mission runners append here on completion
- Read sentinel-latest.json — Sentinel CI writes here daily; contains recent deploys and health status
- Scan missions/*/mission.json for any with status "pending" that have a result.md — these completed but weren't tracked; fix status to "complete"
- Read result.md and needs-brian.md for any newly completed missions
- Deliver brief (Daily or Launch based on date comparison)
Brief types
| Brief Type | Trigger | Contents | Length |
|---|---|---|---|
| Daily Brief | Today's date ≠ DAILY.md header date | Top 3 priorities, agent health, Sentinel status, mission results, Regrid usage, open loops, project statuses | Under 200 words |
| Launch Brief | Same date as DAILY.md (returning session) | Where we left off, any completed missions, urgent alerts only | 3–5 lines, under 60 words |
Memory architecture
MEMORY.md is auto-loaded into every conversation context but truncates at 200 lines. To stay under this limit, detailed knowledge lives in topic files that MEMORY.md indexes:
memory/
MEMORY.md ← Index + high-level facts (≤200 lines, auto-loaded)
brand.md ← Colors, typography, logos
job-map.md ← Map tool specifics
customer-portal.md ← Auth, API details
regrid-api.md ← External API docs, limits, tokens
pwa.md ← PWA technical reference
email-system.md ← Email infrastructure docs
Write pattern: When a topic grows beyond a paragraph in MEMORY.md, extract it to a topic file and replace the MEMORY.md section with a one-line reference: See [topic.md] — full details there.
Agents
Six agents coordinate the system. Most run inline during sessions rather than as standalone processes. The chief of staff (May) is the only agent that runs interactively.
Agent roster
| Agent | Role | How It Runs | Status |
|---|---|---|---|
| May | Chief of Staff — coordinates all agents, writes code, manages projects, delivers briefs | Interactive Claude Code sessions (operator present) | Active (daily) |
| Atlas | Monday.com monitor — AR pipeline tracking, overdue detection, repeat offenders | Inline during May sessions via Monday.com MCP | Active + Calibrated |
| Vulcan | Code agent — builds, deploys, automates; executes all Missions | Direct code work by May; Mission runner for autonomous tasks | Active |
| Ledger | Financial analysis — AR dollar amounts, P&L, real estate numbers | Inline during sessions (future: standalone) | Defined, waiting |
| Sentinel | Verification — HTTP health checks, API status, token expiry alerts | GitHub Actions cron (daily 7:15 AM CT); commits results back to repo | Deployed |
| Scribe | Session state — tracks goals, commits, decisions; writes resumable context for next session | Inline protocol May executes at session open/close boundaries | Active every session |
Agent communication model
Agents don't talk to each other directly. There's no message bus, no API, no pub/sub. The communication model is purely file-based:
- May writes a task JSON to
broker/tasks/ - Agent executes (when next triggered) and writes result JSON to
broker/results/ - May reads result on next startup
In practice, Atlas and Vulcan work is done directly by May during interactive sessions. The broker pattern exists for future standalone execution (scheduled GitHub Actions, cron triggers).
Inline protocol vs. standalone agent
Inline protocol (Scribe): May executes this herself at specific trigger points — session open, after commits, after goal completions, session close. No separate process, no scheduling. It's a protocol baked into May's behavior.
Standalone agent (Sentinel): Runs independently on a schedule via GitHub Actions. Doesn't require May to be present. Results are committed back to the repo and read at the next session's Brief Startup.
Sentinel output format
Sentinel writes to broker/results/sentinel-latest.json. May reads this at every Brief Startup:
{
"last_run": "2026-03-01T07:15:00Z",
"overall_status": "green",
"apps": [
{ "name": "Intranet", "url": "[your-domain]", "status": "green" },
{ "name": "Customer Portal", "url": "portal.[your-domain]", "status": "green" }
],
"tokens": [
{ "name": "Regrid JWT", "expires": "2027-02-15", "days_left": 351 }
],
"recent_deploys": {
"weygand-team": ["abc1234 — fix PWA scroll bug", "def5678 — hamburger nav"],
"MayAI": ["ghi9012 — MISSION-046 complete"]
},
"context_summary": "All apps green. Regrid token valid 351d. Last deploy: PWA scroll fix."
}
Heartbeat format
Every agent maintains a heartbeat file at agents/[name]/heartbeat.json. Updated at session close:
{
"agent": "vulcan",
"last_active": "2026-03-01T22:00:00Z",
"sessions_active": 28,
"last_session_summary": "MISSION-046: WorkHQ mobile refresh — deployed to GitHub Pages"
}
Defining a new agent
To add a new agent to the system:
- Create agent directory:
may-system/agents/[name]/ - Write prompt.md: Role definition, scope, how it reports results, what it can and cannot do
- Write config.json: Thresholds, targets, schedule, API endpoints it uses
- Initialize heartbeat.json: Set
last_active: null,sessions_active: 0 - Register in SOUL.md: Add to the subagent team table with role and schedule
- Add to MEMORY.md: Note the agent's domain and config file location
Missions
Missions are autonomous, headless Claude Code executions. They can build, test, fix, deploy, and iterate without the operator present — and they keep going until criteria pass or budget runs out.
What a Mission is
A Mission is a scoped objective written as a JSON spec. When launched, Claude Code runs headlessly (no interactive terminal) with full code and deploy authority. It reads the mission spec, plans its approach, builds the thing, tests it, fixes failures, deploys, verifies, and loops until all acceptance criteria pass.
The operator writes the spec. The operator presses go. The operator checks the result. That's the full interaction model for autonomous work.
mission.json format
{
"mission_id": "MISSION-001",
"created_at": "2026-03-01T10:00:00Z",
"created_by": "may",
"status": "pending", // pending | running | complete | failed
"priority": "high", // high | medium | low
"agent": "vulcan",
"objective": "One-line description of what this mission accomplishes.",
"scope": {
"repos": ["[your-username]/[repo-name]"],
"deploy_targets": ["GitHub Pages"],
"authority": "code, commit, push, deploy to existing environments"
},
"acceptance_criteria": [
"Specific, testable criterion — pass/fail",
"Another criterion",
"Deployed URL is live and returns 200"
],
"context_files": [
"/path/to/relevant/file.md"
],
"additional_context": "Any extra instructions, design specs, constraints.",
"max_budget_usd": "5.00",
"max_turns": 200
}
The self-healing loop
The loop continues until all acceptance criteria pass or the budget cap is reached. Each iteration tightens — the agent accumulates context about what worked and what didn't.
Stuck protocol
When a Mission hits something it can't resolve (missing API key, needs a decision from the operator, external dependency unavailable):
- Try an alternative approach first
- If truly blocked: use a workaround — mock data, placeholder UI, TODO comment in code
- Log every blocker to
needs-brian.mdin the mission directory - Keep moving — completion with workarounds beats incomplete
### Blocker: Cloudflare R2 bucket not configured
Status: workaround in place
What's needed: Create R2 bucket and set BUCKET_NAME env var
Workaround used: File upload shows "Upload unavailable" message
Files affected: workers/api/upload.js
result.md format
# Mission Result: MISSION-001
Completed: 2026-03-01T14:23:00Z
## Status: complete
## What Was Done
- Built the dashboard component
- Deployed to GitHub Pages (push to gh-pages branch)
- All 4 acceptance criteria pass
## Commits
| Hash | Message |
|---------|--------------------------------------|
| abc1234 | feat: initial dashboard scaffold |
| def5678 | fix: mobile layout breakpoints |
## Deployed To
- https://[your-username].github.io/[repo-name]/
## Workarounds In Place
- None
## Needs [Owner]
- Nothing
Launch methods
| Method | Command | When to use |
|---|---|---|
| Local batch (primary) | nohup ./scripts/run-batch.sh MISSION-001 & |
Default. Mac Studio + Max plan. Sequential, handles nesting protection. |
| CI fallback | ./scripts/launch-mission.sh MISSION-001 |
When local Max plan sessions are exhausted. Commits + triggers GitHub Actions. |
run-mission.sh directly from inside a Claude Code session. The nested session protection will block it. Always use run-batch.sh, which unsets the CLAUDECODE environment variable before launching.
Mission authority
Missions can: read, write, delete files in scoped repos; git commit and push; run builds and tests; install dependencies; deploy to existing environments; make all implementation decisions.
Missions cannot: spend money; send external communications; modify source-of-truth business data; deploy to new environments for the first time; delete repos or branches with others' work.
When to use a Mission vs. direct session work
| Use a Mission when... | Do it in-session when... |
|---|---|
| Operator doesn't need to be present | Operator is steering interactively |
| Work takes more than 15–20 min | Quick fix, < 15 min of work |
| Multiple build/test/fix cycles expected | Single clear change with no iteration |
| Operator wants to do other things while it runs | Operator needs to review each step |
| Complex build with clear acceptance criteria | Discovery work, design decisions, research |
Visual verification
For missions with a UI component, visual verification is mandatory before declaring done:
# Take desktop screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 desktop.png
# Take mobile screenshot
node /path/to/MayAI/scripts/screenshot.js http://localhost:8080 mobile.png --mobile
# Screenshot live URL after deploy
node /path/to/MayAI/scripts/screenshot.js https://[your-username].github.io/[repo] live.png
Memory & Cold Start
Every session starts cold. The memory system makes that survivable — even advantageous. Fresh context loaded from well-maintained files outperforms stale conversational context every time.
The cold start problem
Claude Code has no persistent memory between sessions. Without intervention, each session would start knowing nothing: not what projects are active, not what decisions were already made, not what happened yesterday, not what broke last week. A chief of staff who forgets everything overnight is useless.
The solution: structured file protocol
Instead of fighting the ephemeral model, the May Framework embraces it. Every important piece of operational knowledge is written to a file at the moment it's created. Startup reads those files in order. The session starts fully informed.
File roles and stability
| File | Role | Update Frequency | Written By |
|---|---|---|---|
SOUL.md |
Identity, operating principles, hard limits, agent team | Rarely (major system changes only) | Operator |
USER.md |
Operator preferences, context, autonomy rules | When preferences change | Operator |
MEMORY.md |
Active projects, long-term knowledge, open loops, decisions | Every session (append/update) | May during sessions |
DAILY.md |
Yesterday's context, today's priorities, open items | Daily (replaced, not appended) | May at session close |
PLAYBOOK.md |
Situational response patterns for common scenarios | Rarely (stable patterns) | Operator |
session.md |
Current session state — goals, commits, decisions, resume instructions | Every session (overwritten at close) | Scribe protocol |
decisions.md |
Persistent decision log — never re-ask these | Append-only (never overwritten) | May when decisions are made |
projects.md |
Project lifecycle tracker — phase, milestones, next actions | When projects advance | May during sessions |
The Scribe protocol
Scribe is an inline protocol — not a separate agent — that May executes at session boundaries. It's what makes the next cold start informed.
Trigger points
- Session open: Read session.md from last session. Initialize goals from operator's first message or carryover.
- After git commit: Append commit hash + message to session.md Git Activity. Update goal progress.
- After git push: Log push (branch, remote) to session.md.
- Goal completed: Mark done with timestamp. Note follow-up goals.
- Decision made: Capture decision + implications. Append to decisions.md. Update session.md.
- Session close: Write final session.md. Append summary to sessions.md (archive). Update DAILY.md. Update agent heartbeats.
session.md structure
# Session State
Date: 2026-03-01
Session: afternoon
## Current Goals
- [x] Fix PWA scroll bug — done
- [ ] Write SYSTEM.md update — in progress
## Git Activity
- abc1234 — fix: PWA iOS scroll behavior (weygand-team)
- def5678 — fix: bottom gap on keyboard close (weygand-team)
## Decisions Made
- VAPID keys: do not regenerate (would break existing subscriptions)
## Resume Instructions
Working on May PWA improvements. Scroll bug fixed and deployed.
Next: update SYSTEM.md docs with PWA features. Then write MISSION-047 spec.
All changes in weygand-team repo, deployed to Cloudflare Pages.
May PWA URL: https://[your-domain]/may/
MEMORY.md discipline
MEMORY.md auto-loads but truncates at 200 lines. Enforce this strictly:
- Keep MEMORY.md as an index — one to three sentences per project, then "See [topic.md]"
- Extract detailed technical notes to topic files as soon as a section grows beyond 5–6 lines
- Prune resolved items from Open Loops; archive closed projects
- Never let MEMORY.md exceed 200 lines — truncation means the bottom half of the file is silently lost
decisions.md discipline
Every decision the operator makes gets logged here, immediately, with context:
## 2026-03-01
**VAPID keys:** Do not regenerate — would break all existing push subscriptions.
**Context:** PWA push notifications live; keys stored as Cloudflare Pages secrets.
**Missions/Status drawer:** Permanent — Chat is always the default view.
**Context:** Confirmed in MISSION-045 session.
This file is append-only. Never delete entries. It's the institutional memory that prevents the operator from being asked the same question twice.
The sessions.md archive
Before overwriting session.md at session close, Scribe appends a summary to sessions.md. This is an append-only cumulative record of everything accomplished:
## 2026-03-01 — Afternoon session
Duration: ~2h
Goals completed: PWA scroll bug fix, receipt persistence, relative timestamps
Commits: abc1234, def5678
Deployed: weygand-team (Cloudflare Pages)
Key decisions: VAPID keys permanent, drawer nav permanent
Next: MISSION-047 system documentation
Autonomy Model
The permission model is binary and clear. Most actions are auto-approved. A short list of consequential actions requires confirmation. The rule of thumb is simple enough to apply in any situation.
The rule of thumb
Auto-approve: just do it
- All code decisions within a scoped plan (architecture, naming, structure, libraries, refactoring)
- File creation, modification, deletion within project repos
- Git commits and pushes to existing repos and branches
- Running builds, tests, dev servers, linters
- API reads (any external API — read operations only)
- Research, drafting, formatting, summarizing
- Installing dependencies, updating configs
- Creating/updating system files (MEMORY, DAILY, broker tasks, heartbeats)
- Deploying to existing environments (GitHub Pages, Cloudflare Pages)
Confirm before doing
- Spending money (API upgrades, paid services, new infrastructure)
- Sending external communications (emails, Slack, PR comments visible to others)
- Modifying source-of-truth business data (Monday.com items, Clockify entries)
- First-time deployment to a new environment
- Deleting a repo, branch, or production data
- Any action touching financial records or sensitive documents
- Actions that cannot be undone
Scoped plan execution
Once the operator scopes a plan — "build X", "fix Y", "add Z" — the AI executes the full plan without asking for approval on implementation details. This is critical for avoiding "permission fatigue" where constant confirmation requests train the operator to rubber-stamp everything.
The AI surfaces decisions only when there's a genuine trade-off the operator would care about: a significant performance vs. complexity tradeoff, a choice between mutually exclusive approaches, or an ambiguity that could result in building the wrong thing.
Decision logging as trust infrastructure
Every time the operator makes a decision, it's logged to decisions.md with context. This creates a trust layer: the AI doesn't need to ask the same question twice, and the operator doesn't need to worry about consistent behavior across sessions.
Examples of decisions worth logging:
- Threshold values ("flag overdue after 7 days, not 3")
- Data source preferences ("county GIS first, external API as fallback")
- Autonomy grants ("auto-approve all commits to this repo")
- Architecture choices ("use vanilla JS, no frameworks")
- External communication policies ("never email without confirmation")
Minimizing permission requests
A key design goal: the operator should never have to approve routine actions. Permission prompts for file reads, git operations, and local actions create friction and distract from the actual work. The system is calibrated to prompt only when the stakes are genuinely high.
If the AI is prompting too often, the fix is to update USER.md or CLAUDE.md with explicit auto-approve grants for the action type in question.
Tooling
The system uses Claude Code as its runtime, GitHub for version control and CI, and Cloudflare for edge hosting and APIs. All of it runs on free or low-cost tiers.
Claude Code CLI
The runtime for all AI sessions — both interactive and headless.
# Interactive session (operator present)
claude
# Headless mission (no operator, autonomous)
claude -p "$PROMPT" \
--dangerously-skip-permissions \
--max-turns 200 \
--output-format text
Key flags for mission execution:
-p "$PROMPT"— pass the mission spec as a prompt (non-interactive)--dangerously-skip-permissions— skip confirmation prompts (required for autonomous execution)--max-turns N— limit agentic turns to control cost--output-format text— plain text output (no streaming JSON)
run-batch.sh — local mission launcher
The primary way to launch missions locally. Handles the nesting problem (Claude Code sessions can't spawn nested Claude Code sessions):
#!/bin/bash
# run-batch.sh MISSION-001 MISSION-002 ...
unset CLAUDECODE # Remove nesting protection
unset CLAUDE_CODE_SESSION
for MISSION_ID in "$@"; do
MISSION_DIR="missions/$MISSION_ID"
PROMPT=$(cat "$MISSION_DIR/mission.json" | jq -r .objective)
claude -p "$PROMPT" \
--dangerously-skip-permissions \
--max-turns 200 \
--output-format text
done
Launch in background to keep working: nohup ./scripts/run-batch.sh MISSION-001 &
Monitor: tail -f ~/may-system/logs/missions/batch-*.log
GitHub Actions — mission runner
CI fallback for when local sessions are unavailable. The mission.yml workflow:
name: Run Mission
on:
workflow_dispatch:
inputs:
mission_id:
description: 'Mission ID (e.g. MISSION-001)'
required: true
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run mission
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
MISSION="${{ github.event.inputs.mission_id }}"
PROMPT=$(cat "missions/$MISSION/mission.json")
claude -p "$PROMPT" --dangerously-skip-permissions --max-turns 200
- name: Commit results
run: |
git config user.email "may@[your-domain]"
git config user.name "May"
git add missions/
git commit -m "Mission $MISSION complete" || exit 0
git push
Sentinel — daily health checks
Sentinel runs on a daily cron via GitHub Actions. It checks all deployed apps, verifies API tokens haven't expired, and commits a result JSON back to the repo:
schedule:
- cron: '15 13 * * *' # 7:15 AM CT daily
Sentinel output (broker/results/sentinel-latest.json) is read at every session's Brief Startup to give fresh context about deployment health and recent commits.
may-pwa-reply.sh — PWA reply channel
When Claude responds to a message received via the PWA, this script writes the response to Cloudflare D1 and triggers a push notification:
#!/bin/bash
# may-pwa-reply.sh "$RESPONSE_TEXT"
# Called by email responder after Claude generates a reply
RESPONSE="$1"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Write to D1 outbound table
curl -X POST "https://[worker-name].[account].workers.dev/api/message" \
-H "Authorization: Bearer $WORKER_SECRET" \
-H "Content-Type: application/json" \
-d "{\"role\":\"may\",\"content\":\"$RESPONSE\",\"ts\":\"$TIMESTAMP\"}"
# Trigger push notification
curl -X POST "https://[your-domain]/api/push-notify" \
-H "Content-Type: application/json" \
-d "{\"title\":\"May\",\"body\":\"New message\"}"
Two-Phase Email Architecture
When the operator sends a message (via PWA or email), the system uses a two-phase approach to guarantee fast replies while allowing complex work to run without timeouts:
# Phase 1: TRIAGE (90s timeout, 10 turns)
# Classifies message, handles simple ones inline, acks complex ones
responder polls inbox → spawns Claude TRIAGE session
→ SIMPLE_QUESTION / DECISION / STATUS / QUICK_TASK → handle + reply immediately
→ MISSION_SCOPE / COMPLEX_TASK → send ack + write handoff JSON
# Phase 2: EXECUTION (30min timeout, 60 turns)
# Picks up handoff files, runs full Claude session with context
processor polls handoffs/ → spawns Claude EXECUTION session
→ reads context files (SOUL.md, MEMORY.md, DAILY.md)
→ does the heavy work (scope mission, execute task, research)
→ sends final reply when done
The handoff JSON file bridges the two phases:
{
"id": "handoff-20260301-143022",
"status": "pending", // pending → in_progress → complete/failed
"msg_id": 82,
"source_channel": "pwa",
"classification": "MISSION_SCOPE",
"message_body": "Original message text",
"ack_sent": true,
"instructions": "Triage assessment of what needs to be done"
}
Key design decisions:
- Operator always gets a response within ~30 seconds, even for complex requests
- Fallback guarantee: if even the triage times out, a generic ack is auto-sent and a handoff is auto-written
- Stale recovery: the processor detects dead PIDs on in-progress handoffs and reclaims them
- Failure notification: if the execution phase fails, a message is sent via PWA so the operator knows
- Both phases run as LaunchAgents (macOS) — responder polls every 60s, processor polls every 30s
The May PWA
A progressive web app that serves as the primary async communication channel between operator and AI chief of staff.
| Feature | Implementation |
|---|---|
| Chat storage | Cloudflare D1 (SQLite at edge) |
| Push notifications | VAPID (Web Push Protocol), Cloudflare Pages Functions |
| File attachments | Cloudflare R2 bucket |
| Nav | Hamburger/drawer — Chat default, Missions + Status in drawer |
| Receipt persistence | localStorage — survives page refresh |
| Timestamps | Relative ("2 min ago") — updates every 60s |
| SW cache versioning | Increment version string (e.g. may-v7 → may-v8) on JS changes |
| iOS push | Requires home screen install (Safari → Share → Add to Home Screen) |
Monday.com MCP
Atlas uses the Monday.com Model Context Protocol (MCP) server for reading board data. This enables natural-language queries against Monday.com during interactive sessions:
- Read board items, statuses, column values
- Filter by status, date, assignee
- Drill into subitems (where invoice/payment data lives)
Known limitation: Mirror and formula columns are unreadable via the MCP API. Dollar amounts on parent items require drilling into subitems. Design queries around this constraint.
Cloudflare Workers — API proxy pattern
Cloudflare Workers serve as the backend for all apps that need server-side logic. Common patterns:
- API proxy: Worker holds API keys; frontend calls worker; worker calls external API. Keys never exposed in client-side code.
- Auth: Worker generates magic link JWT, validates JWT on subsequent requests, returns 401 for unauthenticated access.
- Email routing: Worker receives inbound email via Cloudflare Email, parses it, routes to appropriate handler (D1 write for PWA, email forward for standard).
GitHub Pages — zero-cost frontend hosting
All static frontends deploy to GitHub Pages via GitHub Actions. The deploy pattern:
- uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./dist # or ./ for plain HTML
publish_branch: gh-pages
Broker queue (legacy reference)
The broker queue (broker/tasks/ and broker/results/) was the original agent communication mechanism. It's now effectively retired — Missions are the execution model for autonomous work. The queue pattern remains in the codebase for reference and potential future use with standalone agents.
Format for a broker task JSON:
{
"task_id": "TASK-001",
"created_at": "2026-03-01T10:00:00Z",
"agent": "atlas",
"type": "board_scan",
"params": {
"board_id": "[board-id]",
"filter": "AR_pipeline"
},
"priority": "high"
}
Rule Sets
Three files define how the AI operates: SOUL.md (identity), CLAUDE.md (startup and rules), and PLAYBOOK.md (situational responses). Together they replace a complex system prompt with human-readable governance files.
SOUL.md — identity and principles
The identity file. Defines who the AI is, how it thinks, what it cares about, and what it will never do. This is the most durable file in the system — it rarely changes.
Recommended sections
- Identity: Name, role, who it serves, the two-sentence version of what it is
- How it thinks: Triage approach, data vs. narrative preference, action vs. discussion default
- Operating posture: Monitor → Analyze → Act (or Confirm). What triggers confirmation.
- Domain knowledge: What the AI knows about your business (can reference MEMORY.md for details)
- Startup sequence: Ordered list of files to read (mirrors CLAUDE.md)
- Subagent team: Table of agents, roles, schedules
- Communication style: Tone, format preferences, what to never say
- Hard limits: Short list of things this AI never does, ever
SOUL.md skeleton
# [Name] — Chief of Staff
## [Company] | [Entity]
## Identity
You are [Name]. You are [Owner]'s chief of staff across [domains].
You are not an assistant. You are an operator. You manage systems, not conversations.
## How [Name] Thinks
- Triage before responding. [Routing logic for different request types]
- Numbers over narratives. Lead with data.
- Action over discussion. Default to doing or delegating.
## Operating Posture
Auto-approve: [list]
Confirm before: [list]
## Domain Knowledge
### [Primary Business]
[Core facts the AI needs to know — Monday.com, Clockify, key services]
### [Secondary Domain]
[Core facts]
## Subagent Team
| Agent | Role | Schedule |
|-------|------|----------|
| Vulcan | Code | On-demand |
| Atlas | [Tool] monitoring | Every 2h |
## Communication Style
- Direct. No filler phrases.
- Lead with the answer.
- [Other style rules]
## Hard Limits
[Name] never:
- Sends external communications without explicit confirmation
- Modifies source-of-truth business data without explicit confirmation
- Spends money without confirmation
CLAUDE.md — startup and rules
Claude Code reads this file automatically when launched in the repo directory. It's the entry point for the entire system.
Recommended sections
- You Are [Name]: One-paragraph reminder of identity (brief — SOUL.md has the full version)
- Read these files: Ordered list of files to read before doing anything else
- After reading: What to do (deliver brief, check date, etc.)
- System paths: Absolute paths for broker, logs, state, agents (eliminates path confusion)
- Hard rules: Non-negotiable behaviors (archive before overwrite, record decisions, ask one clarifying question)
- Autonomy rules: Auto-approve list and confirm-before list
- Missions section: How missions work in this system
CLAUDE.md skeleton
# CLAUDE.md — [Name] Startup Instructions
## You Are [Name]
Read the following files immediately, in this order, before doing anything else:
1. `SOUL.md` — identity and operating principles
2. `USER.md` — [Owner]'s preferences and context
3. `MEMORY.md` — active projects and long-term knowledge
4. `DAILY.md` — yesterday's context and today's priorities
5. `PLAYBOOK.md` — how to respond to common situations
6. `/path/to/may-system/state/session.md` — last session state
7. `/path/to/may-system/state/projects.md` — project phases
8. `/path/to/may-system/logs/decisions.md` — persistent decisions
After reading all files, deliver the appropriate brief:
- Today's date ≠ DAILY.md date → **Daily Brief**
- Same day → **Launch Brief**
## System Paths
- Broker tasks: /path/to/may-system/broker/tasks/
- State: /path/to/may-system/state/
- Logs: /path/to/may-system/logs/
## Hard Rules
- Never steer a subagent mid-session — write to broker queue only
- At session close: archive session, update heartbeats, update DAILY.md
- Record decisions in logs/decisions.md — never re-ask a resolved question
## Autonomy Rules
Auto-approve: [list]
Confirm: [list]
## Missions
[Mission spec and launch instructions]
PLAYBOOK.md — situational response patterns
A lightweight rulebook for common situations. Formatted as ## "[trigger phrase]" sections with arrow-formatted action bullets.
Effective PLAYBOOK.md entries are:
- Trigger-based (starts with a phrase pattern)
- Action-oriented (arrows pointing to specific steps)
- Referencing specific files and paths, not abstract concepts
- Short enough to scan in 30 seconds
Example entries
## "What's going on" / "Status"
→ Executive summary: top 3 priorities, any agent alerts, one open loop overdue
→ Do NOT list every project. Surface what needs [Owner].
## "Build [tool]" (operator present)
→ If [Owner] is steering interactively, build it directly in the session
→ For deferred work, create a mission spec and offer to launch it
## "Mission: [objective]"
→ Create mission.json in missions/MISSION-XXX/
→ Tell [Owner]: "Mission MISSION-XXX ready. Run: ./scripts/run-batch.sh MISSION-XXX"
→ Or if [Owner] says "go" — launch it directly in background
## Brief Startup (MANDATORY — before every brief)
→ git pull origin main in repo
→ Read pending-brief.md
→ Read sentinel-latest.json
→ Scan missions/* for newly completed
→ Read result.md + needs-brian.md for completed missions
Writing effective identity files
The most common mistake: writing identity files that describe the AI's capabilities rather than its operating posture. Good identity files define behavior, not features.
Effective techniques:
- Be direct. "You are not an assistant. You are an operator." is clearer than "You should try to be helpful in an operator-like way."
- Name the anti-patterns. "Never say 'Great question.' Never say 'Certainly.'" prevents the most common AI communication filler.
- Specify routing logic. "Before answering anything, determine: is this a status question, a decision request, a task to delegate, or a problem to solve?"
- Hard limits are non-negotiable. The hard limits list should be short (5–7 items) and absolute — no hedging.
Getting Started
A step-by-step path for setting up a May-style AI Chief of Staff from scratch. Estimated setup time for a basic working system: 2–3 hours.
Prerequisites
- Claude Code CLI: Install from claude.ai/code. Active subscription required (Max plan recommended for heavy use).
- GitHub account: Free tier works. Repos can be private.
- Cloudflare account: Free tier works for Workers, Pages, D1, R2.
- Basic familiarity with: git, bash scripting, JSON. You don't need to be a developer — the AI writes the code.
Step-by-step setup
-
Create the repo and local state directories
# Create the versioned repo gh repo create [your-username]/[ai-name] --private # Create local state (not in git) mkdir -p ~/[ai-name]-system/{state,logs/missions,agents/may,broker/{tasks,results,drafts}} -
Write SOUL.md — the most important step
Use the skeleton from the Rule Sets section. Fill in:
- Your AI's name and role
- How it should think (triage, data-first, action-default)
- What it should auto-approve vs. confirm
- 2–3 sentences about each domain it needs to know about
- Hard limits (what it should never do)
This file is the foundation. Take your time on it.
-
Write USER.md
Your preferences as the operator. Include:
- Communication style (bullet-heavy vs. prose, detail level)
- Tech preferences (tools you use, tools you avoid)
- Autonomy grants (what it can do without asking)
- Context about your business/domain that doesn't belong in SOUL.md
-
Write MEMORY.md (initial version)
Start simple. List your active projects with one paragraph each. Add known open loops. Add decisions that are already made. This file grows organically — don't try to capture everything upfront.
-
Write DAILY.md (initial version)
Today's date as the header. Your current priorities. This file gets replaced daily — the initial version just needs to be good enough for the first session.
-
Write PLAYBOOK.md
Start with the essential patterns: "what's going on", "build X", "check on [tool]", Brief Startup protocol, Scribe triggers. Add more as you discover recurring situations.
-
Write CLAUDE.md
Use the skeleton from the Rule Sets section. Set absolute paths for your local state directory. List the files to read in order. Define autonomy rules. Add mission section if you plan to use missions.
-
Run your first session and deliver a brief
cd ~/[ai-name] # repo directory claudeThe AI reads your files and delivers a Daily Brief. Evaluate: does it sound like it understands your business? Does the brief surface the right priorities? If not, iterate on SOUL.md and MEMORY.md.
-
Write your first mission spec
Pick something concrete and scoped — a dashboard, a script, a static site. Write the mission.json with clear acceptance criteria. Launch it with
run-batch.sh. Checkresult.mdwhen done. -
Set up Sentinel
Create a GitHub Actions workflow that runs daily and checks your deployed apps. Have it write a JSON result to the repo. Point your session startup to read that JSON. This closes the loop on knowing what's deployed and healthy without checking manually.
Common pitfalls and how to avoid them
| Pitfall | What happens | Fix |
|---|---|---|
| MEMORY.md over 200 lines | Context truncation silently loses the bottom half of the file | Extract to topic files aggressively. Keep MEMORY.md as an index. |
| session.md not archived before overwrite | That session's work is permanently lost from the record | Scribe protocol: always append to sessions.md before overwriting session.md |
| DAILY.md not updated at session close | Next session starts with stale context; operator re-explains what happened | Make DAILY.md update a hard rule in CLAUDE.md: "At session close, update DAILY.md" |
| Decisions not logged | Same questions re-asked across sessions; operator frustrated; AI seems to not learn | Log to decisions.md immediately. Treat it as append-only institutional memory. |
| Running missions inside Claude Code | Nested session protection blocks execution | Always use run-batch.sh (which unsets CLAUDECODE) instead of run-mission.sh directly |
| Brief Startup skipped | Brief reports stale data; completed missions not surfaced; Sentinel results missing | Make Brief Startup mandatory in CLAUDE.md and PLAYBOOK.md. No exceptions. |
| Vague acceptance criteria in missions | Mission completes "technically" but doesn't meet actual needs | Write criteria as pass/fail tests: "Page loads at URL X and returns 200". Not "works well". |
| Permission fatigue | Operator approves everything without reading; autonomy model breaks down | Add more actions to the auto-approve list. The confirm list should be short and meaningful. |
Tips for long-term operation
- Review MEMORY.md weekly. Prune resolved open loops. Archive closed projects. Add new domains as they emerge.
- Run Sentinel before trusting a brief. If Sentinel hasn't run in 24+ hours, the health data may be stale.
- Treat needs-brian.md as your inbox. After any mission completes, read its needs-brian.md before moving on.
- Log decisions immediately. The moment a decision is made — in any session — log it. Don't batch them up.
- Write clear mission specs, not vague goals. "Build a dashboard showing X, Y, Z data with filters for A and B, deployed to GitHub Pages, all criteria passing" is a mission. "Improve the dashboard" is not.
- Let missions fail loudly. If a mission can't meet criteria, it should say so in result.md. A partial result with clear blockers is better than silent failure.
- Update SOUL.md as you learn. The first version of your identity files won't be perfect. Iterate. Add communication style rules as you notice patterns. Tighten the autonomy lists as you build trust.
Scaling up
Once the core system works, common growth paths:
- Add Sentinel → automated health monitoring, deploy verification, token expiry alerts
- Add a PWA → async communication channel (Claude responds even when you're not at a terminal)
- Add email routing → Cloudflare Email + Workers → Claude responds to emails autonomously
- Add Monday.com MCP → natural-language queries against project management data during sessions
- Scale missions → batch launch multiple missions in parallel, each scoped to a different repo/feature