Engineering Standard

How we write code at LightReach.

At LightReach, the engineered artifact is the prompt, and code is what falls out of it. This document explains how we work: what we expect from a prompt, how we review prompts in spec.lightreach.io, how captures and .prompts files should behave, and how activity metrics are computed — plus why no implementation reaches GitHub before its prompt has been read, challenged, and approved by another engineer.

Owner: LightReach Engineering · Status: Active · Effective: April 28, 2026

§ 1

Why we review prompts, not code.

For most of the history of software, code review existed because engineers wrote almost every line of code by hand. Reviewing code was reviewing the work itself. That assumption no longer holds for us.

At LightReach, the engineering work happens before the keyboard touches an editor. It happens when an engineer decides what a system should do, what it must not do, what constraints apply, and how we will know the result is correct. That work lives in the prompt. The implementation that follows is a compilation of that prompt — useful, but downstream of the part where the thinking happened.

So we shifted the gate. Prompts are reviewed in spec.lightreach.io. Code is stored in GitHub. We treat code as a versioned artifact of an approved prompt, the way a traditional team treats a binary as the artifact of approved source.

A prompt is not a chat message. A prompt is an engineering specification that happens to compile to software. Treat it that way and the rest of this document follows.

§ 2

Principles.

1. The prompt is the artifact under review.

When a teammate sends you a change, look at the prompt first. If the prompt is unclear, the generated code does not need to be reviewed yet — it needs to be regenerated from a better prompt.

2. Variance is a signal, not a footnote.

A prompt is well-engineered when two reasonable compilations of it converge on the same design, behavior, and risk profile. If they diverge, the prompt has hidden assumptions or missing constraints, and we fix the prompt before fixing anything downstream.

3. The evaluation plan is the test spec.

Acceptance criteria, edge cases, and failure modes are written into the prompt itself, and the tests that ship with the artifact must describe the same behavior. The answer to “is this correct?” is decided before a line of code is generated.

4. Spec gates the work. GitHub stores it.

Spec is where we agree on what the system should be. GitHub is where the resulting implementation lives. There is no second review on GitHub — the prompt was the review.

§ 3

What a strong prompt looks like.

A strong prompt is something another engineer could pick up tomorrow and produce roughly the same system from. It is not a request — it is a specification. At minimum, it tells the reader (human or model) the answers to the following:

Element	The prompt should answer
Objective	Why this change exists, in terms of a user or system outcome.
Context	What product area, prior decisions, files, and behaviors are relevant.
Scope and non-goals	What must change, and what must explicitly remain untouched.
Constraints	Architecture, data, security, performance, and compatibility boundaries.
Acceptance criteria	Observable behavior that distinguishes correct from incorrect output.
Evaluation plan	Tests, manual checks, and the evidence that proves it compiled correctly.
Variance target	What must remain stable across multiple compilations of the prompt.

If a prompt is missing any of these for a non-trivial change, it is not yet review-ready. Send it back to drafting; do not begin implementation.

§ 4

How we review prompts in Spec.

The review surface is spec.lightreach.io, shaped like a pull request: the branch is the unit of review, not an individual prompt. Reviewers read spec diffs plus the captured sessions that produced them as one intent, then approve or reject like a PR.

How it runs today. On a feature branch, edit the bundle. With spec init hooks, each commit mirrors git add into Spec staging and runs capture; new sessions append to prompts/<branch-slug>.prompts (one file per branch). Sign in once with spec login. spec push or git push uploads staged bytes; on a non-default branch that push opens or refreshes a branch review unless you pass --no-review. Use --reviewer email (repeatable) to seed reviewers. After approvals.required from spec.yaml is met, Merge to <default-branch> promotes to trunk in Cloud. You still compile and push implementation from your machine (spec compile hands off to Claude Code by default; --via api uses the separate compiler package).

Branch locally; capture writes into prompts/<branch-slug>.prompts alongside spec edits.
spec push uploads branch-tagged revisions; they are visible in Cloud but not on trunk yet.
Non-trunk push opens the review (or use Open review on Branches); add reviewers or use --reviewer.
Reviewer: diff vs. trunk left, sessions right; Approve or Reject.
Enough approvals → merge onto the bundle default branch in Cloud; branch history on disk is unchanged.
Compile from approved trunk prompts, verify, push code to GitHub — no second PR-style review there.

Capture & hooks.

The CLI reads Claude Code and Cursor session stores (see spec prompts capture --source), appending idempotently into the branch file. Your conscious moves are git add, commit, and push when Cloud should see the branch.

Git hooks (installed by `spec init`).

In a git checkout, spec init installs Spec hook segments under .git/hooks/ for pre-commit, pre-push, and post-merge. (commit-msg and post-commit files are also installed but are deprecated no-op stubs — they exist only so a previous Spec install whose hook code lived there gets cleanly superseded; they are safe to remove if you really want to.) Refresh them anytime with spec git-hooks install. Cloned bundles that predate hooks: run spec git-hooks install once from the bundle root. To remove Spec’s blocks only (other hook logic in the same files is left intact), run spec git-hooks uninstall. Standard escape hatches: git commit --no-verify and git push --no-verify skip hooks; SKIP_SPEC_PUSH=1 skips only the Spec upload inside git push. Multi-bundle monorepos should set SPEC_BUNDLE_ROOT to the bundle directory.

pre-commit — Does two things in one pass, before git locks the tree for the commit:
1. Captures prompts. Reads new turns from your local Cursor / Claude Code / Codex session stores, redacts them, and merges into prompts/<branch-slug>.prompts. The capture file is git add-ed back into the index so it ships in the same commit your code change does — no “commit twice” dance. On a successful capture you’ll see a one-line prompts capture · N session snapshot(s) → prompts/<branch>.prompts.
2. Mirrors the git index into Spec staging. Walks git diff --cached --name-status and runs spec add / spec unstage for every bundle-eligible path. Renames unstage the old path and add the new; copies add the new path only. Auxiliary Markdown like README.md is skipped the same way spec add . skips it. On any change you’ll see a one-line spec: +N staged · -K unstaged (mirrored from git index) summary on stderr so you can see Spec doing its job inside git commit.
pre-push — On git push of branch refs (not tag-only pushes), runs spec push so Cloud receives the same branch + commit SHA you are publishing. Set SKIP_SPEC_PUSH=1 to skip just this hook for one push.
post-merge — After git merge (or the merge half of git pull) lands on the bundle’s default branch, rolls any non-trunk prompts/<slug>.prompts files into the trunk file so the canonical narrative stays on trunk after a feature branch merges.
commit-msg / post-commit — Installed for compatibility with older Spec versions (which ran capture from commit-msg); both are no-op stubs today. Capture moved to pre-commit in 0.2 because git locks the tree before commit-msg fires, which made captures land outside the in-flight commit.

Upload semantics. spec push uploads the current bytes on disk for each staged path, labeled with the git branch and HEAD at push time. If a file changed since your last spec add, the CLI warns before sending. The root spec.yaml is folded into every non-empty push automatically when it exists on disk, so a snapshot stays valid after a prior upload cleared it from Spec staging (git hooks only mirror paths you committed).

$ git checkout -b feature/billing-rewrite
$ git commit   # hooks: staging sync + capture
$ spec push --reviewer teammate@company.com
# or: git push  # pre-push hook runs spec push

Shell integration (installed by the CLI installer).

The git hooks above only fire after a bundle exists. The shell integration closes the other two ends:

git init → spec init — running git init (or git init <dir>) in any directory also runs spec init in the same place, so a brand-new repo gets a Spec bundle the moment it is created.
Autostart of spec watch — the rc-file block also wires a per-prompt hook (zsh precmd, bash PROMPT_COMMAND, fish --on-event fish_prompt) that runs spec live ensure --quiet the first time you prompt inside a spec init’d folder each shell session. It walks up from $PWD looking for spec.yaml, bails in ~50µs when there is none, and otherwise spawns the watcher daemon in the background. Idempotent: a second prompt in the same bundle is a no-op.

Install once, the rest of your workflow stays exactly the same. The curl installer writes both pieces; set SPEC_NO_SHELL_INTEGRATION=1 before running the installer to skip the rc-file block entirely. The autostart hook also honours SPEC_NO_AUTOSTART=1 in your environment and the spec live autostart off per-machine preference (see §5).

The wrapper is a git() shell function that calls command git "$@" first, preserves git’s exit code, and only triggers spec init when: the subcommand was init, git returned 0, the target directory does not already contain a spec.yaml, and the invocation isn’t --bare / --shared=*. bash, zsh, and fish are all supported; the shell is detected from $SHELL.

Manage the wrapper directly with the spec shell command group:

spec shell install — (re)install both wrapper and autostart hook into ~/.zshrc, ~/.bashrc, or ~/.config/fish/config.fish. Idempotent — rerunning replaces the Spec-managed block in place. --shell {bash|zsh|fish} forces a flavour; --rc-file PATH targets a different file.
spec shell uninstall — remove the Spec block from your rc file. Surrounding content (your aliases, exports, etc.) is left untouched.
spec shell snippet — print the wrapper to stdout. Useful for review (“what is this CLI putting in my rc?”) or for installing into a config-management system instead of an rc file.

The whole rc-file block is bracketed by # >>> spec shell integration >>> / # <<< spec shell integration <<< sentinels, so deleting those lines by hand also works.

How reviewers find out.

In-product queue. Listed reviewers see “Awaiting your review” on Branches; no integration required.
Share link. Copy review link deep-links the review (e.g. ?branch=).
Slack (opt-in). Set cloud.notifications.slack_webhook in spec.yaml for review.opened, review.approved, review.merged, review.closed. Posts run after the API responds; one retry on transient 5xx or connection errors, then best-effort drop. Email as a product channel is still out of scope — use the share link.

§ 5

Spec Live: real-time team feed and edit presence.

Review (§ 4) is the asynchronous gate: prompts wait for a teammate to read, challenge, and approve before the implementation reaches GitHub. Spec Live is the synchronous layer underneath it — while you and a teammate are simultaneously prompting Cursor / Claude Code / Codex against the same bundle, both of you can see each other’s prompts as they happen, and the AI IDEs in your terminal know which files the other person is currently editing.

Autostart. The watcher daemon “flicks on” the moment you prompt inside a spec init’d folder. Three independent triggers do this without you remembering to start anything:

Shell prompt hook. The rc-file block from § 4 runs spec live ensure --quiet on every shell prompt that lands inside a bundle. First entry → daemon starts; subsequent prompts → ~50µs no-op.
Claude Code UserPromptSubmit hook. Wired into .claude/settings.json by spec init. Every time you submit a prompt inside Claude Code, the same idempotent spec live ensure --quiet runs — so even users who never open a terminal get the live feed.
Manual. spec live start or spec watch --background if you want to flip it on yourself (foreground users get spec watch).

Daemon state lives in .spec/watch.pid (PID, host, log path) and .spec/watch.log (rolling stdout/stderr); both are gitignored under the existing .spec/ rule. The daemon refuses to spawn a duplicate when one is already running, so racing autostart triggers are safe.

Two layers, one daemon (spec watch):

Prompt feed. Every new turn in any local agent session is redacted and broadcast to the rest of the team within a few seconds. Receivers render it in their terminal; with --mirror, peer turns also append to prompts/captured/peers/<handle>/<branch>.prompts for local grep.
Edit presence. Every ≈15 seconds, the CLI runs git diff --numstat HEAD and broadcasts the dirty file list with per-file +/− line counts. Receivers maintain .spec/team-presence.json, which AI IDEs read before file edits.

How AI IDEs use presence.

spec init wires three integration vectors so every AI editing surface in the bundle warns — or blocks — before touching a file a teammate is in:

Claude Code. A PreToolUse hook in .claude/settings.json runs spec hooks claude-pre-tool-use before every Edit / MultiEdit / Write / NotebookEdit / StrReplace / Delete. If a teammate is editing the target file, Claude shows the warning inline. Re-install with spec hooks install-claude --block to make Claude refuse the call instead of just warning.
Cursor. .cursor/rules/spec-team-presence.mdc with alwaysApply: true tells the model to call spec locks check <path> before destructive edits (stale-mirror safe). Run spec init --upgrade-rules in an existing bundle after a CLI upgrade to refresh this file and the Claude hook block.
Any other AI agent. AGENTS.md documents the same contract: before modifying a file, run spec locks check <path>; exit code 0 = clear, 2 = a teammate is likely editing it. The human-readable mirror .spec/team-editing-brief.md is updated alongside team-presence.json whenever spec watch refreshes the JSON.

When spec watch isn’t running, team-presence.json is missing and every consumer fails open (exit 0, silent). We never block work because the daemon is off. spec locks check additionally ignores a stale mirror (no fresh updated_at) so zombie JSON on disk does not warn forever. The exit-code contract is the universal substrate — any tool that can run a subprocess and check exit codes can plug in.

What we mean by “the lock file”.

Two on-disk artefacts work together as the bundle’s edit-coordination substrate. Neither is a hard mutex — both are advisory mirrors that AI agents read before they touch a file:

.spec/team-presence.json — machine-readable. Schema is stable (schema: 1) and includes a pre-built files_index for O(1) path lookup. Updated by spec watch on every incoming presence event.
.spec/team-editing-brief.md — the same data rendered as plain English (“@alice is editing auth.py — +12/−3 lines”). The brief is what a model with no JSON-parsing budget can drop straight into its context window.

For programmatic use, prefer spec locks check <path> --json. It returns a single line of JSON with a stable shape so any agent can decide without parsing the mirror itself. Exit codes match: 0 = clear, 2 = at least one teammate is dirty in that path with a fresh mirror.

# Clear (no live data, outside bundle, stale mirror, or genuinely clear):
$ spec locks check src/auth.py --json
{"clear": true, "path": "src/auth.py", "holders": [], "pull_alerts": []}

# Conflict — one or more teammates are touching that path right now:
$ spec locks check src/auth.py --json
{"clear": false, "path": "src/auth.py", "holders": [
  {"handle": "alice", "lines_added": 12, "lines_removed": 3, "untracked": false}
], "pull_alerts": []}

# A teammate just ran `spec push` on the same branch — your local branch
# is now behind. ``pull_alerts`` carries that hint regardless of the path:
$ spec locks check src/auth.py --json
{"clear": true, "path": "src/auth.py", "holders": [],
 "pull_alerts": [{"handle": "alice", "branch": "main",
                  "short_commit": "bbbb222", "self_short": "aaaa111"}]}

Pull-needed hint after a teammate pushes.

When a teammate runs spec push on the same branch you’re on, your local HEAD is now behind — and any AI IDE about to write into the working tree should git pull first. Spec broadcasts the new head commit as part of the post-push presence event, so peers learn within an RTT instead of waiting for the regular 15 s watcher tick.

The signal surfaces in three places at once:

spec locks pull-status [--json] — tiny dedicated command. Exit 0 when in sync (or mirror missing/stale); exit 2 when a same-branch peer is ahead. Designed for pre-edit hooks: cheap, parseable, fails open on no-live-data.
spec locks check <path> --json always carries pull_alerts alongside the per-path holders, so existing hooks pick up the signal for free.
.spec/team-editing-brief.md renders a “Pull needed” section above the dirty-files list whenever any same-branch peer is at a different head_commit — the brief is the single readable file every IDE and rule points at, so the signal lands without changing any consumer.

$ spec locks pull-status
⚠ pull needed — teammate(s) pushed commits ahead of your branch:
  · @alice on `main` at `bbbb222` (you: `aaaa111`)
  Run `git pull` before continuing to edit.

$ spec locks pull-status --json
{"clear": false, "alerts": [
  {"handle": "alice", "branch": "main",
   "short_commit": "bbbb222", "self_short": "aaaa111"}
]}

Why same-branch only? Cross-branch divergence is normal — a teammate on a feature branch is expected to sit at a different SHA from main. The pull-needed signal is for the narrow case where your branches match but their commit doesn’t, which is exactly the post-push window.

Single-user, multi-agent locks (cross-tool coordination on one machine).

team-presence.json coordinates across teammates; a separate file, .spec/active-edits.json, coordinates across your own AI agents. The motivating case: you run Claude Code in a terminal, have Cursor’s Agent open in the same repo, and a Codex Desktop window doing a background review. Git can’t tell those three apart — from its perspective, all the edits are yours. Without a separate lock layer, two of your own agents will happily rewrite the same lines at the same time.

Each entry in active-edits.json is a short-lived lock keyed by (agent, session_id, paths) with a TTL (default 5 min, capped at 60). The flow:

Before a write tool call, the agent’s PreToolUse hook runs spec locks acquire <path> --agent claude_code --session abc. Same agent + session re-acquires are renewals; cross-agent overlaps surface as a conflicts entry in the JSON output and (with --block) exit 2.
After the tool call, PostToolUse runs spec locks release <id> (the matching claude-post-tool-use hook does this automatically; spec hooks install-claude wires both hooks for you).
Every consumer that reads locks — spec locks check, Cursor’s rule, team-editing-brief.md — merges active-edit holders with team-presence holders into one holders[] array (rows from this layer carry kind: "active_edit" so renderers can disambiguate).

$ spec locks acquire src/auth.py --agent cursor --session 8c2 --json
{"acquired": true, "lock_id": "11ab…", "paths": ["src/auth.py"],
 "agent": "cursor", "session_id": "8c2", "conflicts": []}

# meanwhile, in another terminal:
$ spec locks check src/auth.py --json
{"clear": false, "path": "src/auth.py", "pull_alerts": [],
 "holders": [{"kind": "active_edit", "agent": "cursor",
              "session_id": "8c2", "intent": null,
              "handle": "you (cursor)", "self": true,
              "lock_id": "11ab…", "expires_at": "…"}]}

$ spec locks list
11ab2c3d  · cursor · session 8c2 · pid 4711 · expires 2026-05-12T18:33Z
  paths: `src/auth.py`

A crashed agent never holds a lock past the TTL — the on-read filter treats expires_at ≤ now as released. Use spec locks prune to physically clean expired rows when you want the file tidy. The lock file is local-only: it is never pushed to Cloud or shared with teammates, since the whole point is intra-machine coordination.

Commands you’ll actually run.

Day-to-day, you do not need to run anything — the autostart hooks above ensure the daemon is alive whenever you’re prompting inside a bundle. The commands here are for inspection and explicit control:

# inspection
spec live status                  # daemon up? broadcasting on? why?
spec team                         # last 20 prompt events for this bundle
spec team --org                   # workspace-wide feed (all bundles you can see on Cloud)
spec team watch                   # live SSE tail across the whole workspace (Matrix-style continuous feed)
spec team request-push jc -m "need your branch"   # git handoff → .spec/team-push-requests.yaml (merged by spec watch)
spec team flag <event_id> -k warning -m "race condition risk"   # flag a teammate's prompt in real time
spec team --user alice           # narrow to one teammate (substring on handle / name)
spec presence show                # who’s editing what right now
spec presence check path/to/file  # exit 0 clear, exit 2 conflict (legacy; no stale guard)
spec locks check path/to/file     # same exit codes; ignores stale team-presence.json
spec locks pull-status            # exit 2 when a same-branch peer pushed ahead
spec locks acquire path/to/file --agent claude_code --session abc   # same-machine multi-agent lock
spec locks release <lock_id>       # drop a per-agent active-edit lock
spec locks list                   # show every active-edit lock for this bundle
spec locks prune                  # housekeeping: remove expired active-edit locks
spec locks show-brief             # print .spec/team-editing-brief.md
spec bundle doctor                # compare manifest name vs git origin vs cloud slug
spec journal sync                 # per-day markdown under docs/spec-journal/
spec journal rollup               # one weekly-style markdown rollup (CI-friendly)

# explicit lifecycle (rarely needed thanks to autostart)
spec live start                   # background daemon for this bundle
spec live stop                    # SIGTERM → SIGKILL after grace
spec live restart                 # bounce after upgrading the CLI
spec watch                        # foreground watcher (debugging / power use)
spec watch --background           # same as `spec live start`

# policy toggles
spec live off                     # disable broadcasting for this bundle (spec.yaml)
spec live mute                    # silence broadcasting on this machine
spec live autostart off           # disable shell-hook autostart on this machine

Continuous workspace stream.

spec team watch is the dedicated pane for live review. One long-lived SSE connection to GET /api/me/prompt-stream, every turn across every bundle you can see, in real time. Drops reconnect with backoff; Last-Event-ID resumes from the last frame so a network blip never costs a turn; idle pings keep the terminal from looking frozen.

What every frame tells you.

› USER @bayocotjc · prompt to claude_code · main · 12:08:11 · acme/widgets
  cwd ~/code/widgets  touched billing.py, auth.py, +1 more  session 8a13c2
  please refactor billing.py to use the new auth helper

Each header carries the badge (USER mint, AI cyan, ERROR red), the source (Claude Code purple, Codex lime, Cursor cyan), the branch, the bundle, and the time. The indented context row underneath shows three optional chips when the broadcaster ships them:

cwd — teammate’s working directory, shortened to ~. Reveals which repo someone is in when a teammate has multiple bundles open in parallel.
touched — basenames of files this turn changed (first two plus an overflow marker). A cheap diff-visibility proxy that lets you spot “why is the AI writing to auth.py?” before the diff lands.
session — the first six characters of the upstream session id. Two concurrent sessions from the same teammate stay distinguishable.

Verbose by default.

Assistant turns ship with their full text body so reviewers see the AI’s actual output, not just a summary. Two opt-outs: set cloud.prompt_stream.verbose: false in spec.yaml for a team-wide quiet feed, or run spec team watch --no-verbose for a per-session switch. User prompts are never stripped — stripping the prompt body would defeat the point of a review feed.

Tool-only AI turns.

When an agent runs a long Edit / Read / Bash chain with no prose, the watcher synthesises a deterministic one-liner (ran 3 tools: Edit billing.py, Bash "pytest", Read auth.py) and routes it through the auto-critic first. Quiet by default; --show-tool-runs opts back into the noisy view.

Error frames.

Adapters can ship role: error when an agent fails on its own — tool error, refused request, rate limit, timeout. The watcher renders these with a red ERROR badge so “agent in trouble” never looks like “agent quiet”. Wire shape is stable; adapter coverage is a rolling opt-in.

In-pane slash commands.

Type while the stream scrolls. No TUI, no new pane — just slash, command, Enter. Scrollback, mouse-copy, and multiplexer integration keep working. Disable with --no-commands.

/help — list every command.
/summarize <n>{h,m} — dump the last window as a structured block for the agent already running in this terminal to synthesise. No spec-cli API call, no LLM cost — the running Cursor / Claude Code / Codex agent is the summariser.
/flag <event_id> <kind> [note…] — post a flag without leaving the pane. Kinds: warning · question · block · ack.
/focus <handle> · /focus off — one teammate only, until cleared.
/mute <handle> · /unmute <handle> — additive suppression.
/replay <n>{h,m} — re-emit the last window through the notifier; critic and flag rendering re-run.
/search <term> — grep the in-memory event buffer. Matches bodies, summaries, paths, handles, the session id, and event ids; up to 25 newest-first hits.
/critic on · /critic off — toggle the auto-critic at runtime.
/status — who is active right now, by source and bundle, from the in-memory buffer.
/pair — force-print the pending user+assistant pair (uses Cloud tail merge when needed).
/turn [<session-chip>] · /full [<session-chip>] — open the latest turn or the whole session in your system pager (less / PAGER).
/push <handle> [message…] · /push@handle [message…] — from a bundle cwd, record a git-push handoff in .spec/team-push-requests.yaml (same as spec team request-push; merged into team-presence.json / team-editing-brief.md on the next spec watch mirror tick).

The buffer is bounded (last 500 events). To act on something older, drop back to spec team flag <event_id> from another terminal.

Git push handoffs.

When you need a teammate’s commits on the remote now, ask them by Spec handle. The CLI appends a time-bounded row to .spec/team-push-requests.yaml. With spec watch running, active rows are merged into .spec/team-presence.json (push_requests) and summarized in .spec/team-editing-brief.md so every AI tool that already reads the lock mirror sees the handoff. Every slash command (pager behaviour, /flag, /search, &c.) is documented in docs/team-watch-slash-commands.md in the spec-cli source tree.

Flagging a teammate’s prompt in near real time.

While you’re watching the stream, you can flag any prompt event by id and the flag fans out over the same SSE channel so every connected watcher sees it within an RTT. Closed enum of kinds so receivers can render them with stable glyphs: warning · question · block · ack. Event ids are visible in spec team output (prefixed #) so you can copy them straight into the flag command.

# Heads-up that something looks off (warns the room):
spec team flag 4711 --kind warning --note "race condition risk"

# Hard stop — “do not let the agent run this”:
spec team flag 4712 --kind block --note "destructive migration"

# Quick acknowledgement on a peer’s prompt:
spec team flag 4713 --kind ack

Flags are server-stamped (the flagger’s identity is unspoofable), bounded (max 500 chars on the note), idempotent per author per kind, and deletable only by their author. The REST list endpoint at GET /api/projects/{id}/prompt-events/{event_id}/flags is the canonical source for any UI rendering a history.

Catching AI mistakes in real time.

A streaming feed is only useful if it helps you act. Three review aids run inside spec team watch. No LLM round-trips, no extra services.

1. Auto-critic.

Every user prompt and every assistant turn is matched against a catalogue of “blast radius” rules. Each firing rule prints one suggestion indented under the event, paired with the exact spec team flag command to escalate. Toggle at runtime with /critic off; disable for the whole session with --no-critic.

rule	severity	catches
`destructive-verb`	block	`rm -rf`, `git reset --hard`, `DROP TABLE`, `shutil.rmtree`, …
`test-bypass`	block	“disable / skip / comment out the failing tests”, `--no-verify`, “disable CI”
`secret-in-prompt`	block	Stripe / GitHub / AWS / Slack tokens, private RSA blocks, JWTs
`vague-intent`	warning	verb with no scope: “fix this”, “clean up”
`trust-handoff`	warning	“you decide”, “make it good”
`multi-task`	info	“… and also …” compound prompts

Assistant-side coverage inspects tool summaries too, so an agent that runs a destructive command surfaces with the same severity as a teammate who types one.

2. Notify on block hits.

Pass --notify when you can’t keep eyes on the pane. Every block-severity critic hit rings the terminal bell; on macOS, the watcher additionally fires an osascript display notification banner with the rule, the event id, and the author. Default is off — this is the “I’m in another window when someone pastes a secret” switch, not the everyday review mode.

3. No-reply hint.

If a user prompt has been visible for 90 s and no assistant turn from the same session has arrived, the watcher prints ⏳ no-reply. Usually means the teammate’s broadcaster is in summary-only mode and the empty summary got dropped — ask them to flip cloud.prompt_stream.verbose: true.

Visual cues.

Every frame carries a chunky badge: USER , AI , ERROR . Source adapters are color-coded too — claude_code, codex, cursor — so concurrent sessions on different tools stay separable at a glance.

Roadmap.

Diff visibility. Ship a compact unified diff (first ~20 lines) alongside Edit / Write tool calls so reviewers see what changed, not just that something changed.
Adapter-side errors. Wire role: error emission into Claude Code / Codex / Cursor so agent failures are first-class frames instead of implicit silence.
Server-side critic. A system flag author so every watcher sees the same critique without re-running rules locally.
LLM second pass. Optional, for the rules regex cannot catch.

Journal rollup (CI and Cloud).

To automate spec journal rollup weekly, copy examples/github-actions/journal-weekly.yml from the spec-cli repository into your bundle’s .github/workflows/ and add a SPEC_ACCESS_TOKEN repository secret (same JWT as in ~/.spec/credentials after spec login). The CLI also honors SPEC_ACCESS_TOKEN / SPEC_API / SPEC_USER_HANDLE in the environment so CI never has to write a credentials file.

On Spec Cloud (spec.lightreach.io), the API stores the same style of rollup as markdown — deterministic listing from recent Spec Live events, no LLM. Authenticated project members: GET https://spec.lightreach.io/api/projects/{project_id}/journal/rollup returns the latest snapshot (empty until the first refresh or scheduled worker run); POST https://spec.lightreach.io/api/projects/{project_id}/journal/rollup/refresh rebuilds it if you have write access. Production runs a background worker (SPEC_JOURNAL_ROLLUP_ENABLED=1; interval SPEC_JOURNAL_ROLLUP_INTERVAL_SECS, default 86400 seconds).

Privacy posture.

Secrets are redacted from every outbound payload — same _SECRET_PATTERNS as .prompts files on disk. Bearer tokens, OAuth keys, and friends never reach the wire.
Assistant turns are summary-only by default. Full assistant text is only shipped when you explicitly opt in with cloud.prompt_stream.verbose: true in spec.yaml or --verbose-out on the command line.
The author block on every event is server-stamped from your bearer token. Nobody can spoof your handle on the feed.
Read access gates everything — only project members can post or read. Outsiders get a 400.
Three opt-out layers, all surfaced through spec live: spec live off writes the per-bundle broadcasting setting (spec.yaml, committed so the team agrees on policy); spec live mute writes a per-machine broadcasting override (~/.spec/preferences.json); spec live autostart off writes a per-machine override that suppresses the shell hook’s automatic start of the daemon (you can still spec live start by hand). The env var SPEC_NO_AUTOSTART=1 is the same suppression for one shell session.

What’s deferred.

File-level presence is what ships today. Within-a-file cursor / line position, per-prompt line diffs for forensics, and authoritative server-side locks with TTL still need Cloud + CLI work beyond the local mirror. The SSE channel is the substrate. The +/− line counts answer “is this teammate in auth.py right now, and how much have they touched?”; they do not yet answer “which lines.” The hook contract above will keep working once those land — consumers only widen, never narrow.

§ 6

Install and update the `spec` CLI.

You need the CLI for spec login, capture, hooks, and spec push. One-liner (macOS, Linux, WSL) — installs uv if missing, then the tool into ~/.local/bin:

curl -LsSf https://spec.lightreach.io/install.sh | sh

The script runs uv tool install so spec lives in an isolated environment; if system Python is older than 3.9, uv can pull a compatible runtime. It also runs spec shell install to wire git init → spec init into your interactive shell rc file (details); set SPEC_NO_SHELL_INTEGRATION=1 to opt out. Then run spec --help. If your shell cannot find spec, run uv tool update-shell and open a new terminal.

spec --help
spec --version

Update to the latest main (safe to repeat anytime; same as re-running the one-liner):

uv tool install --force git+https://github.com/Unit237/specforge-cli.git

Uninstall the CLI: removes the tool environment only — bundles and ~/.spec stay until you delete them. (spec-cli is the distribution name; uv tool list shows the exact label if yours differs.)

uv tool uninstall spec-cli

The installer script is hosted at /install.sh as plain text so you can read it before running it.

Manual install (no `curl | sh`).

Same install in three steps — reasonable if you don’t want to pipe a script into sh:

# 1. Get uv (one-time; skip if you already have it).
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Install spec from source.
uv tool install git+https://github.com/Unit237/specforge-cli.git

# 3. Wire `git init` → `spec init` into your shell. Skip if you don't want it.
spec shell install

Pin a tag with ...specforge-cli.git@v0.1.0 once released; on main you track trunk. Step 3 is what the one-liner installer does for you automatically — see shell integration.

From a clone (CLI contributors).

git clone https://github.com/Unit237/specforge-cli.git
cd specforge-cli
uv tool install --editable .
uv sync --extra dev
uv run pytest

Claude Code and compile.

By default spec compile writes .spec/compile-prompt.md for an existing Claude Code session (claude --version). Opt-in API path: spec compile --via api plus ANTHROPIC_API_KEY and the separate compiler package.

Environment variables.

The CLI targets the hosted Cloud by default; override when needed:

Variable	Purpose
`SPEC_REF`	Git ref for `install.sh` (default `main`).
`SPEC_REPO`	Git URL for `install.sh` (default the official CLI repo).
`SPEC_NO_PATH`	Set to `1` so `install.sh` skips the post-install PATH suggestion.
`SPEC_NO_SHELL_INTEGRATION`	Set to `1` so `install.sh` does not wire `git init` → `spec init` into your rc file. See shell integration.
`SPEC_API`	Cloud API origin (paths under `/api/`; do not append `/api`).
`SPEC_HOME`	Credentials directory (default `~/.spec`).
`SPEC_BUNDLE_ROOT`	Bundle directory for git hooks in a monorepo.
`SKIP_SPEC_PUSH`	Set to `1` so the pre-push hook skips `spec push`.
`SPEC_HOOK_PUSH_EXTRA_ARGS`	Extra arguments forwarded to `spec push` from the pre-push hook.
`CLAUDE_HOME`	Claude Code store location (default `~/.claude`).
`CURSOR_HOME`	Cursor application-data root for capture (see CLI defaults per OS).

If something goes wrong.

command not found: spec — Run uv tool update-shell and open a new terminal.

uv: command not found — Open a new terminal after uv install, or export PATH="$HOME/.local/bin:$PATH".

Don’t use pip install spec-cli for this tool — the PyPI name may point elsewhere; use uv tool as above.

Corporate proxy — HTTPS_PROXY / HTTP_PROXY are honored by curl and uv.

§ 7

Variance: how we know a prompt is well-engineered.

Variance is the difference between two valid compilations of the same prompt. A prompt with low variance produces consistent designs and behavior across runs. A prompt with high variance produces meaningfully different systems each time, which means the prompt — not the model — is the problem.

When the change is non-trivial, reviewers should compare two compilations across four dimensions before approving the prompt:

Dimension	Aligned	Diverged
Behavior	Same observable outcome for the same input.	Different user-facing or API behavior.
Architecture	Same boundaries, ownership, and data flow.	Different design choices that would be hard to reconcile later.
Safety	Same permissions, data exposure, and operational risk profile.	One compilation introduces a risk the other avoids.
Evaluation	Same definition of “correct.”	Compilations disagree on what proves correctness.

If any dimension diverges, the prompt is not done. The author tightens constraints, adds context, or sharpens acceptance criteria, and we compile again. Variance is meant to be argued about — we deliberately do not reduce it to a single score, because the point is for two engineers to look at two compilations and agree, in writing, about what would and would not be acceptable to ship.

If you cannot defend the prompt against its own variance, the prompt is not ready. Push it back into drafting; do not push code into GitHub.

§ 8

Evaluation: the prompt is the test spec.

We do not write tests as a separate step. We write evaluation plans, and the tests fall out of them.

A prompt with sharp acceptance criteria already describes the tests that should exist. The prompt says what “correct” looks like; compiling the prompt produces both the implementation and the tests that prove it; the reviewer checks that the generated tests describe the same behavior the acceptance criteria did.

The bar is not “the change has tests.” Plenty of changes have tests that do not actually verify anything important. The bar is that the prompt’s evaluation plan and the artifact’s tests describe the same thing. If they do not, the prompt was vague, the implementation drifted, or both — and we fix the prompt.

If you cannot point to the line in the prompt that a test came from, the prompt was incomplete. Add it, re-approve it, and regenerate.

§ 9

A working prompt template.

Use this as a starting point in spec.lightreach.io. Fill in every field. If a field does not apply, say so explicitly — do not leave it blank.

Prompt — v1

Objective. What user or system outcome should this change produce?

Context. What product area, prior decisions, files, and behaviors matter?

Requirements. What must the implementation do?

Non-goals. What must remain untouched?

Constraints. Architecture, data, security, performance, compatibility.

Variance target. What must remain stable across compilations?

Acceptance criteria. Observable outcomes that prove correctness.

Evaluation plan. Tests, checks, or review evidence required.

GitHub artifact. The code artifact produced from this prompt, once approved.

§ 10

Notes for reviewers.

Reviewing a prompt is harder than reviewing a diff, because there is no compiler error to hide behind. A few things that help:

Read the prompt as if you knew nothing about the conversation that led to it. If you cannot implement it from the prompt alone, the prompt is incomplete.
Ask what the prompt does not say. Most defects in generated code trace back to something the prompt left implicit.
If you can imagine two reasonable compilations producing different behavior, name the divergence out loud and ask the author to remove it.
Approve intent, not output. Saying yes in Spec means you agree with what is being built, not that you have validated every line of generated code.
When in doubt, send it back. A returned prompt is cheaper than a reverted change.

§ 11

Exceptions.

There are real cases where prompt review must be skipped: a production incident, a security rollback, a hotfix gating revenue. In those cases, the engineer pushes the fix and writes the retrospective prompt in Spec within twenty-four hours. The prompt is reviewed after the fact, and the system absorbs the same lesson it would have absorbed before. Skipping the prompt entirely is not an option; only its order changes.

Outside of incidents, there are no exceptions. If a change feels small enough to skip, write the prompt anyway. It will take ten minutes and it will save the next engineer who touches that code an afternoon of guessing what we meant.

§ 12

Capture workflow we expect.

Spec treats prompts as source. The machine-readable record lives in prompts/<branch-slug>.prompts (trunk is usually prompts/main.prompts). If you follow a few habits, your sessions stay attributable, reviewable, and fair in the metrics below.

Use a branch for real work. Capture appends to the current branch’s file. Trunk stays the canonical narrative after review and merge.
Let capture run. After you commit, spec prompts capture (often via git hook) should append new [[sessions]] blocks. That is how turns and git context land in the file.
Commit when you have something to anchor. A session’s optional [sessions.commit] block carries commit_sha for the commit the capture ran against. Regular commits give downstream metrics an honest link between conversation and shipped bytes.
Push when you want Cloud to see it. Nothing leaves your laptop until you choose to push; the same rule applies to how aggregates appear on profiles and bundle views.

How `.prompts` files travel through review.

On main (or whatever your bundle’s default branch is), capture writes to prompts/main.prompts. On a feature branch feature/x, capture writes to prompts/feature-x.prompts (the branch slug is lowercased and non-[a-z0-9._-] runes collapse to -). The original branch name survives as [commit].branch inside the file, so the slug only has to be a filesystem-safe handle.

When the branch review is approved and merged in Cloud, every session in prompts/feature-x.prompts is appended into prompts/main.prompts with merged_from, merged_at, and approved_by stamped on each session. The merge is append-only and deduplicated by session id; trunk’s prompts file never has merge conflicts. The branch’s own feature-x.prompts stays in branch history under its slug for forensic reading. This is why trunk’s main.prompts ends up the canonical narrative of the bundle — the merger of every approved branch’s sessions, ordered by started_at, with the green-dot review signal next to the ones that arrived through governance.

Which `.md` files Spec treats as bundle content.

Every Markdown file in the worktree is evaluated against a 6-step ladder, first match wins. The intent: capture every page of English that shapes the build and ignore the ones written for humans.

Frontmatter override. A spec: true / spec: false at the top of the file wins over everything else — explicit beats default, always.
spec.exclude match. Globs in spec.yaml are honoured.
Explicit spec.include match. If you typed it into spec.include, it’s in — even if it would otherwise hit the human-doc denylist (e.g. you can opt docs/CHANGELOG.md in by listing it).
Agent-instruction allowlist. AGENTS.md, CLAUDE.md, GEMINI.md, llms.txt, llms-full.txt, and .github/copilot-instructions.md are recognised at any depth, case-insensitive. These are the files whose explicit purpose is to instruct an AI agent or human implementer about the codebase — they belong in the bundle regardless of where they sit.
Human-doc denylist. README.md, CHANGELOG.md, CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, LICENSE(.md/.txt), NOTICE, HISTORY.md, and ROADMAP.md are excluded by default. They’re for humans browsing the repo, not for the build. Step 1 or step 3 above can pull any of them back in for a specific bundle if needed.
Default include glob. When spec.include is unset, docs/**/*.md is the implicit catch-all. Move your design docs and architectural notes into docs/ and they ship with the spec for free.

Anything that falls through is auxiliary — spec status renders it as ignored, the same row as a .png. To opt a tree of prose-relevant Markdown outside docs/ into the bundle, add an include: line to spec.yaml:

spec:
  include:
    - docs/**/*.md
    - architecture/**/*.md
    - notes/decisions/*.md

You do not need to write intent paragraphs or extra metadata for the metrics in §§ 13–16. They are derived from turns and timestamps already in the file.

§ 13

Activity metrics (three aggregates).

We start with numbers that are cheap to explain and hard to game accidentally:

Name	Meaning
Contribution heatmap	How many user turns you sent per calendar day (attributed to you).
Iteration depth	How many turns appear in a single `[[sessions]]` block (same session, ordered turns).
Abandonment rate	Among sessions that are eligible (finished for counting and the `N`-day grace has passed), the fraction that still have no linked git `commit_sha`.

Profiles show your heatmap. Bundle and team views roll the same signals up across everyone who contributes prompts files on that bundle (Cloud stores every .prompts path for the bundle; metrics dedupe sessions by id and aggregate).

In the web UI, iteration depth is summarized as the median and p90 turns per session (both total turns and user-only turns)—not a single raw count.

Teams. The team rollup uses the same mathematics but foregrounds linked session rate (eligible sessions that already have a commit_sha) alongside abandonment counts — both views read the same underlying eligibility rules in § 15. Per-member rows attribute sessions with the same operator / commit-email fallback described in § 13, keyed to each member’s email.

§ 14

Contribution heatmap.

Each [[sessions.turns]] with role = "user" counts as one prompt for the heatmap. We bucket by the turn’s timestamp (at when present, otherwise session start). Attribution uses the session’s operator when present; when it is missing (common on older captures), the UI falls back to [sessions.commit].author_email, and then to the file-level [commit].author_email so heatmaps still line up with git identity.

The visualization is intentionally similar to GitHub’s contribution graph: it answers “how often did this engineer steer the model on this calendar day?” It is not a score for creativity or correctness.

§ 15

Iteration depth.

Iteration depth is the number of [[sessions.turns]] entries in that session, in file order. The Spec Cloud UI reports median and p90 of that count across sessions (and the same for user-only turns). If you export raw data, you can recompute either definition; document which one a chart uses.

§ 16

Abandonment — and why open sessions do not count.

A session is abandoned for metrics when all of the following are true:

The session is finished for counting (see below).
After it became finished, there is still no commit_sha under that session’s [sessions.commit] table.
A grace period of N whole days has passed since the session became finished. The hosted product currently uses a fixed N = 14 days everywhere; per-org controls are not exposed in the UI yet.

What “finished for counting” means

We exclude in-flight sessions from abandonment entirely. Silence while you are thinking, in a meeting, or overnight is not abandonment.

A session becomes finished for counting when either:

The capture records an ended_at timestamp on the session, or
There has been no new turn for the quiet window W. The hosted product currently fixes W = 36 hours after the latest turn; it is not yet configurable per organization. The clock is based on the latest turn timestamp in that session.

If you stepped away mid-session, you are not “penalized” until the session has gone quiet for at least the quiet window. Only then does the grace clock for abandonment start.

Parameters (constants today)

Symbol	Role
`W`	Quiet window. Minimum idle time with no new turns before a session without `ended_at` is treated as finished for counting. Fixed at 36 hours in production UI today.
`N`	Grace days. After a session is finished for counting, wait this many whole days before counting it abandoned if `commit_sha` is still missing. Fixed at 14 days today.

Abandonment rate is then abandoned_sessions / eligible_sessions, where eligible sessions are those finished for counting whose grace window has also elapsed. Sessions still active (new turns inside W) sit in a third bucket: in progress — visible if we show counts, but never the denominator.

Honest limits: exploratory spikes and pairing sessions may never produce a commit; they can look abandoned under this rule even when they were valuable. That is why we publish the rule in plain language; when per-org tuning ships, these defaults remain the starting point.