Skills

Five skills embedded in the binary, installable into Claude Code, Cursor, OpenCode, and Codex via backlog install-skills. Turn the CLI into an agentic loop, from a single enhanced task to a fully-judged multi-checkpoint goal, with persistent project memory.

What skills are

Skills are markdown files, no code or binary, that become part of an AI coding assistant's context when invoked. They teach the assistant how to drive the backlog CLI and how to layer richer workflows on top of it.

The Backlog repo ships five skills. The base skill (backlog) is the canonical CLI reference; the other four layer workflows and project memory on top of it.

All five skills are embedded in the backlog binary. backlog install-skills detects Claude Code, Cursor, OpenCode, and Codex on your machine and writes the right file in each tool's native format. There's no repo to clone and nothing to copy by hand.

Skills inventory

SkillPurposeInvocation
backlog Base skill. Full CLI reference: every command, flag, ID format, enum, JSON shape, common agent workflow. /backlog <request>
backlog-enhance-tasks Rewrites a task's title and description for clarity. Adds structured sections (Context / Acceptance criteria / Implementation hints). Optionally attaches an implementation plan. /backlog-enhance-tasks TASK-N
backlog-loop Picks up one task and iterates implementation → verification → Judge sub-agent up to 5 attempts until the Judge approves, then exits. Designed for headless claude -p execution. /backlog-loop <project>
backlog-goal End-to-end goal workflow. PREP mode does exhaustive intake and seeds a checkpoint-based board. RUN mode executes via Scout/Judge/Worker sub-agents with parallel safety and a mandatory final audit. /backlog-goal <goal>
backlog-memory Project memory in one skill: learn (load tasks, plans, docs, and memory into the session) and store (synthesize the project's state into persistent memory). Auto-picks the mode — learn on a fresh session, store after work — and asks when ambiguous. /backlog-memory

Each workflow skill declares its dependency on the base backlog skill at the top of its file, so an agent loading backlog-goal (for example) is reminded to read backlog/skill.md first for CLI semantics.

Installation

backlog install-skills

Scans $HOME for supported AI coding tools and writes every embedded skill into each one in the format that tool expects. Re-running is safe: existing files are skipped unless you pass --force.

What gets written where

ToolDetected bySkill file
Claude Code~/.claude/~/.claude/skills/<name>/skill.md
Cursor~/.cursor/~/.cursor/rules/<name>.mdc (with frontmatter)
OpenCode~/.config/opencode/~/.config/opencode/skills/<name>/skill.md
Codex~/.codex/~/.codex/skills/<name>/SKILL.md

Useful flags

FlagEffect
--allInstall into every supported target even if its config dir does not exist yet.
--forceOverwrite existing skill files instead of skipping.
--dry-runPrint what would be written without touching the filesystem.
--skill <name>Install only the named skill (repeatable; default: all four).
You can also keep skills at the project level by leaving skills/ in the repo. Claude Code auto-discovers project-scoped skills when invoked inside the repo.

The workflow-heavy skills (backlog-loop, backlog-goal) work best in Claude Code where sub-agent dispatch is available. They will work in Cursor, OpenCode, and Codex but will not get parallel sub-agent execution.

backlog (base)

The canonical reference for driving the CLI. Every other skill assumes the agent has loaded this one.

What it covers

  • Workspace, profile, project, task, plan, comment, label, memory, doc, attachment commands
  • Global flags: --profile, --db, --json, --as, --quiet
  • ID formats: TASK-N, bare integer, full ULID
  • DB resolution order: --db$BACKLOG_DB--profile → default profile
  • Enum values for type, status, priority, actor.kind
  • JSON response shapes for every list/show command
  • Import-findings file format for bulk task creation
  • MCP tool surface (when backlog mcp serve is running)
  • Common agent workflows: triage findings, pick up a task, revise a plan, capture a decision as memory

Usage

/backlog add a task to fix the login timeout in the auth service
/backlog list all open P1 vulnerabilities in the api project
/backlog show TASK-12 and its plan history
/backlog import the findings from scan-2026-05.json into project api

The skill instructs the assistant to translate the natural-language request into the matching CLI invocation, parse --json output, and present results in a readable form.

backlog-enhance-tasks

Improves a single task's title, description, and (optionally) attaches an implementation plan.

Invocation

/backlog-enhance-tasks TASK-N              # rewrite title and description
/backlog-enhance-tasks TASK-N --build-plan  # plus attach a plan

What it does

  1. Fetches the task with backlog task show TASK-N --json.
  2. Rewrites the title to an imperative verb, specific subject, under 80 chars. Scope is unchanged.
  3. Expands the description into structured markdown:
    ## Context
    <1-2 sentences on why this matters>
    
    ## Acceptance criteria
    - [ ] <specific, testable criterion>
    
    ## Implementation hints
    <file paths, function names, only if clearly known>
  4. Writes back with backlog task update.
  5. If --build-plan is passed, generates an implementation plan with numbered steps + testing section and attaches it via backlog plan add.

When to use

  • Before running /backlog-loop on a task that's too vague to be judgeable.
  • Before handing a backlog item to a sub-agent or a human teammate.
  • As part of triage, to convert a rough one-liner into something actionable.

backlog-loop

Single-task primitive with a built-in Judge gate. Picks one task, iterates implement → verify → judge → fix → re-judge until the Judge approves or 5 attempts have failed, then exits. Never picks up a second task.

Invocation

/backlog-loop <project>          # next highest-priority todo task
/backlog-loop <project> TASK-N   # a specific task
/backlog-loop help               # print the headless-execution guide

Workflow

  1. Select the first task by --status todo --sort priority, or take the explicit ref.
  2. Judgeability check. If the description has no verifiable criteria, refuse to pick up and suggest /backlog-enhance-tasks first. A loop without a Judge gate is the failure mode this skill exists to prevent.
  3. Move to doing, comment "attempt 1 of max 5".
  4. Attach a plan if non-trivial (more than one file, more than one concern, non-obvious criteria).
  5. Iterate (max 5 attempts):
    • Implement (from scratch on attempt 1, from the prior Judge's next_fix_hint on attempts 2–5).
    • Run verification commands.
    • Dispatch the Judge sub-agent with a strict [JUDGE RECEIPT] schema. The Judge reads the task description, runs every verification command, and rejects proxy signals (tests pass ≠ feature works, files changed ≠ behavior changed, plan written ≠ implemented, build green ≠ correct output).
    • Post the receipt as a comment. PASS → break. FAIL → next attempt.
  6. On PASS. Completion comment, task move --status done, exit.
  7. On 5 failures or hard blocker. Diagnostic comment (what kept failing, what was tried, what's needed to unblock), task move --status todo, exit.

Headless execution

backlog-loop is designed for claude -p (non-interactive print mode). All three terminal states (done, blocked, no-work) exit with code 0; non-zero means a harness error, not a task failure.

claude -p "/backlog-loop <project>" \
  --permission-mode acceptEdits \
  --output-format stream-json \
  --max-turns 80

Drain a whole project by wrapping the invocation in a shell loop:

while [ "$(./backlog --profile default task list --project myproj --status todo --json | jq '.tasks | length')" -gt 0 ]; do
  claude -p "/backlog-loop myproj" --permission-mode acceptEdits --max-turns 80
done

The skill's help invocation prints the full headless guide including the drain script, a cron entry, and a GitHub Actions example.

When to use

  • Clearing one well-scoped item off the backlog with verification.
  • As the inner step of a drain script that empties a project task by task.
  • As a scheduled cron / CI job that keeps a long backlog moving.

backlog-goal

End-to-end goal pursuit. Maps a stated goal to a backlog project, decomposes it into checkpoints with verifiable acceptance criteria, executes via Scout/Judge/Worker sub-agents with a strict completion gate.

Invocation

/backlog-goal <goal>        # PREP mode (default)
/backlog-goal prep <goal>   # explicit PREP mode
/backlog-goal run <slug>    # execute the prepared board
/backlog-goal status <slug> # read-only board summary
/backlog-goal pause <slug>  # move all doing tasks back to todo, stop
/backlog-goal clear <slug>  # archive all tasks; brief and plan preserved

Two strict modes

PREP asks, classifies, seeds, stops. It never starts work.

  1. Intake compiler (private). Extracts input_shape, domain, audience, authority, proof_type, completion_proof, likely_misfire, what_bad_looks_like, recurring_blind_spots, reference_patterns, anti_patterns, existing_plan_facts.
  2. Diagnostic ladder (interactive). AskUserQuestion in batches across 13 categories. Vague input gets one question per batch with 2 to 4 options plus a recommended default; the agent reflects on each answer before asking the next.
  3. Agent-generated clarifiers. After the canned categories, the agent reads relevant code and asks 3 to 5 of its own questions.
  4. Brief. Written to disk and stored as a versioned backlog doc.
  5. Plan with 3 to 7 checkpoints. Each has quantified acceptance criteria plus runnable verification commands. Two PREP gates apply:
    • Quantification gate. Every criterion reduces to a number, exit code, strict equality, counted artifact, or presence/absence.
    • Verifiability gate. Every criterion has a runnable verifier, or is explicitly accepted as manual review.
  6. Seed the board. Project, role labels (checkpoint, scout, worker, judge, final-audit), checkpoint tasks, per-checkpoint Scout/Worker/Judge tasks, and a mandatory final-audit Judge task.

RUN executes the board. The PM (main loop) is the only thing that selects tasks, dispatches sub-agents, and moves the board.

The Continuation invariant is restated at the top of every RUN iteration: do not accept proxy signals as completion. Tests passing ≠ feature working. Files changed ≠ behavior changed. Plan written ≠ implemented. The Judge is the only path to done.

Sub-agent roles

RoleAccessEffortReturns
Scoutread-onlylowEvidence map, candidates, contradictions
Workerwrite inside allowed_fileslow–mediumChanged files, verify commands + exit codes
Judgeread-onlyhighPASS/FAIL with file:line evidence per criterion
PM (main loop)board controlOnly role that moves task status

Parallel execution

Default sequential. Parallel is opt-in per dispatch and PM-controlled:

  • Always safe: multiple Scouts; multiple Judges on disjoint inputs.
  • Conditionally safe: multiple Workers when allowed_files are provably disjoint and verify commands don't collide.
  • Never parallel: two checkpoint Judges on the same checkpoint; the final-audit Judge alongside anything.

Memory writes

Every meaningful state transition writes a backlog memory entry. Memory is the inter-turn carrier. Receipts are per-task, the activity log is per-event, and memory is the only place cross-task and cross-session knowledge lives.

TagWritten when
goal,prep-completeAfter PREP seeds the board
goal,learningAfter any receipt that surfaces durable knowledge: patterns, gotchas, decisions
goal,checkpoint,checkpoint-passEvery Judge approval of a checkpoint
goal,checkpoint,checkpoint-failEvery Judge rejection
goal,blockerAny task blocked needing user input
goal,decisionPM/Judge design call worth preserving
goal,completedFinal audit passes

When to use

  • Multi-hour or multi-day coding objectives.
  • Open-ended improvement goals where you don't yet know the exact slices.
  • Migrations and refactors that need staged verification.
  • Anything where the agent's own claim of "done" wouldn't be trustworthy.

For one well-scoped item, use /backlog-loop instead.

backlog-memory

Project memory in a single skill. It does two jobs and picks the right one automatically.

Invocation

/backlog-memory                # auto: learn on a fresh session, store after work
/backlog-memory <alias>        # same, for a specific project
/backlog-memory learn <alias>  # force read-into-context
/backlog-memory store <alias>  # force persist synthesized memory

Two modes

  • learn — reads tasks, plans, docs, and memory entries into the session. Read-only.
  • store — synthesizes the project's state into memory entries grouped by theme (arch, decisions, open-work, done-work, context), idempotently.

With no explicit mode it auto-selects: learn at the start of a fresh session, store after work has been done this session, and it asks when that's ambiguous.

Common conventions

Actor attribution

Every write must be attributed:

--as ai:claude-opus-4-7
--as ai:claude-sonnet-4-6
--as ai:claude-code
--as human:alice

Defaults to human:$USER when --as is omitted, but agents must pass it explicitly. Filter by actor with backlog task list --actor-kind ai or --actor-name claude-code.

Profile flag in the backlog repo

Inside the backlog repo itself, always pass --profile default when running the CLI. The reason: go test can leave an e2etest profile registered, and if it becomes the default the agent will silently target the wrong workspace. Outside the repo, normal profile resolution applies.

JSON output

Always pass --json when piping to jq or otherwise programmatically parsing output.

Completion protocol

Across these skills, the completion sequence for a task is:

  1. Do the work.
  2. Post a comment summarizing what was done (Scout/Worker/Judge receipt for the loop/goal skills; plain summary for the base skill).
  3. Move the task to done: backlog task move TASK-N --status done --as "ai:<model>".

backlog-loop and backlog-goal additionally gate this behind a Judge sub-agent. The base backlog skill and backlog-enhance-tasks do not.

Dependency graph

backlog-goal   backlog-loop   backlog-enhance-tasks   backlog-memory
     │              │                  │                     │
     └──────────────┴──────────────────┴─────────────────────┘
                              │
                              ▼
                       backlog  (canonical CLI reference)

Every workflow skill depends on the base backlog skill for command surface, flag semantics, and JSON shapes. Each declares the dependency at the top of its skill file so an agent loading the workflow skill is reminded to read the base skill first.