Your AI agent shouldn't ask “what’s next?”
It should just check $ backlog

Your AI agent forgets between sessions. Backlog remembers. Open Claude Code tomorrow, Cursor next week. Every task, plan, comment, and decision is right where you left it. Ship more in less time. One binary. $0 forever.

  • 12×more loops closed per workday
  • 10×cheaper per task vs long threads
  • 1file. Zero infra.
localhost:8080 / tasks / TASK-42
api · Tasks · TASK-42

Fix login timeout on /api/session

Status todo
Priority P2
Type bug
Project api
Actor human:alice
Created just now
Description

Sessions are expiring after 5 minutes instead of the configured 24 hours. Reproduces on staging when refresh tokens are issued out of order.

Activity
  • just nowhuman:alice created the task
One database. Whatever wrote it, your shell or your agent, appears here.
~/code/api
$ backlog task add -p api \ -t "Fix login timeout on /api/session" \ --type bug --priority P2 \ --as human:alice
✓ created TASK-42 → api/backlog.db
attributed to human:alice · indexed in FTS5 · activity logged
$ backlog task show TASK-42 --json
{
  "ref": "TASK-42",
  "title": "Fix login timeout on /api/session",
  "type": "bug",
  "priority": 2,
  "status": "todo",
  "actor": "human:alice",
  "project": "api"
}
human:alice committed TASK-42 to api/backlog.db ↺ live demo
Star on GitHub Full getting started guide →
Agents connect via MCP
  • Claude Code
  • Cursor
  • Codex
  • OpenCode
You connect via
  • CLI
  • Web UI
  • HTTP API
  • AI Skills
We spent a decade teaching humans to fit into Jira. Now we're going to ask an LLM to do it? An agent's backlog should live in its context, with tooling built for agentic AI usage.
One Backlog database at the center, read and written by multiple AI agents through MCP
One database at the center. Every agent reads and writes the same queue, every write attributed.
The agentic loop

Same workday.

A loop is one unit of work: pick → plan → run → review → ship. Backlog is the persistent state your AI agent is missing. The queue, the plans, the comments, the activity log all live in the same local database. Tomorrow's session, next week's session, Cursor today and Claude Code in an hour. They all see the same backlog, so nothing needs re-onboarding and no session opens with a "where was I?" prompt. The same developer running fresh sessions across the day closes about 12× the loops of one long-running thread.

And because each task spawns a fresh subagent with only the context it needs, the average session fits in under 50k tokens instead of the 500k a single long-running thread bloats into. Same quality, roughly 10× cheaper per loop.

Without Backlog · one long thread
4 tasks · full day gone · context rebuilt every session
0 × baseline
plan
run
review
ship
TASK-1 09:00 – 11:00
rebuilding
context
plan
run
review
ship
TASK-2 11:30 – 13:30
rebuilding
context
plan
run
review
ship
TASK-3 14:00 – 16:00
rebuilding
context
plan
run
review
ship
TASK-4 16:30 – 17:00
same workday
With Backlog · fresh sessions across the day
shared queue · zero re-onboarding · context survives every session
0 × the loops
Why it works

Each agent reads backlog task list --status todo --limit 1, moves it to doing, attaches a plan, ships a PR, posts a completion comment, and marks done. The next session picks up immediately. The DB enforces attribution; the activity log records every step.

Why it's safe

The Backlog DB handles concurrent writers from multiple sessions. Each task is owned by exactly one actor at a time. Plans are versioned, so an agent never overwrites another's work. Conflicts surface as commits, not silent drift.

Why it scales

The queue is just rows in the Backlog DB. Four sessions is a starting point. Add more for bigger backlogs, or run nightly batches against the same database. The CLI, the MCP server, and the web UI all share the same writes.

How we measured 12×

Baseline: one developer running a single agent session, plan-then-run-then-review per task. Same developer, same agent model, same workday, four sessions hitting one queue. Throughput measured by done tasks in the activity log. The 12× figure is the median across week-long runs. Longer queues run higher, tasks needing human review per PR run lower.

Why it's ~10× cheaper

One long agent thread accumulates everything it has seen: old plans, dead branches, files it loaded an hour ago. Token bills scale with that bloat. With backlog, each task spawns a focused subagent with only what the task needs: the task description, the relevant memory entries, and the linked plan. Most loops finish in 30–60k tokens. The same work in a single 500k-token session costs an order of magnitude more for the same output.

Specialized subagents per task

Different work needs different context. A SAST finding gets a security-focused subagent with the scanner output and the auth-related docs. A feature task gets a feature-shaped subagent with the PRD memory and the affected handler files. Each subagent is born fresh, knows only what it needs, and is gone when the task is done. One task's context never bleeds into the next, and nothing accumulates the bloat of a long-running thread.

Up and running in two commands.

1
Initialize
Run backlog init in any directory. A local database and config file are created. The workspace is ready.
backlog init
2
Create tasks
Via the CLI, the web UI, or an AI agent via MCP. Every write is tagged with a typed actor at the DB level.
backlog task add -p api -t "Fix login timeout" --type bug --priority P2

Know which agent did what, and when.

Every task, plan, comment, and doc is tagged with a typed actor at the database level. Not a log on top, the row itself. ai:claude-code shipped TASK-7 at 2:18am. ai:cursor is mid-way through TASK-12. ai:semgrep queued eleven new findings on Tuesday. Manage your AI agents like a team you actually run: see what each one is working on, what shipped, what's waiting on you, all from one filter on the same database. Your morning standup writes itself.

Learn about actors →
Attribution on every write
# Human opens a vulnerability task
backlog task add -p api \
  -t "SQL injection in /search" \
  --type vulnerability --priority P1 \
  --as human:alice

# Security scanner imports findings in bulk
backlog import-findings findings.json \
  --as ai:semgrep

# See exactly who did what
backlog task list --actor-kind ai
backlog task list --actor-name alice --type vulnerability

Your AI assistant is already in your backlog.

Connect Claude Code, Cursor, Codex, or OpenCode directly via the MCP stdio server. The AI reads tasks, writes plans, leaves comments, all attributed with its own actor name. One config, full access.

Claude Code Cursor Codex OpenCode
MCP setup guide →
~/.claude.json
{
  "mcpServers": {
    "backlog": {
      "command": "backlog",
      "args": ["mcp", "serve", "--as", "ai:claude-code"],
      "env": {
        "BACKLOG_DB": "/path/to/backlog.db"
      }
    }
  }
}
The AI can create tasks, attach plans, move status, and leave comments, all from within its conversation context.

Plans evolve.
Every version is permanent.

Attach a markdown plan to any task. Every edit creates an immutable version. No history is ever lost. See exactly what changed, who changed it, and why.

How versioned plans work →
VER TITLE ACTOR NOTE
v1 Fix unsigned JWT rejection ai:claude-code
v2 Fix unsigned JWT rejection (revised) human:alice added key rotation step
v3 Fix unsigned JWT rejection (final) human:alice removed step 4 per review
backlog plan history $PLAN_ID
backlog web

The same backlog, in your browser.

A polished workspace served straight from the binary. List, board, grid, and timeline views. Inline editing on every property. It needs no build step and no separate front-end repo.

localhost:8080 / tasks
Tasks

Open in api

23 open · 8 in progress · 41 done
List Board Grid Timeline
+ New task N
⌕ Search tasks… + Filter ▾ ↕ Priority Archived
RefTitleTypePriorityStatusActor
TASK-1 Reject unsigned JWTs in auth middleware vulnerability P1 todo ai:semgrep
TASK-2 Fix login timeout on /api/session bug P2 doing human:alice
TASK-3 Add rate limiting to /search endpoint feature P3 done ai:claude-code
TASK-4 Migrate session storage to Redis improvement P2 todo human:alice
TASK-5 SQL injection in /users search vulnerability P1 doing ai:semgrep
List · Board · Grid · Timeline
Four views, one keystroke.
Kanban for sprint flow, grid for inline editing, list for triage, timeline for retros. The last view used is remembered per project.
Inline-editable detail page
Click any property to edit.
Status, priority, type, assignee, due date are all editable inline with a single click. Comments and plans live on the same page.
Archived view + restore
Nothing is ever lost.
Archived tasks slide out of the list with a soft fade, restoreable in one click. Activity log records every archive.
All in one DB

Six surfaces. One file. Every write attributed.

Tasks are the core, but the agent loop needs more. Plans the agent writes, docs it reads, memory it remembers, attachments it analyses, an activity feed humans audit.

Typed work items the agent can pick from.

Tasks have a status, type, priority, assignee, and a TASK-N ref humans actually type. The CLI, MCP, and web UI all hit the same row.

backlog task list --status todo --priority P2 --limit 1 --json
TASK-1Reject unsigned JWTs in auth middlewareP1todo
TASK-2Fix login timeout on /api/sessionP2doing
TASK-3Add rate limiting to /search endpointP3done
TASK-4Migrate session storage to RedisP2todo

Built for solo devs shipping with AI.

Same workflow whether your AI is writing code with you, triaging security findings, or running unattended overnight.

Solo dev + AI agent
"You and Claude, on the same backlog."

You create tasks. Your AI agent drafts plans, leaves comments, imports findings, all tagged with its own actor name. At the end of the sprint you can see exactly what was done and by whom, so you never have to guess what the AI changed.

Filtering by actor
# What has the AI worked on?
backlog task list --actor-kind ai

# What did alice close this week?
backlog task list --actor-name alice --status done

# AI-created plans on open bugs
backlog task list --type bug --actor-kind ai --status todo
Security & compliance
"Every finding becomes a tracked task."

Point your scanner at the findings format. Import in bulk with a single command. Every vulnerability becomes a TASK-N with source, external ref, and a pre-attached remediation plan, attributed to the tool that found it.

findings.json
{
  "version": 1,
  "project": "app",
  "items": [
    {
      "title": "SQL injection in /search",
      "type": "vulnerability",
      "priority": "P1",
      "source": "semgrep",
      "plans": [{
        "title": "Fix",
        "body": "Use parameterized queries."
      }]
    }
  ]
}
import
backlog import-findings findings.json --as ai:semgrep
Solo dev shipping with AI
"Ship more in a week than you did last month."

Your AI agent reads backlog task list --status todo --limit 1, picks the next task, plans it, ships it, posts a completion comment, and grabs the next one. You review at the end of the day. The DB is the contract: no instructions to repeat, no context to rebuild between sessions. See the cross-session walkthrough →

One-shot agent loop
# Pick the next task
ID=$(backlog task list --status todo --limit 1 --json | jq -r '.tasks[0].id')

# Agent does the work, then:
backlog task move "$ID" --status doing --as ai:claude-code
backlog plan add --task "$ID" --title "Implementation" --from-file plan.md
backlog comment add --task "$ID" "Shipped in PR #42" --as ai:claude-code
backlog task move "$ID" --status done --as ai:claude-code
Everything ships in the binary

Four capability areas. No plugins, no tiers, no paid upgrades.

01

Queue + identity

A typed work queue with first-class actor attribution at the row level.

  • TASK-N refs alongside ULIDs.
  • Actor human:name or ai:name on every write.
  • FTS5 search with prefix and boolean operators.
  • Labels per project, with color tokens.
  • Archive + restore with a soft-delete trail.
02

Agent loop

The pieces an agent needs to plan, ship, and pick up the next task.

  • Versioned plans. Every edit is an immutable row.
  • Project docs. Markdown the agent loads as context.
  • Memory. Tagged decisions that survive sessions.
  • Comments. Completion protocol attestation.
  • Activity log. Append-only audit trail.
03

Interfaces

CLI, MCP, web UI, and skills all sit on the same service layer.

  • CLI. 14 verb groups, --json on every command.
  • MCP stdio. Claude Code, Cursor, Codex, OpenCode.
  • Web UI. List, board, grid, and timeline views.
  • HTTP API. For scripts and custom clients.
  • Skills. Five embedded: backlog, loop, goal, enhance-tasks, memory.
04

Ops

The boring parts that make a v1 actually trustworthy.

  • Doctor. Integrity check, atomic backup, and a project linter for stale or weakly-closed work.
  • Health reports. activity analyze for cycle time, latency, and human-vs-AI close ratio.
  • Import. Bulk findings JSON for scanners.
  • Export. JSON, CSV, Markdown.
  • Profiles. Named workspaces in TOML.
  • Migrations. Schema updates roll forward automatically when you upgrade the binary. Nothing to run, nothing to maintain.
MCP-ready clients Claude Code Cursor Codex OpenCode
Compare tools

Backlog wins where the agent loop matters.

It loses on real-time human collaboration and integration marketplaces. Different tools, different problems.

Backlog Linear Jira GitHub Issues
Made for agentic AI and humansyes, by designhumans onlyhumans onlyhumans only
Cost$0, MITper seatper seatper repo
AI agent can read & write directlynative (MCP)via API + gluevia API + gluevia API + glue
Native agentic AI memorytagged, cross-sessionnonono
Typed actor on every rowyes, schema columnnonono
Immutable plan versionsyes, every editedit overwritesedit overwritesedit overwrites
Workspace = file in your reposingle Backlog DBcloud-onlycloud-onlytied to repo
Self-hosted / offlinesingle binarynoon-prem tierenterprise tier
Built-in integrations marketplacenoextensiveextensiveextensive
FAQ

Common questions.

What happens when I close Claude Code and open Cursor?
Backlog lives in a local database, the same one every MCP client reads. Close Claude Code, open Cursor: the next session sees exactly what the last one left behind. Point both clients' MCP config at the same BACKLOG_DB and they're reading from the same brain.
Does this cost anything?
No. Backlog is MIT-licensed, runs locally, and ships as a single binary. No seats, no per-user pricing, no cloud bill. The only cost is the disk space the database takes (a few MB for thousands of tasks).
Why not just use GitHub Issues?
GitHub Issues is great for tracking work humans care about. It's not designed for an AI agent to read on every session: no atomic task claim, no typed actor on every row, no MCP integration, no immutable plan versioning, and no native memory the agent can read on pickup and write on completion. Backlog is purpose-built for agentic loops: the queue, the plans, the tagged memory entries, the activity log all live in one local file the agent reads and writes natively. Use Issues for public-facing work; use Backlog for the AI's working queue and its persistent memory across sessions.
How does a team share one backlog?
Run backlog web on one shared host (a workstation, an internal server, a Tailscale node) and point teammates and their agents at that URL. The HTTP API and MCP server expose the same data the CLI does. For asymmetric handoffs (e.g. moving findings between branches) use backlog import <other.db> to merge two workspaces with full plan history preserved.
Will it hold up at scale?
The local DB handles tens of thousands of rows on a laptop and millions on a server. The bottleneck for a backlog is human attention, not row count. SQLite allows many concurrent readers with one writer at a time, so four parallel agent sessions only contend for the database briefly, during commits.
Why not just use Linear or Jira's API from my agent?
Linear and Jira were designed for humans clicking buttons. Backlog is designed for AI agents reading and writing the database directly. That difference shows up everywhere: an MCP server ships with the binary (no API client, no auth tokens, no rate limits); typed human: / ai: actors are enforced as columns on every row (so every change is auditable by who made it); immutable plan versions preserve the agent's reasoning forever; atomic task claims prevent two agents from picking the same row. Linear and Jira can be wrapped to look agent-friendly. Backlog was built for the agentic loop from the schema up.
What if I outgrow it or want centralised infra?
Two clean exits. Run backlog export --format json for a full dump, or read rows directly via the HTTP API. The schema is small and documented: tasks, plans, comments, memory, actors, activity. Spinning up a Postgres-backed clone with the same shape is a weekend, not a migration. You own your data; Backlog is just the binary holding it today.
Is this free? Will it stay free?
MIT-licensed, self-hosted, no telemetry. There's no cloud SKU to upsell into, no separate "Enterprise" version. If a hosted offering ever ships it will sit alongside the OSS, not replace it.

Two commands and you're running.

Single static binary. No CGO. No runtime dependencies. Go 1.22+.

Go install (recommended)
go install github.com/mazen160/backlog/cmd/backlog@latest
macOS (Apple Silicon)
curl -L https://github.com/mazen160/backlog/releases/latest/download/backlog_darwin_arm64.tar.gz | tar xz
sudo mv backlog /usr/local/bin/
Linux (amd64)
curl -L https://github.com/mazen160/backlog/releases/latest/download/backlog_linux_amd64.tar.gz | tar xz
sudo mv backlog /usr/local/bin/

Then run backlog init in any directory to create your first workspace.

Full getting started guide →