Penling penguin markPenling
01

Bring your own agent.

Penling connects over MCP — no wrapper, no proprietary SDK. The agent you already use just connects.

One Penling server. Any MCP-compatible runtime.

Drop the server config into Claude Code, Cursor, VS Code, or your own agent framework. Penling exposes four tools your agent uses to claim the build, stream progress, resolve checks, and ask a human when it needs to.

  • No lock-in. One Penling server works with any MCP-compatible agent — bring the tools your team already runs.
  • Builds run in your runtime. Your machine, your cloud. Penling is the coordination layer, not the executor.
  • Clarifications pause the build. When the agent needs a human decision, the build waits. Nothing proceeds on a guess.
mcp_config.jsonvia MCP
{
  "mcpServers": {
    "penling": {
      "command": "npx",
      "args": [
        "@penling/mcp-server",
        "--build", "19"
      ],
      "env": {
        "PENLING_TOKEN": "${PENLING_TOKEN}"
      }
    }
  }
}

// Four tools your agent gets:
// claim_build        - lock the build before writing code
// report_event       - stream progress back to Penling
// resolve_check      - close a spec check with evidence
// open_clarification - pause and ask a human a question
02

Human control, by design.

Plans are published, not auto-run. A human shapes and approves every plan before the agent touches code — and that decision is on the record.

The plan is where intent becomes mandate.

AI drafts an implementation plan from the spec. You edit it, approve it, and publish it. The agent can't start without a published plan — and the version it runs is locked to the build.

  • Published, not auto-run. You approve before the agent touches code — no build starts without a human sign-off.
  • Human edits are tracked. Every change you make to a plan is attributed and visible in the plan history.
  • Versioned. A focus area can carry several plans over its lifetime. Only the published one runs.
FA-014PlansPlan v3 · draft
DraftPlan v3· GENERATED 8 MIN AGO · 2 HUMAN EDITS

Implementation plan

01
Insert <!DOCTYPE html> with <head>, <body>, and <h1>Paul's Todo App</h1>
AI
02
Add styled text input with maxlength=25 and placeholder attributeEdited · set maxlength cap
AI + Paul
03
Add <button>Add ToDo</button> with type="submit"
AI
04
Add aria-label attributes to pass accessibility checks
AI
05
Run tests/page.unit.test.js and confirm all assertions passEdited · added test gate
AI + Paul
06
Confirm all 4 checks resolve and close the build
AI
Publishing notifies your MCP agents
03

Structured specs, not tickets.

From a messy brief to a four-part spec a human confirms and an agent can act on — without losing anything along the way.

Paste anything. Penling extracts the signal.

Drop in a brief, a Slack thread, a transcript, or a rough idea. Penling reads the whole thing and surfaces outcomes, constraints, out-of-scope items, and open questions as confirmable chips — nothing is assumed.

  • Nothing assumed. Every extracted signal is confirmable — edit or dismiss before it becomes spec input.
  • Open questions surface early. Unresolved unknowns are flagged before they reach the agent.
  • Out-of-scope is first-class. The agent knows where the boundaries are before it touches a file.
Penling · New initiativeAnalyze brief
Analyzing· PASTED 612 WORDS · 4 SIGNALS FOUND

Build a simple to-do app that works offline. Users should be able to add tasks, mark them complete, and delete them. Data should persist between sessions. Keep it minimal - no accounts, no sync, no backend...

Extracted signals · confirm to keepAI-suggested
OutcomeAdd, complete, and delete tasks
OutcomeData persists between sessions
ConstraintNo accounts, no sync, no backend
Out of scopeReal-time collaboration
?Open questionWhat offline storage strategy - localStorage, IndexedDB, or Service Worker?
NOTHING IS SAVED UNTIL YOU CONFIRM · EDIT ANY SIGNAL INLINE

Every goal becomes a four-part spec.

Penling proposes focus areas from your signals — you accept, edit, or replace them. Each becomes a spec with a definition, expected results, conditions, and explicit boundaries.

  • Specs, not tickets. Each focus area has a definition, expected results, conditions, and explicit boundaries.
  • Results become checks. Expected results flow downstream as the checks the agent must satisfy.
  • Boundaries are explicit. The agent knows the edges of each focus area before writing a single line.
Penling · Todo AppSuggested focus areas
Suggested · pick what to spec3 suggested

Penling broke the goal into focus areas.

Task model & persistence
RWTA-1 · 4-part spec
Add / complete / delete UI
RWTA-2 · 4-part spec
Empty & loading states
RWTA-3 · 4-part spec
Add
Write your own focus area
EACH BECOMES A 4-PART SPEC: DEFINITION · RESULTS · CONDITIONS · BOUNDARIES
The build canvas · live

Watch the build happen. The agent works, you steer.

One screen, three live columns — files, events, and checks filling in with evidence.

PenlingAPI-7Build #1
Building · live
Working tree
src/
api/
rate-limit.ts+64
middleware/auth.ts+18
tests/
rate-limit.test.tsnew
Event stream
Sarah resolved clarification CL-07 — rate limit threshold
just now
AI
Penling AI committed rate-limit.ts — per-key rate limiter
12s ago
AI
Penling AI running spec check 03
now
?
What rate limit should apply per API key?
Sarah → 100 req/min, return 429 with Retry-After.
Spec checks2 / 4
01Returns 429 when limit exceeded
rate-limit.test.ts:18
02Retry-After header is set
rate-limit.test.ts:34
03Limits reset on the minute boundary
running test…
04Authenticated keys have higher limits
awaiting build
EVIDENCE ATTACHES AS EACH CHECK PASSES
Real-time

Every commit, test run, and decision streams over the Penling API.

Collaborative

Human and AI on the same canvas — answer a clarification and the build resumes.

Traceable

Every check is backed by evidence — file, test run, or human resolution.

04

Every decision, on the record.

Every initiative, every action, every build — in one place. What needs your attention is always one glance away.

The dashboard that actually tells you what's happening.

Every initiative, its readiness, what's building, paused, or shipped — with a full exportable history. “Needs you” is always one glance away.

  • Readiness you can trust. Status reflects the spec, not guesswork.
  • “Needs you” is explicit. Open clarifications surface to the top so nothing waits silently.
  • Full history. Every actor, every decision, exports as CSV or JSON.
Penling · My Penling
Dashboard · this week
3
Initiatives active
11
Checks verified
2
Need you
Real working ToDo AppBuilding
3 actions · 1 live
Billing portal v2Paused
clarification open
Search re-index jobShipped
PR #218 merged
At a glance

Every capability has an actor, an artifact, and a record.

Nothing happens off the record. Here's who does what, and what it leaves behind.

Capability
Actor
Artifact produced
Recorded
Signal extraction
AI suggests, human confirms
structured signals
Focus area suggestions
AI suggests, human picks
n-part specs
Plan generation
AI drafts, human edits
published plan
MCP build
agent builds, human steers
commits + PR
Clarifications
AI asks, human replies
Q&A log
Spec checks
AI verifies
check + evidence

See the whole arc, end to end.

Bring a brief and an agent. Watch Penling turn it into a spec, a plan, and a PR traced to its reasoning — every decision on the record.

Start free trialSee how the workflow fits together