Multiple-agents that replace your tool-chain

A tour of the specialist agents that plan, design, build, test, and ship software — and how they hand off to each other inside one graph.

Everyone asks: "why lots of agents and not one?" The short answer is specialisation beats generalisation for handoff-heavy work. The long answer is this post.

The cast

Each agent owns one slice of the delivery pipeline. Here they are, in the order they typically fire.

1. Requirements

Takes a one-line prompt, draws out the hidden questions, produces an approvable structured spec. Its output is a document — entities, pages, integrations, acceptance criteria — not a chat transcript.

"We need a bulk-delete workflow for admins." → 30 seconds later → Structured spec with 2 entities, 2 pages, 1 integration, 4 clarifying questions.

2. Tasks

Turns the approved spec into a typed task graph. Each task is routed to the specialist that handles it (schema, page, integration) and carries a depth (quick / balanced / deep) so a small fix doesn't get five rounds of clarifying.

3. Design

Produces an opinionated design system as tokens — palette, typography, motion. Every downstream page agent consumes these tokens. No separate designer handoff.

4. Schema

Drafts PostgreSQL tables with foreign keys and indexes. Emits DDL the Migrations agent ships — idempotent, transactional, with rollback attached.

5. Migrations

Ships the DDL in the mode each environment picked. Managed applies + records; export-only packages a signed bundle for your CI; reconcile reads your existing ledger and highlights drift. Same ledger, different trust contracts.

6. Pages

Generates Next.js pages from a manifest — component, layout, hooks, and the API routes each page depends on. Composed from shadcn/ui primitives. Pages run in parallel once the schema is stable.

7. Integration

Turns a required integration — Stripe, HubSpot, an internal API — into a typed client with aliased secrets. Other agents + pages call the typed client; secrets never enter prompts.

8. Debug

Reads runtime stack traces and static compiler errors, proposes targeted patches scoped to the minimum file surface. No regenerating a page for a null-check.

9. Tests

Reads the spec + the pages it produced and emits unit, integration, and e2e tests that assert the described behaviour. Runnable in CI before release; dry-run catches trivial green-on-empty tests.

10. Security

Scans every generated page + API route + integration for OWASP top-10 risks, auth-bypass patterns, and accidental PII exposure. Findings are severity-tagged; releases can be gated on zero high-severity findings.

11. Review

Orchestrates approval gates between phases. Every artifact gets a human decision before the next phase runs. Feedback is structured — not PR threads.

12. Release notes

Auto-drafts the customer-facing changelog when an iteration closes. Schema changes included. Themed by what shipped — new / fixed / changed — not by ticket IDs.

13. Master merge

Folds approved change-request satellites back into the master requirement document, with a structured diff the PM can spot-check.

Why specialists, not a generalist

The single-agent, "one prompt one pipeline" approach has three problems:

A network of specialist agents fix all three:

  • Each agent has a tight system prompt focused on its one job.
  • Handoffs are typed contracts — design_spec is a JSON schema, page manifest is a JSON schema, migration output is a typed file format. You can swap the LLM behind any agent and the contract stays.
  • Each agent has its own tier + escalation in agent config — we use Haiku for the cheap stuff and escalate to Sonnet or Opus only where it matters.

The graph between them

                    ┌──→ Design ──────────┐
  Requirements ────→│                     ├──→ Pages ──→ Tests ────┐
                    └──→ Schema ──→ Migrations               ↘    │
                                                               ↘   ↓
                                            Integration ──→ Security ──→ Release notes
                                                                 ↑
                                                Review (at every phase)
  • Requirements fans out into parallel Design + Schema + Integration work.
  • Pages spawn from the Schema manifest — one task per route, parallel.
  • Tests + Security run on the built surface before Release notes.
  • Review runs at each phase boundary.
  • Master merge runs on a different cadence — whenever the PM folds accumulated change-requests back into the source of truth.

What this replaces

A team shipping a single ops workflow with this graph replaces, practically:

What they used beforeAgent
Notion for specsRequirements
Linear / Jira for ticketsTasks
Figma + designerDesign
DBA + Liquibase XMLSchema
Flyway + ops pipelineMigrations
Engineer + Next.js + shadcn/uiPages
Engineer hand-rolling API clientsIntegration
Sentry + engineer triageDebug
Test engineer writing unit + e2eTests
Security review before launchSecurity
PR review threadsReview
Someone writing release notes after the factRelease notes
Quarterly spec refresh meetingsMaster merge

The goal isn't to replace the humans — it's to stop the humans from being glue code between a dozen tools.

Running yours

Agents are composable. Every delivery workflow on AlgorithmShift is just a specific sequence of agent invocations, expressed as YAML:

id: ship-a-page
steps:
  - id: spec
    agent: requirements
    approval: human
  - id: schema
    agent: schema
    depends_on: [spec]
  - id: design
    agent: design
    depends_on: [spec]
  - id: page
    agent: pages
    depends_on: [schema, design]
  - id: tests
    agent: tests
    depends_on: [page]
  - id: security
    agent: security
    depends_on: [page]

If the built-in agents don't cover you, agent builder lets you author your own — with your prompts, your tools, and your approval policy — and drop them into the same graph.

Try it

Start free. A free workspace gets all agents, enough generation jobs to build something real, and managed-mode migrations in dev. Bring one requirement you've been meaning to scope and see how it comes out.

Subscribe

One post in your inbox when we ship something worth reading.

Low-frequency, no marketing spray. Product updates, engineering deep-dives, and the occasional customer story.

Or subscribe via RSS

See the full platform in action.

Bring a real requirement. Watch it become a running app you can ship.