By Niels & Kris August 2025



What the Hell Is an Agentic Coding Pipeline?

Think jazz band meets assembly line. The jazz part is you: improvisation, taste, judgment. The line is the system: issues flow in, plans harden, code appears, tests run, PRs open, clients get meaningful updates, and you decide what ships.

Your agents aren’t “AI gods.” They’re specialists you brief like adults and constrain like toddlers. Give them a lane, a rulebook, and a baton pass. They’ll run all night.

https://youtu.be/o-XdyWpivNI

The pipeline lives across four realities: a tracker (Linear/Jira/Trello), a codebase (GitHub/GitLab), an AI runtime (Claude, GPT-4, or an open model), and your terminal/editor. Your Main Agent reads tickets and decides whether to plan or build. A Planning Agent writes a precise plan. A Coding Agent executes the plan into a branch and opens a PR. A QA Agent pokes holes. A Client Summarizer translates the nerd-speak into “here’s what’s happening” without dumping raw guts on the client. You approve merges, adjust scope, and keep the groove.


Core Architecture — The Beginner Blueprint

Start small. Keep your hands on the wheel until the loop feels trustworthy. Your minimum stack is simple: Linear for truth, GitHub for code, an LLM you can hit from the command line, and your editor (Cursor or VS Code) for surgical tweaks. That’s it.

I like a root agents directory next to your code: a place where every agent has a manifest and a prompt template. Treat it like you treat code. Version it. Review it. Improve it. The Main Agent gets a short job description—what decisions it’s allowed to make, when to escalate, and which subagent to call. The Planning Agent gets strict output expectations: numbered steps, clear acceptance criteria, migrations spelled out, API signatures defined. The Coding Agent gets one power only: implement the approved plan in a dedicated branch and create a PR—no merging.

Keep early runs manual. You call each agent one step at a time from the CLI. You watch what it does. You read the artifacts. You ship only when you’re smiling, not when you’re tired.


Roles & Agents — The Cast of Characters

Let’s name the band and give them charts.

Main Agent (Conductor). Reads the ticket, checks for an existing plan, and routes work. The rule is simple: if there’s no plan, produce one; if there is, hand it to coding. If the plan is vague, send it back to Planning with a critique. The Main Agent never writes code. It guards the loop.

Planning Agent (Architect). Turns a short human issue into a buildable plan. It writes: scope, dependencies, a step-by-step build order, acceptance tests, and a rollback note. It also writes a simple, non-technical summary so your client or PM can understand the intention. Plans get attached to the Linear issue. Nothing moves until you approve.

Coding Agent (Builder). Reads the approved plan and implements exactly that. It creates a feature branch, writes code, updates tests, and opens a PR with a crisp summary that references the ticket. It does not “get creative.” It follows the plan and leaves breadcrumbs in the commit messages so you can audit the mind of the machine later.

QA Agent (Tester). Runs unit tests, lints, and a small scenario script that mirrors the acceptance criteria. It comments on the PR with failures and suggested fixes. It can annotate diffs but cannot push commits unless explicitly permitted.

Code Analyzer (Editor). Reviews the diff for security smells, performance traps, and style drift. It suggests focused patches, not rewrites. You can grant it the ability to push a “fixup” commit if the feedback is trivial.

Client Summarizer (Translator). Consumes the plan and the PR description and writes a short status update: what’s happening, why it matters, and what changes the user will notice. No jargon. No hidden drama. This is your trust engine.


The Beginner Workflow — Manual but Reliable

Let’s walk a ticket end-to-end. Keep hands on everything the first few times. You’re teaching the band your tempo.

Create the issue. In Linear, open a new ticket titled “Add Dark Mode Toggle to Dashboard.” The description is one paragraph: where the toggle lives, any constraints, and your gut on acceptance criteria. Keep it short but unambiguous.

Call the Main Agent. From your terminal, ask: “Read Linear ticket 41. If no plan exists, request a plan; if it exists, route to coding.” The agent fetches the ticket, notices the blank plan, and pings the Planning Agent.

Planning happens. The Planning Agent returns a concrete build guide: add a toggle component in Settings, store preference on the user profile, wire CSS variables for themes, persist on reload, include a config flag for rollout, and list the exact files it expects to touch. It writes a non-technical summary: “We’re adding a theme toggle to the Settings screen so users can switch between light and dark. We’ll remember their choice and apply it everywhere.” It attaches both to the Linear ticket.

You review. Ten seconds to scan the plan, thirty to tune a detail. Maybe you add “ship behind a feature flag for 10% of users.” Approve the plan in Linear. The Main Agent sees the status change and wakes the Coding Agent.

Coding executes. The Coding Agent creates a branch feature/dark-mode-toggle, implements the steps in order, updates tests, and opens a PR titled “Dark mode toggle (Linear #41)” with a short body that includes the plan and a checklist of acceptance criteria.

QA runs. The QA Agent runs the test suite, lints, and a small script that toggles the theme, reloads, and confirms persistence. It comments on the PR with results. If there’s a break, it links the exact line and suggests a minimal patch.

You polish and merge. Open the branch in Cursor, take a final pass, merge if you’re satisfied, and move the ticket to Done. The Client Summarizer posts a status update to your Slack or client portal.

That’s the whole loop. The work feels lighter because you’re saying “yes/no/more like this,” not “give me five hours.”


The Intermediate Pipeline — Semi-Autonomous Flow

Once the basic loop feels trustworthy, you invite more automation—but you still set guardrails.

First, graduate your ad-hoc prompts into real agent manifests. Each agent gets a small YAML (or JSON) file with a name, model, purpose, allowed tools, and strict output formats. Put them under version control. Treat changes like code. You can even open PRs to your agent prompts and review them with teammates, so your system evolves without devolving.

Second, add a pre-flight critique. Before the Coding Agent starts, a tiny Analyzer reviews the plan for missing dependencies or risky migrations. If it finds holes, it posts a comment on the Linear ticket and bounces the plan back to Planning with a concrete fix list. That five-minute feedback saves five hours of teardown later.

Third, begin a synthetic backlog. When you approve “Add dark mode,” your system can draft related tickets: “Persist theme choice server-side,” “Email dark-mode availability to beta users,” “Add theme toggle to mobile view.” You don’t auto-build these yet. You plant them. This becomes your option pool—features half-seeded, waiting on human prioritization.

Fourth, formalize the client transparency layer. Every approved plan generates two artifacts: a technical plan for your repo and an executive summary for humans. If your client portal supports comments, the Client Summarizer can post updates there. If your client prefers email, your bot sends a weekly digest with plain-language milestones and next intent. This isn’t performative. It’s trust compounding.

Finally, add light policy checks. The QA Agent can verify that secrets are read from environment variables, that logs avoid personal data, and that the PR doesn’t introduce new public endpoints without an auth note. Small rules, big safety.


Agent Config in the Wild — How They Think

Here’s how I shape the Planning Agent’s brain—in spirit rather than strict syntax:

  • You are a senior engineer who plans like a chess player: no vague verbs, no hidden steps.
  • Input is a Linear ticket (title + description).
  • Output is a numbered plan with file-level suggestions, a clear test list, and a rollback paragraph.
  • You write a one-paragraph non-technical summary for stakeholders.
  • You never write code. You do not assume libraries that aren’t in package.json/pyproject.toml.
  • If anything is ambiguous, list clarifying questions and stop.

The Coding Agent’s manifesto is even tighter:

  • You implement the approved plan exactly.
  • You create a new branch and keep changes scoped.
  • You update or write tests.
  • You include a PR body that restates acceptance criteria and a checklist that maps to them.
  • You never merge.
  • If a step is impossible, you stop and explain in the PR.

The QA Agent speaks rules, not vibes:

  • You run the test suite and lints.
  • You run a short script that executes the acceptance criteria.
  • You comment with failures and actionable suggestions.
  • You don’t guess at fixes outside the plan.

This seems fussy until it saves your bacon twice in one week.


Examples, Tactics, and Traps

Let’s widen the lane with real textures.

Prompt versioning is non-negotiable. Treat prompts like code. Give them semantic versions. When something goes sideways you’ll want to diff the brain, not just the output. Commit messages for prompts should read like doctor’s notes: “v1.2 — tightened DB migration rules; forbid destructive changes without explicit approval.”

Keep branches small. Agents thrive on tight scopes. A good plan fits in a screen. A good branch fits in a coffee break review. If the plan spills into “phase two” talk, split it now. Your future self will thank you.

Sandbox the dangerous bits. If a plan touches auth, payments, or PII, force a manual review between Planning and Coding. Also, ensure your agents never access production credentials. Use ephemeral dev containers (Codespaces, local Docker) and ephemeral keys. Airgap anything that would ruin your month.

Teach your agents your house style. If you’re a typed Python shop or a strict ESLint/Prettier team, feed those rules to your agents. Show them your favorite commit message style. Teach them how you structure tests. Machines learn your rituals faster than interns—if you document them.

Be specific with migrations. Planning should name the table, the column, the default value, and a rollback step. If a migration deletes data, Planning must ask for human approval. That’s not bureaucracy. That’s survival.

Don’t overexpose the guts to clients. Transparency doesn’t mean streaming your terminal. Clients get intent, milestones, and expected outcomes. You keep the messy middle. The Summarizer’s job is to translate, not terrify.

Know when to stop. Agents can spiral—refactor rabbit holes, endless “optimizations.” Cap runtime, cap token budgets, and cap scope. Your pipeline should breathe, not hyperventilate.


Transparency Layer — Show the System, Not the Sausage

Clients pay you to build systems that keep delivering, not one-off heroics. When a request comes in—say “export analytics to CSV with filters”—your pipeline turns it into a plan, a non-technical summary, and a PR in motion. You share the summary first: what changes, where it lives, and how it helps. You include a gentle note about rollout and testing. You update them when the PR is ready for review and again after merge with a “what to verify” checklist.

This rhythm matters. Clients stop asking “is it done yet?” because the system answers before they have to. They also begin to grasp that what you’ve built for them isn’t just a feature. It’s a feature factory you maintain, tune, and evolve. Pricing shifts naturally from “billable hours” to “stewardship of the machine that builds.” That’s grown-up work.

A bonus play: give stakeholders a “read-only roadmap” that reflects your synthetic backlog. Keep it curated—no raw spitballing—just the top ten likely moves with one-line explanations. Now your client can see the near future and shape it with you, not after you.


Security, Ethics, and the Boring Stuff That Saves You

Keep code and secrets separate. Your Coding Agent doesn’t get production environment variables, full stop. If your agent framework allows tools (shell, HTTP), whitelist the exact commands and domains. Audit logs should tell you who did what, when, with which prompt version.

Bias check your Summarizer. If the client is non-technical, it must not condescend. If they are technical, it must not bluff. Calibrate with a few golden examples and keep them in the agent’s “voice pack.”

Respect data boundaries. Support tickets and analytics can seed a synthetic backlog, but don’t slurp personal info into your prompt context. Build small anonymizers. Teach your Summarizer to redact by default. If you’re working with regulated data, fence the entire AI runtime in a compliant environment or keep sensitive tickets human-only.

Ethics isn’t a side quest. If your pipeline starts scraping the universe for answers, cite sources, avoid licensing traps, and don’t ship other people’s work as your own. That’s not “edgy.” That’s lazy.


The Garden, Not the Assembly Line

Once this engine hums, your repo stops feeling like a graveyard of TODOs and starts feeling alive. Tickets don’t die—they compost into better plans. Half-built branches wait politely for your taste check. A script runs every Friday to groom the synthetic backlog, close stale ideas, and nudge promising ones. You become the curator of momentum, not the firefighter of chaos.

This is where vibecoding shines. You can sit with a client, hear the shape of a request, and “play it” into the system like a melody: seed the title, vibe the intent, let Planning flesh the harmony, let Coding lay down the track, and let QA keep time. You’re not hoping the machine makes art. You’re using it to keep your own art moving.


Concrete Walkthrough — From Vibe to Merge

Let’s do a realistic run with some twists.

A client messages: “We need scheduled CSV exports for weekly sales by region.” You throw a short title in Linear and a description with three lines: where the export lives (Analytics → Exports), the fields they care about, and the schedule (Mondays 8am PT). You add one acceptance criterion: “CSV is downloadable and emailed to admins.”

Main Agent checks for a plan—none exists—and calls Planning. Planning proposes: add a server-side job that queries sales by region, paginate results to avoid memory blowups, stream to CSV, upload to your storage bucket with a seven-day TTL, email a signed link to admins, and add an Exports tab listing history with “download” and “rerun” buttons. It notes a new exports table with filename, size, status, and requested_by. It suggests a feature flag and a small rate limit. It includes tests: unit tests for CSV shape, an integration test for the job, and a UI test to see the last five exports. It also writes a friendly summary for the client: what will exist, where to find it, what happens on Mondays.

You tweak two details: change the schedule to 7am and limit the first iteration to “sales by region,” not “sales by region and product.” Approve.

Coding spins a branch, writes the job and the UI, updates tests, and opens a PR. QA runs, fails on a memory test because the initial query was unbounded. It comments with the failure and suggests a batched approach. Coding amends with a streaming cursor and a limit. Tests pass. The Analyzer flags that the CSV link should expire within 24 hours even though the file TTL is seven days; Coding adds a signed URL with 24-hour expiry. Now you review, smile, merge, and mark the ticket done.

The Summarizer posts: “Exports land Mondays at 7am in Analytics → Exports. You’ll get an email with a one-day download link. We kept scope to ‘by region’ so you can test quickly; ‘by product’ is pre-planned for next sprint.” The client replies with a thumbs-up and—more importantly—trust.


Patterns That Compound

Three patterns will quietly change your life.

Pause points. Bake in natural brakes. After Planning: human approval required. After Coding: human review required. After QA: human merges only. Agents run the hustle; you guard the soul.

Golden tickets. Keep a tiny set of exemplar tickets and their artifacts: a perfect plan, a perfect PR, a perfect client update. Feed them to your agents as “this is what great looks like.” Machines learn faster with taste anchors.

Friday reset. Every Friday, run a ritual: a short script that audits open PRs, stale branches, and noisy prompts. It proposes closures, cleanup, and prompt tweaks. You accept most of it with a grin and go into the weekend lighter.


What to Automate Next

Once you trust the loop, let it breathe.

Let Main trigger Planning automatically when a new Linear ticket enters “Ready.” Let Planning tag you only if it has clarifying questions. Let Coding run only when a plan’s status flips to “Approved.” Let QA run on every PR from an agent branch. Let Summarizer send a weekly roll-up on Fridays. You’re not chasing notifications anymore; the system narrates itself.

You can also let agents draft docs: updating the changelog, bumping the README, and writing a minimal “How to use dark mode” snippet for your help center. Keep docs short and interlinked. The job is to cut support tickets by answering obvious questions before they’re asked.

Finally, invite exploration. Give your Coding Agent a sandbox branch where it’s allowed to spike experimental ideas from your synthetic backlog without touching main. Maybe it drafts a proof-of-concept for role-based theming or a migration to a more efficient CSV library. You harvest what’s good and prune the rest.


Why This Works (and When It Doesn’t)

This works because you split work into the pieces machines are great at—repeatable planning, scoped generation, rote testing—and you keep the pieces only humans are great at—taste, trade-offs, ethics, relationship. You also keep tempo: small plans, small branches, fast feedback.

It fails when you let agents free-solo big features, when you pretend a vague plan is “creative freedom,” or when you expose raw agent chatter to clients. It fails when your secrets leak or your policies don’t exist. It fails when you ask the machine to lead instead of to assist.

The fix is always the same: tighten the plan, shrink the scope, strengthen the rulebook, and slow the loop just enough to steer again. You’re not racing anyone. You’re building a garden.


Next Moves — What You Do After Reading This

Spin up three agents—Main, Planning, Coding—and run three small tickets end-to-end. Read every artifact like a hawk. Once it feels right, add QA and Summarizer. Install pause points and policy checks. Start the synthetic backlog. Schedule the Friday reset. The whole thing is a weekend build if you keep your scope tight.

Then, and only then, automate trigger points. You want the system to move without you, but never against you.


Final Challenger Idea

Don’t sell features. Sell the system that makes features inevitable. Clients will come for dark mode. They’ll stay for the feeling that their product is always moving—planned clearly, built cleanly, tested honestly, and explained like they matter. That’s not AI hype. That’s craft with a motor.

If you want a visual systems diagram of the “Linear → Agents → Git → Client” loop, I can drop a clean one that maps each baton pass and shows where the pause points live. It’ll make this click for folks who need to see the flow before they ride it.