All You Need Is a Monorepo and Guardrails

This post is about giving AI agents safe access to a codebase. The same instinct, strong guardrails over capable models, is why PgBeam gives agents safe access to a database: scoped credentials, read-only enforcement, and a full audit trail at the wire.

AI agents make mistakes that humans wouldn't. They over-engineer simple features, add unnecessary abstractions, and misunderstand business logic edge cases. I ship most of PgBeam's code through AI agents anyway, because the right constraints make these failures rare and the productivity gain is too large to ignore.

The setup is a monorepo with guardrails. The monorepo gives agents full context. The guardrails keep their output consistent. Each layer builds on the previous one: the monorepo enables the rules file, the rules file enables the knowledge base, the knowledge base enables domain-specific skills, and code generation makes entire classes of bugs structurally impossible.

The monorepo is the interface

PgBeam is a PostgreSQL proxy SaaS with ~140,000 lines of code: a Go data plane, a Go control plane, a Next.js dashboard, a TypeScript SDK, a CLI, and Pulumi infrastructure across six regions. All of it lives in one repository.

This matters because agents need context. When an agent adds a new API endpoint, it reads the OpenAPI spec, generates Go server types, implements the handler, updates the TypeScript SDK, and adds the dashboard page. One session, one repository. No cross-repo coordination, no guessing at interfaces.

A single pnpm generate command runs the full code generation pipeline. A single pnpm test validates everything. The feedback loop is tight: an agent makes a change and immediately knows whether it broke something. One CLAUDE.md governs the entire codebase.

CLAUDE.md is the constitution

CLAUDE.md is a rules file that every AI agent session loads before doing anything. It contains the key constraints that make unsupervised code generation safe:

Contract-first: OpenAPI and protobuf are the source of truth. Generate code, don't write it.
Never hand-edit generated code: specific directories are off-limits.
Type safety: no interface{} in Go, no any in TypeScript (with rare, documented exceptions).
Structured logging: log/slog everywhere, no fmt.Println.
Schema in sync: update the schema file with every migration.
RSC conventions: page.tsx is a server component, client.tsx is a client component, actions.tsx is a server action.
Docs-first: read relevant knowledge base files before implementing, update after changes.

These aren't suggestions. They're hard rules that agents follow on every change. Over 800 commits, the codebase stays consistent because every contributor follows the same rules, whether human or AI.

The brain vault

AI agents are stateless. Each session starts fresh with no memory of previous work. PgBeam solves this with a brain/ directory: a knowledge vault that persists across sessions.

It contains architecture docs, development conventions, infrastructure details, billing logic, security assessments, and runbooks. Before an agent touches code, it reads the relevant files. After making changes, it updates them. When an agent needs to add a new proxy region, it reads brain/infrastructure/regions.md and follows the exact steps. When it needs to understand the billing model, it reads brain/billing/plans.md.

The brain isn't just for code. It includes marketing positioning, writing guidelines for blog posts, competitive research, and incident response procedures. This means agents can draft blog posts that match the project's voice, write documentation that follows established patterns, and prepare incident reports with the right structure.

The intended workflow is that knowledge stays current because agents update brain files as part of their task. In practice, this requires periodic human pruning. Outdated brain files are worse than no brain files.

Skills as domain expertise

Generic AI agents write generic code. PgBeam has over 60 skills: packaged sets of guidelines and patterns that agents load based on the task.

Some are language-specific. Go skills encode patterns from Google's and Uber's style guides: error wrapping with %w, sentinel error usage, concurrency primitives, defensive programming. Frontend skills cover React performance, Next.js conventions, and TypeScript type safety from Vercel's engineering patterns.

Some are workflow-specific. execute takes a backlog item, scopes the work, creates a git worktree, implements the change, runs tests, and prepares it for review. adversarial-review spawns a separate agent session on a competing model to challenge the code before a PR is opened. commit formats commit messages. unslop removes AI writing patterns from prose.

The non-code skills matter more than you'd expect. frontend-design produces production-grade UI. interaction-design handles motion and microinteractions. unslop reviews any public-facing text and strips the telltale AI patterns: em dashes, "dive into", hedging, filler. This blog post went through it.

Code generation as a safety net

The strongest guardrail is not a rule agents follow. It's code they can't write.

PgBeam's API surface is entirely generated from OpenAPI YAML. The Go server types, route registration, TypeScript SDK types, and operation maps are all machine-generated and committed to specific directories that agents are instructed never to touch. SQL queries go through a type-safe generator that produces parameterized Go code from SQL. RPC types are generated from protobuf definitions.

A large class of bugs is structurally impossible:

Type mismatches between server and client: both are generated from the same spec.
Missing route handlers: the Go compiler rejects incomplete interface implementations.
SQL injection via string concatenation: the generator produces parameterized queries.
Stale SDK types: regenerating updates everything at once.

When an agent adds an API endpoint, it edits the OpenAPI YAML and runs pnpm generate. If the spec is malformed, the generator fails. If the new endpoint conflicts with an existing one, the bundler fails. If the handler signature doesn't match, the Go compiler fails. The agent gets feedback at every step, and the feedback is immediate.

Adversarial review

Code that passes tests and compiles can still be wrong. PgBeam uses adversarial reviews before opening pull requests: the system spawns one to three separate agent sessions on a competing model to review the change from distinct critical perspectives.

One reviewer focuses on correctness: what inputs or states will break this? Another on structural fitness: where are the coupling points? A third on necessity: what can be deleted? They operate independently, report back with concrete findings, and the implementing agent fixes the real problems before the PR is created.

This catches mistakes that self-review misses. An agent that just wrote a database migration won't notice that it forgot to add an index for a query pattern. A fresh reviewer, loaded with database and performance skills, will.

The adversarial review isn't limited to code. Blog posts, documentation, and marketing copy go through the same process. This post was reviewed by three adversarial lenses before publication. They found factual errors in the original draft (wrong line count, wrong commit count) that would have shipped otherwise.

Interviews before implementation

Not every task starts with a clear spec. When I need to make a decision (a billing model change, an infrastructure migration, a new feature's scope), I don't just tell the agent what to build. I tell it to investigate the topic and interview me.

The agent reads the relevant brain files, researches the problem space, and comes back with structured questions. Each question has curated options with trade-offs explained, and the agent marks which option it recommends based on the codebase context. I pick answers, sometimes write in my own, and the agent refines its understanding. After a few rounds, it produces a plan grounded in both its research and my decisions.

This works better than writing a spec upfront because the agent surfaces questions I wouldn't have thought to answer. When I asked it to plan read replica routing for the proxy, it asked about failover behavior, connection draining strategies, and whether lag thresholds should be per-query or per-connection. These were decisions I needed to make, but wouldn't have written down in an issue.

The interview pattern also forces explicit trade-off decisions. Instead of the agent silently picking an approach and hoping it's right, every significant choice gets a question with options. The result is a plan I've actually validated, not one the agent guessed at.

Most of the proposals in brain/plans/ started as interviews. The agent investigates, asks, listens, and only then writes the plan. Implementation comes after. Critically, the validated plan gets committed to the brain vault before any code is written. This means subagents spawned during implementation can read the plan and understand the full context of what was decided and why, without the orchestrating agent needing to re-explain it.

Subagents, worktrees, and parallelism

A single agent session works sequentially. That's fine for a focused task, but slow when a feature touches multiple independent areas. PgBeam uses subagents to parallelize work: the main agent spawns specialized child agents that handle isolated subtasks concurrently.

Each subagent runs in its own git worktree, a lightweight copy of the repository that shares the same .git directory but has an independent working tree. This means two agents can edit different files at the same time without stepping on each other. One subagent adds the API endpoint and generates the Go types while another builds the dashboard page. They work in parallel, each in their own worktree, and the results get merged back.

The monorepo makes this practical. Every worktree gets the same CLAUDE.md, the same brain vault, the same code generation pipeline. A subagent doesn't need special setup or context. It checks out a worktree, reads the plan from brain/plans/, and starts working. The guardrails apply everywhere because they live in the repository itself.

Subagents also handle research. When an agent needs to investigate a library, check a competitor's approach, or audit a specific subsystem, it can spawn a subagent to do the research without losing its own place. The subagent returns findings, and the main agent continues with the results. This keeps the primary context window clean and focused on the implementation at hand.

The parallelism isn't unlimited. Tasks that touch the same files need to run sequentially. Code generation can't run in two worktrees at once because it writes to the same output directories. But for the common case of independent frontend and backend work, or parallel research tasks, subagents cut wall-clock time significantly.

Beyond code

The setup extends past the codebase. The same agents that write Go handlers also:

Write and review blog posts: drafting from brain files that contain positioning, competitive research, and writing guidelines. The unslop skill strips AI patterns from the output.
Generate documentation: API reference docs are generated from OpenAPI, but conceptual guides are written by agents following the docs-first convention in the brain vault.
Produce videos and demos: scripting, storyboarding, and editing workflows use the same brain context for consistent messaging.
Draft emails and announcements: product updates, changelog entries, and outreach follow the voice and positioning stored in brain/marketing/.
Manage operations: runbooks, incident responses, and infrastructure changes are all documented in the brain and executable by agents.

The monorepo makes this possible because everything shares context. An agent writing a blog post about query caching can read the actual cache implementation, the benchmark results, and the competitive analysis, then produce a post grounded in real numbers.

What this actually looks like

A typical workflow: a GitHub issue describes a feature. An orchestrator picks it up, reads the issue, loads the relevant brain files, and starts an execution session in an isolated git worktree. The agent implements the change step by step, running code generation and tests along the way. When done, adversarial reviewers challenge the work. The agent fixes real issues, and a PR is opened linking back to the issue.

The human reviews the PR. Sometimes it's approved as-is. Sometimes there are comments that the agent addresses. CI deploys to staging, and post-deploy checks verify the feature works end-to-end.

The human's role shifts from writing every line to defining what should be built, reviewing what was built, and maintaining the guardrails that keep quality high. The guardrails are the leverage.

The trade-offs

The maintenance cost is real. 60+ skills need updating as patterns evolve. The brain vault needs pruning. CLAUDE.md evolves with the codebase. When conventions change, you update the guardrails first, then let agents follow the new rules.

The economics work for a small team running a complex product. The alternative is hiring specialists to cover Go systems programming, TypeScript frontend, infrastructure, DevOps, and content. The monorepo-plus-guardrails approach lets one person direct a fleet of specialized agents across all of these domains.

But the guardrails are more important than the model. Strong constraints on a capable model produce better results than an unconstrained model left to its own judgment. The monorepo makes the constraints enforceable. Everything else builds on top.

If you're running PostgreSQL across regions and want to see what this setup produces, try PgBeam. Check the live benchmarks to see latency numbers from 20 global regions.

All You Need Is a Monorepo and Guardrails

Get started with PgBeam