Building an agent system: Documentation, standards, and setup

Agent orchestration works.

DEC. 17, 2025

10 Min Read

Adrian Obelmejias

I’ve shown the results and the workflow. But here’s what I haven’t told you yet: this approach lives or dies on documentation quality.

Not just any documentation. Not generic AI prompts. But high-quality engineering standards that serve both human developers AND AI agents.

If you’re reading this, you’re ready to build that foundation. Here’s exactly how.

The principle: Standards-driven development

The secret to effective AI agents isn’t better prompts or more powerful models. It’s a comprehensive documentation of your team’s patterns, decisions, and conventions.

Here’s the key insight: The documentation you should already have for human developers is exactly what AI agents need to work effectively.

Every well-run engineering team has (or should have):

Coding conventions
Architecture decision records
Testing strategies
Best practices documentation

The same documentation that helps onboard a junior developer helps bootstrap an AI agent. No duplicate effort. No separate “AI documentation.” Just clear, comprehensive standards that serve both audiences.

The two-layer documentation strategy

Our documentation lives in two places, each serving a distinct purpose:

Layer 1: /docs/ - Standards for Humans AND Agents

docs/

├── fastapi-standards.md # Backend API patterns

├── react-standards.md # React component patterns

├── testing-standards.md # Testing approaches

├── python-standards.md # Python conventions

├── ts-standards.md # TypeScript conventions

├── nextjs-standards.md # Next.js specific patterns

└── nx-monorepo-standards.md # Monorepo conventions

These are your team’s engineering standards. They contain:

Exact code patterns with correct and incorrect examples
The “why” behind decisions (not just the “what”)
Common pitfalls with solutions
Security and compliance requirements
Performance considerations

Layer 2: .agents/ - Agent-Specific Context

.agents/

├── profiles/

│ ├── backend-dev.md # Role, decisions, when to use patterns

│ ├── frontend-dev.md # React role, component decisions

│ ├── architect.md # System design, cross-cutting concerns

│ ├── reviewer.md # Code quality, security checks

│ └── tester.md # Testing strategies, edge cases

├── context/

│ ├── codebase-overview.md

│ ├── conventions.md

│ └── dependencies.md

├── workflows/

│ ├── feature-development.md

│ ├── bug-fix.md

│ └── refactoring.md

└── memory/

└── (stores approved architectural plans)

Agent profiles provide:

Role and responsibilities (what this agent focuses on)
References to standards (where to find the patterns)
Decision frameworks (when to use which approach)
Agent-specific guidance (how to apply patterns for this role)

The key difference: Agent profiles reference the standards in /docs/, they don’t duplicate them. This keeps everything in sync and avoids duplication.

What makes standards “agent-friendly”

Through experimentation, I’ve learned what makes documentation work well for AI agents:

1. Show correct AND incorrect examples

Agents learn from contrast. Always provide both:

# Correct - Complete endpoint with all required patterns

@router.post("/", status_code=status.HTTP_201_CREATED)

async def create_user(

session: DbSessionDep, # Multi-tenant context

current_user: CurrentUserDep, # Authentication required

user_data: UserCreate, # Pydantic validation

) -> User:

"""

Create a new user in the organization.

Requires organization_admin role. Creates audit log entry

for HIPAA compliance.

"""

# Permission check (never forgotten)

if not await has_permission(current_user, "create_user"):

raise HTTPException(status_code=403)

# Business logic with proper transaction handling

async with session.transaction():

user = User(

created_by=current_user.id,

**user_data.model_dump()

)

session.add(user)

await session.flush()

# Audit logging for HIPAA

await create_audit_log(

user_id=current_user.id,

action="user_created",

resource_id=user.id

)

# Background task for notifications

user_id = user.id

async def publish_event():

await sns_client.publish(...)

session.on_commit(publish_event)

return user

# Wrong - Missing critical patterns

@app.post("/users/")

def create_user(user: User):

db.add(user)

db.commit()

return user

2. Be explicit about “why”

Agents need context for decisions:

# Use DbSessionDep (not AsyncSession) because it includes

# multi-tenant context and automatically sets the correct

# database schema based on the authenticated user

3. Include edge cases and gotchas

Document the mistakes people make:

# Common mistake: Accessing user.id after session closes

# in background tasks. The session ends before the task runs.

# Solution: Capture primitive values before async tasks

user_id = user.id # Capture the ID

async def send_notification():

await notify(user_id) # Use the captured value

session.on_commit(send_notification)

4. Structure with clear headers

Agents navigate structured docs efficiently:

## Core Principles

## When to Use This Pattern

## Common Pitfalls

## Examples

## Related Patterns

5. Reference, don’t duplicate

Agent profiles reference standards, don’t copy them:

# In agent profile:

"Follow the endpoint pattern in fastapi-standards.md section 3.2"

# Not:

"Here's the endpoint pattern: [500 lines of duplicated content]"

Real examples from our standards

Let me show you what effective standards look like. These work for both humans learning the codebase AND agents implementing features.

Example 1: FastAPI Endpoint Pattern

From docs/fastapi-standards.md:

# Correct - Complete endpoint with all required patterns

@router.post("/", status_code=status.HTTP_201_CREATED)

async def create_user(

session: DbSessionDep, # Multi-tenant context

current_user: CurrentUserDep, # Authentication required

user_data: UserCreate, # Pydantic validation

) -> User:

"""

Create a new user in the organization.

Requires organization_admin role. Creates audit log entry

for HIPAA compliance. Sends welcome email asynchronously.

"""

# Permission check (never forgotten)

if not await has_permission(current_user, "create_user"):

raise HTTPException(status_code=403)

# Business logic with proper transaction handling

async with session.transaction():

user = User(

created_by=current_user.id,

**user_data.model_dump()

)

session.add(user)

await session.flush()

# Audit logging for HIPAA

await create_audit_log(

user_id=current_user.id,

action="user_created",

resource_id=user.id

)

# Background task

user_id = user.id

async def publish_event():

await sns_client.publish(...)

session.on_commit(publish_event)

return user

# Wrong - Missing critical patterns

@app.post("/users/")

def create_user(user: User):

db.add(user)

db.commit()

return user

Why this works for agents: The agent sees the exact pattern, understands the “why” behind each line, and knows what mistakes to avoid.

Why this works for humans: New developers see the complete picture—not just syntax, but the architectural decisions and compliance requirements baked in.

Example 2: React component pattern

From docs/react-standards.md:

// Correct - Proper component structure

interface ButtonProps {

children: React.ReactNode;

variant?: "primary" | "secondary" | "outline";

size?: "sm" | "md" | "lg";

disabled?: boolean;

onClick?: (event: React.MouseEvent<HTMLButtonElement>) => void;

}

export function Button({

children,

variant = "primary",

size = "md",

disabled = false,

onClick,

}: ButtonProps): React.ReactElement {

return (

<button

type="button"

disabled={disabled}

onClick={onClick}

className={cn(

"inline-flex items-center justify-center rounded-md font-medium",

{

"bg-primary text-white": variant === "primary",

"bg-secondary text-secondary-foreground": variant === "secondary",

"border border-input": variant === "outline",

}

)}

{children}

</button>

);

}

// Wrong - Untyped props, inconsistent naming

export function Button(props) {

return <button className={props.class}>{props.text}</button>;

}

Example 3: Testing strategy

From docs/testing-standards.md:

# Business Logic Testing Focus

## What to test:

Calculation functions - capacity calculations, scoring algorithms

Permission systems - role-based access controls

Validation rules - business rule enforcement

State transitions - workflow state changes

Data transformations - data processing logic

## What NOT to test:

Framework internals - Don't test FastAPI or React behavior

Third-party libraries - Trust they're already tested

Simple getters/setters - Test behavior, not boilerplate

Database queries - Test the business logic, not ORM

## Test Structure:

async def test_user_creation_enforces_lead_access(

db_session: DbSession,

make_user,

make_lead,

"""Test that users can only create records for leads they have access to."""

# Arrange - Set up test data

async with db_session.transaction():

user = await make_user(db_session)

lead = await make_lead(db_session, facility_id=user.facility_id)

# Act - Perform the action

result = await create_note(

db_session,

user_id=user.id,

lead_id=lead.id,

content="Test note"

)

# Assert - Verify the outcome

assert result is not None

assert result.lead_id == lead.id

assert result.created_by == user.id

Agent orofile architecture

Agent profiles are where you define roles, responsibilities, and decision-making frameworks. Here’s what goes into an effective profile:

What goes in a profile

1. Role and responsibilities

What this agent focuses on
What decisions it makes
What it doesn’t handle

2. References to standards

Direct links to relevant /docs/ files
Clear pointers: “See section 3.2 of fastapi-standards.md”
Never duplicate the actual patterns

3. Decision frameworks

When to use which pattern
How to choose between approaches
When to ask for guidance

4. Common pitfalls

Role-specific gotchas
Mistakes to avoid
How to handle edge cases

5. Compliance context

Why certain patterns exist (HIPAA, security, etc.)
What happens if rules are violated
Audit requirements

Example: Backend developer profile

Here’s a simplified example from .agents/profiles/backend-dev.md:

# Backend Developer Agent Profile

## Core Identity

You are a Backend Development Specialist for this healthcare platform.

## Key Standards (Read These First!)

Your implementation patterns live here - always follow them:

- [FastAPI Standards](../../docs/fastapi-standards.md) Read this first

- [Python Standards](../../docs/python-standards.md)

- [Testing Standards](../../docs/testing-standards.md)

- [Multi-Tenant Architecture](../../docs/multi-tenant-architecture.md)

These docs contain the exact code patterns, syntax, and examples you must follow.

## Your Role and Responsibilities

1. API Development: Implement endpoints following FastAPI standards

2. Multi-tenant Operations: Ensure proper tenant isolation via schema separation

3. HIPAA Compliance: Never expose PHI in logs or errors (see standards for examples)

4. Background Tasks: Use session.on_commit() pattern (see standards for implementation)

## Decision Framework

When should you add audit logging?

→ Always, for any operation that touches PHI (see Testing Standards for examples)

When should you use DbSessionDep vs AsyncSession?

→ Always use DbSessionDep - it includes multi-tenant context automatically

(see FastAPI Standards section 4.2 for why)

When should you create a background task?

→ For operations that take >200ms or involve external services

(see FastAPI Standards section 6.3 for the pattern)

## Common Pitfalls (Role-Specific)

Pitfall 1: Forgetting Multi-Tenant Context

Wrong: Using AsyncSession directly

Correct: Always use DbSessionDep

See: FastAPI Standards section 4.2

Pitfall 2: Session Access in Background Tasks

Wrong: Accessing user.id after session closes

Correct: Capture primitive values before the task

See: FastAPI Standards section 6.3

Pitfall 3: Exposing PHI in Logs

Wrong: logger.info(f"User {user.name} logged in")

Correct: logger.info(f"User {user.id} logged in")

See: Python Standards section 8.1 - HIPAA Logging

## Compliance Requirements

HIPAA Audit Logging:

- Log all access to PHI (reads, writes, deletes)

- Include: user_id, action, resource_id, timestamp, IP address

- Never log: actual PHI content, passwords, tokens

Multi-Tenant Isolation:

- Database schema separation (automatic via DbSessionDep)

- Never query across schemas

- All queries scoped to current tenant automatically

## When to Ask for Guidance

- Novel architectural patterns not covered in standards

- Security implications unclear

- Performance tradeoffs requiring business input

- Compliance questions beyond documented patterns

Technical setup: Git worktrees and Docker isolation

Now let’s get into the practical setup for parallel development.

Git worktrees configuration

Git worktrees let you have multiple branches checked out simultaneously, each in its own directory:

# From your main repository

cd ~/projects/client-platform

# Create worktrees directory

mkdir -p ~/projects/client-worktrees

# Create a worktree for a new feature

git worktree add ~/projects/client-worktrees/feature-document-api -b feature/document-api

# Create worktree for a bug fix

git worktree add ~/projects/client-worktrees/fix-users-api -b fix/users-api-joins

# Your directory structure now looks like:

# ~/projects/client-platform (main repo)

# ~/projects/client-worktrees/

# ├── feature-document-api/ (separate branch)

# ├── fix-users-api/ (separate branch)

# └── feature-contact-confirm/ (separate branch)

Docker isolation per worktree

Each worktree needs isolated Docker containers. Use COMPOSE_PROJECT_NAME:

# In each worktree, set a unique project name

cd ~/projects/client-worktrees/feature-document-api

export COMPOSE_PROJECT_NAME="agent-document-api"

docker compose up -d

# Different worktree, different containers

cd ~/projects/client-worktrees/fix-users-api

export COMPOSE_PROJECT_NAME="agent-users-api"

docker compose up -d

This creates completely isolated environments:

Separate database containers
Separate Redis instances
Separate service ports
No conflicts, no contamination

Context injection: How to initialize an agent

When starting work in a worktree, you inject context into your AI agent:

cd ~/worktrees/feature-new-api

# Feed the agent its role and context

ROLE: @.agents/profiles/backend-dev.md

CONTEXT:

- @.agents/context/codebase-overview.md

- @.agents/context/conventions.md

- @.agents/context/dependencies.md

TASK: Implement user preferences API

- Endpoints: GET/PUT /v1/users/{id}/preferences

- Storage: JSONB column for flexibility

- Validation: Max 10KB preferences size

Let's start with the database schema.

The @ syntax tells your AI tool (Cursor, Claude Code, etc.) to read and inject the file contents.

Getting started: 4-week plan

Don’t try to build everything at once. Start small and iterate.

Week 1: Single agent, single worktree

Goal: Get comfortable with the guide/review cycle.

Choose your model: Start with Claude Sonnet 4.5 (no extended thinking) for implementation tasks
Document one pattern: Pick your most common task (e.g., “How we write API endpoints”) and document it completely with correct/incorrect examples
Create your first agent profile: Backend or frontend developer, 50-100 lines
Take one feature: Use a single agent in one worktree
Practice guiding: Focus on reviewing and guiding, not typing

Success criteria: You complete one feature using agent orchestration and feel comfortable with the review cycle.

Week 2: Two agents in parallel

Goal: Experience parallel development.

Enhance your agent profile: Add lessons learned from Week 1
Document another pattern: Add testing or component patterns
Take a feature with backend + frontend: Run two agents in parallel worktrees
Practice context switching: Get comfortable switching between reviews

Success criteria: You complete a feature that touches both backend and frontend, managing two agents simultaneously.

Week 3: Full orchestration

Goal: Work on 3-4 features simultaneously.

Add specialized agents: Create architect and reviewer profiles
Document more patterns: Fill out your standards library
Use the Architect → Plan → Implement pattern: For complex features, architect first
Work on 3-4 features simultaneously: Mix of complexities
Refine workflows: Adjust based on what works

Success criteria: You ship 3+ features in a day using orchestration and feel the cognitive shift.

Week 4: Team rollout

Goal: Scale to your team.

Document your workflow: Write a team guide based on your experience
Share your agent profiles: Let others use and improve them
Review and refine standards: Make them work for the whole team
Onboard one team member: Help someone else start with orchestration

Success criteria: At least one other person on your team is successfully using agent orchestration.

The maintenance strategy

Here’s how we keep everything in sync:

Standards evolve in /docs:

PR introduces a new pattern? Update the standard.
Architecture decision made? Document it.
Security issue discovered? Add to the standards.

Agent profiles stay stable:

They reference standards, not duplicate them
Changes to patterns don’t require updating agent profiles
Agents always see the latest patterns via references

The principle: Standards are the source of truth. Agent profiles are the lens through which agents view those standards.

Key lessons learned

After several weeks of working this way, here’s what I’ve learned:

1. Always architect complex features first

Don’t let a dev agent jump straight into implementation on non-trivial features. Use the Architect → Plan → Review → Implement workflow. The 20 minutes spent planning saves hours of refactoring.

Store these plans in .agents/memory/ so the dev agent has a roadmap. And critically: use your most powerful model (Opus or Sonnet 4.5 with extended thinking) for the architect—this is where complex reasoning matters most.

2. The bottleneck shifts to you

Your ability to context switch between reviews becomes the limiting factor. This is a good problem to have.

3. Quality improves

Continuous real-time review catches issues immediately. No more “I’ll review it later” backlog. You catch architectural issues (like missing permission checks) that agents might miss.

4. Consistency is automatic

Agents follow patterns perfectly. No more style drift across the codebase.

5. Context is everything

The more context you provide to agents (ticket details, acceptance criteria, QA feedback), the better their output. Tools like MCP servers can help automate this.

6. Documentation investment pays double

Every standard you write helps both human developers AND AI agents. No duplicate effort.

7. Start with what you have

You don’t need perfect documentation to start. Document one pattern, try it with one agent, and iterate based on mistakes.

8. The tool doesn’t matter—the system does

I use Cursor CLI for this workflow, but the principles work with any AI coding tool (Claude Code, GitHub Copilot, etc.). What matters is: specialized agent profiles, git worktrees for isolation, proper context injection, and the orchestration mindset.

Pick the tool that fits your workflow, but focus on building the system.

9. You can actually be a tech lead AND ship code

This is huge. While agents work on implementation, I can review PRs, answer Slack questions, groom the backlog, and handle architectural decisions—all without losing my place. The agents keep working while I context-switch to leadership responsibilities. When I return, I just review their progress.

No mental model to rebuild. No “where was I?” moment.

Real impact: Before and after

Let me show you the concrete difference documentation makes:

Before (generic AI assistant):

Me: "Create a user endpoint"

AI: Creates basic CRUD without auth, tenant isolation, or audit logging

Me: "Add authentication"

AI: Adds auth but forgets tenant isolation

Me: "Add multi-tenant support"

AI: Adds tenant check but uses wrong dependency

Me: "Add audit logging"

AI: Adds logging but exposes PHI

[After 10 iterations, still not production-ready]

After (agent with standards):

Me: "Create a user endpoint following our standards"

Agent: Reads fastapi-standards.md

Agent: Implements complete endpoint with:

- Proper dependency injection (DbSessionDep, CurrentUserDep)

- Multi-tenant isolation via schema

- Permission checks

- HIPAA-compliant audit logging

- Background task pattern

- Comprehensive tests

- Production-ready on first iteration

The difference isn’t the AI model—it’s the documentation.

Start small, build momentum

You don’t need perfect documentation to start. Begin with:

Document one pattern completely (e.g., “How we write API endpoints”)
Show correct and incorrect examples
Explain the “why” behind decisions
Create your first agent profile that references it
Try one feature with one agent
Iterate based on what the agent gets wrong

Every time an agent makes a mistake, ask: “Is this documented in our standards?” If not, add it. Over time, your standards become comprehensive and your agents become more accurate.

The documentation you build for agents makes your team better—whether those team members are human or AI.

AUTHOR

Adrian Obelmejias

Technical Engineering Manager

Our Approach

Building an agent system: Documentation, standards, and setup

Agent orchestration works.

The principle: Standards-driven development

The two-layer documentation strategy

Layer 1: /docs/ - Standards for Humans AND Agents

Layer 2: .agents/ - Agent-Specific Context

What makes standards “agent-friendly”

1. Show correct AND incorrect examples

2. Be explicit about “why”

3. Include edge cases and gotchas

4. Structure with clear headers

5. Reference, don’t duplicate

Real examples from our standards

Example 1: FastAPI Endpoint Pattern

Example 2: React component pattern

Example 3: Testing strategy

Agent orofile architecture

What goes in a profile

Example: Backend developer profile

Technical setup: Git worktrees and Docker isolation

Git worktrees configuration

Docker isolation per worktree

Context injection: How to initialize an agent

Getting started: 4-week plan

Week 1: Single agent, single worktree

Week 2: Two agents in parallel

Week 3: Full orchestration

Week 4: Team rollout

The maintenance strategy

Key lessons learned

1. Always architect complex features first

2. The bottleneck shifts to you

3. Quality improves

4. Consistency is automatic

5. Context is everything

6. Documentation investment pays double

7. Start with what you have

8. The tool doesn’t matter—the system does

9. You can actually be a tech lead AND ship code

Real impact: Before and after

Start small, build momentum