AI Engineering

Building an Autonomous AI Agent Loop with Ralph and GitHub Copilot CLI

January 24, 202614 min read
AI AgentsCopilot CLIAutomationDeveloper ToolsPRD

Building an Autonomous AI Agent Loop with Ralph and GitHub Copilot CLI

How we used the Ralph pattern to build Rekoma-Context from PRD to production in a single session

Quick Start: Copy This to Your AI Agent

Want to set up Ralph in your project right now? Copy and paste this prompt into Copilot CLI:

 Set up the Ralph autonomous agent loop in this project for autonomous AI-driven development.

   Reference the implementation pattern from https://github.com/soderlind/ralph

   Create the following files:

     - scripts/ralph/prd.json - A PRD template:  {
         "project": "PROJECT_NAME",
         "description": "PROJECT_DESCRIPTION",
         "userStories": [
           {
             "id": "US-001",
             "title": "First feature",
             "description": "As a user, I need...",
             "acceptanceCriteria": ["criteria 1", "criteria 2"],
             "priority": 1,
             "passes": false,
             "notes": ""
           }
         ]
       }
     - progress.txt - Initialize with:  # Ralph Progress Log
       Started: [TODAY'S DATE]
       Project: [PROJECT_NAME]
       ---
     - prompts/default.txt - The iteration prompt:  Work in the current repo. Use these files as your source of truth:
       - The PRD JSON file provided as context (scripts/ralph/prd.json)
       - progress.txt

       1. Find the next incomplete feature (passes: false) to work on and work only on that feature.
          Work on features in priority order (lowest priority number that has passes: false).
       2. Implement the feature following the acceptance criteria.
       3. Run tests if applicable to verify the implementation works.
       4. Update the PRD JSON file with the work that was done (set passes: true, add notes).
       5. Append your progress to progress.txt.
          Use this to leave a note for the next person working in the codebase.
       6. Make a git commit of that feature with message: feat(US-XXX): <title>

       ONLY WORK ON A SINGLE FEATURE PER ITERATION.
       If, while implementing the feature, you notice the PRD is complete (all passes: true), output <promise>COMPLETE</promise>.
     - ralph.ps1 (Windows) or ralph.sh (Mac/Linux) - Loop script using the soderlind/ralph pattern:
       - Parameters: -Iterations (default 25), -Model (default gpt-5.2), -AllowProfile (safe/dev/locked), -DryRun
       - Build combined context file per iteration containing: PRD JSON + progress.txt + prompt
       - Execute via Copilot CLI with context attachment syntax:  copilot --add-dir $ProjectRoot --model $Model --no-color --stream off
   --silent -p "@$contextFile Follow the attached prompt." [tool-args]
       - Tool permissions: use --allow-tool / --deny-tool flags; always deny shell(rm) and shell(git push)
       - Check output for <promise>COMPLETE</promise> to stop early
       - Clean up temp context file after each iteration

   The pattern: AI reads PRD → implements one feature → verifies → updates PRD → commits → repeats until done.

What is Ralph?

Ralph is an autonomous agent loop pattern (inspired by soderlind/ralph) that turns AI coding assistants into self-directed developers. Instead of manually prompting an LLM for each task, Ralph reads a structured PRD (Product Requirements Document), identifies the next incomplete feature, implements it, verifies the work, and commits—then repeats until done.

The key insight: AI agents work better with clear, structured requirements and explicit completion criteria.

Why Ralph + Copilot CLI?

Traditional AI coding workflows look like this:

Human: "Add a config system"
AI: [writes code]
Human: "Now add tests"
AI: [writes tests]
Human: "Tests fail, fix them"
AI: [fixes]
Human: "Now commit"
...repeat 50 times...

With Ralph, it becomes:

Human: ./ralph.ps1 -Iterations 25
[goes to lunch]
AI: [completes 19 user stories, each with tests, types, and commits]

The improvements:

Aspect Manual Prompting Ralph Loop
Context continuity Lost between prompts Maintained via PRD + progress.txt
Task scoping Vague, scope creep Explicit acceptance criteria
Verification Manual "does it work?" Automated: pytest + mypy
Progress tracking Mental model progress.txt log
Commits Often forgotten Automatic per-feature

The Ralph Architecture

┌─────────────────────────────────────────────────────────┐
│                       ralph.ps1                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │  PRD JSON   │  │ progress.txt│  │  prompts/*.txt  │  │
│  │  (truth)    │  │  (history)  │  │  (instructions) │  │
│  └──────┬──────┘  └──────┬──────┘  └────────┬────────┘  │
│         │                │                   │           │
│         └────────────────┼───────────────────┘           │
│                          ▼                               │
│              ┌───────────────────────┐                   │
│              │  Combined Context     │                   │
│              │  (.ralph-context.tmp) │                   │
│              └───────────┬───────────┘                   │
│                          ▼                               │
│              ┌───────────────────────┐                   │
│              │    Copilot CLI        │                   │
│              │  (autonomous agent)   │                   │
│              └───────────┬───────────┘                   │
│                          ▼                               │
│              ┌───────────────────────┐                   │
│              │  Code + Tests + Commit│                   │
│              └───────────────────────┘                   │
└─────────────────────────────────────────────────────────┘

Complete Setup Files

1. Install Copilot CLI

Ralph requires the standalone Copilot CLI (not gh copilot):

# Windows (winget)
winget install GitHub.CopilotCLI

# Verify installation
copilot --version

2. Project Structure

your-project/
├── scripts/ralph/prd.json    # The PRD with user stories
├── progress.txt              # Running log of completed work  
├── prompts/default.txt       # Instructions for each iteration
└── ralph.ps1                 # The loop script

3. PRD Template (scripts/ralph/prd.json)

Copy this complete template:

{
  "project": "my-project",
  "branchName": "ralph/feature-v1",
  "description": "Brief description of what this PRD delivers",
  "userStories": [
    {
      "id": "US-001",
      "title": "Initialize project structure",
      "description": "As a developer, I need a proper project structure so I can start building.",
      "acceptanceCriteria": [
        "Create src/ directory with __init__.py",
        "Create pyproject.toml with project metadata",
        "Create tests/ directory",
        "Run pip install -e . successfully",
        "Run pytest tests/ successfully (even if no tests yet)"
      ],
      "priority": 1,
      "passes": false,
      "notes": ""
    },
    {
      "id": "US-002",
      "title": "Implement core feature",
      "description": "As a user, I need the core feature so I can do X.",
      "acceptanceCriteria": [
        "Create src/core.py with main function",
        "Add type hints to all functions",
        "Create tests/test_core.py with >85% coverage",
        "Run python -m mypy src/ successfully",
        "Run pytest tests/ successfully"
      ],
      "priority": 2,
      "passes": false,
      "notes": ""
    }
  ]
}

Key fields explained:

  • priority: Lower = higher priority. Ralph works on lowest incomplete priority first.
  • passes: Set to true when complete. Ralph skips completed stories.
  • acceptanceCriteria: Explicit checklist. Include verification commands!
  • notes: Ralph writes implementation notes here after completing.

4. Prompt File (prompts/default.txt)

Copy this complete prompt:

Work in the current repo. Use these files as your source of truth:
- The PRD JSON file provided as context (scripts/ralph/prd.json)
- progress.txt

Follow these steps:

1. Find the next incomplete feature (passes: false) and work ONLY on that feature.
   Work in priority order (lowest priority number that has passes: false).

2. Implement the feature according to its acceptance criteria.
   - Write clean, well-typed code
   - Add comprehensive tests
   - Follow existing code patterns in the repo

3. Verify your work:
   - Run the project's type checker (e.g., python -m mypy src/)
   - Run the project's tests (e.g., pytest tests/)
   - Fix any failures before proceeding

4. Update the PRD JSON file:
   - Set passes: true for the completed story
   - Add implementation notes to the notes field

5. Append your progress to progress.txt:
   - Date and user story ID
   - Brief summary of what was implemented
   - Any important decisions or caveats

6. Make a git commit with message: "feat(US-XXX): <title>"

IMPORTANT RULES:
- ONLY work on ONE feature per iteration
- Do NOT skip ahead to other features
- Do NOT mark passes: true until ALL acceptance criteria are met
- If you notice ALL user stories have passes: true, output: <promise>COMPLETE</promise>

5. Progress File (progress.txt)

Initialize with:

# Ralph Progress Log
Started: YYYY-MM-DD
Project: my-project
---

## Codebase Patterns
- [Add patterns as you discover them]
- [e.g., "Use pytest for testing"]
- [e.g., "Type hints required on all functions"]

---

6. Loop Script (ralph.ps1)

Copy this complete script:

#!/usr/bin/env pwsh
<#
.SYNOPSIS
    Ralph: Autonomous agent loop using Copilot CLI (soderlind/ralph pattern).
    
.DESCRIPTION
    This script runs the Ralph loop for autonomous code generation.
    Based on soderlind/ralph pattern - uses Copilot CLI for autonomous operation.
    
.PARAMETER Iterations
    Number of Ralph iterations to run. Default is 25.
    
.PARAMETER Model
    AI model to use. Default is gpt-5.2.
    
.PARAMETER Prd
    Path to the PRD JSON file. Default is scripts/ralph/prd.json.
    
.PARAMETER AllowProfile
    Tool permission profile: safe | dev | locked. Default is dev.
    
.PARAMETER DryRun
    If set, shows what would be run without executing.
    
.EXAMPLE
    .\ralph.ps1 -Iterations 10
    
.EXAMPLE
    .\ralph.ps1 -Prd scripts/ralph/my-feature.json
    
.EXAMPLE
    .\ralph.ps1 -Model "gpt-5.1" -AllowProfile safe
#>

[CmdletBinding()]
param(
    [int]$Iterations = 25,
    [string]$Model = "gpt-5.2",
    [ValidateSet("safe", "dev", "locked")]
    [string]$AllowProfile = "dev",
    [string[]]$AllowTools = @(),
    [string[]]$DenyTools = @(),
    [string]$Prd,
    [switch]$DryRun
)

$ErrorActionPreference = "Stop"
$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
$ProjectRoot = $ScriptDir

# Configuration
$PRD_FILE = if ($Prd) { Join-Path $ProjectRoot $Prd } else { Join-Path $ProjectRoot "scripts\ralph\prd.json" }
$PROGRESS_FILE = Join-Path $ProjectRoot "progress.txt"
$PROMPT_FILE = Join-Path $ProjectRoot "prompts\default.txt"

# Colors for output
function Write-Info { param([string]$msg) Write-Host "INFO: $msg" -ForegroundColor Cyan }
function Write-Success { param([string]$msg) Write-Host "OK: $msg" -ForegroundColor Green }
function Write-Warn { param([string]$msg) Write-Host "WARN: $msg" -ForegroundColor Yellow }
function Write-Err { param([string]$msg) Write-Host "ERROR: $msg" -ForegroundColor Red }

# Build tool args based on profile
function Get-CopilotToolArgs {
    $toolArgList = @()
    
    # Always deny dangerous commands
    $toolArgList += @("--deny-tool", "shell(rm)")
    $toolArgList += @("--deny-tool", "shell(git push)")
    
    if ($AllowTools.Count -eq 0) {
        switch ($AllowProfile) {
            "dev" {
                $toolArgList += @("--allow-all-tools")
            }
            "safe" {
                $toolArgList += @("--allow-tool", "write")
                $toolArgList += @("--allow-tool", "shell(pip:*)")
                $toolArgList += @("--allow-tool", "shell(python:*)")
                $toolArgList += @("--allow-tool", "shell(pytest:*)")
                $toolArgList += @("--allow-tool", "shell(npm:*)")
                $toolArgList += @("--allow-tool", "shell(node:*)")
                $toolArgList += @("--allow-tool", "shell(git:*)")
            }
            "locked" {
                $toolArgList += @("--allow-tool", "write")
            }
        }
    }
    
    foreach ($tool in $AllowTools) { $toolArgList += @("--allow-tool", $tool) }
    foreach ($tool in $DenyTools) { $toolArgList += @("--deny-tool", $tool) }
    
    return $toolArgList
}

# Check prerequisites
function Test-Prerequisites {
    $copilotPath = Get-Command copilot -ErrorAction SilentlyContinue
    if ($null -eq $copilotPath) {
        Write-Err "Copilot CLI not found!"
        Write-Info "Install with: winget install GitHub.CopilotCLI"
        throw "Copilot CLI not installed"
    }
    
    if (-not (Test-Path $PROMPT_FILE)) { throw "Prompt file not found: $PROMPT_FILE" }
    if (-not (Test-Path $PRD_FILE)) { throw "PRD file not found: $PRD_FILE" }
    if (-not (Test-Path $PROGRESS_FILE)) { throw "Progress file not found: $PROGRESS_FILE" }
    
    return $true
}

# Get current PRD status
function Get-PrdStatus {
    $prd = Get-Content $PRD_FILE -Raw | ConvertFrom-Json
    $complete = ($prd.userStories | Where-Object { $_.passes }).Count
    $total = $prd.userStories.Count
    $next = $prd.userStories | Where-Object { -not $_.passes } | Sort-Object priority | Select-Object -First 1
    
    return @{
        Complete = $complete
        Total = $total
        NextTask = $next
        IsComplete = ($complete -eq $total)
    }
}

# Main loop
function Start-RalphLoop {
    Write-Host ""
    Write-Host "+---------------------------------------------------------------+" -ForegroundColor Blue
    Write-Host "|                    RALPH (Copilot CLI Loop)                   |" -ForegroundColor Blue
    Write-Host "+---------------------------------------------------------------+" -ForegroundColor Blue
    Write-Host "|  Model: $($Model.PadRight(52))|" -ForegroundColor Blue
    Write-Host "|  Iterations: $($Iterations.ToString().PadRight(47))|" -ForegroundColor Blue
    Write-Host "|  Profile: $($AllowProfile.PadRight(50))|" -ForegroundColor Blue
    Write-Host "+---------------------------------------------------------------+" -ForegroundColor Blue
    Write-Host ""
    
    Test-Prerequisites
    
    $status = Get-PrdStatus
    if ($status.IsComplete) {
        Write-Success "PRD is already complete! Nothing to do."
        return
    }
    
    Write-Info "Progress: $($status.Complete)/$($status.Total) user stories complete"
    Write-Info "Next task: $($status.NextTask.id) - $($status.NextTask.title)"
    Write-Host ""
    
    $toolArgs = Get-CopilotToolArgs
    
    for ($i = 1; $i -le $Iterations; $i++) {
        Write-Host ""
        Write-Host "----------------------------------------------------------------" -ForegroundColor Cyan
        Write-Host " Iteration $i/$Iterations  $(Get-Date -Format 'HH:mm:ss')" -ForegroundColor Cyan
        Write-Host "----------------------------------------------------------------" -ForegroundColor Cyan
        
        # Build combined context file
        $contextFile = [System.IO.Path]::Combine($ProjectRoot, ".ralph-context.$i.tmp")
        
        $contextContent = @"
# Context

## PRD (scripts/ralph/prd.json)
$(Get-Content $PRD_FILE -Raw)

## progress.txt
$(Get-Content $PROGRESS_FILE -Raw)

# Prompt

$(Get-Content $PROMPT_FILE -Raw)
"@
        Set-Content -Path $contextFile -Value $contextContent -Encoding UTF8
        
        if ($DryRun) {
            Write-Warn "[DRY RUN] Would execute:"
            Write-Host "  copilot --add-dir `"$ProjectRoot`" --model $Model -p `"@$contextFile`" $($toolArgs -join ' ')"
            Remove-Item $contextFile -Force -ErrorAction SilentlyContinue
            continue
        }
        
        try {
            $result = & copilot --add-dir $ProjectRoot --model $Model `
                --no-color --stream off --silent `
                -p "@$contextFile Follow the attached prompt." `
                @toolArgs 2>&1
            
            $exitCode = $LASTEXITCODE
            $resultText = $result -join "`n"
            
            Write-Host $resultText
            
            if ($resultText -match "<promise>COMPLETE</promise>") {
                Write-Host ""
                Write-Success "PRD COMPLETE after $i iterations!"
                Remove-Item $contextFile -Force -ErrorAction SilentlyContinue
                return
            }
            
            if ($exitCode -ne 0) {
                Write-Warn "Copilot exited with status $exitCode; continuing to next iteration."
            }
        }
        catch {
            Write-Warn "Error in iteration ${i}: $_"
        }
        finally {
            Remove-Item $contextFile -Force -ErrorAction SilentlyContinue
        }
    }
    
    Write-Host ""
    Write-Warn "Finished $Iterations iterations without completion signal."
    Write-Info "Run again to continue from where you left off."
}

# Entry point
try {
    Start-RalphLoop
}
catch {
    Write-Err "Fatal error: $_"
    exit 1
}

Running Ralph

Basic Run

.\ralph.ps1

Custom PRD

.\ralph.ps1 -Prd scripts/ralph/my-feature.json

Dry Run (preview without executing)

.\ralph.ps1 -DryRun

Fewer Iterations

.\ralph.ps1 -Iterations 5

Different Model

.\ralph.ps1 -Model "gpt-5.1"

Tool Permission Profiles

Profile Description
dev Full access (default)
safe Write + common dev commands only
locked Write only, no shell access
.\ralph.ps1 -AllowProfile safe

All profiles deny rm and git push by default.

Real Results: Rekoma-Context

We used Ralph to build Rekoma-Context, a context optimization middleware for AI agents. Here's what the experience looked like:

The Run

+---------------------------------------------------------------+
|                    RALPH (Copilot CLI Loop)                   |
+---------------------------------------------------------------+
|  Model: gpt-5.2                                               |
|  Iterations: 25                                               |
|  Profile: dev                                                 |
+---------------------------------------------------------------+

Progress: 0/19 user stories complete
Next task: US-001 - Initialize Python project structure

----------------------------------------------------------------
 Iteration 1/25  14:32:17
----------------------------------------------------------------
[creates project scaffold, verifies pip install -e .]
[commits: "feat(US-001): Initialize Python project structure"]

----------------------------------------------------------------
 Iteration 2/25  14:35:42
----------------------------------------------------------------
[implements core types with framework adapters]
[adds tests with 12 test cases]
[commits: "feat(US-002): Implement Common Types"]

----------------------------------------------------------------
 Iteration 3/25  14:39:18
----------------------------------------------------------------
[implements configuration system]
[commits: "feat(US-003): Implement Configuration System"]

...

----------------------------------------------------------------
 Iteration 19/25  16:48:03
----------------------------------------------------------------
[adds end-to-end integration tests]
[verifies >90% overall coverage]
[commits: "feat(US-019): Run Integration Tests"]

OK: PRD COMPLETE after 19 iterations!

Final Stats

Metric Value
User stories completed 19/19
Total runtime ~2.5 hours
Tests written 109
Test coverage >85% per module
Type coverage 100% (mypy strict)
Manual intervention Zero
Git commits 19 (one per feature)

What Made This Work

  1. Detailed acceptance criteria - Each story had 5-8 specific checkboxes, not vague descriptions

  2. Verification commands in criteria - "Run pytest tests/ successfully" gave Ralph a way to self-check

  3. Dependency ordering - Foundation first, features that depend on it later

  4. progress.txt patterns - Ralph learned codebase conventions and applied them consistently

  5. Single-feature focus - One story per iteration prevented scope creep and kept commits atomic

Tips for Writing Good PRDs

  1. Be specific in acceptance criteria. "Add tests" is vague. "Create tests/test_config.py with >85% coverage" is actionable.

  2. Include verification commands. "Run pytest tests/ successfully" tells Ralph how to check its work.

  3. One feature per story. Keep stories small and focused.

  4. Order by dependency. Put foundational work (types, config) before features that depend on them.

  5. Include the "done" signal. The <promise>COMPLETE</promise> pattern lets Ralph know when to stop.

Common Issues

"Copilot CLI not found"

winget install GitHub.CopilotCLI

Ralph keeps working on the same story

Ensure the prompt instructs Ralph to set passes: true after completion.

Tests fail but Ralph continues

Add explicit acceptance criteria: "All tests pass via pytest tests/"

Too many iterations

Start with -Iterations 5 and increase as needed.

Conclusion

Ralph transforms AI coding assistants from reactive tools into autonomous agents. The key ingredients:

  1. Structured requirements (PRD with acceptance criteria)
  2. Progress persistence (progress.txt survives context resets)
  3. Verification gates (tests + types before marking complete)
  4. Single-feature focus (prevents scope creep)
  5. Automatic commits (preserves working states)

The result: you define what to build, Ralph figures out how.

Next Steps

Ready to try Ralph yourself?

  1. Check out the original soderlind/ralph repo for the pattern
  2. Copy the files above into your project
  3. Write your PRD with clear acceptance criteria
  4. Run .\ralph.ps1 -DryRun to preview, then remove -DryRun to execute

Happy automating!


Rekoma-Context was built using the Ralph autonomous agent pattern.