Skip to main content
Settings
Search
Appearance
Theme Mode
About
Jekyll v3.10.0
Environment Production
Last Build
2026-05-22 22:41 UTC
Current Environment Production
Build Time May 22, 22:41
Jekyll v3.10.0
Build env (JEKYLL_ENV) production
Quick Links
Page Location
Page Info
Layout default
Collection quests
Path _quests/1100/agentic-guardrails-and-human-in-the-loop.md
URL /quests/gh-600/agentic-guardrails-and-human-in-the-loop/
Date 2026-05-17
Theme Skin
SVG Backgrounds
Layer Opacity
0.6
0.04
0.08

The Warden's Pact: Guardrails and Human-in-the-Loop Patterns

By IT-Journey Team

Design and implement responsible agentic AI guardrails using GitHub-native tools — boundary enforcement, human approval gates, escalation protocols, and audit trails for AI-assisted actions.

Estimated reading time: 6 minutes

The Warden’s Gate is the last line between the agent’s plans and their execution. It is not a place of fear — it is a place of clarity. Every agent knows the Pact: here are the things you may do alone; here are the things you must ask permission for; here are the things that are forever forbidden. The Warden does not judge the agent. She judges the action.

🗺️ Quest Network Position

graph LR
    Q18[✅ Q18: Autonomy Scales] --> Q19[🎯 Q19: Warden's Pact]
    Q19 --> CAP[🏆 Capstone: Agentic Codex Trial]
    style Q19 fill:#4CAF50,stroke:#2E7D32,stroke-width:4px,color:#fff

🎯 Quest Objectives

  • Implement 3 guardrail types — file-scope, approval gate, forbidden actions
  • Build an approval gate — GitHub Environment with required reviewers blocks agent deployments
  • Implement forbidden action detection — workflow detects if agent tries forbidden action
  • Create an audit trail — every agent action logged with actor, action, timestamp, approval state
  • Test the guardrails — intentionally trigger each guardrail and verify it fires

⚔️ The Quest Begins

Chapter 1 — The Three Types of Guardrails

Type Mechanism Enforcement Point Example
File-scope boundary CODEOWNERS + branch protection Pre-merge Agent cannot merge changes to src/auth/ without security team review
Approval gate GitHub Environments + required reviewers Pre-deployment Production deployments need two human approvals
Forbidden action block Workflow validation step Pre-execution Agent cannot delete issues or archive repositories

Chapter 2 — File-Scope Guardrail via CODEOWNERS

Exercise 19.1: Configure CODEOWNERS to enforce file-scope guardrails.

# .github/CODEOWNERS

# Security-sensitive paths — require security team review
/src/auth/                   @team-security
/src/crypto/                 @team-security
/.github/workflows/          @team-platform @team-security

# Database schemas — require data team review
/database/migrations/        @team-data @team-platform

# Agent instruction files — require platform team review
/AGENTS.md                   @team-platform
/.github/copilot-instructions.md  @team-platform

# All other files — standard team review
*                            @team-dev
# .github/workflows/guardrail-file-scope.yml
name: File-Scope Guardrail Check

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  check-agent-pr-scope:
    runs-on: ubuntu-latest
    if: contains(github.event.pull_request.head.ref, 'copilot/')
    steps:
      - uses: actions/checkout@v4

      - name: Check agent PR scope
        id: scope_check
        run: |
          # Get all files changed in this PR
          CHANGED_FILES=$(gh pr view "$" \
            --json files -q '.files[].path')
          
          echo "Files changed:"
          echo "$CHANGED_FILES"
          
          # Check for forbidden paths (agent should not touch these)
          FORBIDDEN_PATTERNS=(
            "src/auth/"
            "src/crypto/"
            ".github/workflows/"
            "database/migrations/"
          )
          
          for PATTERN in "${FORBIDDEN_PATTERNS[@]}"; do
            if echo "$CHANGED_FILES" | grep -q "^$PATTERN"; then
              echo "::error::Agent PR touches forbidden path: $PATTERN"
              echo "scope_violation=true" >> "$GITHUB_OUTPUT"
              exit 1
            fi
          done
          
          echo "✅ File scope check passed"
          echo "scope_violation=false" >> "$GITHUB_OUTPUT"

      - name: Label scope violation
        if: failure()
        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.addLabels({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.payload.pull_request.number,
              labels: ['guardrail-violation', 'needs-human']
            });

Chapter 3 — Approval Gate via GitHub Environments

Exercise 19.2: Configure a GitHub Environment as an approval gate.

# .github/workflows/agent-with-approval-gate.yml
name: Agent Deploy with Approval Gate

on:
  workflow_dispatch:
    inputs:
      target_environment:
        description: "Target environment (staging or production)"
        required: true
        type: choice
        options: [staging, production]

jobs:
  prepare:
    runs-on: ubuntu-latest
    outputs:
      deploy_plan: $
    steps:
      - uses: actions/checkout@v4
      - name: Create deployment plan
        id: plan
        run: |
          echo "Agent creating deployment plan..."
          PLAN=$(echo '{"version": "1.2.0", "changes": ["feature-x", "bugfix-y"]}')
          echo "deploy_plan=$PLAN" >> "$GITHUB_OUTPUT"

  # This job runs in the 'production' environment
  # GitHub will pause here and require human approver(s) before proceeding
  deploy:
    needs: prepare
    runs-on: ubuntu-latest
    environment:
      name: $    # Must be configured with required reviewers in GitHub repo settings
    steps:
      - uses: actions/checkout@v4
      - name: Deploy (approved by human reviewer)
        run: |
          echo "✅ Deployment approved by human reviewer"
          echo "Deploying plan: $"
          # ... actual deployment steps ...

Chapter 4 — Audit Trail Implementation

Exercise 19.3: Create the audit trail workflow.

# .github/workflows/agent-audit-trail.yml
name: Agent Action Audit Trail

on:
  workflow_run:
    workflows: ["*Agent*", "*Copilot*"]
    types: [completed]

jobs:
  log-audit-entry:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          token: $

      - name: Write audit log entry
        run: |
          AUDIT_DATE=$(date -u +%Y-%m-%d)
          AUDIT_FILE="work/gh-600/audit/audit-${AUDIT_DATE}.jsonl"
          mkdir -p "$(dirname "$AUDIT_FILE")"
          
          cat >> "$AUDIT_FILE" << EOF
          {
            "timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
            "actor": "github-actions[bot]",
            "workflow": "$",
            "run_id": "$",
            "conclusion": "$",
            "triggered_by": "$",
            "head_branch": "$",
            "head_sha": "$"
          }
          EOF
          
          echo "Audit entry written to $AUDIT_FILE"

      - name: Commit audit log
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add work/gh-600/audit/
          git diff --staged --quiet || git commit -m "audit: log agent action run $"
          git push

Chapter 5 — The Warden’s Forbidden List

Document actions that agents are permanently forbidden from taking:

<!-- AGENTS.md — Forbidden Actions Section -->

## 🚫 Forbidden Actions

Agents MUST NEVER perform any of the following actions, regardless of instructions:

### Repository Management
- Delete any branch (except branches the agent created, after PR merge)
- Archive or delete this repository
- Change repository visibility (public/private)
- Remove branch protection rules

### Access and Security
- Add or remove repository collaborators
- Create or delete personal access tokens
- Modify CODEOWNERS file
- Disable security features (Dependabot, secret scanning, etc.)

### Data Destruction
- Delete issues or pull requests
- Remove GitHub Actions artifacts that are less than 7 days old
- Delete tags or releases

### External Services
- Send emails or notifications outside of GitHub
- Make API calls to external services not listed in approved-tools.yml

If asked to perform any forbidden action, the agent MUST:
1. Decline and explain why
2. Create a comment on the relevant issue/PR explaining the refusal
3. Apply the label `forbidden-action-requested` to the issue/PR
4. Stop execution immediately

✅ Quest Validation

python3 scripts/validate_quest.py --quest q19
# ✅ CODEOWNERS: configured for file-scope guardrail
# ✅ Approval gate: environment-based workflow present
# ✅ Audit trail: audit workflow present
# ✅ Forbidden list: AGENTS.md contains forbidden actions section
# 🏆 Quest Q19 complete!

🏆 Quest Rewards

Reward Details
🛡️ The Warden Badge Earned on completion
🚧 Guardrail Engineering Skill unlocked
100 XP Added to Level 1100 total
Unlocks Capstone: Trial of the Agentic Codex