Trial of the Agentic Codex: The Grand Capstone
By IT-Journey Team
The ultimate challenge — a six-domain scenario that tests every GH-600 skill. Deploy a complete agentic AI system on GitHub, evaluate it, defend your architectural decisions, and earn the title of Agentic Codex Master.
Estimated reading time: 11 minutes
Table of Contents
The Hall of Mastery stands silent. The six seals of the Codex glow on the walls — one for each domain, one for each truth the candidate has learned on the journey from initiate to master. The Codex Master who placed the seals speaks: “Show me that you did not only read the Codex. Show me that you understand why it was written.”
🗺️ The Arc Complete
graph TD
D1["Domain 1: Agentic SDLC"]
D2["Domain 2: Tools & Permissions"]
D3["Domain 3: Memory & Context"]
D4["Domain 4: Evaluation"]
D5["Domain 5: Multi-Agent"]
D6["Domain 6: Governance"]
CAP["🏆 CAPSTONE: Agentic Codex Master"]
D1 --> D2 --> D3 --> D4 --> D5 --> D6 --> CAP
style CAP fill:#FFD700,stroke:#B8860B,stroke-width:4px,color:#000
style D1 fill:#4CAF50,stroke:#2E7D32,color:#fff
style D2 fill:#4CAF50,stroke:#2E7D32,color:#fff
style D3 fill:#4CAF50,stroke:#2E7D32,color:#fff
style D4 fill:#4CAF50,stroke:#2E7D32,color:#fff
style D5 fill:#4CAF50,stroke:#2E7D32,color:#fff
style D6 fill:#4CAF50,stroke:#2E7D32,color:#fff
🎯 Capstone Objectives
- Seal 1 (Domain 1 — 18%): Implement agent-in-SDLC and define boundaries
- Seal 2 (Domain 2 — 18%): Configure tools, permissions, MCP, environment integration
- Seal 3 (Domain 3 — 19%): Implement memory strategy and context continuity
- Seal 4 (Domain 4 — 19%): Evaluate agent performance and iterate on instructions
- Seal 5 (Domain 5 — 17%): Build and manage a multi-agent system
- Seal 6 (Domain 6 — 9%): Implement responsible autonomy, guardrails, and HITL
🏛️ The Grand Trial Scenario
The Scribe presents the scenario:
You are the lead AI engineer at a software team that has decided to adopt agentic AI development using GitHub. You have been given an empty repository, a GitHub account with Copilot, Actions, and Environments, and a mandate: build a working agentic SDLC in the next 6 hours and demonstrate competency in every domain.
The following 6 chapters map directly to the GH-600 exam domains.
⚔️ Seal 1: The Agentic SDLC (Domain 1 — 18%)
Related quests: Q1 (SDLC Integration), Q2 (Plan vs Action), Q3 (Observability)
Challenge 1.1: Describe where agents live in your SDLC
Task: Document where in your software development lifecycle agents operate. Produce a diagram.
<!-- work/gh-600/capstone/sdlc-diagram.md -->
# Our Agentic SDLC
## Agent Integration Points
| Phase | Agent Role | Trigger | Human Touchpoint |
|---|---|---|---|
| Planning | Requirements analysis | Issue created | Human approves specification |
| Implementation | Code writing | `agent-implement` label | Human reviews PR |
| Review | Code review comments | PR opened | Human accepts/rejects suggestions |
| Testing | Test execution | PR opened | Human reviews failures |
| Deployment | Deploy staging | PR merged to main | Human approves production |
## Architecture Diagram
[Include Mermaid diagram here]
Challenge 1.2: Demonstrate planning vs. action separation
Task: Show a GitHub Actions workflow that separates the plan step from the execute step with a mandatory break between them.
# .github/workflows/plan-then-execute.yml
name: Plan-Then-Execute (Sealed Capstone)
on:
issues:
types: [labeled]
jobs:
plan:
if: contains(github.event.label.name, 'agent-implement')
runs-on: ubuntu-latest
outputs:
plan_approved: $
steps:
- name: Generate plan
id: plan
run: |
echo "Generating implementation plan..."
# Agent generates plan here — does NOT execute
echo "approved=pending" >> "$GITHUB_OUTPUT"
- name: Post plan for approval
uses: actions/github-script@v7
with:
script: |
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.issue.number,
body: '**Agent Plan Generated** ✅\n\nReact with 👍 to approve execution, or 👎 to reject.\n\n_No changes have been made yet._'
});
execute:
needs: plan
runs-on: ubuntu-latest
environment: agent-approval # Human must approve in GitHub UI before this runs
steps:
- name: Execute approved plan
run: echo "Executing plan after human approval..."
Challenge 1.3: Configure an observability workflow
Task: Ensure every agent run emits structured logs that can be queried.
Refer to Q3 (Observability & Control) for the full pattern.
⚔️ Seal 2: Tools, Permissions, and Environment (Domain 2 — 18%)
Related quests: Q4 (Tool Selection), Q5 (MCP), Q6 (Dev Env), Q7 (Safe Execution)
Challenge 2.1: Configure a scoped GitHub token
Task: Set the minimum permissions needed for an implementation agent.
# In your workflow:
permissions:
contents: write # Write files to repository
pull-requests: write # Create and update PRs
issues: write # Comment on issues
# Explicitly deny everything else:
# actions: none (default)
# security-events: none (default)
Challenge 2.2: Configure an MCP server
Task: Add a GitHub MCP server to Copilot and demonstrate its use.
// .vscode/mcp.json (workspace-level MCP configuration)
{
"servers": {
"github": {
"command": "npx",
"args": ["@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${input:github-token}"
}
}
}
}
Challenge 2.3: Document error handling and escalation
Task: Add a failure escalation workflow that creates an issue when an agent fails.
Refer to Q7 (Safe Execution & Error Handling) for the full pattern.
⚔️ Seal 3: Memory and Context (Domain 3 — 19%)
Related quests: Q8 (Memory Strategies), Q9 (State Persistence), Q10 (Cross-tool Continuity)
Challenge 3.1: Implement all three memory tiers
Task: Create artifacts demonstrating ephemeral, session, and persistent memory.
## Memory Tier Implementation Checklist
- [ ] **Ephemeral**: Variables in workflow `env:` block used within a job
- [ ] **Session**: Artifact uploaded in step A, downloaded in step B
- [ ] **Persistent**: Repository file updated by agent, committed and pushed
Challenge 3.2: Implement drift detection
Task: Produce a drift check that compares current agent state to expected state.
Refer to Q9 (State Persistence & Drift) for detect_drift.py.
Challenge 3.3: Implement cross-surface context handoff
Task: Show a
context-handoff.jsonpassed between an issue → PR → branch context.
Refer to Q10 (State Continuity Cross-Tools) for the schema and inject_cross_surface_context.py.
⚔️ Seal 4: Evaluation and Performance (Domain 4 — 19%)
Related quests: Q11 (Success Criteria), Q12 (Root Cause Analysis), Q13 (Behavior Tuning)
Challenge 4.1: Define machine-verifiable acceptance criteria
Task: Write 3 acceptance criteria for an agent task that can be verified programmatically.
// work/gh-600/capstone/acceptance-criteria.json
{
"task": "Implement authentication feature",
"criteria": [
{
"id": "AC-01",
"description": "Unit tests pass",
"signal": "ci-pass",
"check_command": "gh run list --workflow=test.yml --branch=feature/auth --status=success --limit=1"
},
{
"id": "AC-02",
"description": "No new security vulnerabilities",
"signal": "security-scan-pass",
"check_command": "gh api /repos/{owner}/{repo}/code-scanning/alerts?state=open | jq 'length == 0'"
},
{
"id": "AC-03",
"description": "Code review approved",
"signal": "pr-approved",
"check_command": "gh pr view {pr_number} --json reviewDecision -q '.reviewDecision == \"APPROVED\"'"
}
]
}
Challenge 4.2: Perform an RCA on a failed run
Task: Take a failed workflow run and produce a 5-Why RCA document.
Refer to Q12 (Failure Root Cause Analysis) for the full RCA template.
Challenge 4.3: Iterate on agent instructions
Task: Make one instruction change, measure the before/after difference.
Refer to Q13 (Behavior Tuning) for the instruction changelog template.
⚔️ Seal 5: Multi-Agent Systems (Domain 5 — 17%)
Related quests: Q14 (Orchestration), Q15 (Observability), Q16 (Recovery), Q17 (Lifecycle)
Challenge 5.1: Design a 3-agent orchestration workflow
Task: Create a fan-out or chain orchestration with 3 sub-agents.
Refer to Q14 (Multi-Agent Orchestration Patterns) for the fan-out and chain patterns.
Challenge 5.2: Add tracing to the multi-agent run
Task: Each sub-agent emits a trace entry with a shared correlation ID.
Refer to Q15 (Multi-Agent Observability) for trace_writer.py.
Challenge 5.3: Add failure recovery to the orchestrator
Task: Orchestrator continues after one sub-agent fails and produces a recovery plan.
Refer to Q16 (Multi-Agent Failure Recovery) for recovery_coordinator.py.
Challenge 5.4: Register all agents in the agent registry
Task: Publish
_data/agents.ymlwith all 3 agents registered.
Refer to Q17 (Multi-Agent Lifecycle Management) for the registry schema.
⚔️ Seal 6: Responsible Agentic AI (Domain 6 — 9%)
Related quests: Q18 (Autonomy Levels), Q19 (Guardrails & HITL)
Challenge 6.1: Produce your autonomy matrix
Task: Complete
_data/autonomy-matrix.ymlwith 5 task types at appropriate levels.
Refer to Q18 (Autonomy Levels Matrix) for the matrix schema.
Challenge 6.2: Implement 3 guardrails
Task: CODEOWNERS file-scope boundary, approval gate environment, forbidden actions list.
Refer to Q19 (Guardrails & Human-in-the-Loop) for each guardrail.
📋 Domain Coverage Rubric (GH-600 Exam Alignment)
| Domain | Weight | Your Score | Pass Threshold |
|---|---|---|---|
| D1: Agentic SDLC | 18% | /18 | ≥ 14 |
| D2: Tools & Environment | 18% | /18 | ≥ 14 |
| D3: Memory & Context | 19% | /19 | ≥ 15 |
| D4: Evaluation | 19% | /19 | ≥ 15 |
| D5: Multi-Agent | 17% | /17 | ≥ 13 |
| D6: Governance | 9% | /9 | ≥ 7 |
| Total | 100% | /100 | ≥ 70 |
🪞 The Grand Reflection
After completing all 6 seals, publish your reflection:
<!-- work/gh-600/capstone/grand-reflection.md -->
# Grand Reflection: Agentic Codex Trial
## What I Built
[Summary of the agentic system you designed and implemented]
## Most Challenging Domain
[Which domain was hardest and why?]
## Key Architectural Decision
[The most important decision you made and why]
## What I Would Do Differently
[Honest reflection on what could be improved]
## Exam Readiness Self-Assessment
[Domain-by-domain confidence rating 1-5]
## Resources I Would Review Again
[Links back to quests or docs that were most valuable]
✅ Capstone Validation
# Validate all 20 quests in the arc
python3 test/quest-validator/quest_validator.py -d pages/_quests/
# Check all 6 seals are present
python3 work/gh-600/scripts/validate_capstone.py \
--registry _data/agents.yml \
--matrix _data/autonomy-matrix.yml \
--reflection work/gh-600/capstone/grand-reflection.md
# Build site
docker-compose exec jekyll bundle exec jekyll build
🏆 The Agentic Codex — Complete Arc Links
| Quest | Domain | Link |
|---|---|---|
| Q1 | D1 | Agentic SDLC Integration |
| Q2 | D1 | Plan vs Action Boundaries |
| Q3 | D1 | Observability & Control |
| Q4 | D2 | Tool Selection & Permissions |
| Q5 | D2 | MCP Server Mastery |
| Q6 | D2 | Dev Environment Integration |
| Q7 | D2 | Safe Execution & Error Handling |
| Q8 | D3 | Memory Strategies |
| Q9 | D3 | State Persistence & Drift |
| Q10 | D3 | State Continuity Cross-Tools |
| Q11 | D4 | Success Criteria & Signals |
| Q12 | D4 | Failure Root Cause Analysis |
| Q13 | D4 | Behavior Tuning |
| Q14 | D5 | Multi-Agent Orchestration Patterns |
| Q15 | D5 | Multi-Agent Observability |
| Q16 | D5 | Multi-Agent Failure Recovery |
| Q17 | D5 | Multi-Agent Lifecycle Management |
| Q18 | D6 | Autonomy Levels Matrix |
| Q19 | D6 | Guardrails & HITL |
| CAP | All | You are here |
🏆 Capstone Rewards
| Reward | Details |
|---|---|
| 🏆 Agentic Codex Master Badge | Earned on full completion |
| 🎓 GH-600 Ready Certificate | Published to your IT-Journey profile |
| 200 XP | Arc total: 2,020 XP |
| Arc Complete | The Agentic Codex arc is sealed |
The Codex Master speaks: “The seals are broken. The knowledge is yours. Go now and build systems worthy of the Codex.”