Error Handling: Status Codes, Problem Details, and Retries
Learn to design clear API errors with status codes, problem+json, idempotency, and retries
Greetings, brave adventurer! Anyone can build an API that works when everything goes right. The mark of a true craftsperson is an API that fails gracefully - one that tells the caller exactly what went wrong, lets them recover, and never duplicates an order when a request is retried. This quest, Error Handling, forges that craft.
You will learn to pick the right status code, return machine-readable error bodies in the standard problem+json format, report validation failures field by field, and design idempotent endpoints so a nervous client can retry safely. A good error message is a gift to every developer who calls your API.
📖 The Legend Behind This Quest
Networks are unreliable. Requests time out, packets drop, servers restart mid-write. When a client sees no response, it cannot tell whether the work succeeded or failed - so it retries. If your “create payment” endpoint is not idempotent, that retry charges the customer twice. The legends of production outages are full of duplicate orders and corrupted data born from careless retries.
This quest teaches the discipline that prevents those disasters: clear, consistent error responses and idempotent design. Master it and your API becomes trustworthy under the messy, failing conditions of the real internet.
🎯 Quest Objectives
By the time you complete this journey, you will have mastered:
Primary Objectives (Required for Quest Completion)
- Status Code Selection - Choose the correct 4xx or 5xx code for any failure
- Problem Details (RFC 9457) - Return errors as standard
application/problem+json - Validation Errors - Report field-level problems clearly and completely
- Idempotency - Design endpoints that tolerate safe retries
- Retries with Backoff - Implement exponential backoff with jitter on the client
Secondary Objectives (Bonus Achievements)
- Idempotency Keys - Make even POST safely retryable
- Error Consistency - Use one error envelope across an entire API
- Retry-After - Honor the server’s instruction on when to retry
Mastery Indicators
You’ll know you’ve truly mastered this quest when you can:
- Map any failure to the correct status code without guessing
- Write a problem+json body a frontend can parse and display
- Explain why retrying a non-idempotent POST is dangerous
- Implement backoff with jitter and explain why jitter matters
🗺️ Quest Prerequisites
📋 Knowledge Requirements
- You have completed API Fundamentals or know status codes and JSON
- You can make requests with curl
- You understand HTTP methods and which are idempotent
🛠️ System Requirements
- Modern operating system (Windows 10+, macOS 10.14+, or Linux)
curlavailable in your terminal- Optional: Python 3 for the retry lab
- An internet connection
🧠 Skill Level Indicators
This 🟡 Medium quest expects:
- You can already read a status code and a response body
- You are ready to design responses, not just consume them
- Ready for 75-90 minutes of focused learning
🌍 Choose Your Adventure Platform
Error handling is platform-independent. These setups give you a tool to provoke and observe HTTP errors on demand.
🍎 macOS Kingdom Path
Click to expand macOS instructions
```bash # httpbin lets you request any status code on demand brew install jq curl -s -o /dev/null -w "%{http_code}\n" https://httpbin.org/status/404 ```🪟 Windows Empire Path
Click to expand Windows instructions
```powershell winget install jqlang.jq # Ask httpbin to return a specific error code curl.exe -s -o NUL -w "%{http_code}`n" https://httpbin.org/status/503 ```🐧 Linux Territory Path
Click to expand Linux instructions
```bash sudo apt update && sudo apt install -y curl jq # Provoke a 429 to practice handling it curl -s -o /dev/null -w "%{http_code}\n" https://httpbin.org/status/429 ```☁️ Cloud Realms Path
Click to expand Cloud/Container instructions
```bash # Run your own httpbin to experiment offline docker run --rm -d -p 8080:80 kennethreitz/httpbin curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8080/status/418 ```🧙♂️ Chapter 1: Choosing the Right Status Code
The status code is your first and most important error signal. Choosing it well lets clients react correctly without parsing the body.
⚔️ Skills You’ll Forge in This Chapter
- Mapping failures to 4xx and 5xx codes
- Distinguishing client mistakes from server faults
🏗️ The Error Code Cheat Sheet
| Situation | Code | Meaning |
|---|---|---|
| Malformed request, bad JSON | 400 Bad Request |
The request itself is broken |
| Missing or invalid credentials | 401 Unauthorized |
You are not authenticated |
| Authenticated but not allowed | 403 Forbidden |
You may not do this |
| Resource does not exist | 404 Not Found |
Nothing lives at this URL |
| Method not allowed on resource | 405 Method Not Allowed |
Wrong verb for this URL |
| Edit conflict / version mismatch | 409 Conflict |
State conflicts with your request |
| Valid JSON but invalid values | 422 Unprocessable Entity |
Syntax fine, semantics wrong |
| Too many requests | 429 Too Many Requests |
You are being rate limited |
| Unhandled server fault | 500 Internal Server Error |
The server broke |
| Upstream dependency down | 503 Service Unavailable |
Temporarily unable to serve |
The key instinct, again: 4xx means the client must change something; 5xx means the server failed and the client may retry. Provoke a few codes to feel them:
# 400 for a bad request, 503 for a server that is temporarily down
curl -s -o /dev/null -w "got %{http_code}\n" https://httpbin.org/status/400
curl -s -o /dev/null -w "got %{http_code}\n" https://httpbin.org/status/503
🔍 Knowledge Check: Status Codes
- What is the difference between
401and403? - When would you return
422instead of400? - Which code tells the client it is safe to retry later?
⚡ Quick Wins and Checkpoints
- Provoked an error: You requested and received a specific error code
- Mapped a failure: You picked the right code for a scenario
🧙♂️ Chapter 2: Problem Details and Validation Errors
A status code alone is not enough. Clients need a machine-readable body. The IETF standardized one: problem+json (RFC 9457), so every API can speak the same error dialect.
⚔️ Skills You’ll Forge in This Chapter
- The problem+json structure
- Reporting validation errors field by field
- Keeping one error envelope across the API
🏗️ The problem+json Format
A problem document is JSON with a Content-Type of application/problem+json and these standard members: type (a URI naming the error), title, status, detail, and instance. You may add your own fields:
{
"type": "https://api.example.com/errors/insufficient-funds",
"title": "Insufficient funds",
"status": 422,
"detail": "Your balance of $30.00 cannot cover the $50.00 transfer.",
"instance": "/transfers/abc-123",
"balance": 30.0,
"requested": 50.0
}
For validation errors, report every invalid field at once so the user does not fix them one frustrating round-trip at a time:
{
"type": "https://api.example.com/errors/validation",
"title": "Your request did not pass validation",
"status": 422,
"errors": [
{ "field": "email", "message": "must be a valid email address" },
{ "field": "age", "message": "must be greater than or equal to 18" }
]
}
Always set the matching content type so generic clients route the error correctly:
# A well-behaved API labels error bodies as problem+json
curl -s -i https://api.example.com/transfers/abc-123 | grep -i "content-type"
# -> Content-Type: application/problem+json
🔍 Knowledge Check: Problem Details
- What
Content-Typedoes a problem+json body carry? - Why report all validation errors at once instead of one at a time?
- Which problem+json members are standard?
🧙♂️ Chapter 3: Idempotency and Safe Retries
When a client does not hear back, it retries. Idempotency guarantees that retrying does no extra harm. GET, PUT, and DELETE are naturally idempotent; POST is not - so we make it safe with an idempotency key.
⚔️ Skills You’ll Forge in This Chapter
- Making POST safely retryable with idempotency keys
- Exponential backoff with jitter
- Honoring
Retry-After
🏗️ Idempotency Keys
The client generates a unique key per logical operation and sends it in a header. The server records the key with the result; if the same key arrives again, it returns the stored result instead of doing the work twice:
# Two identical requests with the same key create only ONE payment
curl -s -X POST https://api.example.com/payments \
-H "Idempotency-Key: 7f3c-1b2a-create-payment-42" \
-H "Content-Type: application/json" \
-d '{"amount": 5000, "currency": "usd"}'
This is exactly how Stripe and other payment APIs make “charge the card” safe to retry.
Now the client side: retry only on retryable failures (429, 5xx, network errors), with exponential backoff and jitter so a thundering herd of clients does not all retry in lockstep:
import time, random, requests
def call_with_retry(url, max_attempts=5):
for attempt in range(max_attempts):
resp = requests.get(url, timeout=5)
# Success or a non-retryable client error: stop immediately
if resp.status_code < 500 and resp.status_code != 429:
return resp
# Honor Retry-After if the server sent one, else back off exponentially
retry_after = resp.headers.get("Retry-After")
if retry_after:
delay = float(retry_after)
else:
delay = (2 ** attempt) + random.uniform(0, 1) # backoff + jitter
time.sleep(delay)
return resp # exhausted attempts; surface the last response
Never retry a 400 or 422 - the request is wrong and will fail forever. Always cap attempts so you fail loudly instead of hammering a struggling server.
🔍 Knowledge Check: Idempotency & Retries
- Why is POST not naturally idempotent, and how does a key fix it?
- Why add random jitter to backoff delays?
- Which status codes should you never retry?
🎮 Mastery Challenges
🟢 Novice Challenge: Pick the Codes
Objective: Assign status codes to a list of failures.
Requirements:
- Choose codes for: bad JSON, missing token, forbidden action, missing record, rate limit hit
- State for each whether the client should retry
- Provoke at least two of them via httpbin
Validation: Every choice matches the cheat sheet and retry logic.
🟡 Intermediate Challenge: Design an Error Body
Objective: Write a problem+json validation error for a signup form.
Requirements:
- Include the standard
type,title,statusmembers - Report at least two field-level errors at once
- State the
Content-Typethe response must carry
Validation: A frontend could render your errors next to the right fields.
🔴 Advanced Challenge: Safe Retrying Client
Objective: Implement an idempotent, retrying client call.
Requirements:
- Retry only on
429and5xx, never on4xxvalidation errors - Use exponential backoff with jitter and honor
Retry-After - Send an idempotency key on a POST so a retry cannot duplicate work
Validation: A duplicated request produces exactly one result.
🏆 Quest Rewards & Achievements
🎖️ Badges Earned:
- 🏆 Herald of Failure - You designed errors that help instead of confuse
- 🔁 Master of the Retry - You implemented safe, idempotent retries
🛠️ Skills Unlocked:
- Problem Details Error Design - Speak the standard error dialect
- Idempotent Retry Strategy - Survive an unreliable network
🔓 Unlocked Quests:
- Rate Limiting - Return and honor
429correctly - API Versioning - Evolve without breaking error contracts
📊 Progression Points: +60 XP
🗺️ Next Steps in Your Journey
Continue the Main Story:
- 🎯 Rate Limiting - The other side of
429
Explore Side Adventures:
- ⚔️ API Versioning - Evolve without breaking clients
- ⚔️ API Authentication - The source of your
401s
Character Class Recommendations
💻 Software Developer: Continue to Rate Limiting
🏗️ System Engineer: Explore API Versioning
🛡️ Security Specialist: Check out API Authentication
📚 Resources
Official Documentation
- RFC 9457: Problem Details for HTTP APIs - The problem+json standard
- MDN: HTTP response status codes - Every code defined
- MDN: Retry-After header - Telling clients when to come back
Community Resources
- Stripe: Idempotent requests - Idempotency keys in production
- AWS: Exponential backoff and jitter - Why jitter matters
- httpbin - The error-provoking service used here
Learning Materials
- Google API Design: Errors - A production error model
- MDN: Idempotent methods - The idempotency definition
- Zalando RESTful API Guidelines - Error and retry conventions
🤝 Quest Completion Checklist
- ✅ Completed all primary objectives
- ✅ Designed a problem+json error body
- ✅ Answered all knowledge check questions
- ✅ Completed at least one mastery challenge
- ✅ Explored the resource library
- ✅ Identified your next quest in the journey
🕸️ Knowledge Graph
Structured wiki-links connect this quest to the IT-Journey knowledge graph. Open the Obsidian Graph View to explore connections.
Level hub: [[Level 0111 - API Development]] Overworld: [[🏰 Overworld - Master Quest Map]] Prerequisites: [[API Fundamentals: HTTP, Requests, and JSON]] Unlocks: [[Rate Limiting: Token Buckets, 429s, and Quotas]] · [[API Versioning: URI, Headers, and Backward Compatibility]] Obsidian docs: [[Obsidian Knowledge Graph and Wiki Links]]
🎁 Rewards
Badges
- 🏆 Herald of Failure - Designed errors that help instead of confuse
- 🔁 Master of the Retry - Implemented safe, idempotent retries
Skills unlocked
- 🛠️ Problem Details Error Design
- 🧠 Idempotent Retry Strategy
Features unlocked
- Access to the Rate Limiting and API Versioning quests
🕸️ Quest Network
Click a node to open the quest · ⌘/Ctrl-click for a new tab · drag to reposition · scroll to zoom.
Referenced by
- Loading…