Documentation

Learn how to interpret results and master system design principles

Understanding Your Results

Capacity (RPS)

The maximum requests per second your system can handle before becoming a bottleneck. This is determined by the slowest component in your architecture.

💡 Tip: If you're below the required RPS, look for the component with the lowest capacity in your path. Consider adding replicas, caching, or optimization.

Latency (P95)

The 95th percentile response time. This means 95% of requests are faster than this value. P95 is more meaningful than averages for understanding user experience.

💡 Tip: Network latency between components adds up quickly. Use CDNs, edge computing, and regional replicas to reduce geographic latency.

Outcome States

✓

Pass

Both metrics met

Partial

One metric met

✗

Fail

Neither met

⚡

Chaos Fail

Component crashed

Hints System

After failing a scenario, you'll get progressive hints to guide your solution. The more you struggle with a scenario, the more specific the guidance becomes.

💡 Tip: Don't rush to look at hints! The learning comes from discovering solutions yourself. Use hints when you're truly stuck.

Essential Reading

📚 Fundamentals

🏗️ Architecture Patterns

⚡ Performance & Scaling

🔧 Tools & Technologies

Common Design Patterns

Cache-Aside Pattern

Application checks cache first, then database if miss. Updates cache on writes.

Use when: Read-heavy workloads, acceptable stale data, fast database queries

Circuit Breaker

Stop calling failing services to prevent cascade failures. Automatically retry when healthy.

Use when: External service dependencies, network failures, timeout handling

Database Sharding

Split database across multiple servers using consistent hashing or range-based partitioning.

Use when: High write throughput, large datasets, horizontal scaling needed

Event Sourcing

Store all state changes as events. Rebuild state by replaying events.

Use when: Audit trails needed, temporal queries, complex business logic

Start Practicing Now