Community
Every lab scenario on DevLeep is a YAML file in a public repository. The community writes, reviews, and improves them. If you have been through a real incident, you can turn it into a lab that helps thousands of engineers who haven't.
github.com/devleep/labs
YAML lab definitions · MIT Licence · PRs welcome
Submit a new lab based on a real incident
Turn an outage you have been through into a lab. The most effective scenarios come from real production failures — not synthetic exercises. Write the YAML, test the validation checks, include all three hint levels.
Improve existing hint quality
Many labs have sparse Level 2 and Level 3 hints. If you solved a lab the hard way and found the hints were not helpful, you are exactly the right person to improve them. Open the YAML, make it clearer, submit a PR.
Review open pull requests
Every new lab needs a technical review. Did you read the YAML, provision the lab, and verify the validation checks actually catch what they should? A confirmed test result in the PR is worth more than a comment.
Propose new scenario ideas
If you have an incident pattern that would make a good lab but don't have time to implement it yet, open an issue describing the scenario. Include: what broke, what the operator would need to do, and what a validation check might look like.
The complete field reference is in the documentation. These are the steps:
Based on a real failure pattern
Not a contrived exercise. There should be a class of production incidents that this lab teaches the operator to recognise and resolve.
Validation tests the outcome
Checks should verify the fix worked — service is running, config is correct, resource is available — not just that a file exists.
All three hint levels written
Level 1 directional, Level 2 specific, Level 3 full walkthrough. Someone completely stuck at 2 AM should be able to get unstuck.
Scenario is realistic
The broken state should look like something that could realistically end up in production — misconfiguration, version drift, resource exhaustion.
Objectives are specific
List what the operator will learn, not what they will do. 'Diagnose and repair a misconfigured systemd unit' beats 'Fix the service'.
Validation is idempotent
Running the validation twice should give the same result. Checks should not depend on timing or side effects from a previous check.
These are the tracks planned or in progress. Labs within each track are prioritised by how commonly the failure pattern appears in production.
If you want to accelerate a particular area, open an issue or submit a lab.
A scenario is the broken environment — an EC2 instance set up in a specific failed state. A lab is the task within that scenario.
One scenario can host multiple labs at different difficulty levels. For example, a scenario_id: nginx-config-error might have a beginner lab that asks the operator to find and fix a syntax error, and an intermediate lab that adds a failing upstream and missing SSL cert to the same environment.