[SEV-1] CRITICAL INCIDENT ACTIVE
T-MINUS 00:14:22
CPU
100%
DISK
98.2%
STATUS
DEGRADED

Real DevOps.
Real Infrastructure.

Every other platform teaches commands.

We teach judgment.

When production breaks at 2:47 AM, you don’t get a tutorial. You get a problem. And a terminal.

Runs in YOUR AWS
Real EC2 Infra
Free forever
No credit card
02:47 ALERT_TRIGGERED
02:49 SSH_ACCESSED
02:53 ROOT_CAUSE_FOUND
INCIDENT_ACTIVE

The pager fires.
You're the one who picks up.

No runbook covers this. No senior is online. Just you, a terminal, and something broken in ways you haven't seen before.

prod-api-01.internal — SSH session
SEV-1 ACTIVE // 3 engineers paged
root@prod-api-01:~#systemctl --failed
nginx.service
Failed with result 'exit-code'
postgresql.service
Failed with result 'signal'
docker.service
Start request repeated too quickly
root@prod-api-01:~#df -h
FilesystemSizeUsedAvailUse%
/dev/xvda199G99G0100%
root@prod-api-01:~#
root@prod-api-01:~#journalctl -f --since "5 min ago"
02:47:01kernel: EXT4-fs error (device xvda1): No space left on device
02:47:02dockerd[892]: Error response from daemon: no space left on device
02:47:03nginx[2201]: [emerg] bind() to 0.0.0.0:80 failed (98: Address in use)
02:47:04systemd[1]: nginx.service: Failed with result 'exit-code'
02:47:05postgres[3341]: FATAL: could not write to lock file
02:47:06kernel: Out of memory: Killed process 4521 (node) total-vm:1.2G
02:47:07systemd[1]: docker.service: Start request repeated too quickly
02:47:08audit[1]: AVC apparmor=DENIED operation=mknod profile=docker
02:47:09kubelet[1023]: Node condition DiskPressure set to True
02:47:10systemd[1]: Reached target basic.system — awaiting recovery
02:47:11

No hints. No walkthrough. Either you fix it or you don't.

Browse incidents

SYS_ARCH // Provisioning Flow

Connecting your account. Deploying the chaos.

AWS Account
Cross-Account IAM Role created by you.
Devleep Engine
Validates access, selects scenario payload.
EC2 + VPC Provisioned
Broken state injected at boot.
You Fix It
SSH access granted. Time starts now.
AWS_RESOURCE_MAP

Not a Sandbox.
Your Account.

We don't simulate environments. We provision actual AWS resources into your account. When you fix the network partition, you're fixing real AWS routing tables.

Account ID:1234-5678-9012
Region:us-east-1
vpc-devleep-lab
subnet-public-1a
i-0a1b2c3d4e5f
t3.micro (Ubuntu 22.04)
STATE: DEGRADED
sg-lab-access
Inbound 220.0.0.0/0
Inbound 80TIMEOUT
MISSION_CATALOG

Choose your incident.

INC-101SEV-2

Disk Exhaustion

$ df -h
/dev/xvda1 98% /
PROVISION_ENV
INC-204SEV-1

OOM Killer Loose

$ dmesg | tail
Out of memory: Killed process...
PROVISION_ENV
INC-342SEV-1

K8s CrashLoop

$ kubectl get pods
payment-api CrashLoopBackOff
PROVISION_ENV
SKILL_TREE_PROGRESSION
/linux-core
  • ├── filesystem_permissions
  • ├── systemd_services
  • └── iptables_routing
/containers
  • ├── image_optimization
  • ├── runtime_security
  • └── volume_mounts
/kubernetes
  • ├── rbac_authorization
  • ├── ingress_controllers
  • └── stateful_sets (active)

Resource Allocation

Traditional Training
$299
per course
Simulated Env
Real AWS
Devleep Ops Center
$0
Free forever. No platform fee.
Access all community labs
Run in your AWS account
INITIALIZE
$ ssh ubuntu@incident-01.prod.internal
ALERT
Production API returning 500s.
Investigate immediately.