Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Infrastructure Resiliency Manager allows you to assess the resiliency of Service Group resources by simulating zone outages on individual resources. You can evaluate the performance of cross-zone resiliency solutions for your applications and identify resources that require Resiliency improvements to support application continuity.
The Availability Zone Down Drill templates provide Azure-recommended faults for supported resource types and allow you to override them with custom logic through Azure Runbooks. After fault injection, you can perform failover and reprotection for resources configured with active-passive solutions by using integrated Recovery Plans. You can also measure application downtime during outages. You can also monitor Service Group and resource health in real time during drill execution through integrated metrics.
Key components for Availability Zone Down Drill
The following table lists the core components you use in Availability Zone Down Drills:
| Component | Description |
|---|---|
| Service Group | A logical group of Azure resources that represent an application or workload. |
| Zone Down Drill | Template that simulates an availability zone outage on Service Group resources to evaluate cross-zone resiliency. |
| Fault Injection | The process of introducing controlled failures to simulate zone outages. |
| Recovery Plan | A defined sequence of Failover and Reprotection operations to recover resources after fault injection. |
| Fault Designer | The interface is to review and edit the faults applied to each resource in the drill. |
| Health Monitoring | Integrated metrics experience to track resource health in real time during drill execution. |
Drill execution lifecycle
The following sequence outlines each stage in the drill lifecycle, and the actions performed at each step:
Fault Injection: Apply controlled faults to resources in the selected availability zone.
Failover: Trigger failover for resources configured with active-passive solutions by using the associated Recovery Plan.
Reprotect: Enable replication for failed-over resources to re-establish redundancy.
Failover (Reverse): Fail resources back from the target zone to the source zone.
Reprotect (Reverse): Re-enable replication in the original direction to restore the baseline configuration.