Business critical databases are at the heart of organizations. Therefore, high-performance Disaster Recovery, an organization's method of regaining access and functionality to its IT infrastructure after a disaster, is necessary to guarantee database integrity and availability.
When evaluating Disaster Recovery solutions, data consistency, testing capability, as well as data loss and recovery timeframes are the key metrics upon which they are evaluated.
Data Consistency: The technology must ensure a recovered environment is always consistent no matter the failure.
Verification Frequency: Continuous verification of the DR solution is preferred, to ensure recovery at any stage.
Recovery Point Objective (RPO): The amount of data loss should be minimized.
Recovery Time Objective (RTO): The recovery time should be minimized.
Azure Site Recovery (ASR) can manage replication for a variety of configurations, including VMs between Azure regions, on-premises VMs and physical servers.
ASR is designed to provide business continuity for applications and workloads, by providing replication of physical and virtual machines from a primary site to a secondary site. Azure Site Recovery will enable failover, when an outage occurs, as well as failback, once the primary location is available again.
ASR provides two levels of data replication, by taking snapshots of the VMs and storing them within Azure data storage. These methods trade off database consistency against solution simplicity, much like other VM technologies.
ASR captures a snapshot of data on disk, but nothing that is in memory. This would provide the equivalent of data that is on disk if the VM unexpectedly crashes (e.g. power outage). There is no guarantee of data consistency for the operating system or any applications that are running on the VM. For database environments, this adds risk and relies on the database engine to address any data consistency issues.
Application-consistent snapshots provide a full snapshot of on-disk data, as well as all data that is in memory and in-progress transactions. This is achieved using Volume Shadow Copy Service (VSS). Application-consistent snapshots are more complex and time-consuming than crash-consistent snapshots and may affect the performance of the applications and databases running on the primary VM.
Application-consistent snapshots provide a full snapshot of on-disk data, as well as all data that is in memory and in-progress transactions. This is achieved using Volume Shadow Copy Service (VSS). Application-consistent snapshots are more complex and time-consuming than crash-consistent snapshots and may affect the performance of the applications and databases running on the primary VM.
RPO | RTO | |
Crash Consistent Snapshots | 5 min | 15 min |
Application Consistent Snapshots | 60 min | 15 min |
ASR enables non-disruptive DR testing via a failover test. This is done by creating a copy of the VM in Azure, with no impact on the production environment, or ongoing replication. Test VMs are cleaned up after the process is completed.
For database systems, which are typically I/O intensive, it is important to understand whether the performance profile of the database environment is within the ASR limits. The following article covers the Azure Site Recovery limits.
ASR provides a reliable and solid DR solution for VMs and non-database environments, where there is less risk of data consistency issues arising. RPO and RTO timeframes are average and may be a factor in mission-critical applications. For these reasons, customers with business-critical databases should look at using ASR combined with Database-specific Disaster Recovery to achieve improved performance.
In blog two we will look at what specialist database Disaster Recovery options are available for Azure Site Recovery.
If you have any questions or would like to discuss how Dbvisit StandbyMP could fit within your organizational needs, contact us, and one of our technical specialists will reach out to you.