Share via

Azure Site Recovery Test Failover – Windows Server 2022 VM Fails to Boot Unless Replication Is Reinitialized

Ghulam Abbas 211 Reputation points
2026-05-06T09:02:44.5766667+00:00

We are experiencing a recurring issue with Azure Site Recovery (ASR) test failovers for a business‑critical file server VM. The server originated as Windows Server 2012 R2 and was upgraded in place to 2019 and then to Windows Server 2022. It is a Gen 1 (BIOS‑based) Azure VM with a 128 GB OS disk and an 8 TB NTFS data disk. In production, the VM operates normally: monthly OS reboots succeed and Veeam backups and in‑region restores work without issue. However, during ASR test failovers to another region (we have the ASR setup between 2 regions and we perform failover test twice a year), the VM consistently fails to boot on the first attempt, briefly entering a running state before powering off with no boot diagnostics output, even when the data disk is detached. We observed this behavior twice, approximately six months apart.

In both cases, the issue was temporarily resolved only after disabling and re‑enabling ASR replication and completing a full re‑sync (which took several days), after which the test failover succeeded. Because this workaround would not be feasible in a real regional outage, we are concerned about first‑attempt failover reliability. We would appreciate Microsoft’s official advice on whether this is a known risk for long‑lived Gen 1 VMs that have undergone multiple in‑place OS upgrades, and whether Microsoft recommends rebuilding the OS on a fresh Windows Server 2022 Gen 2 (UEFI) VM as the long‑term remediation. If rebuilding is the recommended approach, we would also welcome guidance on best practices for safely reintroducing a new server with the same hostname and ensuring DNS, file share definitions, NTFS permissions, and client mappings are preserved without disruption, as well as whether there are any Microsoft‑recommended methods to validate these aspects prior to a production cutover. Thanks

Azure Site Recovery
Azure Site Recovery

An Azure native disaster recovery service. Previously known as Microsoft Azure Hyper-V Recovery Manager.


1 answer

Sort by: Most helpful
  1. Suchitra Suregaunkar 14,420 Reputation points Microsoft External Staff Moderator
    2026-05-06T09:50:53.8033333+00:00

    Hello @Ghulam Abbas

    Thank you for posting your query on Microsoft Q&A platform.

    This behavior is consistent with what can happen on long-lived Gen 1 (BIOS/MBR) Azure VMs that have undergone multiple successive in-place OS upgrades. While the ASR support matrix confirms that "Windows OS upgrade without disable replication is supported", this does not guarantee that the accumulated boot configuration artifacts from a 2012 R2 → 2019 → 2022 upgrade chain will remain fully compatible with the ASR failover process over time.

    • Boot Configuration Data (BCD) and MBR accumulation: Each in-place OS upgrade layers new boot entries, HAL configurations, and legacy driver remnants onto the existing MBR-based boot structure. In production, the source VM's hypervisor environment tolerates these because it has been booting this same disk continuously. However, when ASR creates a target VM in the DR region from replicated disk data, the target VM is essentially "cold booting" this disk for the first time in a different host environment and the accumulated boot complexity can cause it to fail silently (briefly entering "running" state then powering off with no boot diagnostics.
    • Why disabling/re-enabling replication fixes it: When you disable and re-enable ASR replication, it performs a fresh full initial replication of the disk. This creates clean recovery points and eliminates any potential replication metadata inconsistencies or stale delta chain issues that may have built up over months of continuous replication through OS-level changes. The fact that test failover works immediately after a fresh resync, but fails after months of continued replication, strongly points to a replication stream integrity issue compounded by the complex boot layout.
    • No boot diagnostics output: This is characteristic of a BIOS-level boot failure on Gen 1 VMs, the OS never gets far enough into the boot sequence to produce serial console output. On a Gen 2 (UEFI) VM, you would typically see more diagnostic information even in a boot failure scenario.

    There is no explicit Microsoft KB article stating "Gen 1 VMs with multiple in-place upgrades will fail ASR test failovers." However, the ASR support matrix does call out that changes to a VM's OS or configuration during active replication can require replication to be disabled and re-enabled, and the in-place upgrade documentation explicitly warns that Azure capabilities such as Auto guest patching and Auto OS image upgrades may not function correctly after in-place upgrades because "the source image information in the VM properties, including the publisher, offer, and plan, remains unchanged."

    Reference: https://learn.microsoft.com/en-us/azure/virtual-machines/windows-in-place-upgrade

    This mismatch between the actual OS on disk and the VM metadata is a contributing factor in ASR's ability to correctly provision and boot the target VM.

    As a long-Term Remediation: Rebuild on a Fresh Gen 2 VM: Rebuilding the OS on a fresh Windows Server 2022 Gen 2 (UEFI) VM is the recommended long-term approach for the following reasons:

    1. UEFI boot architecture is more resilient and provides better diagnostics during failover. Gen 2 VMs produce meaningful boot diagnostics output even when boot failures occur, making troubleshooting far simpler. https://charbelnemnom.com/migrate-gen1-to-gen2-vms-on-azure/
    2. Trusted Launch support (vTPM, Secure Boot) is only available on Gen 2 VMs, which is increasingly becoming a security requirement.
    3. Clean boot configuration eliminates the legacy MBR artifacts from the 2012 R2 → 2019 → 2022 upgrade chain, giving ASR a clean, single-generation disk to replicate.
    4. ASR replication reliability: A fresh Gen 2 VM with a clean Windows Server 2022 install will have metadata (publisher, offer, SKU) that matches the actual OS, eliminating the metadata mismatch issue.

    Note on Gen 1 → Gen 2 in-place upgrade: Microsoft introduced an in-place Gen 1 to Gen 2 upgrade capability (preview as of Feb 2025) that involves converting the OS disk from MBR to GPT and enabling UEFI. However, for your scenario where the goal is to eliminate accumulated boot artifacts and ensure clean ASR failover reliability, a clean rebuild is preferable over an in-place generation conversion, because the conversion would still carry forward the layered OS upgrade artifacts on the disk.

    File Server Rebuild: Preserving Hostname, DNS, Shares, NTFS Permissions & Client Mappings: The recommended approach is to build the new Gen 2 VM with a temporary hostname, migrate data and configuration, then swap the hostname to match the original server. This ensures zero client-side changes.

    1. Pre-Migration: Export Everything from the Current Server:

    Before decommissioning the old VM, capture the full file server configuration:

    • Share definitions : Export all SMB share configurations:
        Get-SmbShare | Where-Object {$_.Special -eq $false} | Export-Clixml C:\Migration\ShareConfig.xml
      
    • NTFS permissions : Use icacls to save ACLs for all shared directories:
        icacls "D:\Shares" /save C:\Migration\ntfs_permissions.txt /T /C
      
    • DNS records : Document all A records, CNAME records, and any aliases pointing to this server's hostname or IP.
    • DFS Namespace targets (if applicable) : Document the DFS folder targets pointing to this server.
    • Group Policy drive mappings — Identify any GPOs or login scripts referencing the server by hostname or IP. https://4sysops.com/archives/migrate-a-file-server-to-windows-server-2025/
    1. Data Migration: Use Robocopy with Full Fidelity

    Robocopy is the recommended tool for preserving NTFS permissions, timestamps, and ownership:

    robocopy "D:\Shares" "\NEWSERVER-TEMP\D$\Shares" /E /COPYALL /DCOPY:DAT /R:3 /W:5 /MT:16 /LOG:C:\Migration\robocopy_log.txt
    

    Key flags:

    • /COPYALL — copies Data, Attributes, Timestamps, Security (NTFS ACLs), Owner, and Auditing info
    • /DCOPY:DAT — preserves directory timestamps
    • /MT:16 — multi-threaded for the 8 TB data disk

    Run an initial seed copy, then a final delta sync just before cutover.

    1. Recreate Share Definitions on the New Server: Re-import the share definitions captured earlier:
    $shares = Import-Clixml C:\Migration\ShareConfig.xml
    foreach ($share in $shares) {
        New-SmbShare -Name $share.Name -Path $share.Path -Description $share.Description
        # Then re-apply share-level permissions as needed
    }
    
    1. Restore NTFS Permissions (if needed): If Robocopy with /COPYALL was used, permissions are already in place.

    To verify or restore from saved ACLs:

    icacls "D:\Shares" /restore C:\Migration\ntfs_permissions.txt /C
    
    1. Hostname Swap (Zero Client Disruption): This is the critical step to preserve all client mappings:
    2. Shut down the old VM and either rename it (e.g., FILESVR-OLD) or remove its computer account from AD.
    3. Rename the new Gen 2 VM to the original hostname: Rename-Computer -NewName "ORIGINAL-HOSTNAME" -DomainCredential (Get-Credential) -Restart
    4. Assign the same static private IP to the new VM's NIC (if the old VM used a static IP).
    5. DNS A records will update automatically via AD-integrated DNS registration. Verify with nslookup. https://en.ittrip.xyz/windows-server/dc-cutover-same-ip-name
    6. If Using DFS Namespaces: If you're using DFS Namespaces, the cutover is even simpler just update the DFS folder targets from \\oldserver\share to \\newserver\share. If you perform the hostname swap above, no DFS changes are needed at all because the UNC paths remain identical.

    Thanks,
    Suchitra.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.