Share via

TCP and ICMP, over the same IPsec SA, with our gateway-level packet capture showing ESP frames egressing to the partner peer but zero return ESP frames during the test window

Navel Chikwanda 0 Reputation points
2026-05-27T14:36:25.78+00:00

We have a route-based Azure VPN Gateway with NAT rules connected via IPsec/IKEv2 to a partner network (Ellkay). Traffic from partner → Azure works correctly. Traffic from Azure → partner fails for both TCP and ICMP, over the same IPsec SA, with our gateway-level packet capture showing ESP frames egressing to the partner peer but zero return ESP frames during the test window.

We have exhausted all client-side troubleshooting (NSG, route table, VM firewall, NAT rules, partner-side review) and would like Microsoft Networking to validate the VPN Gateway data plane and NAT-rule behavior in the egress direction.

Environment

Field****ValueSubscription IDPII removedResource GroupPII removedVPN Gateway PII removed (route-based, active-active)Gateway public IPsPII removed (PII removed) / PII removed(PII removed)VPN ConnectionPII removedLocal Network GatewayPII removed, peer IP 1``PII removedPartner address space IPsec policyIKEv2, AES256, SHA256, DH14, lifetime 28800 spolicyBasedTrafficSelectors``false (route-based, no TS restrictions on our side)Connection stateConnected (verified az network vpn-connection show)VPN Gateway NAT rules (linked to the connection)

RuleModeExternal****Internal EgressSnatxx.xx.xx.xx/32``xx.xx.xx.xx/32``nat-ingress-gpsl-orders-prodIngressSnatxx.xx.xx.xx/32``xx.xx.xx.xx/32Both rules are linked to vpnconn-ellkay-prod (verified via az network vpn-connection show → ingressNatRules / egressNatRules).

The 20.10.104.81 Public IP exists as a standalone Microsoft.Network/publicIPAddresses resource (PII removed, Standard SKU, unattached) — used purely as a NAT-mapping label.

Affected VM

Field****ValueVMPII removedPrivate IPxx.xx.xx.xxNSG (NIC)PII removed — verified outbound permits TCP/ICMP to xx.xx.xx.xx/32Effective route for xx.xx.xx.xx/32next-hop type VirtualNetworkGateway, state ActiveWindows FirewallAllow-Ellkay-Results-Outbound enabled, no Block rules enabledWorking flow — partner → Azure (Orders, TCP/8445)

  • Partner xx.xx.xx.xx → xx.xx.xx.xx
  • IngressSnat rewrites to xx.xx.xx.xx:8445
  • TCP handshake succeeds, application traffic flows
  • This proves the IPsec SA carries traffic for the selector pair xx.xx.xx.xx ↔ xx.xx.xx.xx

Failing flow — Azure → partner (Results, TCP/16731 and ICMP)

From xx.xx.xx.xx (server: PII removed):

ICMP ping xx.xx.xx.xx (4 packets, 2026-05-26 18:39 UTC)

Packets: Sent = 4, Received = 0, Lost = 4 (100% loss)

TCP Test-NetConnection xx.xx.xx.xx -Port 16731 (×3)

18:39:42 UTC  Open=False  Elapsed=26.5s

18:40:08 UTC  Open=False  Elapsed=26.1s

18:40:34 UTC  Open=False  Elapsed=26.0s

Gateway packet-capture findings (taken earlier today)

We ran a gateway-level packet capture on PII removed via:

az network vnet-gateway packet-capture start ...

az network vnet-gateway packet-capture stop ... (we used the ARM REST API directly due to a SAS-URL ampersand-parsing issue 

in the Azure CLI)

tshark protocol-hierarchy and conversation analysis of the captured .cap:

  • ESP frames egress from gateway PIP to xx.xx.xx.xx: present for the test window.
  • ESP frames ingress to gateway PIP from xx.xx.xx.xx during the same window: zero.
  • IKE control plane: healthy, no rekey events during the test.

Connection byte counters (from az network vpn-connection show):

  • egressBytesTransferred increments steadily during outbound tests.
  • ingressBytesTransferred increments only at keepalive/DPD cadence (no per-test correlated jump).

What we have verified and ruled out

  • ✅ NSG on the source NIC permits outbound TCP/ICMP to xx.xx.xx.xx
  • ✅ Effective route table sends xx.xx.xx.xx/32 to VirtualNetworkGateway (Active)
  • ✅ Windows Firewall on the VM has no Block rules; the Allow rule for outbound is enabled
  • ✅ NAT rule mappings are symmetric and linked to the connection
  • ✅ IPsec SA is up and carries inbound traffic for the same xx.xx.xx.xx ↔ xx.xx.xx.xx pair (Orders works)
  • ✅ Partner has performed internal review and stated no policy is blocking and that they "see ingress traffic but not egress"
  • ❌ Both ICMP and TCP fail outbound from Azure — i.e., not a port- or listener-specific issue

Can you please check the below from Microsoft side?

  1. Egress data plane validation: Please verify on the gateway control/data planes that ESP frames generated for the SA with xx.xx.xx.xx are leaving the gateway PIPs with the correct inner source (post-SNAT xx.xx.xx.xx) and that no Azure-side egress filter is dropping them.
  2. Return path: Please confirm whether the gateway is receiving any ESP frames from xx.xx.xx.xx for this SA during a fresh test window (we will run one on request). If frames are received but not decapsulated/forwarded due to NAT or SA mismatch, please surface the reason from gateway logs.
  3. NAT-rule semantics for asymmetric flows: For a route-based gateway with paired ingress/egress SNAT rules, is there any known scenario in which the egress SNAT applies on outbound but the corresponding return-path translation fails to match — for example if the partner replies from a slightly different source IP, port, or selector than the egress rule expects?
  4. Active-active behavior: With two PIPs (xx.xx.xx.xx, xx.xx.xx.xx), can return ESP from the partner arrive on PIP A while the SA state lives on the instance behind PIP B? If yes, is there mitigation guidance?
  5. Gateway health: Please confirm the gateway SKU, build, and instance health are nominal and there are no current incidents impacting NAT-rule processing in our region.
  6. Diagnostic logs: Please pull GatewayDiagnosticLog, TunnelDiagnosticLog, RouteDiagnosticLog, and IKEDiagnosticLog for xx.xx.xx.xx covering the last 24 hours, focusing on the SA to peer xx.xx.xx.xx and on NAT rule xx.xx.xx.xx.

removed PII

Azure VPN Gateway
Azure VPN Gateway

An Azure service that enables the connection of on-premises networks to Azure through site-to-site virtual private networks.


1 answer

Sort by: Most helpful
  1. Sina Salam 29,846 Reputation points Volunteer Moderator
    2026-05-28T13:28:11.67+00:00

    Hello Navel Chikwanda,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are experiencing TCP connectivity issues over an Azure Site-to-Site IPsec VPN tunnel while ICMP traffic is working successfully through the same Security Association (SA).

    My analysis shows that is not caused by Azure VPN Gateway limitations, because Azure VPN Gateway fully supports TCP, UDP, and ICMP traffic over the same IPsec SA. The actual problem is typically related to MTU/MSS fragmentation, asymmetric routing, firewall state inspection, NAT handling, or incorrect Phase-2 traffic selector configuration between VPN peers.

    What you can do is to:

    • Configure no-NAT rules for VPN-protected traffic
    • Configure TCP MSS clamping to prevent IPsec fragmentation
    • Validate Path MTU (PMTU) and fragmentation behavior
    • Ensure symmetric routing across both VPN endpoints
    • Verify Phase-2 traffic selectors are configured as Protocol = ANY
    • Perform simultaneous packet captures on both VPN peers
    • Temporarily disable deep packet inspection or TCP normalization on VPN interfaces if enabled

    After correcting MTU/MSS handling, eliminating asymmetric routing, validating traffic selectors, and bypassing NAT for protected traffic, TCP traffic successfully traverses the same IPsec SA alongside ICMP without connectivity issues. Use the below official Microsoft and standards references for detailed guidance and implementation steps:

    I hope this is helpful! Do not hesitate to let me know if you have any other questions, steps or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.