Your company has 3 sites, each with a dedicated border router, R1, R2 and R3.
Site-1 (R1) and Site-2 (R2) have their own internet uplinks, but Site-3 (R3) connects to internet via R2. A GRE tunnel is built between R2 and R3 and applied an MTU of 1440, due to some constraints in the transit network between them.

Here are details about network configuration:

  • for backup purposes, a backdoor link exists between sites R1 and R2
  • R2 performs NAT for all internal addresses of Site-2 and Site-3 (172.16.0.0/12 & 192.168.0.0/16) for traffic that is sent toward the Internet
  • the main server, 1.1.1.10, which runs in Site-1 (behind R1), hosts two applications that use TCP 1001 and TCP 1002
  • since the TCP 1001 application is consuming a lot of bandwidth, Policy Based Routing (PBR) was configured on R2 to forward TCP 1001 over the backdoor link (so that internet access for users in Site-2 will not be impacted)
  • traffic for the TCP 1002 application (and for other potential applications) will be NAT-ed and sent over the Internet toward server in HQ

quiz-22 PBR Problem or Not?

After you applied the configuration in the figure above, the users in Site-3 (172.16.1.10) tried to upload data to the application server and sent you the following feedback:

  • TCP 1001 works OK, using the backdoor link
  • TCP 1002 does not work: the connections from Site-3 to server 1.1.1.10 get established but the transfer of data gets stalled soon after it is established and, in the end, it timesout

You check the PBR configured on R2 and everything looks all right:

  • TCP 1001 is forwarded over the backdoor link and works fine
  • TCP 1002 is not matching the PBR and it gets NAT-ed and forwarded to server over the internet (which is what you want/expect)

As a last resort, you installed a sniffer and captured all incoming traffic on R2 sent by R3. Your conclusions were the following:

  • TCP session (SYN/SYN-ACK/ACK) gets established for both TCP 1001 and TCP 1002
  • you notice a lot of fragments and retransmissions for data transfers of both applications, TCP 1001 and TCP 1002
  • TCP 1001 finishes the data transfer with FIN/FIN-ACK (and customer confirms that TCP 1001 works ok)
  • TCP 1002 transfers get stuck, there are no FIN/FIN-ACK (and customer complains that TCP 1002 is not working)

Here are some snapshots of the captured traffic:

TCP 1001 (working)
TCP 1002 (not working)

What is the problem and how can you solve it ?

Post your answer in the 'Comments' section below and subscribe to this blog to get the detailed solution and more interesting quizzes.