This post represents the solution and explanation for quiz-21.
Have a look at it to test your knowledge. ☺
Quiz Review
A large enterprise consisting of multiple remote sites, uses a private MPLS cloud with EIGRP as the protocol between PE to CE and MPLS L3 VPNs to achieve the necessary connectivity.
Of particular interes, Site-A and Site-B have a Backdoor Link between them.
Everything works as desired until a new request reaches the network department: a new Site-ABC will be connected to PE-2 and users in this site will mostly connect to resources behind Site-A / CE-1 (192.168.1.55).
The requirement is to make sure that this traffic (from PE-2 to CE-1) will use the backdoor link instead of the MPLS cloud:
A simple investigation shows that in the current setup, traffic from PE-2 to CE-1 goes via the MPLS cloud:
PE-2#traceroute vrf CUST_A 192.168.1.55 Type escape sequence to abort. Tracing the route to 192.168.1.55 1 10.0.0.6 [MPLS: Labels 16/19 Exp 0] 60 msec 60 msec 40 msec 2 192.168.1.1 [MPLS: Label 19 Exp 0] 36 msec 36 msec 40 msec 3 192.168.1.2 44 msec * 20 msec PE-2#
Problem Statement
The network engineer tries to understand the current routing status for destination 192.168.1.55 and finds out that PE-2 prefers the BGP path versus the EIGRP one:
PE-2#sh ip route vrf CUST_A 192.168.1.55 Routing entry for 192.168.1.55/32Known via "bgp 100" , distance 200, metric 156160, type internal Redistributing via eigrp 100 Advertised by eigrp 100 metric 100000 10 255 1 1500 bgp 100 (self originated) Last update from 10.255.255.1 00:22:30 ago Routing Descriptor Blocks:* 10.255.255.1 (Default-IP-Routing-Table), from 10.255.255.1, 00:22:30 ago Route metric is 156160, traffic share count is 1 AS Hops 0 PE-2# PE-2#sh bgp vpnv4 uni all 192.168.1.55 BGP routing table entry for 100:1:192.168.1.55/32, version 22 Paths: (1 available, best #1, table CUST_A) Not advertised to any peer Local10.255.255.1 (metric 3) from 10.255.255.1 (10.255.255.1) Origin incomplete, metric 156160, localpref 100, valid, internal, best Extended Community: RT:100:1 Cost:pre-bestpath:128:156160 0x8800:32768:0 0x8801:100:130560 0x8802:65281:25600 0x8803:65281:1500 mpls labels in/out nolabel/19 PE-2#
He tries to influence the BGP path selection by setting a high local preference on the redistributed EIGRP routes, but unfortunatelly PE-2 still choses the prefix received over the MPLS as the best path:
ip access-list standard PE1_LOOPBACK permit 192.168.1.55 ! route-map SET_LP_500 permit 10 match ip address PE1_LOOPBACKset local-preference 500 route-map SET_LP_500 permit 999 ! router bgp 100 address-fam ipv4 vrf CUST_A redistribute eigrp 100 route-map SET_LP_500
PE-2#sh bgp vpnv4 uni all 192.168.1.55 BGP routing table entry for 100:1:192.168.1.55/32, version 22 Paths: (1 available, best #1, table CUST_A) Not advertised to any peer Local 10.255.255.1 (metric 3)from 10.255.255.1 (10.255.255.1) Origin incomplete, metric 156160,**localpref 100 , valid, internal,best Extended Community: RT:100:1 Cost:pre-bestpath:128:156160 0x8800:32768:0 0x8801:100:130560 0x8802:65281:25600 0x8803:65281:1500 mpls labels in/out nolabel/19! ! the prefix received over MPLS (with default LP = 100) is still chosen as best !! ! although the redistributed one has LP = 500 !
As most of you already answered in the quiz the reason for not being able to influence the BGP Best Path selection with the Local Preference is the existence of the Cost Community as seen in this line Cost:pre-bestpath:128:156160
What's that ?
Pre-bestpath Cost Community
Pre-bestpath is an extended non-transitive community that Cisco introduced in order to be able to influence the BGP Best Path selection in an arbitrary fashion, after partial computations of the normal process (or even before it starts) and take a decision based on local criteria. In some cases, especially in situation with Backdoor links, this can also help against routing loops.
This is not (yet, and probably will never be) part of an RFC standard as the proposed document is still in draft status. This draft is called "BGP Custom Decisions".
To achieve such a custom decision, this community uses a
POI = 128 , use Cost Community before anything elsePOI = 129 , use Cost Community after the IGP cost to next-hop has been compared- POI = 130, use Cost Community after the paths advertised by BGP speakers in a neighboring autonomous system (if any) have been selected
POI = 131 , use Cost Community after BGP IDs have been compared
Out of all these, Cisco implemented only POI = 129 (IGP) that represents the default and POI = 128 that represents the ABSOLUTE_VALUE.
EIGRP and Cost Community (Pre-bestpath)
Before presenting the solutions for the quiz, let's review some of the characteristics of EIGRP used as PE-CE protocol in relation with the pre-bestpath cost community:
- by default, EIGRP routes redistributed into BGP
get automatically the Cost Community POI 128 => this means that cost value is evaluated /compared before any other path attributes (including weight). Also, the community-ID is as well 128. the value/cost of the pre-bestpath community is the composite metric of the redistributed EIGRP route - routes without this cost community are evaluated as if they had a cost value of 2147483647, which represents half of the maximum possible value
- MP-BGP uses other set of communities to transport EIGRP metric values from one PE to another:
- 0x8800 = Route Flag and Tag
- 0x8801 = AS Number and Delay
- 0x8802 = Reliability, Next Hop, and Bandwidth
- 0x8803 = Reserve, Load and MTU
- 0x8804 = (for external routes) Remote AS Number and Remote ID
- 0x8805 = (for external routes) Remote Protocol and Remote Metric
- the MP-BGP cloud is interpreted as a metric zero (0)
For example, 0x8801 AS Number determines if the prefix will be redistributed as internal (same AS number) or external (different AS numbers).
Now let's put all together and reveal the things behind the scene. As you can see in the picture below, PE-2 has the following information in the BGP table and it will try to find best path:
- prefix 192.168.1.55 received from PE-1 over the MP-BGP with a pre-bestpath community of 128:156160 - this value represents the composite metric of the EIGRP route at the moment it was redistributed from EIGRP to BGP on PE-1
- the MPLS cloud does not modifies this cost (MPLS cloud is transparent)
- prefix 192.168.1.55 received from CE-2 over the EIGRP gets redistributed into BGP and immediately receives a pre-bestpath community of 128:158720 - this value represents the composite EIGRP metric at this point
- due to the existence pre-bestpath, MP-BGP path is selected the best path, even though the locally redistributed one has a weight of 32768 (default weight for all locally originated routes) - as explained, weight does not count when pre-bestpath exists
Quiz Solutions
Now, knowing that the Pre-bestpath Cost Community modifies the normal BGP best path selection process by considering the value of this community (the cost) before anything else is compared (due to ABSOLUTE point of insertion of 128), it becomes obvious that modifying any of the "clasic" path attributes, such as Local Preference, AS PATH, MED or even Weight will not help.
The solutions would have to find a way to modify the pre-bestpath cost or to disable this community. Let's see them in action !
1. Change pre-bestpath on PE-2
One method to get the result we want is to modify the pre-bestpath community on PE-2 during redistribution from EIGRP into MP-BGP. Since we cannot use the same community-ID of 128 (because this gets over-written by the redistribution process) I will use a lower community-ID (1 in below example) and a random cost value (9999999) - according to the RFC Draft: "the Cost Community with the lowest Community-ID is considered first":
PE-2#sh run | s access-list|route-map|router bgp ip access-list standard CE1_LOOPBACK permit 192.168.1.55 ! route-map SET_EXT_COST_COMMUNITY permit 10 match ip address CE1_LOOPBACKset extcommunity cost pre-bestpath 1 9999999 route-map SET_EXT_COST_COMMUNITY permit 99 ! ! router bgp 100 address-family ipv4 vrf CUST_Aredistribute eigrp 100 route-map SET_EXT_COST_COMMUNITY
PE-2#sh bgp vpnv4 uni all 192.168.1.55 BGP routing table entry for 100:1:192.168.1.55/32, version 8 Paths: (1 available, best #1, table CUST_A) Advertised to update-groups: 1 Local 192.168.2.2 from 0.0.0.0 (10.255.255.2) Origin incomplete, metric 158720, localpref 100, weight 32768, valid, sourced,best Extended Community: RT:100:1Cost:pre-bestpath:1:9999999 Cost:pre-bestpath:128:158720 0x8800:32768:0 0x8801:100:133120 0x8802:65282:25600 0x8803:65281:1500 mpls labels in/out 33/nolabel PE-2# PE-2#traceroute vrf CUST_A 192.168.1.55 Type escape sequence to abort. Tracing the route to 192.168.1.55 1 192.168.2.2 64 msec 28 msec 12 msec 2 192.168.12.1 44 msec * 24 msec PE-2#
Note that the pre-bestpath:128:<eigrp_metric> also gets added during redistribution
2. Change pre-bestpath on PE-1
A similar solution to the above one, but this time play with the pre-bestpath cost community between the BGP peers:
PE-1#sh run | s access-list|route-map|router b ip access-list standard CE1_LOOPBACK permit 192.168.1.55 ! route-map SET_EXT_COMM permit 10 match ip address CE1_LOOPBACKset extcommunity cost pre-bestpath 128 7777777 route-map SET_EXT_COMM permit 999 ! ! router bgp 100 address-family vpnv4neighbor 10.255.255.2 route-map SET_EXT_COMM out
PE-2#sh bgp vpnv4 uni all 192.168.1.55 BGP routing table entry for 100:1:192.168.1.55/32, version 36 Paths: (2 available, best #2, table CUST_A) Flag: 0x820 Advertised to update-groups: 1 Local 10.255.255.1 (metric 3) from 10.255.255.1 (10.255.255.1) Origin incomplete, metric 156160, localpref 100, valid, internal Extended Community: RT:100:1Cost:pre-bestpath:128:7777777 0x8800:32768:0 0x8801:100:130560 0x8802:65281:25600 0x8803:65281:1500 mpls labels in/out 24/20 Local 192.168.2.2 from 0.0.0.0 (10.255.255.2) Origin incomplete, metric 158720, localpref 100, weight 32768, valid, sourced,best Extended Community: RT:100:1 Cost:pre-bestpath:128:158720 0x8800:32768:0 0x8801:100:133120 0x8802:65282:25600 0x8803:65281:1500 mpls labels in/out 24/nolabel PE-2#
Note that the pre-bestpath:128:7777777 overwrites the initial one, as you cannot have two communities for the same point of insertion, 128 and the same community-ID, 128
In this case, comparison is done between same POI & community-ID (128) but EIGRP redistributed route has a lower cost (158720) versus the one received over MP-BGP (7777777).
3. Increase metrics using Offset-lists
Since the MPLS cloud is transparent for the EIGRP metric carried from PE-1 to PE-2, another solution to the quiz would be to modify the composite metric just before entering BGP, on PE-1, with an offset-list:
PE-1#sh run | s access-list|router eigrp ip access-list standard CE1_LOOPBACK permit 192.168.1.55 ! ! router eigrp 1 address-family ipv4 vrf CUST_Aoffset-list CE1_LOOPBACK in 1000000 FastEthernet0/0
4. Disabling the Pre-bestpath Behaviour
Last solution to this quiz would be to disable the pre-bestpath behaviour. To achieve this, command "bgp bestpath cost-community ignore" tells the router to ignore the presence of the pre-bestpath community and to follow the normal best path selection process.
This is the least recommended solution because you have to apply this command on all BGP speakers, which is not scalable.
PE-1(config)#router bgp 100 PE-1(config-router)#bgp bestpath cost-community ignore PE-1(config-router)#^Z ! ! PE-2(config)#router bgp 100 PE-2(config-router)#bgp bestpath cost-community ignore PE-2(config-router)#^Z
PE-2#sh bgp vpnv4 uni all 192.168.1.55 BGP routing table entry for 100:1:192.168.1.55/32, version 3 Paths: (2 available, best #2, table CUST_A) Flag: 0x820 Advertised to update-groups: 1 Local 10.255.255.1 (metric 3) from 10.255.255.1 (10.255.255.1) Origin incomplete, metric 156160, localpref 100, valid, internal Extended Community: RT:100:1Cost:pre-bestpath:128:156160 0x8800:32768:0 0x8801:100:130560 0x8802:65281:25600 0x8803:65281:1500 mpls labels in/out 19/18 Local 192.168.2.2 from 0.0.0.0 (10.255.255.2) Origin incomplete, metric 158720, localpref 100,weight 32768 , valid, sourced,best Extended Community: RT:100:1 Cost:pre-bestpath:128:158720 0x8800:32768:0 0x8801:100:133120 0x8802:65282:25600 0x8803:65281:1500 mpls labels in/out 19/nolabel PE-2#traceroute vrf CUST_A 192.168.1.55 Type escape sequence to abort. Tracing the route to 192.168.1.55 1 192.168.2.2 24 msec 16 msec 20 msec 2 192.168.12.1 44 msec * 52 msec PE-2#
This brings the end to another veeeery long post.
Thank you for all your comments and inputs in the quiz !
Comments
comments powered by Disqus