BGP configuration tutorial

advantages of bgp routing protocol and basic bgp configuration cisco router and bgp aggregate address network statement
ghanshyam Profile Pic
ghanshyam,India,Professional
Published Date:21-07-2017
Your Website URL(Optional)
Comment
BRKRST-3320 1 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public Troubleshooting BGP BRKRST-3320 BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 2 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrOverview • Troubleshooting Peers • BGP Convergence • High Utilization • BGP Routing Problems BRKRST-3320 3 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public Troubleshooting Peers BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 4 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speakers Won’t Peer • This can be difficult to troubleshoot if you can only see one side of the connection • Start with the simple things: check for common mistakes Is it supposed to be configured for eBGP multihop? Are the AS numbers right? • Next, try pinging the peering address If the ping fails, there’s likely a connectivity problem BRKRST-3320 5 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer Routerenable • Try some alternate ping Routerping Protocol ip: options Target IP address: 192.168.40.1 Repeat count 5: • Is the local peering address Datagram size 100: Timeout in seconds 2: the actual peering interface? Extended commands n: y Source address or interface: 172.16.23.2 If not, use extended ping to source from the loopback or actual peering address If this fails, there is an underlying routing problem The other router may not know how to reach your peering interface BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 6 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr No Peering Ping: FailsBGP Speakers Won’t Peer Routerenable • Try extended ping to sweep Routerping Protocol ip: a range of possible MTUs Target IP address: 192.168.40.1 Repeat count 5: Datagram size 100: Note the MTU at which the Timeout in seconds 2: ping starts to fail Extended commands n: y Source address or interface: Type of service 0: Make certain the interface is Set DF bit in IP header? no: configured for that MTU size Validate reply data? no: Data pattern 0xABCD: Loose, Strict, Record, Timestamp, Verbosenone: • If these all fail Sweep range of sizes n: y Sweep min size 36: 100 None of the pings work no Sweep max size 18024: 2500 Sweep interval 1: 100 matter how you try.... .... It’s likely a transport problem Drop back and punt BRKRST-3320 7 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer • Remember that BGP runs on top of IP, and can be affected by: Rate limiting Traffic shaping Tunneling problems IP reachability problems (the underlying routing isn’t working) TCP problems Etc. BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 8 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speakers Won’t Peer Useful Peer Troubleshooting Commands show tcp brief all TCB Local Address Foreign Address (state) 64316F14 1.1.1.1.12345 2.2.2.2.179 ESTAB 6431BA8C .179 2.2.2.2. LISTEN 62FFDEF4 . . LISTEN show tcp statistics Rcvd: 7005 Total, 10 no port 0 checksum error, 0 bad offset, 0 too short .... 0 out-of-order packets (0 bytes) .... 4186 ack packets (73521 bytes) .... Sent: 9150 Total, 0 urgent packets 4810 control packets (including 127 retransmitted) 2172 data packets (71504 bytes) .... BRKRST-3320 9 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer Useful Peer Troubleshooting Commands debug ip tcp transactions R1sh log i TCP0: TCP0: state was ESTAB - FINWAIT1 12345 - 2.2.2.2(179) TCP0: sending FIN TCP0: state was FINWAIT1 - FINWAIT2 12345 - 2.2.2.2(179) TCP0: FIN processed TCP0: state was FINWAIT2 - TIMEWAIT 12345 - 2.2.2.2(179) This can be very chatty, so be careful with this debug TCP0: Connection to 2.2.2.2:179, advertising MSS 1460 TCP0: state was CLOSED - SYNSENT 12346 - 2.2.2.2(179) TCP0: state was SYNSENT - ESTAB 12346 - 2.2.2.2(179) TCP0: tcb 6430DCDC connection to 2.2.2.2:179, received MSS 1460, MSS is 1460 BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 10 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speakers Won’t Peer • If the connectivity is good, the next step is to check BGP itself • debug ip bgp Use with caution Configure so the output goes to the log, rather than the console logging buffered size no logging console It’s easier to find the problem points this way routershow log i NOTIFICATION BRKRST-3320 11 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer • show ip bgp neighbor 1.1.1.1 include last reset This should give you the resets for a peer The same information as is shown through debug ip bgp • bgp log-neighbor changes Provides much of the same information as debug ip bgp, as well BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 12 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speakers Won’t Peer Source/Destination Address Matching neighbor 2.2.2.2 remote-as 100 ƒ Both sides must agree on neighbor 2.2.2.2 update-source loopback 0 source and destination addresses ƒ R1 and R2 do not agree on R1 R2 what addresses to use BGP will tear down the TCP session due to the conflict neighbor 10.1.1.1 remote-as 100 Points out configuration neighbor 10.1.1.1 update-source loopback 0 problems and adds some security BRKRST-3320 13 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer Source/Destination Address Matching ƒ R2 attempts to open a session to R1 BGP: 10.1.1.1 open active, local address 2.2.2.2 ƒ R1 denies the session because of the address mismatch ƒ debug ip bgp on R1 shows BGP: 2.2.2.2 passive open to 10.1.1.1 BGP: 2.2.2.2 passive open failed - 10.1.1.1 is not update-source Loopback0's address (1.1.1.1) BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 14 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr 1.1.1.1 10.1.1.1 10.1.1.2 2.2.2.2BGP Speakers Won’t Peer Active vs. Passive Peer neighbor 2.2.2.2 remote-as 100 ƒ Active Session neighbor 2.2.2.2 connection-mode If the TCP session initiated by active R1 is the one used between R1 & R2 then R1 “actively” established the session. ƒ Passive Session R1 R2 For the same scenario R2 “passively” established the session. ƒ R1 Actively opened the neighbor 10.1.1.1 remote-as 100 session neighbor 10.1.1.1 connection-mode passive ƒ R2 Passively accepted the session ƒ Can be configured neighbor x.x.x.x transport connection-mode activepassive BRKRST-3320 15 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer Active vs. Passive Peer ƒ Use show ip bgp neighbor to determine if a router actively or passively established a session R1show ip bgp neighbors 2.2.2.2 BGP neighbor is 2.2.2.2, remote AS 200, external link BGP version 4, remote router ID 2.2.2.2 snip Local host: 1.1.1.1, Local port: 12343 Foreign host: 2.2.2.2, Foreign port: 179 ƒ TCP open from R1 to R2’s port 179 established the session ƒ Tells us that R1 actively established the session BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 16 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr 1.1.1.1 10.1.1.1 10.1.1.2 2.2.2.2BGP Speakers Won’t Peer Session Collisions neighbor 2.2.2.2 remote-as 100 ƒ Both speakers initiate their neighbor 2.2.2.2 connection-mode active sessions at the same time ƒ The active session established by the peer R1 R2 with the highest router-ID is the winner This rarely happens neighbor 10.1.1.1 remote-as 100 neighbor 10.1.1.1 connection-mode Not an issue if this passive does occur BRKRST-3320 17 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speakers Won’t Peer R2 Time to Live AS65001 ƒ BGP uses a TTL of 1 for eBGP peers Default TTL ƒ For eBGP peers that are more than 1 hop away a Configured TTL larger TTL must be used ƒ neighbor x.x.x.x ebgp-multihop 2-255 AS65000 R1 R1show ip bgp neighbors 2.2.2.2 inc External BGP snip External BGP neighbor may be up to 1 hops away. BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 18 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr 1.1.1.1 10.1.1.1 10.1.1.2 2.2.2.2BGP Speakers Won’t Peer Bad Messages %BGP-3-NOTIFICATION: sent to neighbor 2.2.2.2 2/2 (peer in wrong AS) 2 bytes 00C8 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 00C8 00B4 0202 0202 1002 0601 0400 0100 0102 0280 0002 0202 00 unknown subcode The peer open notification subcode isn’t known incompatible BGP version The version of BGP the peer is running isn’t compatible with the local version of BGP peer in wrong AS The AS this peer is locally configured for doesn’t match the AS the peer is advertising BGP identifier wrong The BGP router ID is the same as the local BGP router ID unsupported optional There is an option in the packet which the local BGP parameter speaker doesn’t recognize authentication failure The MD5 hash on the received packet does not match the correct MD5 hash unacceptable hold time The remove BGP peer has requested a BGP hold time which is not allowed (too low) unsupported/disjoint capability The peer has asked for support for a feature which the local router does not support BRKRST-3320 19 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speaker Flap Case Study R1 • Here we see a message from bgp log-neighbor-changes telling us the hold timer expired • We can double check this by looking at show ip bgp neighbor x.x.x.x include last reset R2 %BGP-5-ADJCHANGE: neighbor 10.1.1.1 Down BGP Notification sent %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0 bytes R2show ip bgp neighbor 10.1.1.1 include last reset Last reset 00:01:02, due to BGP Notification sent,hold time expired BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 20 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speaker Flap Case Study R1 ƒ There are lots of possibilities here R1 has a problem sending keepalives? The keepalives are lost in the cloud? R2 has a problem receiving R2 the keepalive? %BGP-5-ADJCHANGE: neighbor 10.1.1.1 Down BGP Notification sent %BGP-3-NOTIFICATION: sent to neighbor 1.1.1.1 4/0 (hold time expired) 0 bytes R2show ip bgp neighbor 10.1.1.1 include last reset Last reset 00:01:02, due to BGP Notification sent,hold time expired BRKRST-3320 21 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speaker Flap Case Study • Did R1 build and transmit a keepalive for R2? debug ip bgp keepalive show ip bgp neighbor • When did we last send or receive data with the peer? R2show ip bgp neighbors 1.1.1.1 BGP neighbor is 1.1.1.1, remote AS 100, external link BGP version 4, remote router ID 1.1.1.1 BGP state = Established, up for 00:12:49 Last read 00:00:45, last write 00:00:44, hold time is 180, keepalive interval is 60 seconds • If R1 did not build and transmit a KA How is R1 on memory? What is the R1’s CPU load? Is R2’s TCP window open? BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 22 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Speaker Flap Case Study R2show ip bgp sum begin Neighbor Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 2.2.2.2 4 2 53 284 10167 0 97 00:02:15 0 But the number of packets The number of packets At least one BGP transmitted is not increasing generated is increasing keepalive interval apart R2show ip bgp summary begin Neighbor Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 2.2.2.2 4 2 53 284 10167 0 98 00:03:04 0 The keepalives aren’t leaving R2 BRKRST-3320 23 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Speaker Flap Case Study • Go back to square one and check the IP connectivity This is a layer 2 or 3 transport issue, etc. R1ping 10.2.2.2 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds: Success rate is 100 percent (5/5), round-trip min/avg/max = 16/21/24 m R1ping ip Target IP address: 10.2.2.2 Repeat count 5: Datagram size 100: 1500 Timeout in seconds 2: Extended commands n: Sweep range of sizes n: Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds: ..... Success rate is 0 percent (0/5) BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 24 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Convergence BRKRST-3320 25 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence • Hey—Who are you calling slow? Slow is a relative term.... BGP probably won’t ever converge as fast as any of the IGPs • Two general convergence situations Initial startup between peers Route changes between existing peers BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 26 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Slow Convergence Initial Convergence • Initial convergence is limited by The number of packets required to transfer the entire BGP database The number of routes The ability of BGP to pack routes into a small number of packets The number of peer specific policies TCP transport issues How often does TCP go into slow start? How much can TCP put into one packet? BRKRST-3320 27 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Initial Convergence ƒ BGP starts a packet by building an attribute set ƒ It then packs as many destinations (NLRIs) as it can into the packet Only destinations with the same attribute set can be placed in the packet Destinations can only be put into the packet until it’s full ƒ First rule of thumb: to increase convergence speed, decrease unique sets of attributes BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 28 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr Less Efficient NLRI Attribute NLRI Attribute More Efficient NLRI NLRI AttributeBGP Slow Convergence Initial Convergence ƒ The larger the packet BGP can build, the more destinations it can put in the packet The more you can put in a single packet, the less often you have to repeat the same attributes Second rule of thumb: allow BGP to use the largest packets possible BRKRST-3320 29 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Initial Convergence • BGP must create packets based the policies towards each peer Third rule of thumb: Minimize the number of unique policies towards eBGP peers BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 30 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr NLRI NLRI Attribute Less Efficient NLRI Attribute NLRI Attribute Less Efficient More Efficient NLRI NLRI Attribute NLRI NLRI Attribute Less Efficient NLRI NLRI Attribute NLRI NLRI Attribute NLRI NLRI NLRI NLRI Attribute More Efficient More Efficient NLRI NLRI AttributeBGP Slow Convergence Initial Convergence • TCP Interactions Each time a TCP packet is dropped, the session goes into slow start It takes a good deal of time for a TCP session to come out of slow start Fourth rule of Thumb: Try and reduce the circumstances under which a TCP segment will be dropped during initial convergence BRKRST-3320 31 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Initial Convergence • Bottom Line: Hold down the number of unique attributes per route Don’t send communities if you don’t need to, etc Hold down the number of policies towards eBGP peers Try to find a small set of common policies, rather than individualizing policies per peer Stop TCP segment drops Increase input queues Increase SPD thresholds Make certain links are clean BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 32 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Slow Convergence Initial Convergence Convergence Input Queue ƒ Here we see the results of time (minutes) Drops setting up maximum sized 20 250K input queues A single router running 16 200K 12.0(18)S 12 150K 100 to 500 peers in a single peer group 8 100K Sending 100,000+ routes to each peer 4 50K ƒ Increasing the input queue 0 0 sizes Reduced the input queue Peer group members drops by a factor of 10 12.0(18)S Reduces convergence speed by 50% BRKRST-3320 33 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Initial Convergence ƒ TCP MTU path discovery 350 allows BGP to use the 300 largest packets possible 250 ƒ Without PMTU discovery, 200 we can support 100 peers with 120,000 routes each 150 100 ƒ With PMTU discover, we can support 175 peers with 50 120,000 routes each 0 ƒ Note this is 12.0(18)S, Cisco 80K 90K 100K 110K 120K IOS Software can support Routes more than this now BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 34 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr Supported Peers 100 200 300 400 500BGP Slow Convergence Route Change Convergence ƒ There are two elements to route change convergence for BGP How long does it take to see the failure? How long does it take to propagate information about the failure? ƒ For faster peer down detection, there are several tools you can use Fast layer two down detection Fast external fallover for directly connected eBGP peers Faster keepalive and dead interval timers Down to 3 and 9 are commonly used today BRKRST-3320 35 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Route Change Convergence • Fast Session Deactivation Link fails; IGP eBGP Multihop Session converges The address of each peer is registered with the Address Tracking Filter (ATF) system BGP Tears Down When the state of the route eBGP Session changes, ATF notifies BGP BGP tears down the peer impacted BGP/RIB ATF Interface BGP does not wait on the ATF Notifies BGP hold timer to expire BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 36 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Slow Convergence Route Change Convergence • Very dangerous for iBGP peers IGP may not have a route to a peer for a split second FSD would tear down the BGP session Imagine if you lose your IGP route to your RR (Route Reflector) for just 100ms • Off by default neighbor x.x.x.x fall-over BRKRST-3320 37 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Route Change Convergence • ATF can also be used to track changes in next hops iBGP recurses onto an IGP next hop to find a path through the local AS Changes in the IGP cost or reachability are normally seen only by the BGP scanner Since the scanner runs every 60 seconds, by default, this means iBGP convergence can take up to 60 seconds on an IGP change.... BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 38 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scrBGP Slow Convergence Route Change Convergence • BGP Next Hop Tracking Enabled by default no bgp nexthop trigger enable • BGP registers all nexthops with ATF Hidden command will let you see a list of nexthops show ip bgp attr nexthop • ATF will let BGP know when a route change occurs for a nexthop • ATF notification will trigger a lightweight “BGP Scanner” run Bestpaths will be calculated None of the other “Full Scan” work will happen BRKRST-3320 39 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public BGP Slow Convergence Route Change Convergence • Once an ATF notification is received BGP waits 5 seconds before triggering NHT scan bgp nexthop trigger delay 0-100 May lower default value as we gain experience • Event driven model allows BGP to react quickly to IGP changes No longer need to wait as long as 60 seconds for BGP to scan the table and recalculate bestpaths Tuning your IGP for fast convergence is recommended BRKRST-3320 14702_05_2008_x1 © 2008 Cisco Systems, Inc. All rights reserved. Cisco Public 40 © 2007, Cisco Systems, Inc. All rights reserved. 13884_05_2007_c1.scr