lunedì 20 agosto 2012

BGP Diverse-Path for a faster convergence

1 commenti

The BGP implementation in Junos is event-driven while in IOS is timer based and require that the scan process goes trough the BGP Table and select the best path to put into the RIB. The BGP scan-time command control this interval, with a default value of 60 sec.
In a large scale bgp scenario where usually route reflectors are involved, this mean that in the worst case the convergence time can be up to 120 sec because the route-reflector convergence and bgp update is required before the client can have a consistent BGP table and compute the new best path updating the RIB.
This is because Route Reflectors distribute to the clients only the best path.

In layer-3 MPLS VPN scenario this problem is solved using different Route-Distinguisher that create not comparable entry on route reflectors allowing reflection of both routes to all clients. This action moves the best-path selection process to all clients, eliminating the intermediate covergence step of route-reflectors.
But how to solve the problem in global routing table ?
Different approach are proposed, and a wonderful discussion can be reached here:

http://blog.ine.com/2010/11/22/understanding-bgp-convergence/

I already use the Add-Path ( http://tools.ietf.org/html/draft-ietf-idr-add-paths-07 ) extension that permit multiple next-hop for the same prefix, this allows load-balancing in addition to the fast convergence due to the direct next-hop tracking, but this approach require the support of this new bgp capability and usually MPLS encapsulation on the backbone to prevent ip lookups and possible routing loops on transit nodes.
BGP Diverse-Path ( http://tools.ietf.org/html/draft-ietf-grow-diverse-bgp-path-dist-08 ) it's not a new capability, but comes from the knowledge of the topology and uses existing attributes of a typical RR BGB Cluster. One cluster member are selected as a "shadow" route-reflector and instead of reflect the best path ( that is reflected by the others route reflector in the cluster ) it's announce the backup path to his clients. It's also important to note that like all other routers in the backbone, it still install the best path into it's own RIB for traffic forwarding.

Now all backbone routers has at least two iBGP peering session to the RR Cluster, the first to the regular route-reflector and the other to the shadow RR.
BGP topology on the RR clients now contain the best and the backup path, allowing a local calculation of the best path. This step eliminates the need of convergence of the route-reflector, halving the total convergence time removing the convergence requirement of the route reflector.

This behavior it's not new, and in the past was performed with an IGP metric manipulation on the Shadow route-reflector ( because the in these cases the tie-break for the best path selection process is the IGP ) but now on some IOS image there is the support to build in a simple manner this architecture.
The last step to speed up the convergence process is to eliminate the scan time and trigger the reconvergence process to the next-hop availability. This can be performed using the next-hop-tracking feature that track the IGP for the next-hop reachability and trigger an immediate reconvergence. In recent IOS version this function is enabled by default.
Take care that having so different converging time ( from few ms to 120 sec ) on different part of the backbone can lead to a traffic loops and high dependence to flapping links. The development of a fast convergence and high capacity backbone require a careful analysis of all components ( and the possible involvement of MPLS, LFA and TE ) and not just enabling some fancy feature.

Testing Lab

This is the complete lab scenario to test this capability:

Into the lab only IPv6 addresses form the ULA ( Unique Local Address ) address Range are used: only one single level-2 ISIS area with all the point-to-point internal lefts to the automatic link-local addresses. Loopback are numbered as /128 ipv6 address and an aggregate prefix is generated on the peering point.
The complete addressing and IGP configuration of R2 looks like:

!
interface Loopback0
no ip address
ipv6 address FD00::2/128
!
interface FastEthernet0/0
no ip address
!
interface FastEthernet0/0.201
description ---- to R1 ----
encapsulation dot1Q 102
ipv6 enable
ipv6 router isis
isis network point-to-point
!
interface FastEthernet0/0.203
description ---- to R3 ----
encapsulation dot1Q 203
ipv6 enable
ipv6 router isis
isis network point-to-point
!
interface FastEthernet0/0.205
description ---- to R5 - ASN2 ----
encapsulation dot1Q 205
ipv6 address FD00:25::2/64
!
router isis
net 49.0000.0000.0002.00
is-type level-2-only
metric-style wide
no hello padding
passive-interface Loopback0
!


R1 is chosen as the shadow route reflectors.
To configure the BGP Diverse Path on the shadow router 4 steps are required:


1) Disable the IGP bestpath igp-metric tie-break ( optional and topology depended )
bgp bestpath igp-metric ignore
2) Allow the identification of the backup path
bgp additional-paths select backup
3) Permit the backup path announcement
bgp additional-paths send
4) select the route-reflection clients enabled for the update ( the Clients peer-group )
neighbor Clients advertise diverse-path backup


The complete BGP configuration of the shadow RR ( R1 )


!
router bgp 1
bgp router-id 100.0.0.1
bgp cluster-id 1
bgp log-neighbor-changes
no bgp default ipv4-unicast
neighbor Clients peer-group
neighbor Clients remote-as 1
neighbor Clients update-source Loopback0
neighbor FD00::2 peer-group Clients
neighbor FD00::3 peer-group Clients
neighbor FD00::4 peer-group Clients
!
address-family ipv4
exit-address-family
!
address-family ipv6
bgp additional-paths select backup
bgp additional-paths send
bgp bestpath igp-metric ignore
neighbor Clients route-reflector-client
neighbor Clients advertise diverse-path backup
neighbor FD00::2 activate
neighbor FD00::3 activate
neighbor FD00::4 activate
exit-address-family
!


check the BGP status:


R1#sh bgp all summary
...
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
FD00::2 4 1 20 24 5 0 0 00:14:47 1
FD00::3 4 1 22 28 5 0 0 00:18:56 0
FD00::4 4 1 24 27 5 0 0 00:18:55 2


The bgp table identify the best path for for "FD00:5::/64" trough R2 ( and install into the RIB ) and the possible "backup-path" trough R4:


R1#sh bgp ipv6 unicast
BGP table version is 5, local router ID is 100.0.0.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network Next Hop Metric LocPrf Weight Path
*>i FD00::/64 FD00::4 0 100 0 i
*>i FD00:5::/64 FD00::2 0 100 0 2 i
*bi FD00::4 0 100 0 2 i


This backup path is now sent to R3:

R1#sh bgp ipv6 unicast neighbors FD00::3 advertised-routes
BGP table version is 5, local router ID is 100.0.0.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network Next Hop Metric LocPrf Weight Path
*>i FD00::/64 FD00::4 0 100 0 i
*biaFD00:5::/64 FD00::4 0 100 0 2 i

Total number of prefixes 2


on R3 the nexthop trigger is enabled with a timeout of 1 sec for the IPv6 address-family


router bgp 1
bgp router-id 100.0.0.3
no bgp default ipv4-unicast
bgp log-neighbor-changes
neighbor FD00::1 remote-as 1
neighbor FD00::1 update-source Loopback0
neighbor FD00::2 remote-as 1
neighbor FD00::2 update-source Loopback0
!
address-family ipv6
bgp nextop trigger enable
bgp nextop trigger delay 1
neighbor FD00::1 activate
neighbor FD00::1 activate
neighbor FD00::1 activate
exit-address-family
!


on R3 two exit point for FD00:5::/64 are now present, and the best path still select R2 as the primary, but the backup path is already present in the BGP table

R3#sh bgp ipv6 unicast
BGP table version is 8, local router ID is 100.0.0.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
* iFD00::/64 FD00::4 0 100 0 i
*>i FD00::4 0 100 0 i
*>iFD00:5::/64 FD00::2 0 100 0 2 i
* i FD00::4 0 100 0 2 i


A traceroute confirm the complete path correctness:

R3#traceroute ipv6 fd00:5::5

Type escape sequence to abort.
Tracing the route to FD00:5::5

1 FD00::2 12 msec 8 msec 8 msec
2 FD00:5::5 24 msec 84 msec 84 msec


As a simple test, during a continuous ping to R5 from R3, the R2 loopback was forced down, triggering the backup path selection without any packet loss.


R3#ping fd00:5::5 repeat 1000

Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to FD00:5::5, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

*Mar 1 01:00:05.675: %BGP-3-NOTIFICATION: received from neighbor FD00::2 4/0 (hold time expired) 0 bytes
*Mar 1 01:00:05.675: %BGP-5-ADJCHANGE: neighbor FD00::2 Down BGP Notification received
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!



Conclusion

Whit this solution the convergence time of bgp is now comparable to the IGP also with route reflectors.
BGP add-path is obviously the more powerful options but require the specific capability in most of the BGP speaker, and then recommended for new solutions,  while diverse-path can help to improve the global convergent time without requiring any new capability on legacy device. MPLS is not always required for both solutions, but take my advice and adopt it always.

Feature availability:

This feature is primary available in IOS XR and recently implemented in IOS 15.2(3)T and 15.2(4)S

The complete lab configuration are here

domenica 21 novembre 2010

talking about CCIE

0 commenti
I recently talk about my experience in becoming CCIE Certified.
This was a good opportunity to share experience and meet old and new friends.
A shot for my presentation:

The link official url for the  event and my presentation ( in italian )

lunedì 1 febbraio 2010

6VPE - ipv6 mpls VPN between cisco e junos

0 commenti
The spread of IPv6 in our networks is slow, but to refresh my IPv6 knowledge and use the LSPs configured in the previous post, I decided to configure an IPv6 mpls VPN between two sites.
This solution is called 6VPE and introduced on IOS a new configuration syntax for VRF.




Building the control plane

The operations are very similar to a IPv4 MPLS VPN: vpn prefix and label are signaled by mBGP while, LDP and/or RSVP ensure the label signaling necessary to forward MPLS packet throught the backbone.

The lab uses a virtual CE, realized exclusively with a loopback within each of the two PE involved. It thus avoids the entire PE-CE configuration as the purpose of the lab is to exercise on the MPLS component.

Peering mBGP

As vpnv6 signaling uses mBGP, we need to configure an iBGP peering between the two PE. The configuration proposed is minimal, and allows only the signaling of the vpnv6 AF (Address Family).

J2 Juniper PE

The BGP configuration necessarily require the use of a group (equivalent to a cisco peer-group) and the explicit peering type identification (internal vs. external). As previously mentioned only AF inet6-vpn unicast is enabled:
protocols {
    bgp {
        group iBGP {
            type internal;
            family inet6-vpn {
                unicast;
            }
            neighbor 10.0.9.6 {
                local-address 10.0.6.2;
            }
        }           
    }               
}                              

routing-options {   
    autonomous-system 1;
}
In junos it is also mandatory to explicity enable the transport of IPv6 traffic over MPLS
[edit logical-systems J2]
nick@zion# show protocols | find mpls 
mpls {
    ipv6-tunneling;   /* enable ipv6 transport */
    label-switched-path J2-to-R6 {
        to 10.0.9.6;
    }
}

R6 PE Cisco

The IOS PE requires a global enable for ipv6 ed ipv6 cef. In the BGP part is avoided the exchange of IPv4 prefixes while vpnv6 peering with a complete community exchange is enabled for J2.
!
ipv6 unicast-routing
ipv6 cef
!
router bgp 1
  no bgp default ipv4-unicast
  neighbor 10.0.6.2 remote-as 1
  neighbor 10.0.6.2 update-source Loopback0
!
address-family vpnv6
  neighbor 10.0.6.2 activate
  neighbor 10.0.6.2 send-community both
exit-address-family
!
!

BGP session

It is always interesting to have a look at the communication capability negotiated BGP: The Open messages include the Multiprotocol Extension AFI (2) and SAFI (120) as indicated in the RFC:

Multiprotocol Extensions for BGP-4 rfc2858
Carrying Label Information in BGP-4 rfc3107
And as assigned by the IANA safi-namespace



The complete capture of BGP communication is here

VRF Configurations on Junos

Until now we have used the logical system of Junos isolating in its own stanza all the configuration statements for the virtual router realized.
The configuration of VRF rather intersects with the main routing instance ( or logical system ) which ensures the communication. The VRF is a particular routing-instance type in which the routing features are configured and the interfaced assigned, however the complete interface configuration remain global.

On J2 the IPv6 address fec0:cc1e:1::1/128 is configured on the new lo0 unit 102.


set logical-systems J2 interfaces lo0 unit 102 family inet6 address fec0:cc1e:1::1/128


then the "red" VRF routing instance is created, with route distinguisher assigned as the 2:2 route-target and 100:100.

set logical-systems J2 routing-instances red instance-type vrf
set logical-systems J2 routing-instances red route-distinguisher 2:2
set logical-systems J2 routing-instances red vrf-target target:100:100


Finally the newly configured unit 102 of the lo0 interface is associated with the routing-instance

set logical-systems J2 routing-instances red interface lo0.102


For those used to work with IOS there is no requirement to explicit configure the "redistribute connected" required to put the prefix in the mBGP table, all the prefix of the routing instance are automatically exported in mBGP.

this is the complete configuration in the usual form:
[edit logical-systems J2]
nick@zion# show interfaces lo0 
    unit 2 {
        family inet {
            address 10.0.6.2/32;
        }
        family iso {
            address 49.0000.0000.0002.00;
        }
    }
    unit 102 {
        family inet6 {
            address fec0:cc1e:1::1/128;
        }
    }

[edit logical-systems J2]
nick@zion# show | find routing-instance 
routing-instances {
    red {
        instance-type vrf;
        interface lo0.102;
        route-distinguisher 2:2;
        vrf-target target:100:100;
        vrf-table-label;
    }
}

The "vrf-table-label" is not required in this case, but serves to ensure the operation of MPLS VPN in Olive, where the forwarding is not PFE assisted.

VRF Definition on IOS

The 6VPE feature introduce a new configuration syntax and keywords with address-family, in this case RD 1:1 is used, while it is necessary to enable the redistribution of connected routes in the BGP process:

R6#sh run vrf red
Building configuration...

Current configuration : 354 bytes
!
vrf definition red
  rd 1:1
!
address-family ipv6
  route-target export 100:100
  route-target import 100:100
exit-address-family
!
!
router bgp 1
!
address-family ipv6 vrf red
  redistribute connected metric 1
  no synchronization
exit-address-family
!
interface Loopback106
  vrf forwarding red
  no ip address
  ipv6 address FEC0:CC1E:6::6/128
end 

Check if everything works as expected

In Junos the routing-instance routing table, are prefixed with the istance name, then the remote prefix fec0:cc1e:6::6/128 are installed in the table named “red.inet6.0”

With a single command I can get all necessary label details:

Top Label = 16 - first hop Transport allocated with RSVP
Inner Label = 32 - VPN Label allocated by mBGP

nick@zion> show route logical-system J2 protocol bgp table red.inet6.0 detail

red.inet6.0: 3 destinations, 3 routes (3 active, 0 holddown, 0 hidden)
fec0:cc1e:6::6/128 (1 entry, 1 announced)
*BGP Preference: 170/-101
Route Distinguisher: 1:1
Next hop type: Indirect
Next-hop reference count: 3
Source: 10.0.9.6
Next hop type: Router, Next hop index: 885
Next hop: 10.0.4.9 via fxp2.204 weight 0x1, selected
Label-switched-path J2-to-R6
Label operation: Push 32, Push 16(top)
Protocol next hop: ::ffff:10.0.9.6
Push 32

Indirect next hop: 8d98330 131070
State:
Local AS: 1 Peer AS: 1
Age: 27:26 Metric: 1 Metric2: 30
Task: BGP_1.10.0.9.6+20615
Announcement bits (1): 0-KRT
AS path: ?
Communities: target:100:100
Import Accepted
VPN Label: 32

Localpref: 100
Router ID: 10.0.9.6
Primary Routing Table bgp.l3vpn-inet6.0

On the IOS side it is necessary to use multiple commands to collect all the label information, sometimes with misleading results... however this is mainly due to the fact that we only use RSVP and the usually Cisco proposed solutions involve LDP:

R6#sh bgp vpnv6 unicast vrf red FEC0:CC1E:1::1/128
BGP routing table entry for [1:1]FEC0:CC1E:1::1/128, version 11
Paths: (1 available, best #1, table red)
Not advertised to any peer
Local, imported path from [2:2]FEC0:CC1E:1::1/128
::FFFF:10.0.6.2 (metric 30) from 10.0.6.2 (10.0.6.2)
Origin IGP, localpref 100, valid, internal, best
Extended Community: RT:100:100
mpls labels in/out nolabel/16

R6#sh ipv6 cef vrf red FEC0:CC1E:1::1/128 detail
FEC0:CC1E:1::1/128, epoch 0
recursive via 10.0.6.2 label 16
nexthop 10.0.6.2 Tunnel0

R6#sh mpls traffic-eng tunnels tunnel 0 | i Label
InLabel : -
OutLabel : FastEthernet0/0, 299776

The powerful cef “hidden” IOS command

In IOS there is a hidden IOS command “show ipv6 cef internal” used to show a set information very useful for troubleshooting and to understand the solution mechanisms.

R6#show ipv6 cef vrf red internal | b FEC0:CC1E:1::1/128
FEC0:CC1E:1::1/128, epoch 0, RIB[B], refcount 3, per-destination sharing
sources: RIB
feature space:
LFD: FEC0:CC1E:1::1/128 0 local labels
contains path extension list
label switch chain 0x66BA9888
IPRM: 0x00018000
ifnums: (none)
path 659EAE98, path list 659E05F0, share 1/1, type recursive, for IPv6, flags must-be-labelled
MPLS short path extensions: MOI flags = 0x4 label 16
recursive via 10.0.6.2[IPv4:Default] label 16, fib 65A2F590, 1 terminal fib
path 659EAF0C, path list 659E063C, share 1/1, type attached nexthop, for IPv4
MPLS short path extensions: MOI flags = 0x1 label implicit-null
nexthop 10.0.6.2 Tunnel0, adjacency IP midchain out of Tunnel0 65EF2C60
output chain: label 16 label implicit-null TAG midchain out of Tunnel0 65EF2AE0 label 299792 TAG adj out of FastEthernet0/0, addr 10.0.8.6 65EF2DE0

IPv4 to IPv6 Mappeed address

In the both cases there is a odd NEXT-HOP address indicated as ::FFFF:10.0.6.2 or ::FFFF:10.0.9.6. This is due to fact that the next-hop address required in the AFI/SAFI ipv6 is still an ipv6 address. In our case this is an IPv4 address ( the PE Loopback ) and this is replaced with IPv4 mapped IPv6 addess.

And in the end the ping...
This waste of energy to run a ping ... hope at least it works:
nick@zion> ping logical-system J2 routing-instance red fec0:cc1e:6::6
PING6(56=40+8+8 bytes) fec0:cc1e:1::1 --> fec0:cc1e:6::6
16 bytes from fec0:cc1e:6::6, icmp_seq=0 hlim=64 time=16.438 ms
16 bytes from fec0:cc1e:6::6, icmp_seq=1 hlim=64 time=17.256 ms
16 bytes from fec0:cc1e:6::6, icmp_seq=2 hlim=64 time=11.247 ms
16 bytes from fec0:cc1e:6::6, icmp_seq=3 hlim=64 time=11.361 ms
^C
--- fec0:cc1e:6::6 ping6 statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/std-dev = 11.247/14.076/17.256/2.787 ms
on the cisco side:
R6#ping vrf red ipv6 fec0:cc1e:1::1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to FEC0:CC1E:1::1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/12/20 ms
R6#

Conclusions


MPLS and BGP are confirmed two key tools in providing simple and efficient solutions, allowing the transport of any type of traffic without changes to the backbone infrastructure.

BGP session capture here
Final Juniper configuration here
Final R4 configuration here
Final R6 Configuration here

martedì 5 gennaio 2010

IS-IS and MPLS Integration between Junos and IOS devices

1 commenti
In recent years I have concentrated work and study on Cisco's MPLS platform, but wanting to deepen my knowledge of Junos and test the interoperability  between the two platforms, I decided to incorporate a couple of Cisco routers in my lab. I replaced J4 and J6 with two Cisco 7200 routers running IOS version 12.2 SRC.



It's necessary to configure ISIS, forming adjacency, enable the carrying of Traffic Engineering extensions, enable RSVP and LSP signaling ( which in Cisco terminology are often called Traffic Engineering Tunnel ) and finally to enable the forwarding of MPLS Labeled packets.

Starting from my topology JNCIP delete on my Olive router "Zion" the two logical-system J4 and J6:
[edit] 
nick@zion# delete logical-systems J4  

[edit] 
nick@zion# delete logical-systems J6  
and proceed with the configuration of two new Cisco router:

R4: IP address and IS-IS Routing

The IS-IS configuration is very simple, enable ISIS on the interfaces and set as a point-to-point link, (to avoid DIS election, speed-up and reduce the database).
In the IS-IS process specified the System ID ( "net" entry) define this as an only level-2 router and use the "wide style" mandatory metrics for the TE operation, and finally declare the loopback as passive to include in the topology.
interface Loopback0
 ip address 10.0.3.4 255.255.255.255
!
interface FastEthernet0/0
 ip address 10.0.4.9 255.255.255.252
 ip router isis    
 isis network point-to-point   
!
interface FastEthernet0/1
 ip address 10.0.2.6 255.255.255.252
 ip router isis 
 isis network point-to-point 
!         
interface FastEthernet1/0
 ip address 10.0.2.10 255.255.255.252
 ip router isis 
 isis network point-to-point 
!
router isis
 net 49.0000.0000.0004.00
 is-type level-2-only   
 metric-style wide    
 passive-interface Loopback0   
!
but the adjacencies never come up, remaining in the INIT state:
R4#sh clns neighbors 
System Id      Interface   SNPA                State  Holdtime  Type Protocol 
0000.0000.0002 Fa0/0       0050.8be3.eb2c      Init   24        L2   IS-IS 
0000.0000.0003 Fa0/1       0050.8be3.eb2d      Init   23        L2   IS-IS 
0000.0000.0005 Fa1/0       0050.8be3.eb2c      Init   20        L2   IS-IS 
there is a MTU mistmatch, the Zion interface has and CLNS (ISO) MTU of 1493:
nick@zion> show interfaces fxp2.204 
  Logical interface fxp2.204 (Index 82) (SNMP ifIndex 143) 
    Description: ------- link ptp J2 <-> J4 -- 
    Flags: SNMP-Traps VLAN-Tag [ 0x8100.204 ]  Encapsulation: ENET2 
    Bandwidth: 0 
    Input packets : 6673 
    Output packets: 6931 
    Protocol inet, MTU: 1496 
      Flags: None 
      Addresses, Flags: Is-Preferred Is-Primary 
        Destination: 10.0.4.8/30, Local: 10.0.4.10, Broadcast: 10.0.4.11 
    Protocol iso, MTU: 1493 
      Flags: None 
    Protocol mpls, MTU: 1484 
      Flags: None 
This is because Zion interfaces are using 802.1q and Olive uses a fixed MTU of 1500 bytes on FXP interfaces, reducing the effective packet length of 4 bytes, while Cisco routers are connected to in access-mode to the switch...
The absence of adjacency depend on hello messages padding introduced for early detection of MTU problems.
The solution proposed is to reduce MTU just for the CLNS packet on the cisco side ( my hardware does not support changing interface MTU )
R4#conf t 
Enter configuration commands, one per line.  End with CNTL/Z. 
R4(config)#int fast 0/0 
R4(config-if)#clns mtu 1493 
R4(config-if)#int fast 0/1 
R4(config-if)#clns mtu 1493 
R4(config-if)#int fast 1/0 
R4(config-if)#clns mtu 1493 
R4(config-if)#^Z 
alternatively you can use the command "no hello padding" and all its variants

for example:
R4(config)#router isis 
R4(config-router)#no hello padding point-to-point 
adjacency is now established and the database is populated ...
R4#sh clns neighbors 
System Id      Interface   SNPA                State  Holdtime  Type Protocol 
zion-J2        Fa0/0       0050.8be3.eb2c      Up     25        L2   IS-IS 
zion-J3        Fa0/1       0050.8be3.eb2d      Up     22        L2   IS-IS 
zion-J5        Fa1/0       0050.8be3.eb2c      Up     20        L2   IS-IS 
R4#sh ip route isis | B ^Ga
Gateway of last resort is not set 

      10.0.0.0/8 is variably subnetted, 22 subnets, 3 masks 
i L2     10.0.2.0/30 [115/20] via 10.0.2.9, FastEthernet1/0 
                     [115/20] via 10.0.2.5, FastEthernet0/1 
i L2     10.0.3.3/32 [115/10] via 10.0.2.5, FastEthernet0/1 
...
R4 : MPLS and IS-IS TE Database integration

you must also enable the extensions required by RFC3784 necessary for TE information exchange, to control band allocation and label signaling via RSVP and finally the process of MPLS labeled packet:

globally enable LSP allocation
mpls traffic-eng tunnels
enable on all PE facing interface RSVP and MPLS packet processing like:
interface FastEthernet0/0 
 ip rsvp bandwidth    
 mpls traffic-eng tunnels
estabilish the router-id ( TLV 134 ) and allow the necessary TE TLV exchange on all Level-2 adjacencies
router isis
 mpls traffic-eng router-id Loopback0 
 mpls traffic-eng level-2

this is the complete R4 configuration:
!
hostname R4
!
mpls traffic-eng tunnels
!
interface Loopback0
 ip address 10.0.3.4 255.255.255.255
!
interface FastEthernet0/0
 ip address 10.0.4.9 255.255.255.252
 ip router isis     
 mpls traffic-eng tunnels  
 clns mtu 1493    
 isis network point-to-point   
 ip rsvp bandwidth   
!
interface FastEthernet0/1
 ip address 10.0.2.6 255.255.255.252
 ip router isis 
 mpls traffic-eng tunnels
 clns mtu 1493
 isis network point-to-point 
 ip rsvp bandwidth
!         
interface FastEthernet1/0
 ip address 10.0.2.10 255.255.255.252
 ip router isis 
 mpls traffic-eng tunnels
 clns mtu 1493
 isis network point-to-point 
 ip rsvp bandwidth
!
router isis
 net 49.0000.0000.0004.00
 is-type level-2-only   
 metric-style wide   
 passive-interface Loopback0   
 mpls traffic-eng router-id Loopback0 
 mpls traffic-eng level-2  
!
the R6 configuration is similar and available at the end of this post.

LSP Setup

To verify the effective integration of the two platforms, let's configure two LSP (or TE Tunnels ), one from J2 to R6 and the simmetric from R6 to J2, reminding us that LSPs are always unidirectional.
We will not use any constraint, and hence the LSP will be allocated according to the best IGP metric the result should be:
LSP1 : J2 → R4 → R6 → J4
LSP2 : J5 → R4 → R6 → J2



The J2 configuration is simple:
protocols { 
    mpls { 
        label-switched-path J2-to-R6 { 
            to 10.0.9.6; 
        } 
}
This command requires an LSP to the address 10.0.9.6, resources allocation and Label signaling. If the entire process is successful, it immediately create an entry for the destination address in the inet.3 table, usually used to resolve the BGP next-hop, and that has precedence over the inet.0:

the LSP is active:
nick@zion> show mpls lsp ingress logical-system J2              
Ingress LSP: 1 sessions 
To              From            State Rt P     ActivePath       LSPname 
10.0.9.6        10.0.6.2        Up     0 *                      J2-to-R6 
Total 1 displayed, Up 1, Down 0 
the destination is installed in inet.3
nick@zion> show route 10.0.9.6 logical-system J2                

inet.0: 21 destinations, 21 routes (21 active, 0 holddown, 0 hidden) 
+ = Active Route, - = Last Active, * = Both 

10.0.9.6/32        *[IS-IS/18] 00:51:24, metric 30 
                    > to 10.0.4.1 via fxp2.203 
                      to 10.0.4.9 via fxp2.204 

inet.3: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden) 
+ = Active Route, - = Last Active, * = Both 

10.0.9.6/32        *[RSVP/7] 00:49:32, metric 30 
                    > to 10.0.4.9 via fxp2.204, label-switched-path J2-to-R6 
The label allocated and signaled by R4 for this LSP is 16, the first label non-reserved
nick@zion> show route 10.0.9.6 logical-system J2 table inet.3 extensive | match Label  
                Label-switched-path J2-to-R6 
                Label operation: Push 16    
The minimum configuration of R6 is:
interface Tunnel0 
 ip unnumbered Loopback0 
 tunnel destination 10.0.6.2 
 tunnel mode mpls traffic-eng 
 tunnel mpls traffic-eng autoroute announce 
 tunnel mpls traffic-eng path-option 10 dynamic 
!
in IOS LSP is usually called “TE Tunnel” and configured as a tunnel interface: some notes about the configuration:
in this type of tunnel there is no “tunnel source”
“autoroute announce” install in the routing table the destination through the tunnel
“path-option 10 dynamic” use just the IGP metric without costrain
R6#sh mpls traffic-eng tunnels brief | b ^TU 
TUNNEL NAME                      DESTINATION      UP IF     DOWN IF   STATE/PROT 
R6_t0                            10.0.6.2         -         Fa0/0     up/up     
J2-to-R6                         10.0.9.6         Fa0/0     -         up/up     
Displayed 1 (of 1) heads, 0 (of 0) midpoints, 1 (of 1) tails 
show 2 tunnels “UP/UP”, one head and one tail.

IOS has just 1 table and this output can be disorienting (or the different Junos table :-) )

R6#sh ip route 10.0.6.2                          
Routing entry for 10.0.6.2/32 
  Known via "isis", distance 115, metric 30, type level-2 
  Redistributing via isis 
  Last update from 10.0.6.2 on Tunnel0, 00:55:05 ago 
  Routing Descriptor Blocks: 
  * 10.0.6.2, from 10.0.6.2, via Tunnel0 
      Route metric is 30, traffic share count is 1 
and the label used, allocated and signaled by J5 is in the tipical Junos range for this type of traffic:
R6#sh mpls traffic-eng tunnels tunnel 0 | i Label 
  InLabel  :  - 
  OutLabel : FastEthernet0/0, 299888
Also on the transit router can control the LSP allocated
On Junos:
nick@zion> show mpls lsp logical-system J5    
Ingress LSP: 0 sessions 
Total 0 displayed, Up 0, Down 0 

Egress LSP: 0 sessions 
Total 0 displayed, Up 0, Down 0 

Transit LSP: 2 sessions 
To              From            State   Rt Style Labelin Labelout LSPname 
10.0.6.2        10.0.9.6        Up       1  1 SE  299888       17 R6_t0 
10.0.9.6        10.0.6.2        Up       1  1 FF  299872        0 J2-to-R6 
Total 2 displayed, Up 2, Down 0 
and on Cisco:
R4#sh mpls traffic-eng tunnels brief | b ^TU 
TUNNEL NAME                      DESTINATION      UP IF     DOWN IF   STATE/PROT 
J2-to-R6                         10.0.9.6         Fa0/0     Fa1/0     up/up     
R6_t0                            10.0.6.2         Fa1/0     Fa0/0     up/up     
Displayed 0 (of 0) heads, 2 (of 2) midpoints, 0 (of 0) tails 

R4#sh mpls traffic-eng tunnels | i Tunnel|Label      
LSP Tunnel J2-to-R6 is signalled, connection is up 
  InLabel  : FastEthernet0/0, 16 
  OutLabel : FastEthernet1/0, 299872 
LSP Tunnel R6_t0 is signalled, connection is up 
  InLabel  : FastEthernet1/0, 17 
  OutLabel : FastEthernet0/0, implicit-null
In both cases the special label "implict-null" is used in place of PHP (Penultimate Hop Popping) because TE operations usually use the EXP Bit for traffic classification, which for some traffic reclassification may be different from the IP Precedence of the transported package ( and obviously also the traffic can not be IP... )

One further note for the Cisco platform that can misleading:
analizing LFIB (Label Forwarding Information Base) used for MPLS packet forwarding on R4 shows "Pop Label"
R4#sh mpls forwarding-table 
Local  Outgoing      Prefix            Bytes Label   Outgoing   Next Hop    
Label  Label or VC   or Tunnel Id      Switched      interface              
16     299872        10.0.6.2 12053 [1]   \ 
                                       0             Fa1/0      10.0.2.9    
17     Pop Label     10.0.9.6 0 [248]  0             Fa0/0      10.0.4.10 

but this mean ( from Cisco documentation ) :
No Label - Means that there is no label for the destination from the next hop or that label switching is not enabled on the outgoing interface.
** Pop Label ** - Means that the next hop advertised an implicit NULL label for the destination and that the router popped the top label.
Aggregate - Means there are several prefixes for one local label. This entry is used when IPv6 is configured on edge routers to transport IPv6 traffic over an IPv4 MPLS network.


Nothing particularly exciting at this point, only a couple of suggestions like the hello padding ISIS and verification of proper implementation of the two LSPs, it is now time to think about what to do with this two LSPs...

final Zion Configuration here
final R4 Configuration here
final R6 Configuration here

giovedì 17 dicembre 2009

Configuration Groups in Junos

0 commenti
Sometimes part of the configuration is repeated, or is necessary to ensure that some statements are always applied to interfaces, protocols or other configuration portion.

With JUNOS you can collect these sets of commands in a group, and then apply it to portions of the configuration. The comparison with a programming language can be like a subroutine or much better for the principle of "inheritance" of the object-oriented programming.
The result is a reduction in the code length, eliminating the possibility of typing errors or oversights. Finally a modification of an operating parameter in the group is directly reflected in the configuration of all the elements to which the group is applied.

When defining groups wildcar can be used to specify which portions of the code to apply the configuration.

The official reference for this statementi is
http://www.juniper.net/techpubs/en_US/junos9.6/information-products/topic-collections/swconfig-cli/id-11139566.html#id-11139566


Why did I introduce groups? Most of the fxp interfaces in all router for my JNCIP/JNCIE lab topology proposed in the previous post, use "family mpls" and "familiy iso" so why not save lots of typing and exercise the use of configuration groups ?

start defining the group:

[edit]
nick@zion# show groups | no-more 
isis-mpls {
    logical-systems {
        <*> {
            interfaces {
                <fxp*> {
                    unit <*> {
                        family iso;
                        family mpls;
                    }
                }
            }
        }
    }
}

and then apply the group to the all the system:


[edit]
nick@zion# set apply-groups isis-mpls 

To display the effect of the configuration group is necessary to pipe the show command trough the "display inheritance" command as follow:


show logical-systems J1 interfaces | display inheritance    
fxp1 {
    unit 102 {
        description "------- LAN  J1-J2 ----------";
        vlan-id 102;
        family inet {
            address 10.0.5.1/24;
        }
        ##
        ## 'iso' was inherited from group 'isis-mpls'
        ##
        family iso;
        ##
        ## 'mpls' was inherited from group 'isis-mpls'
        ##
        family mpls;
    }
    unit 103 {
        description "------- link ptp J1 <-> J3 --";
        vlan-id 103;
        family inet {
            address 10.0.4.14/30;
        }
        ##
        ## 'iso' was inherited from group 'isis-mpls'
        ##
        family iso;
        ##
        ## 'mpls' was inherited from group 'isis-mpls'
        ##
        family mpls;
    }
...

or simply skipping the line with "#" in a concised form:

nick@zion#show logical-systems J1 interfaces | display inheritance | except # 
fxp1 {
    unit 102 {
        description "------- LAN  J1-J2 ----------";
        vlan-id 102;
        family inet {
            address 10.0.5.1/24;
        }
        family iso;
        family mpls;
    }
    unit 103 {
        description "------- link ptp J1 <-> J3 --";
        vlan-id 103;
        family inet {
            address 10.0.4.14/30;
        }
        family iso;
        family mpls;
    }
...

Other elements of the configuration are repetitive, and therefore can find an ideal location in the definition of the group, whose final configuration is thus:


[edit]
nick@zion# show groups | no-more  
isis-mpls {
    logical-systems {
        <*> {
            interfaces {
                 {
                    unit <*> {
                        family iso;
                        family mpls;
                    }
                }
            }
            protocols {
                rsvp {
                    interface all;
                }
                mpls {
                    interface all;
                }
                isis {
                    level 1 disable;
                    level 2 wide-metrics-only;
                    interface all {
                        point-to-point;
                    }
                }
            }
        }
    }
}

Some elements use different names in each logical router, so you must configure each specific command directly into the respective stanzas:

[edit]
nick@zion# show logical-systems J3 protocols | no-more    
isis {
    interface lo0.3 {
        passive;
    }
}

The result, like in the interface portion, is the union of both statements:

nick@zion#show logical-systems J3 | find protocols | display inheritance | except ##     
protocols {
    rsvp {
        interface all;
    }
    mpls {
        interface all;
    }
    isis {
        level 1 disable;
        level 2 wide-metrics-only;
        interface lo0.3 {                      
            passive;
        }
        interface all {
            point-to-point;
        }
    }
}

If you are not confortable using "display inheritance" or working without viewing some portions of the configuration, you can always use my starting configuration with the apply-groups, save the result of the "display inheritance" in a file and then replace the original configuration. In this case is better to use a regular expression to prevent stripping of the hashed password data ( quoted also with '##' ).

nick@zion# show | display inheritance | except "^\ *#" | save Jncip-Logical-System_L2_isis.confg  
Wrote 486 lines of output to 'Jncip-Logical-System_L2_isis.confg'
[edit]
nick@zion# load override Jncip-Logical-System_L2_isis.confg 
load complete

I promised complex scenarios and not just some simple CLI tricks, but is necessary to start with someting  solid to work on...

The complete configuration is available Here