Saturday, June 20, 2015

MPLS Part1

VRF-lite:

VRFs create an instance of the routing table.
VRF, when used inside a single router or without MPLS is VRF-Lite

We can create VRFs in two ways

Legacy method—supports only ipv4
R6(config)#ip vrf VPN_A
R6(config)#int fa0/0
R6(config)#ip vrf forwarding  VPN_A
When applied, this will remove only the ipv4 address attached to the interface. The ipv6 address of the interface will be part of global routing table and ipv4 address will be part of corresponding VRF table

Newer method—supports ipv4 and ipv6, we need to mention with address family commands under vrf
R6(config)#vrf definition VPN_B
R6(config-vrf)#address-family ipv4
R6(config-vrf)#address-family ipv6
R6(config)#int fa0/0
R6(config)# vrf forwarding  VPN_B
When applied, this will remove both the ipv4 and ipv6 addresses attached to the interface.

Each VRF instance has its own RIB and FIB.
An interface in VRF instance A1 cannot ping an interface in VRF instance A2.

To facilitate inter VRF reachability,
·         Ip route VRF VRF_Name prefix mask [interface] [next-hop]àThe interface can be in any VRF
·         The other option is to use the “global” keyword on the end of the route statement to instruct the router to look up the next hop from the global routing table



Some useful show commands,
Show vrf
Show run vrf

LDP:
LDP advertises its router-id as the transport address in the hello discovery messages.
So make sure the router-id is reachable. There must be an exact match for the router-id in the routing table.

The hello messages are sent to 224.0.0.2 on the UDP port 646.
After discovering a neighbor, the tcp connection will be established on 646 and labels are exchanged.

We can change the transport address
R1(config)#int fa0/0
R1(config-if)#mpls ldp discovery transport-address interface

The tcp session will be reestablished on giving the above command.
The TCP connection can be authenticated using an MD5 hash option.
 The hashing key is defined per-neighbor by using the command mpls ldp neighbor <IP> password <password>.
The IP address here is the neighbor’s LDP Router ID. To make the use of passwords mandatory, we need the global command mpls ldp password required.

When an LDP session is established, the hold time used for the session is lower of the values configured on the two routers.
R1(config)#mpls ldp holdtime 45

To change the neighbor discovery interval and hold time
R1(config)#mpls ldp discovery hello interval 15
R1(config)#mpls ldp discovery hello holdtime 45

To change the router-id
R1(config)#mpls ldp router-id lo0 forceàIf force is not used, the router must be reloaded to get the change into effect
‘Force’ will reset the tcp session


Normally, LDP advertises ‘implicit-null’(i.e. Label 3) for connected routes. So PHP router will pop the label before sending the packet.
Say if the packet contains Qos markings and we don’t want the PHP to pop the top label, we can configure the router to advertise ‘explicit-null’ for connected routes.
In such a case, the router will receive packets with ‘label 0’ for connected routes.

R1(config)#mpls ldp expliticit-null for <prefixes> to <ldpPeers>

Normal trace route from a customer router to other customer site
BB1#traceroute 1.1.1.1
  1 10.1.67.6 72 msec 80 msec 60 msec
  2 10.1.56.5 [MPLS: Label 16 Exp 0] 156 msec 148 msec 152 msec
  3 10.1.35.3 [MPLS: Label 16 Exp 0] 152 msec 148 msec 128 msec
  4 10.1.23.2 [MPLS: Label 27 Exp 0] 104 msec 108 msec 104 msec
  5 10.1.12.1 160 msec 132 msec 132 msec

The network is
R1=====R2-----R3-----R5-----R6=====BB1


In the above output, customer is able to see the routers and transit links in the provider’s network.
If we want to hide these details from the customer, we should configure the following command on the Edge router (not required on all P routers)
R6(config)#mpls ip propagate-ttl
R6(config)#no mpls ip propagate-ttl forwardedàThis will cause not to copy the TTL from IP into MPLS label for forwarded traffic only ,for locally generated traffic it works normal.
So the traceroute from PE routers will show all the transit links and for CE they will be hidden.


Then the trace route output from CE and PE routers will looks as
BB1#traceroute 1.1.1.1
  1 10.1.67.6 84 msec 72 msec 72 msec
  2 10.1.23.2 [MPLS: Label 27 Exp 0] 124 msec 120 msec 124 msec
  3 10.1.12.1 152 msec 132 msec 124 msec

R6(config)#do traceroute 1.1.1.1
  1 10.1.56.5 [MPLS: Label 16 Exp 0] 120 msec 168 msec 140 msec
  2 10.1.35.3 [MPLS: Label 16 Exp 0] 104 msec 112 msec 112 msec
  3 10.1.23.2 [MPLS: Label 27 Exp 0] 80 msec 92 msec 84 msec
  4 10.1.12.1 120 msec 108 msec 104 msec

R6(config)#mpls ip propagate-ttl
R6(config)#no mpls ip propagate-ttl local àThis will cause not to copy the TTL from IP into MPLS label for locally generated  traffic only ,for forwarded traffic it works normal.
So the traceroute from CE routers will show all the transit links and for PE router they will be hidden

R6(config)#do traceroute 1.1.1.1
  1 10.1.23.2 [MPLS: Label 27 Exp 0] 120 msec 84 msec 140 msec
  2 10.1.12.1 132 msec 160 msec 108 msec
R6(config)#

BB1#traceroute 1.1.1.1
  1 10.1.67.6 60 msec 56 msec 56 msec
  2 10.1.56.5 [MPLS: Label 16 Exp 0] 172 msec 156 msec 152 msec
  3 10.1.35.3 [MPLS: Label 16 Exp 0] 280 msec 148 msec 124 msec
  4 10.1.23.2 [MPLS: Label 27 Exp 0] 140 msec 112 msec 104 msec
  5 10.1.12.1 124 msec 128 msec 128 msec



LDP targeted hellos:
·         To establish ldp adjacency with devices that are not directly connected
·         Hellos will be unicasted
·         Normally used in TE for LDP session between tunnel endpoints
·         When enabled between directly connected devices, may improve the convergence by retaining the labels even when the link to neighbor is down.


By default, LDP will generate and advertise labels for every prefix found in the local routing table.
If we want to change this behavior and generate labels only for specific prefixes, we can use access-list to select the prefixes eligible for label generation.
R4(config)#no mpls ldp advertise-labelsàThis command must be entered to see the change
R4(config)#mpls ldp advertise-labels for 10 


 A sample traceroute in a network with LDP not turned on completely
R1#traceroute 10.1.67.7
  1 10.1.12.2 [MPLS: Label 26 Exp 0] 72 msec 52 msec 52 msec
  2 10.1.23.3 48 msec 56 msec 68 msec
  3 10.1.35.5 [MPLS: Label 25 Exp 0] 100 msec 100 msec 44 msec
  4 10.1.56.6 104 msec 120 msec 68 msec
  5 10.1.67.7 120 msec 132 msec 128 msec

Some useful show commands
Sh mpls ldp binding 10.1.67.0 24 àto check the LIB
Sh mpls forwarding-table 10.1.67.0 24 àto check the LFIB
Sh mpls ldp discovery detail
Sh mpls ldp  neighbor
Sh mpls ldp parameter


Wednesday, June 17, 2015

DMVPN-Part2

Routing protocols in Phase 1:

The next hop will always be the HUB.

In Phase1, the control plane should be kept as simple as possible because the data plane is always going to be point-to-point hub and spoke tunnels
regardless of the next-hop and routing protocol.

Eigrp:

On enabling eigrp, the spokes can establish adjacency with the hub.
They can’t establish adjacency with other spokes as they cannot replicate multicast traffic directly between them (in all the three phases of dmvpn).

To establish connectivity between spokes, we have two options

First one is, advertise a default route
R5(config-if)#int tun0
R5(config-if)#ip summary-address eigrp 100 0.0.0.0 0.0.0.0

Second one is, disable split-horizon. Spokes will learn the routes from other spokes but the next-hop will be the HUB.
R5(config-if)#int tun0
R5(config-if)#no ip split-horizon eigrp 100

ODR:
ODR is based on CDP.
CDP is enabled by default from IOS 15.x.Just make sure cdp is running on the tunnel interfaces.
R5#show cdp neighbors

Steps to run ODR
First enable cdp on the tunnel interface,
R5(config)#cdp run
R5(config)#int tun0
R5(config-if)#cdp enable

Enable ODR, this must be done the hub, the hub will announce a default route to the spokes and spokes will send their connected links information in the cdp messages to the HUB.
R5(config)#router odr

If any other routing protocol is enabled ODR will not run.
Exchange routing information without enabling any routing protocol.

BGP:

One of the major advantage of DMVPN is ,
We can easily add a spoke without changing any configuration on the existing devices.

When using BGP,this advantage is broken. We may have to do BGP configuration changes, policy changes.
We can use dynamic BGP configuration as a workaround for this.

We can use iBGP or eBGP to speak to the HUB.


RIP:
Normal configuration commands to enable rip.

We can send a default route to the spokes as
R5(config)#route-map DEFAULT  permit 10
R5(config-route-map)#set interface Tunnel0
R5(config)#router rip
R5(config-router)#default-information originate route-map DEFAULT
Or
Disable split horizon
R5(config)#int Tu0
R5(config-if)#
no ip split-horizon

OSPF:

When OSPF is configured over GRE tunnel interfaces, the OSPF network type defaults to point-to-point.
This is not supported in a DMVPN design, because the hub must maintain multiple adjacencies on the same interface, one for each remote spoke.

In DMVPN Phase 1 with OSPF, the OSPF network type is set to point-to-multipoint on the hub at a minimum. With the hub being OSPF network type point-to-multipoint and the spokes being OSPF network type point-to-point, adjacency is supported, as long as the timer values match.

DMVPN PHASE2:

The main problem with phase1 is all the spoke to spoke traffic must pass through HUB putting huge stress on the hub resources.
This limitation was primarily due to the configuration of the spoke as ‘point-point gre tunnel’ rather than ‘multipoint gre tunnel’.

Phase2 permits spoke to spoke tunnels, for this we need to configure the spokes as ‘multipoint gre tunnels’

The only configuration change we need to do is, on all the spokes
R4(config-if)#int tun0
R4(config-if)#no tunnel destination àremoving point-to-point tunnel setting
R4(config-if)#tunnel mode gre multipoint àenabling multipoint gre tunnel on spokes

No configuration changes on the hub.

Routing tables in phase2,
We now know all the networks behind the spokes with next-hops

R4#sh ip route
Gateway of last resort is not set

      14.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C        14.1.1.0/24 is directly connected, Tunnel0
L        14.1.1.4/32 is directly connected, Tunnel0
      150.1.0.0/32 is subnetted, 5 subnets
D        150.1.1.1 [90/28288000] via 14.1.1.1, 00:00:16, Tunnel0
D        150.1.2.2 [90/28288000] via 14.1.1.2, 00:00:16, Tunnel0
D        150.1.3.3 [90/28288000] via 14.1.1.3, 00:00:16, Tunnel0
C        150.1.4.4 is directly connected, Loopback0
D        150.1.5.5 [90/27008000] via 14.1.1.5, 00:00:33, Tunnel0

The implications are
·        Summarization is not allowed on the hubàif summarized, all the traffic will take the path spoke-hub-spoke
·        Next-hop must always be preserved by the hub


Routing protocols in phase2:

Eigrp:
The following configuration must be done on the hub
R5(config-if)#int tun0
R5(config-if)#no ip split-horizon eigrp 100àto advertise networks behind the spokes
R5(config-if)#no ip next-hop-self eigrp 100

OSPF:
One of the main requirements in phase2 is, the routing protocol must preserve the next-hop.
We need to use the OSPF network type that preserves the next-hop, so ospf network type point-to-multipoint is not supported in phase2.

Routing table with ospf network type point-to-multipoint
R2#show ip route ospf
O        150.1.1.1 [110/2001] via 14.1.1.5, 00:0:27, Tunnel0
O        150.1.3.3 [110/2001] via 14.1.1.5, 00:0:27, Tunnel0
O        150.1.4.4 [110/2001] via 14.1.1.5, 00:0:34, Tunnel0

To run ospf in phase2, we need to use the network type Broadcast or NBMA which preserves the next-hop.
It means
·        we need to configure the spokes that they never become DR and BDR(spoke to spoke direct flooding is not possible, spokes all are in same layer3 but not in same layer2 )
·        Not more than 2 hubs are permitted one DR and BDR

By default the network type on tunnel interface is point-to-multipoint, the following configuration must be done on the spokes
R4(config-if)#int tun0
R4(config-if)#ip ospf priority 0 àso that spokes never attempt to claim as DR/BDR because hub cannot preempt them once they think they are DR/BDR
R4(config-if)#ip ospf network broadcast

On the hub,
R5(config-if)#int tun0
R5(config-if)#ip ospf network broadcast

The routing table with ospf network type broadcast/NBMA
R2#show ip route ospf
O        150.1.1.1 [110/2001] via 14.1.1.1, 00:0:27, Tunnel0
O        150.1.3.3 [110/2001] via 14.1.1.3, 00:0:27, Tunnel0
O        150.1.4.4 [110/2001] via 14.1.1.4, 00:0:34, Tunnel0

The next-hop is preserved and when R2 wants to communicate with 150.1.3.3,a spoke-to-spoke tunnel will be established between R2 and R3

A nice and simple explanation of spoke-to-spoke tunnel creation,

Here are the steps,
  1. R2 gets a packet with a next hop R3. There is no NHRP map entry for R3, so an NHRP resolution request is sent to the hub.
  2. The request from R2 will also have the NBMA address of R2. The hub relays the request to R3.
  3. R3 receives the request, adds its own address mapping to it and sends it as an NHRP reply directly to R2.
  4. R3 then sends its own request to the hub that relays it to R2.
  5. R2 receives the request from R3 via the hub and replies by adding its own mapping to the packet and sending it directly to R3
Technically, the requests themselves provide enough information to build a spoke to spoke tunnel but the replies accomplish two things. They acknowledge to the other spoke that the request was received and also verify that spoke to spoke NBMA reachability exists.
DMVPN PHASE3:

The problem with phase2 is scalability.

·        Summarization is not allowed at hub, as a result all the spokes must have routes to all the subnets. This results in huge routing tables/updates.
·        Scalability when the no. of devices increases, very good explanation is provided at the following link

Phase3 solves the main issue of phase1 in a different way.
When the spoke forwards a packet to the hub, the hub will check if the destination is reachable via the same tunnel and in such a case will redirect the spoke to the destination attached spoke.

This is how phase3 works,





1.     R1 and R2 announce the subnets attached to the hub.
2.     The hub can be configured to advertise a default route to the spokes.
3.     Now R1 needs to send traffic to 23.1.1.1, it will send the packet to hub.
4.     The hub will see that the destination is reachable via the same tunnel, so it will send nhrp redirect packet to R1.
5.     R1 will send an nhrp resolution request for the IP 23.1.1.1.The hub will relay this nhrp request to R2.
6.     R2 will send the nhrp reply directly to R1(nhrp request packet will have the nbma address of R1).
7.     R1 will install a route in its routing table for the prefix 23.1.1.0/24 via 22.1.1.1 with an AD of 250

In phase3,a new route is installed in the routing table that tells the spoke how to reach the remote spoke.
We can do summarization and use default routes at the hub.

Configuration:

On the hub,
R5(config-if)#int tun0
R5(config-if)#ip nhrp redirect

On the spokes,
R4(config-if)#int tun0
R4(config-if)#ip nhrp shortcutàmake sure tunnel mode is gre multipoint


OSPF:
In phase2, ospf network type point-to-multipoint is nor supported as the hub will not preserve the next-hop and will always set itself as the next-hop.
In phase3, we can use point-to-multipoint network type as the hub can send a redirect message for other spokes traffic.


Wednesday, June 10, 2015

DMVPN-Part1

DMVPN solves some of the scalability issues with GRE tunnels.
Highly scalable
Easy configuration

DMVPN Phase1:
In phase1 of DMVPN, the hub is a multipoint GRE tunnel and the spokes are point-to-point GRE tunnels. It means
·         Spoke to spoke traffic must go through hub
·         Simplified routing-just a default route on spokes will do
·         Summarization and default routing can be used on the hub
·         Next hop is always changed by the hub

Here is the sample topo for discussion





















The IP address used is just to illustrate that the spoke just needs an internet connection to be part of DMVPN and each of them can be in any arbitrary network as along as connectivity is available.

For our discussions and configurations,
NBMA address –169.254.100.xx
Overlay DMVPN network—14.1.1.xx

We do ping tests on loopbacks of the devices, ip address is 150.1.x.x

Configuration:

Spoke Configuration,

R3#    sh run int tun0
Building configuration...

Current configuration : 352 bytes
!
interface Tunnel0
 ip address 14.1.1.3 255.255.255.0
 ip mtu 1400
 ip nhrp authentication NHRPAUTH
 ip nhrp group INE
 ip nhrp map multicast 169.254.100.5
 ip nhrp map 14.1.1.5 169.254.100.5
 ip nhrp network-id 1
 ip nhrp nhs 14.1.1.5    ----------------àOverlay address of the HUB
 ip tcp adjust-mss 1360
 tunnel source GigabitEthernet0/0.100
 tunnel destination 169.254.100.5--- àNBMA/Public/Underlay address of the HUB
 tunnel key 2
end

HUB Configuration

R5#sh run int tun0
Building configuration...

Current configuration : 293 bytes
!
interface Tunnel0
 ip address 14.1.1.5 255.255.255.0
 no ip redirects
 no ip split-horizon eigrp 100
 ip nhrp authentication NHRPAUTH
 ip nhrp group INE
 ip nhrp map multicast dynamic
 ip nhrp network-id 1
 tunnel source GigabitEthernet0/0.100
 tunnel mode gre multipoint----àMode must be multipoint on HUB, should specify destination on spokes.
 tunnel key 2
end

Once the DMVPN network is established and a routing protocol say EIGRP is enabled, the routers will form adjacency as they are connected in a LAN.
The spoke will multicast the hello packets to HUB.
The hub will multicast the hello packets to the spokes, it’s basically replication of the packets.





















The routes learned on a router

































Ping from spoke R1 to spoke R3 will go as
The destination is reachable via dmvpn tunnel,so the icmp ping request will get gre encapsulated

Icmp ping request sent to hub


















Hub will re encapsulate and send it to spoke3


















Spoke3 will send the reply to hub














Hub will reencapsulate and send the reply to spoke1


Tuesday, June 9, 2015

IPSec -Part2

In this post we will focus on 
GRE over IPSec
IPSec over GRE

GRE over IPSec falls under the category of Route-based VPNs.

The following are the limitations with the policy based VPNs 
  • Does not support multicast or non IP traffic.
  • The interesting traffic must be defined through an ACL--increases the configuration complexity and maintenance.

GRE Over IPSec (IPSEC is transport):

It’s basically,

L2 Header
ESP
GRE
IP
Data


When doing GRE over IPsec, what really changes comparing with normal IPsec encryption is WHAT MUST BE ENCRYPTED.

The decision of how traffic is encrypted or not depends on the routing protocols.
If ospf route points to a tunnel and the tunnel is running encryption, that particular traffic is encrypted.

This is how the limitation with the policy based VPNs is overcome and complex/frequent ACL changes are not required.

Configuration wise its same as traditional way of setting an IPSec tunnel.

The way we define the proxy acl changes.
In GRE over IPsec, proxy ACL will be just the endpoints of the GRE tunnel,
'permit gre hostA host B' (or permit gre any any)

Here the crypto map is applied under the physical interface which the GRE tunnel uses. So GRE encapsulation first and encryption second.

If we apply the crypto map to the gre interface, it becomes IPSec over GRE where encryption happens first and encapsulation second.


Eg:

R1----------R2==============R3-----------R4

R2==R3 -->GRE over IPSec tunnel.
R1,R4 are end host that run tcp,ping applications.


Configuration Steps:
Create a GRE tunnel between R2 and R3.
Create ISAKMP policy.
Create crypto map and associate it with the physical interface that the tunnel will use.
R2(config)#ip access-list ext GRE
R2(config-ext-nacl)#permit gre any any

R2(config)#crypto map GRE_O_IPSEC 50 ipsec-isakmp
R2(config-crypto-map)#match address GRE
R2(config-crypto-map)#set peer 4.4.4.4
R2(config-crypto-map)#set transform-set 3DES_MD5

Transport mode is negotiated only when the traffic is from one router to other router(i.e. sourced locally to the other end point).
This is controlled by the proxy ACL.
For traffic going through the router-->Always Tunnel mode is negotiated irrespective of configuration.
For traffic going to the router-->As per the configuration in the crypto map tunnel or transport.


If the proxy on R2 is configured as 'permit gre any any' -->ipsec mode will be tunnel irrespective of crypto map config.
If the proxy acl on R2 is configured as 'permit gre  host 10.2.2.2 host 10.4.4.4 ' and tunnel mode transport in crypto map,ipsec 
tunnel will comeup in transport mode.


Because of GRE header and esp header, the MTU gets reduced. So if traffic is sent with default MTU,the routers at the tunnel ends have to do
Fragmentation resulting in higher CPU usage.

So to avoid fragmentation,set the mtu to lower values.
If the hosts dont run PMTUD,set the MSS in tcp syn & syn ack packets.
On R2,R3:
int tunnel0
ip tcp adjust-mss 1400


For UDP, we need to do on the end host.

On R1,R4:
ip tcp mss 1450-->This is when the tcp session if from the router,affects bgp,msdp...




IPSec VTI:

Conceptually same as GRE over IPSec but without the additional GRE header overhead

Static VTI àused for site to site
Dynamic VTI àused for remote access


GRE over IPSec
VTI
More overhead but negligible(4 bytes)
We use GRE over IPsec because crypto map cannot define an interface in the routing table, so dynamic routing protocol couldn't run without the the tunnel interface.
Saves 4 bytes of gre overhead
With IPsec VTI we have an interface in the routing table, this remove the need to have an extra GRE IP header encapsulation.
Multiprotocol encapsulation
Ipv4,ipv6,is-is,etc
Single protocol
Ipv4 only over ipv4 ipsec tunnel
Ipv6 only over ipv6 ipsec tunnel
Line protocol based on route to destination
Line protocol status is accurate based on the ipsec phase2 negotiation
R4(config)#int tun0
R4(config-if)#tunnel mode gre ip
R4(config-if)#tunnel protection ipsec profile PROFILE1
R4(config)#int tun0
R4(config-if)#tunnel mode ipsec ipv4
R4(config-if)#tunnel protection ipsec profile PROFILE1
the frame is [Eth Header][IP Header][GRE][Data]
[Eth Header][IP Header][ESP header][Data][ESP trailer]
Supports both tunnel and transport modes
Supports only tunnel mode
Df-bit is not carried upto esp header,so applications cannot do path mtu discovery
In VTI mode,df-bit is carried upto the esp header.
Applications can do path MTU discovery and we need not configure ‘ip mtu’ under the tunnel interface.

We can still configure ‘ip tcp adjust-mss’ for applications that cant do path mtu discovery.
Tunnel and then encrypt
Encrypt and then tunnel


VTI configuration:
Phase 1 is same as in crypto map based tunnel.

For phase 2,
The tunnel defines who the end point i.e. tunnel destination is
The tunnel already defines the traffic i.e. ip any any

We just need to configure how the traffic must be treated using ‘crypto ipsec profiles’.
An IPSec profile just specifies the transform set to be used in protecting the data plane.

R2(config)#crypto ipsec profile PROFILE2
R2(ipsec-profile)#set transform-set 3DES_MD5

The profiles can be applied to both GRE tunnel and IPSec VTI tunnel.


Some platforms may not do hardware switching of GRE packets.

IPSec over DMVPN:

DMPN is p-t-m layer 3 overlay VPN.
Logical hub and spoke topology, direct spoke-to-spoke traffic is supported.

DMVPN is an mgre routing technique

Order of operations:
Crypto first
NHRP second
Routing third

So if crypto ipsec tunnel configuration is wrong, dmvpn will not work.

Configuration is same as in GRE over IPSec.
The peer address to use in the ISAKMP Policy is the NBMA Address, this is important to understand and not to confuse configuring the Tunnel Private address (10.1.100.x in this case).
 Crypto Process is the first thing to start, IF IPSEC IS NOT COMPLETED TUNNELS WILL NOT GO UP.

show crypto ipsec sa | i pkts|peer
show ip traffic | i Frag|frag


IPSec over GRE :

It’s basically,

L2 Header
GRE
ESP
IP
Data

Apply the crypto map under the tunnel interface
Proxy ACL has to match end-end entities.

Encryption first and then GRE tunnel encapsulation.