Understanding and troubleshooting internet routing issues are crucial for IT teams tasked with maintaining workplace operations.

But troubleshooting these routing problems is tricky because the usual tools like ping and traceroute donโ€™t always tell you what you need to know.

This guide provides a comprehensive look at internet routing issues, including common causes, useful tips, must-have tools and more. With a clear understanding of these aspects, you can better navigate and resolve the challenges posed by internet routing issues, ensuring a smoother and more reliable online experience for clients, employees, and end-users.

What is an internet routing issue?

An internet routing issue refers to a problem in the network that hinders the proper delivery of data packets from their source to their intended destination.

Routing is the process by which data, often in the form of packets, is forwarded from one network to another until it reaches its target location. This process relies on a complex system of routers, switches, and protocols that determine the best path for the data to travel.

When an internet routing issue occurs, it can manifest in various forms, such as:

  • Delayed packet delivery
  • Packet loss
  • Inaccessible websites or services
  • Incorrect data path

5 common causes of internet routing issues

Internet routing issues can stem from a variety of causes, each impacting the efficiency and reliability of data transmission across networks.

Here are five common causes:

  1. Hardware failures: Routers, switches, and other networking hardware are crucial for routing internet traffic. Hardware failures, whether due to aging equipment, manufacturing defects, or external factors like power surges, can disrupt the routing paths, leading to connectivity issues.
  2. Software bugs: Firmware and software that govern networking devices are not immune to bugs and glitches. Software issues can lead to incorrect routing decisions, loops, or even complete network outages. Regular updates and patch management are essential to mitigate these risks.
  3. Configuration errors: Human error in configuring network devices can lead to major routing problems. Misconfigured routers might send traffic down inefficient or non-existent paths, causing delays or loss of connectivity. This includes errors in setting up routing protocols, access control lists (ACLs), or incorrect subnetting.
  4. Network overloads and congestion: High traffic volumes can overwhelm network resources, leading to delays and packet loss. This is especially common in networks without adequate capacity planning or in scenarios where there is a sudden spike in traffic.
  5. BGP (Border Gateway Protocol) misconfigurations or hijacks: BGP is crucial for routing traffic between different autonomous systems on the internet. Misconfigurations or malicious hijacking of BGP routes can lead to traffic being routed inefficiently or even maliciously intercepted, which can disrupt global internet traffic.

Understanding and addressing these causes are key steps in understanding and troubleshooting internet routing issues.

How packets are routed

Let’s start here with the basics of how a packet is routed through a network, which illuminates critical subtleties that are useful when troubleshooting.

The originating device puts three important parameters into the IP packet header:

  • The source IP address, which is the address of the device itself
  • The destination IP address, which is where the packet is going
  • The IP protocol, such as UDP or TCP, or ICMP

In the case of UDP and TCP, there are two additional numbers, both of which are important: the source and destination port numbers. The destination IP address is what we normally think of in routing, but actually, the network can route the packet using any combination of these values.

Another parameter called time to live (TTL) governs how far away the destination can be. The name is deceptive because it doesnโ€™t really have anything to do with time. TTL is a hop counter that keeps track of how many times the packet has been forwarded and is used to prevent loops.

The first task of the originating device is to look up the destination address in its own internal routing table. On Windows, use the โ€œroute printโ€ command.

C:\Users\Kevin>route print
[โ€ฆ]
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0       10.10.80.1       10.10.80.2     25
       10.10.80.0    255.255.255.0         On-link        10.10.80.2    281
       10.10.80.2  255.255.255.255         On-link        10.10.80.2    281
     10.10.80.255  255.255.255.255         On-link        10.10.80.2    281
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    306
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    306
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    306
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    306
        224.0.0.0        240.0.0.0         On-link        10.10.80.2    281
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    306
  255.255.255.255  255.255.255.255         On-link        10.10.80.2    281
=========================================================================== 

This example shows a lot of destination networks, but really only two of them matter. The first line is the default route. Network 0.0.0.0 with mask 0.0.0.0 matches any destination. This default route points to a next-hop device, my router, 10.10.80.1.

The other entry in this routing table that matters is the second one, for 10.10.80.0 with a mask of 255.255.255.0. This matches any destination between 10.10.80.0 and 10.10.80.255, my local network segment, which includes my router.

Based on this table, my PC knows to forward this packet to my router. To do so, it uses the IP packet in an Ethernet frame with the routerโ€™s Ethernet MAC address in the destination field and its own Ethernet MAC address in the source field so that the router knows how to forward return packets.

The router strips off the Ethernet frame and looks in its own routing table to know how to reach the destination IP address.

Router>show ip route
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
U - per-user static route, o - ODR

10.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 10.10.80.0/24 is directly connected, Ethernet0
C 10.10.1.0/24 is directly connected, Ethernet1
C 10.25.8.1/32 is directly connected, Loopback0
S 0.0.0.0/0 [1/0] via 10.10.1.4
[โ€ฆ]

This routing table uses CIDR โ€œslashโ€ notation instead of separate network and mask, but it conveys similar information to the PCโ€™s โ€œroute printโ€ command. Look for the entry that matches the destination IP address. Letโ€™s assume itโ€™s the default route, 0.0.0.0/0. The router sees that it points to a โ€œnext hopโ€, 10.10.1.4, which is another router.

The router then creates a new Ethernet frame using the MAC address of this next-hop router for the destination address and wraps it around the original IP packet. The only important change it makes to that original packet is to decrease the TTL value by one.

The whole process is repeated until the packet is delivered to the destination.

Must-have tools for troubleshooting internet routing issues

The standard go-to tools for troubleshooting routing problems are ping and traceroute.

Ping is a very simple-minded tool. It sends an Internet Control Message Protocol (ICMP) โ€œecho requestโ€ packet to the destination device, which sends back an โ€œecho responseโ€ packet. ICMP is a special IP protocol, different from either TCP or UDP. ICMP packets donโ€™t contain the source or destination ports, just a โ€œtype,” such as โ€œecho requestโ€ or โ€œecho response.” Thatโ€™s it. If the request can get all the way through to the destination and the response can get all the way back, then you know you have Layer 3 connectivity.

The problem with ping should be obvious from its description: It tells you nothing at all if you canโ€™t reach the destination. This is where traceroute comes in. Traceroute also sends packets (either UDP or ICMP depending on the implementation) to the destination IP address and looks for a response, but it actually tries several times while manipulating the TTL field that I mentioned earlier.

The first time it tries, the traceroute sends the packet with a TTL value of zero. No router is supposed to forward a packet with a TTL value of zero. So the router drops the packet and sends back a special โ€œTTL exceededโ€ type ICMP packet to the source. Traceroute reports the IP address that appears in that ICMP packet. Now you know the first hop. It will generally do this three times, just to make sure the route is stable.

Then traceroute increments the initial TTL value and sends another packet. This time the first router sees a TTL value of 1, decrements it to 0, and forwards it to the next hop router, which drops it and sends back an ICMP message. Traceroute displays the IP address of that router. This process repeats with initial TTL values of 2, 3, 4, and so on until the destination is reached.

Traceroute will often show you several hops, followed by line after line of โ€œ* * *โ€, which means that it didnโ€™t get back the โ€œTTL exceededโ€ message. This usually means that the last device you saw explicitly listed is the last one that has a good route to the destination. Whoever it forwarded that packet to didnโ€™t know what to do with it.

But itโ€™s also possible that the packet was forwarded, but you just didnโ€™t get the โ€œTTL exceededโ€ message. Sometimes firewalls in particular will refuse to send this message. And sometimes firewalls will actively block these packets from all downstream devices. So itโ€™s not conclusive, but it gives you an idea of where to start looking for trouble.

The other interesting thing you sometimes see in a traceroute session is multiple next-hop IP addresses for the same TTL value. This tells you that there are actually multiple paths to the destination, all with the same routing cost. This is only a problem if there are firewalls in the path. A firewall will generally object to forwarding response packets back to a source if it didnโ€™t see the original packet going the other way. To the firewall, this looks like a protocol violation, so it will usually drop the unexpected packet.

How to use routing loops to troubleshoot routing issues

The other thing that traceroute will sometimes show you is a loop. Somewhere along the path, youโ€™ll see an IP address that you’ve already seen. That is, youโ€™ll see a path that goes from router A to B, C, D, C, D, C, and so on. This tells you that router C is forwarding the packet to router D, which forwards it back to C.

This is actually the reason the TTL field exists. The source device will usually use the maximum value of the TTL field: 255. In a loop, the TTL value will eventually decrement to 0 and the packet will be dropped. There are no infinite loops in IP, but itโ€™s still a bad thing because your packets arenโ€™t getting through and congestion problems could result.

If you see a loop, youโ€™ll need to figure out what the path is supposed to be and adjust the routing tables of the looping devices. Typically youโ€™ll see loops in situations where a dynamic routing table is in conflict with a static route on one or both of the routers in question. This could happen, for example, if you have a static default route on one of the devices pointing to the other one. Then if the more specific route to your destination disappears for any reason, the router will use the default route and send the packet back to where it came from.

How to use protocol filters and policy routing to troubleshoot routing issues

Suppose ping and traceroute say everything is fine, but your application packets still aren’t getting through. This is typically due to either a filter of some kind or policy routing.

Protocol filters are also called access control lists (ACLs). You can find these filters on Cisco routers, switches, and firewalls by searching the configuration file for โ€œaccess-groupโ€ commands, which apply the ACL to an interface.

An ACL can allow one type of traffic and block another type. For example, you might find that the ICMP ping packets are allowed but your application traffic is not. In this case, the routing tables will look right and the ping and traceroute tests will work, but you wonโ€™t be able to run the application.

Policy routing (also called policy-based routing or PBR) can cause even stranger problems if it goes awry. Policy routing means the router will override the routing table when making its forwarding decisions. Instead, it might make its decisions based on the source IP address, protocol, or port number. So the router could be forwarding the ping packets through one path and the application traffic in a completely different way.

If you suspect that policy routing is causing your problems, the first thing to do is to look at the router configuration files for an interface configuration block that includes an โ€œIP policyโ€ command. This command will refer to a route map, which in turn will define how the packets are to be routed.

interface Ethernet0
 ip address 10.10.5.1 255.255.255.0
 ip policy route-map FUNKYROUTING
!
route-map FUNKYROUTING
 match ip address 100 
 set ip next-hop 10.10.6.1
!

In this example, the policy will override whatever is in the routing table for those packets that match ACL number 100 and always forward them to the specified next-hop router. The ACL could identify these packets based on source or destination addresses or ports, or any combination.

Whenever PBR is configured in your network, you need to be extremely careful about troubleshooting routing problems.

How to use VPNs to troubleshoot routing issues

Another important place to look when troubleshooting routing problems is virtual private network (VPN) configuration. Many companies interconnect their remote offices using VPNs through the internet. Sometimes the VPN is a backup link in case a primary private circuit or MPLS service goes down, and sometimes the VPN is the only link. IPsec VPNs are typically used for interconnecting networks.

The critical thing to watch out for in the VPN configuration is the โ€œinteresting traffic list.” This is an ACL that defines what packets may use the VPN link, generally identifying both source and destination networks. Watch out for mismatches between the ACL on the devices on both ends of the VPN, as well as possibly missing networks.

Get templates for network assessment reports, presentations, pricing & moreโ€”designed just for MSPs.

Ebook cover - The Ultimate Guide to Selling Managed Network Services
  1. Yogesh Avatar
    Yogesh

    I usually confused about routing troubleshooting and this is really useful. I appreciate your efforts..Thanks

  2. Ed Avatar
    Ed

    Kevin, I have been a Network Engineer for 30+ years. You did a spectacular job explaining routing/gateways/route-map/ACL and made it so simple. For years I have been discussing this basic rule to several jr. net-workers yet they stumble in these three areas. TTL, UDP and echo reply just flow in your Troubleshooting document dated July 2 2014.
    I would like to use your explanation in the future, I know this will not be the last time I need to discuss gateways, protocols and route-maps. Such a simple idea explained simple! I will stop using my idea “be the packet”
    thanks.

  3. Geoff Avatar
    Geoff

    Kevin, this may be an old post but it is still very useful. Even I understood it! Thanks for clarifying these concepts so well.

Leave a Reply

Your email address will not be published. Required fields are marked *