Let me start by saying that spanning tree is a Good Thing. It saves you from loops, which will completely shut down a network. But it has to be configured properly to work properly. I can’t count the number of times I’ve had a client call me, desperate with a terribly broken network, and I’ve responded, “Sounds like a spanning tree problem.”
There are many ways things can go wrong with spanning tree. In this article, I’ve collected six of the recurring themes.
1. Not configuring spanning tree at all
As I said, spanning tree is a good thing. But for some reason, a lot of switch vendors disable it by default. So out of the box, you might have to enable the protocol.
Sometimes people deliberately disable spanning tree. The most common reason for disabling spanning tree is that the original 802.1D Spanning Tree Protocol (STP) goes through a fairly lengthy wait period from the time a port becomes electrically active to when it starts to pass traffic. This waiting period, typically 45 seconds, is long enough that DHCP can give up trying to get an IP address for this new device.
One solution to the problem is to simply disable spanning tree on the switch. This is the wrong solution.
The right solution is to configure a feature called PortFast on Cisco switches. (Most switch vendors have a similar feature.) You configure the command “spanning-tree portfast” on all the ports connecting to end devices like workstations. They then automatically bypass the waiting period and DHCP works properly.
It’s important to only configure this command on ports that connect to end devices though. Ports connecting to other switches need to exchange spanning tree information.
2. Letting the network pick your root bridge
As the name suggests, spanning tree resolves loops in your network by creating a logical tree structure between the switches. One switch becomes the root of the tree, and is called the root bridge. All other switches then figure out the best path to get to the root bridge.
If there are multiple paths, then on each switch, spanning tree selects the best path and puts all the other ports into a blocking state. In this way, there’s a single path between any two devices on the network, although it might be rather circuitous.
Every switch taking part in spanning tree has a bridge priority. The switch with the lowest priority becomes the root bridge. If there’s a tie, then the switch with the lowest bridge ID number wins. The ID number is typically derived from a MAC address on the switch.
The problem is that, by default, every switch has the same priority value (32768). So if you don’t manually configure a better (lower) bridge priority value on a particular switch, the network will simply select a root for you. Then Murphy’s Law applies. The resulting root bridge could be some tiny edge switch with slow uplinks and limited backplane resources.
To make matters worse, a bad choice of root bridge can make the network less stable. If there’s a connectivity problem that takes any random switch off the network, spanning tree heals rather quickly. But if the root bridge goes down, or if the failure means that some switches no longer have a path to the root bridge, this constitutes a major topology change. A new root bridge needs to be selected. The entire network will freeze during this time and no packets can be forwarded.
I always recommend making the core switch to the root bridge. I also like to select a backup root bridge. If there are dual redundant core switches, then one is the root bridge and the other becomes my backup.
Set the bridge priority on the primary root bridge to the best possible value—4096—and the backup root bridge to the next best value—8192. Why these funny numbers? Well, that’s a long story that we don’t have space for here, but the lower order bits in the priority field have another purpose, so they aren’t available for use as priorities.
3. Using legacy 802.1D
The first open standard for spanning tree is called 802.1D. It’s one of the earliest standards in the IEEE 802 series of standards that includes the specifications for every type of Ethernet and Wi-Fi as well as a bunch of other protocols. It works well despite its age, and you’ll find this type of spanning tree on just about every switch. Any switch that doesn’t support 802.1D is only useful in small isolated environments, and should never be connected to any other switches.
But there have been several important advancements to spanning tree since 802.1D. These improvements allow sub-second convergence following a link failure, as well as the ability to scale to larger networks and the ability to actually have different spanning tree topologies and different root bridges for different VLANs. So it makes a whole lot of sense to use them.
Most modern Cisco switches default to a protocol called Per-VLAN RSTP. This stands for Rapid Spanning Tree Protocol. It automatically operates a separate spanning tree domain with a separate root bridge on every VLAN. In practice, it’s common to make the same switch the root bridge on all or most of the VLANs, though.
The rapid feature or RSTP is what you’ll probably find most useful. This allows the network to recover from most failures in times on the order of 1 to 2 seconds. Multiple Instance Spanning Tree, or MST, is similar to RSTP. The main difference is that you can designate groups of VLANs that are all part of the same tree structure with a single common root bridge. However, I recommend using Per-VLAN RSTP in most cases because it’s easier to configure. Also, I’ve encountered some interoperability problems with MSTP between different switch vendors.
4. Mixing spanning tree types
It should be pretty clear from the descriptions of 802.1D, RSTP, and MST in the previous section that mixing them could get messy. The RSTP and MST protocols have rules for how to deal with this mixing, and in general, it involves creating separate zones within the network for groups of switches running different flavors of spanning tree. This rarely results in the most efficient paths being selected between devices.
The only really valid reason to mix spanning tree types is to allow the inclusion of legacy equipment that doesn’t support the more modern protocols. As time goes by, there should be fewer and fewer of these legacy devices, and the number of places where it makes sense to mix the protocols should become smaller.
I recommend picking one, preferably RSTP or MST, and just using that in a consistent manner across all of your switches.
5. Using MST with pruned trunks
Because MST allows a single spanning tree structure that supports multiple VLANs, you need to be extremely careful about your inter-switch trunks.
I once had a client with a large complicated network involving many switches and many VLANs. They were running MST. For simplicity, they had designated a single MST instance, meaning that all VLANs were controlled by the same root bridge.
The problem for this client arose when they decided that certain VLANs should only exist on certain switches for security reasons. All perfectly reasonable. So they removed the VLAN from the main inter-switch trunks and added new special trunks just for these secure VLANs. And everything broke.
MST considered all VLANs to be part of the same tree, and it selected which trunks to block and which to forward based on that assumption. But in this case, because some VLANs were only present on some trunks and other VLANs were present on the other trunks, blocking a trunk meant only passing some of the VLANs. Blocking the other trunk meant only passing the other set of VLANs. For the blocked VLANs there was simply no path to the root bridge at all.
So, if you’re going to use MST, you need to either ensure that all VLANs are passed on all trunks, or you need to carefully and manually create different MST instances for each group of VLANs with special topological requirements. In other words, you have to do careful analysis and design the network properly. Or you could take the easy way out and run Per-VLAN RSTP.
6. Conflicting root bridge and HSRP/VRRP
Another common topological problem with spanning tree networks involves the way that Layer 2 and 3 redundancy mechanisms sometimes interact.
Suppose I have a network core consisting of two Layer 3 switches. On each segment I want these core switches to act as redundant default gateways. And I want to connect all of the downstream switches redundantly to both core switches and make spanning tree remove the loops.
In this scenario, the spanning tree root bridge for a particular VLAN might be on one of these core switches and HSRP/VRRP master default gateway on the other switch. Then an Ethernet frame originating on one of the downstream switches destined to the default gateway will need to take an extra hop, going first to the root bridge, and then to the secondary core switch that currently owns the default gateway IP.
Normally this isn’t a problem, but imagine that I’m passing packets between two VLANs, both with Core Switch A as the root bridge and Core Switch B as the default gateway. Every packet must go up to Core Switch A, and cross the backbone link to get routed on Core Switch B.
Then it has to cross the backbone link again to go back to Core Switch A to be delivered to its destination. All of the return packets must also cross the backbone link twice. This creates a massive traffic burden on the backbone link where every packet in both directions must cross twice. It also incurs a latency penalty as every packet needs to be serialized and transmitted twice. Even on 10Gbps links, this will typically cost a couple of microseconds in both directions, which could add up for particularly sensitive applications.
Suppose instead that the default gateway was on the same switch as the root bridge. Now the packet goes up to the root bridge, Core Switch A, gets routed between the VLANs, and immediately switched out to the downstream device. It doesn’t cross the backbone at all in either direction.
Spanning tree is a terrifically important protocol. It allows us to build redundancy into inter-switch connections. It saves us from catastrophic loops when somebody accidentally connects things they shouldn’t.
It’s true spanning tree can be misconfigured with bad consequences, but this possibility shouldn’t discourage you from using it. The solution is to be careful and deliberate about your network design.
Great assist Kevin!!! Tons of websites out there telling me how to set up the spanning tree, some convoluted sites explaining why to and the risks of without, but this truly encompasses my ideas that my network admin definitely didn’t set this up right. Thanks!
I happen to see this article from a coworker. Coming from a CCIE, this article contains many valid points and concepts that people forget, but also leaves out many more like root guard, bpdu guard, loop guard, udld, and the reasons some vendors disable it by default. There are enough other technologies these days that one shouldn’t be relying on spanning tree for it’s redundancy aspects within core/distribution infrastructure and even distribution to the edge. Multi-chassis etherchannel, VSS, VLT, VPC, backup ports, and routing protocols for example. If you are using technology that isn’t compatible in any way with these protocols, and you are trying to achieve critical redundancy, then you are mixing consumer grade solutions with enterprise grade solutions. Spanning-tree’s main job these days are just to prevent outside factors from inadvertently tanking the network. However, at this job it is still say only 80% good at. Loops can be caused simply by looping a local switch with itself and then connecting it into the upstream infrastructure. Upstream spanning-tree won’t detect it, because to it there is no loop in the upstream. Yet the broadcasts will still forward upstream causing significant performance issues there. This is where storm control comes in. Arguably, stomcontrol makes a case, in the right configuration, to disable spanning-tree even at the edge. In the end, there are many cases where you can spend more time protecting a spanning-tree implementation from being abused than the value you’ll ever receive out of it. These are the reasons why spanning-tree gets turned off by default from some vendors. There are simply better options, and one needs to be aware of how they intend to use it, the value they’ll receive from it, and know how to configure it properly anyways. It still has value in user exposed edge ports, even if you turn on portfast, but you can easily get away with it disabled else where with proper designs and feature implementations. Again, this doesn’t mean I am running to turn it off or do away with it, but I definitely use it more to protect myself from end-users rather than anything else.
Thanks for your excellent post, really great.
On your point 6. Can you please clarify and confirm if the data must always past through the root bridge? Is the root bridge only used to manage STP or all the user/data traffic also passes through the root bridge? I hope it’s clear what I am asking. Thanks.
First of all, thank you for your post, it is really good!
I have a question regarding priority values on second point, could you explain a little bit more about why we must use the 4096 and 8192 and why not starting from 0 for root bridge and 4096 for backup root?
Spanning tree is considered a management protocol. The reason it is disabled by default is it should really only be enabled on ports that are connected to other switches or need management privileged. If it is on every port then someone can relatively easily plug into your network and brute force the VLAN domain or the configuration if you have not used strong passwords. The issue is that when you are talking about VOIP convergence these layer 2 management protocols have to be enabled. This is where the importance of Bridge Protocol Data Unit (BPDU) comes into play. This protocol allows us to limit which switches are actually members of the VTP domain.
Food for thought.
This is not true. In the case of Cisco switches, you should enable a feature called ‘bpduguard’ on ports connecting to hosts, along with ‘portfast’. So, portfast will allow the port to go into forwarding mode immediately. But bpduguard acts to protect the network from external loops, or adding ‘unauthorized’ switches. All ports send out bpdus, which hosts ignore. However, when a port with bpduguard configured receives a bpdu, it assumes that either a loop has been created, or that a switch has been connected to the port. When this happens, the port will immediately shut down, in error-disable mode, to protect the network.
Hello Kevin, Like everybody else, Thanks so much for your article, was clear and you nailed on point numbers 2 and 6, I make root bridge my core switch which L3, but regarding point 6 (both cores switches sharing a virtual IP) what about in a complex large ring topology? with more than 20 switches managed by these two core switches, the main concern is what happen in the event of one of those 20 switches lose the power, by outages etc? the whole ring will broke? or RSTP should keep the ring alive? (all SWs have the same speed full/1Gbps over FO) Thank you very much!
Hi, incase if I add a new switch in existing STP topology , what will be the impact and how should i plan to minimize the impact.
My thought is to increase the priority value in the new switch so that it never becomes root bridge…is that right?
Kevin! Nice article. I have a question relating to your example where you have 2 core switches 4096 and 8192. What bridge priority should the other distribution switches connecting to the core be? The distribution switches are configured similar to the core, in that there are two of them and all hosts are bonded.
Also, should I increment the bridge priority as I add more distribution switches. I have done this and have reached the max bridge ID allowed. heh
I am thinking the distribution switches should all have the same bridge ID, say 12,288 and 16384 ?
However, bigger issue I screwed up and made core switch 1 bridge id 0. Will need to fix.
Hi Kevin, Struggling to find info on a LAN with multiple switches but no redundancy and running PVST+ spanning tree. If my root bridge gets removed will I see uplink interfaces stopping transmitting frames while a new root bridge is elected.
I am having query for point number 5, let me explain you my topology
We are having 2 Core switches interconnected with single Gigabit interface and below that Core, 2 L2 switches both of them are connected to Core 1 and Core 2 with TenGig port.
MSTP is configured with default instance 0 and following is mst configuration on all the 4 switches (2 Core and 2 L2 switches):
#sh spanning-tree mst configuration
Revision 0 Instances configured 1
Instance Vlans mapped
Following L2 VLAN are presents on respective switches:
Core Switches: 32,33,35,36,38,39,40,41,136,138,139,142,1000,2000 (One of the Core switch priority is 0 for mst instance 0, so that is root bridge)
First L2 switch: 32,33,35,40,41,142
Second L2 switch: 32,33,36,38,39,41,138,139,142
So my query is:
1. Am I supposed to pass all the unique VLANs on all the interconnected trunk ports (between the CORE and between CORE-L2) even if they are not used on few of the switches.
2. Since both the core switches are connected via Gigabit port, traffic from Core2 to Core1 is passing via one the L2 switch which is Tengig port. (How can resolve this issue, should i manually configure low cost for port connected between Core, if yes than i can i do it? need help with the commands)
Thank you for your post, I have a quick question about migrating the Root Bridge from an old switch running SPT to a new core switch running RSTP.
Can you tell me how long the conversion will take when I change the priority on the new switch, will it be 2 sec or up to 45 sec?