Understanding Spanning Tree Protocol -- the Fundamental Bridging Algorithm03/30/2001
Welcome to the fifth article installment in the series, Networking as a 2nd Language. Weíre going to wrap up our layer 2 hardware discussion with an important topic: Spanning Tree Protocol bridging. Following in the tone of the series, weíre going to investigate this powerful bridging algorithm in our fictitious Sprockets corporate network. Letís look in on our favorite VP of IT and see what network problem she's solving today.
Fault tolerance through redundancy
Momma Sprocket identified a critical network connection between the manufacturing floor robot segment and the data center. To keep their two network segments up 24 hours a day, 7 days a week, all year round, she placed two LAN switches in her topology to connect these two segments. The two-switch hardware concept is to minimize the single point of failure on the network. In the event one switch fails, the other switch can maintain network connectivity. Remember, Momma Sprocketís goal is to keep these network segments up year-round.
Network broadcast loops
The day after Momma Sprocketís team implemented the dual switch architecture, an interesting event occurred in the Sprocket network. At 0600 in the morning, a manufacturing floor robot controller sent a network broadcast to request a boot loader.
Also in Networking as a 2nd Language:
The broadcast was first received by switch sw1 on port 2/1. The sw1 switching hardware follows bridging protocol and makes a copy of the broadcast frame from port 2/1 and forwards the copied packet out port 2/2 of switch sw1.
The topology is redundantly connected; therefore, switch sw2 receives the broadcast frame as well on port 2/1. Switch sw2 will make a copy of the frame and forward it out its port 2/2. About this time, switch sw2 is also receiving a copy of the broadcast frame forwarded to the LAN segment from port 2/2 of switch sw1.
Switch sw2 then copies the frame and forwards it out on port 2/1. The same event occurs on switch sw1, port 2/2, when it receives the frame forwarded from port 2/2 of switch sw2.
Didnít we start out with one frame from a controller device on the network? Now, in a small fraction of the time, we have four packets. The problem grows exponentially until the network bandwidth is saturated with a single broadcast request. Devices on both networks are experiencing large numbers of collisions. The network has become unusable. Wasnít network redundancy supposed to be a good idea? What went wrong?
Spanning Tree Protocol
Corporate networks have bridging loops similar to that of the Sprocketís corporate network. Redundancy eliminates a single point of hardware failure in a network. Whenever switch redundancy is present in a network, there is a loop. The trick here is to allow for multiple bridges and only allow for a singular path. The Spanning Tree Protocol (STP) algorithm provides the missing component the Sprocket network needs to implement a redundant network without the redundant traffic paths.
Remember from our earlier discussion on bridges that a switch and a bridge are layer 2 devices? This means the switch relies on MAC addresses for identification of network devices. A switch, essentially a complex bridge, uses bridging tables, which are collections of MAC addresses associated to bridge interfaces or, in the case of a switch, a port number.
Creating the tree
In a STP bridged environment, the switches exchange information amongst themselves using a layer 2 frame called bridge protocol data units (BPDU). The switch will listen in on its ports for BPDU. When a bridge or a switch is turned on, it will send out a BPDU on all ports. The BPDU is not forwarded off the segment by the receiving switch.
When a bridge is turned on, it automatically assumes that it is the root bridge in the STP tree. The STP software will elect a root bridge. Once the root bridge is delegated, it will then calculate all redundant paths from the lower bridges back to itself. Each port on every bridge in the STP tree will be assigned a weighted metric called a path cost. The path cost is used to determine which port provides the best path for data flow.
In the event of hardware failure of a root bridge in a redundant environment, a new root would be elected and port paths would be recalculated. Once a tree is built and the BPDU exchange has settled down, the network is said to be in a state of convergence.
Bridges exchange information with other bridges on the network using configuration messages. Configuration messages are sent about every 4 seconds onto the network. The configuration message contains information regarding which device is the root bridge; the ID of the sending bridge, called a bridge ID; and the distance between the root and sender, called the path cost. The port ID is also included in the message identifying from which port on the bridge the configuration message originated.
Bridge ID and path cost
The bridge ID of a configuration message is an 8-byte field. The six low order bytes are the MAC address of the switch. The high order two-byte (unsigned 16-bit integer) field is the bridge priority number. The root bridge is identified as the bridge with the lowest BID.
Switches all use an algorithm to determine how close they are to the root bridge. This metric is called the path cost. The lower the cost, the closer the switch is to the root. The idea is to transverse the tree using the lowest costs. What happens if two devices have identical path costs in the node of a tree? The device with the lowest MAC address value is used for the tiebreaker.
The flat earth loop-free network
Many corporate topologies exist that rely heavily on layer 2 switches and STP to mesh together networks. These meshed topologies joined together using layer 2 switches are typically called flat earth network topologies. In this design, one switch is the root bridge and all other bridges point back to the root. The other bridges in the tree each then assign a root port. This is a port on a non-root bridge that is believed to be the closest segment back to the root bridge. The root path cost is the cumulative metric measuring how far away a root bridge is located in the network. This information is exchanged with configuration messages in the BPDU. Once the STP is established, the network is a loop-free environment.
Sprocketís redundant network
Momma Sprocket had the right idea. She just needed to enable the Spanning Tree Protocol in her network. This implementation immediately solved the broadcast flooding that plagued their redundant network. This should hold for the time being. But Momma Sprocket needs to set up a domain for Sprockets on the Internet. This will require a layer 3 switch called a router. Fortunately, Nanna Sprocket was bored at the senior home and she decided to go for her Internet engineer certification. In our next installment weíll see how Nanna tackles this problem of integrating switches and routers.
Now -- how about a pop quiz?
If you need some help with those answers, you can review previous articles in this series.
Michael J. Norton is a software engineer at Cisco Systems.
Read more Networking as a 2nd Language columns.
Return to the O'Reilly Network.