Networks are constantly evolving—you implement something new, you take out something old. I’ve noticed that most network engineers are pretty good at the planning and implementation of new things, but we tend to fall down a bit when it comes to removing the old.
There are a lot of reasons for this. Often you need to keep the old gear or the old circuit around for a week or two, just in case the new infrastructure doesn’t work as planned. But then you get busy with other projects and there’s very little incentive to remove the old.
Sometimes you can’t remove the old because you’re not actually certain there isn’t some strange, undocumented but absolutely critical legacy application that needs to use it at infrequent intervals like month-end or year-end.
The problem is that legacy pieces of equipment or, worse, legacy circuits hang around for years unused. They take up expensive rack space. And, particularly in the case of circuits, they cost money. It’s not at all uncommon. I’ve personally seen a case where a bank was still paying thousands of dollars each month for a circuit into a location they’d closed a decade earlier.
It’s even worse for disaster recovery circuits. Proper and up-to-date inventory is rare (unless you’re using a tool to automate it) and you can’t rely on the old trick of putting a sniffer on the line to see whether it’s still in use.
So let’s talk about the general best practices for removing pieces of network infrastructure.
First, get the new infrastructure in place
Assuming that you’re replacing the old infrastructure with something new, get the new element in place before doing anything.
Get it racked and powered up. Get it configured, and then test it as thoroughly as you can. Sometimes there just isn’t a practical way of testing, but you should be able to validate core functionality and make sure the new gear has all of the appropriate licenses for all the features you plan to use.
During this phase, it’s a good idea to make sure you fully understand the existing traffic flows on the existing infrastructure. Probably the easiest way to do this on IP networks is to deploy a protocol analyzer. Personally, I’m a big fan of Wireshark. It’s free, flexible, and easy to use.
Run some very long captures and dig into the traffic flows. Doing this is particularly important when you can’t test, or when the new gear is from a different manufacturer or has significantly different functionality than the old. Then your “testing” phase can include just looking at the most important flows one after another, and making sure the configuration of the new equipment will forward those packets appropriately.
Then, migrate traffic to the new infrastructure
Once you have the new infrastructure in place, you can start migrating traffic from the old to the new. Sometimes the only way to do this is a flash cut, where everything moves at once. This approach carries a higher risk, of course, but at least it’s done quickly.
If I can figure out a way to do it, though, I tend to prefer to migrate traffic in phases, usually starting with something of fairly low business criticality. That way I can make sure the new infrastructure really does work the way I expect, and that it can handle the load.
This migration stage can sometimes drag out if the political will to implement the project evaporates, as sometimes happens. To prevent this from happening, schedule all of your migration phases up front and press on with implementing them.
Next, remove the old infrastructure
Get that crusty old gear out of there as fast as you can! Make sure the project doesn’t suddenly end without actually finishing. As I pointed out at the beginning, removal is particularly important for circuits because they cost more money the longer they sit there. Get rid of them.
If you’ve done a phased migration, there’s rarely a compelling reason to keep old gear around as a fallback. You know the new infrastructure is working.
When you do a flash cut, it makes sense to keep the old infrastructure around for a week or perhaps two, just in case you encounter problems and need to roll back. But needing to roll back is actually very rare if you’ve done your planning and testing well.
Finally, update the documentation
I said up front that the thing people often don’t do well is removing the old infrastructure, but actually the most common failing of network engineers is failing to update the documentation.
No network is entirely self-documenting. Even if you have awesome diagramming tools, they never describe the business functions of network infrastructure. In my documentation, I like to keep a log of decision points: Why did I choose this particular type of gear? What’s the intended function of each firewall rule?
How many times have you looked at some network element or design and thought, “What were they thinking?” Write it down.
And now you’re done.