The time has come, friends…
We asked you for your most frightening tales of networks gone amok!
Of cabling that keeps you awake at night!
Of mysterious errors and hair-pulling glitches!
And you delivered.
Just in time for Halloween, welcome to our second compendium of network nightmares submitted by you from around the web.
Shall we begin?
Here’s some tales from our very own team at Auvik to start things off:
Steve Petryschuk, Product Strategy Director
For me, it dates back to when I was a co-op student, and I brought down an entire server farm that the development team was using to run simulations and tests against integrated circuit designs. A mistake I made in the configuration was the culprit.
Spent what felt like all night getting it back online, but it was ready to go the next morning!
Lawrence Popa, Sales Engineer
Fellow Auvik colleague (Solutions Consultant Diana Kenaya) and I had a machine literally burst into flames in front of us when we were working at a local college. Power supply caught fire. Flames were shooting out the back. She bravely unplugged it while I ran away.
IT took it away and sent it back to Dell. They nicknamed the machine, “Sparky.”
Stewart Mahoney, Solutions Consultant
Sharing one of the scariest set ups I’ve ever worked on. Both pictures are from a lavish private school. Picture 1 is of their main routers. Notice the thin fiber cable that was, prior to the photo, holding up the entire power bank.
Picture 2 is of a switch I spent hours trying to find. I had to use a probe, which found it behind a false wall next to their disti board.
From MSPGeek & our partners
Our community of friends at MSPGeek, as well as our partners, were also happy to share some bone-chilling stories and anecdotes.
[Had] a client that used a Class A public CIDR for their internal network. This single subnet spanned multiple locations across fiber, Wi-Fi, LTE, and microwave. Thousands of devices, all in the same broadcast zone. ‘Controls’ equipment that uses multicast constantly spamming everything. Too many issues to even list. And they are of course a 24/7 operation with mission-critical systems that can be life threatening… as in possible giant explosions/poison gas release if things go bad!
From Jason Slagel, CNWR*
We have a local public safety network here that allows access to wants/warrants data for all of northwest Ohio. Not only do they use the IP address space of a local university… but there’s another particular story. They built a giant new 911 center for our entire area. Everything was good, until they opened, and no one could reach the want/warrant data. Apparently no one had tested/requested networking to the public agency to get that data.
A couple of days went by, and it made the local paper. They called me because I had done some consulting on their Cisco systems. Looked at what we had available. I had access to an IPX network I could see at the 911 location, as well as the public safety building… It was a county NOVEL network!
I ended up tunneling IP over IPX for a month or so, until they could arrange connectivity on the county wide dark fiber network. “A local network consultant” saved the day in the paper, and I’m pretty sure it took weeks for the stink of doing that to wash off.
*edited for style, not for content
From Iain McMullen, Birmingham Consulting
One of my most frustrating issues was at a site with multiple buildings. The switches were managed, but did not have PoE, so the access points all had their own PoE injectors.
When we updated the firmware on the switches, they rebooted. But the APs stayed online, and then decided to auto-mesh together! Once the switches came back online, there was an instant loop, and RSTP disabled the main fiber uplink, leaving all data going over the mesh wireless… total nightmare.
Auvik’s notification timeline showed exactly what happened and helped point to the source of the problem (editor’s note: Go Auvik!). This issue also wouldn’t have happened if the APs were being powered by the switch, as they would have rebooted at the same time. We’ve since updated our policy to disable the “mesh” feature on all of our APs unless specifically required.
From Reddit & Twitter
Our readers pointed us at this Reddit round up of IT mistakes, and there were some doozies. Here are some of the highest voted disasters they shared, as well as a few comments sent directly to our Twitter account.
[My] 8-foot ladder caught something behind me and caused me to misstep, taking out the fiber connection. Later, one of my colleagues added a custom Visio icon to our network map that was my face over a ladder, sitting right in between the ISP and our router.
I once had a rookie kicking off a firmware upgrade without backing up first. The longest afternoon ever, but thankfully it turned out ok (for a change!).
As a junior admin, I wrote a password sync script that, due to a mistake in my code, wiped the /etc/passwd file of EVERY production server for a large chemical manufacturing company. Trains stopped rolling, factories shut down, millions of dollars in cost, primarily because they could not print labels and MSDS sheets for transporting goods.
I did not get fired. Why? The reviewer documented they had tested my code and certified there were no issues, and it was ready to push to production. He never actually tested it, which came out during the postmortem when a different reviewer ran the submitted code in that environment. He was escorted out that afternoon, after they found that all of his code reviews took less than 15 seconds regardless of complexity.
So, the lessons:
- Always do your actual job, even if it is menial testing!
- Change control may be a [pain in the butt], but it may one day save your a**.
Was told, ‘Make a copy of the DB from prod to dev.” I was just a few months into the position, and explained I had very little SQL experience and I shouldn’t be the one doing any work on the production system. Got told, “You’re the sysadmin, you should be able to figure this out.”
Well, I exported the database correctly, but accidentally disconnected the production database instead of the dev database. Took down the entire org, our website, call center, etc.
Didn’t get asked to make SQL changes anymore.
We used to have an office in our MDF. Over time the floor got dirty. He [colleague] decided to vacuum. He also decided to use one of the PDU’s off our UPS to plug the vacuum into. 20amp PDU’s that were, due to my predecessor’s complete lack of care for anything, loaded to 16-17amps each. I know some of you see where this is going.
Proceeds to turn on the vacuum, overloading the PDU, which immediately throws the full load to the other one, which causes that one to go as well.
You know the scariest sound you can hear in your server room? Silence. Complete silence. Brought the entire school district down in the middle of the day for around 30 minutes while everything started back up.
Technically, this isn’t IT networking, but it’s too good not to include.
While working on the backend of the site, this little plugin I had forgotten all about changed a cloudfront hosted script, and saved it as buttfront upon running in the browser. And for two glorious hours an IT software company’s website was a perverted paragon of prodigious proportions.
Have your own terrifying tale to share? We’re putting together our next edition now, so add your network nightmares to the comments below, or email us social[at]auvik2020.wpengine.com!