Bro, I ain’t got flow isn’t only heard at your local hip hop mic night.
It’s a gripe from many network administrators who have inherited small environments, networks with lower-end gear, or who are in the trenches dealing with a time-sensitive issue and need to dig deep—now.
NetFlow is a Layer 3 protocol that, over time, allows administrators to see how much traffic is being generated, by whom, and where that traffic is going. But within a typical deployment where you have a collector ingesting data from devices like Layer 3 switches and firewalls, all the devices need to support NetFlow (or one of its relatives, like sFlow or IPFIX) to get that data.
If you’re designing a network from scratch with enough money, then it’s relatively easy to ensure you can capture flow from key devices.
But maybe such a goal was only recently discussed. Or the powers that hold the purse strings are refusing to upgrade that 10-year-old switch for something a little more modern.
How can you still answer questions like:
- Where’s all the traffic on the network happening? Who’s generating it? Where’s it going?
- We keep getting intermittent connectivity and performance problems on the network. What could be causing them?
The good news is that even without NetFlow you can still get answers to these questions. And you can get them using protocols and tools you probably already have in your arsenal or can easily obtain.
Track the envelopes as they flow through the network
Start by enabling SNMP
The Simple Network Management Protocol (SNMP) is a standards-based protocol available across the vast majority of networking gear. Vendors typically expose both common and application-specific metrics through “encyclopedias” called management information bases (or MIBs). MIBs can be loaded into network management solutions that periodically poll for metrics of interest.
Standard SNMP monitoring can provide a ton of valuable information when:
- You customize a tool to grab specific metrics of interest and then manually define alerting conditions and workflows to get buzzed, or
- You use a product that does the heavy lifting of SNMP-based discovery and data collection automatically and comes pre-configured with dozens of alerts based on best practices.
SNMP’s core strength with network devices is that it effectively grabs three different data types: biodata, counters, and tables. Let’s spend some time on the latter two.
Remember playing Battleship as a kid? Every time you hit an opponent’s vessel, you’d mark it with a red peg instead of the standard white. That was a counter.
Similarly, in the very related field of network management, counters are used to track metrics like bytes in and out, your packet error and discard count, packet count by type (e.g., broadcast, multicast, or unicast) and much more.
SNMP allows for common tables like ARP, MAC address, and wireless client to be periodically polled. It’s good to keep an eye on these for key devices. Know the upper limits of each, as well as what’s a normal acceptable range. Having these max out because of device growth or a network issue can cause disruption.
Digging deeper allows you to determine the root cause and fix it faster—for example, by upgrading the switch or isolating the endpoint causing the storm.
Use data exposed by endpoints, too
Some endpoints support SNMP (or WMI, in the Windows world) and can therefore expose data from the perspective of an employee’s workstation. It’s valuable to be able to compare metrics like broadcast packet spikes from both the perspective of an endpoint and a switch (along with its interfaces, which might be re-broadcasting that traffic) to spot a reflection attack, for example.
Some network management products have a troubleshooting view that can show you correlated events within a specified historical timeframe.
Open the envelope and examine the payload
Mirror a port on your switch
Switches often include a feature called port mirroring. It allows you to have all packets sent and received on one port to be mirrored on another. For example, you could mirror packets on a server-connected port to another port with a sniffer appliance connected to it.
Mirroring is typically used when you want to observe a stream of packets from a device under investigation without directly interacting with it. Data is passively collected.
Some switches also support spanning, which is effectively mirroring an entire VLAN to a port for packet capture and analysis.
Keep a lookout for patterns and oddities
Tony Fortunato of The Technology Firm has a great video on how he eyeballs 1,000 captured packets on a client’s network to infer network issues and potential configuration enhancements across endpoints, peripherals like printers, and network devices like switches.
Make use of Wireshark views like the Protocol Hierarchy Statistics and Endpoint Report to identify what protocols your top-talkers are using. If a workstation is sending and receiving a lot of IRC, BitTorrent, newsgroups, or FTP traffic, you may want to knock on your client CIO’s door and show your findings.
Laura Chappell shows how you can use Wireshark to identify packets that don’t conform to known protocol and port assignments. Seeing traffic classified as “data” underneath a transport protocol like TCP or UDP is a glaring hint, for example, and should warrant some additional investigation.
There are even patterns to look for in identifying a botnet. For example, sending the same stream of packets to a sequential set of IPs is highly suspicious.
You can apply similar techniques to identify sources of broadcast storms—like a rogue DHCP server wreaking havoc on the network and triggering a denial of service.
Wrapping things up and further reading
NetFlow is great for gaining deeper visibility at Layer 3 and beyond—but not everyone has the luxury of putting it to use. Thankfully, there are tools within reach that work for networks both large and small, and on gear that costs $500 or $5,000.
Those tools—like SNMP and Wireshark—can help you identify specific areas of concern that may later turn into a larger crisis. They can also help you get to root cause a lot quicker when you get an initial heads-up from your network management system or client call.