Packet loss is when a piece of data sent from one networked device to another fails to arrive, and can occur for a variety of reasons. The first thing to do when troubleshooting it is to isolate where the loss is occurring. Using the ping and traceroute or tracert tools in most operating systems is very useful for this. This article will work through an example case of isolating packet loss being encountered by a Windows PC attempting to reach 188.8.131.52.
Determining where packet loss is occurring over routed links
To confirm if packet loss is occurring:
Open a command prompt on a client PC, via the Start Menu search for "cmd".
Type "ping -n 20 184.108.40.206".
This will ping the address 220.127.116.11 20 times. Substitute 18.104.22.168 with whatever address must be tested to.
Once the command has run, a summary will be presented indicating if loss occurred.
If no loss occurred, try increasing the "-n" value to something higher (such as 100) to test for a longer period of time.
Note: This only tests for packet loss impacting ICMP or all traffic. Protocol specific loss may not be reflected.
If packet loss was seen, the next step is to identify where the packet loss begins to occur. 'tracert' can be used to check each layer 3 device along the path to the destination:
Open a command prompt on a client PC,via the Start Menu search for "cmd".
Type "tracert -d 22.214.171.124".
This will perform a trace route to 126.96.36.199 and present each hop as an IP address. Substitute 188.8.131.52 with whatever address must be tested to.
Wait for the trace to complete, or press CTRL+C if multiple lines ending with "Request timed out" are encountered.
A lack of response will be represented by an asterisk (*), potentially indicating packet loss, or that the device is configured to not respond. The test may need to be completed multiple times to identify where loss is occurring. If packet loss is frequently encountered after a particular hop, then the issue most likely is with that device or between it and the previous hop. This screenshot illustrates a tracert clear of packet loss. The only device to not respond (hop 11) is likely configured to do so, as there is no packet loss after it.
In this next screenshot, packet loss is regularly encountered beginning with hop 2. This indicates there may be an issue with the ISP gateway, or the link(s) between the Client gateway and ISP gateway. It is recommended to test from multiple clients at different locations in the network to help rule out specific client issues and develop commonality between clients experiencing the problem.
As a more robust test, the tool MTR can be used to preform a continuous series of traces and present a % of loss at each hop in the path to more clearly identify where the loss is occurring. Output for the above scenario would appear similar to:
Determining where packet loss is occurring in a wireless/switched network
Tracert only provides information for layer 3 devices in the path, such as routers. However, in the case where packet loss is occurring at the first hop, and must pass through a wireless access point and switch to get there, additional testing is required to isolate the problem. In this case, testing will need to be done multiple times, while getting progressively closer to the layer 3 device. The following steps are illustrated in the image below:
Ping the access point to test wireless quality. If using a Cisco Meraki AP, ping my.meraki.com.
If loss begins occurring here, refer to the knowledge base article on troubleshooting wireless performance.
Ping a client connected to the same VLAN (if configured) on the switch that the wireless client is connected to. If multiple switches exist along the path, repeat this step as needed.
If loss begins occurring here, the issue is most likely:
- Duplex/speed settings mismatch on the link between the AP and the switch, or switch and wired client
- Bad cable between the AP and switch, or switch and wired client
Connect a client directly to the router/firewall, on the same VLAN as the wireless client, and ping it from the wireless client.
If loss begins occurring here, the issue is most likely:
- Duplex/speed settings mismatch on the link between the switch and the router/firewall, or router/firewall and wired client
- Bad cable between the AP and switch, or router/firewall and wired client
If testing with Cisco Meraki devices it is also possible to ping the first MS switch in the path at switch.meraki.com and the MX security appliance at wired.meraki.com
Note: MX devices on firmware version 12.0+ can track loss and latency to an IP, which can help with isolating issues.
Common causes of packet loss
There are many potential causes for packet loss. This article will outline some of the more common reasons and what can be done about them.
This occurs when two ends of a link are using different speed/duplex settings, such as 100Mbps/half-duplex and 1000Mbps/full-duplex. When this occurs, some or all traffic will be lost on the link. To correct this, ensure both sides of the link have identical settings. Ideally, both ends of the connection should be set to "Auto" for both speed and duplex. If a speed or duplex setting must be manually set of one end, ensure that it has been set to the same values on the other end as well.
Link congestion (too much traffic)
This occurs when more traffic is attempting to go over a network link than it can support. Such as 60Mbps of traffic passing over the same 20Mbps link. This creates a bottleneck, resulting in some traffic being dropped.
There are multiple ways to solve this, including:
Increase the capacity of the link being overwhelmed to allow for all traffic.
Apply traffic shaping rules on MX security appliances or MR access points to limit the volume of traffic. Particularly decreasing the volume of undesirable traffic.
Apply traffic shaping rules on MX security appliances or MR access points to prioritize more important traffic.
Firewall blocking certain traffic
Even if packet loss isn't occurring for all types of traffic, an upstream firewall may be filtering certain types of traffic. This can result in some websites loading and others failing, or some services being accessible, while others are not. If a firewall exists between two devices/locations experiencing these symptoms, ensure that the firewall is not blocking the traffic that is experiencing the problem.
Bad cable or loose connection
A cable that has been poorly/incorrectly terminated or damaged can result in an incomplete or inaccurate electrical signal passing between devices. Swapping a cable with a new one, or performing a cable test on the one in question, can help to eliminate this as a possibility. Similarly, a cable that has not been fully seated in the port, or has been seated in a port with dust or other non-conductive debris on the pins, can result in an incomplete electrical signal. Be sure to keep all ports free of dust or build-up and ensure cables are securely connected.
Save as PDF
Explore the Product
Click to Learn More