Network connectivity troubleshooting: Layer-by-Layer Diagnostic
Knowledge Overview
Prerequisites
- Basic Linux command line knowledge and terminal navigation
- Understanding of fundamental networking concepts (IP addresses, subnets, gateways)
- Root or sudo access to a Linux system for diagnostic commands
- Familiarity with basic text editors (vim, nano) for configuration file inspection
- General understanding of the TCP/IP protocol stack
What You'll Learn
- How to diagnose network problems using the layer-by-layer OSI methodology
- Essential diagnostic commands like ping, traceroute, ip, ss, and tcpdump
- Step-by-step troubleshooting from physical connections to application protocols
- DNS resolution testing and debugging techniques
- Firewall configuration analysis and port connectivity testing
- Performance analysis methods for identifying latency and bandwidth issues
- Advanced scenarios including IPv6, wireless, and container networking
Tools Required
- Core utilities: ping, traceroute, ip, ss (included in most distributions)
- DNS tools: dig, nslookup, host (install dnsutils or bind-utils package)
- Analysis tools: tcpdump, netstat, ethtool, mtr
- Network utilities: nc (netcat), curl, wget
- Optional tools: iperf3 (bandwidth testing), nmap (port scanning), wireshark (GUI packet analysis)
- Root/sudo privileges: Required for most diagnostic commands and packet capture
Time Investment
10 minutes reading time
20-30 minutes hands-on practice
Guide Content
What commands are essential for network connectivity troubleshooting?
Use the systematic layer-by-layer approach to diagnose any network connectivity troubleshooting. Starting from physical connections through application protocols, this comprehensive troubleshooting methodology enables you to identify and resolve network problems efficiently using proven diagnostic commands like ping, traceroute, ip, ss, and tcpdump.
Table of Contents
- How Does the network connectivity troubleshooting Methodology Work?
- What Are the Essential Network Diagnostic Commands?
- How to Verify Physical Layer Connectivity?
- How to Troubleshoot Data Link Layer Issues?
- How to Diagnose Network Layer Problems?
- How to Resolve Transport Layer Issues?
- How to Debug Application Layer Problems?
- What DNS Troubleshooting Techniques Are Most Effective?
- How to Analyze Network Performance Issues?
- What Are Common Firewall Configuration Problems?
- Advanced network connectivity troubleshooting Scenarios
- Frequently Asked Questions
- Troubleshooting Common Error Messages
- Additional Resources
How Does the network connectivity troubleshooting Methodology Work?
network connectivity troubleshooting follows a systematic layer-by-layer approach based on the OSI model. This methodology starts at the physical layer and progressively moves up through data link, network, transport, and application layers, isolating problems at each level before proceeding to the next.
The systematic troubleshooting process delivers more efficient problem resolution than random testing. By following the OSI model from bottom to top, you eliminate entire categories of potential issues at each step, dramatically reducing diagnostic time. Experienced system administrators can often resolve complex network problems in minutes using this structured approach instead of hours of trial-and-error testing.
Understanding the OSI Model Layers for network connectivity troubleshooting
The Open Systems Interconnection (OSI) model provides a conceptual framework that separates network functionality into seven distinct layers. For practical Linux troubleshooting, we focus on five key layers where most problems occur.
Layer 1 (Physical): This fundamental layer encompasses all physical network connections, including Ethernet cables, fiber optics, network interface cards (NICs), switches, and routers. Physical layer problems manifest as completely non-functional network interfaces, typically showing "NO-CARRIER" status in interface reports. Common physical issues include disconnected cables, faulty NICs, damaged ports, or power failures affecting network equipment.
Layer 2 (Data Link): The data link layer handles MAC addressing, frame transmission, and local network segment communication. Problems at this layer often involve incorrect VLAN configurations, MAC address conflicts, switch port issues, or ARP table corruption. The network interface configuration guide from kernel.org provides detailed documentation on low-level interface settings that affect data link operation.
Layer 3 (Network): Operating at the IP level, this layer manages routing between different networks, subnet addressing, and packet forwarding. Network layer troubleshooting involves verifying IP address configuration, examining routing tables, testing gateway connectivity, and confirming proper subnet masks. The majority of connectivity problems occur at this layer due to misconfigured IP settings or routing issues.
Layer 4 (Transport): This layer provides end-to-end communication services through TCP and UDP protocols. Transport layer problems include firewall blocking specific ports, service binding failures, connection timeouts, or port conflicts. Tools like ss and netstat help diagnose transport layer issues by revealing active connections and listening ports.
Layer 7 (Application): The application layer represents actual network services and protocols such as HTTP, SSH, DNS, and SMTP. Application layer troubleshooting requires service-specific knowledge and often involves examining application logs, testing protocol-specific commands, and verifying service configurations.
The Systematic Troubleshooting Workflow
Effective network connectivity troubleshooting follows a proven sequential process:
# Step 1: Verify physical connectivity
ip link show
# Step 2: Check IP configuration
ip addr show
# Step 3: Test local gateway
ping -c 4 <gateway_ip>
# Step 4: Test remote connectivity
ping -c 4 8.8.8.8
# Step 5: Test DNS resolution
ping -c 4 google.com
# Step 6: Analyze routing paths
traceroute google.com
# Step 7: Check listening services
ss -tulpn
This workflow systematically eliminates potential problems at each layer. If the interface shows "UP" with carrier detection, you've confirmed the physical layer works. If pinging the gateway succeeds, you've verified data link and basic network connectivity. Successful remote pings confirm proper routing and gateway functionality.
The beauty of this approach is its efficiency. When troubleshooting reports, "interface is down," you immediately know to focus on physical and data link layers without wasting time checking DNS or application settings. Conversely, when local pings work but external connections fail, you know routing or firewall issues are the likely culprits.
What Are the Essential Network Diagnostic Commands?
Linux provides a comprehensive toolkit of network diagnostic commands that every system administrator must master. These utilities range from basic connectivity testing to advanced packet analysis, enabling thorough investigation of network problems at all layers.
Core Connectivity Testing Tools
The ping command represents the most fundamental network diagnostic tool. It sends ICMP echo request packets to specified hosts and reports response times, packet loss, and reachability status. While simple, ping provides immediate insight into basic network connectivity:
# Test basic connectivity with limited packet count
ping -c 4 google.com
# Test with specific packet size (useful for MTU testing)
ping -c 4 -s 1472 google.com
# Continuous ping with timestamp
ping -D google.com
# Ping with specified interval (0.2 seconds between packets)
ping -i 0.2 google.com
# Set maximum wait time for response
ping -W 2 -c 4 192.168.1.1
The -c option limits packet count, preventing indefinite testing. The -s option adjusts packet size, crucial for diagnosing Maximum Transmission Unit (MTU) problems. Adding timestamps with -D helps correlate network issues with specific times, particularly useful when investigating intermittent problems.
The traceroute command maps the complete path packets take to reach their destination, revealing each router hop along the route. This diagnostic tool proves invaluable when determining where connectivity breaks down between your system and a remote host:
# Basic traceroute to destination
traceroute google.com
# Use ICMP instead of UDP (may bypass some firewalls)
traceroute -I google.com
# Specify maximum hop count
traceroute -m 15 google.com
# Set initial TTL value
traceroute -f 5 google.com
# Display AS numbers for each hop
traceroute -A google.com
Understanding traceroute output requires recognizing normal and problematic patterns. Asterisks (* * *) indicate routers that don't respond to traceroute probes, which may be normal firewall behavior or actual connectivity problems. Dramatically increasing latency at a specific hop suggests congestion or routing issues at that point in the network path.
Interface Configuration and Status Commands
The modern ip command suite from iproute2 has replaced older tools like ifconfig and route, providing comprehensive network interface management:
# Display all network interfaces with detailed information
ip addr show
# Show specific interface details
ip addr show dev eth0
# Display interface link status
ip link show
# View routing table
ip route show
# Display neighbor (ARP) table
ip neigh show
# Show network statistics
ip -s link show
# Display multicast addresses
ip maddr show
The command ip addr show reveals complete interface configuration including IP addresses, subnet masks, broadcast addresses, MAC addresses, and interface status. The output format clearly distinguishes between interface state ("UP" or "DOWN") and carrier detection ("LOWER_UP" indicates physical connection detected).
Examining routing tables with ip route show exposes how your system forwards packets to different destinations becoming a prominent network connectivity troubleshooting. The default route (default via gateway_ip) determines how packets reach networks not explicitly defined in the routing table. Missing or incorrect default routes cause inability to reach external networks despite functioning local connectivity.
Connection and Port Analysis Tools
The ss (socket statistics) command provides detailed information about network connections, listening ports, and socket states. It supersedes the older netstat command with improved performance and more detailed output:
# Show all TCP sockets
ss -t
# Show all UDP sockets
ss -u
# Show listening sockets only
ss -l
# Show process information for sockets (requires root)
ss -p
# Show socket memory usage
ss -m
# Combine options: TCP listening sockets with process info
ss -tlp
# Display all sockets with numeric addresses (no DNS resolution)
ss -tulpn
# Show socket statistics summary
ss -s
# Filter connections to specific port
ss -tn sport = :80
# Show established connections
ss -to state established
The most common command combination ss -tulpn displays all TCP and UDP listening sockets with process information and numeric addresses. This prevents slow DNS lookups while troubleshooting and clearly shows which processes bind to which ports. According to Red Hat's networking guide, understanding socket states proves critical when diagnosing connection problems or investigating security issues.
DNS Resolution Testing
DNS problems account for a large percentage of reported "network connectivity issues." Testing DNS resolution separately from basic connectivity helps isolate these problems:
# Query A record using system resolver
nslookup google.com
# Detailed DNS query with dig
dig google.com
# Query specific DNS server
dig @8.8.8.8 google.com
# Short answer format
dig +short google.com
# Reverse DNS lookup
dig -x 8.8.8.8
# Query specific record types
dig google.com MX
dig google.com TXT
dig google.com AAAA
# Trace DNS delegation path
dig +trace google.com
# Check DNSSEC validation
dig +dnssec google.com
The dig command provides comprehensive DNS information including query time, responding DNS server, authoritative nameservers, and detailed record data. Slow query times indicate DNS server performance problems, while SERVFAIL responses suggest DNS configuration errors or upstream DNS issues.
Comparing results between different DNS servers helps identify whether problems are local (your configured DNS) or global (authoritative DNS servers). Testing with public DNS servers like Google's 8.8.8.8 or Cloudflare's 1.1.1.1 provides reliable comparison points.
Advanced Packet Capture and Analysis
For deep network connectivity troubleshooting, tcpdump captures and analyzes actual network packets, revealing exact protocol-level communication:
# Capture packets on specific interface
tcpdump -i eth0
# Capture and write to file
tcpdump -i eth0 -w capture.pcap
# Capture specific number of packets
tcpdump -i eth0 -c 100
# Display ASCII content
tcpdump -i eth0 -A
# Capture only TCP traffic
tcpdump -i eth0 tcp
# Capture traffic on specific port
tcpdump -i eth0 port 80
# Capture traffic to/from specific host
tcpdump -i eth0 host 192.168.1.100
# Capture with detailed protocol information
tcpdump -i eth0 -vv
# Read from capture file
tcpdump -r capture.pcap
# Filter by network
tcpdump -i eth0 net 192.168.1.0/24
Packet captures reveal precise network behavior invisible to higher-level tools. You can observe whether packets actually leave your system, whether responses arrive, exact timing of network events, and detailed protocol interactions. The Wireshark documentation provides extensive guidance on analyzing captured packet data, though tcpdump offers simpler command-line analysis for quick diagnostics.
How to Verify Physical Layer Connectivity?
Physical layer problems manifest as interfaces showing "DOWN" status or "NO-CARRIER" detection. These fundamental connectivity issues prevent all network communication and must be resolved before investigating higher-layer problems.
Checking Interface Status and Link Detection
The first troubleshooting step always involves verifying interface physical status:
# Check interface status
ip link show
# Sample output interpretation:
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP
# link/ether 00:0c:29:3a:2f:1b brd ff:ff:ff:ff:ff:ff
# Enable interface if down
sudo ip link set eth0 up
# Display detailed interface statistics
ip -s link show eth0
# Check for physical errors
ethtool -S eth0 | grep -i error
# View interface driver and hardware information
ethtool -i eth0
Interface status flags reveal critical information. The UP flag indicates the interface is administratively enabled, while LOWER_UP confirms physical carrier detection (cable connected and link established). An interface showing UP without LOWER_UP indicates physical connectivity problems.
The state field shows the operational status: UP means functioning, DOWN means disabled or disconnected, and UNKNOWN suggests driver or hardware issues. MTU (Maximum Transmission Unit) affects packet size, with 1500 bytes being standard for Ethernet.
Diagnosing Cable and Port Issues
Physical connectivity problems often stem from simple cable or port failures:
Check cable connections: Ensure Ethernet cables are firmly seated in both the NIC and switch/router ports. Loose connections cause intermittent problems more frustrating than complete failures. Look for bent pins in the Ethernet jack and examine cables for visible damage or kinks.
Verify link lights: Most network interfaces include LED indicators showing link status and activity. No link light indicates cable, port, or interface failure. Consult your hardware documentation for specific LED meanings, as colors and patterns vary between manufacturers.
Test with known-good cable: Cable failures occur more frequently than administrators expect, especially in high-use environments. Swapping a suspected bad cable with a known working one quickly eliminates or confirms cable problems. Professional network testing tools can verify cable integrity and pinout correctness.
Check switch port status: If your interface shows no carrier despite good cables, investigate the switch port. Most managed switches allow checking port status, detecting errors, or revealing administrative disabling. Port security features on switches might automatically disable ports when detecting certain conditions.
Troubleshooting Network Interface Card Issues
# List all network interfaces including disabled ones
ip link show
# Check if interface is recognized by kernel
lspci | grep -i ethernet
lspci | grep -i network
# Display detailed PCI information
lspci -vv -s 02:00.0
# Check for interface driver issues in system logs
dmesg | grep -i eth0
dmesg | grep -i network
# Verify driver is loaded
lsmod | grep <driver_name>
# Reload network interface driver
sudo modprobe -r <driver_name>
sudo modprobe <driver_name>
# Check hardware error counters
ethtool -S eth0 | grep -E "error|drop|collision"
If the interface doesn't appear in ip link show output at all, the system hasn't detected the NIC. This suggests hardware failure, BIOS disabling, driver issues, or improper hardware installation. The lspci command should show the NIC in PCI device listings even when not functioning properly.
Driver problems often appear in dmesg output as error messages during system boot or interface initialization. According to kernel.org networking documentation, many driver issues result from firmware incompatibilities or missing driver modules.
Verifying Power Management Settings
Modern network interfaces support power management features that sometimes cause connectivity problems:
# Check current power management status
ethtool eth0 | grep -i wake
# Disable Wake-on-LAN (can cause link issues)
sudo ethtool -s eth0 wol d
# Check if interface is in power-saving mode
cat /sys/class/net/eth0/device/power/control
# Disable power management for interface
echo on | sudo tee /sys/class/net/eth0/device/power/control
# Check for any USB power management (for USB NICs)
cat /sys/bus/usb/devices/*/power/control
Some systems aggressively power-manage network interfaces, potentially causing link drops or initialization failures. Disabling power management for network interfaces often resolves mysterious connectivity problems, especially on laptops or systems with aggressive power-saving configurations.
How to Troubleshoot Data Link Layer Issues?
Data link layer problems involve MAC addressing, ARP resolution, and local network segment communication. These issues prevent communication even when physical connectivity functions properly.
Examining ARP Tables and MAC Address Resolution
The Address Resolution Protocol (ARP) translates IP addresses to MAC addresses for local network communication. ARP problems prevent communication with devices on the same subnet:
# Display current ARP table
ip neigh show
# Show ARP table with detailed status
arp -a
# Clear specific ARP entry (requires root)
sudo ip neigh del 192.168.1.1 dev eth0
# Flush entire ARP cache
sudo ip neigh flush dev eth0
# Add static ARP entry
sudo ip neigh add 192.168.1.1 lladdr 00:11:22:33:44:55 dev eth0
# Monitor ARP activity in real-time
sudo tcpdump -i eth0 arp -n
# Check for duplicate IP addresses
sudo arping -I eth0 -D 192.168.1.100
The ARP table shows three common states: REACHABLE (entry valid and recently verified), STALE (entry exists but hasn't been verified recently), and FAILED (ARP resolution failed). Multiple FAILED entries for your gateway indicate serious local network problems requiring immediate attention.
Duplicate IP addresses cause erratic connectivity as multiple devices respond to the same IP address. The arping command detects duplicates by sending ARP requests and counting responses. More than one response to an ARP probe confirms duplicate address configuration.
Diagnosing VLAN Configuration Problems
Virtual LANs (VLANs) segment networks logically, and misconfiguration causes complete communication failure despite functioning physical connections:
# Display VLAN interfaces
ip -d link show
# Show VLAN configuration details
cat /proc/net/vlan/config
# Create VLAN interface (VLAN ID 10)
sudo ip link add link eth0 name eth0.10 type vlan id 10
sudo ip addr add 192.168.10.1/24 dev eth0.10
sudo ip link set eth0.10 up
# Remove VLAN interface
sudo ip link delete eth0.10
# Check if traffic is VLAN-tagged
sudo tcpdump -i eth0 -e -n
VLAN mismatches occur when your interface expects untagged traffic but receives VLAN-tagged packets, or vice versa. Network switches must have port configurations matching your interface VLAN settings. According to IEEE 802.1Q standard documentation, VLAN IDs range from 1 to 4094, with VLAN 1 typically being the default native (untagged) VLAN.
Resolving MAC Address Conflicts and Spoofing Issues
# Display current MAC address
ip link show eth0 | grep link/ether
# Change MAC address temporarily
sudo ip link set dev eth0 down
sudo ip link set dev eth0 address 00:11:22:33:44:55
sudo ip link set dev eth0 up
# Check for MAC address conflicts on network
sudo tcpdump -i eth0 -e | grep "00:11:22:33:44:55"
# View MAC-to-port mappings (if switch supports SNMP)
snmpwalk -v2c -c public <switch_ip> 1.3.6.1.2.1.17.4.3.1.2
MAC address conflicts occur when multiple interfaces use identical hardware addresses, causing severe network confusion. This rarely happens accidentally with genuine network cards due to manufacturer-assigned unique MAC addresses, but virtual machines, improperly cloned systems, or manual MAC configuration can create conflicts.
Testing Local Segment Connectivity
Before troubleshooting beyond your local network segment, confirm communication with other local devices:
# Ping another device on local subnet
ping -c 4 192.168.1.50
# Test connectivity to default gateway
ping -c 4 $(ip route | grep default | awk '{print $3}')
# Check if gateway responds to ARP
sudo arping -I eth0 -c 4 192.168.1.1
# Verify switch port is receiving/transmitting
ethtool -S eth0 | grep -E "rx_|tx_"
# Check for network errors and drops
ip -s link show eth0
If you cannot ping other devices on the same subnet (same network, not crossing a router), but physical layer tests pass, suspect data link problems like ARP failures, VLAN mismatches, or switch port issues. Communication within the local subnet doesn't require routing, so routing problems are eliminated when local connectivity fails.
How to Diagnose Network Layer Problems in network connectivity troubleshooting?
Network layer diagnostics focus on IP addressing, routing, and subnet configuration. Most connectivity problems manifest at this layer through misconfigured addresses, incorrect subnet masks, or routing failures.
Verifying IP Address Configuration
Correct IP configuration forms the foundation for all network communication:
# Display current IP configuration
ip addr show
# Check specific interface configuration
ip addr show dev eth0
# View IPv4 addresses only
ip -4 addr show
# View IPv6 addresses only
ip -6 addr show
# Configure static IP address
sudo ip addr add 192.168.1.100/24 dev eth0
# Remove IP address
sudo ip addr del 192.168.1.100/24 dev eth0
# Configure secondary IP address
sudo ip addr add 192.168.1.101/24 dev eth0
# Verify broadcast address configuration
ip addr show dev eth0 | grep brd
The subnet mask determines which IP addresses exist on your local network versus which require routing through gateways. Incorrect subnet masks cause bizarre connectivity patterns where some addresses work while others fail unexpectedly. A /24 mask (255.255.255.0) allows 254 usable host addresses in the same subnet, while a /25 mask splits that into two separate subnets of 126 hosts each.
Broadcast addresses enable network-wide communication to all hosts on the subnet. An incorrect broadcast address prevents proper DHCP operation and breaks some network protocols relying on broadcast communication.
Testing Gateway and Router Connectivity
The default gateway provides access to networks beyond your local subnet:
# Display routing table
ip route show
# Show default gateway
ip route show default
# Test gateway connectivity
ping -c 4 $(ip route | grep default | awk '{print $3}')
# Add default gateway
sudo ip route add default via 192.168.1.1
# Add route to specific network
sudo ip route add 10.0.0.0/8 via 192.168.1.1
# Delete route
sudo ip route del 10.0.0.0/8
# Display routing decisions for destination
ip route get 8.8.8.8
# Check if packets actually reach gateway
sudo tcpdump -i eth0 -n host <gateway_ip>
A functioning gateway responds to pings and appears in the ARP table. If pinging the gateway fails but other local devices work, suspect gateway problems, firewall issues blocking ICMP, or gateway interface failures.
The ip route get command shows exactly how the system will route packets to a specific destination, revealing which interface, gateway, and source address the kernel will use. This diagnostic tool quickly identifies routing misconfigurations causing traffic to use wrong interfaces or gateways.
Analyzing Subnet Configuration Problems
# Calculate network and broadcast addresses
ipcalc 192.168.1.100/24
# Test if two addresses are in same subnet
ipcalc 192.168.1.100/24 -c 192.168.1.200
# View routing table in detail
ip route show table all
# Display policy routing rules
ip rule show
# Check MTU configuration
ip link show | grep mtu
# Test MTU path discovery
ping -c 4 -M do -s 1472 google.com
MTU mismatches cause mysterious connectivity problems where small packets work but large transfers fail. The standard Ethernet MTU is 1500 bytes, but VPN tunnels, PPPoE connections, or jumbo frames change this. The RFC 1191 describes Path MTU Discovery, which automatically determines appropriate packet sizes, but misconfigured firewalls blocking ICMP can break this mechanism.
Troubleshooting Routing Problems
# Trace route to destination
traceroute -n google.com
# Use ICMP instead of UDP for traceroute
traceroute -I google.com
# Trace path with maximum hops
traceroute -m 20 google.com
# Display route with MTR (combines ping and traceroute)
mtr google.com
# Check for asymmetric routing
sudo tcpdump -i any -n icmp
# Verify packet forwarding is enabled (for routers)
sysctl net.ipv4.ip_forward
# Enable IP forwarding temporarily
sudo sysctl -w net.ipv4.ip_forward=1
Traceroute reveals the complete path packets take to reach destinations. Timeouts (* * *) at specific hops indicate routing problems, firewalls blocking traceroute probes, or network congestion. Dramatically increasing latency at particular hops suggests network congestion requiring investigation.
The mtr command combines ping and traceroute functionality, continuously monitoring the route and providing statistics on packet loss and latency at each hop. This tool proves particularly valuable when investigating intermittent connectivity problems related to specific network paths.
Testing Network Connectivity Without DNS
Isolating DNS problems from network connectivity requires testing with IP addresses:
# Test connectivity to Google DNS (doesn't require DNS resolution)
ping -c 4 8.8.8.8
# Test connectivity to Cloudflare DNS
ping -c 4 1.1.1.1
# Trace route using IP address only
traceroute -n 8.8.8.8
# Test HTTP connectivity without DNS
curl -I http://93.184.216.34 # example.com IP
# Fetch webpage using IP address
wget http://93.184.216.34/index.html
If pinging IP addresses succeeds but domain names fail, you've isolated the problem to DNS resolution rather than network connectivity. This critical distinction prevents wasting time investigating routing or firewall issues when DNS configuration is the actual culprit.
How to Resolve Transport Layer Issues?
Transport layer troubleshooting focuses on TCP/UDP connectivity, port accessibility, and firewall configuration. These problems prevent specific applications from communicating despite functioning network connectivity.
Identifying Port Connectivity Problems
# Test if specific port is open and accepting connections
nc -zv google.com 80
nc -zv google.com 443
# Scan for open ports on remote host
nmap -p 80,443,22 example.com
# Test UDP port
nc -zuv example.com 53
# Check if local port is listening
ss -tulpn | grep :80
# Test connection with timeout
timeout 5 bash -c "</dev/tcp/example.com/80" && echo "Port 80 open"
# Display all listening ports
ss -tulpn
# Show established connections
ss -tunp | grep ESTAB
The netcat (nc) command provides simple port connectivity testing. When testing succeeds to the IP address but fails to the domain name, DNS resolution problems exist. When testing fails to both, firewall blocking, service not running, or network routing problems are likely causes.
Port scanning with nmap reveals which ports respond on remote systems, but use this tool only on systems you own or have explicit permission to test. Unauthorized port scanning often violates network acceptable use policies and may be illegal in some jurisdictions. The nmap documentation provides comprehensive guidance on proper usage.
Diagnosing Firewall Configuration Issues
# Check firewalld status
sudo firewall-cmd --state
# List all firewalld rules
sudo firewall-cmd --list-all
# Check iptables rules
sudo iptables -L -n -v
# Display iptables with line numbers
sudo iptables -L --line-numbers
# Check nftables rules
sudo nft list ruleset
# Test if firewall is blocking specific port
sudo tcpdump -i any port 80
# Temporarily disable firewalld (testing only!)
sudo systemctl stop firewalld
# Temporarily flush iptables rules (testing only!)
sudo iptables -F
Firewall rules can block outgoing or incoming connections on specific ports. Testing connectivity with firewall temporarily disabled (on test systems only!) quickly determines if firewall rules cause the problem. According to Red Hat's firewall documentation, firewalld provides zone-based firewall management simpler than raw iptables rules.
Analyzing Connection States and Timeouts
# Display TCP connection states
ss -tan | awk '{print $1}' | sort | uniq -c
# Monitor connection establishment
watch -n 1 'ss -tan | grep :80'
# Check TIME_WAIT sockets
ss -tan | grep TIME_WAIT | wc -l
# Adjust TCP timeout values (advanced)
sysctl net.ipv4.tcp_fin_timeout
sysctl net.ipv4.tcp_keepalive_time
# Display connection errors
ss -tie
# Monitor dropped connections
nstat TcpAttemptFails TcpRetransSegs
# Check for connection queue overflows
cat /proc/net/sockstat
Connection states reveal transport layer health. Large numbers of connections stuck in SYN_SENT indicate problems reaching remote services. Excessive TIME_WAIT connections suggest high connection turnover but generally don't indicate problems. CLOSE_WAIT states accumulating over time indicate application bugs not properly closing connections.
Testing Service Availability and Response
# Test HTTP service response
curl -I http://example.com
# Test HTTPS with detailed connection info
curl -Iv https://example.com
# Test with specific timeout
curl --connect-timeout 5 http://example.com
# Test SMTP service
nc -v mail.example.com 25
# Test SSH service
ssh -v user@host
# Check service listening status
ss -tulpn | grep :22
# Verify service is running
systemctl status sshd
Services must be running and bound to the correct IP addresses and ports. The ss -tulpn output shows which address a service binds to: 0.0.0.0 means listening on all interfaces, 127.0.0.1 means local connections only, and specific IP addresses mean listening on particular interfaces only.
Resolving TCP Window and Buffer Issues
# Display current TCP window settings
sysctl net.ipv4.tcp_window_scaling
sysctl net.core.rmem_max
sysctl net.core.wmem_max
# Show TCP memory usage
cat /proc/net/sockstat
# Monitor TCP statistics
nstat -az | grep -i tcp
# Check for TCP retransmissions
netstat -s | grep retransmit
# Optimize TCP for high-latency connections
sudo sysctl -w net.ipv4.tcp_window_scaling=1
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
TCP window sizing affects throughput on high-latency or high-bandwidth connections. Small buffers limit transfer speeds even on fast networks. The kernel automatically adjusts TCP windows when window scaling is enabled, but manual tuning might be necessary for optimal performance in specific situations.
How to Debug Application Layer Problems?
Application layer troubleshooting requires protocol-specific knowledge and often involves examining service logs and testing protocol implementations. Problems at this layer occur despite functioning network connectivity at lower layers.
Testing HTTP/HTTPS Services
# Basic HTTP GET request
curl http://example.com
# Display response headers only
curl -I http://example.com
# Follow redirects automatically
curl -L http://example.com
# Test HTTPS with certificate verification
curl -v https://example.com
# Ignore SSL certificate errors (testing only!)
curl -k https://example.com
# Test with custom headers
curl -H "User-Agent: Custom" http://example.com
# POST data to endpoint
curl -X POST -d "param=value" http://example.com/api
# Test HTTP/2 support
curl --http2 -I https://example.com
# Show detailed timing information
curl -w "@curl-format.txt" -o /dev/null -s http://example.com
HTTP response codes indicate service health: 200 (OK), 301/302 (redirect), 400 (bad request), 403 (forbidden), 404 (not found), 500 (server error), 502 (bad gateway), 503 (service unavailable). Each code category provides clues about where problems exist.
Diagnosing SSH Connection Issues
# Test SSH with verbose output
ssh -v user@host
# Test with extreme verbosity
ssh -vvv user@host
# Use specific SSH key
ssh -i ~/.ssh/specific_key user@host
# Test SSH port connectivity
nc -zv host 22
# Check SSH server status
systemctl status sshd
# View SSH server logs
sudo journalctl -u sshd -f
# Test SSH configuration
sudo sshd -t
# Debug SSH client configuration
ssh -G user@host
SSH verbosity reveals exactly where connection establishment fails: DNS resolution, network connectivity, SSH protocol negotiation, authentication, or authorization. Connection refusal indicates the SSH service isn't running, while authentication failures point to credential or permission problems.
Troubleshooting Email Service Problems
# Test SMTP service
telnet mail.example.com 25
# Test with proper SMTP commands
nc mail.example.com 25 << EOF
HELO client.example.com
MAIL FROM:<sender@example.com>
RCPT TO:<recipient@example.com>
DATA
Subject: Test
Test message
.
QUIT
EOF
# Check email queue
mailq
# View mail server logs
sudo tail -f /var/log/mail.log
# Test DNS MX records
dig example.com MX
# Verify SPF records
dig example.com TXT
Email problems often stem from DNS configuration (incorrect MX records), authentication issues, spam filtering, or recipient server rejection. Testing with direct SMTP commands isolates whether problems exist with your mail server configuration or with email client software.
Debugging Database Connectivity
# Test MySQL/MariaDB connection
mysql -h dbhost -u user -p
# Test PostgreSQL connection
psql -h dbhost -U user -d database
# Check if database port is accessible
nc -zv dbhost 3306 # MySQL
nc -zv dbhost 5432 # PostgreSQL
# Test connection timeout issues
time mysql -h dbhost -u user -p
# Monitor database connections
ss -tunp | grep :3306
Database connection problems often involve network connectivity (firewall blocking ports), authentication (incorrect credentials), or authorization (user lacks necessary privileges). Testing from the database server itself versus remote clients helps distinguish network problems from configuration issues.
Analyzing Application Logs
# View system log
sudo journalctl -f
# Filter logs for specific service
sudo journalctl -u apache2 -f
# View last 100 log entries
sudo journalctl -n 100
# Show logs since specific time
sudo journalctl --since "1 hour ago"
# View kernel messages
dmesg | tail -50
# Monitor Apache access log
tail -f /var/log/apache2/access.log
# Monitor Apache error log
tail -f /var/log/apache2/error.log
# Search logs for errors
sudo journalctl -p err -b
Application logs provide definitive information about service behavior, error conditions, and failed operations. Most network-related application problems leave clear log evidence revealing the exact failure point.
What DNS network connectivity troubleshooting Are Most Effective?
DNS problems disguise themselves as general network connectivity issues, but systematic DNS testing quickly identifies resolution failures. Effective DNS troubleshooting requires understanding the resolution process and testing at multiple points.
Testing DNS Resolution with dig
The dig command provides comprehensive DNS debugging information:
# Basic DNS query
dig example.com
# Query specific DNS server
dig @8.8.8.8 example.com
# Short answer format
dig +short example.com
# Query specific record type
dig example.com A
dig example.com AAAA
dig example.com MX
dig example.com NS
dig example.com TXT
dig example.com SOA
# Trace DNS delegation path
dig +trace example.com
# Reverse DNS lookup
dig -x 8.8.8.8
# Check DNSSEC validation
dig +dnssec example.com
# Query with no recursion (query root servers directly)
dig +norecurse example.com
# Display query timing
dig example.com | grep "Query time"
DNS query timing reveals performance problems. Queries completing in under 50ms indicate healthy DNS, while queries taking seconds suggest DNS server problems, network congestion, or misconfiguration. The +trace option shows every step of DNS resolution from root servers through authoritative nameservers, exposing where resolution fails or becomes slow.
Analyzing DNS Configuration Files
# Display current DNS resolver configuration
cat /etc/resolv.conf
# Check local hosts file
cat /etc/hosts
# View systemd-resolved status
systemd-resolve --status
# Display DNS cache statistics
systemd-resolve --statistics
# Flush DNS cache
sudo systemd-resolve --flush-caches
# Test DNS resolution method
getent hosts example.com
# Check nsswitch configuration
cat /etc/nsswitch.conf
# Test resolver library directly
host example.com
The /etc/resolv.conf file configures system DNS servers. Multiple nameserver entries provide fallback when the primary server fails. The search directive automatically appends domain suffixes to unqualified hostnames, simplifying internal network naming.
Modern systems often use systemd-resolved for DNS resolution, which maintains its own DNS cache and configuration. The /etc/resolv.conf file becomes a symbolic link pointing to systemd-resolved's stub resolver, causing confusion when administrators expect traditional resolver configuration.
Diagnosing DNS Cache Issues
# Check if system uses DNS caching
systemctl status systemd-resolved
# Display cached DNS entries
sudo systemd-resolve --statistics
# Flush system DNS cache
sudo systemd-resolve --flush-caches
# Clear browser DNS cache varies by browser
# Chrome: chrome://net-internals/#dns
# Monitor DNS queries in real-time
sudo tcpdump -i any -n port 53
# Check DNS response times with cache cold vs warm
time dig example.com
time dig example.com
Stale DNS cache entries cause confusion when DNS records change but systems continue using old cached values. The DNS TTL (Time To Live) value determines cache duration, with lower values enabling faster updates but increasing DNS query load.
Troubleshooting Split DNS and Internal Resolvers
# Test external DNS resolution
dig @8.8.8.8 example.com
# Compare internal vs external resolution
dig @<internal_dns> internal.example.com
dig @8.8.8.8 internal.example.com
# Check DNS search domain configuration
cat /etc/resolv.conf | grep search
# Test FQDN vs short name resolution
dig server01
dig server01.internal.example.com
# Verify VPN DNS configuration
resolvectl dns
Split DNS configurations use different DNS servers for internal versus external queries, common in corporate environments. VPN connections often inject additional DNS servers for accessing internal resources, but misconfiguration causes either loss of internal name resolution or inability to resolve public DNS names.
Resolving DNSSEC Validation Problems
# Check if DNSSEC validation is enabled
dig +dnssec example.com
# Verify DNSSEC chain of trust
dig +dnssec +multiline example.com
# Test specific DNSSEC-enabled domain
dig +dnssec cloudflare.com
# Check for AD (authenticated data) flag
dig example.com | grep flags
# Disable DNSSEC validation temporarily (testing only)
sudo systemd-resolve --set-dnssec=no
# Verify DNSSEC configuration
cat /etc/systemd/resolved.conf | grep DNSSEC
DNSSEC provides DNS security by cryptographically signing DNS records, but misconfigurations or stale signatures cause validation failures preventing name resolution. The AD flag in dig output indicates successful DNSSEC validation, while SERVFAIL responses often indicate DNSSEC problems.
How to Analyze Network Performance Issues?
Network performance problems manifest as slow connections, high latency, or intermittent connectivity. Systematic performance analysis identifies bottlenecks and capacity issues distinct from complete connectivity failures.
Measuring Latency and Packet Loss
# Basic latency test
ping -c 100 example.com
# Calculate latency statistics
ping -c 100 example.com | tail -2
# Test with specific packet sizes
ping -c 100 -s 1400 example.com
# Continuous latency monitoring with MTR
mtr --report --report-cycles 100 example.com
# Test latency to multiple hops
traceroute -n example.com
# Monitor for packet loss patterns
ping -i 0.2 example.com | grep -v "time="
Latency (round-trip time) under 50ms indicates excellent local connectivity, 50-100ms represents good performance for most applications, and latency over 200ms causes noticeable degradation for interactive services. Packet loss above 1% seriously impacts TCP performance and causes user-perceptible problems.
Testing Bandwidth and Throughput
# Install iperf3 for bandwidth testing
sudo apt install iperf3 # Debian/Ubuntu
sudo yum install iperf3 # RHEL/CentOS
# Run iperf3 server on target system
iperf3 -s
# Test TCP throughput from client
iperf3 -c <server_ip>
# Test with parallel streams
iperf3 -c <server_ip> -P 4
# Test UDP throughput
iperf3 -c <server_ip> -u -b 100M
# Test for specific duration
iperf3 -c <server_ip> -t 60
# Test reverse direction
iperf3 -c <server_ip> -R
# Generate JSON output for analysis
iperf3 -c <server_ip> -J
Bandwidth testing with iperf3 reveals actual network throughput compared to theoretical capacity. Significantly lower throughput than expected indicates network congestion, misconfigured interfaces, or physical layer problems. The iperf3 documentation provides comprehensive guidance on interpreting results.
Identifying Network Congestion
# Monitor interface statistics
watch -n 1 'cat /proc/net/dev'
# Display detailed interface counters
ip -s link show eth0
# Check for TX/RX errors and drops
ethtool -S eth0 | grep -E "error|drop"
# Monitor bandwidth usage with vnstat
vnstat -l -i eth0
# Display real-time bandwidth usage
iftop -i eth0
# Monitor per-process network usage
nethogs eth0
# Check for interface buffer overruns
ifconfig eth0 | grep overruns
Interface drops, errors, or overruns indicate problems handling traffic volume. TX (transmit) errors often suggest physical problems or duplex mismatches, while RX (receive) errors might indicate incoming traffic exceeding processing capacity.
Analyzing TCP Performance Issues
# Display TCP statistics
nstat -az | grep -i tcp
# Check for retransmission issues
netstat -s | grep retrans
# Monitor connection queue lengths
ss -tlni
# Check TCP window scaling
sysctl net.ipv4.tcp_window_scaling
# Display TCP memory usage
cat /proc/net/sockstat
# Check for window size problems
ss -tie dst <remote_ip>
# Monitor TCP state machine
watch -n 1 'ss -tan | cut -d" " -f1 | tail -n+2 | sort | uniq -c'
High retransmission rates degrade TCP performance dramatically. More than 1% retransmission rate indicates network problems, congestion, or lossy links. According to RFC 2525, TCP performance depends heavily on proper window sizing and avoiding unnecessary retransmissions.
Diagnosing MTU Path Discovery Problems
# Test maximum MTU without fragmentation
ping -M do -s 1472 example.com
# Find optimal MTU for path
tracepath example.com
# Test with larger packet sizes
ping -M do -s 1500 example.com
ping -M do -s 1400 example.com
ping -M do -s 1300 example.com
# Check interface MTU configuration
ip link show | grep mtu
# Set interface MTU
sudo ip link set dev eth0 mtu 1400
# Monitor fragmentation
netstat -s | grep -i frag
MTU problems cause mysterious failures where small transfers work but large data transfers fail. VPN tunnels, PPPoE connections, and certain network configurations require smaller MTU values than the standard 1500 bytes. Path MTU Discovery automatically determines appropriate packet sizes, but firewalls blocking ICMP "fragmentation needed" messages break this mechanism.
What Are Common Firewall Configuration Problems?
Firewall misconfigurations rank among the most common causes of network connectivity failures. Understanding firewall rule evaluation and systematic testing prevents hours of frustrating troubleshooting.
Diagnosing Firewalld Issues
# Check firewalld status
sudo firewall-cmd --state
# Display default zone
sudo firewall-cmd --get-default-zone
# List all configured zones
sudo firewall-cmd --get-zones
# Show active zones
sudo firewall-cmd --get-active-zones
# Display all rules in default zone
sudo firewall-cmd --list-all
# Check specific zone configuration
sudo firewall-cmd --zone=public --list-all
# Verify service is allowed
sudo firewall-cmd --list-services
# Check allowed ports
sudo firewall-cmd --list-ports
# Test if specific port is open
sudo firewall-cmd --query-port=80/tcp
# Add service temporarily
sudo firewall-cmd --add-service=http
# Make rule permanent
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --reload
Firewalld uses zones to apply different rule sets to different network interfaces or connection types. The default zone applies to interfaces not assigned to specific zones. Traffic not explicitly allowed by rules is blocked, so missing allow rules cause connection failures.
Troubleshooting iptables Rules
# Display all iptables rules
sudo iptables -L -n -v
# Show rules with line numbers
sudo iptables -L --line-numbers
# Display NAT table rules
sudo iptables -t nat -L -n -v
# Display mangle table rules
sudo iptables -t mangle -L -n -v
# Check INPUT chain rules
sudo iptables -L INPUT -n -v
# Check OUTPUT chain rules
sudo iptables -L OUTPUT -n -v
# Test specific rule matching
sudo iptables -I INPUT 1 -p tcp --dport 80 -j LOG --log-prefix "HTTP: "
sudo journalctl -kf
# Save current rules
sudo iptables-save > /tmp/iptables.rules
# Restore rules
sudo iptables-restore < /tmp/iptables.rules
Iptables processes rules sequentially from top to bottom, with first match winning. A broad DENY rule before specific ALLOW rules blocks traffic despite ALLOW rules existing. The counters (packet and byte counts) in verbose output reveal which rules match traffic, helping identify where blocking occurs.
Testing with Firewall Temporarily Disabled
# Stop firewalld temporarily (for testing only!)
sudo systemctl stop firewalld
# Test connectivity with firewall disabled
ping -c 4 example.com
# Re-enable firewall
sudo systemctl start firewalld
# Alternatively, test with iptables flushed (DANGEROUS!)
sudo iptables -P INPUT ACCEPT
sudo iptables -P FORWARD ACCEPT
sudo iptables -P OUTPUT ACCEPT
sudo iptables -F
# Restore firewall rules
sudo systemctl restart firewalld
WARNING: Disabling firewalls even temporarily creates security risks. Only perform these tests on non-production systems or with appropriate security measures. If connectivity works with firewall disabled but fails with firewall enabled, you've definitively identified firewall configuration as the problem.
Analyzing Connection Tracking Issues
# Display connection tracking table
sudo cat /proc/net/nf_conntrack
# Show connection tracking statistics
sudo conntrack -S
# Display tracked connections
sudo conntrack -L
# Monitor new connections
sudo conntrack -E
# Check connection tracking table size
sudo sysctl net.netfilter.nf_conntrack_max
# View current table usage
wc -l /proc/net/nf_conntrack
# Clear connection tracking table (DISRUPTIVE!)
sudo conntrack -F
Connection tracking (conntrack) maintains state for network connections, enabling stateful firewall rules. Exhausted connection tracking tables cause new connection failures despite firewall rules permitting the traffic. High-traffic servers require increased nf_conntrack_max values to accommodate connection volume.
Debugging Port Forwarding and NAT
# Display NAT rules
sudo iptables -t nat -L -n -v
# Add port forwarding rule
sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 192.168.1.100:8080
# Enable IP forwarding
sudo sysctl -w net.ipv4.ip_forward=1
# Test port forwarding
nc -zv <external_ip> 80
# Monitor NAT translations
sudo conntrack -L | grep NAT
# Check masquerading rules
sudo iptables -t nat -L POSTROUTING -n -v
Network Address Translation (NAT) problems prevent port forwarding and private network communication through gateways. Missing MASQUERADE or SNAT rules in the POSTROUTING chain prevent reply packets from returning correctly. The Netfilter documentation provides comprehensive NAT configuration guidance.
Advanced network connectivity troubleshooting Scenarios
Complex network problems often require combining multiple diagnostic techniques and understanding subtle protocol interactions. These advanced scenarios build on fundamental troubleshooting skills.
Diagnosing Intermittent Connectivity Issues
Intermittent problems frustrate administrators because symptoms disappear before diagnosis completes:
# Continuous monitoring with timestamps
ping -D example.com | while read line; do echo "$(date +%H:%M:%S) $line"; done
# Log connectivity over extended period
while true; do
date >> /tmp/connectivity.log
ping -c 10 -W 1 example.com >> /tmp/connectivity.log 2>&1
sleep 60
done
# Monitor with MTR for pattern detection
mtr --report --report-cycles 1000 example.com
# Track interface status changes
ip monitor link
# Monitor for interface errors
watch -n 1 'ethtool -S eth0 | grep -E "error|drop|collision"'
# Log network events
sudo journalctl -f -u NetworkManager
Intermittent issues often correlate with specific times (network backup schedules, batch jobs, peak usage periods) or events (interface flapping, wireless interference, temperature-dependent hardware failures). Extended monitoring reveals patterns invisible during short-term testing.
Troubleshooting Asymmetric Routing
Asymmetric routing occurs when outbound and inbound packets follow different paths, potentially causing connection failures:
# Trace outbound path
traceroute -n example.com
# Capture packets to verify inbound path
sudo tcpdump -i any -n icmp
# Check for multiple default gateways
ip route show
# Display policy routing rules
ip rule show
# Monitor routing table changes
ip monitor route
# Test with specific source address
ping -I eth0 example.com
ping -I eth1 example.com
# Verify reverse path filtering settings
sysctl net.ipv4.conf.all.rp_filter
Asymmetric routing itself doesn't necessarily cause problems, but combined with strict firewall rules or connection tracking, it can break connectivity. Reverse path filtering (rp_filter) validates that packets arrive on expected interfaces based on routing tables, potentially dropping legitimately routed packets in asymmetric scenarios.
Diagnosing IPv6 Connectivity Problems
# Check IPv6 configuration
ip -6 addr show
# Test IPv6 connectivity
ping6 -c 4 google.com
# Trace IPv6 route
traceroute6 google.com
# Test IPv6 DNS
dig AAAA google.com
# Display IPv6 routing table
ip -6 route show
# Check neighbor discovery table
ip -6 neigh show
# Test IPv6 to specific address
ping6 -c 4 2001:4860:4860::8888 # Google DNS
# Verify IPv6 forwarding
sysctl net.ipv6.conf.all.forwarding
# Check for IPv6 privacy extensions
ip -6 addr show | grep temporary
Many systems have IPv6 enabled but misconfigured, causing connection attempts to fail slowly before falling back to IPv4. The happy eyeballs algorithm (RFC 6555) attempts both IPv4 and IPv6 simultaneously, using whichever connects first, but misconfigurations can still cause delays.
Troubleshooting Wireless Network Issues
# Display wireless interface status
iw dev wlan0 info
# Scan for available networks
sudo iw dev wlan0 scan | grep -E "SSID|signal"
# Check wireless link quality
iw dev wlan0 link
# Display wireless statistics
cat /proc/net/wireless
# Monitor wireless events
iw event
# Check for interference
sudo iw dev wlan0 survey dump
# View wireless driver information
ethtool -i wlan0
# Check wireless interface errors
ip -s link show wlan0
Wireless connectivity problems stem from weak signal strength, interference, incorrect authentication credentials, or driver issues. Signal strength below -70 dBm indicates marginal connectivity, while values above -50 dBm represent strong signals.
Investigating Container Network Issues
# List Docker networks
docker network ls
# Inspect specific network
docker network inspect bridge
# Check container networking namespace
docker exec <container> ip addr show
# Test connectivity from container
docker exec <container> ping -c 4 google.com
# Verify DNS from container
docker exec <container> nslookup google.com
# Display container port mappings
docker port <container>
# Check iptables rules created by Docker
sudo iptables -t nat -L -n | grep -A 10 DOCKER
# Monitor container network traffic
docker stats --no-stream
Container networking adds another layer of complexity with virtual interfaces, bridges, and NAT. Docker creates iptables rules for port mappings and network isolation, which can conflict with existing firewall configurations. The Docker networking documentation explains these mechanisms comprehensively.
Frequently Asked Questions about network connectivity troubleshooting
Why does ping work but web browsing fails?
When ping succeeds but web browsing fails, the problem exists at the application or transport layer rather than basic network connectivity. This typically indicates DNS resolution failures, firewall blocking HTTP/HTTPS ports, proxy configuration problems, or web server issues. Test DNS separately with dig example.com, check firewall rules for ports 80 and 443, and verify no proxy settings interfere with connections. Use curl -Iv https://example.com to test HTTP connectivity independently of browser configuration.
How do I test if a firewall is blocking my connection?
First, verify basic connectivity works by pinging the remote host's IP address. Then test specific ports using nc -zv <host> <port> or telnet <host> <port>. Compare results from the affected system versus a known-working system to isolate whether local firewall, network firewall, or remote firewall causes blocking. Packet capture with sudo tcpdump -i any port <port> reveals whether outbound packets leave your system and whether responses arrive. If packets leave but no responses return, intermediate or remote firewalls likely block the connection.
What causes "Destination Host Unreachable" errors?
"Destination Host Unreachable" errors have different causes depending on which device generates the message. When your own system reports this immediately, it means no local route exists to the destination network - check your routing table with ip route show. Or your gateway router reports this after several seconds, it indicates the gateway cannot route to the destination network - verify your default gateway is correct. When intermediate routers report this, the problem exists somewhere along the network path - use traceroute to identify where routing breaks down.
How can I tell if network slowness is local or remote?
Use mtr or extended traceroute to identify where latency increases along the network path. Run ping tests to local gateway, ISP first hop, and final destination to isolate where delays occur. Test bandwidth with iperf3 if you control both endpoints, or use online speed tests to compare with expected performance. Monitor local interface statistics with ip -s link show looking for errors, drops, or overruns suggesting local issues. Compare latency from multiple source systems - if all experience problems, the issue likely exists remotely or in shared network infrastructure.
Why does DNS work sometimes but not always?
Intermittent DNS problems often stem from fallback nameserver issues when the primary DNS server fails. Check all configured nameservers in /etc/resolv.conf individually with dig @<nameserver_ip> example.com. DNS caching at various levels (system, resolver, browser) can mask problems temporarily - clear caches and retest. Network problems causing packet loss affect UDP-based DNS queries more than TCP connections. ISP DNS hijacking or intercepting DNS queries can cause unexpected behavior - test with alternative DNS servers like 8.8.8.8 to verify.
How do I troubleshoot VPN connectivity issues?
VPN troubleshooting requires systematic testing of underlying connectivity before VPN-specific issues. Verify basic internet connectivity without VPN active. Check if VPN authentication succeeds but tunneling fails versus authentication failures. Examine VPN client logs for specific error messages. Test if firewall blocks VPN protocol ports (typically UDP 1194 for OpenVPN, UDP 500/4500 for IPsec). Verify routing table changes when VPN connects with ip route show. Some networks block VPN protocols entirely - try different VPN protocols or ports.
What causes high packet loss but normal latency?
This counter-intuitive situation often indicates queuing problems or packet filtering rather than network congestion. Congestion typically increases both latency and packet loss together. Check for selective packet filtering based on protocol, port, or packet characteristics. Examine QoS (Quality of Service) configurations that might prioritize ICMP ping packets but drop other traffic under load. Test with different packet sizes - if small packets work but large packets fail, suspect MTU issues. Monitor for buffer overruns on interfaces suggesting capacity problems.
Why does traceroute show asterisks for some hops?
Asterisks in traceroute output indicate routers that don't respond to traceroute probes, which may be completely normal. Many routers deprioritize or ignore traceroute ICMP or UDP packets as security measures without affecting normal traffic. If traceroute shows asterisks but connectivity works fine, ignore them. If asterisks appear where routing actually breaks down, use alternative traceroute methods: traceroute -I for ICMP, traceroute -T for TCP, or mtr which often gets responses where standard traceroute fails.
How do I diagnose why SSH connections timeout?
SSH timeout issues have multiple potential causes requiring systematic elimination. First verify SSH service runs on remote system and port 22 (or custom port) is open. Test port connectivity with nc -zv <host> 22 to distinguish network problems from SSH-specific issues. Check firewall rules on both client and server systems. If connections succeed from some networks but not others, suspect network-level firewalls. Use ssh -vvv for detailed debugging output revealing exactly where connection attempts fail.
What tools help identify bandwidth-hogging processes?
Several tools identify which processes consume network bandwidth. Install and use nethogs to display per-process bandwidth usage in real-time similar to top for network. Use iftop to show bandwidth usage between hosts without process details. The ss command with -p option shows which processes own specific connections. Monitor detailed process network I/O with iotop -o --only filtered to network-intensive processes. For extended monitoring, consider tools like vnstat that track bandwidth usage over time.
Network connectivity troubleshooting Common Error Messages
"Network is unreachable"
This error occurs when the kernel determines no route exists to the destination network. Verify routing table with ip route show and confirm default gateway exists. Check that the interface assigned the default route is UP: ip link show. Ensure the destination IP address doesn't require a specific route missing from the table. If connecting to local subnet, verify correct subnet mask configured - incorrect mask makes addresses appear in different subnet requiring routing.
"Connection refused"
This message indicates successful network communication to the target system, but the specific port has no listening service. Verify the service is actually running with systemctl status <service>. Check if the service binds to the expected IP address and port using ss -tulpn | grep <port>. Services binding to 127.0.0.1 only accept local connections, while 0.0.0.0 accepts connections from any interface. Confirm firewalls allow traffic to the port but understand that "connection refused" proves traffic reaches the system.
"No route to host"
Similar to "network unreachable" but indicates that while a route exists, something prevents reaching the specific host. Common causes include: the remote host is down, local firewall blocking traffic, remote host firewall rejecting packets, or ARP resolution failure. Check if you can ping other hosts on the same subnet to distinguish subnet-wide problems from specific host issues. Use ip neigh show to verify ARP entries exist for the target host.
"Connection timed out"
Timeouts indicate packets never receive responses, unlike "connection refused" which represents active rejection. This suggests packets get lost in transit, firewalls silently drop packets, or the remote service hangs without responding. Distinguish between connection setup timeouts (SYN packets never acknowledged) versus established connection timeouts (application-layer responsiveness issues). Use tcpdump to verify whether packets leave your system and whether any responses arrive.
"Name or service not known"
This DNS-specific error means hostname resolution failed entirely. Test DNS explicitly with dig <hostname> or nslookup <hostname> to confirm. Verify /etc/resolv.conf contains valid nameserver entries. Check if DNS servers are reachable with ping <nameserver_ip>. Try with known-good DNS servers like 8.8.8.8 to distinguish local DNS problems from authoritative DNS issues. Verify /etc/nsswitch.conf specifies "files dns" for hosts lookup enabling both /etc/hosts and DNS resolution.
"Cannot assign requested address"
This error indicates attempting to bind a socket to an IP address not assigned to any local interface. Check your local IP configuration with ip addr show. Verify you're not trying to bind to a specific IP address that doesn't exist on your system. For servers binding to specific addresses, ensure interfaces are UP and addresses fully configured before starting services. Some applications attempt binding before DHCP completes, causing temporary failures.
"Permission denied"
Permission errors on network operations typically involve insufficient privileges for the requested action. Binding to ports below 1024 requires root privileges on Linux systems. Creating raw sockets for ping or traceroute requires CAP_NET_RAW capability. Check file permissions on Unix domain sockets in /var/run or /tmp. For SSH, verify home directory and .ssh directory permissions don't allow group/other write access, which SSH strictly enforces.
"Too many open files"
This limit error affects both file descriptors and network sockets since Linux treats sockets as file descriptors. Check current limits with ulimit -n and system-wide limits in /proc/sys/fs/file-max. Increase per-process limits in /etc/security/limits.conf or systemd service files using LimitNOFILE=. For high-connection services like web servers, dramatically increase limits beyond defaults. Monitor actual usage with lsof | wc -l or per-process with ls /proc/<pid>/fd | wc -l.
Additional Resources
Official Documentation and Standards
Linux Kernel Networking Documentation - Comprehensive documentation of Linux networking subsystem, including driver development, protocol implementation, and advanced configurations from kernel.org.
RFC 1122 - Requirements for Internet Hosts - Fundamental TCP/IP specification defining requirements for Internet hosts, essential reading for understanding network protocol behavior.
RFC 793 - Transmission Control Protocol - Original TCP specification explaining connection establishment, data transmission, and connection termination mechanics.
Red Hat Networking Guide - Enterprise-focused networking configuration and troubleshooting documentation covering NetworkManager, firewalld, and advanced configurations.
Network Diagnostic Tools Documentation
Wireshark User Guide - Complete documentation for the most powerful network protocol analyzer, including capture filters, display filters, and protocol-specific analysis.
tcpdump and libpcap - Official documentation for tcpdump packet capture utility and libpcap library underlying most packet capture tools.
iperf3 Documentation - Network bandwidth testing tool documentation explaining test methodologies, result interpretation, and advanced options.
nmap Network Scanning - Comprehensive guide to network discovery and security auditing, but remember to only scan systems you own or have explicit permission to test.
Linux Networking Administration Resources
Arch Wiki - Network Configuration - Exceptionally detailed networking configuration guide applicable beyond Arch Linux, covering modern tools and practices.
Ubuntu Server Guide - Networking - Ubuntu-specific networking documentation including Netplan configuration and systemd-networkd usage.
Debian Network Configuration - Debian networking documentation explaining traditional /etc/network/interfaces configuration still used in many environments.
Related LinuxTips.pro Articles
#21: Linux Network Configuration - Static vs DHCP - Learn fundamental network configuration methods before advanced troubleshooting.
#22: SSH Server Setup and Security Hardening - Secure remote access essential for managing Linux systems, with troubleshooting guidance.
#23: Firewall Configuration with iptables and firewalld - Master firewall configuration to prevent self-inflicted connectivity problems.
#24: DNS Configuration and Troubleshooting - Deep dive into DNS setup and diagnosis complementing this article's DNS troubleshooting section.
#25: Network Troubleshooting Tools - ping, traceroute, netstat - Detailed examination of essential network diagnostic commands.
#91: Linux Boot Process Troubleshooting - Network issues sometimes stem from boot-time configuration problems.
#92: File System Corruption Recovery - System problems including network issues can result from underlying file system damage.
#94: Performance Issue Diagnosis - Network performance troubleshooting extends into comprehensive system performance analysis.
Conclusion
Mastering network connectivity troubleshooting requires understanding both the systematic methodology and the specific diagnostic tools at each layer. By following the layer-by-layer approach starting from physical connectivity through application protocols, you can efficiently isolate and resolve network problems that might otherwise consume hours of unproductive investigation.
Remember that effective troubleshooting combines technical knowledge with logical problem-solving. Document your findings as you progress through troubleshooting steps, as patterns often emerge revealing root causes not immediately obvious. The commands and techniques presented here form your foundation, but real expertise develops through experience applying these tools to diverse problems in production environments.
Continue building your Linux networking expertise by practicing these network connectivity troubleshooting techniques in lab environments before production incidents occur. The muscle memory developed through practice enables faster diagnosis when critical systems experience connectivity issues.