I've been debugging weird, transient networking dropouts. In the timeframe this issue started (last 2 months) I replaced all of my office hardware with Ubiquti stuff (UAP-AC-PRO and an ERLite-3) and upgraded my Comcast connection. I was suspicous of Comcast until today.
The way things are wired is easy to describe:
Comcast Router <-> ERLite-3 <-> Switch <-> LAN hosts/Wifi APs
My new suspect is the ERLite-3, but I have no good theory on what is wrong and how to fix it. Starting 2 days ago, I started running mtr (traceroute+ping combo tool) against two external IPs (4.2.2.2 and comcast.net) from my LAN but also from a box directly connected to the Comcast router "outside" of the ERLite-3 router. Finally today I had the glitch again, and it *only* affected the machine on the LAN. That would seem to indicate ERLite-3 is somehow to blame... I think? Any help debugging this is appreciated!
The "bad" period lasted for about 2-3 minutes from 12:02 to 12:05. During each period of {before, during, after} the log messages are effectively the same.
Example mtr output of "LAN + good" (this is what it looks like before and after the bad period):
Start: Thu Jun 9 12:00:54 2016
HOST: MacBook-Air Loss% Snt Last Avg Best Wrst StDev
1.|-- 192.168.1.1 40.0% 10 6.2 4.1 1.4 7.9 2.4
2.|-- REDACTED-static.hfc.comcastbusiness.net 10.0% 10 257.1 36.1 3.1 257.1 83.0
3.|-- 96.120.62.229 0.0% 10 125.0 58.1 15.3 190.7 62.1
4.|-- te-0-5-0-7-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 45.2 54.1 12.8 124.0 34.5
5.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 35.1 32.5 17.0 53.6 10.4
6.|-- 4.68.71.133 30.0% 10 25.6 26.0 18.1 35.5 5.1
7.|-- ae-1-3501.ear1.washington12.level3.net 90.0% 10 1678. 1678. 1678. 1678. 0.0
8.|-- b.resolvers.level3.net 0.0% 10 24.4 27.8 15.8 39.5 7.0
Example mtr output of "LAN + bad" (note the 100% packet loss on the final step, but the puzzle for me is that the packet loss is OUTSIDE of my network? makes me suspicious of the firewall on the ERLite, but yet I have no particular idea/theory.)
Start: Thu Jun 9 12:04:19 2016
HOST: MacBook-Air Loss% Snt Last Avg Best Wrst StDev
1.|-- 192.168.1.1 40.0% 10 1.2 4.2 1.2 10.5 3.6
2.|-- REDACTED-static.hfc.comcastbusiness.net 40.0% 10 62.1 18.5 3.2 62.1 22.6
3.|-- 96.120.62.229 0.0% 10 16.3 38.4 11.7 161.5 45.1
4.|-- te-0-5-0-7-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 24.3 34.9 19.5 72.7 16.1
5.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 21.8 39.8 14.1 94.2 28.6
6.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
Example mtr output of "outside LAN but inside Comcast router" that correlates with above:
Start: Thu Jun 9 12:00:46 2016
HOST: Middle-Earth Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.1.10.1 40.0% 10 65.1 20.8 3.2 65.1 24.8
2.|-- 96.120.62.229 0.0% 10 65.9 35.7 11.4 117.1 32.5
3.|-- te-0-5-0-7-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 16.1 26.6 12.5 69.2 20.0
4.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 15.3 19.5 14.4 27.5 3.7
5.|-- 4.68.71.133 20.0% 10 17.8 17.0 12.1 20.8 2.7
6.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
7.|-- b.resolvers.Level3.net 0.0% 10 17.6 22.1 17.6 26.5 2.4
Example mtr output of "outside LAN but inside Comcast router" that correlates with above:
Start: Thu Jun 9 12:04:18 2016
HOST: Middle-Earth Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.1.10.1 0.0% 10 24.9 17.5 2.5 73.3 21.9
2.|-- 96.120.62.229 0.0% 10 59.6 47.2 12.5 119.9 37.8
3.|-- te-0-5-0-7-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 16.0 23.6 12.8 84.6 21.8
4.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 11.3 16.7 8.6 32.7 6.3
5.|-- 4.68.71.133 40.0% 10 16.2 34.5 13.1 130.0 46.8
6.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
7.|-- b.resolvers.Level3.net 0.0% 10 18.9 22.2 18.9 27.2 2.2
Here is "LAN good/bad" for comcast.net:
Start: Thu Jun 9 12:01:14 2016
HOST: MacBook-Air Loss% Snt Last Avg Best Wrst StDev
1.|-- 192.168.1.1 0.0% 10 1.4 2.3 1.1 4.3 1.1
2.|-- REDACTED-static.hfc.comcastbusiness.net 0.0% 10 8.2 7.1 4.6 10.3 1.6
3.|-- 96.120.62.229 0.0% 10 24.9 18.6 13.8 24.9 3.8
4.|-- te-0-5-0-6-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 24.3 27.5 15.7 50.4 10.2
5.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 148.7 39.1 15.2 148.7 39.4
6.|-- be-7016-cr02.ashburn.va.ibone.comcast.net 0.0% 10 71.5 37.5 21.3 71.5 18.1
7.|-- be-7922-ar03.newcastle.de.panjde.comcast.net 0.0% 10 29.8 51.2 24.6 179.6 46.3
8.|-- ae100-ur11-d.newcastlerdc.de.panjde.comcast.net 0.0% 10 27.8 71.7 26.5 228.3 70.0
9.|-- urlrw01.cable.comcast.com 0.0% 10 24.8 56.7 23.8 149.1 45.1
Start: Thu Jun 9 12:04:41 2016
HOST: MacBook-Air Loss% Snt Last Avg Best Wrst StDev
1.|-- 192.168.1.1 10.0% 10 9.2 8.6 1.5 30.2 9.3
2.|-- REDACTED-static.hfc.comcastbusiness.net 30.0% 10 119.0 23.7 5.0 119.0 42.1
3.|-- 96.120.62.229 0.0% 10 15.6 33.3 15.6 76.7 22.4
4.|-- te-0-5-0-6-sur02.pittsburgh.pa.pitt.comcast.net 0.0% 10 37.2 37.5 18.6 105.4 26.2
5.|-- te-0-0-0-0-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 25.2 26.5 15.4 61.9 13.5
6.|-- be-7016-cr02.ashburn.va.ibone.comcast.net 0.0% 10 24.0 30.8 24.0 47.0 6.4
7.|-- be-7922-ar03.newcastle.de.panjde.comcast.net 0.0% 10 37.1 32.7 23.8 54.9 9.4
8.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
Here is "outside LAN but inside Comcast router" that correlates with above (unlike 4.2.2.2 the path is *slightly* different):
Start: Thu Jun 9 12:01:11 2016
HOST: Middle-Earth Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.1.10.1 20.0% 10 3.0 7.8 1.3 31.7 11.1
2.|-- 96.120.62.229 0.0% 10 17.6 18.7 10.4 35.2 7.1
3.|-- te-0-4-0-6-sur01.pittsburgh.pa.pitt.comcast.net 0.0% 10 14.3 18.9 11.7 30.8 6.1
4.|-- te-0-0-0-2-ar01.pittsburgh.pa.pitt.comcast.net 0.0% 10 23.1 17.4 11.8 23.1 3.9
5.|-- hu-0-16-0-1-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 18.1 22.4 13.9 74.1 18.4
6.|-- be-7016-cr02.ashburn.va.ibone.comcast.net 0.0% 10 17.0 67.4 17.0 197.6 72.4
7.|-- be-7922-ar03.newcastle.de.panjde.comcast.net 0.0% 10 22.8 60.6 22.8 119.8 39.5
8.|-- ae100-ur11-d.newcastlerdc.de.panjde.comcast.net 0.0% 10 22.9 29.0 22.9 43.7 6.9
9.|-- urlrw01.cable.comcast.com 0.0% 10 26.4 27.6 22.5 36.0 4.1
Start: Thu Jun 9 12:04:32 2016
HOST: Middle-Earth Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.1.10.1 50.0% 10 2.7 2.3 1.5 3.1 0.0
2.|-- 96.120.62.229 0.0% 10 15.0 16.0 12.0 22.2 3.3
3.|-- te-0-4-0-6-sur01.pittsburgh.pa.pitt.comcast.net 0.0% 10 12.5 54.0 12.5 256.1 81.3
4.|-- te-0-0-0-2-ar01.pittsburgh.pa.pitt.comcast.net 0.0% 10 15.1 68.9 12.7 186.2 70.9
5.|-- hu-0-16-0-1-ar01.mckeesport.pa.pitt.comcast.net 0.0% 10 22.2 39.9 13.3 152.9 43.1
6.|-- be-7016-cr02.ashburn.va.ibone.comcast.net 0.0% 10 26.9 28.6 19.3 55.9 11.2
7.|-- be-7922-ar03.newcastle.de.panjde.comcast.net 0.0% 10 45.1 31.8 24.0 45.1 6.2
8.|-- ae100-ur11-d.newcastlerdc.de.panjde.comcast.net 0.0% 10 26.9 29.2 24.6 36.9 4.2
9.|-- urlrw01.cable.comcast.com 0.0% 10 42.5 30.8 25.0 42.5 6.3
My config (with some slight edits for security, mostly changing things to REDACTED):
firewall {
all-ping enable
broadcast-ping disable
ipv6-receive-redirects disable
ipv6-src-route disable
ip-src-route disable
log-martians enable
name WAN_IN {
default-action drop
description "Packets from internet to LAN & WLAN"
enable-default-log
rule 1 {
action accept
description "allow established sessions"
log disable
protocol all
state {
established enable
invalid disable
new disable
related enable
}
}
rule 2 {
action drop
description "drop invalid state"
log disable
protocol all
state {
established disable
invalid enable
new disable
related disable
}
}
}
name WAN_LOCAL {
default-action drop
description "Packets from the internet to the router itself (like local management)"
enable-default-log
rule 1 {
action accept
description "allow established sessions"
log disable
protocol all
state {
established enable
invalid disable
new disable
related enable
}
}
rule 2 {
action drop
description "drop invalid state"
log disable
protocol all
state {
established disable
invalid enable
new disable
related disable
}
}
}
receive-redirects disable
send-redirects enable
source-validation disable
syn-cookies enable
}
interfaces {
ethernet eth0 {
address REDACTED/30
description "WLAN (Comcast)"
duplex auto
firewall {
in {
name WAN_IN
}
local {
name WAN_LOCAL
}
}
speed auto
}
ethernet eth1 {
address 192.168.1.1/24
description "LAN"
duplex auto
speed auto
}
ethernet eth2 {
address 192.168.2.1/24
description "LAN (Guest)"
duplex auto
speed auto
}
loopback lo {
}
}
protocols {
static {
}
}
service {
dhcp-server {
disabled false
hostfile-update enable
shared-network-name LAN-1-DHCP {
authoritative disable
subnet 192.168.1.0/24 {
default-router 192.168.1.1
dns-server 192.168.1.1
lease 86400
start 192.168.1.10 {
stop 192.168.1.254
}
unifi-controller 192.168.1.15
}
}
shared-network-name LAN-2-DHCP {
authoritative disable
subnet 192.168.2.0/24 {
default-router 192.168.2.1
dns-server 192.168.2.1
lease 86400
start 192.168.2.10 {
stop 192.168.2.254
}
}
}
}
dns {
forwarding {
cache-size 150
listen-on eth1
listen-on eth2
}
}
gui {
https-port 443
}
nat {
rule 5000 {
description "Masquerade for WAN (Comcast)"
log disable
outbound-interface eth0
type masquerade
}
}
ssh {
port 22
protocol-version v2
}
}
system {
conntrack {
expect-table-size 4096
hash-size 4096
table-size 32768
tcp {
half-open-connections 512
loose enable
max-retrans 3
}
}
gateway-address REDACTED
host-name edgerouter
login {
REDACTED
}
name-server 75.75.75.75
name-server 75.75.76.76
ntp {
server 0.ubnt.pool.ntp.org {
}
server 1.ubnt.pool.ntp.org {
}
server 2.ubnt.pool.ntp.org {
}
server 3.ubnt.pool.ntp.org {
}
}
offload {
ipsec enable
ipv4 {
forwarding enable
}
ipv6 {
forwarding disable
}
}
syslog {
global {
facility all {
level notice
}
facility protocols {
level debug
}
}
}
time-zone America/New_York
traffic-analysis {
dpi enable
export enable
}
}
traffic-control {
smart-queue CS_QoS {
download {
ecn enable
flows 1024
fq-quantum 1514
limit 10240
rate 75mbit
}
upload {
ecn enable
flows 1024
fq-quantum 1514
limit 10240
rate 15mbit
}
wan-interface eth0
}
}