Hi,
First I'd like to apologize, I'm not a native english speaker. Please tell me if my post is hard to understand.
I own an ERLite-3, running the 1.9.0 firmare. I'm running a dual WAN configuration, one on a broadband connection and another on a DSL one. I've defined two load-balance groups, one defaulting on the broadband for devices needing mostly download or upload speed, an one defaulting on the DSL for devices needing a lower average latency.
I was running the default ping test for the watchdog until I ran into issues with the broadband provider. Ping would still work, but TCP connections were just slow as hell and randomly crashing.
I tried to experiment with a custom script, fetching the google.com homepage with curl.
Now is the weird issue. I'm reading conflicting situations with the "show load-balance status", which shows that both of my Internet connections are up, and the "show load-balance watchdog" which says one of my connection is failing the watchdog test.
show load-balance status
Group LB-LAN
interface : eth1
carrier : up
status : active
gateway :
route table : 201
weight : 100%
flows
WAN Out : 31557
WAN In : 39
Local Out : 33456
interface : eth2
carrier : up
status : failover
gateway :
route table : 202
weight : 0%
flows
WAN Out : 0
WAN In : 35
Local Out : 0
Group LB-LowPing
interface : eth2
carrier : up
status : failover
gateway :
route table : 204
weight : 0%
flows
WAN Out : 0
WAN In : 0
Local Out : 0
interface : eth1
carrier : up
status : active
gateway :
route table : 203
weight : 100%
flows
WAN Out : 0
WAN In : 0
Local Out : 0
show load-balance watchdog
Group LB-LAN
eth1
status: Running
pings: 7466
fails: 8
run fails: 0/3
route drops: 0
test script : /config/scripts/connectivity.sh - OK
eth2
status: Waiting on recovery (0/5)
failover-only mode
pings: 3
fails: 3
run fails: 3/3
route drops: 1
test script : /config/scripts/connectivity.sh - FAIL
last route drop : Thu Aug 25 10:12:01 2016
Group LB-LowPing
eth1
status: Running
failover-only mode
pings: 5983
fails: 12
run fails: 0/3
route drops: 1
test script : /config/scripts/connectivity.sh - OK
last route drop : Thu Aug 25 16:27:07 2016
last route recover: Thu Aug 25 16:28:10 2016
eth2
status: Waiting on recovery (0/5)
pings: 3
fails: 3
run fails: 3/3
route drops: 1
test script : /config/scripts/connectivity.sh - FAIL
last route drop : Thu Aug 25 10:12:03 2016
Here's my load-balance configuration :
load-balance {
group LB-LAN {
interface eth1 {
route-test {
count {
failure 3
success 5
}
initial-delay 60
interval 15
type {
script /config/scripts/connectivity.sh
}
}
}
interface eth2 {
failover-only
route-test {
count {
failure 3
success 5
}
initial-delay 60
interval 15
type {
script /config/scripts/connectivity.sh
}
}
}
lb-local enable
}
group LB-LowPing {
interface eth1 {
failover-only
route-test {
count {
failure 3
success 5
}
initial-delay 60
interval 15
type {
script /config/scripts/connectivity.sh
}
}
}
interface eth2 {
route-test {
count {
failure 3
success 5
}
initial-delay 60
interval 15
type {
script /config/scripts/connectivity.sh
}
}
}
lb-local enable
}
}
Here's the connectivity.sh script source :
#!/bin/bash
GROUP=$1
INTF=$2
STATUS=$3
CHECKDOMAIN="http://google.com"
case "$(curl -s --max-time 2 --interface ${INTF} -I ${CHECKDOMAIN} | sed 's/^[^ ]* *\([0-9]\).*/\1/; 1q')" in
[23]) echo "HTTP connectivity is up"; exit 0;;
5) echo "The web proxy won't let us through";exit 1;;
*)echo "Something is wrong with HTTP connections. Go check it."; exit 1;;
esac
exit 0