I have encountered a situation whereby I can consistently crash an ERX, well several of them at the same time actually.
By crash I mean that the ERX does not respond on any port to ping or other probes. The link lights continue to flash, but nothing short of power cycling the devices will bring them back.
The crash happens when hwnat offload'ing is enabled and does not happened when it is disabled.
Only tested with V1.9. (Note, all ERXs have the updated bootloader.)
The rest of this post provides context, details and instructions to recreate.
However since I don't want to redact my configs, I am not posting them. If someone from Ubnt wants them, please let me know where to send them.
- - - -
Before making some major upgrades to my tower config I wanted to recreate it on the bench so I could test the planned changes.
The following image shows the test set up. While technically a logical network map, the only difference from the physical network map is that there is only one physical switch using untagged vlans to keep things separate. OSPF is used to manage the routes between all of the devices.
The large, dark gray box represents the equipment at the tower. The two iperf3 boxes are linux servers.
For the ERXs, the IP address inside the blue box is assigned to switch0. Other IPs are assigned to the indicated ports.
iperf3 servers are started on each end with:
iperf3 -s -i 1
iperf3 clients are started with:
On 10.0.0.10: iperf3 -c 10.1.0.10 -i 1 -P 4
On 10.1.0.10: iperf3 -c 10.0.0.10 -i 1 -P 4
With hwnat offloading disabled, iperf3 clients can be started at each end and all is fine. Total traffic is around 650 Mbps with minor retransmissions.
When hwnat offloading is enabled:
An iperf3 client running on only one end (doesn't matter which end), can run unthrottled without crashing the ERXs. Total bandwidth is over 800 Mbps.
However, when iperf3 clients are started at each end and are not throttled, then the ERXs will crash after a short period of time, typically well less than 5 minutes.
To test if the crashing was due to the amount of traffic I tried a number of tests, starting with fairly low bandwidth and increasing until a crash occurred.
iperf3 -c 10.0.0.10 -i 1 -P 4 -b 50M
iperf3 -c 10.1.0.10 -i 1 -P 4 -b 50M
I started at 50 Mbps from each end and increase the bandwidth in 50 Mbps until things crashed.
Note, the “-P 4” means 4 parallel stream. So 4 x 50 Mbps => 200 Mbps from one client.
So with two clients running with “-b 50M” a total of 400 Mbps is generated.
Things are fine with both clients using “-b 150M”.
But the ERXs crash when “-b 200M” is used on both clients, when there is 800 Mbps. Anything higher or with the two iperf clients unthrottled, causes the ERXs to crash.
A single unthrottled iperf3 client can generate over 800 Mbps and not crash the ERXs.
It is only when a) hwnat offloading is enabled, and b) there are two iperf3 clients running with sufficient traffic that the ERXs crash.
Let me know if someone at Ubnt wants the configs to create this set up.
Cheers
Mark