I have recently setup an EdgeRouter ER-8 (SW v1.8.0, config.boot included) with
- LAN on eth0, hosting a number of machines, one of wich is used as IPv4 server
- load-balancing on WAN ADSL boxes on eth1..eth6, including three on eth2..eth4 (different models, two ISPs) that have static global IPv4 adress and the Edgerouter as the DMZ of their NAT, which I use to provide access to the servers on the LAN using DNAT translation performed by the ER-8.
That works, often for a few consecutive days, but I'm facing an issue where some of the WAN boxes on eth2..eth4 no longer work, in the sense that
- access of the server from the global internet thru their IPv4 address is no longer possible;
- the corresponding EdgeMax DNAT rules no longer trigger;
- the ADSL box no longer shows the EdgeRouter as connected;
- the EdgeMax dashboard shows the boxes as Connected, but with a permanent 0 in the Tx colum for that box, and low Rx.
According to a single attempt at packet capture using the EdgeRouter, residual traffic consists of unanswered ARP requests from the ADSL box for the static IPv4 assigned to the DMZ, that is the EdgeRouter; another similar attempt did not catch any packet.
Power-cycling the ADSL boxes consistently fixes the issue, but it recurs.
A short unplug/replug of the Ethernet cable does not (on one occasion the ADSL box did see the EdgeRouter after that, but other functionality was not restored).
These same three boxes with similar NAT/DMZ setup have been solid behind an SRX5308.
I had the problem 8 times over a week, and that always has been on one of the 3 boxes used for inbound traffic, not on the 2 others.
How could I diagnose/fix that? Thanks in advance for help !