We are designing a new location for one of our customers, and would like to have a pair of ES-12F switches at their core. We will create a lag between the two, and have access switches with uplinks to both switches.
My question is about how to provide a good, redundant uplink to the edgerouter. The way i see it, we have three options, and none of them are optimal with things as they are.
- create a bridge group with spanning tree and uplink into each ES-12F
- This gets us an active-passive link that, if one switch fails, would take ~32seconds to failover. Not terrible especially since the chances of a switch failing are pretty small.
- Create a bonding interface
- active-passive: same as before, but with immediate failover. One downside of this is that if one of the 12F switches is isolated from the rest of the switches, but is still online and linked up, it may isolate traffic to the ER
- active-active: either TLB or ALB mode. This would provide connectivity over both interfaces for ideally better load balancing, still suffers from active-passive possibility (in fact it's a guarantee now)
- Two edgerouters with VRRP
- I know in theory this would work, but the logistics of it seems unreasonable. Putting aside the issues with people not quite understanding how VRRP works and the possibility of someone on our staff making a config change that either doesn't get replicated to the other ER, or just straight up breaking vrrp... I'm not certain the ER pair would be protected against a switch being isolated.
Perhaps I'm focusing too much on the possibility of one switch being isolated, as the only way that could happen is the LAG between the 12F switches going down (the switches are in the same rack, so that should be a very minimal possibility).
So with all that said, neither bridge nor bond groups are hardware offloaded. I've tested and gotten about 200Mbps max on an ER pro. This might be acceptible, as we don't have near that on our WAN... but I'd like to not tie our hands behind our backs. If i had a choice, I'd say offloading the bond group would be ideal, as that provides load balancing and immediate failover, but I see how bridge groups would apply more universally to others using edgerouters.
I'd like to open a discussion on the merits of each method and perhaps get a feature request for hardware offloading of one of the first two options.
For reference, i have provided an image of our proposed topolgoy (containing just one ER)