I love a good troubleshooting challenge. But damn, this one had me stumped until I could pull back the covers of my Orbi system to discover the root cause.
I spun up a Cisco FTDv NGFW image, set the outside interface to 192.168.2.100 and the default gateway to the 192.168.2.1 address on the Ubiquiti router. Ok, we are all fired up, let’s test basic connectivity.
Hmmm. I dug into some packet captures and debugs on the Ubiquiti, and as far as I could tell, the packet was being delivered on vlan 1 and being delivered to the Orbi, who should have responded with an echo reply. I could ping the Orbi from my workstation on the 192.168.1.0 subnet, but not from the 192.168.2.0 subnet. Of course, I have a static route in the Orbi to reach the 192.168.2.0 subnet, so there was not a routing issue. AND, I could ping my workstation or any other IP address on the 192.168.1.0 subnet EXCEPT the Orbi.
I did a little google action and discovered that there are some backdoors in the Orbi for advanced troubleshooting. The trick is to go to: https://192.168.1.1/debug.htm. From this screen you can enable some debugging captures, which I tried first, and then you can download them and view them offline in Wireshark. It’s actually a pretty robust set of data that the debug tool returns.
Sure enough, I saw my ICMP echo requests reaching the Orbi just fine, but still, no responses from the Orbi.
Ok, so what is the Orbi doing with these packets? Enter the next nice little hidden feature of your Orbi: Console access. By enabling telnet from the debug.htm page, I was able to remote into the Orbi shell, which is just BusyBox embedded Linux.
It only took me about 30 seconds to discover the problem.
Directly connected routes have a metric that is always preferred over static routes.
Bingo! Yup, there is a local interface on the Orbi that is already using the 192.168.2.0 subnet. Which means that my ICMP requests were being returned over the tun0 interface and not back to my Ubiquity router, because directly connected routes have a metric that is always preferred over static routes.
TLDR; You can’t use the 192.168.2.0/24 subnet
Once I changed my vlan interface to 192.168.20.0, full connectivity was achieved.