On Wed, 6 Mar 2002, Mike Andrews wrote:
The netmask might spontaneously change if you have two routes in your routing table with the same net number but a different size -- most notably large aggregate Null0 routes you might have for BGP.
For example if you have something on your Cisco like
router bgp 65535 network 192.168.1.0 mask 255.255.240.0 ip route 192.168.1.0 255.255.240.0 Null0 250 ! interface fastethernet0/0 ip address 192.168.1.1 255.255.255.224 interface fastethernet0/1 ip address 192.168.1.33 255.255.255.224
...where you've got a /20 that's subnetted into chunks of /27 or whatever. You'd end up with this in the routing table of the Cisco:
192.168.1.0/20 null0 192.168.1.0/27 fastethernet0/0 192.168.1.32/27 fastethernet0/1
If your ARC is hanging off of fa0/1 in this example, everything's cool. If your ARC is hanging off of fa0/0 in this example, you'll have problems. Big problems. The ARC will get confused when it sees both 192.168.1.0/20 and 192.168.1.0/27 in the routing table, and will use the wrong one.
If your ARC happens to be located in 192.168.1.0/27, it actually screws up to the point of deciding that your ethernet interface's netmask is /20 instead of /27. Annoying but not deadly -- until your DR or BDR reboots, at which point it becomes extremely toxic becuase your network melts down into a little puddle as everyone hangs in ExStart. (Turn on OSPF debugging on the Cisco and you'll see that they won't negotiate because the netmasks don't match.)
Thanks, this explanation sounds very similar to the problem we've had.
(You can recover without rebooting the ARC if you can get in and do a 'reconfigure ip network ip' to reset the netmask. I think. It's been 2 years or so.)
When the problem happens we can usually telnet to the console command prompt so this might be an additional useful workaround.
If your ARC is in 192.168.1.32/27, then everything works just dandy, because there's not another route with the same net number for it to get confused by. That's the workaround for the bug -- move the ARC to another subnet.
3com's known about this bug for years. (It was bug id MR12019 once upon a time.) Check the list archives; I brought it up numerous times. I got tired of waiting for a fix and renumbered my ARCs into a different segment 2 years ago. I don't know if they fixed it in 5.1.99 or not... I worked around it and my ARCs work great now for everything else, so, shrug...
Well, we're running 5.1.99 now :(. We plan to load the recent TCS 4.5 code into the ARC soon - maybe that'll solve the problem but I'll look through the release notes too. If it doesn't fix things we may have to revert back to RIP.