I currently administer a pool of 8 TC1000 chassis, and have been pulling
hair out trying to stabilize OSPF. What we are seeing is some TC's
refuse to ack some LSA reannouncements from the DR, causing it to
eventually drop and restart OSPF sessions at random times. Every now
and then this 'corrective' action taken by the DR will trip up a batch
of TC's and 5 or 6 will all drop OSPF and reload their sessions. These
hiccups are quite noticable to our dual channel ISDN customers (whom
also seem to trigger the activity.)
Our setup is as follows:
8 TC ARCs, all in the same ospf area (not the backbone) with one
addition ospf neighbor, a Cisco router that acts as the DR and ABR to
the backbone area.
TC7 is the mpip server for the group, all chassis are clients with
eachother.
Dual channel callers are just as likely to land across chassis as they
are to land with both links on one chassis.
All callers get subnet/route info, if any (99% just use a single link,
and get an ip from the dynamic pool) from ppp, we do not do any routing
exchange (OSPF, RIP, BGP, etc) with any customers via dialup.
This problem manifested when we turned up ISDN on this network. Before
there were no multi-link customers dialing in, and the setup was fairly
stable. Since turning up MPIP however, we have multiple ospf session
failures a day and our customers are getting quite annoyed with the
pauses or unusable connections they cause.
Can anyone point me to a good walkthrough on a 'proper' MPIP setup so I
can verify that I'm in compliance there, and does anyone have any other
suggestions on how to fix this mess? We're currently running 4.5
Joshua Coombs