On Sat, 26 Apr 2003, Joel - Fox Computers wrote:
As of today, tried both settings on all of the DSPs, several reboots, and the B's are all out of service and won't come back.
Make sure you're doing a hardware reset and not just a reboot when your changing the switch type. This bit me with a few clients as well. Also consider that you could have a multiple problems, switch-type and bad arc code. What works fine with one chassis config could fail miserably with the same code under a completely different config. Try downgrading to some known-stable code; even 4.x.x. I still run that for some customers if there's no pressing need to be on 5. Charles
-----Original Message----- From: Paul Farber [mailto:farber@admin.f-tech.net] Sent: Saturday, April 26, 2003 1:22 PM To: Discussion relating to the 3Com/US Robotics Total Control modem systems. Subject: Re: [USR-TC] Urgent Problem - Any Ideas? Major WTF here.
I had the same sort of issue. At two seperate Telcos (I was co located at the switch) with 2 seperate chassis.
Let me yell it from the mountaintops: SWITCH TYPE
At telco 1, chassis 1 I was told that 5ess was the swith type for 16 PRIs. Dropped B's, Local Out of Server etc. I never had a d channel alarm and and I find it very hard to believe that the switch tech needed help with what 'Local out of Service' means. Didn't spend to much time on it cause my lease was up in 3 weeks. I limped along babysitting and rebooting DSPs as the went local out of service. My guess is that TELCO made some sort of change after an upgrade or something cause for 2.5 years I was rock solid at the switch... NOTHING ever went wrong. In the span of 3 weeks the above happened.
New Telco, New chassis. Set up switch type as 5ESS. EXACT SAME THING HAPPENED. Since the switch techs were litterally 10 feet from my gear they couldn't ignore the problem when I dialed in and got fast busies. They looped and ran stress tests... nothing. They asked me what my signaling/switch type was.. my answer: 5ESS, esf, b8zs (isn't that what ALL switched use?) They responded DOH! We forgot to tell you.. we use NI2. TCM'ed into the DSPs, changed switch types to NI2, save reboot and TADA! No dropped calls, no local out of services, no fast busies.
That was two months ago.
Out of curiosity (and to verify if I had a broke chassis) I started putting PRI's into the older chassis from the first telco. FLAWLESS. No Local out of Service, fast busies etc.
Triple check the switch type and line coding. 5ESS will work somewhat with an NI2... mine did... but had the same problems as you did till the correct switch type was configured.
Also, have them stress test the lines (all 1's is a good stress test). You may have to pay for a site visit... but $1k on line testing is small potatoes when you have to rely on them for your business.
-- Paul Farber Farber Technology farber@admin.f-tech.net Ph 570-628-5303 Fax 570-628-5545
On Fri, 25 Apr 2003, Joel - Fox Computers wrote:
Ok - here's the story.
Just set up a chassis that has 3.5.109 on the DSP's, 5.3.110 on the ARC, 8.6.3 on the NMC. Running 8 PRI's the telco says are configured on a DMS100.
Loaded factory defaults on everything and configured it up. The unit started taking calls, everything looked good. Got up to about 100 users on. Walked away from it.
After about 2 hours, came back to it, and no callers. in session monitor in TCM, the (most) B-channels on all 8 PRI's showed "Local out of service". All except on the 8th PRI, one caller remained connected. I also noticed the D-ALM light intermittently going red on some of the PRI's. As I watched, slowly the B-channels all started changing to "Idle" and "in service". But not all at once, on any give PRI, I'd have the D up, and 19 of the B's, with any 4 of the B's still "local out of service".
Talked to the Telco, they said everything was up and the B-channels were taken down on our end.
At this time, I noticed a message on the console port of the ARC that said: At 17:48:04, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 9
Now, I've seen similar behavior when the switch type was not correct. Without help from the Telco, I tried changing the switch type on the trunk settings to NI2 (which turned out to be the correct answer last time something like this happened). Then rebooted all of the DSP's.
Once I did this - the D-channels stopped dropping and all the B-channels were back, callers started getting on again. Looked good.
2 1/2 hours later, checked on the box. Only 3 DSP's had callers. At this point, noticed on the terminal, that the ARC had rebooted. Scrolling up in HyperTerminal revealed that some type of exception had happened, and the ARC had rebooted. (Turned on text capture on HyperTerminal at this point).
15 minutes later, all of the callers disappeared. Checked, and all B-channels were down. Thinking it might be ARC related, I gave the ARC a reboot command. Immediately on giving it the reboot - this is what I saw:
HiPer>> reboot You have requested to Reboot the system Please confirm the request.(No/Yes):y
Rebooting.... At 21:07:44, Facility "Configurator", Level "INFORMATION":: Received a CFG_SERVICE_CLOSED_MSG message. Administrator Network Service telnetd has been disabled. At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
Immediately after this, I pulled up session monitor on one of the DSPs. And about 10 seconds later all of the B-channels changed from "local out of service" to "in service".
Once the ARC rebooted, getting fast-busy's dialing in. Checked other DSP's and they were all again "local out of service".
On the 1st DSP, did an Actions/Commands->Software->Restore and all of it's B-channels went "in-service".
Did a restore on the rest of the DSP's, and all the B-channels went in-service everywhere, but still getting fast-busy. Gave it about 3 minutes, still fast-busy.
On a hunch - did a Hardware reset on the First 4 DSP's. Almost instantly, the 7th and 8th DSP's started taking calls. (?!)Interestingly though - I still get a fast busy when I dial it myself, and the first DSP just got two utilization lights lit and then rebooted itself. Shortly thereafter, most of the B-channels went out of service again and the callers dropped off. But the ARC isn't giving any errors at the console port at this moment.
So, for lack of any ability to get and keep users connected, I just reset all the DSP's to the DMS100 switch type, and rebooted the cards. Immediately, callers started connecting. Session Monitor shows EVERY B-channel is up. But for how long?
Is the switch-type so far wrong that it just can't function continuously? Is the ARC fried? Is this chassis haunted?
Nothing I can say but: WTF?
Anyone have any ideas?
Thanks,
Joel
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc