RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here.
Too funny - but thanks for the vote of confidence - this one has me stumped - but I'm certain I don't have 8 bad DSP's, that I'm sure of. -----Original Message----- From: matthew@the-spa.com [mailto:matthew@the-spa.com] Sent: Friday, April 25, 2003 11:23 PM To: usr-tc@mailman.xmission.com Subject: RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here. i had something similar but it was only on one card, and that card would show funny lights when you booted it up from day one and it just turned out to be a bad card. you have a whole bunch of cards all doing this, so i don't know what it could be. if i had this i would call you and just have you figure it out :) matthew ---- Original Message ---- From: jfox@foxcomputers.com To: usr-tc@mailman.xmission.com Subject: RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here. Date: Fri, 25 Apr 2003 22:06:49 -0500
Ok - here's the story.
Just set up a chassis that has 3.5.109 on the DSP's, 5.3.110 on the ARC, 8.6.3 on the NMC. Running 8 PRI's the telco says are configured on a DMS100.
Loaded factory defaults on everything and configured it up. The unit started taking calls, everything looked good. Got up to about 100 users on. Walked away from it.
After about 2 hours, came back to it, and no callers. in session monitor in TCM, the (most) B-channels on all 8 PRI's showed "Local out of service". All except on the 8th PRI, one caller remained connected. I also noticed the D-ALM light intermittently going red on some of the PRI's. As I watched, slowly the B-channels all started changing to "Idle" and "in service". But not all at once, on any give PRI, I'd have the D up, and 19 of the B's, with any 4 of the B's still "local out of service".
Talked to the Telco, they said everything was up and the B-channels were taken down on our end.
At this time, I noticed a message on the console port of the ARC that said: At 17:48:04, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 9
Now, I've seen similar behavior when the switch type was not correct. Without help from the Telco, I tried changing the switch type on the trunk settings to NI2 (which turned out to be the correct answer last time something like this happened). Then rebooted all of the DSP's.
Once I did this - the D-channels stopped dropping and all the B-channels were back, callers started getting on again. Looked good.
2 1/2 hours later, checked on the box. Only 3 DSP's had callers. At this point, noticed on the terminal, that the ARC had rebooted. Scrolling up in HyperTerminal revealed that some type of exception had happened, and the ARC had rebooted. (Turned on text capture on HyperTerminal at this point).
15 minutes later, all of the callers disappeared. Checked, and all B-channels were down. Thinking it might be ARC related, I gave the ARC a reboot command. Immediately on giving it the reboot - this is what I saw:
HiPer>> reboot You have requested to Reboot the system Please confirm the request.(No/Yes):y
Rebooting.... At 21:07:44, Facility "Configurator", Level "INFORMATION":: Received a CFG_SERVICE_CLOSED_MSG message. Administrator Network Service telnetd has been disabled. At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
Immediately after this, I pulled up session monitor on one of the DSPs. And about 10 seconds later all of the B-channels changed from "local out of service" to "in service".
Once the ARC rebooted, getting fast-busy's dialing in. Checked other DSP's and they were all again "local out of service".
On the 1st DSP, did an Actions/Commands->Software->Restore and all of it's B-channels went "in-service".
Did a restore on the rest of the DSP's, and all the B-channels went in-service everywhere, but still getting fast-busy. Gave it about 3 minutes, still fast-busy.
On a hunch - did a Hardware reset on the First 4 DSP's. Almost instantly, the 7th and 8th DSP's started taking calls. (?!)Interestingly though - I still get a fast busy when I dial it myself, and the first DSP just got two utilization lights lit and then rebooted itself. Shortly thereafter, most of the B-channels went out of service again and the callers dropped off. But the ARC isn't giving any errors at the console port at this moment.
So, for lack of any ability to get and keep users connected, I just reset all the DSP's to the DMS100 switch type, and rebooted the cards. Immediately, callers started connecting. Session Monitor shows EVERY B-channel is up. But for how long?
Is the switch-type so far wrong that it just can't function continuously? Is the ARC fried? Is this chassis haunted?
Nothing I can say but: WTF?
Anyone have any ideas?
Thanks,
Joel
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
Joel. You gotta be kidding me. My e-mail says it's 11:57 P.M. Please tell me you aren't still messing around with Total Control boxes at this hour. (Actually, it's 1:45 A.M. here in Dallas, I'm up feeding junior (6-month old son) ). The internet will survive, take a break!! If you need a spare/benchmark unit out there, let me know and I'll get one going your way. All I ask is that you pick up the freight on this "beast". You must not have got that unit from us, we never have any problems with the units WE send out. Go to Bed! Shannan Young SouthWest Data Technology, Inc. sky1@airmail.net www.swdt.com 972-739-7010 972-739-7013 (Fax) ----- Original Message ----- From: Joel - Fox Computers To: Discussion relating to the 3Com/US Robotics Total Control modemsystems. Sent: Friday, April 25, 2003 11:54 PM Subject: RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here. Too funny - but thanks for the vote of confidence - this one has me stumped - but I'm certain I don't have 8 bad DSP's, that I'm sure of. -----Original Message----- From: matthew@the-spa.com [mailto:matthew@the-spa.com] Sent: Friday, April 25, 2003 11:23 PM To: usr-tc@mailman.xmission.com Subject: RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here. i had something similar but it was only on one card, and that card would show funny lights when you booted it up from day one and it just turned out to be a bad card. you have a whole bunch of cards all doing this, so i don't know what it could be. if i had this i would call you and just have you figure it out :) matthew ---- Original Message ---- From: jfox@foxcomputers.com To: usr-tc@mailman.xmission.com Subject: RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here. Date: Fri, 25 Apr 2003 22:06:49 -0500
Ok - here's the story.
Just set up a chassis that has 3.5.109 on the DSP's, 5.3.110 on the ARC, 8.6.3 on the NMC. Running 8 PRI's the telco says are configured on a DMS100.
Loaded factory defaults on everything and configured it up. The unit started taking calls, everything looked good. Got up to about 100 users on. Walked away from it.
After about 2 hours, came back to it, and no callers. in session monitor in TCM, the (most) B-channels on all 8 PRI's showed "Local out of service". All except on the 8th PRI, one caller remained connected. I also noticed the D-ALM light intermittently going red on some of the PRI's. As I watched, slowly the B-channels all started changing to "Idle" and "in service". But not all at once, on any give PRI, I'd have the D up, and 19 of the B's, with any 4 of the B's still "local out of service".
Talked to the Telco, they said everything was up and the B-channels were taken down on our end.
At this time, I noticed a message on the console port of the ARC that said: At 17:48:04, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 9
Now, I've seen similar behavior when the switch type was not correct. Without help from the Telco, I tried changing the switch type on the trunk settings to NI2 (which turned out to be the correct answer last time something like this happened). Then rebooted all of the DSP's.
Once I did this - the D-channels stopped dropping and all the B-channels were back, callers started getting on again. Looked good.
2 1/2 hours later, checked on the box. Only 3 DSP's had callers. At this point, noticed on the terminal, that the ARC had rebooted. Scrolling up in HyperTerminal revealed that some type of exception had happened, and the ARC had rebooted. (Turned on text capture on HyperTerminal at this point).
15 minutes later, all of the callers disappeared. Checked, and all B-channels were down. Thinking it might be ARC related, I gave the ARC a reboot command. Immediately on giving it the reboot - this is what I saw:
HiPer>> reboot You have requested to Reboot the system Please confirm the request.(No/Yes):y
Rebooting.... At 21:07:44, Facility "Configurator", Level "INFORMATION":: Received a CFG_SERVICE_CLOSED_MSG message. Administrator Network Service telnetd has been disabled. At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
At 21:07:44, Facility "GWC Modem Driver", Level "CRITICAL":: callline->callstate = 1
Immediately after this, I pulled up session monitor on one of the DSPs. And about 10 seconds later all of the B-channels changed from "local out of service" to "in service".
Once the ARC rebooted, getting fast-busy's dialing in. Checked other DSP's and they were all again "local out of service".
On the 1st DSP, did an Actions/Commands->Software->Restore and all of it's B-channels went "in-service".
Did a restore on the rest of the DSP's, and all the B-channels went in-service everywhere, but still getting fast-busy. Gave it about 3 minutes, still fast-busy.
On a hunch - did a Hardware reset on the First 4 DSP's. Almost instantly, the 7th and 8th DSP's started taking calls. (?!)Interestingly though - I still get a fast busy when I dial it myself, and the first DSP just got two utilization lights lit and then rebooted itself. Shortly thereafter, most of the B-channels went out of service again and the callers dropped off. But the ARC isn't giving any errors at the console port at this moment.
So, for lack of any ability to get and keep users connected, I just reset all the DSP's to the DMS100 switch type, and rebooted the cards. Immediately, callers started connecting. Session Monitor shows EVERY B-channel is up. But for how long?
Is the switch-type so far wrong that it just can't function continuously? Is the ARC fried? Is this chassis haunted?
Nothing I can say but: WTF?
Anyone have any ideas?
Thanks,
Joel
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc _______________________________________________ USR-TC mailing list USR-TC@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
participants (2)
-
Joel - Fox Computers -
S.Young