----- Original Message -----
Sent: Saturday, April 26, 2003 9:43
AM
Subject: *MORE INFO* RE: [USR-TC]
Urgent Problem - Any Ideas? Major WTF here.
Yep, I was working that late - Wish this was one of Shannan's
boxes.
This morning, the ARC on this unit rebooted again. and
absolutely every B-channel was local out-of-service. At the reboot,
the ARC said this:
EXCEPTION 0300 CRASH DUMP:
GPRs:
R0: 0x0065EC88 R1: 0x07F55E88
R2: 0x000F3200 R3: 0x41820114
R4:
0x07F55F68 R5: 0x00000003 R6: 0x00000006
R7: 0x0000000F
R8: 0x00000000 R9:
0x00000000 R10: 0x0000000F R11: 0x00000006
R12:
0x0000003C R13: 0x000FD248 R14: 0x0099D9FC R15:
0x0099D9E0
R16: 0x0099D9E8 R17: 0x0099D9D4 R18:
0x0099D9DC R19: 0x0099D9EC
R20: 0xFFFFFEFF R21:
0x00CC5FF0 R22: 0x00306F34 R23: 0x00C10220
R24:
0x00000000 R25: 0x014D9364 R26: 0x01E4F2B4 R27:
0x00000000
R28: 0x009D9070 R29: 0x00000000 R30:
0x000000FF R31: 0x01342E44
SPRs:
CR: 0x20000400 XER:
0x20000000 LR: 0x0065EC88 CTR:
0x005171BC
SRR0: 0x00661FC0 SRR1: 0x00009030
DSISR: 0x40000000 DAR: 0x41820157
DMISS:
0x41820150 DCMP: 0x80000006 HASH1: 0x00010800 HASH2:
0x0001F7C0
IMISS: 0x00000000 ICMP: 0x00000000
RPA: 0x00000000 IABR: 0x00000000
82660 Registers:
Err Status 1: 0x00, Err Status 2: 0x00, CPU Err:
0x14, PCI Err: 0x06
CPU/PCI Addr: 0x00061EFC, Sys Error Addr:
0x0006E960
Call Stack:
0x00661FC0
(Exception return address - SRR0)
0x0065EC88
0x0063B3F8
0x00638594
0x00306F88
0x007F6BB4
0x007F6E74
0x002008D4
0x0020024C
0x0020009C
0x000A95A0
The first 2000 bytes in the
Stack follows
0x83C10010 0x7C0803A6 0x8001001C
0x906CC040 0x38600001 0x3D80009A 0x48000010
0x38600000 0x4182000C 0x2C030000 0x90610008
0x483E4875 0x38A00040 0x8084BFC0 0x3C80009A
0x387F0000 0x907E0004 0x483F5CC1 0x9143001C
0xB17E0000 0x614A4354 0x816B19C8 0x3D404143
0x83C30008 0x3D600089 0x4800005C 0x38600000
0x4182000C 0x2C0C0000 0x81810008 0x41820010
0x7C7F1B79 0x483F1605 0x9001001C 0x38A00040
0x93E10014 0x60845030 0x38C10008 0x93C10010
0x38600014 0x3C805544 0x9421FFE8 0x7C0802A6
0x4E800020 0x38210040 0x7C0803A6 0xBB010020
0x80010044 0x7D99C12E 0x39800000 0x4182000C
0x2C0B0000 0x7D79C02E 0x3B393514 0x3F200089
0x4181FF28 0x2C0C0000 0x81990010 0x483EA8ED
0x387E0000 0x483EA8F5 0x807E0008 0x483F1801
0x38800000 0x807F0000 0x40820018 0x2C0B0000
0x897F0004 0x83FE0008 0x4BFFC5E1 0x38A00002
0x7C6C5A14 0x81830008 0x389F0000 0x8163000C
0x4082001C 0x7C1D6000 0x83AB0004 0x816B0000
0x618C6104 0x817E0008 0x3D80554D 0x994C024A
0x394A00FF 0x894C024A 0x7D8C5A14 0x8163000C
0x81830008 0x41820050 0x7C7F1B79 0x483FEF9D
0x809D0000 0x83AA0008 0x814A0000 0x387C0000
0x81430008 0x996C0004 0x396B00FF 0x896C0004
0x81830008 0x418200D4 0x2C030000 0x408200DC
0x2C0A0000 0x8141000C 0x3BC30000 0x483E609D
0x38800000 0x38790000 0x38A1000C 0x3B3E0000
0x3B9C64A0 0x3F8000CE 0x408100EC 0x2C0A0000
0x815E0010 0x7FCC5A14 0x398C6550 0x3D8000CE
0x1D6B0024 0x572B063E 0x4180FFCC 0x2C1C0100
0x3B9C0001 0x4BFFF40D 0x387F0000 0x38810008
0x9B810008 0x40820014 0x7C0BD800 0x7D6BD838
0x7FBB6030 0x7D7E58AE 0x578C077E 0x578BEEFE
0x3BBC0000 0x7FC96214 0x3BFB0000 0x39292A9C
0x3B800001 0x3D200089 0x1D9B0530 0x573B063E
0x4181FFA0 0x7C0AF840 0x815D02DC 0x3BFF0001
0x3BDE0138 0x7D3CC12E 0x39200000 0x483F1961
0x38800000 0x7C7CC02E 0x483F7F15 0x38800020
0x41820020 0x2C030000 0x7C7CC02E 0x7F8C5A14
0x8163000C 0x81830008 0x41820038 0x2C030000
0x483FF0C1 0x387A0000 0x808C00F8 0x7D8CF214
0x819B0000 0x3B7BD150 0x3B5A64A0 0x3BBC0000
0x3BDF0000 0x3F60009A 0x3F4000CE 0x41820080
0x2C0B0000 0x817C02DC 0x3BE00000 0x4800020C
0x4181FF74 0x7C0AD840 0x815D02DC 0x3B7B0001
0x3BFF0138 0x48000711 0x91410014 0xB161001A
0x7D4A602E 0x39600000 0x91610010 0x38610010
0x1D9E001C 0xB1810018 0x7D5D5214 0x1D4A0054
0x816B00F8 0x572A063E 0x7D6BFA14 0x817A0000
0x572C063E 0x91840028 0x39800000 0x4082004C
0x2C0A0000 0x7D44C02E 0x7C8C5A14 0x8163000C
0x81830008 0x41820064 0x2C030000 0x483FF17D
0x387C0000 0x808C00F8 0x7D8CFA14 0x819A0000
0x3B5AD
BOOT PROM Version 1.16 (Built on June 9th, 1998 at
12:24:24)
Loading kernel ... OK
Joel.
You gotta be kidding me. My e-mail says
it's 11:57 P.M. Please tell me you aren't still messing around with
Total Control boxes at this hour. (Actually, it's 1:45 A.M. here in
Dallas, I'm up feeding junior (6-month old son) ). The internet will
survive, take a break!!
If you need a spare/benchmark unit out there,
let me know and I'll get one going your way. All I ask is that you
pick up the freight on this "beast".
You must not have got that unit from
us, we never have any problems with the units WE send
out.
Go to Bed!
Shannan Young
----- Original Message -----
Sent: Friday, April 25, 2003 11:54
PM
Subject: RE: [USR-TC] Urgent
Problem - Any Ideas? Major WTF here.
Too funny -
but thanks for the vote of confidence - this one has me stumped - but
I'm certain I don't have 8 bad DSP's, that I'm sure
of.
-----Original Message-----
From: matthew@the-spa.com
[mailto:matthew@the-spa.com]
Sent: Friday, April 25, 2003 11:23
PM
To: usr-tc@mailman.xmission.com
Subject:
RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF
here.
i had something similar but it was only on one
card, and that card
would show funny lights when you booted it up
from day one and it
just turned out to be a bad
card.
you have a whole bunch of cards all doing this, so i
don't know what
it could be.
if i had this i would call
you and just have you figure it out
:)
matthew
---- Original Message ----
From:
jfox@foxcomputers.com
To:
usr-tc@mailman.xmission.com
Subject:
RE: [USR-TC] Urgent Problem - Any Ideas? Major WTF here.
Date:
Fri, 25 Apr 2003 22:06:49 -0500
>
>Ok - here's the
story.
>
>Just set up a chassis that has 3.5.109 on the
DSP's, 5.3.110 on the
>ARC, 8.6.3 on the NMC. Running 8
PRI's the telco says are configured
>on a
DMS100.
>
>Loaded factory defaults on everything and
configured it up. The unit
>started taking calls, everything
looked good. Got up to about 100
>users on. Walked
away from it.
>
>After about 2 hours, came back to it, and
no callers. in session
>monitor in TCM, the (most)
B-channels on all 8 PRI's showed "Local
>out of service".
All except on the 8th PRI, one caller remained
>connected. I
also noticed the D-ALM light intermittently going red
>on some of
the PRI's. As I watched, slowly the B-channels all
>started
changing to "Idle" and "in service". But not all at once,
on
>any give PRI, I'd have the D up, and 19 of the B's, with any 4
of the
>B's still "local out of service".
>
>Talked to
the Telco, they said everything was up and the B-channels
>were
taken down on our end.
>
>At this time, I noticed a message
on the console port of the ARC that
>said:
>At 17:48:04,
Facility "GWC Modem Driver", Level
"CRITICAL"::
>callline->callstate = 9
>
>Now, I've
seen similar behavior when the switch type was not correct.
>
Without help from the Telco, I tried changing the switch type on
the
>trunk settings to NI2 (which turned out to be the correct
answer last
>time something like this happened). Then
rebooted all of the DSP's.
>
>Once I did this - the
D-channels stopped dropping and all the
>B-channels were back,
callers started getting on again. Looked good.
>
>2
1/2 hours later, checked on the box. Only 3 DSP's had
callers. At
>this point, noticed on the terminal, that the
ARC had rebooted.
>Scrolling up in HyperTerminal revealed that
some type of exception
>had happened, and the ARC had
rebooted. (Turned on text capture on
>HyperTerminal at this
point).
>
>15 minutes later, all of the callers
disappeared. Checked, and all
>B-channels were down.
Thinking it might be ARC related, I gave the
>ARC a reboot
command. Immediately on giving it the reboot - this is
>what
I saw:
>
>HiPer>> reboot
>You have requested to
Reboot the system
>Please confirm the
request.(No/Yes):y
>
>Rebooting....
>At 21:07:44,
Facility "Configurator", Level "INFORMATION":: Received
>a
CFG_SERVICE_CLOSED_MSG message. Administrator Network
Service
>telnetd has been disabled.
>At 21:07:44, Facility
"GWC Modem Driver", Level "CRITICAL"::
>callline->callstate =
1
>
>At 21:07:44, Facility "GWC Modem Driver", Level
"CRITICAL"::
>callline->callstate = 1
>
>At
21:07:44, Facility "GWC Modem Driver", Level
"CRITICAL"::
>callline->callstate =
1
>
>
>
>Immediately after this, I pulled up
session monitor on one of the
>DSPs. And about 10 seconds
later all of the B-channels changed from
>"local out of service"
to "in service".
>
>Once the ARC rebooted, getting
fast-busy's dialing in. Checked other
>DSP's and they were
all again "local out of service".
>
>On the 1st DSP,
did an Actions/Commands->Software->Restore and all of
>it's
B-channels went "in-service".
>
>Did a restore on the rest
of the DSP's, and all the B-channels went
>in-service everywhere,
but still getting fast-busy. Gave it about 3
>minutes, still
fast-busy.
>
>On a hunch - did a Hardware reset on the First
4 DSP's. Almost
>instantly, the 7th and 8th DSP's started
taking calls.
>(?!)Interestingly though - I still get a fast busy
when I dial it
>myself, and the first DSP just got two utilization
lights lit and
>then rebooted itself. Shortly thereafter,
most of the B-channels
>went out of service again and the callers
dropped off. But the ARC
>isn't giving any errors at the
console port at this moment.
>
>So, for lack of any ability
to get and keep users connected, I just
>reset all the DSP's to
the DMS100 switch type, and rebooted the
>cards.
Immediately, callers started connecting. Session
Monitor
>shows EVERY B-channel is up. But for how
long?
>
>Is the switch-type so far wrong that it just can't
function
>continuously? Is the ARC fried? Is this
chassis haunted?
>
>Nothing I can say but:
WTF?
>
>Anyone have any
ideas?
>
>Thanks,
>
>Joel
>
>
>
>
>
>
>
>_______________________________________________
>USR-TC
mailing
list
>USR-TC@mailman.xmission.com
>http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________
USR-TC
mailing list
USR-TC@mailman.xmission.com
http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________
USR-TC
mailing list
USR-TC@mailman.xmission.com
http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc
_______________________________________________
USR-TC mailing
list
USR-TC@mailman.xmission.com
http://mailman.xmission.com/cgi-bin/mailman/listinfo/usr-tc