Re: Stuck W1 port

Brian Moore (bem@cmc.net)
Tue, 8 Oct 1996 17:04:34 -0700 (PDT)

> On Tue, 8 Oct 1996, Brian Moore wrote:
> >
> > I have occasional, or more accurately, sporadic problems of a similar
> > nature on one line. Some nights it will refuse to acknowledge anything
> > 3-4 times,
>
> 'Refuse to acknowledge'? I confused, do you mean the W1 port doesn't go to
> established after you have reset it?

No, I mean that it just seems to not see any data coming towards it. 'reset
w1' always fixes it. Some nights it's 3-4 resets, but the past couple days it
has been behaving. (And another line has been losing protocol for 20 second
spots... just enough to page and annoy me.. but that's almost surely a phone
issue.)

> > though
> > it's been almost 36 hours since it last had one of its fits, which
> > confuses me. We have multiple leased lines here, and only one is being
> > the problem child. (Though another one was acting up on Sunday, but that
> > may just be local telco problems... it seems to have fixed itself, and its
> > behaved fine for several months).
>
> So this one line has had continuing problems? Is it PPP or frame relay?
> What is the router on the other end of the link?

This, like (most of) our other lines is PPP over a T1. It's a Cisco 4500 at
the hub. There are indications that it's a hardware or telco issue, like two
other identically set up systems have precisely no such problems. But... the
solution makes it reek of software, and it sounds a LOT like the old 3.3.1 bug
of W1 just tiring out and not seeing anything.

> > I'm baffled as to what the problem is. Likewise, we are seeing a large
> > number of framing errors on this one PM, though I don't know if it's the
> > PM or line quality. According to the CSU, the quality on the line is
> > dubious at times, but the hits don't coincide with the periods of the PM
> > silence.
>
> Framing errors indicate line problems. I'm not sure how a CSU means line
> quality, maybe you can fill me in on that aspect.

Ah, nice Datasmart CSU's log (in 15 minute blocks) the number of Bipolar
violations, Code Violations, Controlled Slip, etc. Nifty little toys, though
the menu system gets annoying after a while.

> What CSU/DSU are you using?
>
> Periods of silence? Do you mean no traffic through the W1 port as
> indicated by the input and output counters on the W1 port?

Well no sign of the Cisco anyway... I'm usually too much in a hurry to get
things back up than to see what it's claiming. Next time it dies I'll see what
it has to say for a bit before resetting it.

> > It has me baffled as well. It -seems- to be a software problem from the
> > ability to reset the port, though the framing error count bothers me.
> > This is ComOS 3.3.2 (no c2, but as I understand it, that's mainly ISDN
> > fixes, and we're not doing that at the moment).
>
> If you are getting framing errors, then there is most likely a line
> problem.

Perhaps... which is one of the theories and why I haven't pestered you about
it. But since there are a few people on this list with a lot of telephony
knowhow, and it was brought up.... maybe they have a guess. :)

> I'm not sure what you mean by a problem with the ablity to reset the
> port? What are you expecting the command to do and what are you seeing?
> Maybe I can explain or fix the problem.

I dunno, it just seems weird to me that a line hit could so easily kill
protocol. Now, I could understand a certain period of exceptionally crappy
lines making the PM and/or the Cisco give up on PPP for a while... but the line
when it cleans itself up should happily try to bring PPP back up.

We once had a PM2ER with the W1 programmed and not attached to anything... it
sat and spammed about how hard it was trying to get LCP going on it, so I have
no idea why in this case it doesn't seem to try very hard at all. (With debug
to 0x51 I see NOTHING, so it hasn't even noticed the protocol went down.
Perhaps that's the problem? Should that a 'keep-alive' sort of signal be a
ComOS x.x.x feature. Cisco has it, and it at least seems to notice the loss of
protocol with it.)

> > In a sick way I am glad to see I'm not the only one with this problem.
>
> I feel bad that you guys are having a problem, how can I help?

Um, give me two or three PM-3s, and I'll be happy. :)

Seriously, I don't know if it's a Livingston thing, a Kentrox thing, an MCI
thing, a Cisco thing, or what. I'm leaning at Telco, since it -seems- to
happen in the evenings more than any other time.. about the time things cool
off outside and temperatures change. (We're still dealing with copper for
local loops, after all...)

> If you want more than email assistance, call Livingston tech support at
> 800-458-9966.

Nah, just need to know if there's any particular guesses on what to check to
figure this out more.

It all looks as happy and peppy as the other systems.... but blech, the errors!
If you say that's definitely an MCI/USWorst/GTE issue, than if my pager starts
annoying me again, I'll pester them to check the line.

Command> version
Livingston Enterprises PortMaster Version 3.3.2
System uptime is 37 days 3 hours 37 minutes
Command> show w1
----------------------- Current Status - Port W1 ---------------------------
Status: ESTABLISHED
Input: 3260608823 Abort Errors: 5606/154
Output: 1598551026 CRC Errors: 24
Pending: 0 Overrun Errors: 0
TX Errors: 0 Frame Errors: 11653/2
Modem Status: DCD+ CTS+

Active Configuration Default Configuration
-------------------- ---------------------
Port Type: Netwrk Netwrk (Hardwired)
Line Speed: Ext 1536K Ext Clock
Modem Control: on on
Remote Host: cisco-s5.cmc.net cisco-s5.cmc.net
Local Address: pm-1-w1.edmonds.cmc.net pm-1-w1.edmonds.cmc.net
Netmask: 255.255.255.252 255.255.255.252
Interface: ptpW1 (PPP,Listen) (PPP,Quiet)
Mtu: 1500 0
Dial Group: 0

>
> JGT
> --
> John G. Thompson Livingston Enterprises Inc. Phone: (800) 458-9966
> JOAT(MON) 6920-220 Koll Centre Pkwy. Fax: (510) 426-8951
> support@livingston.com Pleasanton, CA 94566 http://www.livingston.com