Briefly, here's the core problem:
o A user dials in (with win 3.1, win 95, or a mac), establishes a
ppp connection, [usually] gets an ip address assigned dynamically,
then is unable to access www, pop, smtp, news, telnet, ftp, etc.
Ping to them from a remote site works. In some instances, the
inability to access services does not occur immediately. There are
numerous cases where they will be able to work fine for up to 15-20
minutes before the connection seems to lock up (usually with win95).
o Other (maybe unrelated) problems are seemingly frequent loss of
connection to the modems and moderately frequent failure in
authentication of the user.
The equipment:
Although we have sites all over (which all seem to be experiencing
the problem to some degree), we isolated a small sub-set of the
network and the problem still seems to occur with regularity. The
layout (after isolation of other equipment) is:
o A pm2e30 (ComOs 3.3.1c1) with a combination of MultiTech 28.8,
Supra Fax 28.8, AT&T Paradyne 28.8, AT&T Paradyne 14.4 and a Motorola
BitSurfer Pro ISDN modem connected to 10baseT. The 10baseT is
connected to an ethernet hub. Also connected to the hub is a
486DX4-100 running Linux 1.2.13. The Linux box has two Boca 16 port
serial boards with a bunch of AT&T Paradyne 14.4 modems.
o The Linux box runs radiusd and is set up as the primary
authentication server and the primary accounting server. It is also
the name server for the network (named) and the www server (httpd).
Some details:
o Running tcpdump on eth0 of the Linux box while someone tries to
telnet yeilds some packet traffic. There appears to be a packet
from the users nterm port to the linux box's telnet port, followed
by what appears to be a reply packet. It would appear that the
reply packet is not getting processed because the linux box keeps
sending what appears to be identical packets at intervals. The same
results can be seen when the user tries to access a local www page
(http port , not telnet). With tracing turned on in Trumpet TCPMan,
the packets don't seem to show up back at the user's machine.
o Some users seem to be able to work reliably.
o Dialing in to a modem on one of the boca boards seems to bypass the
problem.
Other data can be provided if it would be of any help. We have been
hunting this critter for weeks and have accumulated a huge amount of
observations.
================================================
Bottom line. We are about at the end of our rope. The question I am
asking is wether there are any generous souls out there who can offer
some suggestions as to how to proceed. We are looking for tools to
assist in troubleshooting and/or advice on where to look. Although a
'couple people at Livingston Tech Support have really tried to assist
us, the people there who really understand the guts of the portmaster
have not only been less than helpful, they have lied to us, avoided
our calls after promising to help, and generally treated us with
disrespect. We are at the point where we believe the problem may be
on the Linux box (which doesn't excuse the crappy treatment from some
of the people at Livingston), but can't seem to make any more
progress on nailing it down. Any assistance at all would be
appreciated. Please reply to wizard@magick.net Thanx...
Warren D. 'Cal' Calhoun <psyop@magick.net>
System Administrator for Magick Net, Inc.
233 N.E. "B" Street, Grants Pass, OR 97526
[TEL] +1-541-471-2542 [FAX] +1-541-471-7168