Re: Strange Radius Messages (fwd) REVISITED

John R. Jamerson (jamerson@coastalnet.com)
Fri, 28 Feb 1997 21:16:57 -0500

>Once upon a time Robin Dua shaped the electrons to say...
>>Is there a fix to this & could this be causing random disconnects at all?
>
>1. That means the network and/or RADIUS server is slow enough that the PM
>isn't getting a reply withing 3 seconds so it asked again.
>2. No. This message has to do with calls that aren't yet connected and means
>nothing for connected users.
>
>>Secondly, we are getting the following console message on our
>>authentication server:
>>
>>Jan 17 17:43:33 tdcore radius[131]: sending SIGHUP signal to unresponsive
>>child process 29391
>>
>>What does this mean and what can we do to fix this problem?
>
>It means the RADIUS server lost communication with one of the forked off
>child processes. So it killed the forker.
>
>This is normal for a UNIX daemon that cleans up after itself. It is
>only a problem if you get a lot of them. And then the issue is usually
>the machine isn't quick enough or has too high a loading.
>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>-MZ

Alright, MZ. I've got a brand new, shiny Sparc Ultra with Solaris 2.5.1 (and
all the patches) that's putting "unresponsive child" msgs in the logfile
approximately 200 times/hr.

It's not highly loaded (just brought it on line today -- nothing on it yet);
current
load at primetime (9PM) Friday night is 0.02, 0.05, 0.05. Have 30 (+/-)
portmasters using
it as Authentication and Accounting Primary host. And it _sure_
ain't too slow. Running Radius 2.0 in dbm mode.

Symptoms:

Radius would start, go for about 5 minutes, then the logfile would show rejects
for supposedly unknown users (even though they were in the passwd file and
were valid users), then start spewing unresponsive child msgs, then die.

Restarting Radius produced the same results, time after time.

Finally dug out a msg from the PM list by Brian Elfert (sent 2/19/97) that
told how
to change the nscd.conf file in Solaris 2.5.1 to stop (not solve) the problem.

Did that. Restarted Radius, it no longer rejects valid users as "unknown", and
no longer dies. However, it does continue to fill the logfile with
"unresponsive
child" msgs.

I've got 5 other Sparcs here that I'm upgrading to Solaris from SunOS 4.1.x this
weekend. Will apply the nscd.conf changes to each as we bring 'em up.

Is this gonna be a "gotcha"??

John Jamerson | Internet Access
Systems Administrator/WAN Mgr | In this business, yesterday is
CoastalNet/Global Info Exchange | history. The day before is
http://www.coastalnet.com | archaeology. (Me: 1995)