> We have a customer who runs a PM-25 and RADIUS 2.0 (on Solaris).
> Recently, he gets many repeating error messages all over his console
> screen, which resembles the following:
>
> "Mar 20 11:15:39 hn radius [22313]:Sending SIGHUP signal to
> unresponsive child process 22328."
>
> Anyone knows what the possible cause is?
In my opinion this problem is caused by a bug in Radius 2.0
This is the expected functionality of RADIUS:
1. parent process forks a child (function rad_spawn_child)
2. parent process registers the child process in a data structure
called 'authreq'
3. child process handles the request and terminates
4. parent process catches SIGCLD, the signal handler finds the
child process registered in the mentioned data structure and
clears the entry.
On some system the process scheduler runs the child process
before the parent process. There is a lack of synchronisation
so it goes like this:
1. parent process forks a child
2. child process handles the request and terminates
3. parent process gets SIGCLD, the handler does not (!) find the
child process registered in the data structure so it does
not do anything.
4. parent process registers the child process in a data structure
(now it is too late !) It will be not cleared and will cause
the annoying message later.
I have also notified the Livingston customer support, but received no
answer yet.
Vlado Potisk
vlado_potisk@tempest.sk