[PLUG] Query related to Process Table overflow

Pranav Peshwe pranavpeshwe at gmail.com
Fri Oct 9 04:31:11 PDT 2009


On Fri, Oct 9, 2009 at 4:34 PM, Mandar Vaze <mandarvaze at gmail.com> wrote:

> Hi,
>
> We are running into an issue where users are unable to login to the
> RHEL 4.7 machine even via console.
> Both username and password are accepted, and then after a while users
> are returned to the login prompt
>
> Here is my "theory" :
> I suspect that this may be the case of Process Table overflow, where
> OS is unable to create a new process.
> In order to login the user, after password verification, OS needs to
> provide an interactive shell to the user
> If the Process Table is full, no new process can be created including a
> shell.
> But all the existing processes would continue to run - hence the login
> prompt.
>
> Query:
> 1. How do I prove or disprove my theory ? The big problem is that I
> can't get "into" the system to troubleshoot, and the problem goes away
> after a reboot.
>

Hi,
     After the reboot, log in to the machine. Then wait for your problem to
occur.
When you find that users can no longer log in, try creating any new process
from the shell you already have on the machine. If you can create a process,
then PTO would not be the problem and you can safely skip reading the rest
of this reply :)


> 2. How do I "catch" a rogue process (which may be forking too many
> processes resulting into PT overflow)
>
3. How do I prevent this from happening in future ?
>
>
Setting an nproc limit in  /etc/security/limits.conf can be a good start.
You can go on increasing the limit for users one by one. If for some user,
the system becomes unusable, then you have caught the rogue user to begin
with. Then you can narrow down on the process he runs.

Also, check whether your maximum pid limit (/proc/sys/kernel/pid_max) is not
set unsually low. It is 32768 on my machine running Ubuntu 9.04.

HTH.

People, kindly CMIIW.

- P


More information about the plug-mail mailing list