Christophe Casalegno

How to automatically take an action if the load average is too high

When you manage servers, it can be useful to take a numbers of action automatically (restart a service, add a firewall rules, kill a process, sending an alert, etc.) if the load average is too high. The problem of a lot of shell script about this, it’s if the load average down slowly after the action, the script may go wrong and execute another kill, restart, etc that can be a cause of a service downtime.

The solution is to check 2 states of the load average before running yours commands.

Place to the code :

#!/bin/sh
tsleep=2 # time to wait before 2 checks
llimit=8 # load limit before action
alert=monitor@christophe-casalegno.com #put your monitor alert email here
host=`hostname -f`
load=`cat /proc/loadavg |awk {'print $1'}|cut -d "." -f1` # The load average now
sleep $tsleep
load2=`cat /proc/loadavg |awk {'print $1'}|cut -d "." -f1` # The load average after tsleep
if test "$load" -ge $llimit
then
        if test "$load2" -ge $load
        then
        date=`date`
        echo "The Load Average has reached $load1 and $load2 on $host" | mail -s "$host : High Load Average Alert" $alert
        echo "$date : The Load Average has reached $load1 and $load2 on $host" >> /var/log/loadavg.log
        else
        echo "ok" 1>&2
       fi
else                                                                                                                                              
sleep 1
fi

Now you have just to put this script into your crontab each minutes for example and says good night to your server ;). You can also download directly this script from : https://www.christophe-casalegno.com/tools/loadcheck.sh

Christophe Casalegno
​Vous pouvez me suivre sur : Twitter | Facebook | Linkedin | Telegram | Youtube

Un commentaire

  1. Pingback: Bash : how to simply check disk space on linux and take action when a threshold is reached | Christophe Casalegno

Leave a Comment