Quantcast
Channel: Free Tutorials » Development
Viewing all articles
Browse latest Browse all 10

Health check script Linux

$
0
0

Bellow I will present script checking Load Average on a Linux server which will send report if it becomes to high.
It is using Linux command ‘uptime’ which is pulling out server’s uptime as well as its Load Average:

  1. # uptime
  2. 21:53:51 up 43 days,  2:19,  3 users,  load average: 1.63, 0.70, 8.29

In my script bellow I am using not the last minute load average, but the last five minutes (the second) one.
If you are not aware you can interpret a load average of “1.63, 0.70, 7.89″ on a single-CPU system as:

– during the last minute, the system was overloaded by 63% on average (1.63 runnable processes, so that 0.73 processes had to wait for a turn for a single CPU system on average).

– during the last 5 minutes, the CPU was idling 30% of the time on average.

– during the last 15 minutes, the system was overloaded 698% on average (7.89 runnable processes, so that 6.98 processes had to wait for a turn for a single CPU system on average).

I have chosen the five minute interval as sending mails every minute is too aggressive in case of a server load. Also it could be something too short and handled by the server without notification.

  1.  
  2. #! /bin/bash
  3. # Here you set the maximum Load Average you want to be notified on.
  4. # It depends on the number of processes and the usual server load.
  5. max_loadavge=3  
  6.  
  7. # In case of load average: 3.63, 5.70, 3.29 – will extract 5.70
  8. loadavg=$(uptime | awk -F'load average:' '{print $2}'|awk '{gsub(",",""); print $2}'
  9.  
  10. # Prints the load average with current time stamp into log file.
  11. # It will save according to the
  12. # You may comment this line on case you do not need it
  13. echo "Load Average:$loadavg on `date`" >> /var/log/load.log
  14.  
  15. # Set the e-mail you will be notified here.
  16. EMAIL1="email@domain.com"
  17.  
  18. Optional secondary e-mail
  19. #EMAIL2="email2@domain.com"
  20.  
  21. #Checks if the maximum Load Average is smaller then the one from the 'uptime' command
  22. if [[ "$loadavg" > "$max_loadavge" ]]
  23. then
  24.  
  25. # Creates notification message subject
  26. SUBJECT="`hostname` LOAD AVERAGE ALERT $loadavg(>$max_loadavge)"
  27.  
  28. #Both lines bellow will be the body of the message. It will contain the current load average with time stamp as well as list of
  29. #the current processes running on the server
  30. ETEXT1="Load Average:$loadavg on `date`\n\n"
  31. ETEXT2="USER     PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND\n`ps aux | sort -rgk3`"
  32.  
  33. echo -e "$ETEXT1$ETEXT2" | mail -s "$SUBJECT" "$EMAIL1"
  34.  
  35. # if you are going to send to more than one email the line will be:
  36. # echo -e "$ETEXT1$ETEXT2" | mail -s "$SUBJECT" "$EMAIL1" "$EMAIL2"
  37.  
  38. ##Here also you can execute scripts for restarting particular services e.g. Apache, MySQL
  39.  
  40. fi
  41.  
  42. exit

* Note that some settings may need tuning because of changed/different command output

Once the script is ready you can set it as Cron. job Mine is set to check every 5th minute:

  1. # .—————- minute (0 – 59)
  2. # |  .————- hour (0 – 23)
  3. # |  |  .———- day of month (1 – 31)
  4. # |  |  |  .——- month (1 – 12) OR jan,feb,mar,apr …
  5. # |  |  |  |  .—- day of week (0 – 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
  6. # |  |  |  |  |
  7. # *  *  *  *  * user-name  command to be executed
  8. */5 * * * * /root/scrpts/health.check.sh &> /dev/null

Viewing all articles
Browse latest Browse all 10

Latest Images

Trending Articles





Latest Images