nagios-event-triger-AutoRecover
創(chuàng)新互聯(lián)建站為企業(yè)級客戶提高一站式互聯(lián)網(wǎng)+設(shè)計服務(wù),主要包括成都網(wǎng)站設(shè)計、成都網(wǎng)站建設(shè)、APP應(yīng)用開發(fā)、微信平臺小程序開發(fā)、宣傳片制作、LOGO設(shè)計等,幫助客戶快速提升營銷能力和企業(yè)形象,創(chuàng)新互聯(lián)各部門都有經(jīng)驗豐富的經(jīng)驗,可以確保每一個作品的質(zhì)量和創(chuàng)作周期,同時每年都有很多新員工加入,為我們帶來大量新的創(chuàng)意。思路:
use NRPE to execute the necessary commands on the remote hosts
In order to adapt the scheme from the Nagios docs to work on remote servers as well three things need to be done:
1.The command that is executed by the event handler script should be changed to use NRPE
2.On the remote machine the nagios user (under which the NRPE service is running) should be given some sudo rights so that it is actually allowed to start a service.
3.The NRPE configuration on the remote machine should of course be changed to include the new command(s) for starting services.
1. nagios manage server
(1)vi localhost.cfg
define service{ use generic-service host_name test2.bigdata.com service_description gmond check_command check_nrpe_eventhandler!check_gmond notifications_enabled 1 notification_interval 0 max_check_attempts 4 event_handler restart-service!gmond } define service{ use generic-service host_name test2.bigdata.com service_description mysqld check_command check_nrpe_eventhandler!check_mysqld notifications_enabled 1 notification_interval 0 max_check_attempts 5 event_handler restart-service!mysqld }
(2)vi commands.cfg
define command{ command_name check_nrpe_eventhandler command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 40 -c $ARG1$ } define command { command_name restart-service command_line $USER1$/eventhandlers/event_handler_script.sh $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$ $ARG1$ $SERVICEDESC$ }
利用后面編寫的通用的事件處理腳本模塊文件 /usr/local/nagios/libexec/eventhandlers/event_handler_script.sh 傳遞監(jiān)控服務(wù)所需的參數(shù)$ARG1$
2. on remote machine with NRPE running
(1)vi nrpe.cfg
command[check_gmond]=/usr/local/nagios/libexec/check_gmond command[restart_gmond]=/usr/bin/sudo /etc/init.d/gmond restart command[check_mysqld]=/usr/local/nagios/libexec/check_mysqld command[restart_mysqld]=/usr/bin/sudo /usr/local/nagios/libexec/restart_mysqld
(2)edit your service manage script on remote machine.
/usr/local/nagios/libexec/check_mysqld
/usr/local/nagios/libexec/restart_mysqld
3. 通用的事件處理腳本模塊文件 /usr/local/nagios/libexec/eventhandlers/event_handler_script.sh內(nèi)容如下
#!/bin/sh # # Event handler script for restarting the web server on the local machine # # Note: This script will only restart the web server if the service is # retried 3 times (in a "soft" state) or if the web service somehow # manages to fall into a "hard" error state. # update 2015/10/23 # version: 0.2 date=`date` # What state is the HTTP service in case "$1" in OK) # The service just came back up, so don't do anything... ;; WARNING) # We don't really care about warning states, since the service is probably still running... ;; UNKNOWN) # We don't know what might be causing an unknown error, so don't do anything... ;; CRITICAL) # We don't really care about warning states, since the service is probably still running... # Aha! The HTTP service appears to have a problem - perhaps we should restart the server... # Is this a "soft" or a "hard" state? case "$2" in # We're in a "soft" state, meaning that Nagios is in the middle of retrying the # check before it turns into a "hard" state and contacts get notified... SOFT) # What check attempt are we on? We don't want to restart the web server on the first # check, because it may just be a fluke! case "$3" in # Wait until the check has been tried 3 times before restarting the web server. # If the check fails on the 4th time (after we restart the web server), the state # type will turn to "hard" and contacts will be notified of the problem. # Hopefully this will restart the web server successfully, so the 4th check will # result in a "soft" recovery. If that happens no one gets notified because we # fixed the problem! 3) echo -n "Restarting service $6 (3rd soft critical state)...\n" # Call NRPE to restart the service on the remote machine /usr/local/nagios/libexec/check_nrpe -H $4 -c restart_$5 echo "$date -restart $6 on server $4 -at retry $3 times -SOFT" >> /tmp/eventhandlers ;; esac ;; # The HTTP service somehow managed to turn into a hard error without getting fixed. # It should have been restarted by the code above, but for some reason it didn't. # Let's give it one last try, shall we? # Note: Contacts have already been notified of a problem with the service at this # point (unless you disabled notifications for this service) HARD) echo -n "Restarting $6 service...\n" # Call the init script to restart the NRPE server echo "$date -restart $6 on server $4 -at retry $3 times -HARD" >> /tmp/eventhandlers /usr/local/nagios/libexec/check_nrpe -H $4 -c restart_$5 ;; esac ;; esac exit 0
另外有需要云服務(wù)器可以了解下創(chuàng)新互聯(lián)scvps.cn,海內(nèi)外云服務(wù)器15元起步,三天無理由+7*72小時售后在線,公司持有idc許可證,提供“云服務(wù)器、裸金屬服務(wù)器、高防服務(wù)器、香港服務(wù)器、美國服務(wù)器、虛擬主機、免備案服務(wù)器”等云主機租用服務(wù)以及企業(yè)上云的綜合解決方案,具有“安全穩(wěn)定、簡單易用、服務(wù)可用性高、性價比高”等特點與優(yōu)勢,專為企業(yè)上云打造定制,能夠滿足用戶豐富、多元化的應(yīng)用場景需求。
新聞標(biāo)題:nagiosEventHandlers-創(chuàng)新互聯(lián)
網(wǎng)頁網(wǎng)址:http://www.rwnh.cn/article36/igopg.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供虛擬主機、外貿(mào)建站、微信公眾號、企業(yè)建站、網(wǎng)站改版、用戶體驗
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)
猜你還喜歡下面的內(nèi)容