Cloudera Manager支持邮件告警、snmp转发告警和自定义脚本告警三种方式。本文介绍自定义告警脚本的方式将CM告警输出。

1. 创建目录

Alert Publisher服务所在节点创建目录

mkdir /opt/cloudera/alert_script

2. 创建脚本

由于Alert Publisher服务通过bash执行自定义脚本,所以先编写alert.sh脚本,获取原始告警数据

vim alert.sh
#!/bin/bash

# 原始告警
OG_ALERT_FILE=$1

# 告警信息发送到自定义python脚本解析
cat "$OG_ALERT_FILE" | python /opt/cloudera/alert_script/alert.py

alert.sh脚本将获取的数据转发到python脚本解析,以下是alert.py脚本中解析告警数据的函数:

def get_alert_data():
    """
    解析接收的原始告警数据
    """
    org_alert_file = sys.stdin
    alert_data = json.load(org_alert_file)

    alert_data_list = []
    for i in range(0, len(alert_data)):
        alert = alert_data[i]["body"]["alert"]['attributes']
        cluster = alert['CLUSTER'][0]
        hostname = alert.get('HOSTS', [None])[0]
        _time = int(str(alert_data[i]["body"]["alert"]['timestamp']['epochMs'])[0:-3])
        _time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(_time))
        alter_service = alert['HEALTH_TEST_RESULTS'][0]['testName']
        alert_event_code = alert['HEALTH_TEST_RESULTS'][0]['eventCode']
        alert_severity = alert['HEALTH_TEST_RESULTS'][0]['severity']
        alert_content = alert['HEALTH_TEST_RESULTS'][0]['content']

        # msg = f"告警集群:{cluster},告警主机:{hostname},告警级别:{alert_severity},告警时间:{_time}," \
        #       f"告警服务:{alter_service},告警事件编码:{alert_event_code},告警详细内容:{alert_content}"
        msg = [cluster, hostname, alert_severity, _time, alter_service, alert_event_code, alert_content]

        alert_data_list.append(msg)

    return alert_data_list

获取到解析的数据后,可通过邮件、钉钉机器人、企业微信机器人等方式进行告警。本文以邮件为例,如下是alert.py脚本完整内容:

#!/usr/bin/env python
# coding=utf-8
# Author: Yujichang
# desc: cdh自定义告警脚本

import json
import sys
import time
import smtplib
from email.mime.text import MIMEText


class SendMail(object):
    """自定义发送告警邮件"""
    def __init__(self):
        self.mail_host = 'smtp.mxhichina.com'
        self.mail_user = 'yujichang@deepexi.com'
        self.mail_passwd = '*****'
        self.mail_name = 'yujichang@deepexi.com'
        self.mail_to_user = ['yujichang@deepexi.com']

    def send_mail(self, content):
        sender = "CDHMonitor"+"<"+self.mail_name+">"
        msg = MIMEText(content, 'html', 'utf-8')
        msg['Subject'] = 'CDH服务状态检查'
        msg['From'] = sender
        msg['to'] = ",".join(self.mail_to_user)
        try:
            s = smtplib.SMTP_SSL()
            s.connect(self.mail_host, 465)
            s.login(self.mail_user, self.mail_passwd)
            s.sendmail(sender, self.mail_to_user, msg.as_string())
            s.close()
            return True
        except Exception as e:
            print(e)
            return False


def get_alert_data():
    """
    解析接收的原始告警数据
    """
    org_alert_file = sys.stdin
    alert_data = json.load(org_alert_file)

    alert_data_list = []
    for i in range(0, len(alert_data)):
        alert = alert_data[i]["body"]["alert"]['attributes']
        cluster = alert['CLUSTER'][0]
        hostname = alert.get('HOSTS', [None])[0]
        _time = int(str(alert_data[i]["body"]["alert"]['timestamp']['epochMs'])[0:-3])
        _time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(_time))
        alter_service = alert['HEALTH_TEST_RESULTS'][0]['testName']
        alert_event_code = alert['HEALTH_TEST_RESULTS'][0]['eventCode']
        alert_severity = alert['HEALTH_TEST_RESULTS'][0]['severity']
        alert_content = alert['HEALTH_TEST_RESULTS'][0]['content']

        # msg = f"告警集群:{cluster},告警主机:{hostname},告警级别:{alert_severity},告警时间:{_time}," \
        #       f"告警服务:{alter_service},告警事件编码:{alert_event_code},告警详细内容:{alert_content}"
        msg = [cluster, hostname, alert_severity, _time, alter_service, alert_event_code, alert_content]

        alert_data_list.append(msg)

    return alert_data_list


def conversion_to_html(data: list):
    """
    将解析的告警数据转换为html格式,这个函数主要是定义邮件告警时,以html格式发送
    """
    import pandas as pd

    # pandas v1.0以下版本需要设置列宽度
    pd.set_option('display.max_colwidth', -1)
    columns = ['集群', '主机', '级别', '时间', '服务', '事件编码', '详细信息']
    alert_data_html = pd.DataFrame(data, columns=columns).to_html(escape=False)

    head = \
        """
        <head>
            <meta charset="utf-8">
            <STYLE TYPE="text/css" MEDIA=screen>
                table.dataframe {
                    border-collapse: collapse;
                    border: 2px solid #a19da2;
                    /*居中显示整个表格*/
                    margin: auto;
                }
                table.dataframe thead {
                    border: 2px solid #91c6e1;
                    background: #f1f1f1;
                    padding: 10px 10px 10px 10px;
                    color: #333333;
                }
                table.dataframe tbody {
                    border: 2px solid #91c6e1;
                    padding: 10px 10px 10px 10px;
                }
                table.dataframe tr {
                }
                table.dataframe th {
                    vertical-align: top;
                    font-size: 14px;
                    padding: 10px 10px 10px 10px;
                    color: #105de3;
                    font-family: arial;
                    text-align: center;
                }
                table.dataframe td {
                    text-align: center;
                    padding: 10px 10px 10px 10px;
                }
                body {
                    font-family: 宋体;
                }
                h1 {
                    color: #5db446
                }
                div.header h2 {
                    color: #0002e3;
                    font-family: 黑体;
                }
                div.content h2 {
                    text-align: center;
                    font-size: 28px;
                    text-shadow: 2px 2px 1px #de4040;
                    color: #fff;
                    font-weight: bold;
                    background-color: #008eb7;
                    line-height: 1.5;
                    margin: 20px 0;
                    box-shadow: 10px 10px 5px #888888;
                    border-radius: 5px;
                }
                h3 {
                    font-size: 22px;
                    background-color: rgba(0, 2, 227, 0.71);
                    text-shadow: 2px 2px 1px #de4040;
                    color: rgba(239, 241, 234, 0.99);
                    line-height: 1.5;
                }
                h4 {
                    color: #e10092;
                    font-family: 楷体;
                    font-size: 20px;
                    text-align: center;
                }
                td img {
                    /*width: 60px;*/
                    max-width: 300px;
                    max-height: 300px;
                }
            </STYLE>
        </head>
        """
    body = \
        """
        <body>
        <div align="center" class="header">
            <!--标题部分的信息-->
            <h1 align="center">CDH集群服务告警</h1>
        </div>
        <hr>
        <div class="content">
            <!--正文内容-->
            <h2> </h2>
            <div>
                <h4></h4>
                {df_html}
            </div>
            <hr>
            <p style="text-align: center">
            </p>
        </div>
        </body>
        """.format(df_html=alert_data_html)

    html_msg = "<html>" + head + body + "</html>"
    html_msg = html_msg.replace('\n', '').encode("utf-8")

    return html_msg


if __name__ == '__main__':
    alert_data = get_alert_data()
    alert_data_html = conversion_to_html(alert_data)

    s = SendMail()
    s.send_mail(alert_data_html)

注:使用以上邮件告警方式,python需安装pandas模块。

3. 修改属主

chown cloudera-scm:cloudera-scm -R /opt/cloudera/alert_script
chmod +x /opt/cloudera/alert_script/alert.sh
chmod +x /opt/cloudera/alert_script/alert.py

4. 配置CM告警服务

在CM-Cloudera Management Service-配置 搜索alert.script.path,配置上创建的脚本:

然后重启Alert Publisher服务。

6. 验证

验证的话可以在操作系统中kill掉相关进程,本文手动kill掉Hue Load Bancer服务,一会后收到相关邮件告警: