Grafana for service/host SLA graphs

sla
ido-mysql
grafana

(Carsten Köbke) #1

Author: @Carsten

Revision: v0.1

Tested with:

  • Icinga 2 v2.8.4
  • Icinga Web 2 v2.5.3
  • Grafana 5.1.1
  • MariaDB 10.2.14

Introduction

This howto will explain you to setup MariaDB/Mysql & Grafana to calculate your defined & 24x7 Service-Level-Agreement based on data Icinga2 writes into the IDO-DB. Thanks to Thomas Gelf for providing the needed procedures & functions for MariaDB/MySQL.

Requirements

  • Icinga2 with MySQL IDO enabled
  • Grafana 5.1+ (it will not work with Grafana version 4 or lower)
  • Root access to the IDO database
  • A User with enough rights (select & run stored procedures) to the IDO-DB for Grafana
  • SQL schemafiles download from here
  • Dashboard download from here

Installation

  1. Import the mysql schema files with your root account into the icinga2 ido database (replace icinga_ido_database with your database name)
for i in *.sql ; do mysql -u root -p icinga_ido_database < $i ; done
  1. Create a MySQL user with enopugh rights to do selects and run procedures (replace username, password, host and databases)
CREATE USER 'test123'@'localhost' IDENTIFIED BY 'GEHEIM'; 
GRANT SELECT, EXECUTE ON `icinga`.* TO 'test123'@'localhost';
  1. Create MySQL datasource in Grafana

  2. Import the dashboard into Grafana, choosing the datasource you just created

Refreshing tables

Before you can use the dashboard we have to refresh the sla_timeperiod tables. Open mysql as root user and run

CALL icinga_refresh_slaperiods();

You have to do this everytime you change,add or remove a timeperiod in icinga.

Usage

Open the Dashbaord and you can choose the host & service with a timeperiod you want to use as SLA timeperiod.

If you want to see only SLA for 24x7 you need to change value column under the graph options from sla_state0 to state0

Known Bugs/Limitations

For me with a MariaDB 10.2 the table refresh doesnt work as it should. So i guess i can only use 24x7 calculation. I found the problem, if you have less then 2194 objects in the table icinga.objects it will not generate the whole range for 6 years. For MariaDB >= 10.0 you can use this refresh_slaperiods-procedure.sql

DROP PROCEDURE IF EXISTS icinga_refresh_slaperiods;

DELIMITER $$

CREATE PROCEDURE icinga_refresh_slaperiods()
    SQL SECURITY INVOKER
BEGIN
  DECLARE t_start DATETIME;
  DECLARE t_end DATETIME;
  DECLARE tp_id, tpo_id BIGINT UNSIGNED;
  DECLARE fake_result INT UNSIGNED;

  DECLARE done INT DEFAULT FALSE;

  DECLARE cursor_tp CURSOR FOR SELECT
          tpo.object_id,
          tp.timeperiod_object_id
        FROM icinga_timeperiods tp
        JOIN icinga_objects tpo ON tp.timeperiod_object_id = tpo.object_id
            AND tpo.is_active = 1;

  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;


  SET SESSION binlog_format = ROW;

  START TRANSACTION;

  TRUNCATE TABLE icinga_sla_periods;

  SELECT
      CAST(DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 4 YEAR), '%Y-01-01 00:00:00') AS DATETIME),
      CAST(DATE_FORMAT(DATE_ADD(NOW(), INTERVAL 1 YEAR), '%Y-12-31 23:59:59') AS DATETIME)
    INTO t_start, t_end;

  OPEN cursor_tp;

  tp_loop: LOOP
    FETCH cursor_tp INTO tp_id, tpo_id;
    IF done THEN
      LEAVE tp_loop;
    END IF;


    SET @tp_lastend := NULL,
        @tp_lastday := NULL,
        @day_offset := NULL;

    INSERT
      INTO icinga_sla_periods SELECT
        tpo_id,
        DATE_ADD(CAST(monthly.date AS DATETIME), INTERVAL finaltps.start_sec SECOND) AS start_time,
        DATE_ADD(CAST(monthly.date AS DATETIME), INTERVAL finaltps.end_sec SECOND) AS end_time

      FROM (
        SELECT
          DATE_ADD(DATE(t_start), INTERVAL @day_offset := @day_offset + 1 DAY) AS date,
          DAYOFWEEK(DATE_ADD(DATE(t_start), INTERVAL @day_offset DAY)) - 1 AS weekday

      --  FROM icinga_objects o
        FROM seq_1_to_2194

        JOIN (SELECT @day_offset := -1) day_offset

      --  ORDER BY object_id
      --  LIMIT 2194
      ) monthly JOIN (
        SELECT
          NULL AS day,
          NULL as start_sec,
          NULL AS end_sec
        FROM DUAL
        WHERE (@tp_lastday := NULL) IS NOT NULL
          AND ((@tp_lastend := 0) + (@day_offset := -1)) = 1

        UNION ALL

          SELECT
            day,
            start_sec AS start_sec,
            end_sec AS end_sec
          FROM icinga_timeperiod_timeranges tpr

          JOIN icinga_timeperiods tp ON tp.timeperiod_id = tpr.timeperiod_id
          WHERE tp.timeperiod_object_id = tpo_id

      ) finaltps ON finaltps.day = monthly.weekday
      WHERE DATE_ADD(CAST(monthly.date AS DATETIME), INTERVAL finaltps.end_sec - 1 SECOND) <= t_end
      ORDER BY monthly.date, finaltps.start_sec, finaltps.end_sec
      ;

  END LOOP tp_loop;

  CLOSE cursor_tp;

  COMMIT;
  SET SESSION binlog_format = STATEMENT;



  SELECT 0 INTO fake_result FROM icinga_objects LIMIT 1;
END;
$$
DELIMITER ;

FAQ

The author will take no responsibility for the correctness of the calculations (iam too stupid to understand the sql statements :wink: )


How to gather data for SLA reporting