Skip to content

Future plans

Prometheus Long term service

On August 2023, MONIT launched a Pilot Prometheus service for centralising metrics based on Grafana Mimir. The duration of this pilot is scheduled for a maximum of 1 year, and the primary goal is to validate this system suitability as data store for Prometheus based monitoring metrics and alarms.

  • During the pilot duration, focus will be put on integrating "short term" metrics (40d) while we learn to operate and tune-up the service. This use case is considered to cover "real time" service monitoring needs for debugging and alarming.
  • Storing longer term (more than 40 days) is at this stage considered a secondary objective. It is part of our roadmap for the second part of the pilot. In any case, if you have a use case ready to be integrated please send us a SNOW ticket so we can discuss about it.

Grafana 10

The 10th of October we upgraded our Grafana deployment to version 10 (see OTG).

Migration to unified alerting will happen on November the 13th (see OTG).

  • Until early next year (to cover for Christmas break), service managers will be receiving duplicated alerts unless they take action to validate and remove old alerts from monit-grafana-old.

Data centre monitoring

Virtual machines and physical nodes

MONIT bundle

As of October 2023, the MONIT agent bundle for data centre monitoring is comprised of Collectd as the metrics collector and Apache Flume as the metrics and system logs forwarding agent.

The MONIT team is actively working on a new implementation of the MONIT agent bundle with two main feature changes: 1) Replacing the forwarding agent by Fluentbit 2) Adding the possibility to use Node exprorter as metrics collector (replacing or in parallel with Collectd).

The target to have this new bundle generally available is Q1 2024.

Storage backend

Currently data produced by Collectd and integrated in MONIT is kept in InfluxDB. With the move to Prometheus ecosystem and due to InfluxDB limitation this is intended to be changed in the future.

  • This means that tentatively during 2025 MONIT will stop storing Collectd data in InfluxDB and integrate with the Prometheus LTS instead.
  • This will have an impact on users since dashboards based in InfluxDB will need to be adapted
  • Further information, procedures and tooling will be introduced during 2024 in preparation for this change
  • Data produced by Node Exporter will not be integrated in InfluxDB from the beginning and instead use the Prometheus LTS directly.

Kubernetes clusters

MONIT bundle

This monitoring is currently not officially covered by the MONIT team.

  • An effort will be made to provide HELM charts with the monitoring agent configuration (based on fluentbit) so that it can be deployed in any Kubernetes compatible environment (ETA 2024).