Skip to content

Notification Targets

ServiceNow target

It is possible to create ServiceNow incidents from alarms integrated in the MONIT infrastructure.

To enabled it, these are the parameters that can be set and their default values:

  • cerncollectd::config::snow_alarms_enabled: true
  • cerncollectd::config::snow_fe: populated from local FE fact
  • cerncollectd::config::snow_se: the default SE associated to the FE
  • cerncollectd::config::snow_assignment_level: 3
  • cerncollectd::config::snow_grouping: true (deprecated in favour of hostgroup_grouping)
  • cerncollectd::config::snow_hostgroup_grouping: true
  • cerncollectd::config::snow_auto_closing: false (only when hostgroup_grouping is false)
  • cerncollectd::config::snow_fe_category: undef
  • cerncollectd::config::snow_watchlist: undef

These defaults can be overridden per hostgroup or environment using Hiera (check section below).

Grouping

All alarms are grouped by default and the infrastructure allows to specify if alarms are grouped by "entity" or "hostgroup". To control this selection please use the snow_hostgroup_grouping parameter.

cerncollectd::config::snow_hostgroup_grouping: true

The grouping of alarms follows these rules: * If there is no "hostgroup_grouping" field or is set to false the events will be grouped in an incident by "alarm_name" and "entity". * If there is "hostgroup_grouping" set to true and there is not "submitter_hostgroup" the events will be grouped by "alarm_name" and "entity". * If there is "hostgroup_grouping" set to true and there is "submitter_hostgroup" the events will be grouped by "alarm_name" and "submitter_hostgroup".

Auto closing

Incidents created in SNOW can be closed automatically when and OK event is received matching and open ticket by "alarm_name" and "entity".

cerncollectd::config::snow_auto_closing: true

This events are only sent to SNOW when the "hostgroup_grouping"/"grouping" options are disabled (set to false). Since it's not possible to know when a ticket containing multiple entities should be closed.

Email target

It is possible to generate emails from alarms integrated in the MONIT infrastructure.

To enabled it, these are the parameters that can be set and their default values:

  • cerncollectd::config::email_alarms_enabled: false
  • cerncollectd::config::email_to: [] - List of email recipients

These defaults can be overridden per hostgroup or environment using Hiera (check section below).

Disable targets (Roger)

All alarms are shipped with the flag "roger_alarmed" based on the roger state of the machine for the specific alarm type. If an alarm is shipped with roger_alarmed false all notification endpoints will be ignored and no action will be taken (ticket creation in SNOW for example).

The flow followed to decide this flag is the following, moving to the next step only if the previous one failed for some reason:

  1. Check the roger cache file in the host
  2. Ask roger for actual information
  3. Check the parameter "alarmed_default" in the cerncollectd::alarms::config class (default: false)

To change the value of the alarmed_default parameter use Hiera and write:

cerncollectd::alarms::config::alarmed_default: true

Overriding defaults

All parameters described in the sections above can be overridden using Hiera.

All alarms

cerncollectd::config::snow_se: 'My SNOW SE'

This configuration will be applied to all the alarms by default.

Single alarm

The best way is to use the custom_targets parameter that the alarm definition will have. An example for the boot_full alarm would be:

cerncollectd_contrib::alarm::boot_full::custom_targets:
  snow:
    functional_element: "My FE"
  email: 
    to:
      - someone@cern.ch
      - someone.else@cern.ch
    send_ok: true

Inside the snow Hash, you can set custom values for the following parameters. If omitted, they will be set to the global defaults:

  • disabled: true/false
  • functional_element
  • service_element
  • assignment_level
  • grouping (deprecated, use hostgroup_grouping)
  • hostgroup_grouping
  • watchlist
  • auto_closing

Inside the email Hash, you can set the following parameters:

  • disabled: true/false
  • to
  • send_ok: true/false

Usually that's all you have to do, but there could be cases when this parameter can't be used, or where you need finer granularity. In these cases, it's possible to use directly the cerncollectd::alarms::extra resource. You can define a new resource in your Puppet manifest, with the collectd namespace and the target fields to override.

The same configuration that is available for the snow integration can be done for the email target as well. It's possible to use directly the cerncollectd::alarms::extra resource.

::cerncollectd::alarms::extra {'df':
  ctd_namespace => 'df',
  targets       => {
    snow => { 
      disabled => false,
      watchlist => ['someone@cern.ch','someone.else@cern.ch'],
    },
    email => {
      disabled => false,
      to       => ['someone@cern.ch'],
    },
  }
}

::cerncollectd::alarms::extra {'df_root':
  ctd_namespace => 'df_root',
  targets       => {
    snow => { functional_element => 'newfe' },
    email => {
      disabled => false,
      to       => ['someone@cern.ch','someone.else@cern.ch'],
      send_ok  => true,
    },
  }
}

This will generate several configuration files inside "/etc/collectd.d/alarms/":

  • df.yaml: contains the specific SNOW configuration for the df plugin.
  • df_root.yaml: contains the specific SNOW configuration for the df plugin and root instance.

The priorities are always driven by the granularity of the definition, so more specific definitions will take over more generic ones (in this case: df_root > df > default).

So the result of the example above will be:

  • SNOW alarms will be enabled for all the notifications coming from plugin df, and they will be sent to the default FE.
  • In the case of df_root, the notifications will be sent to the FE newfe.

In this example it is portrayed how to set a list of recipients for the email target, as well as how to send OK alarms.