Skip to content

Service metrics

The following options are provided:

Please note the considerations in the sections below.

Data send via Collectd

  • Collectd is the recommended method for ingesting data into the infrastructure
  • Before deciding to use any other method have a look first at Collectd

Data send via AMQ/HTTP

  • The usage of the AMQ source is recommended for producers outside CERN
  • The following fields are "reserved": _id, availability_zone, environment, event_timestamp, host, hostgroup, hostname, json, monit_hdfs_path, producer, submitter_envrionment, submitter_hostgroup, toplevel_hostgroup, timestamp, type, type_prefix, version
  • You can send a flatten JSON doc or a doc split into "data" and "metadata"
  • Anything that you send inside "data" will be kept there
  • Anything else that is one of the "reserved" fields will be promoted to "metadata"
  • Anything that is not one of the "reserved" fields will be promoted to "data"

Data stored in OpenSearch

When inserting documents inside OpenSearch a mapping for the fields is generated, it can be driven by a predefined template or by OS infering the type to use. There are several options to choose (i.e: boolean, keyword, ip...), so you can have a look into the mapping types.

In addition, there is also a set of mapping params, that might come in handy depending on your data (highlight for the enabled one).

As a general rule, we take all the "string" fields as a "keyword" with ignore_above_value of 256 characters (which means strings over that amount of characters will be written down but not indexed). The other option is to use a type of the "text family".

Users can request these types or mapping parameters to be changed for their index patterns (through a SNOW ticket). Depending on the documents rate/size this might not be approved by default and require some further discussion as it may have a big impact on the stored size of the documents.