Alertmanager: OpsGenie notify fails with 422 status code

Created on 15 Nov 2017 · 12Comments · Source: prometheus/alertmanager

What did you do?

I have simple rule on test instance:

groups:
- name: example
  rules:
  - alert: BlaAlert
    expr: up == 1
    for: 0s
    labels: []
    annotations:
      severity: critical
      message: Something wrong
      summary: OpsMeOut

I logged request it sends:

{"alias":"1d1d7ea4d3921fc7909a53d3738caeb62f0e9e5c6affccbf5f3323d2ee1667c9","message":"[FIRING:3] BlaAlert (critical)","description":"OpsMeOut
Alerts Firing:
Labels:
 - alertname = BlaAlert
 - instance = localhost:9090
 - job = prometheus
 - severity = critical
Annotations:
 - summary = OpsMeOut
Source: http://prometheus-2:9090/graph?g0.expr=up+%3D%3D+1\\u0026g0.tab=1
Labels:
 - alertname = BlaAlert
 - instance = localhost:9100
 - job = node
 - severity = critical
Annotations:
 - summary = OpsMeOut
Source: http://prometheus-2:9090/graph?g0.expr=up+%3D%3D+1\\u0026g0.tab=1
Labels:
 - alertname = BlaAlert
 - instance = localhost:9187
 - job = postgres
 - severity = critical
Annotations:
 - summary = OpsMeOut
Source: http://prometheus-2:9090/graph?g0.expr=up+%3D%3D+1\\u0026g0.tab=1

","details":{},"source":"http://prometheus20:9093/#/alerts?receiver=opsgenie","teams":"test-prometheus"}

What did you expect to see?

I expected successful response 200 from OpsGenie, instead I got 422 response which means: "Semantic errors in request body".

But I don't see any problems with requests because it have message field which is only required field from what I can see in OpsGenie API documentation.

Environment

System information:

Linux prometheus20 3.10.0-693.2.2.el7.x86_64 #1 SMP Tue Sep 12 22:26:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Prometheus version:

prometheus, version 2.0.0 (branch: HEAD, revision: 0a74f98628a0463dddc90528220c94de5032d1a0)
  build user:       root@615b82cb36b6
  build date:       20171108-07:11:59
  go version:       go1.9.2

Alertmanager version:

alertmanager, version 0.10.0 (branch: HEAD, revision: 133c888ef3644b47a52acbaeffb09f4cc637df1b)
  build user:       root@01302b7cd08a
  build date:       20171109-15:34:53
  go version:       go1.9.2

Alertmanager configuration file:

global:
  smtp_smarthost: 'localhost:25'
  smtp_from: '[email protected]'

templates: []

route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 3h 
  receiver: sink
  routes:
  - match_re:
      service: ^.*$
    receiver: sink
    routes:
    - match:
        severity: critical
      receiver: opsgenie

receivers:
  - name: 'sink'
    email_configs:
      - to: '[email protected]'
  - name: 'opsgenie'
    opsgenie_configs:
      - api_key: 'foo-valid-opsgenie-key'
        teams: 'test-prometheus'

Prometheus configuration file:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - localhost:9093
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "*_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'node'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9100']
  - job_name: 'postgres'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9187']

Logs:

level=debug ts=2017-11-15T14:16:46.144797798Z caller=notify.go:600 component=dispatcher msg="Notify attempt failed" attempt=1 integration=opsgenie err="unexpected status code 422"

Source

XooR

Most helpful comment

@josedonizetti @fatalsaint #1108

Tom-Fawcett on 15 Nov 2017

👍2

All 12 comments

@Tom-Fawcett created the PR to move to V2, perhaps he has some insight

stuartnelson3 on 15 Nov 2017

hey @XooR this is a bug caused by the API upgrade. The problem is that the 'teams' should be sent in a different structure. For now, you can make it work by removing the 'teams' option from your configuration. I'll have a PR to fix it by EOD.

josedonizetti on 15 Nov 2017

Yep sorry guys :disappointed: I caused this bug.

@XooR for now I would advise either not using the teams config, or rolling back a version.

@stuartnelson3 @josedonizetti I identified this issue a few days ago, and created #1101, however, it hasn't progressed.

Tom-Fawcett on 15 Nov 2017

👍1

@Tom-Fawcett Awesome the PR exist already.

josedonizetti on 15 Nov 2017

I would like to chime in that I have this same problem with the newer version 0.10.0 alertmanager but it's the 'tags' field causing my problems, not the 'teams' field (I don't use the teams field in alertmanager for opsgenie). When I remove the 'tags' field from my config opsgenie successfully gets the post.

fatalsaint on 15 Nov 2017

@fatalsaint apologies, I'll fix that.

Tom-Fawcett on 15 Nov 2017

@josedonizetti @fatalsaint #1108

Tom-Fawcett on 15 Nov 2017

👍2

@fatalsaint @XooR are either of you able to build from head and confirm the behavior is fixed for you?

stuartnelson3 on 15 Nov 2017

My setup utilizes the Docker containers. I can try when the master tag in hub gets updated. The Dockerfile in the repo looks like it requires a pre-built binary.

fatalsaint on 15 Nov 2017

Tested it on the account I've used to debug, and it worked perfectly. Both tags and teams.
screen shot 2017-11-15 at 8 57 50 pm

josedonizetti on 15 Nov 2017

👍1

Master tag updated, so I tested tags, and it worked.

Thanks for the fast response all.

fatalsaint on 16 Nov 2017

I'm also confirming that it works both tags and teams option.

XooR on 16 Nov 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Alertmanager memberlist not sending to all peers

stuartnelson3 · 5Comments

PagerDuty v1 API will be unsupported from February 2018

mattbostock · 4Comments

UI: silence's expire button always expires the last silence in list (not the one the button is clicked for)

pborzenkov · 5Comments

Alert limitation discussion: Maximum number of alerts to display in UI

stuartnelson3 · 5Comments

Unable to edit a silence expiry time

leonerd · 6Comments