Skip to content

monitoring(postgres): fix panels, adjust transaction alerts

Warren Gifford requested to merge postgres-fix into main

Created by: bobheadxi

Working on alerts and monitoring issues recently, noticed that postgres alerts are always firing in dogfood and the dashboards are a bit incomplete, this PR:

  • Relaxes the transaction duration critical alert to 10m - this fires often in k8s.sgdev.org but always seems to be a spike of just around 10m. Might just be a deploy thing
  • Adds my best-guess Interpretation to some undocumented alerts. Unsure where to go with pg_exporter_err right now, could use some help @daxmc99 (has been active in k8s.sgdev.org for a while it seems)
  • Moves the first group of observables to the standardized General group (which makes it a top-level panel)
  • Improve naming on some alerts
  • Panel options so we don't show a bunch of value. However, not sure if I'm using the best label here though - currently mostly Kubernetes app, but not sure if postgres_exporter has a better label we should use instead. Whatever label we choose, I think we should also sum by that label (see screenshot below, app is not unique atm)
  • Remove per-database container monitoring (see databaseContainerNames docstring)
  • This seems only available for k8s, so add a note in description

before:

image

after:

image

Merge request reports

Loading