Skip to content

Evaluate managed observability tools (GCP Prometheus, Grafana Cloud)

Created by: danieldides

Currently we run a self-hosted instance of Prometheus alongside each Managed Instance. This service is essential towards our observability and alerting stack, driving all our alerts, Grafana dashboards, and some core functionality of services. Evaluate the cost and feasibility of moving to a hosted solution for metric aggregation and visualization.

Candidates include (but are not limited to):

Considerations:

  • Cost per instance increase or decrease (weighed against the VM + storage cost of hosting our own)
  • Accessibility and integration with existing tools (proxied grafana instances, alertmanager, tracing, logging, etc)
  • Ease of provisioning (terraform support)