Skip to content

Tracking issue - Health Status Tooling

Created by: caugustus-sourcegraph

Plan

SDD

We want to offer admins a simple way to evaluate the health of a Sourcegraph deployment.

Requirements

  • Define a list of technical health checks that validate deployment is healthy
  • Ship a prototype health status tool that provides validation that everything is set up properly. The prototype should assess the following:
    • Can services talk to each other as expected
    • Is the disk configured as expected (i.e., each replica has a separate location defined, etc)
    • Is the disk and network performing as expected
    • Can the instance communicate with the code hosts
  • The tool should work for Kubernetes deployments (w/ or w/o Helm)
  • The tool should render health status in a visually simple way via CLI (think stop sign colored updates). For an example please see this Reddit post.
  • Ship documentation on remediation efforts or troubleshooting best practices. The documentation should be descriptive enough to assist customers/CEs debug at each level of the health check. The documentation should address both Yellow and Red output statuses.
  • Share this tool with CE to validate with select customers

Tracked issues

Legend

  • 👩 Customer issue
  • 🐛 Bug
  • 🧶 Technical debt
  • 🎩 Quality of life
  • 🛠Roadmap
  • 🕵Spike
  • 🔒 Security issue
  • 🙆 Stretch goal