Skip to content

distribution: 3.15 tracking issue

Created by: beyang

Work

@Unassigned

  • distribution: 3.15 tracking issue #9605

@beyang: 3.00d

  • Install Jaeger by default in all Sourcegraph deployment options #9300 3d

@efritz

  • Federate precise-code-intel-worker metrics in pure docker environments #9552 :shipit:

@ggilmore

  • Merge deploy-sourcegraph latest into k8s.sgdev.org #9759
  • Set ephemeral storage resource requests and limits in deploy-sourcegraph #9604
  • Write sourcegraph/server -> docker compose migration guide #9602
  • k8s: upgrade deprecated apis #9148
  • ci: add enable-profiling/ branch prefix #9186 :shipit:

@slimsag

  • reduce src_graphql_field_seconds cardinality #9895 🐛
    • monitoring: reduce src_graphql_field_seconds cardinality substantially #9898 :shipit:
  • search alerting threshold should be lowered #9894
    • observability: relax search latency alerts to match real world #9901 :shipit:
  • multiple gitserver alerts need DataMayNotExist=true #9856 🐛
    • observability: minor improvements + upgrade git (fixes CI) #9871 :shipit:
  • merge and test deploy-sourcegraph-docker#93 #9839
  • observability: NaN values can leak into alert_count metric #9832 🐛
    • observability: improved NaN handling #9835 :shipit:
  • Remove line color dividing warning/critical threshold #9830
    • observability: remove line color dividing warning/critical thresholds #9831 :shipit:
  • observability: frontend dashboard: P90/P99 search latency should clarify it covers successful searches only #9819 🐛
    • observability: minor improvements + upgrade git (fixes CI) #9871 :shipit:
  • "resolve_revision_duration_slow" flaky; threshold too aggressive #9751 🐛
    • observability: convert Zoekt Index Server dashboard into generated form #9758 :shipit:
  • Symbols -> frontend-internal connection should be monitored #9732 🐛
    • observability: convert Symbols dashboard into generated form #9733 :shipit:
  • Symbols dashboard "Store fetch queue size" appears to be able to go negative #9731 🐛
    • observability: convert Symbols dashboard into generated form #9733 :shipit:
  • Symbols dashboard cardinality too high to be useful #9730 🐛
    • observability: convert Symbols dashboard into generated form #9733 :shipit:
  • "Search errors on unindexed repositories" cardinality is too high to be useful #9670 🐛
    • observability: convert Searcher dashboard to generated #9672 :shipit:
  • "Non-200 frontend responses over 5m" should have less cardinality #9668 🐛
    • observability: convert Query Runner dashboard to generated #9669 :shipit:
  • Frontend dashboard "Hard search errors" should be uncompacted #9660 🐛
    • observability: convert frontend dashboard to generated #9661 :shipit:
  • Syntect server dashboard should be uncompacted #9525 🐛
    • generate Grafana dashboards and Prometheus alert rules from single source of truth #9529 :shipit:
  • syntect-server dashboard "Worker timeouts" should not show one value per frontend instance #9524 🐛
    • generate Grafana dashboards and Prometheus alert rules from single source of truth #9529 :shipit:
  • syntect-server dashboard "Worker timeouts" can appear to go negative #9523 🐛
    • generate Grafana dashboards and Prometheus alert rules from single source of truth #9529 :shipit:
  • sourcegraph.com searcher errors regularly high, does it include context timeouts? #9360 🐛
    • observability: convert Searcher dashboard to generated #9672 :shipit:
  • saved searches should be excluded from search panels/alerts where possible #9358 🐛
    • monitoring: distinguish between browser and API requests #9892 :shipit:
  • frontend dashboard: "hard errors" can show as two entries due to multiple frontend instances #9356 🐛
    • observability: convert frontend dashboard to generated #9661 :shipit:
  • gitserver dashboard missing panel to show # concurrent execs #9354 🐛
    • observability: convert Git Server dashboard into generated form #9781 :shipit:
  • when alerts are firing, they are over-counted due to # of instances #9353 🐛
    • observability: do not over-count firing alerts due to number of service replicas #9826 :shipit:
  • gitserver dashboard should show total available disk space and percentages #9352 🐛
    • observability: convert Git Server dashboard into generated form #9781 :shipit:
  • gitserver critical disk space alert should be 15% #9351 🐛
    • observability: convert Git Server dashboard into generated form #9781 :shipit:
  • RFC 128: Implement consistent Docker image versioning #9251
    • docker-images: republish docker images using Sourcegraph versioning and canonical names #9847 :shipit:
  • Improved search monitoring and alerting #8580
  • Missing prometheus metrics #6428
  • observability: frontend dashboard: generic GraphQL monitoring #9797
  • observability: searcher: add alerting for archive cache #9796
  • monitoring for container restarts #9793
  • monitoring for containers being down #9792
  • monitoring for CPU and memory usage #9791
  • Alert when search indexing is failing #9783
  • Alert when repository updates are failing #9782
  • observability: canonical wording breakdown #9773
  • Frontend dashboard: "non-200 indexed search responses every 5m" should be by (category,code) #9744 🐛
  • syntax highlighting sometimes doesn't work, refreshing causes it to #9557 🐛
  • RFC 128: Investigate and propose a solution for how the Slack /release command could be implemented behind the scenes #9252
  • Determine missing alerts between alertmanager and grafana #7528
  • Setup new Grafana alerting on k8s.sgdev.org and sourcegraph.com to go to Slack #7527
  • Identify further inter-service communication metrics missing from new Grafana dashboards + alerts #7525
  • Identify further service-specific metrics missing from new Grafana dashboards + alerts #7524
  • metrics/alerts for gitserver slow execs and fetches/clones #6675
  • observability: add container monitoring in docker-compose / pure docker deployments #9869 :shipit:
  • docker-images: fix build script permissions #9850 :shipit:
  • enterprise/dev/ci: build republished images #9848 :shipit:
  • docker-images: cleanup folder structure, use build scripts when appropriate #9845 :shipit:
  • build Grafana and Prometheus images on CI #9843 :shipit:
  • observability: do not average histogram quantiles #9840 :shipit:
  • observability: use more color-blind friendly threshold colors #9821 :shipit:
  • docker-images: fix building and publish new versions of grafana,prometheus #9801 :shipit:
  • observability: add support for alerting on values that are LessOrEqual #9780 :shipit:
  • observability: searcher: correct query for "unindexed search request errors" #9772 :shipit:
  • observability: add better validation of wording #9771 :shipit:
  • observability: convert Zoekt Web Server dashboard into generated form #9770 :shipit:
  • observability: do not cut off panels at the bottom of home dashboard #9769 :shipit:
  • observability: correctly sort firing alerts to the tops of dashboards #9767 :shipit:
  • observability: do not mark dashboards as non-editable #9765 :shipit:
  • observability: convert Replacer dashboard into generated form #9749 :shipit:
  • Distribution tracking issue syncer: include PRs #9710 :shipit:
  • observability: convert LSIF Server dashboard into generated form #9682 :shipit:
  • observability: convert Repo Updater dashboard to generated #9667 :shipit:
  • observability: convert GitHub Proxy dashboard to generated #9666 :shipit:
  • observability: more clear grouping support #9642 :shipit:
  • fix regression in GraphQL field monitoring #9619 :shipit:
  • metrics FAQ cleanups #9618 :shipit:
  • doc/admin/observability/metrics: add FAQ entries #9617 :shipit:
  • observability: add support for specifying unit type of graphs #9539 :shipit:
  • observability: remove default values #9537 :shipit:

@uwedeportivo

  • jaeger images must be versioned alongside sourcegraph #9893
  • Move e2e tests into regression test suite #9814
  • Update docs for how to upgrade sourcegraph.com #9607
  • Enforce pod security policies #9603
  • regression e2e search tests: Problems connecting to searcher #9323 🐛
  • 3.15 release tracking issue #9214
  • grafana admin: editing data source field did not have any effect #8531 🐛
  • Placeholder: Continuous release process #7950

Legend

  • 👩 Customer issue
  • 🐛 Bug
  • 🧶 Technical debt
  • 🛠Roadmap
  • 🕵Spike
  • 🔒 Security issue
  • :shipit: Pull Request