DevX: Q2B2 Improve actionability of CI
Created by: burmudar
Problem
Devs often don’t know what the pipeline is doing or why their pipeline has failed.
We (devx) often get asked to help diagnose and figure out what the actual problem is. This is because we have in-depth knowledge about the pipeline and how it works.
The pipeline needs to be less of a blackbox and surface useful information which allows one to investigate further.
Boundaries
Scope
Our primary concern is with the pipeline and its communication channels. As it stands the pipeline has 3 communication channels:
- Buildkite UI
- Slack notifications
sg ci status
Each channel can be enhanced to provide more contextual information and easy on-ramps to other tools that help failure diagnosis.
Improving pipeline stability also falls into improving pipeline communication. A false positive or bug communicates the wrong thing and confuses the end user. Therefore, addressing bugs falls into the scope of this bet.
Out of scope
- Nice to haves = eye candy
- Speed improvements
- Test out buildkite tracing / opentelemetry
- Better test output can help with making a broken pipeline more actionable
- https://github.com/mfridman/tparse
-
https://github.com/gotestyourself/gotestsum
- This requires investigation as both support ci like features
- Does showing a test failure better improve actionability ?
- Less test verbosity leads to less pipeline noise
- Everyone hates verbosity until they need it
Definition of Done
- Document our various annotations and what each link on an annotation does.
- Make everyone aware of the new on ramps to tools to help them diagnose their pipeline.
- All pipeline failures generate a unified failure annotation
Payout:
- Annotations provide more contextual help
- All Job failures have relevant annotations
- Some integration / client tests don’t produce annotations therefore we have to add it
- Details of build failures are clearly communicated and next steps to investigate are provided.
Tracked issues
@unassigned
-
https://github.com/sourcegraph/sourcegraph/issues/26083 -
https://github.com/sourcegraph/sourcegraph/issues/28286 -
https://github.com/sourcegraph/sourcegraph/issues/29643 -
https://github.com/sourcegraph/sourcegraph/issues/26638
Completed
-
( 🏁 161 days ago) https://github.com/sourcegraph/sourcegraph/issues/26182 -
( 🏁 1 day ago) https://github.com/sourcegraph/sourcegraph/issues/27088 -
( 🏁 1 day ago) https://github.com/sourcegraph/sourcegraph/issues/26304
@burmudar: 3.00d
-
https://github.com/sourcegraph/sourcegraph/issues/37616 -
( 🏁 23 days ago) https://github.com/sourcegraph/sourcegraph/issues/37613
-
-
https://github.com/sourcegraph/sourcegraph/issues/38043
Completed: 3.00d
-
( 🏁 57 days ago) https://github.com/sourcegraph/sourcegraph/issues/34608 3.00d -
( 🏁 52 days ago) https://github.com/sourcegraph/sourcegraph/issues/37758 -
( 🏁 52 days ago) https://github.com/sourcegraph/sourcegraph/issues/34470🐛 -
( 🏁 23 days ago) https://github.com/sourcegraph/sourcegraph/issues/37613
Legend
-
👩 Customer issue -
🐛 Bug -
🧶 Technical debt -
🎩 Quality of life -
🛠️ Roadmap -
🕵️ Spike -
🔒 Security issue -
🙆 Stretch goal