Automation Backend: Tech Debt, Developer UX and Ideas for Architecture & Design
Created by: mrnugget
This is a braindump of all the things we ran into while working on #6085 (closed). The items on this list range from "nice to have" to "we should do it" to "we need to do this, sooner rather than later"
a8n.Store
- Can/should we combine
repos.Store
anda8n.Store
?- What about enterprise/OSS divide here?
- The tests for
a8n.Store
are in one large file and have global state - The tests are slow because a database needs to be dropped and created for every package in which a
dbtest
database is used - In
a8n.Store
: the column names are duplicated in every query string. We could probably substrings and include them in the other queries. -
UpdateCampaignJob
needsnulltimeColumn
forStartedAt
andStartedAt
Limit: -1
instead of Limit: $number-we-think-should-be-high-enough
Allow fetching all rows with Right now we sometimes fetch all rows of a given table by specifying something like Limit: 10000
in the hopes of never having to fetch more than that. See the search results here: https://k8s.sgdev.org/search?q=repo:%5Egithub%5C.com/sourcegraph/sourcegraph%24+Limit%5C:%5Cs%2B%5Cd%7B3%2C%7D%2C+file:enterprise/internal/a8n/&patternType=regexp
Wha we already do for ChangesetEvents
is to allow specifying Limit: -1
:
Architecture and Design of a8n code
- We have
a8n.Service
andrun.Runner
— they are really similar - Should we combine them? Do we keep them separate and have two separate services?
- Do we need the
run
package? - We could change the
run.Runner
to take adone
channel as last parameter inRun
on which it sends a message when it's finished executing the jobs. That would allow us to get rid of theWaitgroup
and theWait()
method.
Use a persistent queue
Extracted into https://github.com/sourcegraph/sourcegraph/issues/6723
Execute ChangesetJobs in parallel
Extracted into https://github.com/sourcegraph/sourcegraph/issues/6722
gitserver
when applying diff fails
Error responses in Extracted into separate issue: https://github.com/sourcegraph/sourcegraph/issues/6717
Changesets are never cleaned up
Right now a user can create multiple changesets with createChangesets
and never attach them to a campaign and they'll persist.
The same happens when a user deletes a campaign: the changesets will stay around and will be synced.
We can probably find some heuristic when it's safe to clean up a changeset, i.e. when a changeset is older than 5 hours and hasn't been attached to a campaign, we delete it.
Enterprise and OSS
- We have two
a8n
packages, one in ininternal
and one inenterprise/pkg
- That always requires a named import
- We can't just have only one enterprise package because we need to access the type definitions in
internal/a8n
fromcmd/repo-updater/repos
becauserepo-updater
has theSources
that can load and createa8n.Changesets
CampaignType.searchQuery
Executing Right now we use our own wrapper around searchResolver
called graphqlbackend.RepoSearch
. Is it maybe better to use zoekt.Searcher.List
in the a8n.Runner
than graphqlbackend.RepoSearch
? See the code here @keegancsmith suggested this in https://github.com/sourcegraph/sourcegraph/pull/6309#issuecomment-548691096 (See also https://github.com/sourcegraph/sourcegraph/issues/6627 for more context on this.)
We probably want to go through the user-facing search code path for matchTemplate
(in the case of a comby
campaign) and use structural search there (once it's ready). But do we also need to do the same for searchScope
?
repos
package
Naming of and in - Naming of and in
repos
package- By now, the
repos
package should be calledexternalservice
(orcodehost
), since it does a lot more than justrepos
. It could also probably be extracted fromrepo-updater
-
repos.Source
needs to be calledClient
, because it's more than a "source of repositories" by now
- By now, the
GraphQL layer and a8n
- In order to create a new GraphQL query/mutation for
a8n
you need to:- Define it in
schema.graphql
- Define an interface in
./cmd/frontend/graphqlbackend/a8n.go
that will be implemented by the (enterprise-only)a8n
package - Implement the interface in
./enterprise/pkg/a8n/resolvers/
- Write a test in
./enterprise/pkg/a8n/resolvers
which again defines structs that reflect the GraphQL schema to which we can unmarshall the resolver response
- Define it in
- The GraphQL tests in
./enterprise/pkg/a8n/resolvers/resolver_test.go
are really verbose and heavy- They all share the same setup
- They require a lot of setup (admin user, ext service, repos just to have a
RepoID
somewhere) - They are slow due to the heavy setup
- They require inline type definitions to unmarshal GraphQL responses, duplicated across tests
Inconsistencies in type definitions
- Repo IDs are sometimes
uint32
, sometimesint32
, sometimesint64
- Inconsistent usage of
api.RepoName
andapi.CommitID
— some use it, some don't
Developer UX when dealing with external services, repos and talking to code hosts
- No predefined set of common
httpcli
middlewares inhttpcli
- If you have a repo, it's cumbersome to get to the right external service client.
go-diff
Fix multi-file diffs without extended header in go-diff
has a bug where it doesn't parse multi-file diffs correctly that have no headers between diffs.