Skip to content

Automation Backend: Tech Debt, Developer UX and Ideas for Architecture & Design

Created by: mrnugget

This is a braindump of all the things we ran into while working on #6085 (closed). The items on this list range from "nice to have" to "we should do it" to "we need to do this, sooner rather than later"

a8n.Store

  • Can/should we combine repos.Store and a8n.Store?
    • What about enterprise/OSS divide here?
  • The tests for a8n.Store are in one large file and have global state
  • The tests are slow because a database needs to be dropped and created for every package in which a dbtest database is used
  • In a8n.Store: the column names are duplicated in every query string. We could probably substrings and include them in the other queries.
  • UpdateCampaignJob needs nulltimeColumn for StartedAt and StartedAt

Allow fetching all rows with Limit: -1 instead of Limit: $number-we-think-should-be-high-enough

Right now we sometimes fetch all rows of a given table by specifying something like Limit: 10000 in the hopes of never having to fetch more than that. See the search results here: https://k8s.sgdev.org/search?q=repo:%5Egithub%5C.com/sourcegraph/sourcegraph%24+Limit%5C:%5Cs%2B%5Cd%7B3%2C%7D%2C+file:enterprise/internal/a8n/&patternType=regexp

Wha we already do for ChangesetEvents is to allow specifying Limit: -1:

https://github.com/sourcegraph/sourcegraph/blob/8248533108ae3f47bd5fcb451b3912a4f0152e0a/enterprise/internal/a8n/store.go#L609-L618

Architecture and Design of a8n code

  • We have a8n.Service and run.Runner — they are really similar
  • Should we combine them? Do we keep them separate and have two separate services?
  • Do we need the run package?
  • We could change the run.Runner to take a done channel as last parameter in Run on which it sends a message when it's finished executing the jobs. That would allow us to get rid of the Waitgroup and the Wait() method.

Use a persistent queue

Extracted into https://github.com/sourcegraph/sourcegraph/issues/6723

Execute ChangesetJobs in parallel

Extracted into https://github.com/sourcegraph/sourcegraph/issues/6722

Error responses in gitserver when applying diff fails

Extracted into separate issue: https://github.com/sourcegraph/sourcegraph/issues/6717

Changesets are never cleaned up

Right now a user can create multiple changesets with createChangesets and never attach them to a campaign and they'll persist.

The same happens when a user deletes a campaign: the changesets will stay around and will be synced.

We can probably find some heuristic when it's safe to clean up a changeset, i.e. when a changeset is older than 5 hours and hasn't been attached to a campaign, we delete it.

Enterprise and OSS

  • We have two a8n packages, one in in internal and one in enterprise/pkg
  • That always requires a named import
  • We can't just have only one enterprise package because we need to access the type definitions in internal/a8n from cmd/repo-updater/repos because repo-updater has the Sources that can load and create a8n.Changesets

Executing CampaignType.searchQuery

Right now we use our own wrapper around searchResolver called graphqlbackend.RepoSearch. Is it maybe better to use zoekt.Searcher.List in the a8n.Runner than graphqlbackend.RepoSearch? See the code here @keegancsmith suggested this in https://github.com/sourcegraph/sourcegraph/pull/6309#issuecomment-548691096 (See also https://github.com/sourcegraph/sourcegraph/issues/6627 for more context on this.)

We probably want to go through the user-facing search code path for matchTemplate (in the case of a comby campaign) and use structural search there (once it's ready). But do we also need to do the same for searchScope?

Naming of and in repos package

  • Naming of and in repos package
    • By now, the repos package should be called externalservice (or codehost), since it does a lot more than just repos. It could also probably be extracted from repo-updater
    • repos.Source needs to be called Client, because it's more than a "source of repositories" by now

GraphQL layer and a8n

  • In order to create a new GraphQL query/mutation for a8n you need to:
    • Define it in schema.graphql
    • Define an interface in ./cmd/frontend/graphqlbackend/a8n.go that will be implemented by the (enterprise-only) a8n package
    • Implement the interface in ./enterprise/pkg/a8n/resolvers/
    • Write a test in ./enterprise/pkg/a8n/resolvers which again defines structs that reflect the GraphQL schema to which we can unmarshall the resolver response
  • The GraphQL tests in ./enterprise/pkg/a8n/resolvers/resolver_test.go are really verbose and heavy
    • They all share the same setup
    • They require a lot of setup (admin user, ext service, repos just to have a RepoID somewhere)
    • They are slow due to the heavy setup
    • They require inline type definitions to unmarshal GraphQL responses, duplicated across tests

Inconsistencies in type definitions

  • Repo IDs are sometimes uint32, sometimes int32, sometimes int64
  • Inconsistent usage of api.RepoName and api.CommitID — some use it, some don't

Developer UX when dealing with external services, repos and talking to code hosts

  • No predefined set of common httpcli middlewares in httpcli
  • If you have a repo, it's cumbersome to get to the right external service client.

Fix multi-file diffs without extended header in go-diff

go-diff has a bug where it doesn't parse multi-file diffs correctly that have no headers between diffs.

See this piece of code