Skip to content

insights: retroactively update repo names in TimescaleDB

Created by: slimsag

Since we expect insights to be e.g. filtered down by repo name (or a regex over the repo name, like to find insights for a specific org), the DB stores three fields (dbschema):

  • The repo_id, which is the ID of the repo (irregardless of any renames) as known by the main app DB.
  • The repo_name string, which the repo was named at the time the datapoint was recorded. We will use this to regexp search for data points with repo_name matching some regexp. The idea is that there would be a background worker which goes through the DB and asks the main app DB (via RepoStore.GetByID()) what the current name of the repo_id is and updates this field retroactively, thus it is possible to query based on the current name of the repo (generally speaking.)
  • The original_repo_name string, which is exactly the same as repo_name except that it will not be retroactively updated. This is useful because you might wish to see that e.g. an insight's data changed substantially as part of a major renaming effort that went on. In this case, some data points would show the old repo name and some data points would show the new repo name (because original_repo_name is the name of the repo at the time the data point was recorded)

Everything described above is implemented, and all the fields described above are being recorded - but the background worker which updates repo_name to match the latest-known name for the repo is not:

The idea is that there would be a background worker which goes through the DB and asks the main app DB (via RepoStore.GetByID()) what the current name of the repo_id is and updates this field retroactively, thus it is possible to query based on the current name of the repo (generally speaking.)

We should implement that, or if we don't care about repo renames ditch it and just have a single repo_name field.