Skip to content

codeintel: Rewrite janitor

Administrator requested to merge ef/janitor-refactor into main

Created by: efritz

This is an attempt to clean up the janitor to better fit Postgres as the data store. This is partially a cleanup effort from the migration (RFC 235). There was a lot of cruft around this piece. The janitor now has two distinct responsibilities, some of which will be permanent:

  • Remove all upload files on disk older than a certain age
  • Remove all upload part files on disk older than a certain age
  • Soft delete lsif_uploads records that are uploading longer than a given time period
  • Soft delete lsif_uploads records with no matching repository
  • Soft delete lsif_uploads records that are not visible at tip and are older than a given time period
  • Hard delete lsif_uploads records transactionally with the data associated with it in the codeintel db
  • (temporary) Find all data in the codeintel db that has no matching lsif_uploads record and delete it. This is to catch stuff that falls through the cracks and should be subsumed by the task above in normal operation.

The code for the first two can be taken out entirely if we use S3/GCS with object lifecycle management and are the only file management tasks that remain in code.

This started as an effort to update our monitoring for the bundle manager, but I pulled on too much of the noodle and this meatball fell out.

5 second rule for reviewers starts NOW.

To review: Read the new version of enterprise/cmd/precise-code-intel-bundle-manager/internal/janitor/janitor.go and compare to the list above.

This closes https://github.com/sourcegraph/sourcegraph/issues/12168 and covers any outstanding details of RFC 205.

Merge request reports

Loading