Skip to content

Implement cleanup of repos that are cloned on the wrong shard

Warren Gifford requested to merge rg/delete_missharded_repos into main

Created by: rafax

Adds Janitor logic to delete repositories that are cloned on disk, but not assigned to the current gitserver shard. Disabled by default, once enabled this will delete a configurable number of repositories in one run (and keep collecting stats without deletes beyond that). Actual disk delete is performed by corrupt repos check.

[gitserver-0] INFO Server server/cleanup.go:178 removing repo cloned on the wrong shard {"repo": "/Users/rafal/.sourcegraph/repos/0/github.com/rafax/gojam/.git", "target-shard": "127.0.0.1:3502", "size-bytes": 991134, "deleted-this-run": 1, "deleted-limit": 10}
[gitserver-0] INFO Server server/cleanup.go:213 removing corrupt repo {"repo": "/Users/rafal/.sourcegraph/repos/0/github.com/rafax/gojam/.git", "reason": "missing-head"}

Test plan

  • Unit tests
  • Disabled by default, will enable once we have enough baseline metrics
  • Tested locally by running two gitserver replicas and copying repos into their GITDIR folders -> both replicas correctly clean up repos not belonging to them

Merge request reports

Loading