Fix Github public repos to properly tag archived repos
Created by: varsanojidan
Fixes https://github.com/sourcegraph/customer/issues/1285
The gist of this change is that the Github API for listing public repos doesn't include an archived field (see this issue), so we have no way of knowing which repo is archived.
I had made a change to this previously where if {"exclude":[{"archived": true}]}
was set, that it would use the Github search API instead which could tell us all of the public non-archived repos.
My oversight here was that its still important to know whether a repo is archived or not even if we aren't just outright excluding them (see linked issue up top), therefore I have changed how list all public repos works for the Github code host.
First it will use the search API to query for all of the public repos that are archived. It will use this to generate a hashmap. Then it will query the regular Github API to list all of the public repos, and before sending the result back, check each repo against the hashmap to get it's true archived state.
Test plan
Tested locally against dogfood github to validate search functionality properly filters Github public archived repos now.
Config:
{
"GITHUB": [
{
"url": "https://ghe.sgdev.org",
"token": "redacted",
"repositoryQuery": [
"public"
]
}
]
}