github: automatic repositoryQuery refinement
Created by: tsenart
This PR extends our GitHubSource
to automatically refine repositoryQuery
queries that hit the 1000 results limit that the GitHub's Search API enforces.
It does so by ANDing
a created:
qualifier with the original query with a range from before GitHub was created until time.Now()
. It then refines that range until the limit isn't hit and traverses that search space by moving the window that created:
matches, until we reach the time before GitHub was created (i.e. minCreated
).
Existing customers that use the manual version of this work-around should not need to change anything. This code path only kicks in when we hit the search API limit.
This is needed for us to continuously sync all repos with more than 1star, with a site level external service.
Fixes #2562 (closed)