Skip to content

campaigns: Introduce heuristic syncing of changesets

Warren Gifford requested to merge heuristic-syncer into master

Created by: ryanslade

This changes our syncer from syncing all changesets every 2 minutes to one that syncs each changeset based on a combination of when it was last synced and when it last changed. We are also able to inject "high priority" items for immediate syncing.

For now, the heuristics used are very simply and more of a linear backoff than anything else. We can extend this later if needed.

The syncer operates in the following way:

A schedule is generated on a set interval, currently every two minutes. The schedule is computed by fetching sync data from the database:

  • Last sync with codehost
  • Last remote change. ie, what was the codehost "updated_at" value when we last synced
  • Latest event (this could be newer than either of the above due to webhooks)

From the above, we work out when next to sync by looking at the difference between when we last synced and max(remote change, latest event) and performing linear backoff.

For example:

We cap syncs with a max 8 hours, min of 2 minutes.

Last sync 10am, latest change 9:55. Next sync 10:05. Last sync 10am, latest change 9:50. Next sync 10:10. Last sync 10am. latest change 10:01. Next sync 10:02 (Sync happened before latest event but we'll never sync more frequently than every 2 minutes)

Once we have the schedule we use it to trigger our syncs in chronological order.

Every two minutes the schedule is replaced. This is less efficient but we need to fetch new data from the db anyway and it simplifies the code being able to simply throw away the old schedule and use a new one.

Still TODO in followup PRs:

  • Instrumentation / dashboards
  • Shared, configurable rate limiter
  • Sync now UI
  • Concurrency. We only process once sync at a time for now
  • We could potentially calculate the next sync time in the db and return a limited subset of items that will sync soon but I've left the work in code for simplicity at this point

Closes: https://github.com/sourcegraph/sourcegraph/issues/6388

Merge request reports

Loading