usage-data: maintain state of progress when scraping events
Created by: coury-clark
Closes https://github.com/sourcegraph/sourcegraph/issues/39089
Capture state of scraping jobs progress. Adds a new table event_logs_scrape_state
that will record the bookmark of the highest event_id that was successfully sent. If no state is found a new state row will be initialized at the current event. Older events will not be backfilled.
Test plan
To test locally:
Set up gcp credentials
gcloud auth application-default login
Start sg
GOOGLE_APPLICATION_CREDENTIALS="$HOME/.config/gcloud/application_default_credentials.json" sg start ...
Add configuration to the site config
"exportUsageTelemetry": {
"enabled": true,
"topicProjectName": "sourcegraph-dogfood",
"topicName": "usage-data-testing"
},
Log lines will indicate job progress
[ worker] INFO worker.export-usage-telemetry telemetry/telemetry_job.go:127 fetching events from bookmark {"bookmark_id": 21333}
[ worker] INFO worker.export-usage-telemetry telemetry/telemetry_job.go:138 telemetryHandler executed {"event count": 5, "maxId": 21338}
Check bookmark table
select bookmark_id from event_logs_scrape_state order by id limit 1;