Add an index to the repository creation timestamp
Created by: flying-robot
The repositories
GraphQL endpoint supports sorting via either REPOSITORY_NAME
or REPOSITORY_CREATED_AT
. Using the timestamp doesn't hit any indexes, resulting in a much more expensive query plan:
localhost sg@sg=# EXPLAIN SELECT * FROM repo ORDER BY name DESC LIMIT 5;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit (cost=0.56..1.28 rows=5 width=530) │
│ -> Index Scan Backward using repo_name_unique on repo (cost=0.56..517029.02 rows=3576655 width=530) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(2 rows)
localhost sg@sg=# EXPLAIN SELECT * FROM repo ORDER BY created_at DESC LIMIT 5;
┌───────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit (cost=394376.44..394377.04 rows=5 width=530) │
│ -> Gather Merge (cost=394376.44..822626.06 rows=3576656 width=530) │
│ Workers Planned: 4 │
│ -> Sort (cost=393376.38..395611.79 rows=894164 width=530) │
│ Sort Key: created_at DESC │
│ -> Parallel Seq Scan on repo (cost=0.00..378524.64 rows=894164 width=530) │
└───────────────────────────────────────────────────────────────────────────────────────────┘
(6 rows)
The difference in wall-clock time is very noticeable as a result:
localhost sg@sg=# SELECT id FROM repo ORDER BY name DESC LIMIT 5;
┌──────────┐
│ id │
├──────────┤
│ 41271131 │
│ 41270656 │
│ 41618978 │
│ 41164256 │
│ 42024836 │
└──────────┘
(5 rows)
Time: 60.597 ms
localhost sg@sg=# SELECT id FROM repo ORDER BY created_at DESC LIMIT 5;
┌──────────┐
│ id │
├──────────┤
│ 42090777 │
│ 42090776 │
│ 42090775 │
│ 42090774 │
│ 42090773 │
└──────────┘
(5 rows)
Time: 701.684 ms