Skip to content

doc/dev: document API docs search indexing architecture, tradeoffs, scaling, etc.

Warren Gifford requested to merge sg/apidocs-search-architecture into main

Created by: slimsag

I have been experimenting with a way to index API docs for search integration in #24928 and have an approach I am happy with for now using Postgres FTS.

This PR begins the process of landing all of that by documenting:

  • Tradeoffs (why we're using Postgres FTS instead of e.g. Zoekt, that I've consulted the search team in this decision, etc.)
  • What is indexed? (most recent upload of default branch only)
  • Scaling estimation (good enough for now, maybe not the best long-term solution, but I've done due diligence to make sure we're not breaking the DB or anything and can roll this out very slowly / gradually)
  • Limiting the search index size (I've added an escape hatch to limit how much data will be in the search index)
  • Public vs. private repositories (how we're dealing with that, we will immediately index both but may only query public repos to begin with.)

Helps #25193

Signed-off-by: Stephen Gutekanst [email protected]

Merge request reports

Loading