team/codintel - implemented symbol-pagerank POC
Created by: steveyegge
This CL implements a simple version of symbol-pagerank.
The intuition is that if you treat symbols as web pages, and calls and references to symbols as hyperlinks, then PageRank would provide a good estimate of the "importance" of each symbol.
We do not store much symbol scope information, so I have squashed all the symbols to be file-scoped. Source files inherit the incoming links (and corresponding page rank) for all symbols they contain.
It's implemented as a command-line tool in the lib/codeintel tree. It relies heavily on the code that handles lsif uploads, and constructs the symbol reference graph from the in-memory lsif representation before it's chunked out.
I also provide an option for including the "implements" graph as file links--that is to say, edges between implementable things (such as interfaces, or abstract classes) and their implementation sites. You can think of these as being just another kind of symbol reference. Depending on the indexer, edges may represent implementations, method overrides, or other specific symbol relationships. This information was available in the index, so I threw it in. However, although it was clearly effective in raising prominence for implementable types, it still suffers from various issues and is not yet ready to be a default.
The implementation overall suffers from various shortcomings and should not be considered production-ready. The PageRank Go implementation is MIT-licensed on GitHub and needs a security and legal review, and also a review for accuracy, performance, etc.
The output of the tool is a list of all filenames in the lsif index, paired with their page rank. It writes text lines to stdout for now. You can run it on any lsif index to see how the files rank; note that the algorithm is nondeterministic because it involves random jumps, and each time it runs, the list will be different. However, page ranks do not usually vary widely from run to run.