Low-memory symbol indexing with bloom filters
The latest Metals release introduces three new in-memory indexes to implement the features "find symbol references" and "fuzzy symbol search". Indexes are important to provide fast response times for user requests but they come at the price of higher memory usage. To keep memory usage low, Metals uses a data structure called bloom filters that implements space-efficient sets. Thanks to bloom filters, the three new indexes added in the last release use only a few megabytes of memory even for large projects with >500k lines of code.
In this post, we look into how Metals uses bloom filters for fast indexing with small memory footprint. We explain what bloom filters are and how we can encode problems like fuzzy searching to take advantage of the nice properties of bloom filters. Finally, we evaluate these new features on a real-world project: the Akka build.