Question 1

What is sparse retrieval?

Accepted Answer

A retrieval approach that uses sparse representations (mostly zeros) for queries and documents, enabling fast, scalable search with inverted indexes and term-level signals.

Question 2

What is BM25+?

Accepted Answer

A refined version of BM25 that adds a small delta term and adjusted length normalization to improve ranking, especially for long documents, while keeping a simple, fast scoring formula.

Question 3

What is SPLADE?

Accepted Answer

A method that learns sparse representations of queries and documents to enable effective lexical matching with neural models, while remaining compatible with traditional inverted-index search.

Question 4

What are learned sparse models?

Accepted Answer

Models trained to produce sparse, term-like representations from text, combining neural understanding with fast, indexable search.

Question 5

How do these advances help at scale?

Accepted Answer

They improve ranking accuracy while preserving fast retrieval on large collections by using sparse indices that support both neural signals and lexical matching.

Sparse Retrieval Advances: BM25+, SPLADE, and Learned Sparse Models

Sparse Retrieval Advances: BM25+, SPLADE, and Learned Sparse Models

💡 Key Takeaways

❓ Frequently Asked Questions