Question 1

What is ColBERT?

Accepted Answer

ColBERT (Contextualized Late Interaction for Efficient Retrieval) is an information retrieval model that uses BERT to produce contextualized token embeddings for queries and documents, and scores relevance using a late-interaction mechanism.

Question 2

What does 'late interaction' mean in ColBERT?

Accepted Answer

Late interaction means embeddings are computed first; the final relevance score is formed later by comparing query and document token embeddings, rather than performing joint cross-attention during encoding.

Question 3

How is the ColBERT relevance score computed?

Accepted Answer

For each query token, ColBERT finds the most similar document token embedding (MaxSim) and sums these maxima across all query tokens to obtain the final score.

Question 4

Why is ColBERT suitable for large collections?

Accepted Answer

Because ColBERT enables efficient indexing of token-level embeddings and uses a late-interaction scoring approach (MaxSim with inverted indices), avoiding expensive cross-attention over all document tokens.

ColBERT and Late Interaction Concepts

ColBERT and Late Interaction Concepts

💡 Key Takeaways

❓ Frequently Asked Questions