Question 1

What is active learning?

Accepted Answer

Active learning is a training approach where the model selects unlabeled data points to label, aiming to learn more efficiently with fewer labeled examples.

Question 2

What are the main risks of active learning?

Accepted Answer

Risks include sampling bias if labeled data isn’t representative, potential overfitting to the queried subset, reduced generalization to new data, and noise from imperfect labels.

Question 3

What is label drift and why does it matter?

Accepted Answer

Label drift (concept drift) happens when the meaning or distribution of labels changes over time. If not addressed, the model’s accuracy can degrade and require retraining.

Question 4

How can you mitigate active learning risks and label drift?

Accepted Answer

Use diverse or stratified sampling, mix in some random labeling, ensure high-quality annotations, monitor for drift, and retrain with up-to-date data.

Active learning risks and label drift

Active learning risks and label drift

💡 Key Takeaways

❓ Frequently Asked Questions