Question 1

What is model extraction?

Accepted Answer

Model extraction is an attack in which an adversary queries a deployed model to infer or clone its parameters and behavior, potentially creating a surrogate model that mimics the target.

Question 2

What is membership inference?

Accepted Answer

Membership inference attempts to determine whether a specific data record was part of the model’s training data by analyzing outputs or confidence scores.

Question 3

What are common mitigations against model extraction?

Accepted Answer

Limit exposure with authentication, rate limits, and anomaly detection; restrict outputs (e.g., provide only top-k labels or less precise scores); monitor query patterns; and consider techniques like model watermarking or secure deployment.

Question 4

What are common mitigations against membership inference?

Accepted Answer

Use privacy-preserving training (e.g., differential privacy), reduce overfitting with regularization, limit or perturb exposed outputs, and implement data minimization and robust access controls to limit leakage.

Model extraction and membership inference risk mitigations

Model extraction and membership inference risk mitigations

💡 Key Takeaways

❓ Frequently Asked Questions