Question 1

What is prompt sensitivity in language models?

Accepted Answer

Prompt sensitivity measures how much a model's output changes when the prompt is slightly altered (wording, order, or supplied examples). High sensitivity means small changes can yield different answers; low sensitivity means outputs stay more stable.

Question 2

What are robustness probes in prompt evaluation?

Accepted Answer

Robustness probes are tests that vary prompts in systematic ways (paraphrases, synonyms, formatting changes, or adversarial prompts) to see if the model's answers stay accurate and consistent.

Question 3

How do you design prompt probes for assessing robustness?

Accepted Answer

Create multiple prompt variants that preserve meaning, including paraphrases and different contexts. Run the model on each variant and compare the outputs for consistency and correctness.

Question 4

What metrics help evaluate prompt robustness?

Accepted Answer

Look at output consistency (agreement across variants), factual accuracy across prompts, semantic similarity of responses, and stability of confidence or scoring when applicable.

Question 5

How can I improve prompt robustness in practice?

Accepted Answer

Use clear, explicit instructions; include a diverse set of examples; test with several rephrasings during development; identify prompts that cause inconsistent results and revise them accordingly.

Prompt Sensitivity and Robustness Probes

Prompt Sensitivity and Robustness Probes

💡 Key Takeaways

❓ Frequently Asked Questions