Perturbations
Overview
H2O Eval Studio provides several perturbation methods to test the robustness of RAG systems, LLMs, and predictive (NLP) models. The perturbations are applied to the text input, and the model output on the original input is then compared to the model output on the perturbed input. If the model output changes significantly, the model is considered to be not robust.
The following perturbation methods are supported:
- Random Character Perturbation
- QWERTY Perturbation
- Comma Perturbation
- Word Swap Perturbation
- Synonym Perturbation
- Antonym Perturbation
Use cases
Chatbots and virtual assistants: Ensure that the chatbot is robust to typos, abbreviations written in lower case, missing punctuation, and other common mistakes. Perturbing prompts can help to ensure that the chatbot is robust and responds as expected.
Social media datasets analysis: Social media text data contains a significant number of abbreviations, slang, emojis, and other non-standard language. Perturbing the text can help to ensure that the model is robust to these variations.
Summarization and translation: To ensure that the model is robust to diverse writing styles, synonyms, typographical errors, and abbreviations.
Question answering: To ensure that the model is robust to rephrased questions, writing styles, synonyms, and other common mistakes.
Perturbations guide
Perturbations can be used to test the robustness of a model using the following steps:
- You can perturb a test case, test, or test suite using a sequence of perturbations.
- Test cases with perturbed prompts are added to the corresponding tests, relationships are used to link perturbed prompts to original prompts. Categories describing the perturbations methods and intensity are added to the test case categories.
- User can run any evaluator, and it will detects flip of metric(s) calculated by the evaluator.
- A metric flip is defined as follows:
- Every metric has a threshold value, which is configurable using the evaluator parameters.
- Each metric calculated by the evaluator has associated metadata that include information like "higher is better/worse".
- If the metric value for the original prompt is below the threshold, while the value of the same metric is above the threshold for the perturbed prompt, then it is reported as a flip (and vice versa - original above, perturbed below)
- Metric flips are reported in the evaluation report as
robustness
problems.
Flip example:
- For instance, hallucination metric value for a prompt might be
0.3
before perturbation and0.8
after perturbation with threshold0.75
.
Random character perturbation
Perturbator that replaces random characters in a sentence. Currently, five types of character perturbations supported namely:
- Random character replacement (default: 'random_replacement'): Randomly replace
p
percentage of characters with other characters in the input text. - Random keyboard typos ('random_keyboard_typos'): Randomly replace
p
percentage of characters with their neighboring characters on the QWERTY keyboard. E.g., "a" with "q", "s" with "a", etc. - Random character insertion ('random_insert'): Randomly insert
p
percentage characters into the input text. - Random character deletion ('random_delete'): Randomly delete
p
percentage characters from the input text and replace it with "X". - Random OCR ('random_OCR'): Randomly replace
p
percentage of characters with common OCR errors.
QWERTY Perturbation
Perturbator that replaces 'y' with 'z', or conversely, replaces 'z' with 'y'.
Comma Perturbation
Perturbator that adds a comma after some words. It mimics a common mistake in English writing and/or typos.
Word Swap Perturbation
Perturbator that swaps two words in a sentence.
Synonym Perturbation
Perturbator that replaces words with their synonyms.
Antonym Perturbation
Perturbator that replaces words with their antonyms.
- Submit and view feedback for this page
- Send feedback about H2O Eval Studio to cloud-feedback@h2o.ai