Skip to main content

Perturbations

Overview

H2O Eval Studio provides several perturbation methods to test the robustness of RAG systems, LLMs, and predictive (NLP) models. The perturbations are applied to the text input, and the model output on the original input is then compared to the model output on the perturbed input. If the model output changes significantly, the model is considered to be not robust.

The following perturbation methods are supported:

Use cases

  • Chatbots and virtual assistants: Ensure that the chatbot is robust to typos, abbreviations written in lower case, missing punctuation, and other common mistakes. Perturbing prompts can help to ensure that the chatbot is robust and responds as expected.

  • Social media datasets analysis: Social media text data contains a significant number of abbreviations, slang, emojis, and other non-standard language. Perturbing the text can help to ensure that the model is robust to these variations.

  • Summarization and translation: To ensure that the model is robust to diverse writing styles, synonyms, typographical errors, and abbreviations.

  • Question answering: To ensure that the model is robust to rephrased questions, writing styles, synonyms, and other common mistakes.

Perturbations guide

Perturbations can be used to test the robustness of a model using the following steps:

  • User can perturb test case, test or test suite using a sequence of perturbations.
  • Test cases with perturbed prompts are added to the corresponding tests, relationships are used to link perturbed prompts to original prompts. Categories describing the perturbations methods and intensity are added to the test case categories.
  • User can run any evaluator, and it will detects flip of metric(s) calculated by the evaluator.
  • A metric flip is defined as follows:
    • Every metric has a threshold value, which is configurable using the evaluator parameters.
    • Each metric calculated by the evaluator has associated metadata that include information like "higher is better/worse".
    • If the metric value for the original prompt is below the threshold, while the value of the same metric is above the threshold for the perturbed prompt, then it is reported as a flip (and vice versa - original above, perturbed below)
  • Metric flips are reported in the evaluation report as robustness problems.

Flip example:

  • For instance, hallucination metric value for a prompt might be 0.3 before perturbation and 0.8 after perturbation with threshold 0.75.

Random character perturbation

Perturbator that replaces random characters in a sentence. Currently, five types of character perturbations supported namely:

  1. Random character replacement (default: 'random_replacement'): Randomly replace p percentage of characters with other characters in the input text.
  2. Random keyboard typos ('random_keyboard_typos'): Randomly replace p percentage of characters with their neighboring characters on the QWERTY keyboard. E.g., "a" with "q", "s" with "a", etc.
  3. Random character insertion ('random_insert'): Randomly insert p percentage characters into the input text.
  4. Random character deletion ('random_delete'): Randomly delete p percentage characters from the input text and replace it with "X".
  5. Random OCR ('random_OCR'): Randomly replace p percentage of characters with common OCR errors.

QWERTY Perturbation

Perturbator that replaces 'y' with 'z' and vice versa.

Comma Perturbation

Perturbator that adds a comma after some words. It mimics a common mistake in English writing and/or typos.

Word Swap Perturbation

Perturbator that swaps two words in a sentence.

Synonym Perturbation

Perturbator that replaces words with their synonyms.

Antonym Perturbation

Perturbator that replaces words with their antonyms.

Perturbations API

Perturbators can be listed as follows:

from h2o_sonar import evaluate

perturbator_descriptor = evaluate.list_perturbators()

print(perturbator_descriptor)

String can be perturbed as follows:

    from h2o_sonar import evaluate
from h2o_sonar.utils.robustness import perturbations

perturbed_text = evaluate.perturb(
content=input_content,
perturbators=[
commons.PerturbatorToRun(
perturbator_id=perturbations.QwertyPerturbator.perturbator_id,
intensity=commons.PerturbationIntensity.LOW.name,
)
]
)

print(perturbed_text)

Test case can be perturbed using multiple perturbations as follows:

    from h2o_sonar import evaluate
from h2o_sonar.utils.robustness import perturbations

test_case = testing.RagTestCaseConfig(
prompt="This is the text to be perturbed using a perturbator."
)

perturbed_test_case = evaluate.perturb(
content=test_case,
perturbators=[
commons.PerturbatorToRun(
perturbator_id=descriptor.perturbator_id,
)
for descriptor in evaluate.list_perturbators()
]
)

print(perturbed_test_case.prompt)

Using Perturbations to Assess Model Robustness

Typical use of perturbations is to assess the robustness of a model. The following code example demonstrates how to use perturbations to assess the robustness of a model:

    #
# perturb the test suite - all test cases in test suite's tests are perturbed
#
test_suite = testing.RagTestSuiteConfig.load_from_json("path/to/test_suite.json")

perturbed_suite = evaluate.perturb(
content=test_suite,
perturbators=[
commons.PerturbatorToRun(
perturbator_id=perturbator_id,
intensity=commons.PerturbationIntensity.HIGH.name,
)
],
in_place=False,
)
perturbed_suite.save_as_json("path/to/perturbed_test_suite.json")


#
# resolve the test suite to test lab - get answers from RAGs/LLMs
#
test_lab = testing.RagTestLab.from_rag_test_suite(
rag_connection=target_host_connection,
rag_test_suite=perturbed_suite,
llm_model_names=llm_model_names,
rag_model_type=llm_model_type,
docs_cache_dir="path/to/docs_cache_dir",
)
test_lab.build()
test_lab.complete_dataset()
test_lab.save_as_json("path/to/test_lab.json")

#
# run evaluation using the test lab with perturbed test cases
#
evaluation = evaluate.run_evaluation(
dataset=test_lab.dataset,
models=list(test_lab.evaluated_models.values()),
evaluators=evaluators,
)

#
# print (robustness) problems
#
print(f"Problems [{len(evaluation.result.problems)}]")
for p in evaluation.result.problems:
print(f" {p}")

Feedback