Adversarial attack
Overview
An adversarial attack is a model attack type that alters a model's expected predictions by modifying its input data in a particular way. During an adversarial attack, H2O Model Security systematically alters the input data when scoring new data in your deployed model (in H2O MLOps).
- To learn how to create an adversarial attack, see Create an adversarial attack.
- See Settings: Adversarial attack to learn about all the settings for an adversarial attack.
- See Metrics: Adversarial attack to learn about all the metrics for an adversarial attack. Note
H2O Model Security offers an array of metrics in the form of charts, stats cards, and confusion matrices to understand an adversarial attack.
Adversarial attacks in production (and the need for H2O Model Security)
Because Machine Learning (ML) models are typically nonlinear and use high-degree interactions to increase accuracy, it's always possible that some combination of data can lead to unexpected model outputs. Therefore, in production, adversarial attacks are common, which are strange or subtle combinations of data that cause your model to give an attacker the prediction they want without the attacker having access to the internals of your model.
With the above in mind, H2O Model Security can help by enabling you to trick your own model by seeing its outcome on many different combinations of input data values. In particular, H2O Model Security enables you to observe how your model will react to an adversarial attack, leading to solutions before its implementation in production.
- Submit and view feedback for this page
- Send feedback about H2O Model Security to cloud-feedback@h2o.ai