Model inversion attack
Overview
A model inversion attack is a model attack type that trains a surrogate model to imitate the original model. The model inversion attack aims to create a surrogate model that copies the original model. The surrogate model learns by sending predictions to the original model. H2O Model Security, therefore, through a model inversion attack, highlights the likelihood of a surrogate model being able to resemble your original model.
- To learn how to create a model inversion attack, see Create a model inversion attack.
- See Settings: Model inversion attack to learn about all the settings for a model inversion attack.
- See Metrics: Model inversion attack to learn about all the metrics for a model inversion attack. Note
H2O Model Security offers an array of metrics in the form of charts, stats cards, and confusion matrices to understand a model inversion ttack.
Model inversion attacks in production (and the need for H2O Model Security)
Due to a lack of security or a distributed attack on your model API in production, hackers can simulate data, submit it, receive predictions, and train a surrogate model between their simulated data and your model predictions. This surrogate can:
- Expose your proprietary business logic, which can be known as "model stealing"
- Reveal sensitive information based on your training data
- Be the first stage of a membership inference attack
- Be a test-bed for adversarial example attacks
With the above in mind, H2O Model Security can help by highlighting areas of the model that increase the probability of a surrogate model being able to copy your original model and, therefore, protecting your model's proprietary logic.
- Submit and view feedback for this page
- Send feedback about H2O Model Security to cloud-feedback@h2o.ai