Below are some of the key features available in Driverless AI.
Flexibility of Data and Deployment¶
Driverless AI works across a variety of data sources including Hadoop HDFS, Amazon S3, and more. Driverless AI can be deployed everywhere including all clouds (Microsoft Azure, AWS, Google Cloud) and on premises on any system, but it is ideally suited for systems with GPUs, including IBM Power 9 with GPUs built in.
NVIDIA GPU Acceleration¶
Driverless AI is optimized to take advantage of GPU acceleration to achieve up to 40X speedups for automatic machine learning. It includes multi-GPU algorithms for XGBoost, GLM, K-Means, and more. GPUs allow for thousands of iterations of model features and optimizations.
For datasets, Driverless AI can generate visualizations and creates data plots that are most relevant from a statistical perspective based on the most relevant data statistics in order to help users get a quick understanding of their data prior to starting the model building process. See Visualizing Datasets for more information.
Automatic Feature Engineering¶
Feature engineering is the secret weapon that advanced data scientists use to extract the most accurate results from algorithms. H2O Driverless AI employs a library of algorithms and feature transformations to automatically engineer new, high value features for a given dataset. See Driverless AI Transformations for more information.
Machine Learning Interpretability (MLI)¶
Driverless AI provides robust interpretability of machine learning models to explain modeling results in a human-readable format. In the MLI view, Driverless AI employs a host of different techniques and methodologies for interpreting and explaining the results of its models. A number of charts are generated automatically, including K-LIME, Shapley, Variable Importance, Decision Tree Surrogate, Partial Dependence, Individual Conditional Expectation, and more. Additionally, you can download a CSV of LIME and Shapley reasons codes from this view. See Model Interpretation for more information.
Driverless AI delivers superior time series capabilities to optimize for almost any prediction time window. Driverless AI incorporates data from numerous predictors, handles structured character data and high-cardinality categorical variables, and handles gaps in time series data and other missing values. See Time Series in Driverless AI for more information.
NLP with TensorFlow¶
Text data can contain critical information to inform better predictions. Driverless AI automatically converts short text strings into features using powerful techniques like TFIDF. With TensorFlow, Driverless AI can also process larger text blocks and build models using all available data to solve business problems like sentiment analysis, document classification, and content tagging. See NLP in Driverless AI for more information.
Automatic Scoring Pipelines¶
For completed experiments, Driverless AI automatically generates both Python scoring pipelines and new ultra-low latency automatic scoring pipelines. The new automatic scoring pipeline is a unique technology that deploys all feature engineering and the winning machine learning model in a highly optimized, low-latency, production-ready Java code that can be deployed anywhere. See The Driverless AI Scoring Pipelines for more information.
Custom Recipe Support¶
Driverless AI allows you to import custom recipes for MLI algorithms, feature engineering (transformers), scorers, and configuration. You can use your custom recipes in combination with or instead of all built-in recipes. This allows you to have greater influence over the Driverless AI Automatic ML pipeline and gives you control over the optimization choices that Driverless AI makes. See Appendix A: Custom Recipes for more information.