Reproducibility in Driverless AI

Reproducibility refers to the ability to reproduce or replicate the results of a machine learning model or experiment. This concept is essential in machine learning because it lets you validate the results of your experiments (by ensuring that your findings are not the result of chance or errors in the implementation) and lets others build on them. Reproducibility also promotes transparency and accountability, which are important for ensuring that machine learning models are safe, ethical, and trustworthy. Driverless AI lets you build experiments with a random seed to get reproducible results when running experiments.

Enable reproducibility

When setting up an experiment, the Reproducible toggle lets you specify whether to enable reproducibility. By default, this toggle is disabled. When enabled, the Reproducible toggle works in tandem with the reproducibility_level config option to enable a specific level of reproducibility. Note that when the Reproducible toggle is enabled, reproducibility_level=1 by default.

Reproducible toggle on the experiment setup page

The following section describes the different levels of reproducibility in more detail.

Reproducibility levels

You can manually specify one of the four available levels of reproducibility with the reproducibility_level config option. The following list describes how these levels of reproducibility are distinct from one another.

  • 1 (default): Same experiment results for same operating system, same CPU(s), and same GPU(s).

  • 2: Same experiment results for same operating system, same CPU architecture, and same GPU architecture.

  • 3: Same experiment results for same operating system and same CPU architecture. Note that this reproducibility level excludes GPUs.

  • 4: Same experiment results for same operating system. This level is considered to be the best effort approximation.

Notes:

  • Experiments are only reproducible when run on the same hardware (that is, when using the same number and type of GPUs/CPUs and the same architecture). For example, you will not get the same results if you try an experiment on a GPU machine, and then attempt to reproduce the results on a CPU-only machine or on a machine with a different number and type of GPUs.

  • Experiments run using TensorFlow with multiple cores cannot be reproduced.

  • LightGBM is more reproducible with 64-bit floats, and Driverless AI switches to 64-bit floats for LightGBM.

  • Enabling this option automatically disables all of the Feature Brain expert settings options; specifically: