Install on RHEL

This section describes how to install the Driverless AI Docker image on RHEL. The installation steps vary depending on whether your system has GPUs or if it is CPU only.

Environment

Operating System	GPUs?	Min Mem
RHEL with GPUs	Yes	64 GB
RHEL with CPUs	No	64 GB

Install on RHEL with GPUs

Note: Refer to the following links for more information about using RHEL with GPUs. These links describe how to disable automatic updates and specific package updates. This is necessary in order to prevent a mismatch between the NVIDIA driver and the kernel, which can lead to the GPUs failures.

https://access.redhat.com/solutions/2372971

https://www.rootusers.com/how-to-disable-specific-package-updates-in-rhel-centos/

Watch the installation video here. Note that some of the images in this video may change between releases, but the installation steps remain the same.

备注

As of this writing, Driverless AI has been tested on RHEL versions 7.4, 8.3, and 8.4.

Open a Terminal and ssh to the machine that will run Driverless AI. Once you are logged in, perform the following steps.

Retrieve the Driverless AI Docker image from https://www.h2o.ai/download/.
Install and start Docker EE on RHEL (if not already installed). Follow the instructions on https://docs.docker.com/engine/installation/linux/docker-ee/rhel/.

Alternatively, you can run on Docker CE.

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum makecache fast
sudo yum -y install docker-ce
sudo systemctl start docker

Install the NVIDIA Container Toolkit (if not already installed). More information is available at NVIDIA Container Toolkit Installation Guide.

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install the NVIDIA Container Toolkit
sudo apt-get install -y nvidia-docker2
Note: If you would like the nvidia-docker service to automatically start when the server is rebooted, then run the following command. If you do not run this command, you will have to remember to start the nvidia-docker service manually; otherwise, the GPUs will not appear as available.
sudo systemctl enable nvidia-docker
Alternatively, if you have installed Docker CE above, you can install the NVIDIA Container Toolkit with:
curl -s -L https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo | \
sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install nvidia-docker2
Note: From Docker version 19.03 and later, the nvidia-docker wrapper is no longer supported. Instead, use the --gpus all flag with Docker to enable GPU support.

For the best performance, including GPU support, use the following command:
docker run --gpus all <image_name>
For a lower-performance experience without GPUs, you can simply use:
docker run <image_name>

Verify that the NVIDIA driver is up and running. If the driver is not up and running, log on to http://www.nvidia.com/Download/index.aspx?lang=en-us to get the latest NVIDIA Tesla V/P/K series driver.

docker run --gpus all --rm nvidia/cuda nvidia-smi

Set up a directory for the version of Driverless AI on the host machine:

# Set up directory with the version name
mkdir dai-2.1.0

Change directories to the new folder, then load the Driverless AI Docker image inside the new directory:

# cd into the new directory
cd dai-2.1.0

# Load the Driverless AI docker image
docker load < dai-docker-ubi8-x86_64-2.1.0.tar.gz

Enable persistence of the GPU. Note that this needs to be run once every reboot. Refer to the following for more information: http://docs.nvidia.com/deploy/driver-persistence/index.html.

sudo nvidia-smi -pm 1

Set up the data, log, and license directories on the host machine (within the new directory):

# Set up the data, log, license, and tmp directories on the host machine
mkdir data
mkdir log
mkdir license
mkdir tmp

At this point, you can copy data into the data directory on the host machine. The data will be visible inside the Docker container.
Run docker images to find the image tag.
Start the Driverless AI Docker image and replace TAG below with the image tag. Depending on your install version, use the docker run --runtime=nvidia (>= Docker 19.03) or nvidia-docker (< Docker 19.03) command. Note that from version 1.10 DAI docker image runs with internal tini that is equivalent to using --init from docker, if both are enabled in the launch command, tini will print a (harmless) warning message. For GPU users, as GPU needs --pid=host for nvml, which makes tini not use pid=1, so it will show the warning message (still harmless).

We recommend --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 in docker launch command. But if user plans to build image auto model extensively, then --shm-size=4g is recommended for Driverless AI docker command.

Note: Use docker version to check which version of Docker you are using.

# Start the Driverless AI Docker image
docker run --runtime=nvidia \
   --pid=host \
   --rm \
   --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \
   -u `id -u`:`id -g` \
   -p 12345:12345 \
   -v `pwd`/data:/data \
   -v `pwd`/log:/log \
   -v `pwd`/license:/license \
   -v `pwd`/tmp:/tmp \
   h2oai/dai-ubi8-x86_64:2.1.0-cuda11.8.0.xx

# Start the Driverless AI Docker image
docker run --gpus all \
   --pid=host \
   --rm \
   --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \
   -u `id -u`:`id -g` \
   -p 12345:12345 \
   -v `pwd`/data:/data \
   -v `pwd`/log:/log \
   -v `pwd`/license:/license \
   -v `pwd`/tmp:/tmp \
   h2oai/dai-ubi8-x86_64:2.1.0-cuda11.8.0.xx

Driverless AI will begin running:

--------------------------------
Welcome to H2O.ai's Driverless AI
---------------------------------

- Put data in the volume mounted at /data
- Logs are written to the volume mounted at /log/20180606-044258
- Connect to Driverless AI on port 12345 inside the container
- Connect to Jupyter notebook on port 8888 inside the container

Connect to Driverless AI with your browser at http://Your-Driverless-AI-Host-Machine:12345.

Install on RHEL with CPUs

This section describes how to install and start the Driverless AI Docker image on RHEL. Note that this uses docker and not nvidia-docker.

Watch the installation video here. Note that some of the images in this video may change between releases, but the installation steps remain the same.

备注

As of this writing, Driverless AI has been tested on RHEL versions 7.4, 8.3, and 8.4.

Open a Terminal and ssh to the machine that will run Driverless AI. Once you are logged in, perform the following steps.

Install and start Docker EE on RHEL (if not already installed). Follow the instructions on https://docs.docker.com/engine/installation/linux/docker-ee/rhel/.

Alternatively, you can run on Docker CE.

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum makecache fast
sudo yum -y install docker-ce
sudo systemctl start docker

On the machine that is running Docker EE, retrieve the Driverless AI Docker image from https://www.h2o.ai/download/.
Set up a directory for the version of Driverless AI on the host machine:

# Set up directory with the version name
mkdir dai-2.1.0

Load the Driverless AI Docker image inside the new directory:

# Load the Driverless AI Docker image
docker load < dai-docker-ubi8-x86_64-2.1.0.tar.gz

Set up the data, log, license, and tmp directories (within the new directory):

# cd into the directory associated with your version of Driverless AI
cd dai-2.1.0

# Set up the data, log, license, and tmp directories on the host machine
mkdir data
mkdir log
mkdir license
mkdir tmp

Copy data into the data directory on the host. The data will be visible inside the Docker container at /<user-home>/data.
Run docker images to find the image tag.
Start the Driverless AI Docker image. Note that GPU support will not be available. Note that from version 1.10 DAI docker image runs with internal tini that is equivalent to using --init from docker, if both are enabled in the launch command, tini will print a (harmless) warning message.

We recommend --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 in docker launch command. But if user plans to build image auto model extensively, then --shm-size=4g is recommended for Driverless AI docker command.

$ docker run \
  --pid=host \
  --rm \
  --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \
  -u `id -u`:`id -g` \
  -p 12345:12345 \
  -v `pwd`/data:/data \
  -v `pwd`/log:/log \
  -v `pwd`/license:/license \
  -v `pwd`/tmp:/tmp \
  -v /etc/passwd:/etc/passwd:ro \
  -v /etc/group:/etc/group:ro \
  h2oai/dai-ubi8-x86_64:2.1.0-cuda11.8.0.xx

Driverless AI will begin running:

--------------------------------
Welcome to H2O.ai's Driverless AI
---------------------------------

- Put data in the volume mounted at /data
- Logs are written to the volume mounted at /log/20180606-044258
- Connect to Driverless AI on port 12345 inside the container
- Connect to Jupyter notebook on port 8888 inside the container

Connect to Driverless AI with your browser at http://Your-Driverless-AI-Host-Machine:12345.

Stopping the Docker Image

To stop the Driverless AI Docker image, type Ctrl + C in the Terminal (Mac OS X) or PowerShell (Windows 10) window that is running the Driverless AI Docker image.

Upgrading the Docker Image

This section provides instructions for upgrading Driverless AI versions that were installed in a Docker container. These steps ensure that existing experiments are saved.

WARNING: Experiments, MLIs, and MOJOs reside in the Driverless AI tmp directory and are not automatically upgraded when Driverless AI is upgraded.

Build MLI models before upgrading.

Build MOJO pipelines before upgrading.

Stop Driverless AI and make a backup of your Driverless AI tmp directory before upgrading.

If you did not build MLI on a model before upgrading Driverless AI, then you will not be able to view MLI on that model after upgrading. Before upgrading, be sure to run MLI jobs on models that you want to continue to interpret in future releases. If that MLI job appears in the list of Interpreted Models in your current version, then it will be retained after upgrading.

If you did not build a MOJO pipeline on a model before upgrading Driverless AI, then you will not be able to build a MOJO pipeline on that model after upgrading. Before upgrading, be sure to build MOJO pipelines on all desired models and then back up your Driverless AI tmp directory.

Note: Stop Driverless AI if it is still running.

Requirements

We recommend to have NVIDIA driver >= 471.68 installed (GPU only) in your host environment for a seamless experience on all architectures, including Ampere. Driverless AI ships with CUDA 11.8.0 for GPUs, but the driver must exist in the host environment.

Go to NVIDIA download driver to get the latest NVIDIA Tesla A/T/V/P/K series drivers. For reference on CUDA Toolkit and Minimum Required Driver Versions and CUDA Toolkit and Corresponding Driver Versions, see here .

备注

If you are using K80 GPUs, the minimum required NVIDIA driver version is 450.80.02.

Upgrade Steps

SSH into the IP address of the machine that is running Driverless AI.
Set up a directory for the version of Driverless AI on the host machine:

# Set up directory with the version name
mkdir dai-2.1.0

# cd into the new directory
cd dai-2.1.0

Retrieve the Driverless AI package from https://www.h2o.ai/download/ and add it to the new directory.
Load the Driverless AI Docker image inside the new directory:

# Load the Driverless AI docker image
docker load < dai-docker-ubi8-x86_64-2.1.0.tar.gz

Copy the data, log, license, and tmp directories from the previous Driverless AI directory to the new Driverless AI directory:

# Copy the data, log, license, and tmp directories on the host machine
cp -a dai_rel_1.4.2/data dai-2.1.0/data
cp -a dai_rel_1.4.2/log dai-2.1.0/log
cp -a dai_rel_1.4.2/license dai-2.1.0/license
cp -a dai_rel_1.4.2/tmp dai-2.1.0/tmp

At this point, your experiments from the previous versions will be visible inside the Docker container.

Use docker images to find the new image tag.
Start the Driverless AI Docker image.
Connect to Driverless AI with your browser at http://Your-Driverless-AI-Host-Machine:12345.