Enabling Notifications

Driverless AI can be configured to trigger a user-defined script at the beginning and end of an experiment. This functionality can be used to send notifications to services like Slack or to trigger a machine shutdown.

The config.toml file exposes the following variables:

listeners_experiment_start: Registers an absolute location of a script that gets executed at the start of an experiment.
listeners_experiment_done: Registers an absolute location of a script that gets executed when an experiment is finished successfully.

Driverless AI accepts any executable as a script. (For example, a script can be implemented in Bash or Python.) There are only two requirements:

The specified script can be executed. (i.e., The file has executable flag.)
The script should be able to accept command line parameters.

Script Interfaces

When Driverless AI executes a script, it passes the following parameters as a script command line:

Application ID: A unique identifier of a running Driverless AI instance.
User ID: The identification of the user who is running the experiment.
Experiment ID: A unique identifier of the experiment.
Experiment Path: The location of the experiment results.

Example

The following example demonstrates how to use notification scripts to shutdown an EC2 machine that is running Driverless AI after all launched experiments are finished. The example shows how to use a notification script in a Docker container and with native installations. The idea of a notification script is to create a simple counter (i.e., number of files in a directory) that counts the number of running experiments. If counter reaches 0-value, then the specified action is performed.

In this example, we use the AWS command line utility to shut down the actual machine; however, the same functionality can be achieved by executing sudo poweroff (if the actual user has password-less sudo capability configured) or poweroff (if the script poweroff has setuid bit set up together with executable bit. For more info, visit: https://unix.stackexchange.com/questions/85663/poweroff-or-reboot-as-normal-user.)

The `on_start` Script

This script increases the counter of running experiments.

#!/usr/bin/env bash

app_id="${1}"
experiment_id="${3}"
tmp_dir="${TMPDIR:-/tmp}/${app_id}"
exp_file="${tmp_dir}/${experiment_id}"

mkdir -p "${tmp_dir}"
touch "${exp_file}"

The `on_done` Script

This script decreases the counter and executes machine shutdown when the counter reaches 0-value.

#!/usr/bin/env bash

app_id="${1}"
experiment_id="${3}"
tmp_dir="${TMPDIR:-/tmp}/${app_id}"
exp_file="${tmp_dir}/${experiment_id}"

if [ -f "${exp_file}"  ]; then
    rm -f "${exp_file}"
fi

running_experiments=$(ls -1 "${tmp_dir}" | wc -l)

if [ "${running_experiments}" -gt 0  ]; then
    echo "There is still ${running_experiments} running experiments!"
else
    echo "No experiments running! Machine is going to shutdown!"
# Use instance meta-data API to get instance ID and then use AWS CLI to shutdown the machine
# This expects, that AWS CLI is properly configured and has capability to shutdown instances enabled.
aws ec2 stop-instances --instance-ids $(curl http://169.254.169.254/latest/meta-data/instance-id)
fi

Copy the config.toml file from inside the Docker image to your local filesystem. (Change nvidia-docker run to docker run --gpus all for GPU environments, or docker run for non-GPU environments.)

# In your Driverless AI folder (for exmaple, dai_1.5.1),
# make config and scripts directories
mkdir config
mkdir scripts

# Copy the config.toml file to the new config directory.
docker run --gpus all \
  --pid=host \
  --rm \
  -u `id -u`:`id -g` \
  -v `pwd`/config:/config \
  --entrypoint bash \
  h2oai/dai-ubi8-x86_64:2.2.1-cuda11.8.0.xx
  -c "cp /etc/dai/config.toml /config"

Edit the Notification scripts section in the config.toml file and save your changes. Note that in this example, the scripts are saved to a dai_VERSION/scripts folder.

# Notification scripts
# - the variable points to a location of script which is executed at given event in experiment lifecycle
# - the script should have executable flag enabled
# - use of absolute path is suggested
# The on experiment start notification script location
listeners_experiment_start = "dai_VERSION/scripts/on_start.sh"
# The on experiment finished notification script location
listeners_experiment_done = "dai_VERSION/scripts/on_done.sh"

Start Driverless AI with the DRIVERLESS_AI_CONFIG_FILE environment variable. Make sure this points to the location of the edited config.toml file so that the software finds the configuration file. (Change nvidia-docker run to docker run --gpus all for GPU environments, or docker run for non-GPU environments.)

docker run --gpus all \
  --pid=host \
  --rm \
  -u `id -u`:`id -g` \
  -e DRIVERLESS_AI_CONFIG_FILE="/config/config.toml" \
  -v `pwd`/config:/config \
  -v `pwd`/data:/data \
  -v `pwd`/log:/log \
  -v `pwd`/license:/license \
  -v `pwd`/tmp:/tmp \
  -v `pwd`/scripts:/scripts \
  h2oai/dai-ubi8-x86_64:2.2.1-cuda11.8.0.xx

Export the Driverless AI config.toml file or add it to ~/.bashrc. For example:

# DEB and RPM
export DRIVERLESS_AI_CONFIG_FILE="/etc/dai/config.toml"

# TAR SH
export DRIVERLESS_AI_CONFIG_FILE="/path/to/your/unpacked/dai/directory/config.toml"

Edit the Notification scripts section in the config.toml file to point to the new scripts. Save your changes when you are done.

# Notification scripts
# - the variable points to a location of script which is executed at given event in experiment lifecycle
# - the script should have executable flag enabled
# - use of absolute path is suggested
# The on experiment start notification script location
listeners_experiment_start = "/opt/h2oai/dai/scripts/on_start.sh"
# The on experiment finished notification script location
listeners_experiment_done = "/opt/h2oai/dai/scripts/on_done.sh"

Start Driverless AI. Note that the command used to start Driverless AI varies depending on your install type.

# Deb or RPM with systemd (preferred for Deb and RPM):
# Start Driverless AI.
sudo systemctl start dai

# Deb or RPM without systemd:
# Start Driverless AI.
sudo -H -u dai /opt/h2oai/dai/run-dai.sh

# Tar.sh
# Start Driverless AI
./run-dai.sh

Enabling Notifications

Script Interfaces

Example

The on_start Script

The on_done Script

The `on_start` Script

The `on_done` Script