RHEL에 설치하기¶

이 섹션에서는 RHEL에 Driverless AI Docker 이미지를 설치하는 방법에 대해 설명합니다. 설치 단계는 사용자 시스템에 GPU가 있는지 또는 CPU만 사용하는지에 따라 다릅니다.

환경¶

운영 체제	GPU 유무	Min 메모리
GPU 포함 RHEL	예	64GB
CPU 포함 RHEL	아니요	64GB

GPU가 포함된 RHEL에 설치하기¶

Note: GPU가 포함된 RHEL의 사용에 대한 자세한 내용은 다음 링크를 참조하십시오. 이 링크에서는 자동 업데이트 및 특정 패키지 업데이트를 비활성화하는 방법을 설명합니다. 이는 NVIDIA 드라이버와 커널 간의 불일치를 방지하기 위해 필요합니다. 그로 인해 GPU 오류가 발생할 수 있기 때문입니다.

https://access.redhat.com/solutions/2372971

https://www.rootusers.com/how-to-disable-specific-package-updates-in-rhel-centos/

Watch the installation video here. 이 비디오의 일부 이미지가 릴리스 사이에 변경될 수도 있지만, 설치 단계는 동일합니다.

참고

본문 작성 시점에 Driverless AI는 RHEL 버전 7.4, 8.3, 8.4에서 테스트되었습니다.

터미널을 열고 Driverless AI를 실행할 시스템에 ssh를 적용합니다. 로그인한 다음, 다음 단계를 수행합니다.

https://www.h2o.ai/download/에서 Driverless AI Docker 이미지를 검색합니다.
RHEL에 Docker EE를 설치합니다(아직 설치되지 않은 경우). Https://docs.docker.com/engine/installation/linux/docker-ee/rhel/ 의 설명을 따릅니다.

또는 Docker CE에서 실행할 수도 있습니다.

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum makecache fast
sudo yum -y install docker-ce
sudo systemctl start docker

nvidia-docker2를 설치합니다(아직 설치되지 않은 경우). 자세한 내용은 https://github.com/NVIDIA/nvidia-docker/blob/master/README.md 에서 확인할 수 있습니다.

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
Note: 서버의 재부팅 시 nvidia-docker 서비스가 자동으로 시작되도록 하려면 다음 명령을 실행합니다. 이 명령을 실행하지 않으면 nvidia-docker 서비스를 수동으로 시작해야 합니다. 그렇지 않으면 GPU가 가용 상태로 나타나지 않습니다.
sudo systemctl enable nvidia-docker
또는, 위의 Docker CE를 설치한 경우 다음 작업을 수행하여 nvidia-docker를 설치할 수 있습니다.
curl -s -L https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo | \
sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum install nvidia-docker2

NVIDIA 드라이버가 실행되고 있는지 확인합니다. 드라이버가 실행되고 있지 않은 경우, http://www.nvidia.com/Download/index.aspx?lang=ko-KR 에 로그인하여 최신 NVIDIA Tesla V/P/K 시리즈 드라이버를 다운로드합니다.

nvidia-docker run --rm nvidia/cuda nvidia-smi

호스트 시스템에서 Driverless AI 버전에 대한 디렉터리를 설정합니다.

# Set up directory with the version name
mkdir dai-1.10.1.2

디렉터리를 새 폴더로 변경한 다음, Driverless AI Docker 이미지를 새 디렉터리로 로드합니다.

# cd into the new directory
cd dai-1.10.1.2

# Load the Driverless AI docker image
docker load < dai-docker-centos7-x86_64-1.10.1.2.tar.gz

GPU의 지속성을 활성화합니다. 이는 재부팅할 때마다 한 번씩 실행해야 합니다. 자세한 내용은 다음 링크를 참조하십시오: http://docs.nvidia.com/deploy/driver-persistence/index.html.

sudo nvidia-smi -pm 1

호스트 시스템(새 디렉터리 내)에 데이터, 로그, 라이선스 디렉터리를 설정합니다.

# Set up the data, log, license, and tmp directories on the host machine
mkdir data
mkdir log
mkdir license
mkdir tmp

이 때 데이터를 호스트 시스템의 데이터 디렉터리로 복사할 수 있습니다. 해당 데이터는 Docker 컨테이너 내에 표시됩니다.
docker 이미지 를 실행하여 이미지 태그를 찾습니다.
Driverless AI Docker 이미지를 시작하고 아래의 TAG를 이미지 태그로 교체합니다. 설치 버전에 따라 docker run --runtime=nvidia (Docker 19.03 이후) 또는 nvidia-docker (Docker 19.03 이전) 명령을 사용하십시오. 버전 1.10부터 DAI docker 이미지는 docker의 --init 를 사용하는 것과 동일한 내부 tini 로 실행됩니다. 시작 명령에서 둘 다 활성화된 경우 tini는 (무해한) 경고 메시지를 출력합니다. GPU 사용자의 경우 GPU는 nvml에 --pid=host 가 필요하고 tini가 pid=1을 사용하지 않으므로 (여전히 무해한) 경고 메시지를 보여줍니다.

docker 실행 명령에서 --shm-size=256m 을 권장합니다. 하지만 사용자가 image auto model 을 광범위하게 구축할 계획인 경우 Driverless AI docker 명령에 --shm-size=2g 를 권장합니다.

Note: docker version 을 사용하여 무슨 버전의 Docker를 사용하고 있는지 확인합니다.

# Start the Driverless AI Docker image
docker run --runtime=nvidia \
   --pid=host \
   --rm \
   --shm-size=256m \
   -u `id -u`:`id -g` \
   -p 12345:12345 \
   -v `pwd`/data:/data \
   -v `pwd`/log:/log \
   -v `pwd`/license:/license \
   -v `pwd`/tmp:/tmp \
   h2oai/dai-centos7-x86_64:1.10.1-cuda11.2.2.xx

# Start the Driverless AI Docker image
nvidia-docker run \
   --pid=host \
   --rm \
   --shm-size=256m \
   -u `id -u`:`id -g` \
   -p 12345:12345 \
   -v `pwd`/data:/data \
   -v `pwd`/log:/log \
   -v `pwd`/license:/license \
   -v `pwd`/tmp:/tmp \
   h2oai/dai-centos7-x86_64:1.10.1-cuda11.2.2.xx

Driverless AI가 실행됩니다:

--------------------------------
Welcome to H2O.ai's Driverless AI
---------------------------------

- Put data in the volume mounted at /data
- Logs are written to the volume mounted at /log/20180606-044258
- Connect to Driverless AI on port 12345 inside the container
- Connect to Jupyter notebook on port 8888 inside the container

http://Your-Driverless-AI-Host-Machine:12345 에서 브라우저로 Driverless AI에 연결합니다.

CPU가 있는 RHEL에 설치하기¶

이 섹션에서는 RHEL에 Driverless AI Docker 이미지를 설치하고 시작하는 방법에 대해 설명합니다. 여기서는 nvidia-docker 가 아니라 docker 를 사용합니다.

Watch the installation video here. 이 비디오의 일부 이미지가 릴리스 사이에 변경될 수도 있지만, 설치 단계는 동일합니다.

참고

본문 작성 시점에 Driverless AI는 RHEL 버전 7.4, 8.3, 8.4에서 테스트되었습니다.

터미널을 열고 Driverless AI를 실행할 시스템에 ssh를 적용합니다. 로그인한 다음, 다음 단계를 수행합니다.

RHEL에 Docker EE를 설치합니다(아직 설치되지 않은 경우). Https://docs.docker.com/engine/installation/linux/docker-ee/rhel/ 의 설명을 따릅니다.

또는 Docker CE에서 실행할 수도 있습니다.

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum makecache fast
sudo yum -y install docker-ce
sudo systemctl start docker

Docker EE를 실행 중인 시스템에서 https://www.h2o.ai/download/ 에서 Driverless AI Docker 이미지를 검색합니다.
호스트 시스템에서 Driverless AI 버전에 대한 디렉터리를 설정합니다.

# Set up directory with the version name
mkdir dai-1.10.1.2

새 디렉터리 내에 Driverless AI Docker 이미지를 로드합니다.

# Load the Driverless AI Docker image
docker load < dai-docker-centos7-x86_64-1.10.1.2.tar.gz

(새 디렉터리 내에) 데이터, 로그, 라이선스, tmp 디렉터리를 설정합니다:

# cd into the directory associated with your version of Driverless AI
cd dai-1.10.1.2

# Set up the data, log, license, and tmp directories on the host machine
mkdir data
mkdir log
mkdir license
mkdir tmp

데이터를 호스트의 data 디렉터리로 복사합니다. 해당 데이터는 /<user-home>/data 의 Docker 컨테이너 내에 표시됩니다.
docker 이미지 를 실행하여 이미지 태그를 찾습니다.
Driverless AI Docker 이미지를 시작합니다. GPU 지원은 사용할 수 없습니다. 버전 1.10부터 DAI docker 이미지는 docker의 --init 를 사용하는 것과 동일한 내부 tini 로 실행됩니다. 시작 명령에서 둘 다 활성화된 경우 tini는 (무해한) 경고 메시지를 출력합니다.

docker 실행 명령에서 --shm-size=256m 을 권장합니다. 하지만 사용자가 image auto model 을 광범위하게 구축할 계획인 경우 Driverless AI docker 명령에 --shm-size=2g 를 권장합니다.

$ docker run \
  --pid=host \
  --rm \
  --shm-size=256m \
  -u `id -u`:`id -g` \
  -p 12345:12345 \
  -v `pwd`/data:/data \
  -v `pwd`/log:/log \
  -v `pwd`/license:/license \
  -v `pwd`/tmp:/tmp \
  -v /etc/passwd:/etc/passwd:ro \
  -v /etc/group:/etc/group:ro \
  h2oai/dai-centos7-x86_64:1.10.1-cuda11.2.2.xx

Driverless AI가 실행됩니다:

--------------------------------
Welcome to H2O.ai's Driverless AI
---------------------------------

- Put data in the volume mounted at /data
- Logs are written to the volume mounted at /log/20180606-044258
- Connect to Driverless AI on port 12345 inside the container
- Connect to Jupyter notebook on port 8888 inside the container

http://Your-Driverless-AI-Host-Machine:12345 에서 브라우저로 Driverless AI에 연결합니다.

Docker Image 중지¶

Driverless AI Docker 이미지를 중지하려면, Driverless AI Docker 이미지를 실행하는 터미널(Mac OS X) 또는 PowerShell(Windows 10) 창에 Ctrl + C 를 입력합니다.

Docker Image 업그레이드¶

이 섹션에서는 Docker 컨테이너에 설치된 Driverless AI 버전의 업그레이드에 관한 지침을 제공합니다. 이 단계는 기존 실험이 저장되도록 합니다.

WARNING: Driverless AI tmp 디렉터리에는 Experiment, MLI, MOJO가 있으며, 이는 Driverless AI가 업그레이드될 때 자동으로 업그레이드되지 않습니다.

업그레이드하기 전에 MLI 모델을 빌드하십시오.

업그레이드하기 전에 MOJO 파이프라인을 빌드하십시오.

업그레이드하기 전에 Driverless AI를 중지하고 Driverless AI tmp 디렉터리를 백업하십시오.

Driverless AI의 업그레이드 전에 모델에 MLI를 빌드하지 않은 경우, 업그레이드 후 해당 모델에서 MLI를 확인할 수 없습니다. 업그레이드 전에 향후 릴리스에서 지속적으로 해석하고자 하는 모형에 MLI 작업을 실행하십시오. 해당 MLI 작업이 현재 버전의 해석 모형 목록에 나타나면, 이는 업그레이드 후에도 유지됩니다.

Driverless AI의 업그레이드 전에 모델에 MOJO 파이프라인을 빌드하지 않은 경우, 업그레이드 후에는 해당 모델에 MOJO 파이프라인을 빌드하지 못합니다. 업그레이드 전에 필요한 모든 모델에서 MOJO 파이프라인을 빌드한 후, Driverless AI tmp 디렉터리를 백업하십시오.

Note: Driverless AI가 계속 실행 중인 경우 중지하십시오.

요구 사항¶

Ampere를 포함한 모든 NVIDIA 아키텍처에서 원활한 경험을 위해 호스트 환경에 설치된 471.68 (GPU만 해당) 이상의 NVIDIA 드라이버를 사용할 것을 권장합니다. Driverless AI는 GPU용 CUDA 11.2.2와 함께 제공되지만 드라이버가 호스트 환경에 있어야 합니다.

최신 NVIDIA Tesla A/T/V/P/K 시리즈 드라이버를 다운로드하려면 NVIDIA download driver 로 이동하십시오. CUDA Toolkit 및 최소 필수 드라이버 버전, CUDA Toolkit 및 해당 드라이버 버전에 대한 참고자료는 here 를 참조하십시오.

업그레이드 단계¶

Driverless AI를 실행하는 시스템의 IP 주소에 SSH를 사용합니다.
호스트 시스템에서 Driverless AI 버전에 대한 디렉터리를 설정합니다.

# Set up directory with the version name
mkdir dai-1.10.1.2

# cd into the new directory
cd dai-1.10.1.2

https://www.h2o.ai/download/에서 Driverless AI 패키지를 검색하여 이를 새 디렉터리에 추가합니다.
새 디렉터리 내에 Driverless AI Docker 이미지를 로드합니다.

# Load the Driverless AI docker image
docker load < dai-docker-centos7-x86_64-1.10.1.2.tar.gz

이전 Driverless AI 디렉터리의 데이터, 로그, 라이선스, tmp 디렉터리를 새 Driverless AI 디렉터리로 복사합니다.

# Copy the data, log, license, and tmp directories on the host machine
cp -a dai_rel_1.4.2/data dai-1.10.1.2/data
cp -a dai_rel_1.4.2/log dai-1.10.1.2/log
cp -a dai_rel_1.4.2/license dai-1.10.1.2/license
cp -a dai_rel_1.4.2/tmp dai-1.10.1.2/tmp

이 때 이전 버전의 실험이 Docker 컨테이너 내에 표시됩니다.

docker image 를 사용하여 새 이미지 태그를 찾습니다.
Driverless AI Docker 이미지를 시작합니다.
http://Your-Driverless-AI-Host-Machine:12345 에서 브라우저로 Driverless AI에 연결합니다.