Return to Level1Techs.com

And Idiots Guide to Testing ResNet50 Performance (Nvidia Docker/Ubuntu)

linux
benchmarking
tensorflow
#1

Testing ResNet50 performance is easy with the HP Deep Learning Benchmark Suite. This guide will go from start to finish assuming the OS installation is fresh. This benchmark is tailored to benchmark GPU scaling. In other words, it tests up to 7 GPUs starting with 1 and then adds GPUs, rerunning the benchmark each time.

First attach an Nvidia GPU.

Start with installing required applications - you may not need all python modules but it’s good to have them installed :

sudo apt install git

sudo apt install python

sudo apt install python-pip

sudo apt install curl

pip install numpy

pip install pandas

pip install matplotlib

pip install portpicker

You’re also going to need to setup nvidia drivers and cuda :

sudo vi /etc/modprobe.d/blacklist.conf and add blacklist nouveau

Install the latest Nvidia CUDA drivers by navigating to the OS and OS version you’re using.

https://developer.nvidia.com/cuda-downloads

Since this guide was built using Ubuntu 18.04 the following example is tailored for that.

Navigate to /home/username/Downloads

wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.168_418.67_linux.run

sudo chmod 755 cuda_10.1.168_418.67_linux.run

sudo ./cuda_10.1.168_418.67_linux.run

Install everything and reboot.

Check your installation using nvidia-smi — if you do not see a GPU listed lsmod |grep nouveau — if nouveau is running you need to make sure it’s blacklisted. On some occasions blacklisting nouveau manually will somehow conflict with the nvidia-blacklist. If this happens comment out your manual entry in /etc/modprobe.d/blacklist.conf

The benchmark suite requires docker and nvidia-docker2

Install Docker :

First setup aptitude to use HTTPS:

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

Now add the GPG key for docker :

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Now add the docker repo for Ubuntu:
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

Update the aptitude database :

sudo apt-get update

Install docker :

sudo apt-get install docker-ce docker-ce-cli containerd.io

Now install Nvidia-Docker2 :

Setup the GPG Key

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ sudo apt-key add - distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

Add the repo :
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list

Update aptitiude database :
sudo apt-get update
Install Nvidia-Docker 2 :
sudo apt-get install -y nvidia-docker2 sudo pkill -SIGHUP dockerd

Now install the benchmarking framework :

Navigate to the use home directory and make new directory named mlbenchmarks.

Enter mlbenchmarks :

git clone https://github.com/HewlettPackard/dlcookbook-dlbs dlbs

Now navigate to /dlbs/tutorials/recipes/multi_gpu_compute_scaling

sudo ./run

This will check for an nvidia docker and if it’s not there it will prompt you to download it. Copy and paste the command and allow the docker to download.

sudo nvidia-docker pull nvcr.io/nvidia/tensorflow:18.04-py3

3 Likes