Work notes while building an AMD based desktop for deep learning projects with Ubuntu 18.04, CUDA, with TF and Pytorch
Monday Feb 17th ( President’s Day ) started at 10:30a
- added the CPU
- Add memory
- Ripjaws memory won’t fit under the Noctua fan, so have to rip open the covers

- Ripjaws memory won’t fit under the Noctua fan, so have to rip open the covers
- add M.2
- Screw motherboard to case
- Add graphics card
- Hook up power
- Hook up connections
-
Power on test
-
Power supply just went on standby. After about 20 minutes of debugging, turns out the on off connections weren’t secure.
-
- Hooked up HD
- Power on

Finished 3:17p ( including lunch )
Saturday Feb 22nd (9:09pm)
- Adding to this post from the newly built PC.
- Downloaded Windows 10 Pro, transferred to USB key, installed and activated. So far so good
- The only complication to getting to this point from the last post ( Feb 17th) was the USB key not getting recognized. Switched to an older/slower USB ( black vs blue ) and it installed fine.
- Next thing is to install Ubuntu 18.xx
- Outstanding items
- Only saw 32GB of memory, had installed 48GB
3TB HD not recognized.- Fixed March 1st – May be a faulty connector, but got it recognized on the disk management utility.
- Created a new volume and named it D
- Ubuntu Install
- Installed via USB key ( see references below )
- Complication 1: have to unmount the /cdrom to proceed
- umount -l -r -f /cdrom
- <picture>
- https://ubuntuforums.org/showthread.php?t=1237721&highlight=the+installer+needs+to+commit+changes
- Complication 2: Install taking a long time
- Mirrors may be unstable – downloads take a long time
- https://discourse.ubuntu.com/t/check-why-it-takes-too-long-to-install/12792
- Install Nvidia drivers ( Ubuntu uses Intel on the motherboard by default )
- https://www.linuxbabe.com/ubuntu/install-nvidia-driver-ubuntu-18-04
Installed Nvidia driver 435, so will be using CUDA 10.1 alongside 9.2 https://medium.com/@IsaacJK/setting-up-a-ubuntu-18-04-1-lts-system-for-deep-learning-and-scientific-computing-fab19f7ca39dWent with 430 drivers instead of 435
Finished around midnight
Final notes
- 500GB of Samsung 1TB formatted ext4 – /dev/nvme0n1p5
- Dual boot into Ubuntu first
- Formatted 8GB microSD card to hold Ubuntu bootable disk and installation
- Installs
- Nvidia driver – 430
- CUDA 10.1
- Anaconda
- Python 3.7
- TF 2.0.1 – conda activate tfgpu
- Pytorch – conda activate pytorch
References
- Ubuntu install
- https://wordpress.com/post/numbersandcode.wordpress.com/179
- https://medium.com/@IsaacJK/setting-up-a-ubuntu-18-04-1-lts-system-for-deep-learning-and-scientific-computing-fab19f7ca39d
- https://linuxconfig.org/how-to-install-ubuntu-18-04-bionic-beaver
- https://linuxconfig.org/how-to-create-a-bootable-ubuntu-18-04-bionic-usb-stick-on-ms-windows
- https://unetbootin.github.io/
- Conda install tensorflow-gpu – this will also install CUDA.
-
The following NEW packages will be INSTALLED:
_libgcc_mutex: 0.1-main
_tflow_select: 2.1.0-gpu
absl-py: 0.9.0-py37_0
asn1crypto: 1.3.0-py37_0
astor: 0.8.0-py37_0
blas: 1.0-mkl
blinker: 1.4-py37_0
c-ares: 1.15.0-h7b6447c_1001
ca-certificates: 2020.1.1-0
cachetools: 3.1.1-py_0
certifi: 2019.11.28-py37_0
cffi: 1.14.0-py37h2e261b9_0
chardet: 3.0.4-py37_1003
click: 7.0-py_0
cryptography: 2.8-py37h1ba5d50_0
cudatoolkit: 10.1.243-h6bb024c_0
cudnn: 7.6.5-cuda10.1_0
cupti: 10.1.168-0
gast: 0.2.2-py37_0
google-auth: 1.11.2-py_0
google-auth-oauthlib: 0.4.1-py_2
google-pasta: 0.1.8-py_0
grpcio: 1.27.2-py37hf8bcb03_0
h5py: 2.10.0-py37h7918eee_0
hdf5: 1.10.4-hb1b8bf9_0
idna: 2.8-py37_0
intel-openmp: 2020.0-166
keras-applications: 1.0.8-py_0
keras-preprocessing: 1.1.0-py_1
ld_impl_linux-64: 2.33.1-h53a641e_7
libedit: 3.1.20181209-hc058e9b_0
libffi: 3.2.1-hd88cf55_4
libgcc-ng: 9.1.0-hdf63c60_0
libgfortran-ng: 7.3.0-hdf63c60_0
libprotobuf: 3.11.4-hd408876_0
libstdcxx-ng: 9.1.0-hdf63c60_0
markdown: 3.1.1-py37_0
mkl: 2020.0-166
mkl-service: 2.3.0-py37he904b0f_0
mkl_fft: 1.0.15-py37ha843d7b_0
mkl_random: 1.1.0-py37hd6b4f25_0
ncurses: 6.1-he6710b0_1
numpy: 1.18.1-py37h4f9e942_0
numpy-base: 1.18.1-py37hde5b4d6_1
oauthlib: 3.1.0-py_0
openssl: 1.1.1d-h7b6447c_4
opt_einsum: 3.1.0-py_0
pip: 20.0.2-py37_1
protobuf: 3.11.4-py37he6710b0_0
pyasn1: 0.4.8-py_0
pyasn1-modules: 0.2.7-py_0
pycparser: 2.19-py_0
pyjwt: 1.7.1-py37_0
pyopenssl: 19.1.0-py37_0
pysocks: 1.7.1-py37_0
python: 3.7.6-h0371630_2
readline: 7.0-h7b6447c_5
requests: 2.22.0-py37_1
requests-oauthlib: 1.3.0-py_0
rsa: 4.0-py_0
scipy: 1.4.1-py37h0b6359f_0
setuptools: 45.2.0-py37_0
six: 1.14.0-py37_0
sqlite: 3.31.1-h7b6447c_0
tensorboard: 2.1.0-py3_0
tensorflow: 2.1.0-gpu_py37h7a4bb67_0
tensorflow-base: 2.1.0-gpu_py37h6c5654b_0
tensorflow-estimator: 2.1.0-pyhd54b08b_0
tensorflow-gpu: 2.1.0-h0d30ee6_0
termcolor: 1.1.0-py37_1
tk: 8.6.8-hbc83047_0
urllib3: 1.25.8-py37_0
werkzeug: 1.0.0-py_0
wheel: 0.34.2-py37_0
wrapt: 1.11.2-py37h7b6447c_0
xz: 5.2.4-h14c3975_4
zlib: 1.2.11-h7b6447c_3
-
- Test Tensorflow and python3 install
-
(tf-gpu) dan@dan-X399-AORUS-PRO:~/dev/tensorflow-cnn-tutorial$ python3
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import tensorflow as tf
>>> tf.__version__
‘2.1.0’
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices(‘GPU’)` instead.
2020-02-23 01:48:20.149384: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-02-23 01:48:20.175975: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3493005000 Hz
2020-02-23 01:48:20.176961: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cc220e6780 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-23 01:48:20.176985: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-02-23 01:48:20.178028: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-23 01:48:20.351783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2020-02-23 01:48:20.352047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-23 01:48:20.353750: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-23 01:48:20.355686: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-23 01:48:20.355959: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-23 01:48:20.357749: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-23 01:48:20.358979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-23 01:48:20.363192: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-23 01:48:20.364707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-23 01:48:20.364766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-23 01:48:20.665490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-23 01:48:20.665531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-02-23 01:48:20.665540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-02-23 01:48:20.667506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 5220 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:41:00.0, compute capability: 7.5)
2020-02-23 01:48:20.669697: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cc24b5ece0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-02-23 01:48:20.669713: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
True
>>>
-
- Run MNIST test using this tutorial https://github.com/dragen1860/TensorFlow-2.x-Tutorials
- The Cole Murray tutorial used previously wasn’t compatible with TF 2.1 used here https://github.com/ColeMurray/tensorflow-cnn-tutorial
-
(tf-gpu) dan@dan-X399-AORUS-PRO:~/dev/TensorFlow-2.x-Tutorials/03-Play-with-MNIST$ python3 main.py
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] – 1s 0us/step
datasets: (60000, 28, 28) (60000,) 0 255
2020-02-23 01:52:33.707955: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-23 01:52:33.720688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2020-02-23 01:52:33.720807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-23 01:52:33.721751: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-23 01:52:33.722790: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-23 01:52:33.722949: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-23 01:52:33.723951: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-23 01:52:33.724572: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-23 01:52:33.726953: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-23 01:52:33.727906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-23 01:52:33.728194: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-02-23 01:52:33.751744: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3493005000 Hz
2020-02-23 01:52:33.752694: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5620d9fcbb40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-23 01:52:33.752717: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-02-23 01:52:33.753560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
2020-02-23 01:52:33.753608: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-23 01:52:33.753625: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-23 01:52:33.753639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-23 01:52:33.753654: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-23 01:52:33.753668: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-23 01:52:33.753682: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-23 01:52:33.753697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-23 01:52:33.755060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-23 01:52:33.755099: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-23 01:52:33.834700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-23 01:52:33.834733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-02-23 01:52:33.834740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-02-23 01:52:33.836021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5113 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:41:00.0, compute capability: 7.5)
2020-02-23 01:52:33.837739: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5620dd752280 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-02-23 01:52:33.837752: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
Model: “sequential”
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) multiple 200960
_________________________________________________________________
dense_1 (Dense) multiple 65792
_________________________________________________________________
dense_2 (Dense) multiple 65792
_________________________________________________________________
dense_3 (Dense) multiple 2570
=================================================================
Total params: 335,114
Trainable params: 335,114
Non-trainable params: 0
_________________________________________________________________
2020-02-23 01:52:34.579333: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
0 loss: 1.2610700130462646 acc: 0.15625
200 loss: 0.43644407391548157 acc: 0.6821875
400 loss: 0.35265296697616577 acc: 0.8464062
600 loss: 0.30810198187828064 acc: 0.870625
800 loss: 0.2214876413345337 acc: 0.90234375
1000 loss: 0.29607510566711426 acc: 0.89453125
1200 loss: 0.2684360444545746 acc: 0.9134375
















