AMD deep learning desktop build

Posted on February 17, 2020 by dpang1

Work notes while building an AMD based desktop for deep learning projects with Ubuntu 18.04, CUDA, with TF and Pytorch

Monday Feb 17th ( President’s Day ) started at 10:30a

added the CPU
Add memory
- Ripjaws memory won’t fit under the Noctua fan, so have to rip open the covers
add M.2
Screw motherboard to case
Add graphics card
Hook up power
Hook up connections
Power on test
- Power supply just went on standby. After about 20 minutes of debugging, turns out the on off connections weren’t secure.
Hooked up HD
Power on

Finished 3:17p ( including lunch )

Saturday Feb 22nd (9:09pm)

Adding to this post from the newly built PC.
Downloaded Windows 10 Pro, transferred to USB key, installed and activated. So far so good
- The only complication to getting to this point from the last post ( Feb 17th) was the USB key not getting recognized. Switched to an older/slower USB ( black vs blue ) and it installed fine.
- Next thing is to install Ubuntu 18.xx
Outstanding items
- Only saw 32GB of memory, had installed 48GB
- ~~3TB HD not recognized.~~
  - Fixed March 1st – May be a faulty connector, but got it recognized on the disk management utility.
  - Created a new volume and named it D
Ubuntu Install
- Installed via USB key ( see references below )
- Complication 1: have to unmount the /cdrom to proceed
  - umount -l -r -f /cdrom
  - <picture>
  - https://ubuntuforums.org/showthread.php?t=1237721&highlight=the+installer+needs+to+commit+changes
- Complication 2: Install taking a long time
  - Mirrors may be unstable – downloads take a long time
  - https://discourse.ubuntu.com/t/check-why-it-takes-too-long-to-install/12792
- Install Nvidia drivers ( Ubuntu uses Intel on the motherboard by default )
  - https://www.linuxbabe.com/ubuntu/install-nvidia-driver-ubuntu-18-04
  - ~~Installed Nvidia driver 435, so will be using CUDA 10.1 alongside 9.2 https://medium.com/@IsaacJK/setting-up-a-ubuntu-18-04-1-lts-system-for-deep-learning-and-scientific-computing-fab19f7ca39d~~ Went with 430 drivers instead of 435

Finished around midnight

Final notes

500GB of Samsung 1TB formatted ext4 – /dev/nvme0n1p5
Dual boot into Ubuntu first
Formatted 8GB microSD card to hold Ubuntu bootable disk and installation
Installs
- Nvidia driver – 430
- CUDA 10.1
- Anaconda
- Python 3.7
- TF 2.0.1 – conda activate tfgpu
- Pytorch – conda activate pytorch

References

Ubuntu install
Conda install tensorflow-gpu – this will also install CUDA.
- The following NEW packages will be INSTALLED:
  
  _libgcc_mutex: 0.1-main
  _tflow_select: 2.1.0-gpu
  absl-py: 0.9.0-py37_0
  asn1crypto: 1.3.0-py37_0
  astor: 0.8.0-py37_0
  blas: 1.0-mkl
  blinker: 1.4-py37_0
  c-ares: 1.15.0-h7b6447c_1001
  ca-certificates: 2020.1.1-0
  cachetools: 3.1.1-py_0
  certifi: 2019.11.28-py37_0
  cffi: 1.14.0-py37h2e261b9_0
  chardet: 3.0.4-py37_1003
  click: 7.0-py_0
  cryptography: 2.8-py37h1ba5d50_0
  cudatoolkit: 10.1.243-h6bb024c_0
  cudnn: 7.6.5-cuda10.1_0
  cupti: 10.1.168-0
  gast: 0.2.2-py37_0
  google-auth: 1.11.2-py_0
  google-auth-oauthlib: 0.4.1-py_2
  google-pasta: 0.1.8-py_0
  grpcio: 1.27.2-py37hf8bcb03_0
  h5py: 2.10.0-py37h7918eee_0
  hdf5: 1.10.4-hb1b8bf9_0
  idna: 2.8-py37_0
  intel-openmp: 2020.0-166
  keras-applications: 1.0.8-py_0
  keras-preprocessing: 1.1.0-py_1
  ld_impl_linux-64: 2.33.1-h53a641e_7
  libedit: 3.1.20181209-hc058e9b_0
  libffi: 3.2.1-hd88cf55_4
  libgcc-ng: 9.1.0-hdf63c60_0
  libgfortran-ng: 7.3.0-hdf63c60_0
  libprotobuf: 3.11.4-hd408876_0
  libstdcxx-ng: 9.1.0-hdf63c60_0
  markdown: 3.1.1-py37_0
  mkl: 2020.0-166
  mkl-service: 2.3.0-py37he904b0f_0
  mkl_fft: 1.0.15-py37ha843d7b_0
  mkl_random: 1.1.0-py37hd6b4f25_0
  ncurses: 6.1-he6710b0_1
  numpy: 1.18.1-py37h4f9e942_0
  numpy-base: 1.18.1-py37hde5b4d6_1
  oauthlib: 3.1.0-py_0
  openssl: 1.1.1d-h7b6447c_4
  opt_einsum: 3.1.0-py_0
  pip: 20.0.2-py37_1
  protobuf: 3.11.4-py37he6710b0_0
  pyasn1: 0.4.8-py_0
  pyasn1-modules: 0.2.7-py_0
  pycparser: 2.19-py_0
  pyjwt: 1.7.1-py37_0
  pyopenssl: 19.1.0-py37_0
  pysocks: 1.7.1-py37_0
  python: 3.7.6-h0371630_2
  readline: 7.0-h7b6447c_5
  requests: 2.22.0-py37_1
  requests-oauthlib: 1.3.0-py_0
  rsa: 4.0-py_0
  scipy: 1.4.1-py37h0b6359f_0
  setuptools: 45.2.0-py37_0
  six: 1.14.0-py37_0
  sqlite: 3.31.1-h7b6447c_0
  tensorboard: 2.1.0-py3_0
  tensorflow: 2.1.0-gpu_py37h7a4bb67_0
  tensorflow-base: 2.1.0-gpu_py37h6c5654b_0
  tensorflow-estimator: 2.1.0-pyhd54b08b_0
  tensorflow-gpu: 2.1.0-h0d30ee6_0
  termcolor: 1.1.0-py37_1
  tk: 8.6.8-hbc83047_0
  urllib3: 1.25.8-py37_0
  werkzeug: 1.0.0-py_0
  wheel: 0.34.2-py37_0
  wrapt: 1.11.2-py37h7b6447c_0
  xz: 5.2.4-h14c3975_4
  zlib: 1.2.11-h7b6447c_3
Test Tensorflow and python3 install
- (tf-gpu) dan@dan-X399-AORUS-PRO:~/dev/tensorflow-cnn-tutorial$ python3
  Python 3.7.6 (default, Jan 8 2020, 19:59:22)
  [GCC 7.3.0] :: Anaconda, Inc. on linux
  Type “help”, “copyright”, “credits” or “license” for more information.
  >>> import tensorflow as tf
  >>> tf.__version__
  ‘2.1.0’
  >>> tf.test.is_gpu_available()
  WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
  Instructions for updating:
  Use `tf.config.list_physical_devices(‘GPU’)` instead.
  2020-02-23 01:48:20.149384: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
  2020-02-23 01:48:20.175975: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3493005000 Hz
  2020-02-23 01:48:20.176961: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cc220e6780 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
  2020-02-23 01:48:20.176985: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
  2020-02-23 01:48:20.178028: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
  2020-02-23 01:48:20.351783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
  pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
  coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
  2020-02-23 01:48:20.352047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
  2020-02-23 01:48:20.353750: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
  2020-02-23 01:48:20.355686: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
  2020-02-23 01:48:20.355959: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
  2020-02-23 01:48:20.357749: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
  2020-02-23 01:48:20.358979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
  2020-02-23 01:48:20.363192: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
  2020-02-23 01:48:20.364707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
  2020-02-23 01:48:20.364766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
  2020-02-23 01:48:20.665490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
  2020-02-23 01:48:20.665531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
  2020-02-23 01:48:20.665540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
  2020-02-23 01:48:20.667506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 5220 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:41:00.0, compute capability: 7.5)
  2020-02-23 01:48:20.669697: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55cc24b5ece0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
  2020-02-23 01:48:20.669713: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
  True
  >>>
Run MNIST test using this tutorial https://github.com/dragen1860/TensorFlow-2.x-Tutorials
- The Cole Murray tutorial used previously wasn’t compatible with TF 2.1 used here https://github.com/ColeMurray/tensorflow-cnn-tutorial
- (tf-gpu) dan@dan-X399-AORUS-PRO:~/dev/TensorFlow-2.x-Tutorials/03-Play-with-MNIST$ python3 main.py
  Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
  11493376/11490434 [==============================] – 1s 0us/step
  datasets: (60000, 28, 28) (60000,) 0 255
  2020-02-23 01:52:33.707955: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
  2020-02-23 01:52:33.720688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
  pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
  coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
  2020-02-23 01:52:33.720807: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
  2020-02-23 01:52:33.721751: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
  2020-02-23 01:52:33.722790: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
  2020-02-23 01:52:33.722949: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
  2020-02-23 01:52:33.723951: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
  2020-02-23 01:52:33.724572: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
  2020-02-23 01:52:33.726953: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
  2020-02-23 01:52:33.727906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
  2020-02-23 01:52:33.728194: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
  2020-02-23 01:52:33.751744: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3493005000 Hz
  2020-02-23 01:52:33.752694: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5620d9fcbb40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
  2020-02-23 01:52:33.752717: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
  2020-02-23 01:52:33.753560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
  pciBusID: 0000:41:00.0 name: GeForce RTX 2060 computeCapability: 7.5
  coreClock: 1.71GHz coreCount: 30 deviceMemorySize: 5.79GiB deviceMemoryBandwidth: 312.97GiB/s
  2020-02-23 01:52:33.753608: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
  2020-02-23 01:52:33.753625: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
  2020-02-23 01:52:33.753639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
  2020-02-23 01:52:33.753654: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
  2020-02-23 01:52:33.753668: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
  2020-02-23 01:52:33.753682: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
  2020-02-23 01:52:33.753697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
  2020-02-23 01:52:33.755060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
  2020-02-23 01:52:33.755099: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
  2020-02-23 01:52:33.834700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
  2020-02-23 01:52:33.834733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
  2020-02-23 01:52:33.834740: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
  2020-02-23 01:52:33.836021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5113 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:41:00.0, compute capability: 7.5)
  2020-02-23 01:52:33.837739: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5620dd752280 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
  2020-02-23 01:52:33.837752: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
  Model: “sequential”
  _________________________________________________________________
  Layer (type) Output Shape Param #
  =================================================================
  dense (Dense) multiple 200960
  _________________________________________________________________
  dense_1 (Dense) multiple 65792
  _________________________________________________________________
  dense_2 (Dense) multiple 65792
  _________________________________________________________________
  dense_3 (Dense) multiple 2570
  =================================================================
  Total params: 335,114
  Trainable params: 335,114
  Non-trainable params: 0
  _________________________________________________________________
  2020-02-23 01:52:34.579333: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
  0 loss: 1.2610700130462646 acc: 0.15625
  200 loss: 0.43644407391548157 acc: 0.6821875
  400 loss: 0.35265296697616577 acc: 0.8464062
  600 loss: 0.30810198187828064 acc: 0.870625
  800 loss: 0.2214876413345337 acc: 0.90234375
  1000 loss: 0.29607510566711426 acc: 0.89453125
  1200 loss: 0.2684360444545746 acc: 0.9134375

Meetup – Computer vision on mobile device & Multi-stage ML for document understanding

Posted on February 28, 2019 by dpang1

https://www.meetup.com/SF-Big-Analytics/events/258514786/

Went to a very informative meetup at GoPro headquarters in San Mateo. The takeaways were mainly from the first speaker from Facebook around image recognition on mobile and from the various participants re: what positions they were hiring for.

Computer Vision

Sam walked through various techniques and papers detailing the work needed to train and infer from models staged in the mobile. Following are some of the papers outlined in the talk

DSD: DENSE-SPARSE-DENSE TRAINING FOR DEEP NEURAL NETWORKS

Click to access 1607.04381.pdf

Value-aware Quantization for Training and Inference of Neural Networks

Click to access Eunhyeok_Park_Value-aware_Quantization_for_ECCV_2018_paper.pdf

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

https://arxiv.org/abs/1812.03443

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

Click to access 1812.08934.pdf

Document Understanding

The basis of the talk from Henry and Vivek was: Where is the text, what is the text, what does the text mean. They didn’t offer any papers or insights, but demonstrated the feature and how it’s used within the Workday process.

They compared MXNet, with consulting help from Amazon, with Tensorflow and felt that the former was faster to train and more accurate.

Hiring

Salesforce was hiring. The representative was from the infrastructure group and they were hiring software and data engineers with some work experience. They are a Java/Scala shop.

Facebook was hiring all positions in all locations with various amounts of experience, pretty much anywhere and anything.

Workday is another Java/Scala shop. They do hire Python people for data science. Previously, they would translate python models into Java/Scala, but that is not scalable from a product point of view, so now they just containerize the Python model.

GoPro is also hiring, but like Salesforce, the representative was from the infrastructure/platform team, so were looking primarily at software and data engineers.

Overview of the Convolution Neural Networks course from deeplearning.ai

Posted on December 12, 2017 by dpang1

Screen Shot 2017-12-03 at 9.01.23 AM

I just finished the fourth course of the deeplearning series, and it was immensely enjoyable. I have to admit with the advent of Hinton’s capsule networks the motivation to start this set on Convolutional Neural Networks was a little harder than the previous three. Hinton and other bloggers have already outlined shortcomings of CNNs and the thought at the back of my mind was whether it was worth it to spend the time learning something that may become obsolete.

Nevertheless it was worth it, and as I learned from attending NIPS 2017 last week and through a Deep Learning Study Group, there is still much work to be done with capsule networks. More on that later.

As per the previous three deeplearning.ai courses, this course had the following characteristics:

Clear and concise
- Andrew Ng went over the concepts meticulously
- The exercises were clearly documented and easy to follow
- Grading went without a hitch except for one instance (see Caveats below)
Touched on a wide range of concepts and use cases
- CNNs, Residual and Inception networks
- Object detection, neural style transfer, and face recognition
Afforded lots of opportunities for further study
- Many ways to practice Keras and Tensorflow
- You could easily complete the exercises and leave it at that, but then you would be shortchanging yourself. There are so many other avenues to explore based on the code written that you owe yourself to put some additional effort and discover something new.
There are some caveats
- As in previous courses, the exercises were structured well and documented extensively. They were however, much more challenging that those in the previous three courses.
- Grading for the last exercise of the fourth week dealing with triplet loss is still flawed as of this writing. There is a workaround if you look in the discussion forums.
- There is flaw when trying to use the model in the Week 2 programming assignment to perform your own predictions.

The course encompasses four weeks, each culminating in a quiz and one or two programming assignments. An understanding of Python, Keras, and Tensorflow would be helpful, but is not necessary as there are tutorials in the previous courses. Following is a synopsis of each week’s content.

Week 1

Andrew Ng first starts with the motivation for CNNs and what makes them powerful. The initial lectures on edge detection take a step by step approach to showing how different filters perform different functions by detecting different basic patterns. In later lectures, you’ll see how filters in deeper layers detect more complex shapes (Week 4, What are deep ConvNets learning).

The middle set of lectures deal with the technical guts of CNNs, namely filter size, stride, padding, and pooling, culminating in an example of a single layer CNN. He finally ends with a lecture on the advantages of CNN over fully connected networks, namely:

- Fewer parameters to train
- Parameter sharing – same filters can be used in different areas of the image
- Sparsity of connections – cells in resultant layers are dependant only on a small subset of the previous layer, hence less prone to overfitting

Screen Shot 2017-12-13 at 8.00.41 AM

Week 2

This weeks’ lectures deal with looking at case studies and different architectures. They delve into the history of CNNs and their evolution, which helps with seeing how different architectures influence results and ultimately gives you better intuition to build better CNNs for yourself

Residual networks help train deeper networks and help with the vanishing or exploding gradient problem by having the output of layer skip the next layer (skip connections) and feed into the one after that. This helps with performance and stability of the parameters during training. One advantage of the skip connection is that it learns the identity function better than a NN without a skip connection and hence doesn’t hurt performance as much.

The pooling layer is useful for reducing the height and width dimensions of CNNs, whereas the 1×1 convolution (aka Network in Network) are useful for reducing the channel dimension of CNNs. This becomes important when talking about the Inception network.

The Inception network combine results for all the different filter sizes. This results in a layer with a very large channel size, and hence the 1×1 convolution comes into play here to reduce the channel dimension.

Screen Shot 2017-12-13 at 7.58.42 AM

Week 3

This was probably the most interesting section because it dealt with object detection. The main concepts here are:

- Sliding windows of various sizes to determine if an object is detected or not.
- Bounding boxes determined by the network to outline an object.
- Intersection over Union (IOU) calculations and non-max suppression to determine the best anchor box and ways to de-duplicate boxes that detect the same object
- Anchor boxes – boxes of various dimensions that are related to the objects you want to detect.
- YOLO Algorithm – You only look once algorithm that quickly performs object detection. I confirmed with a Waymo engineer at the NIPS 2017 conference that the company has more advanced algorithms than YOLO, but all other concepts are still relevant.

Screen Shot 2017-12-13 at 7.57.08 AM

Week 4

Face verification vs face recognition

Face recognition problems have the issue of not having enough data to train for a traditional CNN. In addition, you have what is called a one-shot learning problem, which is the problem of performing recognition based on one single image. So how is this problem solved?

For Face recognition, you train two networks (a siamese network, which are two CNNs with the same parameters) to encode faces into a n-vector. Then use a difference function d(img1, img2) that will say yes or no based on the similarity of the two encodings. How is this network trained, and what is the objective function?

The objective function used to train a siamese network is called a triplet loss, which utilizes an anchor image, a positive (similar) image, and a negative (dissimilar) image

Screen Shot 2017-12-12 at 12.13.39 PM

Neural Style Transfer

Calculate similarity scores between activations in a certain layer as a way to keep track of style. You can also extend this to activations between layers.
Apply similarity scores to generated image to transfer style.
Instead of updating weights, it updates the pixels
Each iteration generates a better representation of the style over the new content than previous generations

Deeplearning.ai’s course on CNNs is a good overview of the concepts and use cases around the Convolutional Neural Network. The explanations were clear, concise, and except for a grading hiccup in one of the programming exercises, the quizzes and assignments definitely helped with reinforcing the ideas in the lessons. I’m definitely looking forward to taking the fifth installment of the series – Sequence Models – which is starting Dec 18th.

Deep Learning – Playing with Neural Style using Torch and Tensorflow

Posted on November 8, 2016 by dpang1

Deep Learning is the hot topic in artificial intelligence circles right now and with the advent of the Go competition and other deep learning advancements, a lot of attention has focused on platforms that make deep learning accessible. Two of those platforms are Torch and Tensorflow. I spent a weekend trying them out and here are some preliminary thoughts.

My point of comparison was the Neural Style project implemented in both platforms by Justin Johnson and Anish Athalye. Neural Style is a deep learning implementation that tries to derive artistic styles from pictures and applies them to a candidate image. The result is a mashup of the original picture in the style (or styles) set at input time.

The first thing to notice is the complexity of the setup. Tensorflow is easier as there aren’t as many components and steps involved. With Torch, there are quite a few moving parts, and even though there are scripts that allow you to do one step installation, I can see how this can become problematic as libraries and models get updated.

For instance, with Tensorflow I could get the basic Neural Style command running after the installation, but with Torch, I encountered errors, related to missing libraries or incompatibilities. I eventually opted for a more complex command line that allowed me to bypass those issues. More on this in a later post.

However, one thing that Torch shines in is execution time. The torch implementation of Neural Style ran orders of magnitude faster that Tensorflow. In less than an hour (using a Macbook Pro), it went through a thousand iterations whereas with Tensorflow it took more than two days.

Case and point are the following examples. The first is a 100 iteration run on Tensorflow that took about half a day. As you can see, you can discern the outlines of the candidate image in the result, but it’s far from done.

Original image: