Advanced Introduction to C++, Scientific Computing and Machine Learning




Claudius Gros, WS 2019/20

Institut für theoretische Physik
Goethe-University Frankfurt a.M.

Deep Learning

simple vs. complex problems







simple problems

backpropagation fails for simple problems

complex problems

given enough (labeled) training data, large
classes of complex problems are 'solvable'

deep networks





pruning
removing : weak links;
$|w_{ij}|$ well below average
reduces : network complexity;
overfitting


data preprocessing
whitening : covariance matrix
$\to$ identity matrix
: all data equally relevant

batch learning

'online' learning

offline learning

deep belief nets (DBN)



stacked RBMs

data availability

semi-supervised learning

train a net of stacked RBMs with unlabelled data
add a final output node connected to top hidden layer
use backpropagation on labelled data
to fine-tune connection weights

autoencoder









dimensionality reduction

autoencoders generate low-dimensional
representations of the data

denoising

stacked autoencoders

deep learning building blocks



autoencoder restricted Boltzmann machine recurrent network convolution network
feedforward undirected recurrent hierarchical feedforward

backpropagation through time

$$ \fbox{$\phantom{\big|} \mathbf{y}(t+1) \phantom{\big|}$} \quad\leftarrow\quad \fbox{$\phantom{\big|} \mathbf{y}(t) \phantom{\big|}$} \quad\leftarrow\quad \fbox{$\phantom{\big|} \mathbf{y}(t-1) \phantom{\big|}$} \quad\leftarrow\quad \fbox{$\phantom{\big|} \mathbf{y}(t-2) \phantom{\big|}$} \quad\leftarrow\quad\dots $$

receptive fields as convolutions




receptive fields

convolution scanning of 2D data


convolution networks

convolution nets

extended set of kernels
$\qquad\Rightarrow\qquad$
rastering
$\qquad\Rightarrow\qquad$
data convolution

pooling

$\qquad$
  • convolution $\ \to \ $ feature map
  • pooling
    : subsampling
    : dimensionality reduction
    : e.g. max-pooling

what makes it work





convolution net - illustration













fooling deep networks





adversial perturbations


+ 0.007 x =
"panda" "nematode" "gibbon"
57.7% confidence 8.2% confidence 99.3 % confidence

performance / confidance

attacking deep networks



original tempered

cyber security

image datasets

temper train data

missclassification induced by training data tempering

$$ \begin{array}{lcccc} \hline & \rlap{\text{baseline}} & & \rlap{\text{tampered}} \\ & \text{CIFAR} & \text{SVHN} & \text{CIFAR} & \text{SVHN} \\ \hline \text{optimal case} & 0 & 0 & 100 & 100 \\ \hline \text{BCNN} & 28.7 & 12.9 & 87.2 & 91.4 \\ \text{AlexNet} & 11.1 & 5.5 & 83.7 & 97 \\ \text{VGG-16} & 5.3 & 3.7 & 90.1 & 98.9 \\ \text{ResNet-18} & 23.8 & 3.6 & 42.4 & 40.9 \\ \text{SIRRN} & 4.7 & 3.9 & 74.1 & 89.5 \\ \text{DenseNet-121} & 2.6 & 2.6 & 60.7 & 68.1 \\ \hline \end{array} $$

TensorFlow

Copy Copy to clipboad
Downlaod Download

import tensorflow as tf             // import tensorflow library as "tf"
mnist = tf.keras.datasets.mnist     // MNIST handwritten digits dataset already included

(x_train, y_train),(x_test, y_test) = mnist.load_data()     // load data into tensor
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([                        // define your DL model
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),        // use Relu neurons
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',                             // configures the model 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)                       // trains model
model.evaluate(x_test, y_test)                              // tests and evaluates model

AlphaGo zero



game of Go

a short history of the future