Advanced Introduction to C++, Scientific Computing and Machine Learning

Claudius Gros, WS 2021/22

Institut für theoretische Physik
Goethe-University Frankfurt a.M.

Where is ML heading to?

Efficient learning

Blog: towards data science
Efficient Inference in Deep Learning - Where is the Problem?
(2020, by Amnon Geifman)
Petaflops: $10^{15}$ floating point operations per seconds
CPU: $\sim 0.1\cdot 10^{12}$
GPU: $\sim (1\!-\!10)\cdot 10^{12}$
$\mathrm{day}\ \hat{=}\ 86.4\cdot10^3\,\mathrm{sec}$
Moore's law: performance of hardware doubles every 1.8-2 years
ML requirements: doubling every 3-4 months

Performance

Blog: Medium
Computational Complexity of Deep Learning: Solution Approaches
(2021, by Vijay S. Agneeswaran)
performance as a function of resources (time, data, complexity)
ImageNet-1k
complexity: # of model parameters

complexity barrier
old/new ML: hard/soft

Scaling

Kaplan, Jared, et al., Scaling laws for neural language models (2020).

$$ \begin{array}{rcl} \def\arraystretch{1.4} \mathrm{N} &:& \mathrm{number\ model\ parameters} \\ \mathrm{C} &:& \mathrm{computing\ resources} \\ \mathrm{D} &:& \mathrm{dataset\ size} \end{array} $$

competitive games

two-player knapsack

optimize total value when packing items with different (values,weights)
$\alpha$-zero (state of the art algorithm)
players A/B with $N_A$/$N_B$ neurons
$P_{A\to B}\,$: probability that A beats B

[O. Neumann, C. Gros (2021)]