Advanced Introduction to C++, Scientific Computing and Machine Learning
Claudius Gros, WS 2021/22
Institut für theoretische Physik
Goethe-University Frankfurt a.M.
Where is ML heading to?
Efficient learning
Blog:
towards data science
Efficient Inference in Deep Learning - Where is the Problem?
(2020, by Amnon Geifman)
Petaflops: $10^{15}$ floating point operations per seconds
CPU: $\sim 0.1\cdot 10^{12}$
GPU: $\sim (1\!-\!10)\cdot 10^{12}$
$\mathrm{day}\ \hat{=}\ 86.4\cdot10^3\,\mathrm{sec}$
Moore's law
: performance of hardware doubles every 1.8-2 years
ML requirements
: doubling every 3-4 months
Performance
Blog:
Medium
Computational Complexity of Deep Learning: Solution Approaches
(2021, by Vijay S. Agneeswaran)
performance as a function of resources (time, data, complexity)
ImageNet-1k
complexity: # of model parameters
complexity barrier
old/new ML: hard/soft
Scaling
Kaplan, Jared,
et al.
,
Scaling laws for neural language models
(2020).
$$ \begin{array}{rcl} \def\arraystretch{1.4} \mathrm{N} &:& \mathrm{number\ model\ parameters} \\ \mathrm{C} &:& \mathrm{computing\ resources} \\ \mathrm{D} &:& \mathrm{dataset\ size} \end{array} $$
competitive games
two-player knapsack
optimize total value when packing items with different (values,weights)
$\alpha$-zero (state of the art algorithm)
players A/B with $N_A$/$N_B$ neurons
$P_{A\to B}\,$: probability that A beats B
$\displaystyle\hspace{10ex} P_{A\to B} = \frac{N_A}{N_A+N_B} $
[O. Neumann, C. Gros (2021)]