Programmierpraktikum

Claudius Gros, SS2012

Institut für theoretische Physik
Goethe-University Frankfurt a.M.

statistical evaluation of integrals

• evaluation of multi-dimensional integrals
• $$I \ =\ \int \, f(x_1,x_x,\dots,x_N)\,dx_1\,dx_2\cdot\cdot\cdot dx_N$$
• e.g. $1000$ trapezoids per dimension
• $$\left(10^3\right)^N \ =\ 10^{3N} \qquad\quad \mbox{function evaluations}$$
• not feasible for large $N$

Monte Carlo sampling

• evaluate the integral by sampling statistically the integrand $\ f(x_1,x_x,\dots,x_N)$

Metropolis algorithm

• evaluate the integral by sampling statistically the
most important terms of the integrand $\ f(x_1,x_x,\dots,x_N)$
(importance sampling)

integration as averaging

$\hspace{10ex} \displaystyle I \ =\ \int_a^b \, f(x)\, dx \ =\ \sum_{i=1}^N f(x_i)\,\Delta x$
• box width $\ \Delta x = \frac{b-a}{N}$
$\hspace{10ex} \displaystyle I \ =\ (a-b) \left(\frac{1}{N} \sum_i\, f(x_i) \right) \ =\ (a-b)\, \langle f\rangle$
• expectation value (average) $\ \ \ \langle f\rangle$
of integrand
$$I\ =\ V\, \langle f\rangle$$
• volume $\ V$
• expectation values may be evaluated statistically

area integration

$\hspace{10ex} \displaystyle I\ = \ \int_{-a}^a f(x,y)\,dx\,dy$

with

$\hspace{10ex} \displaystyle f(x,y)\ =\ \left\{ \begin{array}{ccl} 1 && \mbox{inside}\\ 0 && \mbox{outside} \end{array} \right.$

• evaluation of $\ \pi$
/** Evaluating \pi with simple Monte Carlo evaluation.
*/
public class MonteCarlo {
static final double x0 = 0.0, x1 = 1.0;    // x-bounds
static final double y0 = 0.0, y1 = 1.0;    // y-bounds

public static void main(String[] args) {
int N = 1000000;                           // number of draws
int deltaN = N/10;                         // printout intervals
double x, y, sum = 0;

// --- loop over all random evaluations
for (int i=0; i<N; i++)
{
x = Math.random();     // random draws
y = Math.random();
sum += MonteCarlo.myFunction(x,y);
if ((i+1)%deltaN==0)
System.out.printf("%8d %8.4f\n",i+1,4.0*sum/(i+1.0));
}  // end of loop over all Monte Carlo draws
}  // end of MonteCarlo.main()

/** The function to be evaluated.
* Here the circle.
*/
public static double myFunction(double x, double y) {
if ( (x<MonteCarlo.x0)||(x>MonteCarlo.x1)||
(y<MonteCarlo.y0)||(y>MonteCarlo.y1) )
{
System.out.println("x or y value out of range");
return 0.0;
}
if (y*y<=1.0-x*x)     // x*x+y*y=1 is the circle equation
return 1.0;
else
return 0.0;
} // end of MonteCarlo.myFunction()

} // end of class MonteCarlo


basic statistics

statistics

• mean (expecatation value)
• $$\langle f\rangle \ =\ \frac{1}{N}\sum_{i=1}^N f(x_i)$$
• variance $\ \sigma^2$
standard deviation $\ \sigma$
• $$\sigma^2 \ =\ \frac{1}{N}\sum_{i=1}^N \Big(f(x_i)-\langle f\rangle \Big)^2$$
• note: $\ \sigma\$ is not the standard deviation for the process,
not the an estimate for the accuracy of $\ \langle f\rangle$
• question: how do we estimate the accuracy of the measurement: $\ \langle f\rangle$

Monte Carlo - convergence

data binning

• divide the $\ N\$ measurements into $\ k=1,\dots,K\$ bins
containing each $\ N_K=N/K\$ data points
• $$\langle f\rangle_k \ =\ \frac{1}{N_K} \sum_{i=(k-1)N_k}^{k N_k} f(x_i), \qquad\quad k=1,\dots,K$$
• example: $\ N_k=3$
• $$\underbrace{f(x_1)\ \ f(x_2)\ \ f(x_3)}_{\langle f\rangle_1}\ \ \underbrace{f(x_4)\ \ f(x_5)\ \ f(x_6)}_{\langle f\rangle_2}\ \ \underbrace{f(x_7)\ \ f(x_8)\ \ f(x_9)}_{\langle f\rangle_3}\ \ \dots$$
• the accuracy of $\ \langle f\rangle\$ is given
by the standard deviation of $\ \langle f\rangle_k$
• $$\Delta \langle f\rangle \ =\ \sqrt{\frac{1}{K}\sum_{k=1}^K \Big(\langle f\rangle_k-\langle f\rangle\Big)^2}$$
• error estimate: $\ \Delta \langle f\rangle$
• usually $\ K\sim 10,\, 20$
$N_k\$ large

law of large numbers

$$\Delta \langle f\rangle \ \sim\ \frac{1}{\sqrt{N}}$$
• convergence very slow

Monte Carlo with error estimation

• evaluation of $\ \pi$
/** Evaluating \pi with simple Monte Carlo evaluation,
* calculating variance and standard deviation (error).
* On input: number of measurements N.
*/
public class MonteCarloVariance {
static final double x0 = 0.0, x1 = 1.0;          // x-bounds
static final double y0 = 0.0, y1 = 1.0;          // y-bounds

public static void main(String[] args) {

// --- determine the number of draws from input
int N = 0;
if (args.length==1)
N = Integer.parseInt(args[0]);
else
{
System.out.println("Wrong number of input parameters");
return;
}

// --- other parameters and variables
int K = 20;                                      // number of bins
int nK = N/K;                                    // number of data per bin
double[] data = new double[N];                   // storing all measurements
double[] meanBin = new double[K];                // means of bins
double mean = 0.0, sigma = 0.0;                  // result: mean and error
double x, y;                                     // variables

// --- loop over all random evaluations
for (int i=0; i<N; i++)
{
x = Math.random();                             // random draws
y = Math.random();
data[i] = MonteCarloVariance.myFunction(x,y);  // storing measurement
}  // end of loop over all Monte Carlo draws

// --- evaluate the overall mean and the mean of the bins
for (int i=0; i<N; i++)
mean += data[i];
mean = mean/N;

for (int k=0; k<K; k++)
{
meanBin[k] = 0.0;
for (int n=0; n<nK; n++)                     // nK data points in each bin
meanBin[k] += data[k*nK+n];
meanBin[k] = meanBin[k]/nK;
}  // end of loop over all bins

// --- evaluate the error: variance of means of bins
for (int k=0; k<K; k++)
sigma += (meanBin[k]-mean)*(meanBin[k]-mean);
sigma = Math.sqrt(sigma/K);

// --- printout of result
System.out.printf(" number of measurements : %d\n",N);
System.out.printf(" number of bins         : %d\n",K);
System.out.printf(" data points per bin    : %d\n",nK);
System.out.printf(" estimate for pi        : %8.4f\n",4.0*mean);
System.out.printf(" estimate for accuracy  : %8.4f\n",4.0*sigma);

}  // end of MonteCarloVariance.main()

/** The function to be evaluated.
* Here the circle.
*/
public static double myFunction(double x, double y) {
if ( (x<MonteCarloVariance.x0)||(x>MonteCarloVariance.x1)||
(y<MonteCarloVariance.y0)||(y>MonteCarloVariance.y1) )
{
System.out.println("x or y value out of range");
return 0.0;
}
if (y*y<=1.0-x*x)
return 1.0;
else
return 0.0;
} // end of MonteCarloVariance.myFunction()

} // end of class MonteCarloVariance


integrals over distribution functions

• probability distribution function $\ \rho(x)$
• $$I \ =\ \int g(x)\,\rho(x)\,dx, \qquad\quad \rho(x)\ \ge\ 0, \qquad\quad \int\rho(x)\,dx\ =\ 1$$
• example: Boltzman factor $\ e^{\beta E_\lambda}$
probability that a state with energy $\ E_\lambda$
is thermally occupied
• $$\rho \ =\ \frac{e^{\beta E_\lambda}}{Z}, \qquad\quad Z \ =\ \sum_\alpha\, e^{\beta E_\alpha}$$
• inverse temperature $\ \beta = 1/(k_B T)$
partition function $\ Z$

importance sampling

• the integral can be dominate by small regions of
phase space with large values for $\ \rho(x)$
$\hspace{6ex}\displaystyle \rho(x) \ =\ \frac{1}{y_n}\frac{x^n}{x+a}, \quad\quad y_n\ =\ \int_0^1\, \frac{x^n}{x+a}\, dx$
• importance sampling
Monte Carlo sampling of phase space with probability $\rho(x)$ to sample $\ x$

random walk through configuration space

• random walk
• $$x_1, \ \ x_2,\ \ x_3,\ \ \dots, \ \ x_N$$
• probability to visit a given phase space $\ x$
• $$P_N(x) \ =\ \frac{1}{N}\sum_{n=1}^N\delta(x-x_n), \qquad\quad \int P_N(x)dx = 1$$
• construct walk with correct limiting behaviour
• $$\lim_{N\to\infty} P_N(x)\ =\ \rho(x)$$
• we then have
• \begin{eqnarray} I& =& \int g(x)\rho(x)dx \ =\ \lim_{N\to\infty} \int g(x)P_N(x)dx\\ & =& \lim_{N\to\infty} \frac{1}{N}\sum_{n=1}^N\int g(x)\delta(x-x_n)dx \end{eqnarray}
• and
• $$I\ =\ \lim_{N\to\infty} \frac{1}{N}\sum_{n=1}^N g(x_n)$$
• this is the basis of importance sampling:
the important parts of phase space are visited (sampled) often
the important parts of phase space seldomly
by the random walk $\ \{x_n\}$

Metropolis algorithm

• method to generate a random walk $\ \{x_n\}$
with the correct limiting behviour
• $$P_N(x) \ =\ \frac{1}{N}\sum_{n=1}^N\delta(x-x_n), \qquad\quad \lim_{N\to\infty} P_N(x)\ =\ \rho(x)$$
• Markov chain
random walk where the probility to select $\ x_{n+1}\$
depends only on the $\ x_n$
(and not on $\ x_{n-1},\ x_{n-2},\ \dots\$: no memory)

transition rules

• select candidate for $\ \tilde x_{n+1}\$ randomly
• draw a random number $\ r\in[0,1]$
• $$\mbox{if} \quad \left\{ \begin{array}{c} {\rho(\tilde x_{n+1})}/{\rho(x_n)} \ge r \\ {\rho(\tilde x_{n+1})}/{\rho(x_n)} \le r \end{array} \right\} \quad\mbox{then}\quad \left\{ \begin{array}{rrcl} \mbox{accept candidate:} & x_{n+1}&=&\tilde x_{n+1} \\ \mbox{reject candidate:} & x_{n+1}&=& x_{n} \end{array} \right.$$
• repeat
• transition probability from state $\ x\$ to state $\ y$
• $$P(x\to y) \ =\ \frac{\rho(y)}{\rho(x)}$$

detailed balance

• Markov chain $\ \{x_n\}\$ with transition probability
• $$P(x_n\to x_{n+1}) \ =\ \frac{\rho(x_{n+1})}{\rho(x_n)}$$
• two states $\ x\$ and $\ y\$ phase space
with $\ \rho(y)>\rho(x)\$
• (relative) number of transtions $\ x\to y$ $$T(x\to y) \ =\ P_N(x)\,P(x\to y) \ =\ P_N(x)$$ since $\ \rho(y)/\rho(x)>1$
• (relative) number of transtions $\ y\to x$ $$T(y\to x) \ =\ P_N(y)\,P(y\to x) \ =\ P_N(y)\rho(x)/\rho(y)$$ since $\ \rho(x)/\rho(y)<1$

detailed balance

• for a large number of walkers (random walks)
of for $\ N\to\infty$
the probability $\ P_N(x)\$ will be determined
by the steady-state condition (detailed balance) $$T(y\to x) \ =\ T(x\to y)$$ there would be otherwise an infinite accumulation
• $$P_N(x) \ =\ P_N(y)\rho(x)/\rho(y), \qquad\quad \frac{P_N(x)}{P_N(y)} \ =\ \frac{\rho(x)}{\rho(y)}$$
• the Metropolis algorithm hence correctly yields
• $$\lim_{N\to\infty} P_N(x)\ =\ \rho(x)$$

evaluation of relative properties

$$y_n \ =\ \int_0^1 \frac{x^n}{x+a}\,dx,\qquad\quad y_{n+1} \ =\ \int_0^1 \frac{x^{n+1}}{x+a}\,dx \ =\ \int_0^1 x\,\frac{x^{n}}{x+a}\,dx$$
• with the Metropolis algorithm only relativ quantities can be evaluated
• $$\frac{y_{n+1}}{y_n} \ =\ \int x\,\rho(x)\,dx,\qquad\quad \rho(x)\ =\ \frac{1}{y_n}\frac{x^n}{x+a}$$
• use statistics (error estimate) from Monte Carlo program
/** Evaluating y_{n+1}/y_n, with y_n = \int_0^1 x^n/(a+x) dx
* with Monte Carlo (Metropolis) algorithm.
* On input: number of Monte Carlo steps N.
*/
public class Metropolis {
static final double aPar = 10.0;                 // parameter
static final double nPow = 3.0;                  // power

public static void main(String[] args) {

// --- determine the number of draws from input
int N = 0;
if (args.length==1)
N = Integer.parseInt(args[0]);
else
{
System.out.println("Wrong number of input parameters");
return;
}

// --- other parameters and variables
int K = 20;                                      // number of bins
int nK = N/K;                                    // number of data per bin
double[] data = new double[N];                   // storing all measurements
double[] meanBin = new double[nK];               // means of bins
double mean = 0.0, sigma = 0.0;                  // result: mean and error
double x = Math.random();                        // random starting value
double y, ratio;                                 // variables
data[0] = x;                                     // first measurment

// --- loop over all Monte Carlo steps
for (int i=1; i<N; i++)
{
y = Math.random();                             // random candiate
ratio = Metropolis.myDensity(y)/Metropolis.myDensity(x);
if (ratio>Math.random())
x = y;                                       // accept candidate
data[i] = x;                                   // storing measurement
}  // end of loop over all Monte Carlo draws

// --- evaluate the overall mean and the mean of the bins
for (int i=0; i<N; i++)
mean += data[i];
mean = mean/N;

for (int k=0; k<K; k++)
{
meanBin[k] = 0.0;
for (int n=0; n<nK; n++)                     // nK data points in each bin
meanBin[k] += data[k*nK+n];
meanBin[k] = meanBin[k]/nK;
}  // end of loop over all bins

// --- evaluate the error: variance of means of bins
for (int k=0; k<K; k++)
sigma += (meanBin[k]-mean)*(meanBin[k]-mean);
sigma = Math.sqrt(sigma/K);

// --- printout of result
System.out.printf(" a  in  y^n/(x+a)       : %8.4f\n",aPar);
System.out.printf(" n  in  y^n/(x+a)       : %8.4f\n",nPow);
System.out.printf(" number of measurements : %d\n",N);
System.out.printf(" number of bins         : %d\n",K);
System.out.printf(" data points per bin    : %d\n",nK);
System.out.printf(" y_{n+1}/y_n  estimate  : %8.4f\n",mean);
System.out.printf(" estimate for accuracy  : %8.4f\n",sigma);
System.out.printf("\n");
System.out.printf(" ratio y_4/y_3 (exact)  : %8.4f\n",0.797490089);

}  // end of Metropolis.main()

/** The proability density.
*/
public static double myDensity(double x) {
return Math.pow(x,nPow)/(x+aPar);
} // end of Metropolis.myDensity()

} // end of class Metropolis