#### Andrew Garcia 26 Dec 2020 # What are Monte Carlo methods

Monte Carlo methods are a broad set of computer simulations based on random sampling which is used to estimate the outcomes of observable events. These methods were developed in the United States during World War II in tandem with the first computer, the MANIAC, to work on complex thermonuclear problems and the decision making associated with such.

In the movie WarGames, a fictional supercomputer which can run war simulations and learn to make decisions over time is shown. Such simulations would likely have to be of the Monte Carlo variety.

Monte Carlo simulations have been applied to modeling a diverse spectrum of problems, from making predictions on the stock market, to modeling complex biological systems. These simulations typically fall under the following 3 classifications, in order of increasing complexity (subjective):

1. Simple Monte Carlo -- Typically used in industry (Six Sigma, Design of Experiments )
2. Metropolis-Hastings Monte Carlo (MCMC) -- Can be used measure physical properties at equilibrium, thermodynamical observables
3. Kinetic Monte Carlo (kMC) -- Can be used to simulate systems dynamically (time evolutions)

Though I specialize on Kinetic Monte Carlo simulations, I am quite familiar with all of these. Being the most easy as an instructional example, I will present a Monte Carlo algorithm from the Simple Monte Carlo branch using an industrial example I came up with. To begin, some basics on this Simple Monte Carlo algorithm will be introduced.

## Simple Monte Carlo Basics

One can draw numerical samples from a probability density function such as a Gaussian in order to obtain meaningful information about a system. Assuming we are sampling 500 values from a Gaussian with a mean of 5 and a standard deviation of 0.2, our sample map and corresponding histogram may look like this: Note that the histogram is just another way of representing the whole 500 sample mapping in terms of probability.

## Optimizing Candy Softness with Monte Carlo

Let's assume we have a taffy making machine based on a puller mechanism. Above: The action of the taffy puller patented by Richards (1905) and shown in Thiffeault's study

We find candy softness depends on the 2 candy-making variables the following way:

Y = 0.6 X1 + 1.4 X2

### Statistics

We find the rotating paddle speed (X2) oscillates exactly 10 speed units above and below its X2 input. We also notice there is human error in the amount of salt added (X1); we find that the amount of salt added follows a Gaussian distribution centered at the X1 input and with a standard deviation of 3 units.
This information can be used to set the dispersion metrics of our candy-making variables. If we decide our input for X1 to be 18 salt mass units and the input for X2 to be 10 speed units, the distribution statistics for these variables will be:

amount of salt (X1):
mean = 18; standard dev. = 3     <-- Gaussian (normal) distribution
rotating paddle speed (X2):
range = [40 - 10, 40 + 10]             <-- uniform distribution

### Monte Carlo algorithm

With this information we can estimate the statistical distribution of our output (candy softness) for any input value with the following sMC algorithm:

1. A sample from X1's distribution is randomly chosen (first uppermost figure below)
2. A sample from X2's distribution is randomly chosen (second uppermost figure below)
3. These two samples are operated with the function for Y above
4. 1-3 is repeated N times to form a distribution for Y (third uppermost figure below)

Let's also assume our customer needs the candy to have softness ratings Y between 55 and 75. We can place these requirements in our histogram for softness as dashed red lines.  ### Making the process Six-Sigma

You may have noticed from the figures above that with the current conditions of the process variables we had a decent amount of taffy candy which DID NOT meet our customer's specifications on candy softness, as a decent amount fell beyond the dashed red bars. For a company this would translate in wasted money. Moreover, our design also did not meet Six Sigma requirements, which means 99.97% of the candy softness population [or such within 3 standard deviations from the mean in both directions] should be within the range of customer requirements.

Through an optimization procedure which would make this post too verbose, we find that our machine's rotating paddles must be replaced with paddles which oscillate less than 2.5 speed units from 40 i.e. with a range of [37.5,42.5] . We thus make the modification of variable X2 and run the sMC simulations once again:  The candy softness distribution now meets our customer specifications.

This is perhaps the simplest class falling under the umbrella of Monte Carlo methods. You may do some additional exploration on your own as these methods can be quite interesting and have a wide array of applications spanning beyond science and industry.