Introduction to Monte Carlo Simulation Samik Raychaudhuri, Ph.D. Senior Member of Technical Staff Crystal Ball Global Business Unit

Agenda • • • •

Introduction Deterministic Modeling Monte-Carlo Simulation Method and Steps Identify Input Distribution • Distribution fitting • Correlation between distributions

• • • • •

Random Number Generation Analysis and Decision Making Application of Monte Carlo Simulation Monte Carlo Simulation Software Conclusion

Introduction • Monte Carlo Simulation • A type of simulation which relies on repeated random sampling and statistical analysis to compute results • Methodical way of automating what-if analysis • Provides an uncertainty dimension to otherwise static mathematical models

• What’s in the name? • Monte Carlo is a casino in Monaco, France • The use of randomness and the repetitive nature of the process are analogous to the activities conducted at a casino

Mathematical Models • Mathematical models are used in natural sciences, social sciences, finance and engineering • Depend on a number of input parameters • When processed through the mathematical formulas in the model, results in one or more outputs

Models and Simulation • Models are an attempt to capture behavior and performance of business processes and products. • Simulation is the application of models to predict future outcomes with known and uncertain inputs. MODELS 1

2

SIMULATION

LO 3 HI

F = m∗a

Control Inputs

Noise Variables

Y = f (x) Y = f (x) Outcome Predictions

Difference from Discrete Event Simulation

• Discrete event simulation models process-flow of physical systems, Monte Carlo simulation models stochastic systems with uncertainty • Event happens at discrete event simulation at regular intervals, each occurrence of an event in Monte Carlo simulation is a random number generated from a distribution (there is no time dimension)

Deterministic Modeling • Input parameters for a model depend on various external factors • Realistic models are subject to risk from systematic variation or uncertainty of the input parameters • Base-case Scenario: Deterministic model which does not consider these variations • Experimenters develop several versions of the model: basecase, best-case, worst-case

Deterministic Modeling: Disadvantages • It might be difficult to evaluate the best and worst case scenarios for each input variable • All the input variables may not be at their best or worst levels at the same time • As an experimenter increases the number of cases to consider, model versioning and storing becomes difficult

Monte Carlo Simulation Method In Monte Carlo Simulation: • We identify a statistical distribution which we can use as the source for each of the input parameters • We draw random samples from each distribution, which then represent the values of the input variables • For each set of input parameters, we evaluate the model and get a set of output parameters • The value of each output parameter is one particular outcome scenario in the simulation run • We collect such output values from a number of simulation runs • Finally, we perform statistical analysis on the values of the output parameters, to make decisions about the course of action (whatever it may be)

Monte Carlo Simulation Method Input Variables

Outputs

Monte Carlo Simulation Steps •

Following are the important steps for Monte Carlo simulation: 1. 2. 3. 4.

Deterministic model generation Input distribution identification Random number generation Analysis and decision making

Deterministic Model Generation • MC Simulation starts off with developing a deterministic model • The model should resemble the real scenario as closely as possible • In this model, most likely values or the base-case values of the input variables are used • It is very important to develop, analyze and validate a good deterministic model

Agenda • • • •

Introduction Deterministic Modeling Monte-Carlo Simulation Method and Steps Identify Input Distribution • Distribution fitting • Correlation between distributions

• • • • •

Random Number Generation Analysis and Decision Making Application of Monte Carlo Simulation Monte Carlo Simulation Software Conclusion

Identification of Input Distribution • Probability distributions are used for: • Representing variation: if the value or level of a variable varies with time, distance or any other such measurable factor • Describing uncertainty: if the value or level of the variable is uncertain over time, distance or any other such measurable factor

• If the historical data for a particular input parameter is available, we can use numerical methods to fit the data to one of the theoretical discrete or continuous distribution • If there is no historical data for one or more input parameter(s), we can use expert opinion to model the corresponding input distribution

Identification of Input Distribution: Distribution Fitting • A mathematical method to identify probability distribution from historical data • Typical methods: • Method of maximum likelihood • Method of moments • Nonlinear optimization

• The task is to evaluate the parameters of a distribution which uniquely identify it • Typically a few distributions are fitted to the historical data • Discussing each method is beyond the scope of this presentation

Distribution Fitting: Goodness-of-fit • After we identify the parameters of the distributions, we need to compare them to identify the best one • A few statistics are used to compare fitted distributions. They are called goodness-of-fit statistics • The most common are: • Chi-square • Kolmogorov-Smirnov statistics • Anderson-Darling statistics

• Aim of these statistics is to calculate the difference between the theoretical CDF of the distribution, and the empirical CDF from the raw data

Example of distribution fitting • Weight data of 30 students in pounds • Calculation for mean and standard deviation:

• MLE estimates for the parameters for a normal distribution, the mean and standard deviation, are sample mean and sample standard deviation • So, our fit to the normal distribution is N(149.16, 25.97)

Example of distribution fitting (contd.) • Now we will fit the same data to a 2-parameter lognormal distribution • Finally we will compare the goodness-of-fit statistics

Use Excel

Input Distribution: Correlation Between Random Variables • Correlations between variables are an important part of Monte Carlo simulations, as they occur naturally in various circumstances • When the values of two variables depend upon one another in part or completely, they are considered correlated • Correlation does not imply causation

Input Distribution: Correlation Between Random Variables (contd.) • Causality is reflected by Response Surface models ( Y=f(x) ) • Direct Relation between Independent Variables (Inputs, Assumptions) and Dependent Variables (Outputs, Forecasts) • Must use statistical proofs (ANOVA) or expert knowledge to develop direct relationship

• Correlations • Indirect Relation (Association) between Independent Variables (only between Inputs, Assumptions) • Magnitude and Sign indicate rough associative behavior between two Independent Variables

Input Distribution: Correlation Between Random Variables (contd.)

ICE CREAM SALES

JAN

FEB MAR APR

MAY JUN

• Do Shark Attacks cause Ice Cream Sales? • This is Correlation, not Cause-and-Effect

JUL

AUG

SEP

OCT

NOV DEC

Input Distribution: Correlation Between Variables: Correlation Coefficient • Correlation is described using correlation coefficient • Range between -1 to 0 for negative correlation • Range between 0 to 1 for positive correlation

• Two popular correlation coefficients: • Pearson’s Correlation Coefficient • Spearman’s Rank Correlation Coefficient

• When dealing with different distributions, it is better to use rank correlation coefficient, since it is distribution free • Rank correlation is also preferred when dealing with qualitative characteristics which cannot be measured quantitatively but can be arranged serially • With multiple data sets, we can use formulas to calculate the correlation between data series. This information should be used when generating random numbers from input distributions

Input Distribution: Correlation Between Variables: Effects of Correlation • Adding positive correlations between input variables into a model will generally increase the standard deviation of the output variables • Without correlations in the model, there is the risk of underestimating the variation of the output variable • Six Sigma metrics calculated may be wrong WITHOUT CORRELATIONS

WITH CORRELATIONS

PREDICTED 8% RISK

REALLY 17% RISK!!

Agenda • • • •

Introduction Deterministic Modeling Monte-Carlo Simulation Method and Steps Identify Input Distribution • Distribution fitting • Correlation between distributions

• • • • •

Random Number Generation Analysis and Decision Making Application of Monte Carlo Simulation Monte Carlo Simulation Software Conclusion

Random Number Generation • After we have identified the underlying distributions for the input parameters of a simulation model, we generate random numbers from these distributions • The generated random numbers represent specific values of the variable • We will discuss: • Most common method of generating random numbers, called inverse transformation • Generating correlated random numbers • Generating random numbers without distribution

Random Number Generation: Inverse Transformation • Provides a direct route to generate random sample from a distribution • Uses inverse of the CDF of PDF for continuous distributions and PMF for discrete distributions • Converts a random number between 0 and 1 to a random value for the specific distribution • Mathematically: • Let X be a continuous r.v. following a PDF of function f. • Let F-1 be the inverse of the CDF of f. • Following two steps would generate a random number x from f: • Generate U~U(0,1) • Return X=F-1(U) • Since, 0 ≤ U ≤ 1, F-1(U) always exist. • This method can also be used when f is discrete.

Random Number Generation: Inverse Transformation (contd.)

Random Number Generation: Inverse Transformation (contd.) • Advantages: • This method can be used for generating random numbers from truncated distributions • Can be used for any general type of distribution function, including mixture function of discrete and continuous distributions

• Disadvantage: • This method uses inverse-CDF. So, becomes difficult to implement if there is no closed-form inverse CDF for a distribution • If F(X), the CDF, can be calculated easily, an iterative method (like bisection or Newton-Raphson) can be used • Has finite precision numerical error and tolerance error

• Other methods: • Acceptance-rejection method • Composition method • Convolution method

Random Number Generation: AcceptanceRejection Method • This method is used for cases when the form of f makes it difficult to sample directly (e.g., when F-1 is not closed form) • Used for more complicated models, e.g., MC sampling for diffusion models (like Brownian motion) • An example: Suppose it is desired to generate a random point within the unit circle. Generate a candidate point (x,y) where x and y are independent uniformly distributed between −1 and 1. If it so happens that then the point is within the unit circle and should be accepted.

Random Number Generation: Correlated Random Variables • We have talked about correlations before • Assume that we have discovered correlation between two (or more) input variables • Now we want to generate samples from the distributions of those variables • Involves complicated computation: • We will need the correlation matrix: a matrix which contains correlation coefficient between each pair of distributions • Next we decide on using one of the many copula functions available in the literature, and convert the correlation matrix into a rotation matrix • This matrix is then used to impart correlations on streams of uniform random numbers • Finally, inverse CDF of each distribution is used to get back the random numbers

Random Number Generation from a Dataset • Now, let’s suppose that we could not obtain an underlying distribution for an input variable. All we have is some data • All is not lost yet: we can use the available data to generate random samples. This method is called bootstrapping • In bootstrapping, we do not really generate random variates. Instead, we repeatedly sample the original dataset to choose one of the data points from the set (choose a number with replacement) • One has to still use an uniform RNG, specifically an RNG to generate integer random numbers among the indices of an array, which is being used for storing the original dataset

Random Number Generation from a Dataset (contd.) • Bootstrapped simulation can be a highly effective tool in the absence of a parametric distribution for a set of data • It does not provide general finite sample guarantees though, and has a tendency to be overly optimistic • The apparent simplicity may conceal the important assumptions: samples are independently drawn • Difficult to use correlation information, although possible • Intrinsic correlation in repeated observations must be taken into account to draw valid scientific inference

Agenda • • • •

Introduction Deterministic Modeling Monte-Carlo Simulation Method and Steps Identify Input Distribution • Distribution fitting • Correlation between distributions

• • • • •

Random Number Generation Analysis and Decision Making Application of Monte Carlo Simulation Monte Carlo Simulation Software Conclusion

Analysis and Decision Making • The result of the Monte Carlo simulation of a model is typically subjected to statistical analysis • For each set of random numbers (or trials) generated for each of the input variables, we use the model formula to arrive at a trial value for the output variable(s) • When the trials are complete, the stored values are analyzed • Different types of analysis are possible: • • • • •

Histogram and other charts Measures of central tendency: average or mean, median, mode Measures of dispersion: range, mean absolute deviation, variance Percentiles Higher order moments like skewness and kurtosis

• One can also compute capability statistics in case of six-sigma based simulations • Sensitivity analysis: Finds out the input variables which cause the predominance of variation in the values of the output parameter of interest.

Result Analysis: Sensitivity • An important question while analyzing result is: how much a given input distribution affects the result • In other words, this is the sensitivity of output variable to each of the input variables • The overall sensitivity of an output variable to an input variable depends on: • Uncertainty in the input variable • Dependence between the model and the input variable

• One can use correlation between the target output variable and the input variable as a measure of sensitivity between the two variables • Again, rank correlation would be better here, since they are distribution independent • High positive or negative correlation indicates that the specific input variable has a significant on the output variable

• Be careful when using correlations as sensitivity though. Calculations might be inaccurate if the input variables are correlated, or the relationship between an input and output is non-monotonic, or some of the input or output variables are discrete

Application of Monte Carlo Simulation • Finance: • Portfolio analysis • Options and real options analysis • Personal financial planning

• Reliability analysis and six-sigma • MC simulation in mathematics, statistical physics and other physical sciences • Engineering • Civil and construction • Electronics and computers

Monte Carlo Simulation Software • Use high-level programming languages like C/C++/Java/.NET • Develop computer program for generating uniform random numbers • Possibly be tailor-made for specific situation • Various software libraries are available to facilitate MC simulation code

• Use general purpose math tools • Matlab, R, Scilab • Have to program in tool-specific language

• General purpose simulation software packages • Model an industry-specific problem, generate random numbers, and perform output analysis • Goldsim, Vanguard Systems, SimCad are some examples

Monte Carlo Simulation Software (contd.)

• Can also be performed using add-ins to popular spreadsheet software like Microsoft Excel • One typically starts by developing a deterministic model for the problem in a spreadsheet • Then one defines distributions for the input variables which contain uncertainty • Add-ins are capable of generating charts and graphs and helps in various type of analysis • Example: • Crystal Ball from Oracle • @RISK from Palisade • Risk solver from Frontline Systems

Conclusion • Monte Carlo simulation is a very useful mathematical technique for analyzing uncertain scenarios and providing probabilistic analysis of different situations • The basic principle for applying MC analysis is simple and easy to grasp • Various software have accelerated the adoption of MC simulation in different domains including mathematics, engineering, finance etc.

Questions • Question time !!

Hands On Example: Futura Apartments • You are a potential purchaser of the Futura Apartments complex • Because there is some uncertainty surrounding the number of units you can rent each month and the monthly expenses, you need to simulate your potential profit or loss per month • This knowledge will help you to determine whether or not this complex is worth purchasing • Your research has led you to make the following assumptions: • $500 per month is the going rent for the area • The number of units rented during any given month will be somewhere between 30 and 40 • Operating costs will average around $15,000 per month for the entire complex, but might vary slightly from month to month

Hands On Example: Futura Apartments (contd.) • • • • •

Run single-step of the simulation Complete the simulation Look at forecast chart, and the statistics Look at the certainty of certain values of profit or loss Look at sensitivity chart

Hands On Example: Discounted Cash Flow Analysis • Your pharmaceutical company is very interested in acquiring AllergyGone, a potential new anti-allergy drug with no known side effects • You have been asked to produce a Discounted Cash Flow (DCF) analysis of AllergyGone over a five-year period to determine if this product is worth acquiring. Because of the uncertainty in the product pricing, demands, and costs, your company has decided to simulate the Net Present Value (NPV) and Internal Rate of Return (IRR) prior to negotiations • Assumptions: • • • •

Discount rate is 10%, which is also your company’s hurdle rate Tax rate is 32% Cost of revenue is 75% of gross revenue, but varies Operating cost is 10% of gross income, but varies

• Deterministic Analysis: • The measure of success is the IRR and NPV calculations • The NPV is approximately $500,000 with an IRR of 15% • Initial investment of $3,400,000 in Year 0

Hands On Example: Discounted Cash Flow Analysis (contd.) • • • •

Run single-step of the simulation Complete the simulation Look at forecast charts, and the statistics Look at sensitivity chart

Introduction to Monte Carlo Simulation

Crystal Ball Global Business Unit ... Simulation is the application of models to predict future outcomes ... As an experimenter increases the number of cases to.

584KB Sizes 4 Downloads 421 Views

Recommend Documents

Introduction to Monte Carlo Simulation - PDFKUL.COM
Monte Carlo Simulation Steps. • Following are the important steps for Monte Carlo simulation: 1. Deterministic model generation. 2. Input distribution identification. 3. Random number generation. 4. Analysis and decision making ..... perform output

Monte Carlo Simulation
You are going to use simulation elsewhere in the .... If we use Monte Carlo simulation to price a. European ...... Do not put all of your “business logic” in your GUI.

Statistical Modeling for Monte Carlo Simulation using Hspice - CiteSeerX
To enable Monte Carlo methods, a statistical model is needed. This is a model ..... However, it is difficult to determine the correlation without a lot of statistical data. The best case .... [3] HSPICE Simulation and Analysis User Guide. March 2005.

Using the Direct Simulation Monte Carlo Approach for ...
The viability of using the Direct Simulation Monte Carlo (DSMC) approach to study the blast-impact ... by computing load definition for two model geometries - a box and an 'I' shaped beam. ... On the other hand, particle methods do not make the conti

Monte Carlo Simulation for the Structure of Polyolefins ...
unsaturated (usually vinyl) groups can be incorporated by the. CGC into a ... broaden the molecular weight distribution, since chains formed at the second site will .... to published experimental data for a lab-scale synthesis of a dual catalyst ...

Monte Carlo simulation of radiation transfer in optically ...
radiation transfer processes in cirrus clouds is a challenging problem. .... Influence of small-scale cloud drop size variability on estimation cloud optical.

Chapter 12 Photon Monte Carlo Simulation
... for viewing the trajectories is called EGS Windows [BW91]. ..... EGS-Windows - A Graphical Interface to EGS. NRCC Report: ... 1954. [Eva55]. R. D. Evans.

Wigner Monte Carlo simulation of phonon-induced electron ...
Oct 6, 2008 - and its environment the phonon mode in this case with which the ...... 39 M. D. Croitoru, V. N. Gladilin, V. M. Fomin, J. T. Devreese, W. Magnus ...

Chapter 12 Photon Monte Carlo Simulation
interaction of the electrons and positrons leads to more photons. ... In this case, the atomic electron is ejected with two electrons and one positron emitted. This is ...

Migration of Monte Carlo Simulation of High Energy ...
Grid node, based both on the Globus (http://www.globus.org) and Gridway ([8], .... V. and Andreeva, J. 2003; RefDB: The Reference Database for CMS Monte ...

Wigner-Boltzmann Monte Carlo approach to ... - Springer Link
Aug 19, 2009 - Quantum and semiclassical approaches are compared for transistor simulation. ... The simplest way to model the statistics of a quantum ..... Heisenberg inequalities, such excitations, that we will call ..... pean Solid State Device Res

a monte carlo study
Mar 22, 2005 - We confirm this result using simulated data for a wide range of specifications by ...... Federal Reserve Bank of Kansas City and University of Missouri. ... Clements M.P., Krolzig H.$M. (1998), lA Comparison of the Forecast ...

Sequential Monte Carlo multiple testing
Oct 13, 2011 - can be reproduced through a Galaxy Pages document at: ... Then, in Section 3, we show on both simulated and real data that this method can ...

Sequential Monte Carlo multiple testing
Oct 13, 2011 - An example of such a local analysis is the study of how the relation ... and then perform a statistical test of a null hypothesis H0 versus. ∗To whom ... resampling risk (Gandy, 2009), and prediction of P-values using. Random ...

A novel approach to Monte Carlo-based uncertainty ...
Software Ltd., Kathmandu, Nepal, (3) Water Resources Section, Delft ... was validated by comparing the uncertainty descriptors in the verification data set with ... The proposed techniques could be useful in real time applications when it is not ...

Hamiltonian Monte Carlo for Hierarchical Models
Dec 3, 2013 - eigenvalues, which encode the direction and magnitudes of the local deviation from isotropy. data, latent mean µ set to zero, and a log-normal ...

Sonification of Markov chain Monte Carlo simulations
This paper illustrates the use of sonification as a tool for monitor- ... tional visualization methods to understand the important features of Ф. When. , however ...

Bayes and Big Data: The Consensus Monte Carlo ... - Semantic Scholar
Oct 31, 2013 - posterior distribution based on very large data sets. When the ... and Jordan (2011) extend the bootstrap to distributed data with the “bag of little ...