Reliability of global sensitivity indices

Viewer
Transcript

1 2

Reliability of global sensitivity indices

3 4 5

Chonggang Xu1, George Z. Gertner2 1-Department of Entomology and Center for Quantitative Sciences in Biomedicine, North

6

Carolina State University; 2-Department of Natural Resources and Environmental

7

Sciences, University of Illinois at Urbana-Champaign.

8 9 10

In Press. doi:10.1080/00949655.2010.509317

11 12

1

1

Abstract: Uncertainty and sensitivity analysis is an essential ingredient of model

2

development and applications. For many uncertainty and sensitivity analysis techniques,

3

sensitivity indices are calculated based on a relatively large sample to measure the

4

importance of parameters in their contributions to uncertainties in model outputs. To

5

statistically compare their importance, it is necessary that uncertainty and sensitivity

6

analysis techniques provide standard errors of estimated sensitivity indices. In this paper,

7

a delta method is used to analytically approximate standard errors of estimated sensitivity

8

indices for a popular sensitivity analysis method, the Fourier Amplitude Sensitivity Test

9

(FAST). Standard errors estimated based on the delta method were compared to those

10

estimated based on 20 sample replicates. We found that the delta method can provide a

11

good approximation for the standard errors of both first-order and higher-order sensitivity

12

indices. Finally, based on the standard error approximation, we also proposed a method

13

to determine a minimum sample size to achieve the desired estimation precision for a

14

specified sensitivity index. The standard error estimation method presented in this paper

15

can make the FAST analysis computationally much more efficient for complex models.

16

Keywords: Fourier Amplitude Sensitivity Test, Sensitivity analysis, Uncertainty analysis,

17

Standard error, Simple random sampling, Random balance design sampling

18

2

1 2

1

Introduction Uncertainty and sensitivity analysis is an essential ingredient of numerical model

3

development and applications [1]. It can help in the model application by providing

4

model users with information on the reliability of model predictions, and can help the

5

model development by providing judging criteria for model identification, calibration and

6

corroboration. Uncertainty and sensitivity analysis attempts to quantify the uncertainties

7

in model outputs (generally resulted from uncertainties in model parameters) and the

8

importance of parameters in their contribution to the uncertainties in model outputs. In

9

accordance with the tradition within the uncertainty and sensitivity analysis community,

10

the parameters in this paper refer to the numerical model input variables (e.g., the daily

11

survival rate describing mosquito survival in a mosquito population model), instead of

12

those describing the probability distribution as used in statistics (e.g., mean and standard

13

deviation of a normal distribution). Many uncertainty and sensitivity analysis techniques

14

are now available [2,3]. A large group of them are variance-based methods, which

15

attempt to decompose the variance of a model output y into partial variances contributed

16

by individual model parameters. Their main concern is to calculate total variance in a

17

model output y [i.e., V(y)] and a conditional variance for expected value of model output

18

y given a specific set of parameters [4]. The ratios of conditional variances (also termed

19

partial variances) to total variance in the model output (referred to as sensitivity indices)

20

are used to measure the importance of parameters in their contributions to the uncertainty

21

in model outputs. Since variance-based sensitivity indices are generally calculated by

22

incorporating uncertainties in all parameters simultaneously, they are also referred to as

23

global sensitivity indices in the literature [3].

3

1

One of the most popular variance-based uncertainty and sensitivity analysis

2

techniques is Fourier Amplitude Sensitivity Test (FAST) [5,6,7,8]. For a review of other

3

variance-based methods, please refer to Borgonovo [9]. The FAST uses a periodic

4

sampling approach to introduce signals for all parameters of interest and a Fourier

5

transformation to decompose the variance of a model output into partial variances

6

contributed by different model parameters. The ratios of partial variances to total variance

7

(referred to as sensitivity indices) are used to measure the importance of main effects of

8

parameters (or interaction effects among parameters) in their contributions to variance of

9

the model output. Different sampling approaches can be used to estimate partial variances

10

in FAST, including the traditional sampling by a search curve in the parameter space

11

using an auxiliary variable [5,6,8], random balance design sampling [10], and simple

12

random sampling [11]. Xu and Gertner [11] compared the performance of different

13

sampling algorithms for FAST analysis. The FAST analysis is originally developed for

14

models with independent parameters. In order to extend FAST for models with dependent

15

parameters, Xu and Gertner [12,13] introduced a random reordering approach to account

16

for rank correlations among parameters. Until now, FAST analysis was mainly used to

17

estimate the partial variances contributed by the main effects of parameters. For the

18

traditional sampling based on the search curve, it has been heuristically shown or realized

19

that FAST can be used to estimate partial variances contributed by interactions among

20

parameters using linear combination of characteristic frequencies assigned to individual

21

parameters [14]. However, there is a lack of a theoretical proof and understanding for the

22

calculation of partial variances contributed by interactions among parameters. Xu and

23

Gertner [11] theoretically showed that FAST analysis can be used to estimate partial

4

1

variances contributed by both main effects and interaction effects of model parameters

2

for different sampling approaches including sampling based on the search curve, simple

3

random sampling, and random balance design sampling.

4

Sensitivity indices determined by the ratios of partial variances to total variance of

5

a model output in FAST can be used to measure the importance of parameters in their

6

contributions to uncertainty in the model output. However, due to the random errors

7

introduced by sampling, it could be misleading to judge the relative importance order for

8

parameters only based on the estimated sensitivity indices, especially if the sensitivity

9

indices of two parameters are close. In view of the random errors introduced by sampling,

10

it is important that we know the reliability of estimated sensitivity indices. In statistics,

11

reliability of a statistic (i.e., a sensitivity index in this paper) is generally measured by its

12

standard error. A common approach for standard error estimation is to conduct FAST

13

analysis using many (e.g., m) replicates of its periodic sample. For each replicate, a new

14

periodic sample with a relatively large sample size of N (N > 1000) that is required by

15

FAST is drawn . We will need to run the model on each newly drawn periodic sample

16

and calculate the sensitivity indices for different model parameters. Finally, the standard

17

error of an estimated sensitivity index can be calculated based on the standard deviation

18

of m estimated sensitivity indices. This method is effective and simple to implement for a

19

simple model (which takes a very short time to run a single simulation). However, for a

20

complex model which takes a long time (e.g., hours) to run a single simulation, the above

21

replicated sample method for standard error estimation could be computationally

22

expensive or infeasible.

5

1

Nowadays more complex models (especially spatial models) have been developed

2

to simulate complicated systems. For example, many forest landscape models have been

3

developed to predict the potential forest landscape changes under future global warming

4

(generally including inter-species competition, seed dispersal and seedling recruitment,

5

tree growth, fire disturbances or harvesting), which may take a number of hours to run a

6

single simulation for a large study area (e.g., millions of hectares) [15,16,17]. For vector-

7

borne diseases (e.g., dengue fever or malaria), population dynamics models of disease

8

vector (e.g., mosquitoes) and human movement are important for a better understanding

9

of vector dynamic and a more efficient disease control. In view of the importance of

10

mosquito dispersals to population dynamics in heterogeneous spatial domain and to the

11

success of vector control for dengue fevers [18], new spatial models have been developed

12

to account for mosquito dispersals [19,20]. Since the spatial model needs to simulate

13

biological developments within water containers (e.g., weight growth, physiological

14

development, pupation), effects of environment factors (e.g., temperature and moisture),

15

spatial dispersals for individual mosquito cohort and even gene flow for releasing

16

genetically modified mosquitoes to reduce virus-borne mosquito population, it can take

17

the model several hours to run a single simulation for a single city with thousands of

18

houses [20]. Uncertainty and sensitivity analysis is an important tool to help us

19

understand the reliability of landscape or population dynamics predicted from those

20

complex models. However, for complex models, the replicate-based method of standard

21

error estimation for sensitivity indices could be difficult or infeasible with a large number

22

(mൈN) of model runs.

6

1

Two potentially efficient approaches available to calculate standard errors of

2

sensitivity indices are the bootstrap method and the delta method. For the bootstrap

3

method, m replicates of samples drawn from the original periodic sample can be used to

4

estimate the standard errors of sensitivity indices calculated by FAST. As such, running

5

of the model for each of the m replicates is not required with the bootstrap method.

6

However, since the FAST analysis depends on periodic samples to form signals for

7

parameters (see eq.(2)), the resampling can potentially destroy the periodic signal which

8

would lead to an unreliable estimate of the standard error. Therefore, in this paper, the

9

delta method is used to analytically calculate the standard errors of sensitivity indices

10

based on a single periodic sample of size N required by FAST. Based on the derived

11

standard error calculation formula, we also propose a method of determining the

12

minimum sample size required to achieve desired estimation precision for a specified

13

sensitivity index.

14

The paper is organized as follows: Section 2.1 gives a general background of

15

FAST analysis; Section 2.2 derives the standard errors of estimated sensitivity indices for

16

simple random sampling; Section 2.3 derives the standard errors of estimated sensitivity

17

indices for random balance design sampling; and Section 2.4 provides a summary of

18

FAST analysis with standard error estimation by the delta method. Section 3 provides

19

two test models. Section 4 presents their results. Section 5 discusses the potential use of

20

derived standard errors for the determination of a minimum sample size to achieve a

21

desired estimation precision and extends the proposed method to models with correlated

22

parameters.

7

1

2

2

2.1

3

Method Review of FAST Consider a computer model y  g ( x1 ,.., xn ) , where xi (i  1,..., n) is a model parameter

4

with probability density function f i ( xi ) and n is the total number of parameters of

5

interest. The parameter space is defined as K x n  {x1 ,..., xn | xi ~ f i ( xi ), i  1,..., n} . (1)

6 7

The traditional version of the FAST assumes independence among model parameters.

8

The main idea of FAST is to use a periodic sampling approach to introduce a signal for

9

each parameter and then use a Fourier transformation to decompose total variance of

10

model output (y) into partial variances contributed by different model parameters. In

11

order to generate periodic samples for a specific parameter xi , FAST uses a periodic

12

search function (a function to search or explore the parameter space) as follows

13

[11,14,21,22],

14

1 xi  G (i )  Fi 1 (  arcsin(sin( i ))) . 2

(2)

15

where i is a random variable uniformly distributed between 0 and 2 , and Fi 1 () is the

16

inverse cumulative distribution function of parameter xi . For the sampled values of i ,

17

the G( i ) is a function used to generate the corresponding samples for parameter xi ,

18

which will follow the pre-specified probability density function f i ( xi ) . Using eq. (2), the

19

parameter space can now be explored by samples in the  -space defined as follows

20

K n  {1 ,...,  n | 0   i  2 , i  1,..., n} .

8

(3)

1

A main concern of variance-based uncertainty and sensitivity analysis is to calculate the

2

total variance of model output y [i.e., V(y)] and the conditional variance for expected

3

value of y given a specific set of parameters [4]. The conditional variances are used to

4

measure the importance of parameters in their contributions to uncertainty in the model

5

output. For example, a relatively large conditional variance for the expected value of y

6

given xi [i.e., V ( E ( y | xi )) ] will indicate that a relatively high amount of uncertainty in a

7

model output is contributed by this parameter xi . Similarly, a relatively large conditional

8

variance for the expected value of y given a specific set of parameters xsub [i.e.,

9

V ( E ( y | xsub )) ] will indicate that a relatively large amount of uncertainty in a model

10

output is contributed by this set of parameters. Using the Strong Law of Large Numbers,

11

it can be shown that the conditional variance for the expected value of y in the parameter

12

space can now be calculated in the  -space (see Xu and Gertner [11] for details). Namely,

V ( xi )  Vx ( Ex ( y | xi ))  V ( E ( y | i ))

13

 Vx ( Ex ( y | xi , x j ))  V ( E ( y |  i ,  j )), i  j

V

( xi , x j )

V

( xi , x j , xk )

 Vx ( Ex ( y | xi , x j , xk ))  V ( E ( y |  i ,  j ,  k )), i  j  k

(4)

...... V

({ x1,..., xn }\ xi )

 Vx ( Ex ( y |{x1,..., xn } \ xi ))  V ( E ( y |{1,..., n } \ i ))

V ( x1 ,..., xn )  Vx ( g ( x1 ,..., xn ))  V ( g (G (1 ),..., g (G ( n )) 14

where Vx () and V () are the conditional variances calculated in the parameter space

15

and  -space respectively; Ex () and E () are the expected values calculated in the

16

parameter space and  -space respectively; { x1,..., xn } \ xi represents all parameters except

17

xi .For a subset ( xsub ) of all parameters {x1,..., xn } , the V ( xsub ) represents the partial variance

18

in model output y due to the uncertainty in subset parameters xsub . The V

9

( x1,..., xn )

represents

1

the variance of model output y resulting from uncertainties in all model parameters.

2

Namely,

3

V

 V ( y ).

( x1,..., xn )

(5)

4

Following the decomposition of variance in analysis of variance (ANOVA) assuming

5

independence among parameters [4], we define

Vxi  V ( xi ) , Vxi x j  V 6

( xi , x j )

 Vxi  Vx j , i  j (6)

...... Vx1,..., xn  V

r



( x1,..., xn )



s 1 1i1 i2 is  n

Vxi ... xi , r  1, 2,..., n  1 1

s

7

as partial variances of model output contributed by the first-order (or main) effects, the

8

second-order interaction effects, the third-order interaction effects, and so on, until the

9

nth order interaction effects of model parameters. Summing all the left and right terms in

10

eq. (6), we get the variance decomposition as follows, n

V ( y )   Vxi  Vxi x j 

11

i 1

i j

V

i j k

xi x j xk

   Vx1,..., xn

(7)

12

which suggests that the total variance resulting from parameter uncertainties can be

13

decomposed into partial variances contributed by the first-order effects, the second-order

14

interaction effects, third-order interaction effects, and until the nth order interaction

15

effects of parameters. Dividing both sides of eq. (7) by V ( x1,..., xn ) , we get n

1    xi    xi x j 

16

i 1

17

i j



i j k

xi x j xk

    x1,..., xn

where

10

(8)

x  i

V

V

x x 

1

i j

xi ( x1,..., xn )

Vxi x j V

( x1,..., xn )

(9)

...

x

1... xn



Vx1... xn V

( x1,..., xn )

2

represent the first-order, second-order, and so on, until nth order sensitivity indices,

3

respectively.

4

Using search functions with eq. (2), the model output becomes a multiple periodic

5

function of ( 1 ,...,  n ). Thus, we can apply a multiple Fourier transformation to the model

6

y  g (G (1 ),..., G ( n )) over all variables in the  -space. Namely, we have g (G (1 ),..., G ( n )) 

7

8 9





C ( ) r1 ..rn ei ( r11  rnn )

(10)

r1 ,..., rn 

where C ( ) r1 ..rn  (

2

2

1 n )    g (G (1 ),..., G ( n ))e  i ( r11  rn n ) d1  d n . (11) 2 0 0

10

It is notable that C ( ) r1 ..rn is the expected value of g (G (1 ),..., G ( n ))e  i ( r11  rnn ) in view

11

that 1 ,...,  n are independently and uniformly distributed in the  -space. Namely,

12

C ( ) r1 ..rn  E ( g (G (1 ),..., G ( n ))e  i ( r11  rn n ) )

13 14

(12)

which can be estimated based on N samples of 1 ,...,  n as follows,

1 Cˆ ( ) r1 ..rn  N

N

 (g (G( j 1

( j) 1

),..., G ( n ))e  i ( r11 ( j)

( j)

 rn n( j ) )

)

(13)

15

where  ( j ) represents the jth sample for i . For notational convenience, we define the i

16

cosine Fourier coefficient ar1 ..rn and sine Fourier coefficient br1 ..rn as follows, 11

ar1 ..rn  E [ g (G (1 ),..., G ( n )) cos( r11    rn n )]

1

2

br1 ..rn  E [ g (G (1 ),..., G ( n )) cos(r11    rn n )]

(14)

with

C ( ) r1 ..rn  ar1 ..rn  ibr1 ..rn .

3

(15)

4

Based on eq. (14), the cosine Fourier coefficient ar and sine Fourier coefficient br can be

5

estimated as follows, aˆr 

6

1 N

1 bˆr  N

N

 g (

( j) 1

,  n ( j ) ) cos(r11( j )    rn n ( j ) ),

( j) 1

,  n ) sin(r 

j 1 N

 g ( j 1

(16) ( j)

( j) 1 1

   rn n ), ( j)

7

where  ( j ) represents the jth sample for i . Using the multiple Fourier transformation, i

8

partial variances in eq. (6) can be estimated by the sum of Fourier amplitudes (i.e.,

9

| Cr1rn |2 ) at different frequencies (see Xu and Gertner [11] for a proof), 

Vxi   | C ( ) 00ri 00 |2 |ri | 1

Vxi x j  10





|ri |,|r j | 1

Vxi x j xk 

| C ( ) 00ri rj 00 |2





|ri |,|r j |,|rk | 1

| C ( ) 00ri rj rk 00 |2

(17)

... Vx1xn 





|r1 |,,|rn | 1

| C ( ) r1 rn |2 .

11

Eq. (17) shows that the Fourier amplitude | C ( ) 00ri 00 |2 results from the main effects of

12

parameters, the Fourier amplitude | C ( ) 00ri rj 00 |2 results from the second-order

13

interaction effects, the Fourier amplitude | C ( ) 00ri rj rk 00 |2 results from the third-order

12

1

interaction effects, and the Fourier amplitude | C ( ) r1 rn |2 results from the nth order

2

interaction effects. Thus, to calculate the partial variances, we only need to estimate the

3

Fourier coefficients based on eq. (13). Summing all the terms in eq. (17) based on eq. (7),

4

it is easy to show that

5

V ( x1 ,..., xn ) 





r1 ,, rn  , ( r1 ,, rn )  (0,...,0)

| C ( ) r1rn |2 .

(18)

6

It is notable that there is only one period for the sampled parameter values using search

7

functions with eq. (2). Thus, there are strong signals for parameters in the Fourier

8

amplitudes (i.e., | C ( ) r1rn |2 ) when the fundamental frequency is equal to one (i.e.,

9

r1  1 , or r2  1 ,..., or rn  1 ). The signals decrease at higher harmonics which are integer

10

(termed the harmonic order) multiples of the fundamental frequency [6]. The signals in

11

the Fourier amplitude become close to zero, if any of r1 ,..., rn are relatively large (i.e., at a

12

relatively high harmonic order). The harmonic order at which the Fourier amplitudes

13

become close to zero is defined as a maximum harmonic order, which is commonly four

14

or six in practice and could be greater for highly nonlinear models [6]. With a specified

15

maximum harmonic order M (please see the end of Section 2.2 for a description of M

16

selection), the partial variances contributed by the main effects and interaction effects can

17

be calculated as

13

M

Vxi   | C ( ) 00ri 00 |2 |ri | 1

Vxi x j 

M

|ri |,|r j | 1

Vxi x j xk 

1



| C ( ) 00ri rj 00 |2

M



|ri |,|r j |,|rk | 1

| C ( ) 00ri rj rk 00 |2

(19)

... Vx1xn  2

M



|r1 |,,|rn | 1

| C ( ) r1 rn |2 .

Different sampling methods can be used to estimate the partial variances

3

contributed by different model parameters [11]. The sampling methods include the

4

traditional sampling by a search curve using an auxiliary variable s [5,6,8], random

5

balance design sampling [10], and simple random sampling [11]. For the traditional

6

sampling using an auxiliary variable s, in addition to sampling errors, there are also

7

interference errors resulted from the selection of frequencies for different model

8

parameters (see Xu and Gertner [11] for details). The sampling errors are substantially

9

smaller than the interference errors. In addition, for the traditional search-curve based

10

sampling, it is difficult or infeasible to conduct FAST analysis for models with many

11

parameters due to the large sample size required (e.g., >20 parameters) [10]. Therefore, in

12

this paper, we are mainly concerned with simple random sampling and random balance

13

design sampling.

14

2.2

15 16

Standard errors of sensitivity indices for simple random sampling For simple random sampling, we draw a random sample of size N in the  -space

 ( j) {   (1( j ) ,..., n ( j ) ) , j=1, …, N } to estimate the Fourier coefficients using eq. (13). For

14

1

the first-order or higher-order sensitivity indices, using eq. (9) and eq. (19), they can be

2

estimated based on a sample in the  -space as follows,

ˆ 

3



 r H ( M )

Cˆ (r )

2

(20)

ˆ 2

4

 where r  H ( M ) ( xi )  {(r1 ,..., rn ), | ri | 1,..., M , rj  0 for j {1,.., n} \ i} for the calculation

5

of first-order sensitivity index of a specific parameter xi [i.e.,  xi in eq. (9)];

6

 r  H ( M ) ( xi , x j )  {(r1 ,..., rn ), | ri |,| rj | 1,..., M , rk  0 for k  {1,.., n} \{i, j}} for the

7

calculation of second-order sensitivity index resulted from interactions of two parameters

8

 xi and xj [i.e.,  xi x j in eq. (9)]; and r  H ( M ) ( x1 ,..., xn )  {(r1 ,..., rn ), | r1 |,...,| rn | 1,..., M }

9

for the calculation of nth order sensitivity index resulted from interactions of all

10

parameters [i.e.,  x1... xn in eq. (9)]. We define H ( M ) as a general set to represent one of

11

the harmonic sets H ( M ) ( xi ) , H ( M ) ( xi , x j ) , …, or H ( M ) ( x1 ,..., xn ) for the calculation of first-

12

order, second-order and so on until the nth order sensitivity indices. The M is a user-

13

selected maximum harmonic order (generally four or six, we will discuss selection details

14

at the end of this section after we introduce the necessary basics). The ˆ 2 is the sample

15

variance of model output y,

16

ˆ 2 

1 N [ g (1( j ) ,...,  n ( j ) )  g (1 ,...,  n )]2  N  1 j 1

(21)

17

where g (1 ,...,  n ) is the sample mean of model output y. Replace Cˆ (r ) in eq. (20) by the

18

estimated cosine and sine Fourier coefficient in eq. (15), the sensitivity indices can be

19

calculated as follows,

15

ˆ 

1



 r H ( M )

(aˆr 2  bˆr 2 )

ˆ 2

(22)

.

2

Based on eq. (22), we can easily see that both estimation errors for Fourier coefficients

3

aˆr and bˆr , and the estimation error for sample variance ˆ 2 of the model output leads to

4

the estimation error for a sensitivity index ˆ . If we can estimate the standard errors for

5

aˆr , bˆr , and ˆ 2 , we are able to approximately estimate the standard error for the

6

sensitivity index ˆ using the delta method (see Appendix A for details of delta method).

7

For a large sample size N , we assume that aˆr , bˆr , and ˆ 2 approximately follow a

8

multivariate-normal distribution with a mean vector ( {ar } |rH ( M ) , {br } |rH ( M ) ,  2 ) and a

9

covariance matrix  , where {ar } |rH ( M ) represents the list of cosine Fourier coefficients

10

 and {br } |rH ( M ) represents the list of sine Fourier coefficients for all r  H ( M ) . Let

  ({ 11

   } |rH ( M ) ,{ }|rH ( M ) , 2 ) ar br 

M   (ar 2  br 2 )    2a    2b ,   { 2r } |rH ( M ) ,{ 2r } |rH ( M ) ,  r ( 2 ) 2        

(23)

12

based on the delta method (see Appendix A for details), ˆ will approximately follow a

13

normal distribution as follows

14

ˆ ~ N ( ,   T )

(24)

15

for a large sample size N. In order to estimate the variance of ˆ , we need to estimate the

16

covariance matrix  ( i.e., variance of and covariance among aˆr , bˆr , and ˆ 2 ).

16

1

Based on the Central Limit Theorem, Xu and Gertner [11] showed that the

2

estimated cosine Fourier coefficient aˆr and sine Fourier coefficient bˆr in eq.(16)

3

approximately follow normal distributions for a large sample size N . Namely,

aˆr ~ N (ar ,V (aˆr )), bˆ ~ N (b , V (bˆ )),

4

r

r

(25)

r

5

where V (aˆ (r1..)rn ) and V (bˆr(1..)rn ) are the variances of a ( ) r1 ..rn and b( ) r1 ..rn , respectively. In

6

view that {1( j ) ,...,  n ( j ) , j  1,..., N } are random samples drawn in the  -space, V (aˆ (r1..)rn )

7

and V (bˆ(r1..)rn ) can be empirically estimated based on eq. (16) as follows,

8

1 Vˆ (aˆr )  2 N 1 Vˆ (bˆr )  2 N

9

N

[g

2

(G (1( j ) ),..., G ( n ( j ) )) cos 2 (r11( j )    rn n ( j ) )] 

2

(G (

j 1 N

[ g

( j) 1

j 1

),..., G ( n )) sin (r  ( j)

2

( j) 1 1

   rn n

( j)

1 (aˆr ) 2 , N

(26)

1 )]  (bˆr ) 2 . N

In order to estimate the standard errors of sensitivity indices, we also need to estimate the

10

covariance between any two Fourier coefficients. Using eq. (16), we can estimate the

11

covariance between cosine Fourier coefficients aˆ (r ) and sine Fourier coefficients bˆr(' )

12

  (where r  (r1 , , rn ) and r ' is another vector of harmonics which could be different from

13

 or the same as r ) as follows (see Appendix B for details),

14

1 1     Cov ( aˆ r , bˆr ' )  E[ g 2 (1 ,  n ) cos( r ( ) T ) sin( r '( )T ]  ar br ' . N N

(27)

15

Based the above equation, an empirical estimation of covariance between cosine Fourier

16

coefficients aˆ (r ) and sine Fourier coefficient bˆr(' ) can be derived,

17

ˆ (aˆ  , bˆ  )  1 Cov r r' N2

N

g j 1

2

1     (1( j ) ,...,  n ( j ) ) cos(r ( ( j ) )T ) sin(r '( ( j ) )T )  aˆ r bˆr ' . N

17

(28)

1

  Similarly, covariance between two cosine Fourier coefficients aˆ (r ) and aˆ (r' ) ( r  r ' ) and

2

  covariance between two sine Fourier coefficients bˆr( ) and bˆ(r' ) ( r  r ' ) can be calculated

3

by the following two equations,

4

5

ˆ (aˆ  , aˆ  )  1 Cov r r' N2 ˆ (bˆ  , bˆ  )  1 Cov r r' N2

N

g j 1

N

g j 1

2

2

1     (1( j ) , n ( j ) ) cos(r ( ( j ) )T ) cos(r '( ( j ) )T )  aˆ r aˆ r ' , N

(29)

1     (1( j ) ,  n ( j ) ) sin(r ( ( j ) )T ) sin(r '( ( j ) )T )  bˆr bˆr ' , N

(30)

6

respectively. For the standard error of ˆ 2 , based on eq. (21) and the Central Limit

7

Theorem, ˆ 2 approximately follow a normal distribution for a large sample size N.

8

Namely,

9 10

ˆ 2 ~ N ( 2 ,

1 V [ g (1 ,...,  n )  g (1 ,...,  n )]2 ) . N 1

(31)

Therefore, an empirical estimation of variance of ˆ 2 [i.e., V( ˆ 2 ) ] can be calculated as 1 ˆ V [ g (1 ,...,  n )  g (1 ,...,  n )]2 N 1 N 1 1  [ g (1( j ) ,...,  n ( j ) )  g (1 ,...,  n )]4  (ˆ 2 ) 2  N ( N  1) j 1 N 1

Vˆ (ˆ 2 )  11



(32)

1 [uˆ (4)  (ˆ 2 ) 2 ], ( N  1)

12

where uˆ (4) is the estimated fourth central moment for g (1 ,...,  n ) . Finally, the

13

covariance between cosine Fourier coefficient and ˆ 2 can be estimated as follows (see

14

Appendix C for details),

Cov (aˆr , ˆ 2 )  15

ar 2 1  T 2 . E{g (1 ,..., n ) cos(r ( ) )[ g (1 ,..., n )  g (1 ,..., n )] }  ( N  1) N

18

(33)

1

Thus, we can get an empirical estimation of covariance between cosine Fourier

2

coefficient and ˆ 2 with the following equation,

ˆ (aˆ  , ˆ 2 )  Cov r 3

N aˆ ˆ 2 (34) 1   {g (1( j ) ,..., n ( j ) ) cos( r ( ( j ) )T )[ g (1( j ) ,...,  n ( j ) )  g (1 ,...,  n )]2 }  r .  N ( N  1) j 1 N

4

Similarly, we can derive an empirical estimation of variance between sine Fourier

5

coefficient and ˆ 2 as follows, Cov (bˆr , ˆ 2 ) 

6

N 1 aˆrˆ 2 (35)  ( j) T ( j) ( j) ( j) ( j) 2 { ( ,..., ) sin( ( ) )[ ( ,..., ) ( ,..., )] } . g r g  g          1 n n n 1 1 N ( N  1) j 1 N

7

ˆ among Fourier coefficients and sample variance With the estimated covariance matrix 

8

of output y based on eq. (27), eq. (28), eq. (29), eq. (30), eq. (32), eq. (34), and eq. (35),

9

we have an empirical estimation of the standard error of sensitivity index ˆ as follows

ˆ (ˆ )  ˆ ˆ ˆ T

10 11

(36)

where

12

  (aˆr 2  bˆr 2 )    2aˆ  ˆ   2b (M ) . ˆ   { 2r } |rH ( M ) ,{ 2r }|rH ( M ) ,  r H (ˆ 2 ) 2        

13

Based on eq. (36), we can see that the standard errors of sensitivity indices depend on

14

three components: 1) the magnitude of Fourier amplitudes, which depend on parametric

15

importance in terms of uncertainty contribution (i.e., the true value of sensitivity indices);

16

2) the magnitude of model outputs and variance of model outputs; and 3) the random

17

ˆ , which depends on sample size N. error measured by 

19

1

The FAST method gives a biased estimator of partial variances contributed by

2

parameters due to the random noises introduced by sampling [11]. Thus, selection of the

3

maximum harmonic order (M) in eq. (19) is very important for the estimation of

4

sensitivity indices. If we select a very large M, it may cause overestimation by

5

incorporating much noise for those Fourier amplitudes at high harmonics. If we select a

6

low M, then it may cause too much underestimation if the model is highly nonlinear

7

(which will still have signals at high harmonics). Xu and Gertner [11] recommended

8

using a reasonably large M (specifically, M<50 for first-order sensitivity indices, M<5 for

9

second-order sensitivity indices, M<3 for third-order sensitivity indices) and estimating 2



   bˆ  ) by only using those sine and cosine 2

Fourier amplitudes (i.e., Cˆ (r1..)rn  aˆ (r1..)rn

11

Fourier coefficients significantly larger or smaller than zero based on hypothesis tests.

12

The hypothesis tests are based on a z-score statistic of aˆr and bˆ r as follows Za 

13

( ) r1 ..rn

2

10

aˆr , Vˆ (aˆ  ) r

(37)

bˆr Zb  , Vˆ (bˆ ) r

14

which will follow standard normal distributions (assuming a relative large sample size N).

15

The incorporation of only Fourier coefficients (i.e., aˆr and bˆ r ) significantly larger or

16

smaller than zero will not affect the proposed method of standard error estimation.

17

Namely, we only need to calculate the standard errors based on those Fourier coefficients

18

significantly larger or smaller than zero.

20

1 2 3

2.3

Standard errors of sensitivity indices for random balance design sampling For the random balance design sampling, we first draw N grid samples for each

parameter in the  -space, {1(1) ,...,  n ( N ) |  i ( j )   0 

4

2 ( j  1), j  1,..., N } N



(38)

5

where 0 is randomly drawn from [0,

6

explore between 0 and 2 . Then the samples are randomly permuted to form N samples

7

in the  -space. In this way, random balance design sampling draws random samples in

8

the high-dimensional  -space similar to simple random sampling. Therefore, estimation

9

errors for Fourier coefficients using random balance design will be similar to those based

N

] so that the grid sample for i can ergodically

10

on simple random sampling. However, by using grid samples for each individual

11

parameter, estimation errors for Fourier coefficients used to calculate partial variances

12

contributed by main effects are different from those based on simple random sampling.

13

Based on the definition of Fourier coefficients C ( ) 00ri 00 in eq.(11) and eq. (12), we have C ( ) 00ri 00  E ( g (G (1 ),..., G ( n ))e  i ( rii ) ) 2

2

1  ( ) n    g (G (1 ),..., G ( n ))e  i ( rii ) d1  d n 2 0 0 1 2 E ( g (G (1 ),..., G ( n )) |  i )e  i ( rii ) di  0 2 1 2  u (i )e  i ( rii ) di 2 0  Ei (u (i )e  i ( rii ) ) 

14

15 16 17

where u (i )  E ( g (G (1 ),..., G ( n )) | i ) . Let model output

21

(39)

y  u ( i )   i

1

(40)

2

where  i is a random error independent from i with E ( i )  0 . Using eq. (40), it can

3

be easily shown that V ( i )  V ( y )  V [ g (G (1 ),..., G ( n )) | i ]

4

5

Based on eq. (40), the Fourier coefficients can be estimated as 1 ) Cˆ (00.. ri ..00  N

N

 (g (G (

( j) 1

j 1

( j) 1 N 1 u ( i ( j ) )e  i ( rii )   N j 1 N  Cˆ ( ,u )  Cˆ ( ,  )

00 ri 00

8

),..., G ( n ( j ) ))e  i ( rii



6

7

(41)

 V ( y )  Vxi

( j)

N

 (  

j 1

( j) i

)

)

)e  i ( rii

( j)

)

(42)

00 ri 00

where 1 Cˆ (00 ,uri)00  N 1 Cˆ (00 , ri )00  N

N

 u ( j 1

( j)

)e  i ( rii

( j)

)e  i ( rii ) .

i

N

 (  j 1



i

( j)

)

,

( j)

9 10

) Based on eq. (42), it indicates that the standard error of Fourier coefficient Cˆ (00.. ri ..00 is

11

resulted from estimation errors of both C ( ,u ) 00ri 00 and C ( ,  ) 00ri 00 . For

12

( ,  ) Cˆ (00 , ri )00  aˆ (00 , ri )00  ibˆ00  ri 00 with

aˆ (00 , ri )00  13

14

1 N

1 ( ,  ) bˆ00  ri 00  N

N

 (  j 1



( j)

) cos(rii ( j ) ),

( j)

) sin(rii ( j ) ),

i

N

 (  j 1



i

based on the Central Limit Theorem, we have

22

aˆ (00 , ri )00 ~ N (0, 1

bˆ

( ,  ) 00 ri 00

~ N (0,

V ( i ) 2N V ( i ) 2N

), (43)

),

2

for a relatively large sample size N. This indicates that the estimation error of C ( ,  ) 00ri 00

3

decreases at an order of

4

C ( ,u ) 00ri 00 (where C ( ,u ) 00ri 00  E[u (i )e i ( rii ) ] ), since we use grid samples for each

5

individual parameter, the estimation errors decrease as an order

6

sample size N (see Xu and Gertner [11] for details). This suggests that the estimation

7

error for C ( ,  ) 00ri 00 is substantially larger than that for C ( ,u ) 00ri 00 for a relatively large

8

sample size N. Thus, we can ignore the estimation error for C ( ,u ) 00ri 00 . Finally, for

9

) ˆ ˆ Cˆ (00.. ri ..00  a00..ri ..00  i b00..ri ..00 , we have

10

1 with an increasing sample size N. For estimation of N

V ( i )   aˆ00ri 00 ~ N  a00ri 00 , , 2N   V ( i )   bˆ00ri 00 ~ N  b00ri 00 , , 2N  

1 with an increasing N2

(44)

11

in view that C ( ,u ) 00ri 00  C ( ) 00ri 00 by eq. (39). To estimate V ( i ) based on eq. (41), we

12

need to first estimate Vxi . Following Xu and Gertner [11], a conservative preliminary

13

estimation of Vxi can be calculated as follows, M

14

Vxi   | Cˆ (00 ) ri 00 |2

(45)

|ri | 1

23

1

where M is a relatively small harmonic order (e.g., 4), under which the Fourier

2

amplitudes have relatively small proportions of errors. In order to improve the estimation

3

accuracy of V ( xi ) , the statistical tests with eq. (37) for simple random sampling can be

4

used to select those cosine and sine Fourier coefficients significantly larger or smaller

5

than zero for the estimation of Fourier amplitudes (i.e., | Cˆ (00 ) ri 00 |2  (a00ri 00 ) 2  (b00ri 00 ) 2 ).

6

Finally, V ( i ) can be estimated as Vˆ ( i )  Vˆ ( y )  Vxi .

7

(46)

8

It can be shown that the covariance between cosine Fourier coefficient a00..ri ..00 and sine

9

Fourier coefficient b00..ri '..00 is zero (where the harmonic ri can be the same as or different

10

from ri ' ,| ri |,| ri ' | 1,..., M , see Appendix D for details). We can also show that the

11

covariance between a00..ri ..00 and a00..ri '..00 , covariance between b00..ri ..00 and b00..ri '..00 are both

12

zero (where ri  ri ' ,| ri |,| ri ' | 1,..., M , see Appendix D for details).

13

For the covariance between cosine Fourier coefficients aˆ00ri 00 and sample

14

variance ˆ 2 in simple random sampling for first-order sensitivity indices, based on eq.

15

(42), we have

16

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , uri )00 , ˆ 2 )  Cov(aˆ (00 , ri )00 , ˆ 2 ).

(47)

17

Notice that the covariance estimation for Cov(aˆ00ri 00 , ˆ 2 ) based on eq. (33) is mainly

18

 dependent on the signal of model output y for i at harmonic r  (0,..., 0, ri , 0...0) with

19

ri  1,..., M . Since aˆ (00 , ri )00 is estimated based on the part of model output y with no signal

20

for i (i.e.,  i which is independent from i , see eq. (40) for details), the second term

24

1

in right-hand side of eq. (47) [i.e., Cov(aˆ (00 , ri )00 , ˆ 2 ) ] will be close to zero. Therefore, we

2

have

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , uri )00 , ˆ 2 ).

3 4

At the same time, for a grid sample of parameter xi in random balance design sampling,

5

aˆ (00 , uri )00 has a substantially lower estimation error compared to that of aˆ (00 , ri )00 [11].Thus,

6

we can treat aˆ (00 , uri )00 as a constant for a relatively large sample size of N and we have

7

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , uri )00 , ˆ 2 )  0.

8

Please see Appendix E for a proof of above equation. Similarly, we can also show that

.

Cov(bˆ00ri 00 , ˆ 2 )  0 .

9 10

ˆ becomes a diagonal matrix and the standard error of Finally, the covariance matrix 

11

first-order sensitivity index ˆ xi can be estimated based on delta method as follows (see

12

Appendix F for details),

 ˆ x  2 1 (1  ˆ xi )ˆ xi   2i  [uˆ (4)  (ˆ 2 )2 ]. ˆ N N   ( 1)   2

ˆ (ˆ x ) 

13

i

(48)

14

Eq. (48) suggests that the standard errors of sensitivity indices calculated by FAST

15

analysis depend on four factors. They are 1) the magnitude of sensitivity indices, 2)

16

sample variance ( ˆ 2 ) of the model output, 3) fourth central moment ( uˆ (4) ) of the model

17

output, and 4) the sample size N.

18

2.4

19 20

Procedures In this section, we specifically provide a summary of the procedure for FAST

analysis and corresponding standard error estimation for the calculated sensitivity indices.

25

1



2

FAST procedure using simple random sampling a. Draw

independent

random

samples

in

the



-space

( K n  {1 ,...,  n | 0   i  2 , i  1,..., n} ).

3 4

b. Generate corresponding random samples in the parameter space using search functions in eq. (2) and the sample in the  -space.

5 6

c. Run the model using parameter samples from b.

7

d. Calculate the Fourier coefficients based on eq. (16) and the corresponding

8

sensitivity indices with eq. (22). Statistical tests with eq. (37) are used to

9

select Fourier coefficients significantly larger or smaller than zero.

10

ˆ among Fourier coefficients and sample e. Calculate the covariance matrix 

11

variance of output y based on eq. (27), eq. (28), eq. (29), eq. (30), eq. (32),

12

eq. (34), and eq. (35).

13 14 15 16 17 18

f. Estimate standard errors of sensitivity indices using eq. (36). 

FAST procedure using random balance design sampling a. Draw N grid samples for { i } using eq. (38), which are then randomly permuted to form a random sample in the  -space. b. Generate a corresponding random sample in the parameter space using search functions in eq. (2) and the sample in the  -space.

19

c. Run the model using parameter samples from b.

20

d. Calculate the Fourier coefficients based on eq. (16) and the corresponding

21

sensitivity indices with eq. (22). Statistical tests with eq. (44) are used to

22

select Fourier coefficients significantly larger or smaller than zero for first

23

order sensitivity indices calculation, and statistical tests with eq. (37) are

26

1

used to select Fourier coefficients significantly larger or smaller than zero

2

for higher-order sensitivity indices calculation.

3

ˆ among Fourier coefficients and sample e. Calculate the covariance matrix 

4

variance of output y using eq. (44) and eq. (32) for first order sensitivity

5

ˆ equal to a diagonal matrix, and using eq. (27), eq. (28), eq. indices with 

6

(29), eq. (30), eq. (32), eq. (34), and eq. (35) for higher order sensitivity

7

indices.

8

f. Estimate standard errors of first order and higher order sensitivity indices

9 10 11

using eq.(48) and eq. (36), respectively.

3

Applications In order to test how good the delta method approximates the standard errors of

12

sensitivity indices, we use a simple test model and a realistic complex model. The simple

13

test model is as follows 3

14

y   (i xi )  x1 x2  2 x2 x3  3 x1 x3  x1 x2 x3 2

(49)

i 1

15

where x1 , x2 , and x3 are three independent parameters for the model. We assume all

16

parameters follow standard normal distributions. Although the model is simple, it is

17

representative since it is nonlinear and non-monotonic. The analytical sensitivity indices

18

have been derived by Xu and Gertner [11] for this model.

19

For the complex model, we applied the delta method to a world population model,

20

World 3 [23]. The World 3 model is a computer program for simulating the interactions

21

among population, industrial growth, food production and limits in the ecosystems of the

22

Earth. The model consists of six main systems: food system, agriculture system,

27

1

industrial system, population system, nonrenewable resources system and pollution

2

system. In this paper, we are concerned with the industrial system, which can provide the

3

products for world population [23]. At the same time, this industrial system creates

4

pollution, which reduces the land productivity. For details, please refer to [23]. The seven

5

parameters of interest in the model are shown in Table 1. The output of interest is the

6

world human population on a 5-year basis. For real application models, the ranges,

7

distributions and correlations among parameters can be derived from empirical or

8

historical data [24,25]. However, for the simplification of our test case, we assume a

9

uniform distribution for each parameter. The bounds for each parameter are assumed to

10

be a 10% deviation of the central value (Table 1). Since the FAST analysis does not allow

11

for the calculation of higher-order sensitivity indices for correlated parameters, in this

12

paper, we assume the independence among parameters.

13

Xu and Gertner [11] showed that, for partial variances contributed by main effects

14

of parameters in simple random sampling, and partial variances contributed by higher-

15

order interactions in random balance design sampling, the estimation bias for Fourier

16

amplitude C ( ) r1 ..rn

17

E[ g (G (1 ),..., G ( n ))]2 . Since the sensitivity index is calculated based on a ratio of sum N

18

of Fourier amplitudes C ( ) r1 ..rn

19

details), it is possible that the calculated sensitivity index is larger than one if

20

E[ g (G (1 ),..., G ( n ))]2 is much larger than  2 in a model. To overcome this problem, Xu N

21

and Gertner [11] recommended that the model output be centered by sample mean and/or

2

is related to the expected square of model output

2

to the variance of model output  2 (see eq. (9) for

28

E[ g (G (1 ),..., G ( n ))]2 is N

1

scaled by standard deviation of the model output, so that

2

generally smaller than the variance  2 . For the World 3 model, the variance of model

3

E[ g (G (1 ),..., G ( n ))]2 , especially at the beginning of output is much smaller than N

4

model simulation. In order to overcome this potential large estimation bias, we subtracted

5

the model output by its sample mean.

6

In order to test the reliability of standard errors derived based on the delta method, we

7

used the standard deviations of the sensitivity indices based on 20 replicates of samples

8

as references. In our preliminary analysis, we found that 20 replicates is enough for

9

reasonable approximation of the standard errors (i.e., the increase in the number of

10

replicates does not substantially affect the standard error estimation).

11

4

Results

12

For the simple test model, our results showed that the delta method can provide a

13

good approximation for standard errors of both first-order and higher-order sensitivity

14

indices in simple random sampling (Figure 1) as well as random balance design sampling

15

(Figure 2). The approximation is better for a larger sample size since the delta method is

16

dependent on the large sample size assumption. We can also see that the standard errors

17

of first-order sensitivity indices for random balance design sampling are substantially

18

reduced compared to those based on simple random sampling. Finally, the standard errors

19

are generally higher for larger sensitivity indices. However, the coefficients of variations

20

(i.e., the precision relative to the mean values of sensitivity indices) are much lower

21

(Figure 1a and Figure 3).

29

1

For the World 3 model, parameters x2, x3 and x5 have relative high first-order

2

sensitivity indices before year 2000. After that, the interaction between x2 and x3 and

3

interaction between x3and x5 become important (Figure 4). We derived the sensitivity

4

indices for all first-order, second-order and third-order interactions based on a single

5

sample of size N (N = 1000 or N = 5000). For convenience, we did not plot those

6

sensitivity indices less than 0.05. The standard errors of first-order and higher-order

7

sensitivity indices estimated based on the delta method can reasonably approximate those

8

based on 20 sample replicates for both simple random sampling (Figure 4 and Figure 5)

9

and random balance design sampling (Figure 6 and Figure 7). The increase of sample size

10

from 1000 to 5000 decreases the estimated standard errors for both simple random

11

sampling (Figure 4 and Figure 5) and random balance design sampling (Figure 6 and

12

Figure 7). However, the difference in standard errors for first-order sensitivity indices

13

between random balance design sampling and simple random sampling is not so evident

14

compared to that based on the test model using eq.(49). There are two reasons for that.

15

First, the centered output values reduce the magnitude of possible random errors by

16

reducing the bias term

17

variations of the model output at the beginning of the simulation so that the random

18

balance design sampling does not increase much in sampling efficiency compared to

19

simple random sampling. After that (about 100 years after initial simulation year), the

20

random balance design sampling does increase the precision (i.e., reduce the standard

21

errors) slightly for first-order sensitivity indices of parameters x2 and x3 (compare Figure

22

4 and Figure 6, Figure 5 and Figure 7).

E[ g (G (1 ),..., G ( n ))]2 . Second, there is only a small amount of N

30

1 2

5

Discussion Due to the random errors introduced by sampling, it is important that uncertainty and

3

sensitivity techniques provide a measure of reliability of estimated sensitivity indices

4

(commonly using the standard errors of estimated sensitivity indices). For complex

5

models which take a long time to run a single simulation, it is computationally expensive

6

or infeasible to estimate standard errors of sensitivity indices based on replicates of a

7

relative large sample (e.g., 10 or 20 replicates). The delta method can provide a good

8

approximation of standard errors of sensitivity indices at no additional cost of the model

9

simulations. This can substantially increase the efficiency of FAST analysis for complex

10 11

models. In another way, the delta method can provide the user with a general approach to

12

obtain the minimum sample size to achieve a desired precision level for a specified

13

sensitivity index using the random balance design sampling. For a model with a sample

14

variance of output y ( ˆ 2 ) and a 4th central moment ( uˆ (4) ) (estimated based on a

15

preliminary sample), if we want to have a desired standard error ˆ (ˆ xi ) of a specified

16

sensitivity index (e.g., ˆ (ˆ xi ) is less than 50% of a sensitivity index of 0.02), the

17

minimum sample size N ( Nˆ min ) can be approximately estimated based on eq. (48) as

18

follows,

19

Nˆ min 



 ˆ xi  1 ˆ ˆ (1 ) [uˆ (4)  (ˆ 2 ) 2 ].      2 x x 2 2 i i  ˆ  ˆ (ˆ xi ) ˆ (ˆ xi ) 2

2







(50)

20

Eq. (50) can provide the model user an operational approach to determine the minimum

21

sample size to achieve a desired precision level for a specified sensitivity index. Namely,

22

the model user can first have a preliminary estimation of ˆ 2 and uˆ (4) based on a 31

1

relatively small sample size (dependent on simulation time of the model), which will take

2

less computational time. Then, the user can define a desired standard error for a

3

sensitivity index of interest. For example, for the last simulation year in World3 model,

4

we have ˆ 2 = 1.605 (10+18) and uˆ (4) = 6.518 (10+36). In order to achieve a standard error

5

of 0.03 for a sensitivity index of 0.5, a minimum sample size of 981 is required based on

6

eq. (50) (see Figure 8). As far as we know, this is the first statistical approach proposed to

7

determine the sample size for an uncertainty and sensitivity analysis method based on

8

random balance design sampling.

9

We need to point out that the sample size based on eq. (50) is only a minimum

10

sample size from the standard error perspective. One key assumption in FAST is that the

11

model output y will become a multiple periodic function for parameters in the  -space.

12

For models with many parameters, if a parameter has a relatively low contribution to

13

uncertainty in a model output, then the periodic signal may easily be contaminated by the

14

random noise (e.g.,  i in eq. (50)) due to other parameters. In order to detect and

15

distinguish the signals for different parameters, a relatively large sample size is required.

16

Currently we do not have an analytical form to calculate the minimum required sample

17

size for that. Based on our experiences, a sample with size of 1000 is generally good for

18

models with less than 20 parameters. A statistical robust method to achieve a desired

19

precision level for estimated sensitivity indices is especially important for complex

20

models since it is computationally inefficient for trial-error approaches (i.e., try many

21

replicates at different samples sizes to get the desired precision). Thus, our proposed

22

method is a first step toward a precise determination of sample size for FAST

23

applications in complex models.

32

1

The proposed standard error estimation for the sensitivity indices is mainly based on

2

the assumption that the model parameters are independent. In practices, it is common that

3

the parameter can be correlated. Xu and Gertner [12,13] proposed using a simple random

4

reordering approach to generate samples for parameters with a specific rank correlation

5

structure. The main idea is that, after generating samples for (1 ,.., n ) [using simple

6

random sampling approach or random balance design sampling approach] and the

7

corresponding samples for parameters ( x1 ,.., xn ) with the search function of eq. (2),

8

sampled parameter values are reordered to honor a specified correlation structure using

9

Iman and Conover’s method [26]. Samples of (1 ,.., n ) are reordered correspondingly

10

based on the order of parameter samples. For first order sensitivity indices, it would be

11

easy to extend our proposed approaches for standard error estimation to model with

12

correlated parameters. For a model y  g ( x1 ,.., xn ) with dependent parameters, the model

13

output can be decomposed as follows for a specific model parameters xi ,

14 15

y  u ( xi )   (  xsub )

(51)

u ( xi )  E ( g ( x1 ,..., xn ) | xi )

(52)

where

16 17

and  (  xi ) is the random error that arises due to the uncertainty in the parameters except

18

xi with

19

E ( (  xi ) )  0 V ( (  xi ) )  V(y )- V( E ( g ( x1 ,..., xn ) | xi )).

33

(53)

1

With the search function eq. (2), we can see that the model output y becomes a periodic

2

function of i using eq. (51). Based on a Fourier transformation over i , the partial

3

variance contributed by parameter xi can be calculated as [11] M

Vxi   | C (i

4

|ri | 1

5

u 1 where Cˆ (rii )  N

N

 g (G( j 1

( j) 1

u

) ri

|2 ,

(54)

),..., G ( n ( j ) ))e  i ri i . With the sine and cosine Fourier ( j)

6

u coefficient of Cˆ (rii ) , we can estimate the first-order sensitivity indices of all parameters

7

using eq. (22). It is noteworthy that the decomposition of y in the parameter space with eq.

8

(51) is equivalent to the decomposition in  -space with eq. (40). Therefore, we can still

9

estimate standard errors of first order sensitivity indices using eq.(48) for models with

10

correlated parameter based on random balance design sampling. The decomposition in  -

11

space with eq. (40) is a general decomposition and is also valid for simple random

12

sampling for first-order sensitivity indices. Therefore, eq. (36) can be still used to

13

estimate standard errors of first order sensitivity indices for models with correlated

14

parameter based on simple random sampling. Due to the difficulty to distinguish

15

correlated effects and interaction effects, currently FAST analysis has not been proposed

16

for calculation of higher order sensitivity indices in models with correlated parameters.

17

6

18

Acknowledgement This study was supported by U.S. Department of Agriculture McIntire-Stennis funds

19

(MS 875-359) and NIH grant R01-AI54954-0IA2. We thank two anonymous reviewers

20

for their very helpful comments which substantially improved this paper.

34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

References: 1. A. Saltelli, S. Tarantola, and F. Campolongo, Sensitivity analysis as an ingredient of modeling. Statistical Science 15 (2000), pp. 377-395. 2. A. Saltelli, M. Ratto, S. Tarantola, and F. Campolongo, Sensitivity analysis for chemical models. Chem Rev 105 (2005), pp. 2811-2826. 3. A. Saltelli, K. Chan, and M. Scott, Sensitivity Analysis. John Wiley and Sons, West Sussex, 2000. 4. A. Saltelli, and S. Tarantola, On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal. J Am Stat Assoc 97 (2002), pp. 702-709. 5. R.I. Cukier, J.H. Schaibly, and K.E. Shuler, Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. III. Analysis of the approximations. J Chem Phys 63 (1975), pp. 1140-1149. 6. R.I. Cukier, H.B. Levine, and K.E. Shuler, Nonlinear sensitivity analysis of multiparameter model systems. Journal of Computational Physics 26 (1978), pp. 1-42. 7. J.H. Schaibly, and K.E. Shuler, Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. II. Applications. J Chem Phys 59 (1973), pp. 3879-3888. 8. R.I. Cukier, C.M. Fortuin, K.E. Shuler, A.G. Petschek, and J.H. Schaibly, Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. I. Theory. J Chem Phys 59 (1973), pp. 3873-3878. 9. E. Borgonovo, Measuring uncertainty importance: investigation and comparison of alternative approaches. Risk Anal 26 (2006), pp. 1349-1361. 10. S. Tarantola, D. Gatelli, and T.A. Mara, Random balance designs for the estimation of first order global sensitivity indices. Reliab Eng Syst Safe 91 (2006), pp. 717727. 11. C. Xu, and G.Z. Gertner, Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST). Comput Stat Data An Accepted (2010), pp. 999. 12. C. Xu, and G.Z. Gertner, Extending a global sensitivity analysis technique to models with correlated parameters. Comput Stat Data An 51 (2007), pp. 5579-5590. 13. C. Xu, and G.Z. Gertner, A general first-order global sensitivity analysis method. Reliab Eng Syst Safe 93 (2008), pp. 1060-1071. 14. A. Saltelli, S. Tarantola, and K.P.S. Chan, A quantitative model-independent method for global sensitivity analysis of model output. Technometrics 41 (1999), pp. 3956. 15. R. Scheller, and D. Mladenoff, An ecological classification of forest landscape simulation models: tools and strategies for understanding broad-scale forested ecosystems. Landsc Ecol 22 (2007), pp. 491-505. 16. H.S. He, Forest landscape models: Definitions, characterization, and classification. For Ecol Manag 254 (2008), pp. 484-498. 17. C. Xu, G.Z. Gertner, and R.M. Scheller, Uncertainties in the response of a forest landscape to global climatic change. Glob Change Biol 15 (2009), pp. 116-131.

35

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

18. P. Reiter, Oviposition, dispersal, and survival in Aedes aegypti: implications for the efficacy of control strategies. Vector Borne Zoonot Dis 7 (2007), pp. 261-274. 19. M. Otero, N. Schweigmann, and H.G. Solari, A stochastic spatial dynamical model for Aedes aegypti. Bull Math Biol 70 (2008), pp. 1297-1325. 20. K. Magori, M. Legros, M.E. Puente, D.A. Focks, T.W. Scott, et al., Skeeter Buster: a stochastic, spatially-explicit modeling tool for studying Aedes aegypti population replacement and population suppression strategies. Plos Neglect Trop Dis 3 (2009), pp. e508. 21. S.F. Fang, G.Z. Gertner, S. Shinkareva, G.X. Wang, and A. Anderson, Improved generalized Fourier Amplitude Sensitivity Test (FAST) for model assessment. Stat Comput 13 (2003), pp. 221-226. 22. Y. Lu, and S. Mohanty, Sensitivity analysis of a complex, proposed geologic waste disposal system using the Fourier Amplitude Sensitivity Test method. Reliab Eng Syst Safe 72 (2001), pp. 275-291. 23. D.H. Meadows, D.L. Meadows, and J. Randers, Beyond the Limits. Chelsea Green Publishing Company, Post Mills, Vermont, 1992. 24. R.L. Iman, M.E. Johnson, and T.A. Schroeder, Assessing hurricane effects. Part 1. Sensitivity analysis. Reliab Eng Syst Safe 78 (2002), pp. 131-145. 25. A. Kanso, G. Chebbo, and B. Tassin, Application of MCMC-GSA model calibration method to urban runoff quality modeling. Reliab Eng Syst Safe 91 (2006), pp. 1398-1405. 26. R.L. Iman, and W.J. Conover, A distribution-free approach to inducing rank correlation among input variables. Commun Stat-Simul C 11 (1982), pp. 311-334. 27. E.L. Lehmann, Elements of large-sample theory. Springer, New York, 1999.

36

1

Tables and Figures

2 3 4 5 6

Table 1 Parameter specifications of uniform distributions for World3 model Parameter

Label

Lower bound

Upper bound

x1

industrial output per capita desired

315

385

x2

industrial capital output ratio before 1995

2.7

3.3

0.387

0.473

0.387

0.473

fraction of industrial output allocated to x3 consumption before 1995 fraction of industrial output allocated to x4 consumption after 1995 x5

average life of industrial capital before 1995

12.6

15.4

x6

average life of industrial capital after 1995

16.2

19.8

x7

initial industrial capital

1.89 (10+11)

2.31 (10+11)

7

37

(a)

0.45 0.4 0.35

FAST

0.09

Delta

Reference

0.08

Reference

Standard error

0.3 Sensitivity

(b)

0.1

0.25 0.2 0.15

0.07 0.06 0.05 0.04 0.03

0.1

Parameters and their interactions

Parameters and their interactions

x1 x2 x3

x1 x3

x2 x3

x1 x2 x3

x2 x3

x1

x1 x2 x3

x2 x3

0 x1 x3

0.005

0 x3

0.05

x1 x2

0.01

x2

0.1

x1 x3

0.15

0.02 0.015

x1 x2

0.2

x3

0.25

0.03 0.025

x2

Standard error

0.3

x1

Reference

0.035

Reference

0.35 Sensitivity

Delta

0.04

FAST

0.4

1 2 3 4 5 6 7 8 9 10 11

(d)

0.045

0.45

x1 x2

Parameters and their interactions

(c)

0.5

x3

x1

x1 x2 x3

x2 x3

x1 x3

x1 x2

x3

x2

0 x1

0.01

0

x2

0.02

0.05

Parameters and their interactions

Figure 1 Estimated sensitivity indices and their standard errors based on simple random sampling. The left column [panel (a) for a sample size of 1000 and panel (c) for a sample size of 5000] shows the sensitivity indices and the right column [panel (b) for a sample size 1000 and panel (d) for a sample size of 5000] shows the standard errors. For the sensitivity indices in (a) and (c), the white bars represent the FAST-based sensitivity indices using a randomly selected replicate, while the grayed bars represent the analytical derived sensitivity indices. For the standard errors in (b) and (d), the white bars represent estimated standard errors based on the delta method. The grayed bars represent the standard errors of sensitivity indices calculated based on 20 sample replicates.

38

(a)

0.45 0.4

Delta

0.045

Reference

0.35

Reference

0.04 Standard error

0.3 Sensitivity

(b)

0.05 FAST

0.25 0.2 0.15

0.035 0.03 0.025 0.02 0.015

0.1

Parameters and their interactions

x1 x2 x3

x2 x3

x1 x3

(d)

0.025

0.4

x1 x2

Parameters and their interactions

(c)

0.45

Delta

FAST

0.35

Reference

0.02

Reference Standard error

0.3 Sensitivity

x3

x1

x1 x2 x3

x2 x3

x1 x3

x1 x2

x3

0 x2

0 x1

0.005

x2

0.01

0.05

0.25 0.2 0.15 0.1

0.015

0.01 0.005

0.05 0

Parameters and their interactions

1 2 3 4 5 6 7 8 9 10 11

x1 x2 x3

x2 x3

x1 x3

x1 x2

x3

x2

x1

x1 x2 x3

x2 x3

x1 x3

x1 x2

x3

x2

x1

0

Parameters and their interactions

Figure 2 Estimated sensitivity indices and their standard errors based on random balance design sampling. The left column [panel (a) for a sample size of 1000 and panel (c) for a sample size of 5000] shows the sensitivity indices and the right column [panel (b) for a sample size 1000 and panel (d) for a sample size of 5000] shows the standard errors. For the sensitivity indices in (a) and (c), the white bars represent the FAST-based sensitivity indices using a randomly selected replicate, while the grayed bars represent the analytical derived sensitivity indices. For the standard errors in (b) and (d), the white bars represent estimated standard errors using the delta method. The grayed bars represent the standard errors of sensitivity indices calculated based on 20 sample replicates.

39

0.45

Coefficient of variation

0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05

x1 x2 x3

x2 x3

x1 x3

x1 x2

x3

x2

x1

0

Parameters and their interactions

1 2 3 4

Figure 3 Coefficients of variations for the test model in eq. (49) based on simple random sampling with a sample size of 5000.

40

.030

Standard error

.035

.4 .3 .2 .1

.020

(x2)

.015 .010

.005 0.0 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .6 .05 Year Year .5 .04 .4 .3 .2 .1 1850 1900 1950 2000 2050 2100 2150 .16

Year

.10 .08 .06

(x3)

.02 .01

.030

.14 .12

.03

0.00 1850 1900 1950 2000 2050 2100 2150

Standard error

Sensitivity index

.025

Standard error

Sensitivity index Sensitivity index

.5

Year

.025 .020 .015

(x5)

.010 .005

.04

.02 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 Year

.035

Year

.030

.4 Standard error

Sensitivity index

.5

.3 .2 .1 0.0

.025 .020

(x2, x3)

.015 .010 .005

.16 .14 .12 .10 .08 .06 .04 .02 0.00

.025

Year

Standard error

Sensitivity index

0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150

1850 1900 1950 2000 2050 2100 2150

1 2 3 4 5 6

Year

.020 .015 .010

(x3, x5)

.005

0.000 1850 1900 1950 2000 2050 2100 2150

Year

Year

Figure 4 Estimated sensitivity indices (left column) and their standard errors (right column) using simple random sampling with a sample size of 1000. The solid lines indicate sensitivity indices and standard errors estimated based on 20 sample replicates. The dotted lines indicate sensitivity indices and standard errors estimated using a single sample of size 1000. The standard errors are estimated using the delta method.

41

.016 .014 .4 .012 .3 .010 (x2) .008 .2 .006 .004 .1 .002 0.0 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .6 .020 Year Year .018 .5 .016 .014 .4 (x3) .012 .010 .3 .008 .006 .2 .004 .002 .1 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 Standard error

Standard error

.16 .012 Year Year .14 .010 .12 .008 .10 .08 (x5) .006 .06 .004 .04 .002 .02 0.00 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150

Standard error

Sensitivity index Sensitivity index

Year

Year .018 .016 .4 .014 .3 .012 (x2, x3) .010 .2 .008 .1 .006 .004 0.0 .002 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .14 .012 Year .12 Year .010 .10 .008 .08 .06 .006 .04 (x3, x5) .004 .02 .002 0.00 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .5

Standard error

Sensitivity index

Sensitivity index

Standard error

Sensitivity index

.5

Year

1 2 3 4 5 6 7

Year

Figure 5 Estimated sensitivity indices (left column) and their standard errors (right column) using simple random sampling with a sample size of 5000. The solid lines indicate sensitivity indices and standard errors estimated based on 20 sample replicates. The dotted lines indicate sensitivity indices and standard errors estimated using a single sample of size 5000. The standard errors are estimated using the delta method.

42

Standard error

.030

.5

Standard error

Sensitivity index Sensitivity index

.45 .035 .40 .030 .35 .025 (x2) .30 .020 .25 .015 .20 .010 .15 .10 .005 .05 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .6 .035

.4 .3 .2

.025

(x3)

.020 .015 .010

.005 .1 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .020 Year .018 .016 .014 .10 .012 (x5) .08 .010 .008 .06 .006 .004 .04 .002 .02 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 Standard error

.5 Sensitivity index

Year

.12

.04

Year

.3 .2 .1 0.0 1850 1900 1950 2000 2050 2100 2150 .16 .14 .12 .10 .08 .06 .04 .02 0.00

.03

(x2, x3) .02 .01

0.00 1850 1900 1950 2000 2050 2100 2150 .030

Year Standard error

Sensitivity index

Year

.4 Standard error

Sensitivity index

.14

Year

.025 .020 .015

(x3, x5)

.010 .005

0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150

1 2 3 4 5 6 7

Year

Year

Figure 6 Estimated sensitivity indices (left column) and their standard errors (right column) using random balance design sampling with a sample size of 1000. The solid lines indicate sensitivity indices and standard errors estimated based on 20 sample replicates. The dotted lines indicate sensitivity indices and standard errors estimated based a single sample of size 1000. The standard errors are estimated using delta method.

43

.014

.4

.012

Standard error

Sensitivity index

.5

.3 .2 .1

.010

(x2)

.008 .006 .004

.50 .014 .45 .012 .40 .010 .35 .008 .30 .006 .25 .004 .20 .15 .002 .10 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 Standard error

Sensitivity index

.002 0.0 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150

.010

.12

.008

Standard error

Sensitivity index

.14

(x3)

.10 .08 .06 .04

.006

(x5)

.004 .002

.02

0.00 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .016 .014 .012 .3 .010 (x2, x3) .2 .008 .006 .1 .004 0.0 .002 0.000 1850 1900 1950 2000 2050 2100 2150 1850 1900 1950 2000 2050 2100 2150 .14 .012 .12 .010 .10 .008 .08 Standard error

.4

Standard error

Sensitivity index

Sensitivity index

.5

.06 .04 .02 0.00

.004 .002

1850 1900 1950 2000 2050 2100 2150

1 2 3 4 5 6 7

(x3, x5)

.006

0.000 1850 1900 1950 2000 2050 2100 2150

Year

Year

Figure 7 Estimated sensitivity indices (left column) and their standard errors (right column) using random balance design sampling with a sample size of 5000. The solid lines indicate sensitivity indices and standard errors estimated based on 20 sample replicates. The dotted lines indicate sensitivity indices and standard errors estimated based on a single sample. The standard errors are estimated using delta method.

44

10000

8000

Sample size

6000

4000

2000

0 0.00

1 2 3 4

.01

.02

.03

.04

.05

.06

Standard error

Figure 8 Minimum sample size required for specified standard errors for a sensitivity index of 0.5 for the last simulation year of World3 model.

45

Appendix A: Delta method

1 2

For a statistic TN with E (TN )   and V (TN )   2 , D

N (TN   )  N (0, 2 )

3 D

4

where  indicates convergence in distribution as N   , and N (0, 2 ) indicates a

5

normal distribution with mean zero and variance  2 . Then, for a function g ( ) where the

6

derivative g '( ) is not zero, based on the delta method, we have D

N ( g (TN )  g ( ))  N (0,[ g '( )]2 2 ).

7 8

If TN is a multivariate vector with covariance matrix  , and D

N (TN   )  N (0, ) ,

9 10

for a function g ( ) with   (1 ,...,  m ) , the delta method states that D

N ( g (TN )  g ( ))  N (0, T )

11 12 13

where

 g ( ) g ( )  g ( ) g ( ) ,..., ,...,   0. ,  m  1  m  1

14

The delta method is based on a first-order Taylor series expansion. For a detailed proof,

15

please refer to Lehmann [27].

46

1 2 3

Appendix B: Covariance between estimated Fourier coefficients for simple random sampling

4

The covariance between cosine Fourier coefficients aˆ (r ) and sine Fourier coefficients bˆr(' )

5

  (where r  (r1 , , rn ) and r ' is another vector of harmonics which could be different from

6

 or the same as r ) can be calculated as,

Cov(aˆ (r ) , bˆr(' ) )   ( j) T 1 N   ( j) ( j) ( j) g r   ( , ) cos( ( ) )  g (1( j ) ,  n ( j ) ) sin(r '( )T ]  a ( ) r b ( ) r '  1 n N j 1 j 1 N 1   ( j)   ( j) (B1)  2 {E[ g 2 (1( j ) ,  n ( j ) ) cos(r ( )T ) sin(r '( )T ] N j 1 N   (k)   ( j)  E[  g (1( j ) ,  n ( j ) ) cos(r ( )T ) g (1( k ) , n ( k ) ) sin(r '( )T )]}  E[ 7

1 N

N

j , k 1, j  k ,

a

( )

 r

b ( ) r '

8

 ( j)  ( j)  ( j) where   (1( j ) ,..., n ( j ) ) and ( )T represents the transpose of the vector  . The

9

first term of the above equation can be simplified as

10

N 1   ( j)   ( j) {E[ g 2 (1( j ) ,  n ( j ) ) cos(r ( )T ) sin(r '( )T ]} 2 N j 1 1      E[ g 2 (1 , n ) cos(r ( )T ) sin(r '( )T ]. N

(B2)

11

Since (1( j ) ,...,  n ( j ) ) are independently drawn from (1( k ) ,  n ( k ) ) , the second term of eq.

12

(B1) can be simplified as follows,

47

1

N 1   ( j) T   (k ) T ( j) ( j) (k ) (k ) E g   r  g   r { [ ( , ) cos( ( ) ) ( , ) sin( '( ) )]}  n n 1 1 N2 j , k 1, j  k , N 1   ( j)   (k )  2 {  E[ g (1( j ) , n ( j ) ) cos(r ( )T )]E[ g (1( k ) ,  n ( k ) ) sin( r '( )T )]} N j , k 1, j  k ,

1  2 [ N ( N  1)a ( ) r b ( ) r ' ] N N  1 ( ) ( ) a r b r '  N 2

Finally, based on eq. (B1), (B2) and (B3), we have

Cov(aˆ (r ) , bˆr(' ) ) 1     E[ g 2 (1 ,  n ) cos(r ( )T ) sin(r '( )T ] N N  1 ( ) ( )  a r b r ' N  a ( ) r b( ) r ' 1 1      E[ g 2 (1 ,  n ) cos(r ( )T ) sin(r '( )T ]  a ( ) r b( ) r ' . N N  3

4

48

(B3)

1 2 3

Appendix C: Covariance between estimated Fourier coefficients and sample variance for simple random sampling

4

The covariance between the estimated cosine Fourier coefficients and sample variance ˆ 2

5

can be estimated as Cov(aˆr , ˆ 2 )  E[

6



1 N

N

1 N   ( j) ,..., n ( j ) ) cos(r ( )T ) [ g (1( j ) ,...,  n ( j ) )  g (1 ,...,  n )]2 ]  ar  2  N  1 j 1 N   ( j) E{ g (1( j ) ,...,  n ( j ) ) cos(r ( )T )[ g (1( j ) ,..., n ( j ) )  g (1 ,...,  n )]2 }

 g ( j 1

1 N ( N  1)

( j) 1

(C1)

j 1

N 1   ( j) E{  g (1( j ) ,...,  n ( j ) ) cos(r ( )T )[ g (1( k ) ,...,  n ( k ) )  g (1 ,...,  n )]2 } N ( N  1) j ,k 1, j  k ,



 ar  2

7

The first term of the above equation can be simplify as N 1   ( j) E{ g (1( j ) ,..., n ( j ) ) cos(r ( )T )[ g (1( j ) ,...,  n ( j ) )  g (1 ,..., n )]2 } N ( N  1) j 1 1    E[ g (1 ,..., n ) cos(r ( )T )[ g (1 ,..., n )  g (1 ,...,  n )]2 ]. ( N  1)

8

9 10

Let u  E[ g (1 ,...,  n )] , based on the Central Limit Theorem, for a large sample size N, we have g (1 ,..., n ) ~ N (u,

11 12 13 14

2 N

)

and E ( g (1 ,...,  n )  u ) 2 

2 N

.

The second term of eq. (C1) can be simplified as

49

N 1   ( j) E{  g (1( j ) ,...,  n ( j ) ) cos(r ( )T )[ g (1( k ) ,..., n ( k ) )  g (1 ,..., n )]2 } N ( N  1) j ,k 1, j  k , N 1   ( j) {  E[ g (1( j ) ,..., n ( j ) ) cos(r ( )T )]E[ g (1( k ) ,..., n ( k ) )  g (1 ,...,  n )]2 }  N ( N  1) j ,k 1, j  k ,

1

 ar E[ g (1 ,...,  n )  g (1 ,...,  n )]2  ar E[ g (1 ,...,  n )  u  u  g (1 ,...,  n )]2  ar {E[ g (1 ,...,  n )  u ]2  E[u  g (1 ,...,  n )]2  2 E[( g (1 ,..., n )  u )( g (1 ,...,  n ( k ) )  u )]} ar 2 . N Finally, we have  ar 2 

2

Cov(aˆr , ˆ 2 ) 

1   E{g (1 ,...,  n ) cos(r ( )T )[ g (1 ,...,  n )  g (1 ,..., n )]2 } ( N  1)  ar 2 

3

ar 2 N

 ar  2 

a  2 1   E{g (1 ,...,  n ) cos(r ( )T )[ g (1 ,...,  n )  g (1 ,..., n )]2 }  r . N ( N  1)

4

Similarly, we can show that

5

Cov(bˆr , ˆ 2 ) 

1 b 2   E{g (1 ,...,  n ) sin(r ( )T )[ g (1 ,..., n )  g (1 ,..., n )]2 }  r . ( N  1) N

6

50

1

Appendix D: Covariance between estimated Fourier coefficients for random balance

2

design sampling

3

The covariance between cosine Fourier coefficients a00..ri ..00 and sine Fourier coefficient

4

b00..ri '..00 (where the harmonic ri can be the same as or different from ri ' ,| ri |,| ri ' | 1,..., M )

5

can be calculated as, Cov(aˆ00..ri ..00 , bˆ00..ri '..00 )  Cov[  Cov[

6

 Cov[

N

1 N

 g (1( j ) ,...,n( j ) ) cos(rii ), j 1 N

1 N

8

9

10

11 12

13

1 N

 (u(i ( j ) )  i ( j ) ) cos(rii ), ( j)

j 1 N

1 N

 (u( j 1

1 N 7

( j)

( j) i

) cos(rii )  ( j)

N



j 1

1 N

( j) 1

j 1

1 N

 

 u (i ( j ) ) sin(r 'i i )  j 1

 g (

 (u ( j 1

N

  

1 N

N

  j 1



( j) i

( j) i

)   i ( j ) ) sin(ri 'i )]

( j) i

cos(ri 'i )]. ( j)

1 N

N

 u( j 1

1 N

N

 u( j 1

( j) i

) cos(rii ) and ( j)

1 N

1 N

N

  j 1

N

 u( j 1

( j) i



( j) i

( j) i

) cos(rii ) is ( j)

cos(rii ) for a relatively large sample size ( j)

(see Section 2.2 for details). Similarly, the estimation error of

substantially smaller than that of

( j)

( j)

By using a grid sampling for i , the estimation error of

substantially smaller than that of

( j)

cos(rii ),

( j) i

j 1

,..., n ( j ) ) sin(ri 'i )]

N

N

1 N

( j)

N

1 N

N

 u ( j 1

( j) i

) sin(ri 'i ) is ( j)

sin(ri 'i ) .Thus, we treat ( j)

) sin(ri 'i ) as constants. In addition, based on ( j)

eq. (44), we have

E[ i cos(rii )]  E[aˆ (00 , ri )00 ]  0, ( ,  ) E[ i sin(ri 'i )]  E[bˆ00  ri '00 ]  0.

51

1

Thus, Cov(aˆ00..ri ..00 , bˆ00..ri '..00 ) is simplified as

Cov(aˆ00..ri ..00 , bˆ00..ri ..00 )  E[   

1 N

N

  ( j ) cos(rii ) ( j)

i

j 1

 ( j)

N

N

1 N

  j 1



( j) i

sin(ri 'i )] ( j)

N

( j) 1 E[  i ( j ) cos(rii )  i ( j ) sin(ri '(i )T )] 2 N j 1 j 1

1 N2 1 N2

 E[   N

( j)



j 1

N



j , k 1, j  k

i

2

cos(rii ) sin(ri 'i ) ( j)

( j)

E[ i ( j ) cos(rii ) i ( k ) cos(ri 'i )]



( j)

(k )



2 ( j) 1 E[  i ( j ) cos(rii ) sin(ri 'i )] N N ( j) (k ) 1  2  E[ i ( j ) cos(rii ) i ( k ) cos(rii ] N j , k 1, j  k



2

3

 

2 1 E[  i cos(rii ) sin(ri 'i )] N N 1 E[ i cos(rii )]E[ i sin(ri 'i )].  N



Finally, in view that  i is independent from i , we have Cov(aˆ00..ri ..00 , bˆ00..ri '..00 )

4

5



2 N 1 2 1 E[  i ]E[cos(rii ) sin(ri 'i )]  E [ i ]E[cos(rii )]E[sin(ri ' )]. N N

 

Using the fact that E[cos(ri i ) sin( ri 'i )] 

6

2

 cos(r ) sin(r ' )d i i

i

i

i

 0,

0

7 8 9 10 11

and

E[ i ]  0,

we have Cov(aˆ00..ri ..00 , bˆ00..ri '..00 )  0. Similarly, we can show that the covariance between aˆ00..ri ..00 and aˆ00..ri '..00 ( ri  ri ' ) is

52

Cov(aˆ00..ri ..00 , aˆ00..ri '..00 ) 1 2 3

 

2 1 E[  i ]E[cos(rii ) cos(ri 'i )] N  0, in view that



E[cos( rii ) cos( ri 'i )] 

2

 cos(r ) cos(r ' )d i i

i

i

i

 0, for ri  ri '.

0

4

5 6 7

The covariance between bˆ00..ri ..00 and bˆ00..ri '..00 can be calculated as Cov(bˆ00..ri ..00 , bˆ00..ri '..00 )

 

2 1 E[  i ]E[sin(rii ) sin(ri 'i )] N  0, in view that



E[sin( rii ) sin( ri 'i )] 

2

 sin(r ) sin(r ' )d i i

i

0

8

53

i

i

 0, for ri  ri '.

1

Appendix E: Covariance between estimated Fourier coefficients and sample

2

variance of model output for random balance design sampling

3

Using eq. (42), the covariance between cosine Fourier coefficients aˆ00ri 00 and sample

4

variance ˆ 2 can be calculated as

5

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , uri )00  aˆ (00 , ri )00 , ˆ 2 ).

6

For random balance design sampling, the cosine coefficient

aˆ (00 , uri )00 

7

1 N

N

 cos(r

( j)

i i

j 1

(E1)

)u (i ( j ) )

8

is estimated based on a grid sample of i (see eq. (38) for details), which will have a

9

much lower estimation error (the estimation error decreases at an order of

1 with an N2

10

increasing sample size N) compared to that of aˆ (00 , ri )00 based on simple random sampling

11

(the estimation error decreases at an order of

12

Therefore, we treat aˆ (00 , uri )00 as a constant and we have

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , ri )00 , ˆ 2 ).

13 14

1 with an increasing sample size N). N

Using eq. (21) and eq. (42), we have Cov(aˆ (00 , ri )00 , ˆ 2 )  E{

15

16



1 N

N

  ( j ) cos(i ( j ) ) j 1

i

1 N

N

 [ g (G ( j 1

( j) 1

),..., G ( n ( j ) )) g (G (1 ),..., G ( n ))]2 }

1 E{ i cos(i )[ g (G (1 ),..., G ( n ))  g (G (1 ),..., G ( n ))]2 } N N ( N  1) E ( i ( j ) cos(i ( j ) )) E[ g (G (1( k ) ),..., G ( n ( k ) ))  g (G (1 ),..., G ( n ))]2 .  2 N

Since E ( i ( j ) cos(i ( j ) ))  E (aˆ (00 , ri )00 )  0 (see eq. (43) for details), we have

54

Cov( aˆ (00 , ri )00 , ˆ 2 ) 

1 2

Let

 (1 ,...,  n )  g (G (1 ),..., G ( n ))  g (G (1 ),..., G ( n ))]2

3 4

and u ( ) ( i )  E[ (1 ,...,  n ) |  i ] ,

5 6

1 E{ i cos( i )[ g (G (1 ),..., G ( n ))  g (G (1 ),..., G ( n ))]2 }. (E2) N

we have

 (1 ,..., n )  u ( ) (i )   ( ) 

7

(E3)

i

8

where  ( ) i is the random error independent from i and E ( ( ) i )  0 . Using eq. (E3),

9

the covariance in eq. (E2) can be simplified as follows, Cov(aˆ (00 , ri )00 , ˆ 2 )

10

11 12 13 14

1 1 E[ i cos(i )u ( ) (i )]  E[ i cos(i ) ( ) i ] N N 1 1  E[ i ]E[cos( i )u ( ) (i )]  E[ i  ( ) i ]E[cos(i )]. N N 

In view that E[ i ]  0 and E[cos(i )] 

1 2



2

0

cos( i ) d i  0 , we have

Cov(aˆ00ri 00 , ˆ 2 )  Cov(aˆ (00 , ri )00 , ˆ 2 )  0 . Similarly, we can show that ( ,  ) ˆ2 Cov(bˆ00ri 00 , ˆ 2 )  Cov(bˆ00  ri 00 ,  )  0 .

15

55

1

Appendix F: Standard errors of first-order sensitivity indices for random balance

2

design sampling

3

Standard errors of first-order sensitivity indices can be estimated based on the delta

4

method and eq. (44) as follows,

5

ˆ (ˆ x ) i

ˆ ˆ T  ˆ  2





4aˆ00...ri ..00 2 Vˆ ( i ) h

aˆ00... ri ..00  0

(ˆ 2 ) 2

2N





4bˆ00..ri ..00 2 h

bˆ00.. ri ..00  0

h

(ˆ 2 ) 2

   h (aˆ00...ri ..002  bˆ00..ri ..00 2 )   h ˆ V ( i )  aˆ00.. ri ..00  0,bˆ00..ri ..00  0  ˆ 2   V [ˆ ]. 2 2 2N (ˆ )      

h

6

where aˆ00... ri ..00  0 and bˆ00... ri ..00  0 indicate that cosine Fourier coefficient aˆ00...ri ..00 and sine

7

Fourier coefficient bˆ00... ri ..00 are significantly larger or smaller than zero based on the

8

hypothesis test in eq. (37), respectively. Since we have

Vˆ ( i )  Vˆ ( y )  Vxi  ˆ 2  ˆ xi ˆ 2

9

 ˆ 2 (1  ˆ xi ), 10

the estimated standard error ˆ (ˆ xi ) can be simplified as

ˆ (ˆ x ) i

 11 

2 Nˆ 2



h

aˆ00...ri ..00

2

aˆ00... ri ..00  0

2 (1  ˆ xi )  Nˆ 2

12

2

i

2 (1  ˆ xi )[ Nˆ 2 ˆ



 ˆ x    bˆ00...ri ..00 2 ]   2i  Vˆ [ˆ 2 ] h  ˆ  bˆ00.. r ..00  0 2

h

aˆ00...ri ..00

2

a00... ri ..00  0



 x   h bˆ00...ri ..002 (1  ˆ xi )   ˆ 2i  Vˆ[ˆ 2 ]   bˆ00.. r ..00  0

i

ˆ x 2 (1  ˆ xi )ˆ xi  ( 2i ) 2 Vˆ [ˆ 2 ]. N ˆ

Based on estimation of Vˆ [ˆ 2 ] by eq. (32), we have

56

 ˆ x  2 1 (1  ˆ xi )ˆ xi   2i  [uˆ (4)  (ˆ 2 )2 ]. N  ˆ  ( N  1) 2

1

ˆ (ˆ x )  i

57

Sensitivity Evaluation of Global Resonant H-Tree Clock ...

The Impact of Evidence Reliability on Sensitivity and Bias in Decision ...

indices-glycemiques.pdf

Indices prompt sheet.pdf

Sensitivity of Torsional Natural Frequencies

Discovery Reliability Availability Discovery Reliability Availability

indices prompt (alternative).pdf

Complex Indices and a Blocking Account of the ...

106 - Indices of Climate Change for United States.pdf

Aggregating indices of governance quality: An ...

Sensitivity of Metrics of Phylogenetic Structure to Scale ...

indices prompt (alternative).pdf

REAL ELEMENTS AND SCHUR INDICES OF A ...

Indices of 1-Forms and Newton Polyhedra

RIDE RENTAL AGREEMENT_RELEASE OF RELIABILITY FORM ...

RELIABILITY OF FLOATING STRUCTURES: EXTREME ...

Damage indices for failure of concrete beams under ...

Application of Diatom Indices in a Planted Ditch ...

The relationships between two indices of mandibular ...