of V~g~I{l/;01/ Sci~1/c;~ 4: 441-452, /993
JO~n1al

441

Spatial models for spatial statistics: some unification Ver Hoer, Jay M.I,2·, Cressie, Noel A. C.l & Glenn-Lewin, David C.2 ofStatistics, Iowa State Universiry, Ames, IA 5001 J, USA; 2Departmenr of Borany. Iowa State University, Ames, IA 500//, USA; 'Current address: Alaska Department ofFish and Game. 1300 College Road, Fairbanks, AK 99701, USA; Feu +/ 90745264/0; E-mni/ (BITNFr) FFJMV@ALASKA, (/NTERNF:f) [email protected] I Depcmment

AbstracL A general statistical framework is proposed for comparing linear models of spatial process and pattern. A spatial linear model for nested analysis of variance can be based on either fixed effects or random effects. Greig-Smith (1952) originally used a fixed effects model, but there are also examples of random effects models in Lhe soil science literature. Assuming imrinsic stationarity for a linear model, the expectations of a spatial nested ANOVA lllld (wo teon local variance (lTL V, Hill 1973) are funclions of the variogram, and several examples are given. Paired quadrat variance (PQV. Ludwig & Goodall 1978) is a variogram estimator which can be used 10 approximate TIl.. V. and we provide an example from ecological data. BOIh nested ANOVA and TILV can be seen as weighted lag-I variogram estimators that are functions of support, rather than distance. We show that there are two unbiased estimators for the variogram under aggregation, and computer simulation shows that the estimator with smaller variance depends on Ihe process autocorrelation.

Keywords: Autocorrelation; First-order autoregressi ve model: Nesled ANOVA; Paired-quadrat variance; Pattern analysis; Two Tenn Local Variance; Variogram.

Introduction Ecologists concerned with spatial pattern firs! cenlered lheir interesl on tcsling the hypothesis thaI a species ex.hibited a complete spatially random pattern. Many indices have been proposed (for a review, see Goodall & West 1979; Cressie 1991, sect. 8.2). It soon became apparent that very few organisms exhibit com· p1ele spatial randomness (csr) at all scales, so attention has turned to the description of non-csr pattern. Ecologists often define spatial pattern as the non· random horizontal spatial abundance of organisms (Greig-Smith 1979; McIntosh 1985; Kershaw & Looney 1985), although PieJou (1977) allows for random pal· tern. There are several features of pattern. Grain is the scale of pallcrn in which differences in abundance occur . the size of patches, for instance. Intensity is the extent to which abundance differs over an area (Pielou 1977).

Interest in inference about imensity and scale of pauern has generated many statistical methods (e.g. GreigSmith 1952; Hill 1973; Usher 1975; Ludwig & Goodall 1978; Ripley 1978; GaJiano 1983; Dale & MacIsaac 1989; and OrI6ci & Orl6ci 1990). Comparisons of some of the earlier methods (Ludwig 1979; Carpenter & Chaney 1983) recommended Two Tenn Local Variance (TrLV, Hill 1973) and Paired Quadrat Variance (PQV, Ludwig & Goodall 1978) for estimating grain and intensity. Several recent studies have employed TrLV (e.g. Dale & MacIsaac 1989; Ver Hocf, Glenn-Lewin & Werger 1989; Dale & Blundon 1990). In general, statistics may be used in several ways. Two common uses of statistics are for a descriptive summary of the data and for estimating parameters. For example. consider the sample mean. It can always be calculatcd for a data set, and it provides a good description of the data's centraltcndency. In this sense it is a descriptive statistic. If we further postulate that the data come independently from a model, such as a nonnal distribution N(Jl, s2), then the sample mean is also an estimator of the parameter I-L So the sample mean can be used for both description and estimation. The development and use of TTLV and PQV in ecology has been largely descriptive. The original intent of the research reported in this paper was to see if these statistics also es/imate any model paramelers, and if so, what Ihey are_ Secondarily, we found that there are some interesting relationships between the statistics. and that TrLV can be approximatcd by PQV. In this paper, we show some relationships between Nested Analysis of Variance (nested ANOVA), first used forpauem analysis by Greig-Smith (1952), TI1.V (a modification of nested ANOVA), and PQV. Nested ANOVA and lTLV are statistics that aggregate or block data from contiguous plots; PQV is based on the dislance between plols. Is there some mathematical relationship between the aggregation- and distancc·based methods? To answer this requires a mathematical framework. SO we begin with definitions of process and pattern by placing them in the context ofa statistical model.

Vcr Hocf, J. M., Crcssic, N. A. C. & Glenn-Lewin, D. C.

442

Lei the ecological measure of interest (e.g. biomass. abundance. etc.) be a random variable. Then a spatial process is a collection of these random variables, Z(s), indexed by spatial location vectors s. e.g. s = (x. y)' coordinates. which we shall denote a'i.

combinations of fixed and random effects. The model (2) has several distinct advantages over a verbal definition of patlem as, say, non-random spatial abundance. First the model malt:es the definition and L'OllCcpt of pallcrn mathematically explicit. Second, as a special case of (2). where,

(1 )

here 0 is some subset of Rd •which is a Euclidean space with dimension d = 1.2. or 3. For example. 0 is usually defined by the boundaries of the sludy area. The model (I) is the generating mechanism. or proau; an oUlcome afthe model (also called a realization or data) from (I) is referred to as the p3l1crn. For example. suppose we wish to collect biomass samples along a transect Then OcRl is the tronseci itself. s= I, 2....• n is the position on the transect. and Z(s) is a random variable for biomass at location s. Prior to the growing season. Z(s) is a random variable and may take on one of many values, so the whole set ofrandom variables lZ{s») is thoughl of as a process. Once the data have been collected. the set of values Iz(.1) J is the observed pallcm. The process (I) is very general. More specifically. consider a process Z(.) consisting of a lincar model of fixed and random effects,

(2) where l~{s); i:::: I, .... PI} are fixed or non-random effects (parametcrs) in the mean struclUre with coefficients {a i ; i:::: I " .., PI }; these fixed effects arc the deterministic, ecological effects. In Ihe example of biomass along a transcct, III might be soil moisture, ill mighl be soil nitrogen, etc. Randomness isconmined in the set {~(s):j:::: I•... , P2}' which arc random variables with zero mean and unit variance wilh cocfficienl<; {OJ: j:::: I, ..., P2}; is lhe variance associated with ~ because Var[Oj.bjl :::: Again, using the example of biomass along a transect.

a/

I

aA (.1)

al

contains the uncertainty of the biomass

I

values due to differenl sources; e.g. genotypic variation. unknown diseases. and Other unmeasurable innuences. That is, even when sampling soil moisture. soil nitro-.gen, ctc.• we cannot predict the value of Z(s) with certainty. In words. (2) can be wriuen: Z(.):::: detcnninistic mean structurc + random error.

Equation (2) generalizes the model of Moms (1987) by making it spatially explicit. and it allows for more

I,a,Jli(s)::::Jl and

~ ",6,(') = "6(.). j-I

i",1

a:.)

with a standard while noise process, patlem may indeed be spatially random. Finally, this model allows for the description ofpauem in a positive sense. 1llat is, rather than defining pattern as not completely random. it is seen as some combination of mean suueture and possibly spatially-dependent random error. The model (2) is the theoretical framework for the comparison of nested ANOV A, TTLV, and PQV. In particular, we show that the spatial paltcm-methods of nested ANOV A and TIL V can estimate an aggregated variogram, and PQV is a vanogram estimator. Through the variagram, wc shall show thai the aggregation approach (nested ANOVA and TTLV) and the distance approach (PQV) arc related. Geostatistical methods, which use the vanogram, have had a major role in spatial description, modeling. and prediction for lhe geological sciences, but they have only recently been adopted by the ecological sciences for analyzing spatial dala (e.g. Robertson 1987; Robertson ct al. 1988; Legendre & Fortin 1989).

Historical development of Nested ANOVA and TILV in vegetation studies

Nested ANOVA Greig-Smith (1952) initially applied nested ANOV A to data in 2-dimensional space. Kershaw (1957) modi· fied nested ANOV A for line-transect data of contiguous quadrats; wc shall consider lhe transect case. Suppose there are n = 2k random variables Z(s) in a line transecl consisting of contiguous sample rectangles ofequal size (henceforth called quadrats),

1t has been more usual in the ecological literature to use subscripts on Z. but since Z is a function of location. we retain the Z(.) notation.

443

- Spatial models for spatial statistics A useful alternative notation is: Let Y(ab ...cd) = Z(s), where ab...cddcnotes the binary representation (of length Ie) of s-l; s= I, .... n = 2k. That is,

(J

o ifZ is in JlI half of transect = {1 if Z is in 2 00 half of transect

o if Z is in lSI

half of alh half nd 1 if Z is in 2 half of alh half

b= {

For example, take n = 8 and Z( I), Z(2), Z(3), ..., Z(8) in a transecl of contiguous quadrats. Then a nesled spalial structure denoted BS(I), I = 1,2,4,8, can be imposed with the equivalent YC·) notation, B5(8)

YfAAA)

B5(4)

85(2j BS(l)

85(1)

Y(lAA)

Y(OAA) Y(OOA)

Y(QIA)

Y(10A)

Y(I tAl

V(OOO) I V(OOI} VC010l1V(011 ) V(l00j I YOOll V(1IOlIV(III)

Z
Z(2)

2(3)

Z(4)

2(5) I 2(6)

Z(1)

2(8)

where Y (aA. .. A) is the average over all Y values for which a is given; Y (abA. ..A) is lhe average over all Y values for some ab combination, etc. Under the Yeo) nolation, calculation of nested ANOV A is gi ven in Table I. In plant ecology, Greig-Smith (1952) originally made use of nested ANaVA for spatial data. In terms of the linear model (2), Greig-Smith's analysis was based on

the following model (using the Y(·) nOlation):

Y (ab ...cd) =}.Li + ,ui_1 (a) + J..lk _2(ob)+...+ J.Lz.(ab ...c) +C1ol(ab ...cd). (3) In (3), n = 2k , each J.l, is a fixed effect in the mean structure due to the block size 21, and 01 is independent random error. The obvious estimator for ~i is Y (AA A), for 1.1._ 1(0) is Y (OA ... A), for Ill-I (I) is Y (IA A), etc. The deviations ofBS( I) from the averages at B5(2) serve to estimate the random error a. Table I may be used in perfonning an F-test of the equality of mean effects in model (3) with the following assumptions: All 0. are independent Gaussian random variables with unit variance. Due to the probable violation of these assumptions, Greig-Smith's original usage was quickly criticized by Thompson (1955, 1958) and others. A purely random effects model could also be chosen, where the modeJ, in tenns of (2), now becomes,

Y (ab ... cd) = J..lk + ai_10k-lea) + a._ 20"t_2(ab)+... + (]lol(ab ...cd). (4) In (4), n = 2.10, I-lt is some overall mean and each 0/ is a random effect due to the block size indicated by its subscript t. The expectations for mean squares in Table I, based on model (4), are given in Table 2.

Table 1. Nested ANOVA table for spatial data along a line transect. d.f.

Mean

Sum of squares 5.5.

2Hf[-( "-' Y aA... A)- -Y(AA... A)

I'

5.5. d. f.

a"O

,

2'-2

I Ilv!'bA .. A)- V('A... A)j' a=O~O

85(2)

BS(l)

T"'"

2" 2"'

2'-1

2I I i [V('b .cA)- Y(ab a"O b=O

~O

1 ,

,

<1,,0 b"O

.1,,0

~uarc

..AA)J'

5.5. d. f.

5.5.

-

d. f.

L LL IVI'b .. cd)- -Y{ab... cA)J'

-5.S.

, , , 2L L···L IV('b...'d)-VlAA. .AA)I'

-5.S.

a 0 b={j

dO

d. f.

d.f.

Vcr Hoef, I. M., Cressie, N. A. C. & Glenn-Lewin, D. C.

444

T.b1e 2. Expected mean square column for nested ANOYA (fable 1) under the random effects model in equation (4), ANOVA.

Soun:::c

d.f.

Mean square

5.S.

d. f. 5.S. d. f.

BS(2j

85(1)

2~-2

-5.S.

uf +2(1j

5.S.

01

d, f.

2t - 1

-

d. f.

7TLV Hill (1973) noticed that, in (5), choosing a fixed staning position j for each block omitted many similar terms, so he proposed the following two term local variance (1TLV) statistic:

Here it is assumed that all random variables (o/} are independent wi!h zero mean and unit variance. Under this model, Table 2 can be used to partition total variance by solving rareach spatially nested random effect, 02,. as has been done in several soil studies (Youden & Mehlich 1937; Webster & ButJer 1976; Nortcliff 1978).

However, due to possible correlation between nearby qUadrats, nested ANOYA is likely to be inappropriate

for testing hypotheses for contiguous quadrats. Despite lhis limitation, it was subsequently adopted as agraphi-

cal description of vegecation (see, e.g. Greig-Smith 1983, p. 95; Kershaw & Looney 1985, p. 129), based on the following statistic:

MSBS(Z,)=-(l'lave 2 2

i+2.

I

[( i"j+1

Z(i)-

mean SQuare MSBS(2") against block size 2' or against r, and looking for a 'peak'. The location of the peak is reported to indicate the 'grain', or scale of pattern, while its height indicates 'intensity' (Ludwig & Goodall 1978). However, even as a description of pattern, there still appear to be difficulties with lhe nesled ANOVA approach; for a review see Goodall (1974) and Pielou (1977). One problem is that the geometrical progression of 2" in (5) causes largcr and larger gaps where pallem may occur. Secondly, results depend on the starting position relativc to the pattern (for examples, see Usher 1969; and Errington 1973).

j+2~'

IZ(i)

1 TTLV(m)=-avo

2m

[(i+'-' i+'·-' )'] IZ(i)-. IZ(i)

'''}

.

(6)

""1+'"

wherej= I ,2, ... , n-2m+ I and m is an integer; ms.n/2. This statistic is interpreted in the same way as MSBS in the nested ANOVA; Le. one is looking for a 'peak' when plouing Tl1..V(m) against m.

The variogram

General descriprion )'}

(5)

j=j+2'+]

where ave[e] is the average of all terms where j = (q-I) 21'+1; q = I, ..., 2~+1 for a given r, r=O, I, ..., k-1. Equation (5) is a function of aggregation, or block size. where MSBS(2') denotes mean square at block size 2'. Here, r is the geometric progression of aggregation or blocking,j is the starting position for each pair of blocks to be differenced along the transect, and q is a sequential numbering of the block pairs (after aggregation). Equation (5) is a re-expression. in the Z(e) notation, of each mean square due to each source (block size) in a spatially nested ANOVA (Table 1), and is established in

App. I. Formal inference based on nested ANOVA and statistical hypothesis-tcsting was quickly dropped in favor of the descriptive practice of plotting the block size

Return to the definition of process and pattern in (I). In the case of a univariate distribution, a statistical distribution is Characterized by parameters, and it is onen the estimation of these parameters that concerns the ecologist or statistician. Where there are many random variables, such as in the process (I), parameters specifying the dependence among Ihc random variables arc typical1y needed (e.g. all pairwise correlations). Anolher set of parameters that captures the spatial de~ pendence is given by the variogram. Very generally, the variogram 2y(e,e) is defined as, 2y(u,v);;o var(Z(u)-Z(v»), where u and v are spatial location vectors in the study area D, and var(e) denotes the variance. It is assumed that var[Z(s)] < 00 for all seD. The process Z(-) is said to be intrinsically stationary if it satisfies,

- Spatial models for spatial statistics (i) E [z(s)] '" ~ for all sED, and,

(ii)

Val'

[Z(u) - Z{v)] = 2)(u - v), for all u, v E D;

where E (e) denotes expectation. The condition (i) emphasizes a constant mean throughout the study region D, and is called mean stationarity. The condition (ii) states that the variogram depends only on the distance and direction between two points, not their exact locationswhich allows 2 y(u,v) to be written as 2y{u- v). Intrinsic stationarity is necessary to allow the variogram to be estimated from the data. There are several properties of the variogram. Notice that when u- v =0, then 2y(0)=O. The variogram is called isotropic if var[Z(u) - Z(v)J = 2Y(lIu-vID for all u, v =: D, where lIu - vII = IIhll is the Euclidean distance between u and v. Thus, the variogram now depends only on the distance (direction is no longer important) separating any two points in D. Under intrinsic stationarity, the linear model (2) can be written as, Z(S)==/l+S(s).

(7)

where IJ. is some overall mean, and 6 (e) has mean zero and variogram 2y(e). The variogram contains information on the variance and covariance among the Z's; independence ofZ's is included as a special case where y(h) = dJ. for h ;toO.

Autocorrelation and the variogrom

445

the process Z(e) consists of random variables at points. However, quadrats have non-zero area. Thus, the distance between any two quadrats is ambiguous. We can, however, adopt the following conventions. Assume that the data from the belt transect are written as Z(I), ... , Z(n). We shalt specify that any two adjacent Z(s) and Z(5+1)to bc I unit apart. 2(s) and Z(s+2) tobc 2 units apart, and, in general, Z(5) and 2(1) to be Is-~ units apan. Intrinsic stationarity for the Z(e) random variables over the discretized regions (contiguous quadrats) then is:

(;) E [Z(,)J

=lJ; ,= 1.2....;

(ii) var[Z(s)- Z(t)] '" 2yb(ls - ~); s= 1,2, ...n; t = 1,2, ... 11. where the SUperscipl b indicates a variogram for a belt transect. The result in App. 2 allows a variogram for aggre· gated (= blocked) quadrat data to be written,

2Y"(h;m)=-", I,I,2Y'(Ii- il)+ m i=Ji'=1

"".

(8)

l,2y b (li- il).

j=hm+1

where 21'(e) is the unaggregated variogram, Ii - jl is the distance between the quadrats i andj, m is the number of contiguous plots aggregated, and h is the distance. in terms of m, after aggregation. The superscript a indicates the aggregated variogram.

Probably more familiar co ecologists are the autocovariance and autocorrelation functions (e.g. Sokal & Oden 1978: Cliff & Ord 1981), which are also used extensively in time series. Define the autocovariance function qu,v) ii!! COY (Z(u), 2(v», where cov (e,e) de· notes the covariance.lfC(u,v);toO, then we do not have independence between data at sites u and v. Along with mean stationarity, assuming C(u.v)=C(u- v)=C(h) for all u and v is termed second-order stationarity. The autocorrelation function then is C(h) I C(O); hERd. Notice that, in terms of variances and covariances.

Now that a spatially discrete version of the variogram is available for any aggregation level and distance (8), it is possible to derive the expected values of nested ANOYA and ITLV in terms of the variogram. From the derivation of (8) in App. 2 and (5) and (6), it is easy to obtain the expected values of MSBS(2') and TILV(m):

2y(u.v) =: var (z(u) - Z(v)) '" var (Z(n» +var (Z(v»2cov (Z(u), Z(v».

E[MSBS(2')]= ~ {2y'(1;2')j

Thus, the relationship between the variogram and autocovariance function is 2y(u,v) = C (o,u) +C(v,v)2C(u,v). Assuming second-orderstalionarity, we obtain y(h) = rfl. - C(h), where dJ. =C(O).

Relationships among the variogram, PQV, nested ANOVA and TTLV

(9)

ond

E=[TILV(m)]=; {2y'(I;m)).

( 10)

The aggregated variogramfor quadrat data Now consider the case of a belt transect or contiguous quadrats. Usually, the variogram is defined where

In fact, an unbiased method or moments estimatorof(8) is:

Ver Hoef. J. M., Cressie. N. A. C. & Glcnn-Lewin, D. C.

446

(II)

(15), the relationship between TT'LVand PQV is cstalr lished. Specifically, (14), (15), and (16), show that TI1.V can be estimated with a linear combination of PQV values.

Comparison of 2y' (l;m) and 2y- (1;m) Then Hill's (1973) statistic (6) has the following relationship to (11);

TILV(m) = ~{2 y'(I;m)}.

(12)

2

Due to the difficullies of computing the exact variances, a simulation experiment was conducted to examine the relative efficiency of the two estimators of 2r"O;m),namely, 2yl(l;m)in(ll)and 2ji<' (I;m)in (15). The relative efficiency is defined as,

which is an ~ weighted estimator of the aggregated variogram. 2 Matheron (1963) proposed an estimator for the variogram 2i'(h) which, for the transect case, is.

2 yb(h) = -'-I(Z(i)- Z(i +10))' n-h

( 13)

;=1

A second unbiased estimator of 21' (I ;m) will now be developed, based on PQV. Goodall (1974) introduced randomly-paired quadrats variance, which was subsequently modified to all-paired quadrats variance PQV (ludwig & Goodall 1978). Although the authors did not realize ii, PQV is exactly the variogram estimator in (13); i.e ..

PQV(h)=2 Yb(h).

(14)

Now, replace 21' (h) in (8) with (13) to obtain,

2y'(I;m). --",ff2 ybOi - jlJ+ m

-", f

m root

;"'1]=1

(15)

f2yb(Ji-j1J j=m ... 1

Thus, 2 ya (I ;m) given by (II) and 2 yl (l; m) given by (15) both estimate 2fl (I;m). Because TrLV(m) ::::

~ 2)'" (I ;m), it is natural to consider an approximation 2 to ITL V based on (15), namely, (16)

Because 2 ya (I ;m) depends on PQV through (14) and

Suppose we have a set of normally-distributed random variables Z(j); i:::: 1, 2, ...,40, where j denotes transcct location. From the following first-order autoregressive model: E [Z(j)] :::: 0, var[Z(i)J :::: I, and cov [Z(j),Z(j)] :::: {1'"-Jl; j, j:::: I, 2, ..., 40, independent transect realizations (1000 of them) were generated on a computer for each p value; p ::::- 0.9, - 0.8, ..., 0.8, 0.9. The data were simulated using the Cholesky decomposition method (Cressie 1991, p. 201). Both 2y.. (l;m) and 2 ji<' (I; m) were calculated for each level of aggregation, m::: 1,2, ..., 20, for each transect. BOlh estimators arc unbiased for 2 ~ (I; m). The variance of each statistic for each m, p combination was estimated by taking the average (over the HX)() transects) squared difference between the calculated value and the true value. Fig. 1 shows the estimated efficiency for all m,p combinations. It can be seen that, over positive values of p, (he variance of 2 yl (1; m) is as much as 1.5 times the variance of 2 yl (1; m), with the efficiency of 2 yl (1 om) increasing with m. However, over negative values of p, the variance of 2yl (1: m) is as much as 10 times the variance of 2 ya (1; m), and 2 y" (I; m) also does relatively better than 2 y" (I; m) with increasing m.

Examples Consider the example of spatial independence. In this case the variogram looks like Fig. 2a; (]2 :::: 5 was arbitrarily chosen as the variance of all the Z(j). (The variance is one-half the 'sill', the constanl height that the variogram attains.) Under this variogram model, 2t (I; m) in (8) and the expectation ofTrLV(m) in (10) can be seen in Fig. 2b. This shows that the weighting factor (mf2) keeps TILV(m) invariaOi to aggregation when

- Spatial models for spatial statistics EffIcIency of Two Esllmolors ".0 17.0 C I~.O ~

g,'U

••

g ll.o

"0

1.0

I

7.0

447

glade nora is dominated by Schizach)'rillm scoparillm and other grasses in areas with deep soil (greater than 15 em) and by Croronopsis eltiptica and other annuals in areas with shallow soil (less than 15 em). Along a 3O-m transect, each 5O-<:m segment was photographed from ground level through a W-cm wide strip of vegetation, Each segment was backed by a white screen with a \,ertical scale on it. 1bc 60 photographs were each digitized. and image analysis was used to compute the percentage of non-white pixels in a venical column 105m tall by IDcm wide, making a tOlal of 300 estimates of % vertical cover along the lranscct. This lllethod was adapted from Rocbensen. Heil & Bobbink (1988) and used by Ver Haef. Glenn-Lewin & Werger (1989). The raw data arc given in Fig. 5a. We calculated the estimator 2 yb(h), given in (13). of the variogram 2 y b (hl (Fig. 5b). Next we estimated 2ya (I ;m) with hoth estimators: 2 (I ;m) in (II) and 2 y3 (I; m) in (15)(Fig. x). Aftcrscalingeach by (ml2). TILV(m) and TIL V(m)!!(ml2)2 ya (I ;m) are oblained (Fig. Sd). It can be seen that the approximation using TIL V(m) is veryclosc to TTLV(m). and at least retains all of the features oflTLV(m).

,,=

,.,

,., 1.~0.9 -0.7 -0.) -0.3 -0.1 0.1

oulocorr.lollon

0.3

o.~

0,7

0.9

~olu.

Fig. I. Isopleth lines for the relative efficiency of 2 y( I: m) versus 2 y (I: m). where III is tM' le\'el of aggregation and p is autocorrelation in a first·onlc:r autoregressive model.

lhere is spatial independence. Several common vanogram models are oflen fit to actual data. a few of which are shown here; for the mathematical formulae and fuller discussion, sec Joumcl & Huijbrcgts (1978). Fig. 3a show!!; a spherical vanogram model. Here, variables close together have a relatively high correlation, bUI reach zero correlation at higher lags (arbitrarily chosen 10 be h = 5 in this example). where Ihe variogram natlens out to the value 2& 10, (Again. the variance for all Z(i) was arbitrarily chosen to be tT'=5.) Theexpectations, 2~ (I :m) and E [TTLV(ml). for the spherical variogram arc given in Fig. 3b. Fig. 4a shows the ·ho1e· effect'. or wave variogram model, which often occurs under periodic correlation structure. Fig. 4b shows the corresponding values of 2t (I ;m) and E[m.Vem)]. Notice that for both the spherical and wave models. E1TTLV(m)J increases to values above the true varianee of &=5. Also, although not plonedon the figures. E(MSBS(2'») is the same as EITn..V(m)] for those abscissa values where m::2 r; r:zO. I..... k- I. For rcal data. we took a set which appeared to havc no trend (Fig. 5a). These data consist of percent vertical cover in an igneous glade (Nelson 1985) in the Ozarks of southeastern Missouri (Shannon Cou",y). Igneous glades in this area occur on a bedrock of rhyolite. These glades are grassy openings in a mixed oak-hickory (Quercus spp. and COf)'o texana. mainly) forest matrix. The herbaceous nature of glades in this area is due to the shallow and extremely droughty nature of the soils. The

=

y3

Discussion and Conclusions The statisttcal model (2) was used to show the quantities that the pattern statistics are estimating and to show their relationships. For nested ANOVA, it has been usual to assume spalially nes!ed mean effects. given by (3). (e.g. Greig-Smith 1952):orsp3lially nested random effccts, given tly (4), (e.g. Youden & Mehlich 1937), wilh independence among random variables. In contrast. the usual assumptions for a variogram analysis are intrinsic stationarity, namely, a constant mean and spatial dependence among random variables. or a variogram. which is a function only of the displacement h between the spatial locations under consideration. Assuming the variogram model, or intrinsic statio-narity. the aggregation of data by TTLV led naturally to a definition of the slalistical quantity (mI2)2t ():m). which is a variogram as a function of aggregation. Given intrinsic stationarity, the expected values ofocsted ANOVA (9) and TILV (10) are (ml2)2t (I:m). In contrast. Miesch (1975) assumed nested random effccts and then showed that the classical variogram cstimator (13) is a function of the mean square estimates from nested ANOV A. However, the result of Miesch is only true for the rather special sampling design he specifics. and would not hold for a transect of contiguous quad-

"'IS. Although Tn..V has been used previously to estimate the 'scale of pattern', a definilion of 'scale of

Ver Hoef. J. M.• Cressie, N. A. C. &

448

Glenn~Lewin.

D. C.

(.)

(·1

" to

_ ••_ •••_ •••••••••_

.__

•••

8

'i;

__

.

5

..

ol--~-~-~-~-----'

o

20

10

30 I. . b

oL_~_~_~_~_-'

o

50

10

20

30

.0

1111 h

('I 10

('I 20

6

E('M'LV(m»

T····

16

• ..................................................

.......................................

I(rn.V(m))





5

2

. .... ~.

2'y0Cl;ItI)

o

10

20 30 ..,p-e,IlUOD m

40

50

spatial randomness.

.• -.. .; ....•

' .:

>

~

:

...

..•. ..

"! , ... '

,

~

(., :;.

,•

,I

.' ",

.

,• "••

"

'

.~

1.t;

,~

,~

,

...



:a;-" ~,\.

'"

o l:..__~...:.::~: ..~.-:::..:-:t:~.,,::'::i=;:,,"" o 10 20 30 .0 50 ..arele.Uon m

1011- 2. (a) Variogram model under complete spatial randomness, where the variance was arbitrarily chosen as S. (b) Aggregated vanogram and EtTTLV(m») under complete

~

10

,~

.•.

: ,~

Fig. 3. (a) Spherical model for the vanogram, where the variance was arbitrarily chosen as 5. (b) Aggregated variogram and E[TIl.V(rn») for the spherical variogram model.

.....

..,

~(I:m)

••••• 27'(t;JU)

,.,

,., ,

~

'00

aal"'1.UoD. m

Qu.dr..t Sooq.. ftI"" I

("

("

I.'

("

"

"

" " ,• •

..... m.V(m)

·····m.V(...)

~

..".....tl_

100 III.

Flc- !!i. (a) Percent vertical cover data for igne
- Spatial models for spatial s!a!istics (.)

15

- _ _.

......... . e ... ~ . 10

5

01-_-'-_-'--_-'-_......._ - '

o

10

20

30

.0

50

'" h (0)

........•....... ...................... E(mV(m»)

25

"

20

15 10

5



o L··:~=·_.._·_~··_··=·:·_:;::·_=:·::L:::"':::·±(~',.:::=)=

o

10

20

30

4(J

50

a,ure,,,UoD m

Fi.. 4. (a) Wave model for the Yariogram. where: the Yariance was arbirrarily chosen as S. (b) Aggregated Yanagram and E [TTl.V(m») for the wave variogram model.

pattern', and moreover a statistical quantitity that embodies it, are difficult to find in the ecological literature. On Ihe other hand. rn.. V estimates the parametric quantity (mn)2t (I; m). which is of interest sincc it is invariant to aggregation m when there is spatial independence among the I'llJldom variables. We have shown that there are two estimaton of(mI2)2t (1 ;m), namely: (mI2)2yo (!;m) (=TTI.V(m».nd (mI2)2Y' (':m) (= TTI.V (m». BecauseTIL V(m)is a functionofPQV(h)(= 2ib (h», liLV can be approximated from knowledge of PQV. As Fig. 5d shows. they may often yield similar results. From our simul.:ttion slUdy. in order 10 estimate (mJ 2)2y- (l;m), Tn... V(m) can be recommended when there is negative autocorrelation, and rn..V(m) is recommended when there is positive autocorrelation. 1lle variance ofTTlV(m) is much smaller than 111..V(m) for negative valueso{ p, while the varianceoflTL Vern) is only moderately smaller than TTLVern) over positive values of p. Therefore. the 'safe' estimator to usc is n'LV(m). However, negative spatial dependence appears to be rare in nature, so in general TIl. V(m) has

449

some advantage. TIle direclion (and strength) of dependence can be checked by estimating the variogram [c.g. with equation 13. or using some of the estimaton from time series; see e.g. 8Qx & Jenkins (1976); or Pankratz (1983)]. Positive autocorrelation will give an exponential·type variogmrn model. We expect that lhe relative efficiencies, calculated for a Simple autoregressive model. have the same qualitative features for more: general models. An alternative, which has not been considered here, is spectral analysis (e.g. Ripley 1978). In ecological uses of spectral analysis. the deterministic mean Structure is modeled as periodic, a sum of sine or cosine waves, and the residuals arc usually modeled as independent random CrTOrs. Ripley (1978) points out that TILV and nested ANOV A for transect data can be seen as a type of spectral analysis with 'square waves'. Of practical concern 10 the ecologisl is the use of the variogram, TTl... V, nested ANOV A, or spectral analysis. among other methods, for analyzing spatial data. While they may all be used to explore and describe the data, ultimately one wants a model. Then, the decision depends on how one wants to model the dc=romposition: Z(.) = detenninistic mean strUcture + random enor. It is important to noc:e that this decomposition is not unique. 1berefore, the choice depends on whether the investigator wants to model spatial heterogeneity as variation in nested blocks of detenninistic mean struc· ture and assume independent residuals(nested ANOVA). 10 model the spatial heterogeneity as periodic mean structure with independent or dependent residuals (spectral analysis), or to model spatial heterogeneity as autocorrclated random error with constant mean structure (variogram). This decision will depend in part on Ihe ohjectives of Ihe invcsligalor, and in pan on which model assumptions seem biologically reasonable. Additionally, the methods are not mutually exclusive. It is possible to consider models sueh as spatially nested effects, where the random erron are independent among blocks, but are correlaled wilhin blocks. lnwilively, lhe spatially nested effects could be removed with nested ANOVA, and the residuals analyzed by a variogram analysis.1be properties of such approaches need funher study. Ackno.... ledgements. The authors would like 10 thank M. B. Dale. P. Dixon, M. W. Palmer, one anonymous reviewer. and the editor R. K. Peel for their helpful comments. This work was liUpported by fundli from the Whitehall FoundaIion, the Nonh Atlantic Treaty Organization (NATO). the National Part Service. and the National Scic:nce Foundation. grant number DMS-9001862.

450

Ver Hoef, J. M.• Cressic. N. A. C. & Glenn-Lewin. D. C.

BOl, G. E. P. & Jenkins. G. M. 1976. Time Series ItlltllyJu: ForeCQSt and Co"'rol. Hoklcn-Day. Oatland. CA. Carpenter. S. R. & Chaney. J. E. 1983. Scale of sparial paIterns: four methods compared. Vtgtlllfio 53: 15]-169. Cliff, A. D. & Ord.J. K. 1981. SpolioJproctssu: MoJeuand Applications. Pion, London. Cressie. N. 1991. Sratisti(:s /OT SpatW Data. John Wiky and

Sons. New York. NY. Dale. M. R. T. & Blundon. D. J. 1990. Quadl1l.1 variance analysis and pattern development during primary S~ sicn.1. Veg. Sci. I: 153-164. Dale, M. R. T. & Macisaac, D. A. 1989. New methods for the analysis of spatial pattern in vegetation. J. fcol. 77: 78-91. Eninglon. J. C. 1973. The effect of regular and random distributions on the analysis of pattern. J. Ecol. 61: 99-105. Galiano. E. F. 1983. Detection of multi-species patterns in plant populations. Vegtfalia 53: 129-138. Goodall, D. W. 1974. A new method for the analysis of spatial panem by random pairing of qu3
Greig-Smith. P. 1952. The use of rnndom and contiguous quadrats in the study of suuclure in plant communities. AItIl. Bot. N. S. 16: 293-316. Greig·Smilh. P. 1979. Pattern in vegetalion. J. £Col. 67: 755779. Greig-Smith. P. 1983. Quatltitalj"e Plml! frolo&y. University of California Press. Berkeley. CA. Hill. M. O. 1913. The inlensilY of spatial pattern in planl communities. J. frol. 61: 225-235. Joumel, A. G. & HuijbreglS. C. J. 1978. Mining Geo.UQti!licJ. Academic Press. London. Kershaw, K. A. 1957. The use of cover and frequency in the delection of pattern in plant communities. Ecology 38: 291-299. Kershaw, K. A. & Looney. J. H. 11.1985. Quanlitativeand Dynamic Plant Ecology. Edward Arnold. London. Legendre. P. & Fortin. M. 1989. Spalial pallern and ecological analysis. Vegeralio 80: 107-138. Ludwig. J. A. 1979. A teSlofdifferem quadrat variance methods fortheanalysisofspalial pallems.ln: Cormack. R. M. &. Ord. J. K. (ods.) Sp(uitJl aru:J Temporal A.nal)'!is in &ology. Inl. Coop. Publishing House. Fairfield. MD. LudWig, J. A. & Goodall. O. W. 1978. A comparison of pairedwith blocked- quadraa variantt ~lbods for the analysis of spatial panern. Vegeratio 38: 49-59. Malheroo. G. 1963. Principles of geostaListics. &on.. Geol. 58: 1246-1266. Mclnlosh. R. P. 1985. T~ Bad:ground of Ecology. cambridge UniversilY Press. Cambridge. Miesch. A. T. 1975. Variogrnms and variance componenls in geochemistry andoreeYlliluation. Geot. Sot-. A.m. Mem. 42: 333-340. Morris. D.W. 1987. Ecological scale and habitat use. Ecotogy 68: 362-369.

Nelson. P.W. 1985. The lUffS/rial natural c:onrnwniries of Missouri. Missouri Dept. of COflSCTYlIIion. Jeffenon Cily, MO. Nortcliff. S. 1978. Soil variability and reconnaissance soil mapping: astatistical stUdy in Norfolk. J. Soil Sci. 29: 403· 418. OrI6ci. 1... & OrI6ci. M. 1990. Locating discontinuilies along ecological gradienu. J. Vel. Sci. I: 311-324. Pankralz. A. 1983. Forewsting ...·i'h Uni~'oriOle Box·JenfWu Models. John Wiley and Sons. New Yort., NY. Pielou, E. C. 1977. MathemQtico/ &o/ogy. John Wiley and Sons. New York. NY. Ripley. B. D. 1918. Spectral analysis and the analysis of pattern in plant communilies. J. &01. 66: 965-981. Robertson, G.P. 1987. Geostatislics in ecology: interpolating with known variance. Ecology 68: 744-748. RObertson, G. P., Huston. M. A.• Evans. F. C. & Tiedje. J. M.1988. Spalial variability in II successional plant com· munity: palteTnSofnitrogen availability, &otogy69: 15171524. Rocbcrtscn. H.. Heil. G. W. & Bobbillk, R. 1988. Digital picture processing: A new melhod to measu~ ~geution StructUTC. Acto Bol. Hurl. 37: 187-192. Sokal, R. R. & Oden. N. L. 1978. Spatial autocorrelation in biology I. Melhodology. Bioi. J. Linn. Soc. 10: 199--228. Thompson. H. R. 1955. Spatial point processes with application 10 ecology. BiDmL'riktl42: 102·115. Thompson. H. R. 1958. The slati5lical study of plant distribu· liOll patlcrns using a grid of quadralS. A.u.st. J. Bot. 6: 322342. Us.her. M. B. 1969. The relalion between mean square and block size in !he analysis of similar patterns. 1. £col. 57: 505·514. Usher. M. B. 1975. Analysis or pattern in real and artificial plant populations. J. &01. 63: 569-586. Ver Hoef.J. M.. Glenn-Lewin, D. C. &. Werger.M.J. A. 1989. Relationship between horizontal pattern and vertical struc· ture in a chalk grassland. VegelQlio 83: 147-155. Webster. R. & Butler. B. E. [976. Soil classification and 5um:y sfUdies at Ginninderra. AUSI. J. Soil Res. 14: 1-24. Youden. W. J. & Mehlich. A. 1937. Selection of efficient methods for soil ~pling. Conlr. Boyce Thompson Insl. Plant Res. 9: 59-70.

Received 4 June 1991; Revision received 28 December 1992; Accepted 15 January 1993.

- Spatial models for spatial statistics -

451

App. 1. Mean square as a function of block size. The mean square for each block (aggregation) effect in a spatially nested ANQVA of a transect of contiguous quadrats can be written as: MSBS(2') =_1,-.,e 2(2 )

i+" i+"·' )'] r Z(I)- rZUJ . [( i=j+l i=j+2'+1

wherej =(q - I )2r+1, q = I, ..., 21- r- 1and r=O, I, ... ,k- I.

Proof First, consider the algebra for calcultning the mean square of block size I, MSBS(I). From Table 1, change from Y(·) notation to Z(.) notation to obtain, MSBS(I) =

2:-' [(Z(I) -

Z(I)~ Z(2»)' +( Z(2) _ Z(I) ~ Z(2) J' +( Z(3) _ Z(3)~ Z(4) J' +( Z(4) - Z(3)~ Z(4) )'... lAl.I)

= 2,1_, M(Z(I) - Z(2»)' + (Z(3) - Z(4»'+.. .+(Z(n -1)- Z(n»'j.

Notice lIlal lIlere are cllactly 21- 1 terms summed in (A 1.1); so wriLC, MSBS(I) = ~ave[(Z(I)- Z(2»' .(Z(3)- Z(4»)' ....(Z(n -I) - Zen»)'

I

(A 1.2)

For block size 2, lei,

W(I)= Z(I)+Z(2). W(2)= Z(3)+Z(4). etc. 2 2 Then, calculate the mean square for block size 2, MSBS(2), as was done for MSBS( I) in (A 1.1): MSBS(2) = 2,2_ 2 i[(W(I) - W(2»' + (W(3) - W(4»' + ( w(

%-1)-w(%) J'}A 1.3)

In (AI.V, lhere are exaclly 2k- 2 tenns summed; so write:

MSBS(2)= ave[(W(I)- W(2))'.(W(3)- W(4))' ... .(W(%-I)- w(%))'j = .ve[(

Z(l)~ Z(2)

Z(3); 2(4) )'.

;.ve[(Z(I)+ Z(2) - Z(3) - Z(4)'

( Z(n - 3); Zen - 2) _ zen

-I~+ z(n»)']

(Z(n - 3) + Z(n - 2) - Z(n - I) - Z(n))'].

Generalizing from (AI. I) through (AI.4), we obtain,

(AI.4)

452

Ver Hoef. J. M., Cressie. N. A. C. & Glenn·Lewin, D. C.

MSBS(2') =

~ave[_I_, (Z(I)+...+Z(2') _ Z(2' +1)- Z(2'" ))' ,...J 2 (2')

=_1_ ave[(Z(I)++Z((2') _ Z(2' + I) _ Z(2'.' ))', 2(2') (Z(2(2')+ 1)+ .+Z(3(2'))- Z(3(2')+ .)- .. -Z(4(2')))" etc] so,

MSBS(2'J=-(I,)ave 2 2

j." [( I

i:j+l

j.'''' IZ(;))'],

Z(;)-

1=)-1'2'+1

where ave[-] is the average of all lermsj = (q_1)2M-l; q= l, .... 2*-1'-1 for a given r; r=O, I, "0' k- 1.

App. 2. The variogram for discrete contiguous quadrats for any level of aggregation. We wish to aggregate the Z(e) by averaging them into groups of size m. Let I s~-I

Z("m)s- LZ(i), m

;=s

where s is the slarting position. Then. assuming intrinsic stationarity, the variogram under aggregation is also defined;

it is: I J+m-I

1 1+..{!1.-1

]'

va<[Z("m)-Z(t: m)]s2)'"("t: m)=E [ -; ~ Z(i)--; f:::Z(i) .

(A2.1)

Then, for the case of/he transect of contiguous quadrats thai we aTe considering, the following expression 0£2r" (.1, t; m) in terms of 21' can be easily derived:

1

J~-l

.~-I

;=~

j=~

2)'"("t:m)=--, L m

I

r"~-l H.fl.-I

L rb(li-jl)--, L m

;=1

2

!+.f'.-I H.fI.-1

Lrb(li-jl)+-, L

Lrb(li-jl)

j=,

j=1

m

j>=~

(AZ.2)

Now, after aggregation to size m, define the lag between 'ZJ...s+hm;m) and Z(s; m) as h. That is, we are emphasizing that no matter what the scale of aggregation m, two adjacent Z{s+m: m) and Z(s;m) arc one unit apart, and the size of that unit depends on m. Assuming intrinsic stationarity, from (A2.2) we can express the variogram 2"'" (s. t; m) as both a function ofaggregation m and distance h conditional on the level of aggregation: Define 2"'" (h;m)=2r (s,s+hm: m), and, from (A2.2), we obtain,

2)'"(h;m)=-..!,f fZrb(HI)+..!, f m

.=1 j=l

m

;=1

I"2 r b(HIJ,

(A2.3)

j=/tm+1

where m is the number of adjacent quadrats which are averaged, and h is the distance (in units of length m) after aggregation. After some algebra, (A2.3) can be simplified 10 a computing fonnula involving only one summation,

2)'"(h;m) = 2 rb(hm) +-3,!'(i - m)2y'(i)+(m -ill rb(hm +;)+ rb(hm -;)J m m 1=1

Spatial models for spatial statistics: some unification

Dec 28, 1992 - comparing linear models of spatial process and pattern. A ..... Nested ANOVA table for spatial data along a line transect. Y (ab...cd) = J..lk + ...

3MB Sizes 1 Downloads 248 Views

Recommend Documents

Spatial models for spatial statistics: some unification
Dec 28, 1992 - adopted by the ecological sciences for analyzing spatial dala (e.g. ... fied nested ANOVA for line-transect data of contiguous quadrats; wc shall ...

Spatial Statistics
without any treatments applied (called a uniformity trial in the statistical litera- ture). The overall .... Hence, we will analyze the data of figure 15.IB with a classical ...

Flexible Spatial Models for Kriging and Cokriging Using ...
and Cjk(||h||) has domain [0, ∞), where ||h|| is Euclidean distance. 2.1 MOVING-AVERAGE REPRESENTATIONS. Barry and Ver Hoef (1996) showed that a large class of variograms (with a sill) can be developed by ..... make the half sphere larger, or shrin

statistics for spatial data cressie pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. statistics for ...

Screening for collusion: A spatial statistics approach
Sep 26, 2012 - Keywords: collusion, variance screen, spatial statistics, K-function ... is readily available such as prices or market shares; the procedure should ...

statistics for spatial data cressie pdf
Loading… Page 1. Whoops! There was a problem loading more pages. statistics for spatial data cressie pdf. statistics for spatial data cressie pdf. Open. Extract.

Screening for collusion: A spatial statistics approach
Mar 28, 2014 - ‡Faculty of Economics and Business, University of Groningen, and ..... method only requires us to calculate the likelihood of a small number of.

Spatial Dependence, Nonlinear Panel Models, and ... - SAS Support
linear and nonlinear models for panel data, the X13 procedure for seasonal adjustment, and many ... effects in a model, and more Bayesian analysis features.

spatial model - GitHub
Real survey data is messy ... Weather has a big effect on detectability. Need to record during survey. Disambiguate ... Parallel processing. Some models are very ...

Download Mathematical Biology II: Spatial Models and ...
biosciences. It has been extensively updated and extended to cover much of the growth of mathematical biology. From the reviews: "This book, a classical text in ...

Spatial dependence models: estimation and testing -
Course Index. ▫ S1: Introduction to spatial ..... S S. SqCorr Corr y y. = = ( ). 2. ,. IC. L f k N. = − +. 2. 2. ' ln 2 ln. 0, 5. 2. 2 n n. e e. L π σ σ. = −. −. −. ( ),. 2 f N k k. = ( ).

Price-Dependent Demand in Spatial Models
Jan 31, 2012 - ... in the fixed cost of entry while increases in the transportation cost of con- ...... 24For instance, the results presented in Madden and Pezzino ...

Spatial Nexus
detail. Code to replicate the model can be made available from the authors upon request. ∗Center for ... Lithuania. Email: [email protected]. Website: ..... productivity is more responsive to the movements in the labor market. This also ...

spatial and non spatial data in gis pdf
spatial and non spatial data in gis pdf. spatial and non spatial data in gis pdf. Open. Extract. Open with. Sign In. Main menu.