Estimating sampling errors for major and trace elements in geological materials using a propagation of variance approach Clifford R. Stanley Department of Geology, Acadia University, Wolfville, Nova Scotia, B49 2R6 Canada (e-mail: cliff[email protected]) ABSTRACT: Sampling errors produced when geological materials (rocks, soils, tills,

drainage sediments) are collected have been estimated empirically using variance decomposition methods or theoretically using Poisson or binomial statistics. Unfortunately, historical distribution-based approaches assume that the element of interest occurs in only one mineral. Although this may be true in some cases, most major oxide and many trace elements reside in more than one mineral in most geological materials. As a result, historical distribution-based approaches do not estimate sampling errors correctly. An alternative theoretical approach to sampling error estimation is proposed that employs both Poisson and hypergeometric statistics, depending on whether the elements of interest reside in rare or common grains. It is intended for use in advance of sampling to ensure that samples in a survey will be colleted in sufficient size to achieve a desired level of sampling precision. This method requires estimates of the proportions, sizes and compositions of the minerals making up the geological material, and thus is based on information readily available from a few (orientation) samples of the material to be sampled. This approach accommodates cases where more than one mineral contains an element of interest. It involves first estimating the sampling error for the minerals present in the geological material. Then, the mineral sampling errors are used to make estimates of the sampling error of all elements within these minerals simultaneously using a simple propagation of variance approach. An EXCEL spreadsheet is provided that undertakes the relevant calculations, and this can be adapted to consider any suite of minerals and elements in geological materials. KEYWORDS: Sampling error, sample preparation, binomial distribution, hypergeometric distribution, Poisson distribution, variance propagation

INTRODUCTION Geochemical variations attributable to sampling and preparation (or sub-sampling) errors have been considered by a variety of authors (Wickman 1962; Wilson 1964; Kleeman 1967; Visman 1969; Engels & Ingamells, 1970; Ingamells & Switzer 1973; Ingamells 1974a, b, 1981; Gy 1974, 1979; Cheng 1995; Clifton et al. 1969; Stanley 1998). These authors generally conclude that in many cases, sampling errors may represent the majority of error inherent in a geochemical analysis, and these errors may be so large as to obscure the underlying controls on the geochemical composition. Field-based studies comparing the magnitude of sampling and analytical error (e.g. Baird et al. 1967; Morton et al. 1969; Cameron et al. 1979) confirm these conclusions. More recently, the new analytical methods that have significantly reduced analytical error (Hall 1993) have made sampling error an even larger proportion of the total measurement error. Because geochemical analysis typically involves collecting a (hopefully) representative sample and reducing the size of that sample to a mass convenient for instrumental analysis, knowledge of the actual magnitude of sampling and preparation error is critical to developing sampling and preparation strategies that do not introduce significant error. Geochemistry: Exploration, Environment, Analysis, Vol. 3 2003, pp. 169–178

The fundamental controls that determine the magnitude of sampling error are: (1) sample size; (2) grain size(s) of the mineral(s) hosting the element of interest; (3) the abundance(s) of the mineral(s) containing the element of interest; and (4) the concentration(s) of the element of interest in the host mineral(s). The last three of these factors are characteristics of the geological material sampled, and typically determine the size of the sample necessary to obtain a desired level of sampling or preparation precision. Essentially, the ‘law of large numbers’ dictates that the larger the sample, the smaller the sampling error. However, collecting too large a sample will introduce additional sample preparation costs, so determining how large a sample to collect to achieve, and not over-achieve, a specified level of sampling error is a critical objective in geochemical survey design. Two fundamental approaches have considered these four factors to produce methods in which to quantify the magnitude of sampling or sample preparation error: empirical approaches (Visman 1969; Gy 1974, 1979) and theoretical approaches (Wickman 1962; Wilson 1964; Kleeman 1967; Clifton et al. 1969; Engels & Ingamells 1970; Ingamells & Switzer 1973; Ingamells 1974a, b, 1981; Cheng 1995; Stanley 1998). 1467-7873/03/$15.00  2003 AEG/Geological Society of London

170

C. R. Stanley

The most popular empirical approach involves Pierre Gy’s sampling theory (Gy 1974, 1979). This employs a number of empirical constants to describe the physical, mineralogical and geochemical characteristics of the samples under consideration. This approach can be applied to both rare and common mineral grain sampling scenarios, and has been proven to be a useful method of determining sampling and preparation error in many geochemical samples (Gy 1974, 1979; Ottley 1983; Sketchley 1997; Francois-Bongarcon 1998). However, Gy’s method does require the estimation of a variety of empirical factors generally determined from prior sampling results (Sketchley 1997; Francois-Bongarcon 1998). As a result, Gy’s sampling technique (1974, 1979) cannot readily be used to estimate appropriate sample size in advance of sampling. In contrast, theoretical approaches utilize statistical distributions to quantify the magnitude of sampling and preparation error a priori. As such, they require a simple sampling scenario so that the underlying statistical assumptions can be met. In order to achieve this simple scenario, the ‘equant grain model’ (Stanley 1998), or equivalent assumptions, have typically been invoked (Clifton et al. 1969; Visman 1969; Ingamells & Switzer 1973; Cheng 1975; Ingamells 1974a, 1974b, 1981). The ‘equant grain model’ assumes that: (1) all of the mineral grains have the same size; (2) the element of interest resides in only one mineral; (3) the concentration of the element of interest in that mineral is constant; and (4) the compositions of all other (gangue) mineral grains are the same. If the mineral containing the element of interest is a rare grain (nugget), then the Poisson distribution can be used effectively to quantify the sampling error (Wickman 1962; Ingamells & Switzer 1973; Ingamells 1974a, b, 1981; Clifton et al. 1969; Stanley 1998). In contrast, if the mineral grain containing the element of interest is common (making up a significant proportion of the mineral grains), then the binomial distribution has been invoked in an attempt to quantify the sampling error (Wilson 1964; Kleeman 1967; Engels & Ingamells 1970; Ingamells & Switzer 1973; Cheng 1995). Unfortunately, use of the binomial distribution in this instance is not statistically valid. This is because the binomial theorem applies to trials where there is random sampling with replacement (Spiegel 1975). The replacement ensures that the probabilities do not change from trial to trial. Unfortunately, geological materials are aggregates of a large number of mineral grains. As a result, sampling a geological material can be considered as a series of many random selections of mineral grains. However, once a grain is selected as part of the sample, it is not replaced. Thus, the probabilities of collecting mineral grains of interest change during this conceptual series of random selections, and the appropriate distribution to model the scenario is the hypergeometric distribution. This is because the hypergeometric distribution applies to cases of random sampling without replacement (Spiegel 1975). Furthermore, virtually all theoretical approaches estimating sampling or preparation error involving Poisson or binomial statistics assume that the element of interest is contained within only one mineral. Theoretical approaches using binomial statistics have been developed to consider cases where the element of interest is contained in trace amounts in the gangue mineral (Engels & Ingamells 1970; Cheng 1995), but the mathematics have generally proven to be unwieldy. Whereas the element of interest may reside in one mineral in some cases (e.g. Au in native gold, Zr in zircon), this is not true in general. As a result, the sampling error estimation approaches developed to date cannot rigorously make accurate estimates of sampling errors for most elements in geological materials.

The following theoretical approach to sampling error estimation overcomes these two problems. It can be employed using either or both Poisson and hypergeometric statistics to simultaneously address sampling errors associated with rare and common grains. Furthermore, it does not assume that only one mineral contains the element of interest. Finally, because different elements reside in different minerals that have different grain sizes and proportions, different sample sizes may be necessary to achieve a specific level sampling error for different elements. The theoretical approach developed below allows calculation of the magnitude of sampling error for all elements simultaneously, and thus also avoids this further complication. Input required to estimate sampling error includes only the proportions, sizes and compositions of the minerals making up the geological material. As a result, this technique can be employed in advance of a geochemical survey using information derived from a small number of (orientation) samples. APPROACH In order to develop a sampling error estimation method with the above characteristics, the relationship between the composition of a geological material and its mineral proportions must be formalized. Consider a geological material that contains a suite of several (p) minerals. The concentration of an element in the geological material can be determined from the relative proportions and compositions of the contained minerals, according to the equation: p

Xi =

&N X j

ij

(1)

j=1

where X is a mass concentration, N is a mineral proportion (in mass terms), j is a mineral index and i is an element index. An equation of this form exists for every (major, minor and trace) element in a geological material, and these equations establish the relationship between a geological material’s geochemical and modal composition. For example, consider a granite rock composed of quartz, albite, potassium feldspar and phlogopite. The composition of SiO2 in this granite is thus governed by: XSiO2 = NQTZXSiO2,QTZ + NALB XSiO2,ALB + NKSP XSiO2,KSP + NPHLXSiO2,PHL Similar equations govern the concentrations of other major, minor and trace elements in the rock. For example: XK2O = NQTZXK2O,QTZ + NALB XK2O,ALB + NKSP XK2O,KSP + NPHLXK2O,PHL Thus, with knowledge of the geological material’s mineral proportions and compositions, the bulk composition of a geological material can be deduced from the set of relations represented by Equation 1. Variance propagation If the element concentration measurement errors associated with sampling a geological material are desired, then simple error variance propagation can be used to estimate these errors, provided the errors in the mineral proportions and element concentrations are known. The equation to estimate these sampling errors can be obtained from a generative formula

171

Sample errors for elements in geological materials derived from a Taylor series first order expansion (Meyer 1975; Rees 1984; Stanley 1990; Vacher 2001): p

f2~X!

=

&~ j=1

!

2

f 2 + 2 Xj Xj

p1 p

& &~Xf !~Xf !

k = 1j = 2

j

k

XjXk

(2)

Because the mineral proportions and mineral compositions are independent, the second summation term in Equation 2 cancels (because all XjXk=0), and the sampling error propagation equation associated with Equation 1 reduces to: p

X2 i

=

&~

2 2 NjX ij

+ N2j X2 ij!

(3)

j=1

where the 2 are the variances of the associated mineral proportions and mineral compositions. To solve Equation 3 and determine the sampling errors for any element in a geological material requires knowledge of both the mineral proportions and mineral compositions in that material, as well as the measurement errors associated with each mineral proportion and mineral composition. Unfortunately, Equation 3 becomes complicated if significant compositional variability exists in some minerals. As a result, let us assume, for the moment, that every mineral in the geological material has an invariant composition, and that we either know this composition, can assume it as ideal, or can estimate it based on petrological principles. Note that below we will consider a strategy that will accommodate variations in composition, so that the more general case involving mineral composition variations can be considered. If the compositions of all minerals are invariant, we can substantially simplify Equation 3: p

X2 i

=

&~

2 2 NjX ij !

(4)

j=1

because all X2 ij=0. As a result, the error in the mineral proportion estimates and the concentrations of the elements in each mineral are all that are required to estimate the sampling error in each element concentration in the geological material. Using measured, assumed or constrained mineral compositions to determine the element concentrations in each mineral, the element sampling errors become functions of only the errors on the modal estimates of the minerals: 2 2 2 X2 SiO = N ~100.00%!2 + N ~68.74%!2 + N ~64.76%!2 + QTZ ALB KSP 2 ~41.00%!2 N PHL 2

Fig. 1. Schematic geological material containing six different minerals of differing size, shape, colour and volume proportion.

wish to consider cases where more than one mineral contains the element of interest and where minerals have different sizes and shapes, we need to consider several equant grain models, one for each mineral of different size and shape (Fig. 1). As a result, we create several equant grain models, each involving equant grains with a size and shape equal to the size and shape of the target mineral, as determined from prior hand sample or thin section examination. Thus, for a large mineral, the geological material is conceptually broken up into equant grains the size of the large mineral. Consequently, the large mineral will conceptually occur fully liberated in the equant grains, but the smaller minerals will be composited into the larger equant grains (Fig. 2). For a small mineral, the geological material is conceptually broken up in a different way such that the equant grains have a size equal to the small mineral. Consequently, the small mineral will occur conceptually fully liberated in individual grains, but the larger minerals will be broken up into the smaller equant grains (Fig. 3). Finally, for a medium-sized mineral, the geological material is conceptually broken up such that the equant grains have a size equal to the medium-sized mineral. In this case, the smaller minerals will be conceptually composited into the mediumsized equant grains, whereas the larger minerals will be conceptually broken up into the medium-sized equant grains (Fig. 4). Once these equant grain models are invoked, the mineral proportion sampling errors for minerals that are common in the sample can be estimated for each mineral using the hypergeometric distribution, whose variance is: 2 = np~1  p! N  n N1

~

and: 2 2 2 ~0.00%!2 + N ~0.00%!2 + N ~16.92%!2 + X2 K O = N QTZ ALB KSP 2

2 N ~10.71%!2 PHL

Thus, determining the mineral proportion sampling errors is all that is required to determine the magnitude of sampling error on the bulk composition for the geological material. Common mineral proportion sampling errors In order to estimate the sampling error for a geological material, the equant grain model must be invoked. However, because we

!

(5)

In this application, 2 is the variance of the hypergeometric distribution and has units of number of grains squared, n is the effective number of equant grains in the sample (and must generally be greater than 20 for the hypergeometric variance formula to apply), p is the proportion of the mineral grain of interest in the sample, and N is the effective number of equant grains in the population. The effective number of equant grains in a sample can be determined for each mineral from the given sample size and the grain size of the mineral of interest using:

172

C. R. Stanley

Fig. 2. Schematic geological material conceptually broken up into equant grains the size of the large dark gray square mineral of Figure 1 (now darkened for illustrative purposes). Note that in order to invoke the ‘equant grain model’, the other equant grains of the same size are composites of several other minerals. The large dark gray square mineral comprises 17.36 % (25 of 144 equant grains) of the geological material.

Fig. 3. Schematic geological material conceptually broken up into equant grains the size of the small dark gray square mineral of Figure 1 (now darkened for illustrative purposes). Note that the larger grains have been broken up into equant grain size to be able to invoke the ‘equant grain model’. The small dark gray square mineral comprises 14.93 % (86 of 576 equant grains) of the geological material.

n=M m

(6)

(Clifton et al. 1969; Stanley 1998) where M is the mass of the sample and m is the mass of a grain of the mineral of interest. The proportion of the mineral grain of interest can be estimated from prior hand sample or thin section examination. As a result, the additional only parameter necessary to estimate is N, the effective number of grains in the population. The effective number of grains in the population is ideally a function of the geometry and sampling density of the geochemi-

Fig. 4. Schematic geological material conceptually broken up into equant grains the size of the elongate white mineral of Figure 1 (now darkened for illustrative purposes). Note that, in order to be able to invoke the ‘equant grain model’, the smaller minerals have been composited into equant grains, whereas the larger minerals have been broken up into equant grains. The elongate white mineral comprises 15.10% (29 of 192 equant grains) of the geological material.

cal survey undertaken. For example, if the geochemical survey is a one-dimensional traverse across continuous bedrock exposure where samples are collected at random within 25 m intervals, and if the sample is a 10 cm  10 cm  10 cm rock (c. 2.7 kg in size), the population from which each sample is collected can be considered to be a strip 10 cm wide, 10 cm deep and 25 m long. Although this definition of the population might at first appear arbitrary, if the survey is truly a traverse, then samples should be collected along a line. As a result, this definition of the population is entirely consistent with the survey design. Given the above sample and population, the volume of the actual sample is 1/250th of the volume of the population. Consequently, N/n = 250, and this value can be thought of as the inverse of the one-dimensional sampling density (d = 1/250; a unitless quantity). Alternatively, if the sample is 30 cm  30 cm  30 cm in size (c. 72.9 kg in size), then the population can be considered as a similar strip 30 cm wide, 30 cm deep, and 25 m long. As a result, N/n is lower (= 83.33; d = 1/83.33) because the sample size is larger. Alternative scenarios where samples are not universally available can also be considered. For example, if the outcrop exposure for the above survey was only 25%, the population size would drop accordingly, and N/n for 30 cm3 samples would drop to 20.83. Finally, if the geochemical survey involves collection of 1 m long sections of continuous half diamond (NQ) drillcore, then the sample mass will be c. 2.5 kg. However, because only two samples can be collected from each meter of drill core (one half of the core, or the other half), the one-dimensional sample density is large (d = 1/2; N/n = 2). In a similar manner, if the geochemical survey is a twodimensional 50 m square soil grid, and the samples are 10 cm  10 cm  10 cm in size (c. 2 kg in mass), then the population from which a sample can be collected is a square right prism 50 m by 50 m wide by 10 cm thick. In this case, N/n will equal 250,000 and the two-dimensional sampling density will be 1/250,000 (and is again unitless). In geochemistry, the size of the population from which a sample is collected has not historically been conceived in the

173

Sample errors for elements in geological materials

variance to estimate sampling error is possible if population homogeneity can be assumed and the sample design involves a random-stratified geometry. If the population is not homogeneous, then random composite sampling is required to ensure that the resulting sample is representative of the entire population. Only then will the resulting sample exhibit equivalent sampling error to an idealized sample collected grain by grain. The extent of composite sampling necessary is dependent on the level of heterogeneity within the population, and a larger number of smaller sub-samples would necessarily have to be collected from extremely heterogeneous populations. Once N/n has been determined, and n has been calculated from the grain size and sample size (Equation 6), N can be determined through simple multiplication of N/n by n. Thus, provided that: (1) volume-based mineral proportion estimates derived from hand sample or thin section inspection are converted into mass-based mineral proportion values; (2) massbased mineral compositions are known, assumed or constrained; and (3) the sample size and mineral sizes are known, all necessary parameters required for determination of the hypergeometric mineral sampling variances are available and the overall element sampling variance can be calculated using Equation 4. One should note that if N/n is very large, then Equation 5, converges to: 2 = np~1  p!

Fig. 5. Schematic diagram to illustrate geochemical sample formats. Each small square represents a mineral grain, and the two large grids represent two grain populations.

above manner. Nevertheless, the sample density and sample size exert direct control on the N/n ratio, and this ratio can be used to quantify the size of the underlying population for each sample (N) by simple multiplication of the ratio by n (from Equation 6). If population size is to be conceived in this way, then in order to use the hypergeometric distribution in the sampling error variance calculation (Equation 5), the grains that make up the sample should be collected individually and at random grain by grain (Fig. 5, top). The 36 scattered 1  1 small shaded squares in Figure 5 graphically represent these individual grains. Obviously, while ideal, this is not logistically possible for virtually all geochemical surveys, where single grab samples or composites consisting of a small number of sub-samples are typically collected. As a result, most geochemical samples are derived from one or a few locations (as composites) from within the population area (Fig. 5, bottom). The large 6  6 shaded square in Figure 5 illustrates how the grains are typically derived from one spot. Thus, technically, sampling errors cannot be estimated for most geochemical surveys. However, if the population area is compositionally homogeneous, then the collection of a sample composed of individual grains from a variety of random locations, and the collection of a sample where all the grains are derived from a randomly chosen single point or small set of points, are equivalent from a probability point of view (Dr. Paul Cabilio, Acadia University, pers. comm. 2002). As a result, the use of the hypergeometric

(7)

because (N  n)/(N  1) => 1. This equation describes the variance of a binomial distribution. As a result, the convergence of the hypergeometric variance on the binomial variance as N/n => ` illustrates that the use of binomial statistics by previous authors (Wilson 1964; Kleeman 1967; Engels & Ingamells 1970; Ingamells & Switzer 1973; Cheng 1995) to estimate sampling error was not a bad assumption for reconnaissance-type surveys, and was a substantial improvement over previous practice. However, because the possible population sizes (N) for commonly used geochemical surveys (e.g. drill core, soil grids) can vary significantly relative to n, the binomial variance approach is not generally valid. Nevertheless, it can be shown that, provided N is several orders of magnitude larger than n, the estimated variances from Equations 5 and 7 are virtually identical. As a result, in most two-dimensional geochemical surveys (e.g. soil grid and drainage sediment surveys; particularly reconnaissance surveys) which commonly have large N/n ratios (say > 10 000), the binomial variance formula can be used without significant bias. However, for most one dimensional traverses (e.g. drill core sampling programs or soil/bedrock traverses), the hypergeometric variance formula must be used. For example, if we can assume that the synthetic rock sample presented in Figure 1 was collected as part of a two-dimensional rock survey with sample density and sample size such that N/n = 10,000, then the mineral sampling variances can be calculated using either hypergeometric or binomial variance formulae (Equations 5 or 7). Table 1 presents the concentrations and calculated binomial and hypergeometric errors (they are equal) for each mineral in the synthetic rock under this assumption. However, if the synthetic rock sample of Figure 1 was collected as part of a one-dimensional survey with N/n = 100, then the mineral sampling variances can be calculated accurately using only the hypergeometric variance formula (Equation 5). Table 2 presents the corresponding results under this assumption. Significant differences exist in the estimated sampling errors for these two surveys, and these differences are solely a function of the geometry and sample density of the geochemical survey.

174

C. R. Stanley

Table 1. Calculated one standard deviation coefficients of variation for the six minerals in the synthetic rock of Figure 1, assuming N/n = 10 000 (consistent with a reconnaissance lithogeochemical survey). Size

Shape

Color

N

n

x

p (%)

x

p

CV%

large small small small medium medium

square square square square elongate elongate

dark gray dark gray light gray white white light gray

1,440,000 5,760,000 5,760,000 5,760,000 1,920,000 1,920,000

144 576 576 576 192 192

25 86 41 205 29 19

17.4 14.9 7.1 35.6 15.1 9.9

4.6 8.6 6.2 11.5 5.0 4.1

3.2 1.5 1.1 2.0 2.6 2.2

18.4 10.0 15.1 5.6 17.2 21.8

Figure 2 3

4

N – number of equant grains in the population for each model; n – number of equant grains in the sample; x – number of grains of the mineral of interest in the sample p – proportion of the mineral of interest in the sample, p = x/n; x – one standard deviation sampling error on the number of grains of the mineral of interest in the sample based on hypergeometric statistics; p – one standard deviation sampling error on the proportion of grains of the mineral of interest in the sample based on hypergeometric statistics; CV% – coefficient of variation for the number and proportion of grains of the mineral of interest in the sample (they are equal).

Table 2. Calculated one standard deviation coefficients of variation for the six minerals in the synthetic rock of Figure 1, assuming N/n = 10 (consistent with a detailed lithogeochemical survey). See Table 1 caption for explanation of table headings. Size

Shape

Color

N

n

x

p (%)

x

p

CV%

large small small small medium medium

square square square square elongate elongate

dark gray dark gray light gray white white light gray

1,440 5,760 5,760 5,760 1,920 1,920

144 576 576 576 192 192

25 86 41 205 29 19

17.4 14.9 7.1 35.6 15.1 9.9

4.3 8.1 5.9 10.9 4.7 3.9

3.0 1.4 1.0 1.9 2.5 2.0

17.2 9.4 14.3 5.3 16.3 20.3

Figure 2 3

4

A ‘rule of thumb’ that provides insight into which distribution variance formula to use to estimate sampling error is that reconnaissance surveys generally have large N/n, and thus can have their sampling variances estimated using the either the hypergeometric or binomial variance formula. However, detailed and follow-up surveys typically have low N/n, and thus require use of the hypergeometric variance formula. Note that because (N  n)/(N  1) is always c 1, the hypergeometric sampling errors will always be smaller than the binomial sampling errors. This bias produces some rather undesirable effects when one attempts to reduce sampling error. Reducing sampling error requires that the size of the sample be increased. This is because the relative sampling error (measured using the coefficient of variation, in percent; CV%) is a function of n, the effective number of equant grains in the sample. For the hypergeometric distribution: CV % = 100% 

Œ

~1  p!~N  n! np~N  1!

(8)

and for the binomial distribution: CV % = 100% 

Œ

~1  p! np

(9)

In both of these equations, CV % is equal to the sampling error standard deviation divided by the mean (observed) number of target mineral grains (x = np), expressed in percent. Thus, increasing the size of the sample (n), decreases the relative sampling error. By rearranging Equations 8 and 9, one can determine the size of the sample necessary to obtain a desired level of relative sampling error using: n= 

~1  p!N p~CV %! ~N  1! + ~1  p! 2

 ~100%!2

(10)

and: n= 

~1  p! p~CV%!2

 ~100%!2

(11)

for hypergeometric and binomial distributions, respectively. Consequently, if the sampling error is over-estimated because the binomial variance formula was used, then calculations to determine the appropriate sample size to achieve a specific level of sampling precision will over-estimate the sample size required. Although many would consider this desirable because it would produce a ‘conservative’ result, this philosophy is actually misguided. Collecting a larger than necessary sample may require additional costs in terms of time and manpower, leading to a less cost efficient and less effective geochemical survey. Furthermore, over-estimation of sampling error could negatively impact subsequent data analysis. Statistical tests on the geochemical results may indicate that some of the real but subtle geochemical variations observed are not statistically significant solely because they do not exceed the over-estimated measurement errors. As a result, critical geological information may be lost by over-estimating sampling error simply because the statistical critical value was ‘conservatively’ set too high. For example, if the sampling error for Si is conservatively estimated at 1 wt. % (when it is in fact 0.25 wt. %), subtle variations with magnitudes less than 1 wt. % but greater than 0.25 wt. % would not be considered significant, even though they would be significant with a correctly estimated sampling error. Thus, this potential loss of information would also result in a less efficient and effective geochemical survey. As a result, over-estimating sampling error can produce results that are as bad as underestimating sampling error (which could result in the collection of unrepresentative samples and/or data analysis conclusions that are not supported by the data). Thus, the use of the hypergeometric distribution variance formula is recommended to produce accurate sampling error estimates.

Sample errors for elements in geological materials Rare mineral proportion sampling errors The above discussion has focused on sampling errors associated with mineral grains that are common in samples of a geological material. If the mineral of interest is not a common grain, then a similar approach may be undertaken using a different statistical distribution to estimate the level of sampling error. Rare minerals, also known as nuggets, are typically governed by Poisson statistics (Clifton et al. 1969; Stanley 1998). Fortunately, the formula describing the sampling variance for the Poisson distribution is much simpler than the formula for the hypergeometric distribution: 2 = x

n=

~100%!2 p~CV %!2

rock. These two potassium feldspars could be considered as two different minerals with identical compositions but different grain sizes. Again, the proportion of each mineral size could be made to reflect the distribution of grain sizes, and the degree of detail required to model the expected sampling errors can be accommodated. In summary, variations in mineral compositions and grain sizes are better and more easily addressed by considering the different compositions or grain sizes of a mineral as completely different minerals rather than attempting to statistically incorporate these variations in the sampling error formula of Equation 3.

(12)

where x is the number of nuggets in the sample (x = np). Obviously, if one knows the proportion (p) and size of the rare mineral grain, and the effective number of equant grains in the sample, one can determine how many of the rare mineral grains reside in the sample by simple multiplication. If x is small (< 20), the mineral is a candidate for modeling its sampling variance using the Poisson distribution, and the mineral sampling variance can be determined using Equation 12. The relative error associated with sampling a rare grain can be determined using Equation 13, and the size of a sample necessary to achieve a specific relative sampling error can be determined using Equation 14: CV % = 100% =x

175

(13)

(14)

Once the sampling variance for the rare grain has been determined, Equation 4 can then be used to assess the geochemical sampling error. In this way, using hypergeometric and Poisson variance approaches, both common and rare mineral grains containing an element of interest can be simultaneously addressed by this propagation of variances approach. Variations in mineral sizes and compositions The above discussion has assumed that the compositions of the minerals do not vary. As a result, the second term in the summation of Equation 3 equals zero, and the calculation of the sampling error employs the simplified Equation 4. This assumption of constant composition can be avoided by considering the observed range in mineral composition as a suite of minerals with different compositions. For example, if the plagioclase composition in a sample was observed to be bimodal, with 30% having a composition of AN50–60 and 70% having a composition of AN10–20, then plagioclase could be considered as two different minerals, one with a composition of AN55 and one with a composition of AN15. The proportions of each plagioclase composition would be adjusted to reflect the 30:70 proportion of plagioclase with AN50–60 and AN10–20 compositions. In this way, mineral composition variations can be addressed, and the level of detail used to describe the mineral compositional variations can be made commensurate with the precision required for the sampling error estimate and the ultimate geochemical survey objectives. Similarly, if there is a range in grain sizes for a mineral, the mineral could be considered as a suite of minerals with different equant grain sizes. For example, potassium feldspar could exist as both a phenocryst and a groundmass mineral in a volcanic

SOFTWARE In order to facilitate calculation of geological material sampling errors using the approach described above, an EXCEL spreadsheet has been prepared that makes all of the relevant calculations. This spreadsheet is flexible, in that it can easily be expanded to consider additional minerals and additional elements. Furthermore, it not only calculates the sampling error on every element, but estimates the geochemical composition of the geological material under consideration, and estimates its expected bulk density, molar mass and molar volume, given the mineral proportions, sizes, and compositions entered by the geologist. The program automatically calculates Poisson, binomial or hypergeometric variance estimates, as requested by the geochemist. The program also converts volume-based mineral proportions into the mass-based mineral proportions necessary for the calculation of the sampling error. Readers interested in obtaining a copy of this spreadsheet software (Sample_Size.XLS) can download it from the author’s website at: http://ace.acadiau.ca/~cstanley/software.html. EXAMPLE The EXCEL spreadsheet (Sample_Size.XLS) contains the petrologic information derived from a typical sample of the Salmontail Lake Pluton, South Mountain Batholith, Nova Scotia (Fig. 6; MacDonald & Horne 1987; Horne et al. 1989; MacDonald et al. 1989; 1992a, b; MacDonald 2001). This megacrystic granodiorite was sampled by the Nova Scotia Dept. of Natural Resources during a regional lithogeochemical survey in the late 1980’s and early 1990’s. The overall coarse grain size and orthoclase phenocrysts make it particularly difficult to collect representative samples of this intrusion, particularly for some of the major oxide elements. As a result, this lithology provides an excellent example of how the hypergeometric sampling error calculation is made. Samples collected as part of this reconnaissance lithogeochemical survey had a sample density of c. 1 sample per 9 square km (MacDonald et al. 1992b; MacDonald 2001), but exposure is poor (on the order of 0.25%; pers. comm. Mike MacDonald, NSDNR 2002). Twenty 20 kg samples were collected, and these had a size of c. 20 cm  20 cm  20 cm (ideally 21.6 kg in size). As a result, N/n is c. 225,000,000  0.25% = 562,500, and the hypergeometric distribution variance formula should thus be used for all but the most trace minerals (e.g. zircon). Petrologic information necessary to determine the sampling error on the major oxide analyses in the Salmontail Lake Pluton is presented in Table 3. Information used by the spreadsheet includes: the sample size (in g), the N/n ratio, the type of variance to calculate (hypergeometric, binomial or Poisson), the mineral proportions, mineral sizes (3 axis dimensions), mineral compositions and mineral densities. Note that two potassium feldspar minerals are used in this example to represent both the

176

C. R. Stanley

Fig. 6. Photograph of Salmontail Lake Pluton granodiorite of the South Mountain Batholith, Nova Scotia. Note megacrystic texture and orthoclase megaphenocrysts. Table 3. Mineral proportions, sizes and compositions of average Salmontail Lake Pluton granodiorite, South Mountain Batholith used in an example calculation of sampling error. Letters in the ‘Distn’ column refer to the distribution used to estimate the sampling variances (H = hypergeometric; P = Poisson); numbers in the ‘Pptn’ column are the mineral proportions. Zircon is the only mineral modeled as a nugget because all other trace minerals occur as common inclusions in biotite. Mineral Quartz Plagioclase K-Spar Megacryst K-Spar Biotite Muscovite Chlorite Andalusite Garnet Ilmenite Magnetite Rutile Monazite Titanite Zircon Apatite

Size X (mm)

Size Y (mm)

Size Z (mm)

Density (g/cc)

5 5 25 5 3 0.2 3 0.5 0.1 0.05 0.05 0.05 0.05 0.05 0.05 0.06

5 2 10 2 3 0.2 3 0.5 0.1 0.05 0.05 0.05 0.05 0.05 0.05 0.02

5 2 10 2 0.3 0.02 0.3 0.5 0.1 0.05 0.05 0.05 0.05 0.05 0.05 0.02

2.65 2.65 2.60 2.60 2.85 2.80 2.75 3.18 4.00 4.70 5.18 4.20 4.75 3.55 4.68 2.39

phenocryst and groundmass grain sizes. Note also that zircon is the only trace mineral modelled as a nugget because all other trace minerals (monazite, apatite, rutile, magnetite, ilmenite, and titanite) occur as common inclusions in biotite, and thus are both abundant and relatively uniformly distributed in the granite. Results from the spreadsheet calculation are presented in Tables 4 and 5. The sample size chosen to assess sampling error reflects that of the survey (20 kg). However, even with this very large sample size, the very coarse-grained rock was still sampled with substantial error. In fact, the large sample size represented a

Composition SiO2 Na4/5Ca1/5Al6/5Si14/5O8 K17/20Na3/20AlSi3O8 K17/20Na3/20AlSi3O8 KAl1.4Fe1.9Mg0.9Si2.8O10(OH)2 KAl3Si3O10(OH)2 Mg7Fe3Al4Si6O20(OH)16 Al2SiO5 Fe2MgAl2Si3O12 FeTiO3 Fe3O4 TiO2 CePO4 CaTiSiO5 ZrSiO4 Ca5(PO4)3(OH)

Distn

Pptn (%)

H H H H H H H H H H H H H H P H

32 29 10 12 11 2.4 0.75 1.0 0.5 0.1 0.1 0.1 0.025 0.5 0.025 0.5

compromise between survey cost and representativity (Mike MacDonald, NSDNR, pers. comm. 2002). Note that the sampling error on K2O (1 standard deviation relative error = 2.86%) is larger than SiO2 and Al2O3 (0.94% and 1.16%, respectively). This is because very few orthoclase phenocrysts are collected even in a 20 kg sample (x = 111), producing significant sampling error in K2O, Al2O3 and SiO2. However, because the rest of the granodiorite is mostly quartz and plagioclase, substitution of orthoclase for quartz or plagioclase replaces the SiO2 in orthoclase with SiO2 in quartz and plagioclase, and replaces the Al2O3 in orthoclase with Al2O3 in

177

Sample errors for elements in geological materials

Table 4. Results of example calculation of sampling error for Salmontail Lake Pluton granodiorite, South Mountain Batholith for 20 kg sample and N/n = 562 500. The various numbers of target mineral equant grains (x), total equant grains (n), error on the number of target equant grains (x), and wt. % errors and relative errors on target mineral abundances are presented. Note that a large number of target mineral grains occur, except for the K-spar megacrysts, in a 20 kg sample. Mineral Quartz Plagioclase K-Spar Megacryst K-Spar Biotite Muscovite Chlorite Andalusite Garnet Ilmenite Magnetite Rutile Monazite Titanite Zircon Apatite

# of equant, target mineral grains (x)

Total # of equant grains (n)

7.09103 4.01104 1.11102 1.66104 1.13105 8.31107 7.69103 1.38104 2.21107 2.21107 2.21107 5.54106 1.11107 5.54106 5.77107 7.09107

2.2104 1.38105 1.11103 1.38105 1.03106 3.46109 1.03106 2.77106 2.211010 2.211010 2.211010 2.211010 2.211010 2.211010 1.151011 2.21104

Sampling error on # of mineral Mineral wt. % error grains (x) 69 169 10 121 317 9,003 87 117 4,704 4,704 4,704 2,353 10,497 2,353 23,956 69

Mineral relative wt. % error

0.309 0.120 0.872 0.084 0.0328 0.000271 0.00872 0.00631 0.000037 0.000041 0.000033 0.000019 0.000063 0.000018 0.000018 0.309

0.980 0.421 9.015 0.728 0.281 0.011 1.136 0.848 0.021 0.021 0.021 0.042 0.009 0.042 0.004 0.980

Table 5. Results of example calculation of sampling error for Salmontail Lake Pluton granodiorite, South Mountain Batholith for 20 kg sample and N/n = 562,500. The bulk density of the rock is 2.69 g/ml, its bulk molar volume is 49.03 ml/mole, and its bulk molar mass is 131.78 g/mole. The calculated volume for a 20 kg sample is 7,441 ml. Element

SiO2 TiO2 Al2O3 Fe2O3 FeO MgO CaO Na2O K2O P2O5 H2 O Zr (ppm) Ce (ppm)

Element Composition (wt. %)

1 SD Sampling Error on Element Composition (wt. %)

1 SD Relative Sampling Error on Element Composition (CV%)

69.771 0.517 14.147 0.133 3.708 1.079 1.641 3.048 4.452 1.245 0.193 215 307

0.655 0.000046 0.164 0.000028 0.00934 0.00306 0.00508 0.0185 0.127 0.00302 0.000008 0.088 0.128

0.939 0.009 1.161 0.021 0.252 0.284 0.310 0.609 2.856 0.243 0.004 0.042 0.042

plagioclase. As a result, the sampling errors for SiO2 are lower than the sampling errors for Al2O3, and both of these are substantially lower than the sampling errors for K2O. All other elements exhibit less than 1% relative error, largely because they occur in minerals that have large numbers of grains in the sample. CONCLUSIONS The propagation of variance approach allows determination of sampling errors for major, minor and trace elements in geological materials. The approach first estimates the sampling error for each mineral in a geological material using the equant grain model, and then propagates this variance into sampling error on the elements in the sample. The approach is flexible, as it employs several statistical distributions to deal with rare and common grain sampling errors. Furthermore, it is statistically valid, and requires no simplifying assumptions about mineral sizes, mineral compositions, or how many minerals contain an element. The propagation of variance approach to sampling error estimation can be used to estimate the magnitude of sampling error in geological materials using only information

that can be readily obtained by simple hand sample or thin section inspection. This research was supported by an NSERC discovery grant to the author. It benefited from a number of philosophical discussions years ago about sampling errors with Dr. Alastair Sinclair, and from more recent discussions with Dr. Paul Cabilio (Acadia University), and Dr. Mike MacDonald and Ms. Linda Ham (NSDNR).

REFERENCES BAIRD, A.K., MCINTYRE, D.B., WELDAY, E.E. & MORTON, D.M. 1967. A test of chemical variability and field sampling methods. Lakeview Mountain tonalite, Lakeview Mountains, southern California batholith. In: Short Contributions to California Geology. Special Report. California Division of Mines and Geology, 11–19. CAMERON, E.M., ERMANOVICS, I.F. & GOSS, T.I. 1979. Sampling methods and geochemical composition of Archean rocks in southeastern Manitoba. Precambrian Research, 9, 35–55. CHENG, X. 1995. Mass changes during hydrothermal alteration, Silver Queen epithermal deposit, Owen Lake, central British Columbia. Ph.D. thesis. University of British Columbia, Canada. CLIFTON, H.E., HUNTER, R.E., SWANSON, F.J. & PHILLIPS, R.L. 1969. Sample size and meaningful gold analysis. U.S. Geological Survey Professional Paper, 625, C1–C17.

178

C. R. Stanley

ENGELS, J.C. & INGAMELLS, C.O. 1970. Effect of sampling inhomogeneity in K-Ar dating. Geochimica et Cosmochimica Acta, 34, 1007–1017. FRANCOIS-BONGARCON, D. 1998. Extensions to the demonstration of Gy’s formula. Exploration and Mining Geology, 7, 149–154. GY, P. 1974. The sampling of broken ores; a review of principle and practice. In: Geological, Mining and Metallurgical Sampling. Transactions, Section B: Applied Earth Sciences. Institution of Mining and Metallurgy, 194–205. GY, P. 1979. Sampling of Particulate Ores; Theory and Practice. Developments in Geomathematics Series. Elsevier Scientific Publishing Company, Amsterdam. HALL, G.E.W. 1993. GeoAnalysis 90, An International Symposium on the Analysis of Geological Materials. Bulletin. Geological Survey of Canada. HORNE, R.J., COREY, M.C., HAM, L.J. & MACDONALD, M.A. 1989. Lithogeochemical variation in granodiorite and biotite monzogranite of the South Mountain Batholith, Nova Scotia. In: BROWN, Y. & MACDONALD, D.R. (eds) Mines and Minerals Branch, Report of Activities, 1988, Part B. Nova Scotia Department of Mines and Energy, Canada. INGAMELLS, C.O. 1981. Evaluation of skewed exploration data – the nugget effect. Geochimica et Cosmochimica Acta, 45, 1209–1216. INGAMELLS, C.O. 1974a. Control of geochemical error through sampling and subsampling diagrams. Geochimica et Cosmochimica Acta, 38, 1225–1237. INGAMELLS, C.O. 1974b. New approaches to geochemical analysis and sampling. Talanta, 21, 141–155. INGAMELLS, C.O. & SWITZER, P. 1973. A proposed sampling constant for use in geochemical analysis. Talanta, 20, 547–568. KLEEMAN, A.W. 1967. Sampling error in the chemical analysis of rocks. Journal of the Geological Society of Australia, 14, 43–47. MACDONALD, M.A. 2001. Geology of the South Mountain Batholith, Southwestern Nova Scotia. Open File Report no. 2000-2. Nova Scotia Dept. of Natural Resources. MACDONALD, M.A. & HORNE, R.J. 1987. Petrographical and geochemical aspects of a zoned pluton within the South Mountain Batholith. In: BATES, J.L. & MACDONALD, D.R. (eds) Mines and Minerals Branch, Report of Activities, 1987. Nova Scotia Department of Mines and Energy, Canada, 121–127. MACDONALD, M.A., COREY, M.C., HAM, L.J. & HORNE, R.J. 1989. Petrographic and geochemical aspects of the South Mountain Batholith. In: MACDONALD, D.R. & MILLS, K.A. (eds) Mines and Minerals Branch, Report of Activities, 1988, Part A. Nova Scotia Department of Mines and Energy, Canada, 75–80.

MACDONALD, M.A., HORNE, R.J., COREY, M.C. & HAM, L.J. 1992a. Geological and geochemical features of the 370 Ma, peraluminous South Mountain Batholith, Meguma Terrane, Nova Scotia. In: MACDONALD, D.R. (ed.) Mines and Minerals Branch, Report of Activities, 1991. Nova Scotia Department of Mines and Energy, Canada. MACDONALD, M.A., HORNE, R.J., COREY, M.C. & HAM, L.J. 1992b. An overview of recent bedrock mapping and follow-up petrological studies of the South Mountain Batholith, Nova Scotia, Canada. Atlantic Geology, 28, 7–28. MEYER, S.L. 1975. Data Analysis for Scientists and Engineers. John Wiley and Sons, New York. MORTON, D.M., BAIRD, A.K. & BAIRD, K.W. 1969. The Lakeview Mountains pluton, southern California batholith; part II, chemical composition and variation. Geological Society of America Bulletin, 80, 1553–1563. OTTLEY, D.J. 1983. Calculation of sample requirements and the development of preparation procedures for reliable analysis of particulate materials. Nevada Bureau of Mines and Geology Report, 36, 132–144. REES, C.E. 1984. Error propagation calculations. Geochimica et Cosmochimica Acta, 48, 2309–2311. SKETCHLEY, D.A. 1997. Case history guidelines for establishing sampling protocols and monitoring quality control. Exploration and Mining Geology, 6, 384. SPIEGEL, M.R. 1975. Probability and Statistics. Schaum’s Outline Series in Mathematics. McGraw-Hill Book Company. STANLEY, C.R. 1990. Error propagation and regression on Pearce element ratio diagrams. In: RUSSELL, J.K. & STANLEY, C.R. (eds) Theory and Application of Pearce Element Ratios to Geochemical Data Analysis. Short Course Notes, 8. Geological Association of Canada, Canada, 179–215. STANLEY, C.R. 1998. Nugget: PC-software to calculate parameters for samples and elements affected by the nugget effect. Exploration and Mining Journal, 7, 139–147. VACHER, H.L. 2001. The Taylor series and error propagation. Journal of Geoscience Education, 49, 305–313. VISMAN, J. 1969. A general sampling theory. Materials Research and Standards, 9, 8–13. WICKMAN, F.E. 1962. The amount of material necessary for a trace element analysis. Arkiv för Mineralogi Och Geologi, 10, 131–139. WILSON, A.D. 1964. The sampling of silicate rock powders for chemical analysis. The Analyst, 89, 18–30.

Estimating sampling errors for major and trace ...

of unrepresentative samples and/or data analysis conclusions that are not ..... In: Geological, Mining and Metallurgical Sampling. ... McGraw-Hill Book Company.

647KB Sizes 2 Downloads 300 Views

Recommend Documents

A Simple and Efficient Sampling Method for Estimating ...
Jul 24, 2008 - Storage and Retrieval; H.3.4 Systems and Software: Perfor- mance Evaluation ...... In TREC Video Retrieval Evaluation Online. Proceedings ...

Snowball sampling for estimating exponential random ...
Nov 13, 2015 - Abstract. The exponential random graph model (ERGM) is a well-established statis- tical approach to modelling social network data. However, Monte Carlo estimation of ERGM parameters is a computationally intensive procedure that imposes

Estimating fishing mortality of major target species and species ... - frdc
improve data quality; and the designation of a number species as either no-take or .... management of shark mortality needs to consider all interactions.

Estimating fishing mortality of major target species and species ... - frdc
Background. The volume of shark ..... channels (newspapers, fishing websites and newsletters) and word-of-mouth. Incentives including ...... images these should be outlined in this section outline and attach them where possible. Manuscript ...

Estimating Attendance at Major League Baseball ...
Jan 15, 2009 - being 56,438 (the Mets hosting the Yankees on May 20). We also ... Current studies of attendance also need to account for inter-league games. Whereas ...... Free-agency and the competitiveness of Major League Baseball.

SAMPLING ALGORITHMS AND CORESETS FOR lp REGRESSION
Define the random variable Xi = (Tii|Ai⋆xopt −bi|)p, and recall that Ai⋆ = Ui⋆τ since ... Thus, since Xi − E[Xi] ≤ Xi ≤ |Ai⋆xopt − bi|p/qi, it follows that for all i such.

SAMPLING ALGORITHMS AND CORESETS FOR lp REGRESSION
regression problem for all p ∈ [1, ∞), we need tools that capture the geometry of lp- ...... Define the random variable Xi = (Tii|Ai⋆xopt −bi|)p, and recall that Ai⋆ =.

Sampling Algorithms and Coresets for lp Regression
Email: [email protected]. ‡Computer Science, University of Pennsylvania, Philadelphia,. PA 19107. Work done while the author was visiting Yahoo! Research. Email: [email protected] ficient sampling algorithms for the classical ℓp regres- sion p

TRACE FOSSILS
Smooth the surface of the sediment using a flat piece of plastic or wood; a cheap ruler should work well. 4. Allow the animal to run, walk, or crawl across the ...

Adaptive Sampling based Sampling Strategies for the ...
List of Figures. 1.1 Surrogate modeling philosophy. 1. 3.1 The function ( ). ( ) sin. y x x x. = and DACE. 3.1a The function ( ). ( ) sin. y x x x. = , Kriging predictor and .... ACRONYMS argmax. Argument that maximizes. CFD. Computational Fluid Dyna

Estimating Domestic Demand for Major Fruits in the Philippines - Sign in
School of Management. University of the Philippines Mindanao. Determining the Market Potential of Major. Fruits in the Philippines: An Application of the.

Standard Errors and Confidence Intervals for Scalability ...
Oct 10, 2012 - not available for scales consisting of Likert items, the most popular item type in ... or attitude of interest, such as religiosity, tolerance or social capital. ... 2011), political knowledge and media use (Hendriks Vettehen, ..... We

Geographical sampling bias and its implications for ...
and compared it with sampling intensity in non-priority areas. We applied statistical ... the goal of maximizing species coverage, species represented by a single .... each grid cell are then multiplied in order to eliminate dis- junct or marginal ..

Development of a new method for sampling and ...
excel software was obtained. The calibration curves were linear over six .... cyclophosphamide than the analytical detection limit. The same time in a study by.

Robust Sampling and Reconstruction Methods for ...
work that goes against the traditional data acquisition paradigm. CS demonstrates ... a linear program (Basis Pursuit) can recover the original sig- nal, x, from y if ...

Noninvasive methodology for the sampling and ...
This ongoing project examines the behavioural and acoustic aspects of Atlantic .... at 4 °C for at least. 30 min prior to visualization on a 1% ethidium bromide ..... a tool to study diet: analysis of prey DNA in scats from captive. Stellar sea lion

Sampling of Signals for Digital Filtering and ... - Linear Technology
exact value of an analog input at an exact time. In DSP ... into the converter specification and still ... multiplexing, sample and hold, A/D conversion and data.

Importance Sampling for Production Rendering
MIP-Map Filtering. • Convert the solid angle to pixels for some mapping. • Use ratio of solid angle for a pixel to the total solid angle. Ωp = d(ωi) w · h l = max[. 1. 2 log. 2. Ωs. Ωp,0] ...