Statistical Basis for Demonstrating the Absence of Soil Contamination

By Charles Holbert

February 8, 2026

1.0 Introduction and Decision Context

This document presents the statistical basis commonly used in environmental site investigations to demonstrate the absence of soil contamination relative to applicable action levels (ALs). Regulatory decisions often require quantitative statements about confidence—such as demonstrating that contaminant concentrations are unlikely to exceed ALs across a site, or that any exceedances are limited in extent and significance. In practice, the objective is to demonstrate, with a specified confidence level (typically 95%), that concentrations across the site are below ALs or that any exceedances are limited to an acceptably small fraction of the site.

The methods summarized here support those decisions by linking sampling results to probabilistic statements about exceedance frequency, population means, upper percentiles, and hotspot detectability. Emphasis is placed on transparent assumptions, defensible confidence levels, and practical sample size planning. Although the equations are general, their appropriate use critically depends on sampling design, decision unit definition, and data quality considerations, which are discussed alongside the statistical formulations.

Three complementary statistical frameworks are addressed:

  • Exceedance-based evaluation
  • Mean-based evaluation
  • Percentile (upper-tail) evaluation

Hotspot detection is treated separately as a spatial coverage problem rather than a parameter-estimation problem.

2.0 Exceedance-Based Evaluation

A common approach evaluates whether the true exceedance probability p (i.e., the fraction of the site exceeding the AL) is small. Each sample result is classified as an exceedance (above AL) or non-exceedance (at or below AL). Under a binomial model, when n independent and representative samples yield zero exceedances, the one-sided \((1-\alpha)\) upper confidence bound on p is:

\begin{equation}\tag{1} p_u = 1 - \alpha^{1/n} \end{equation}

where \(\alpha\) is the significance level (e.g., \(\alpha = 0.05\) for 95% confidence) and \(n\) is the number of samples.

2.1 Sample Size Planning for Exceedance Control

If the project objective is to demonstrate, with confidence \((1-\alpha)\), that the true exceedance fraction does not exceed a specified value \(p^*\), assuming zero observed exceedances, then select n such that:

\begin{equation}\tag{2} p_U \le p^* \end{equation}

Substituting (1) into (2) gives:

\begin{equation}\tag{3} 1 - \alpha^{1/n} \le p^* \end{equation}

which can be rearranged to:

\begin{equation}\tag{4} n \ge \frac{\ln(\alpha)}{\ln(1-p^*)} \end{equation}

This calculation is widely used for planning investigations where the criterion is “no exceedances observed.” It assumes the sampling locations are representative of the decision unit.

2.2 Probability of Finding Contamination

In some cases, the objective is framed in terms of detection power: the probability of observing at least one exceedance if the true exceedance rate is p. Under the binomial model, the probability of observing at least one exceedance in n samples is:

\begin{equation}\tag{5} P(\ge \text{exceedance}) = 1 - (1-p)^n \end{equation}

To achieve the desired power \(1-\beta\), the required sample size satisfies:

\begin{equation}\tag{6} (1-p)^n \le \beta \end{equation}

which can be rearranged to:

\begin{equation}\tag{7} n \ge \frac{\ln(\beta)}{\ln(1-p)} \end{equation}

This framing supports statements such as: “If at least p of the decision unit exceeds the AL, then n samples provide at least \(1-\beta\) chance of finding an exceedance”.

3.0 Mean-Based Evaluation

An alternative or complementary approach evaluates whether the true mean concentration for the decision unit is below the AL. This approach is typically framed as a one-sided hypothesis test, where the null hypothesis represents an unacceptable (contaminated) condition and the alternative represents an acceptable (clean) condition.

For approximately normally distributed data, the one-sided \((1-\alpha)\) upper confidence limit (UCL) on the mean is gven by:

\begin{equation}\tag{8} UCL = \bar{x} + t_{1-\alpha,;n-1},\frac{s}{\sqrt{n}} \end{equation}

where \(\bar{x}\) is the sample mean, s is the sample standard deviation, and \(t_{1-\alpha,\;n-1}\) is the \((1-\alpha)\) quantile of the Student t distribution with n-1 degrees of freedom. If the UCL is less than or equal to the AL, then the true mean concentration is below the AL with confidence \((1-\alpha)\).

3.1 Sample Size Planning for Mean Comparison

Sample size planning for mean-based evaluations focuses on achieving adequate power to conclude that the mean is below the AL when it truly is. To establish the parameters for the hypothesis test and determine the required sample size, we define the following variables and assumptions:

  • Null: \(H_0:\ \mu \ge \mathrm{AL}\)
  • Alternative: \(H_1:\ \mu \lt \mathrm{AL} - \Delta\)
  • Difference (margin): \(\Delta>0\)
  • Standard deviation estimate: \(s\) (from prior data or pilot sampling)

Using a one-sided normal approximation for planning, an approximate required sample size is:

\begin{equation}\tag{9} n \approx \left(\frac{z_{1-\alpha}+z_{1-\beta}}{\Delta/s}\right)^2 \end{equation}

where:

  • \(\alpha\) controls confidence (Type I error) for concluding “clean” when not,
  • \(\beta\) controls power (Type II error), i.e., probability of concluding “clean” when it truly is (power = \(1-\beta\)),
  • \(z_{q}\) is the standard normal quantile (often used for planning; final analysis may use t).

Planning is typically performed using normal quantiles, with the final analysis conducted using the t-distribution. Larger variance or smaller margins require larger sample sizes.

4.0 Percentile and Upper Tail Evaluation

In some regulatory contexts, decisions are based on upper-tail behavior rather than the mean, such as ensuring that the 95th percentile of concentrations is below the AL. This approach addresses concerns about localized higher concentrations that may not affect the mean.

4.1 Nonparametric Sample Size for Percentile Bound

Let p be the target percentile (e.g., p = 0.95) and \(X_{(n)}\) be the sample maximum (largest observed value). Without assuming any distribution, the probability that the maximum exceeds the true p-quantile \(q_p\) is:

\begin{equation}\tag{10} Pr!\left(X_{(n)} > q_p\right) = p^n \end{equation}

If all observed samples are below the AL, n is chosen with confidence \((1-\alpha)\) that the p-th percentile is below the AL (using the maximum as a conservative bound), using the following:

\begin{equation}\tag{11} p^n \ge 1-\alpha, \end{equation}

which can be rearranged to:

\begin{equation}\tag{12} n \ge \frac{\ln(1-\alpha)}{\ln(p)} \end{equation}

4.2 More General Nonparametric Percentile Bound

More generally, a one-sided upper confidence bound for \(q_p\) can be formed using the k-th order statistic \(X_{(k)}\), where k is selected so that:

\begin{equation}\tag{13} Pr!\left(X_{(k)} \ge q_p\right) \ge 1-\alpha \end{equation}

This probability can be expressed via the binomial distribution:

\begin{equation}\tag{14} Pr!\left(X_{(k)} \ge q_p\right)=\sum_{j=0}^{k-1}\binom{n}{j}p^{,j}(1-p)^{n-j} \end{equation}

Planning is done by selecting n (and then determining k) so the sum is at least \((1-\alpha)\). In practice, \(k=n\) (the maximum) is often used for simplicity and conservatism, especially for screening.

5.0 Treatment of Non-Detects and Analytical Constraints

For exceedance-based methods, a non-detect can be treated as a non-exceedance only when the reporting limit is below the AL; otherwise, the result may be indeterminate and require lower reporting limits or resampling. For mean- or percentile-based analyses involving censored data, appropriate censored-data methods (e.g., Kaplan-Meier product limit estimator or regression on order statistics) should be applied. The validity of statistical conclusions depends on proper treatment of non-detects.

6.0 Role of Investigation Area in Sample Size Determination

The statistical sample size calculations presented above do not explicitly include the physical area of the region under investigation. This is not an omission, but a direct consequence of the statistical assumptions underlying these methods. In the exceedance-based, mean-based, and percentile-based frameworks, sample size is governed by probability, variability, and decision risk, rather than by the absolute size of the site.

The methods treat the site (or a defined decision unit within the site) as a statistical population from which samples are drawn. Each sample location is assumed to represent a realization from the same underlying distribution of soil concentrations for that decision unit. Under this framework, the relevant questions are:

  • What fraction of the population exceeds an action level?
  • What is the population mean or upper percentile?
  • With what confidence and power do we want to make these statements?

These questions depend on:

  • the acceptable probability of error (\(\alpha\) and \(\beta\)),
  • the expected variability of concentrations (\(\sigma\)),
  • the magnitude of the decision threshold (action level),
  • and the tolerable exceedance fraction or effect size,

Although area does not explicitly appear in the formulas, it enters the analysis implicitly through the assumptions required for statistical inference:

  1. Representativeness. Larger or more heterogeneous sites may require more samples to ensure that the assumption of random, representative sampling is reasonable. This increases sample size by design, not by formula.

  2. Decision Unit Definition. Large sites are typically subdivided into smaller, more homogeneous decision units. Sample size calculations apply to each decision unit independently, rather than to total site area.

  3. Hotspot Resolution. Area influences the spatial resolution of sampling (e.g., grid spacing). To ensure that contamination of a given minimum size would be detected, sampling density may need to increase as area increases. This is a spatial coverage consideration, not a statistical confidence calculation.

  4. Heterogeneity and Variance. Larger areas often encompass greater soil variability, which increases the estimated variance (\(\sigma\)). Increased variance directly increases required sample size through the formulas, but area itself remains absent.

It is important to distinguish between:

  • Statistical confidence, which addresses uncertainty in estimating population parameters, and
  • Spatial coverage, which addresses the risk of missing localized contamination.

The formulas previously presented address statistical confidence. Spatial coverage concerns are addressed through sampling design (grid spacing, stratification, targeted sampling), not by modifying the statistical equations. As a result, two sites of very different physical size can require similar sample sizes if:

  • they are internally homogeneous,
  • the same decision criteria apply,
  • and the same confidence and power requirements are used.

Conversely, a smaller but more heterogeneous site may require more samples than a larger, more uniform site.

Therefore, the absence of area from the sample size calculations reflects standard statistical practice and does not imply that site size is irrelevant—only that it influences sampling design and variance estimation, rather than the probability models used to quantify confidence.

7.0 Hotspot Detection as a Spatial Coverage Problem

The exceedance-, mean-, and percentile-based calculations above quantify statistical confidence about population parameters (e.g., exceedance fraction, mean). A different but complementary question for site screening is: “How many samples (and what spacing) are needed to be confident that a hotspot of at least a specified size would be detected, if it exists?”

This is a spatial coverage problem. Here, the area of the investigation region explicitly matters, because detection depends on whether at least one sample falls within the hotspot footprint. Samples are assumed to be placed using a design that is approximately random with respect to the hotspot location (or a systematic grid with a random start, which is often treated similarly for planning).

To determine the required sample size, we define the following variables:

  • A = total area of the decision unit (region under investigation)
  • a = predetermined hotspot area of concern (minimum hotspot size to detect)
  • n = number of sample locations placed within A
  • \(1-\beta\) = desired detection confidence (power), i.e., probability of detecting the hotspot if it exists

7.1 Random or Randomized Systematic Sampling

If a hotspot of area a exists somewhere within A, and n sample locations are independent and uniformly distributed over A, then the probability that a given sample misses the hotspot is:

\begin{equation}\tag{15} P(\text{miss}) = 1-\frac{a}{A} \end{equation}

The probability that all n samples miss the hotspot is:

\begin{equation}\tag{16} P(\text{all miss}) = \left(1-\frac{a}{A}\right)^n \end{equation}

Therefore, the probability of detecting the hotspot (at least one sample lands in it) is:

\begin{equation}\tag{17} P(\text{detect}) = 1-\left(1-\frac{a}{A}\right)^n \end{equation}

The required sample size to achieve the detection confidence \(1-\beta\) is given by:

\begin{equation}\tag{18} 1-\left(1-\frac{a}{A}\right)^n \ge 1-\beta \end{equation}

Solving for n yields:

\begin{equation}\tag{19} n \ge \frac{\ln(\beta)}{\ln\left(1-\frac{a}{A}\right)} \end{equation}

This formula explicitly includes area through the ratio a/A, which is the fraction of the decision unit occupied by the hotspot. As A increases for a fixed hotspot size a, A/a decreases and the required n increases to maintain the same detection confidence.

For small hotspots (when \(a \ll A\)), \(ln(1-x) \approx -x\) for small x and sample size is approximated by:

\begin{equation}\tag{20} n \approx -\ln(\beta),\frac{A}{a} = \frac{A}{a} ln(\frac{1}{\beta}) \end{equation}

This approximation highlights the intuitive scaling: required samples increase roughly in proportion to A/a.

7.2 Systematic Grid Sampling

In practice, hotspot detection is often implemented via a systematic grid rather than purely random points, because grid spacing directly controls the maximum “gap” between samples. If the decision unit has area A and n samples are placed on an approximately uniform grid, the average area represented per sample is:

\begin{equation}\tag{21} A_{cell} \approx \frac{A}{n} \end{equation}

For a square grid, the approximate grid spacing is:

\begin{equation}\tag{22} s \approx \sqrt{\frac{A}{n}} \end{equation}

Area therefore enters by linking sample count to spatial spacing.

For a minimum hotspot area \(a\), a characteristic length scale can be defined by approximating the hotspot as a circle:

\begin{equation}\tag{23} a = \pi r^2 \end{equation}

which can be rearranged to:

\begin{equation}\tag{24} r = \sqrt{\frac{a}{\pi}} \end{equation}

A common design concept is to set grid spacing s so that a hotspot of radius r is unlikely to “fit between” samples. This is not purely statistical (it depends on hotspot shape and alignment), but it can be used to translate area into a spatial sampling density in the field.

7.3 Combining Hotspot Detection with Concentration Exceedance

Hotspot detection often assumes that any sample within the hotspot would register as an exceedance (i.e., hotspot concentrations are above the AL and detectable). If that assumption is uncertain (e.g., partial exceedance within the hotspot or analytical non-detect concerns), the detection probability can be expressed as a product of spatial hit probability and exceedance probability within the hotspot. A conservative way to represent this is:

\begin{equation}\tag{25} Pr(\text{detect}) = 1-\left(1-\theta\frac{a}{A}\right)^n \end{equation}

where \(\theta\) equals the probability a sample taken inside the hotspot exceeds the AL (accounts for within-hotspot variability, measurement error, etc.).

The required sample size for detection confidence \(1-\beta\) becomes:

\begin{equation}\tag{26} n \ge \frac{\ln(\beta)}{\ln\left(1-\theta\frac{a}{A}\right)} \end{equation}

When \(\theta=1\), this reduces to the simpler hotspot hit model in (19).

In hotspot detection design, the decision is not about estimating a population parameter (mean, percentile, exceedance fraction). It is about geometric coverage: ensuring samples are dense enough across a region of size A that a hotspot of area a is likely to be intersected. The key driver is the fractional area a/A, which directly determines the chance any given sample lands in the hotspot. That is why the region area appears explicitly in the hotspot sample size equations and in grid-spacing calculations.

7.4 Assumptions and Limitations of Hotspot Detection Analysis

The hotspot detection analysis assumes that a hotspot of at least the specified minimum area exists entirely within the defined decision unit and that sample locations are placed using a random or randomized systematic design such that their placement is independent of the unknown hotspot location. Systematic grid sampling is treated as equivalent to random sampling for planning purposes when the grid origin is randomly selected; however, alignment between a regular grid and an elongated or regularly shaped hotspot could reduce detection probability relative to the random-placement assumption.

The analysis further assumes that any sample collected within the hotspot would identify contamination, meaning that concentrations within the hotspot exceed the applicable action level and are detectable given laboratory reporting limits and sampling methods. If concentrations within the hotspot are spatially variable, intermittently below action levels, or subject to significant analytical uncertainty, the probability of detection may be lower than predicted unless additional conservatism is incorporated.

Hotspot geometry is simplified for analytical purposes (e.g., assuming a compact or approximately circular area). Irregular, linear, or fragmented hotspots may be more difficult to detect at a given sampling density. Edge effects may reduce detection probability for hotspots located near the boundaries of a decision unit, particularly for systematic grids, and this potential reduction in coverage should be considered when establishing decision unit boundaries or selecting grid spacing.

The calculations assume that sampling locations are reasonably well distributed across the decision unit. Clustered or non-uniform sampling reduces effective spatial coverage and can overstate detection confidence if treated as independent trials. When stratified sampling is employed, hotspot detection calculations should be applied separately within each stratum using the stratum area rather than the total site area.

Targeted or judgmental samples placed in areas of known or suspected releases increase the practical likelihood of hotspot detection but do not conform to the random-placement assumptions underlying the probabilistic equations. As such, targeted samples are considered a qualitative enhancement to detection confidence and are not directly credited in the statistical calculations unless incorporated into a formal stratified or model-based design.

Conclusions

Statistical demonstrations of the absence of soil contamination rely on clearly defined decision objectives, appropriate sampling designs, and explicit control of decision risk. The methods summarized in this document provide complementary tools for addressing different regulatory questions, including exceedance frequency, mean concentration, upper-tail behavior, and the likelihood of detecting localized hotspots.

Key conclusions include:

  • Sample size requirements are driven by confidence levels, acceptable error rates, and variability-not directly by site area-when the objective is statistical inference about a decision unit.
  • Physical area becomes relevant when the objective shifts from parameter estimation to spatial coverage, such as ensuring detection of hotspots of a specified minimum size.
  • Exceedance-based, mean-based, and percentile-based approaches address different aspects of site risk and are often used together to provide a robust weight of evidence.
  • The validity of statistical conclusions depends as much on sampling design, representativeness, and laboratory detection limits as on the numerical calculations themselves.

When applied consistently and transparently, these statistical frameworks support defensible environmental decisions and provide a clear linkage between sampling effort, uncertainty, and regulatory confidence.

Posted on:
February 8, 2026
Length:
14 minute read, 2823 words
See Also: