Blog

Data analytics, statistics, and more

Shaded Relief Basemap Using ggplot2

Shaded relief of surface elevation illustrates the shape of the terrain in a realistic fashion by showing how the three-dimensional surface would be illuminated from a point light source. This post illustrates the use of geom_relief(), a new aesthetic mapping layer, or geom (geometric object), for creating a shaded relief basemap using ggplot2.

July 25, 2022

Examination of England’s Surface Water Quality

This post is intended to provide tools and insights to those individuals interested in analyzing the ecological and chemical status of various water bodies across England. Information and data about the river basin management water environment can be accessed from the Catchment Data Explorer. Classifications indicate where the quality of the environment is good, where it may need improvement, and what may need to be improved.

July 18, 2022

County Drought Levels Throughout the United States

The U.S. Drought Monitor is updated each Thursday to show the location and intensity of drought across the country, which uses a five-category system, from Abnormally Dry (D0) conditions to Exceptional Drought (D4). Using these data and the R statistical programming language, we can visualize drought severity across the United States for various time periods as static maps or even as an animated map

July 3, 2022

PCA, t-SNE, and UMAP Classification of Vegetable Oils

In this post, we explore three dimensionality reduction techniques specifically used for data exploration and visualization: principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP).

June 5, 2022

Proportional Odds Ordinal Logistic Regression

In this post, we will use ordinal logistic regression to provide general contrasts on the log odds ratio scale as an alternative to nonparametric ANOVA. Proportional odds ordinal logistic regression is a generalization of the Wilcoxon and Kruskal-Wallis tests that extends to multiple covariates and interactions.

April 17, 2022

Nonparametric Two-Way ANOVA

In this post, we will evaluate whether sample depth and/or site location affect arsenic concentrations measured in soil. To address non-normality and heteroscedasticity, two-way ANOVA will be performed using the rank-transformation of the data values.

April 16, 2022

Mann-Kendall Power Analysis Revisited

Detection of a long-term, temporal trend in environmental data is affected by a number of factors, including the size of the trend to be detected, the time span of the data, and the magnitude of variability and autocorrelation of the noise in the data. This post evaluates the power of the Mann-Kendall test to identify a trend for various combinations of trend, variability, and sample size using Monte Carlo simulation.

April 5, 2022

Sample Size Requirement for One-Sample t-Test

This post computes the sample size necessary to achieve a specified power for a one-sample t-test, given the ratio of means, coefficient of variation, and significance level. Calculations are based on the USEPA’s 1996 Soil Screening Guidance Document that discusses sample size calculations to determine whether soil at a potentially contaminated site needs to be investigated for possible remedial action.

April 2, 2022

Plume Moment Analysis Using Thiessen Polygons

Mass-based analyses of groundwater contaminants provide complementary information not readily quantified using single-well analytics. This post describes methods that can be used to evaluate contaminant concentrations measured in wells to determine how plume mass and plume center-of-mass change through time.

April 2, 2022