Blog

Data analytics, statistics, and more

Shaded Relief Basemap Using rayshader

This post illustrates the use of rayshader, an R library that uses elevation data in a base R matrix and a combination of raytracing, hillshading algorithms, and overlays to generate 2D and 3D maps. A surface relief map created using digital elevation data will be rendered using rayshader and ggplot2.

September 6, 2022

Rain Tomorrow Stacked Ensemble Model

For this post, we will evaluate rainfall in Australia using daily weather observations from multiple Australian weather stations. We will build a stacked ensemble classification model using the H2O machine learning platform for use in predicting if there will be rain tomorrow.

September 5, 2022

Rain Tomorrow

For this post, we will evaluate rainfall in Australia using daily weather observations from multiple Australian weather stations. We will build several machine learning models using the tidymodels framework for use in predicting if there will be rain tomorrow.

August 28, 2022

Shaded Relief Basemap Using ggplot2

Shaded relief of surface elevation illustrates the shape of the terrain in a realistic fashion by showing how the three-dimensional surface would be illuminated from a point light source. This post illustrates the use of geom_relief(), a new aesthetic mapping layer, or geom (geometric object), for creating a shaded relief basemap using ggplot2.

July 25, 2022

Examination of England’s Surface Water Quality

This post is intended to provide tools and insights to those individuals interested in analyzing the ecological and chemical status of various water bodies across England. Information and data about the river basin management water environment can be accessed from the Catchment Data Explorer. Classifications indicate where the quality of the environment is good, where it may need improvement, and what may need to be improved.

July 18, 2022

County Drought Levels Throughout the United States

The U.S. Drought Monitor is updated each Thursday to show the location and intensity of drought across the country, which uses a five-category system, from Abnormally Dry (D0) conditions to Exceptional Drought (D4). Using these data and the R statistical programming language, we can visualize drought severity across the United States for various time periods as static maps or even as an animated map

July 3, 2022

PCA, t-SNE, and UMAP Classification of Vegetable Oils

In this post, we explore three dimensionality reduction techniques specifically used for data exploration and visualization: principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP).

June 5, 2022

Proportional Odds Ordinal Logistic Regression

In this post, we will use ordinal logistic regression to provide general contrasts on the log odds ratio scale as an alternative to nonparametric ANOVA. Proportional odds ordinal logistic regression is a generalization of the Wilcoxon and Kruskal-Wallis tests that extends to multiple covariates and interactions.

April 17, 2022

Nonparametric Two-Way ANOVA

In this post, we will evaluate whether sample depth and/or site location affect arsenic concentrations measured in soil. To address non-normality and heteroscedasticity, two-way ANOVA will be performed using the rank-transformation of the data values.

April 16, 2022