Intro to R

How to use this guide

1 Each chapter introduces one concept with self-contained examples in code blocks. Click Run to execute the code, and output appears below the cell.

2 Every exercise has an empty editor where you write your own code. Use Just run to test how it behaves, or Run & check to find out whether your answer is correct.

3 Click Reveal solution when you're ready to compare. Try the exercise yourself first!

⚠️ A note on errors: you will get error messages, and that's normal! Read them carefully — they're R telling you where it got confused. Resources for interpreting errors include Stack Overflow, GeeksforGeeks, or even the R Documentation itself.

Chapter 01

Welcome to R

R is a free programming language built for working with data. In watershed ecology, it can be used for almost everything: analyzing stream chemistry, estimating discharge, modeling nutrient loads, examining macroinvertebrate community composition, and producing publication-quality figures from monitoring data. By the end of this guide, you'll be able to do real watershed data analysis in R.

You don't need any programming experience. We'll go step by step, and there are exercises throughout to help you practice. Each exercise is followed by a worked solution so you can check your answer and understand the reasoning.

Two ways to use R

You have two options for working with R. Both run the same R language; the difference is where the code lives.

Right here in your browser. Every gray code cell on this page is a tiny live R session. Click Run and the code executes immediately, with output shown below the cell. You don't have to install anything. This is great for learning the language and working through this guide.

On your own computer with R and RStudio. For real work — projects, assignments, research — you'll want R installed locally so you can save your work, manage data files, and use the full suite of R packages. The rest of this chapter walks you through installing both pieces and explains how the desktop environment compares to the in-browser cells you'll use here.

Installing R and RStudio

You need two separate downloads. R is the language itself. RStudio is the editor you use to write and run R code. Install R first, then RStudio.

Step 1 — Install R

Go to cran.r-project.org (the Comprehensive R Archive Network). Pick the link that matches your operating system:

Windows: Click "Download R for Windows" → "base" → "Download R-X.X.X for Windows". Run the installer with default settings.
macOS: Click "Download R for macOS" and download the .pkg file matching your processor. Apple Silicon Macs (M1/M2/M3/M4) use the "arm64" build; older Intel Macs use the "x86_64" build. If you're not sure, click the Apple menu → About This Mac to check.
Linux: Click "Download R for Linux" and follow the instructions for your distribution. On Ubuntu/Debian, the simplest command is sudo apt install r-base; on Fedora, sudo dnf install R. CRAN also has more recent builds maintained directly by the R project — check the page for your distribution.

Step 2 — Install RStudio

Go to posit.co/download/rstudio-desktop. (RStudio is now made by a company called Posit, but the software is still called RStudio.) The page should auto-detect your OS and show the right download. Run the installer.

Once both are installed, you only ever open RStudio — it finds and uses the R installation behind the scenes. You generally don't open R directly.

What RStudio looks like

When you open RStudio for the first time, you'll see a window divided into four panes. The screenshot below shows the typical layout:

RStudio's four-pane layout, showing the source editor, console, environment, and files panes — Screenshot of RStudio's four-pane layout

Each pane has a distinct job. Here's what they do:

Source editor (top-left) — where you write and save R scripts. A script is just a text file ending in .R that contains R code. You type code here and run it line-by-line (with Ctrl+Enter on Windows/Linux, Cmd+Return on Mac), or run the whole script at once.
Environment / History (top-right) — shows everything currently in memory. Variables, vectors, data frames you've created — all listed here with their values or dimensions. Think of it as a live inventory of your R session. Clicking a data frame opens it in a spreadsheet-style viewer, which is really helpful for inspecting datasets. The History tab shows every command you've ever typed in the current session.
Console (bottom-left) — the live R prompt. Anything you type here is executed immediately and the result appears right below. The console is what's actually running your R commands; the Source editor pane is just a place to write code that you then send to the console. When you click "Run" on a code cell on this webpage, you can think of it as sending that cell's contents to a hidden console.
Files / Plots / Help / Packages / Viewer (bottom-right) — a multi-purpose pane.
- Files browses your project folder.
- Plots displays figures you create with plot(), ggplot(), etc. — you can flip back through every plot you've made.
- Help shows R's built-in documentation. Type ?mean in the console and the help page for mean() appears here.
- Packages lists installed packages and lets you install new ones.
- Viewer shows HTML output (knit reports, interactive widgets).

Projects and working directories

Once you start doing real analyses you'll have a folder full of related files: your R scripts, raw data, figures, and notes. Two RStudio concepts help you keep them organized.

The working directory. R has a notion of a "current folder" — the place it looks for files when you say things like read.csv("nitrate.csv"). That folder is called the working directory. You can ask R where it currently is with getwd(), and you can change it with setwd("/path/to/folder"). If R can't find a file you're trying to read, the most common reason is that your working directory is somewhere else and R is looking in the wrong place.

RStudio Projects. An RStudio Project is just a folder with a small .Rproj file inside it. When you open a project (File → Open Project, or by double-clicking the .Rproj file), three useful things happen:

Your working directory is automatically set to the project folder. No more setwd() with hardcoded paths.
RStudio remembers which scripts you had open last time, so you pick up where you left off.
Variables you defined in your last session can optionally be restored.

The recommended workflow is: one project per analysis. For a class assignment, make a folder called something like watershed-assessment, create a new RStudio Project in it (File → New Project → New Directory), and put all your scripts and data inside. When you reopen the project a week later, you won't have to remember anything about where files live — R already knows.

A typical project folder might look like this:

watershed-assessment/ — the project root
- watershed-assessment.Rproj — the project file
- data/ — raw data files (CSVs, shapefiles)
- scripts/ — your .R analysis scripts
- figures/ — plots you generate
- README.md — a brief note describing the project

With this structure and a project file, references like read.csv("data/nitrate.csv") work the same on your laptop, your collaborator's laptop, and the lab computer. No absolute paths needed.

The R authoring environment

So far we've talked about R (the language) and RStudio (the editor). But when you start writing real analyses, you'll encounter a few more terms: IDE, R scripts, and R Markdown / Quarto documents. These are the three things that make up your day-to-day R workflow, and it's worth a quick tour before we head into the actual coding chapters.

RStudio is an IDE

RStudio is what's called an integrated development environment, or IDE for short. The name is more intimidating than the idea: an IDE is just a single application that bundles together the tools you need to write code. You've already seen the four panes — the source editor, the console, the environment viewer, and the files/plots/help pane. Each of those is a separate tool, but RStudio integrates them into one window so you don't have to juggle multiple programs.

Other languages have their own IDEs (VS Code, PyCharm, Eclipse, Xcode). For R, the dominant choice is RStudio. You could write R code in any plain text editor and run it from a terminal, but RStudio's tight integration of the console, plots, and help pages makes it dramatically more pleasant.

R scripts: the standard file format

An R script is a plain text file with the extension .R. It contains nothing but R code (and comments). When you run a script, R executes the lines from top to bottom as if you'd typed them into the console one at a time. Here's what a small script might look like:

Example: bear_creek_analysis.R

# Bear Creek nitrate analysis — May 2026
# Author: J. Doe

library(ggplot2)

# Load and summarize the data (simulated here for demonstration)
nitrate <- c(0.42, 1.85, 0.31, 2.94, 1.12, 0.58, 3.21, 0.27, 2.05, 1.43)
mean(nitrate)
sd(nitrate)

Scripts are the right tool when your output is the analysis itself — a script that loads data, fits a model, and saves a figure to disk. They're the workhorses of reproducible research: anyone with the same data can run your .R file and get the same results. Most published analysis pipelines are built from one or more R scripts.

R Markdown and Quarto: writing with code

Sometimes the deliverable isn't just the analysis but a document — a lab report, a methods write-up, a thesis chapter. For these, plain R scripts aren't quite right because you also want explanations, headings, equations, figures, and tables, all woven together with the code that produced them.

That's what R Markdown (.Rmd files) and Quarto (.qmd files) are for. They're two closely-related formats — Quarto is Posit's newer, more flexible successor to R Markdown — that let you mix writing and code in one file. Quarto is increasingly the modern default, but the two work so similarly that you can treat them as one topic for now. We'll just call them "Rmd/Quarto" here.

Inside one of these files, prose is written as ordinary text (with markdown formatting for things like bold and headings), and R code lives in code chunks: small blocks fenced off by triple-backticks with {r} at the top. When you press Knit (for Rmd) or Render (for Quarto) in RStudio, the file is processed end-to-end: each code chunk runs, its output is captured (numbers, tables, figures), and everything is woven into a polished output document — usually HTML, PDF, or Word.

A typical chunk inside an Rmd/Quarto file looks like this:

Example chunk inside a .Rmd or .qmd document

# Above this chunk, prose like "Figure 1 shows the nitrate gradient."
# (Imagine the prose written in normal text, not in this code cell.)

# The chunk itself runs as R code:
library(ggplot2)
gradient_data <- data.frame(
  distance = c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0),
  nitrate  = c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)
)
ggplot(gradient_data, aes(x = distance, y = nitrate)) +
  geom_point() + geom_smooth(method = "lm")

# When the document is knitted, the figure appears here in place of the code,
# embedded between the surrounding paragraphs.

For example: a watershed assessment lab report might have an introduction paragraph, a code chunk that loads the field data, another paragraph describing the analysis, a chunk that fits a linear model and prints the summary, then a chunk that produces a final figure — all in one .Rmd or .qmd file. When you knit it, the result is a single PDF or HTML document with the prose flowing naturally around the figures and tables generated by your code. No copy-pasting between R and Word. If the data updates, you just re-knit and everything regenerates.

How this webpage compares to all of the above

The code cells you'll use throughout this guide sit in an interesting middle ground. Conceptually, they're closest to Rmd/Quarto chunks — prose and runnable R code interleaved on a page — except that this page is fixed and you don't render anything. Compared to working in RStudio with real .R or .qmd files, the code cells here are intentionally simpler:

Persistence between cells. Each cell on this webpage runs in a fresh R environment, so variables you create in one cell are not automatically available in the next — that's why some exercises ask you to recreate a data frame at the top of your answer. In RStudio (script or notebook), your variables persist for the entire session.
Saving your work. Code typed into a cell here lives only as long as the browser tab is open. .R, .Rmd, and .qmd files all save to your computer where you can edit, version-control with git, and share them.
Packages. Only a handful of R packages are pre-loaded in your browser (we have ggplot2, but not the full tidyverse). In RStudio you can install any of the ~20,000 packages on CRAN with a single command.
Data files. The cells here can't read CSVs or other files from your hard drive. In RStudio you'll routinely use read.csv(), read_csv(), or specialized functions like dataRetrieval::readNWISdv() to pull discharge records straight from the USGS.
Knitting / rendering. This page is hand-written HTML; nothing gets rendered from a source document. With Rmd/Quarto, the file you write is the source — you knit it whenever you want to produce the final output.
Speed. The browser version downloads R-compiled-to-WebAssembly the first time you visit, which takes ~30 seconds. R running locally through RStudio is at native speed once installed.

The code cells here are good enough to work through this guide — you can focus on learning R without worrying about installation, file paths, or knitting. Once you're ready to do your own analyses, you'll likely use R scripts for the analysis itself and .Rmd/.qmd documents to write up the results. The beauty is that the R syntax you learn here works identically in all three.

Chapter 02

Terminology

Before you write your first line of R, a few vocabulary words will save you a lot of confusion. This short chapter introduces four terms — code, comments, functions, and operators — that show up on every page of this guide.

Code and Comments

Code is just text — instructions written for the computer to follow. Each line tells R to do something specific. R reads your code top to bottom, one line at a time, and does exactly what you say. If R does something unexpected, it's because the code told it to. Reading the line carefully is almost always the first step in fixing a bug (or, problem in the code).

Lines that start with # are comments — notes for humans that R ignores completely. Comments are how you explain to your future self (or classmate/professor/colleague) what your code is doing:

# This whole line is a comment. R ignores it.
2 + 2          # The 2 + 2 part runs; everything after # is a comment.

Functions

A function is a named procedure that takes one or more inputs and gives back a result. You can think of a function like a labeled machine: you put something in, the machine does its job, and something comes out. R comes with thousands of functions built in, and you'll learn many of them throughout this guide.

You "call" (run) a function by writing its name followed by parentheses. Whatever you put inside the parentheses is the input (sometimes called an argument). For example, sqrt() is the square-root function:

sqrt(16)        # input: 16, output: 4
sqrt(2)         # input: 2, output: about 1.414

# Some functions take multiple inputs separated by commas
round(3.14159, 2)   # input: a number and a digit count; output: 3.14

The parentheses are essential. sqrt(16) calls the function on 16, but sqrt by itself just refers to the function as an object — like saying the word "calculator" instead of actually using one. Forgetting parentheses, or having them in the wrong place, is a common source of error for beginners.

You'll see this same pattern everywhere in R. mean(x) takes a vector x and gives back its average. length(x) gives you how many items are in x. read.csv("nitrate.csv") reads a CSV file. The pattern is always function-name followed by parentheses around the input.

Arguments aren't always required for a function (e.g., getwd() is a function that returns your current working directory). Sometimes arguments have default values, so you don't have to specify them each time. If you're curious about the arguments a function accepts, you can use the help function ?function_name to view its documentation.

Operators

An operator is a special shortcut symbol that acts like a tiny function. You've used them in math your whole life: +, -, *, /. R has the same arithmetic operators, plus a few R-specific ones like <- (used to store values in variables, which we'll see in the next chapter) and == (which checks whether two things are equal).

Operators don't use parentheses around their inputs — they sit between them. So 2 + 3 is an operator (+) acting on two numbers, while sum(2, 3) is a function (sum) acting on two numbers. Both give 5; the syntax is just different.

That's the whole vocabulary you need to start reading R code: code is what you type, comments are notes for humans, functions are named procedures called with parentheses, and operators are shortcut symbols.

Chapter 03

Variable Assignment: Storing Values

A variable is a named container that holds a value. Once you store something in a variable, you can refer to it by name in later calculations. This saves typing and makes your code easier to read.

In R, we assign values using the arrow operator <- or the equals sign =. Read it as "gets":

# Store a measurement of stream discharge (in cubic meters per second)
discharge_cms <- 2.34

# Store a stream name (text values go in quotes)
stream_name <- "Bear Creek"

# Store a count of macroinvertebrate taxa observed
taxa_richness = 23

# To see what's in a variable, just type its name
discharge_cms
stream_name
taxa_richness

You can use variables in calculations:

# From the previous cell
discharge_cms <- 2.34

# Convert that discharge from cubic meters per second to cubic feet per second
# (1 cms = 35.3147 cfs)
discharge_cfs <- discharge_cms * 35.3147
discharge_cfs

Naming rules

Variable names should be descriptive. nitrate_mg_per_L is far better than x because anyone reading your code (including future-you) will know what it means. A few rules:

Names can contain letters, numbers, dots, and underscores
Names must start with a letter
R is case-sensitive: Discharge and discharge are different variables
Avoid using names of built-in functions like mean, sum, or c

Exercise 3.1 Velocity-area discharge

A stream's cross-section has a measured width of 4.2 meters and an average depth of 0.35 meters. The mean current velocity is 0.68 meters per second. Create three variables to store these values, then compute discharge using the velocity-area method:

Q = w × d × v

where Q is discharge (m³/s), w is width (m), d is mean depth (m), and v is mean velocity (m/s). Store the answer in a variable called discharge and print it.

Your answer

Reveal solution

R · Solution

width_m <- 4.2
depth_m <- 0.35
velocity_ms <- 0.68
discharge <- width_m * depth_m * velocity_ms
discharge

Explanation. We stored each measurement in a clearly named variable, then translated the equation directly. The velocity-area method is the most basic field technique for measuring stream discharge, and it's the foundation for more sophisticated methods you'll encounter later (like the midsection method that integrates across multiple verticals). A discharge of about 1 m³/s indicates a small to mid-sized headwater stream.

Chapter 04

Vectors: Lists of Values

In watershed ecology, you almost never make just one measurement — you sample across sites, dates, or depths. A vector is an ordered list of values of the same type (all numbers, or all text). You build vectors with the c() function, which stands for "combine."

# Daily mean discharge (m³/s) over a week
weekly_q <- c(2.1, 2.4, 8.7, 5.3, 3.2, 2.6, 2.3)
weekly_q

# Dissolved oxygen readings (mg/L) at 8 sites along a stream
do_mg_L <- c(9.2, 8.8, 7.5, 6.1, 5.8, 7.2, 8.4, 9.0)

# Site names from a watershed survey
site_names <- c("Headwater", "Trib_A", "Below_Trib_A", "Mid_reach", "Pasture", "Forest_in")
site_names

Math on vectors

R applies math operations to every element of a vector at once. This is called vectorization, and it's one of the most powerful features of R:

# From the previous cell
do_mg_L <- c(9.2, 8.8, 7.5, 6.1, 5.8, 7.2, 8.4, 9.0)

# Convert all DO readings from mg/L to micrograms per liter
do_mg_L * 1000

# Summary statistics for the DO data
mean(do_mg_L)
median(do_mg_L)
min(do_mg_L)
max(do_mg_L)
sd(do_mg_L)
length(do_mg_L)

⚠ Watch out: length() here means the number of items in the vector. R uses the word in the general sense.

Pulling out specific values

Use square brackets [ ] to access individual elements. R indexes from 1 (not 0):

weekly_q <- c(2.1, 2.4, 8.7, 5.3, 3.2, 2.6, 2.3)

# The first day's discharge
weekly_q[1]

# The third day's discharge — note the storm pulse!
weekly_q[3]

# Days 2 through 4 (covering the storm and immediate recession)
weekly_q[2:4]

# Just days 1, 4, and 7
weekly_q[c(1, 4, 7)]

You can also pull out values that meet a condition. This is called logical subsetting:

do_mg_L <- c(9.2, 8.8, 7.5, 6.1, 5.8, 7.2, 8.4, 9.0)

# Which sites had DO below 6 mg/L (a common stress threshold for many fish)?
do_mg_L < 6

# Pull out only the readings below 6 mg/L
do_mg_L[do_mg_L < 6]

Exercise 4.1 Nitrate monitoring

You measured nitrate concentrations (mg N/L) at 10 sites across an agricultural watershed: 0.42, 1.85, 0.31, 2.94, 1.12, 0.58, 3.21, 0.27, 2.05, 1.43.

Store these in a vector called nitrate.
Store the mean in nitrate_mean and the standard deviation in nitrate_sd.
The EPA recommends a threshold of 1.0 mg/L for ecologically meaningful nitrate enrichment. Store the number of sites exceeding this threshold in n_high.

Your answer

Reveal solution

R · Solution

# (a)
nitrate <- c(0.42, 1.85, 0.31, 2.94, 1.12, 0.58, 3.21, 0.27, 2.05, 1.43)

# (b)
nitrate_mean <- mean(nitrate)
nitrate_sd   <- sd(nitrate)
nitrate_mean
nitrate_sd

# (c)
n_high <- sum(nitrate > 1.0)
n_high

Explanation. Part (c) uses a clever trick. The expression nitrate > 1.0 returns a vector of TRUE/FALSE values — one for each site. When you call sum() on logical values, R counts the TRUEs. So sum(nitrate > 1.0) is just "how many sites exceeded the threshold?" You'll use this pattern constantly when screening monitoring data against water quality criteria.

Chapter 05

Data Frames: The R Spreadsheet

A data frame is R's version of a spreadsheet. It has rows (typically one per sample, site, or sampling date) and columns (one per measurement or variable). Data frames are the workhorse of data analysis.

Let's build a small data frame from a hypothetical synoptic stream survey:

stream_data <- data.frame(
  site_id     = 1:10,
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  temp_C      = c(11.2, 10.8, 14.6, 15.3, 18.7, 19.4, 11.5, 14.9, 17.8, 10.5),
  do_mg_L     = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13),
  ept_taxa    = c(18, 17, 9, 8, 4, 3, 19, 7, 5, 20)
)

stream_data

The ept_taxa column counts the number of Ephemeroptera, Plecoptera, and Trichoptera (mayfly, stonefly, and caddisfly) taxa — a common bioindicator where higher numbers generally suggest better water quality.

Inspecting a data frame

Real datasets often have hundreds or thousands of rows, so you usually want a quick look rather than the whole thing:

stream_data <- data.frame(
  site_id     = 1:10,
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  temp_C      = c(11.2, 10.8, 14.6, 15.3, 18.7, 19.4, 11.5, 14.9, 17.8, 10.5),
  do_mg_L     = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13),
  ept_taxa    = c(18, 17, 9, 8, 4, 3, 19, 7, 5, 20)
)

head(stream_data)
tail(stream_data, 3)
str(stream_data)
nrow(stream_data)
ncol(stream_data)

Accessing columns and filtering rows

Use the dollar sign $ to grab a single column:

stream_data <- data.frame(
  site_id     = 1:10,
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  temp_C      = c(11.2, 10.8, 14.6, 15.3, 18.7, 19.4, 11.5, 14.9, 17.8, 10.5),
  do_mg_L     = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13),
  ept_taxa    = c(18, 17, 9, 8, 4, 3, 19, 7, 5, 20)
)

# All the temperatures
stream_data$temp_C

# Average nitrate across all sites
mean(stream_data$nitrate_mgL)

# All forest sites
stream_data[stream_data$land_use == "Forest", ]

# Urban sites with nitrate above 2 mg/L
stream_data[stream_data$land_use == "Urban" & stream_data$nitrate_mgL > 2, ]

The & symbol means "AND" (both conditions must be true). The | symbol means "OR" (either can be true).

Exercise 5.1 Subsetting a survey

Using the stream_data data frame from the example above (you'll need to recreate it in your answer cell — code persists between runs in the same cell, but each cell starts fresh):

Store the average EPT richness at the forested sites in forest_ept_mean.
Store the count of urban sites in n_urban.
Store a data frame of sites with temperature below 15°C in cool_sites.

Your answer

Reveal solution

R · Solution

stream_data <- data.frame(
  site_id     = 1:10,
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  temp_C      = c(11.2, 10.8, 14.6, 15.3, 18.7, 19.4, 11.5, 14.9, 17.8, 10.5),
  do_mg_L     = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13),
  ept_taxa    = c(18, 17, 9, 8, 4, 3, 19, 7, 5, 20)
)

# (a)
forest_ept_mean <- mean(stream_data$ept_taxa[stream_data$land_use == "Forest"])
forest_ept_mean

# (b)
n_urban <- sum(stream_data$land_use == "Urban")
n_urban

# (c)
cool_sites <- stream_data[stream_data$temp_C < 15, ]
cool_sites

Explanation. In part (a), the inner part stream_data$land_use == "Forest" gives us a TRUE/FALSE vector marking which rows are forest sites. We use that to subset stream_data$ept_taxa, keeping only the EPT counts from forest sites. Reading from the inside out is a useful habit when code looks complicated. Notice the dramatic difference: forest sites average around 18 EPT taxa, while the urban sites average around 4 — exactly the pattern bioassessment programs are designed to detect.

Chapter 06

Operators: R's Vocabulary

Arithmetic operators

10 + 3   # addition
10 - 3   # subtraction
10 * 3   # multiplication
10 / 3   # division
10 ^ 2   # exponent (10 squared)

Comparison operators

These return TRUE or FALSE:

5 > 3
5 < 3
5 == 5   # equal to (note the DOUBLE equals!)
5 != 4   # not equal to
5 >= 5
5 <= 4

The = vs == trap. Remember that a single = assigns a value (Chapter 03)!

Logical operators

TRUE & FALSE   # AND: both must be TRUE
TRUE | FALSE   # OR: at least one must be TRUE
!TRUE          # NOT: flips TRUE to FALSE

Exercise 6.1 Compliance check

You're screening a stream site for compliance with two water quality criteria: DO must be at least 6.0 mg/L AND temperature must be below 20°C for it to be considered suitable habitat for a sensitive coldwater species. Given the measurements below, write a single line of code that stores TRUE in a variable called is_suitable if the site meets both criteria, and FALSE otherwise.

Your answer

site_do <- 7.4
site_temp <- 17.8

# Your one-line answer below:

Reveal solution

R · Solution

site_do <- 7.4
site_temp <- 17.8

is_suitable <- site_do >= 6.0 & site_temp < 20
is_suitable

Explanation. A two-part water quality criterion translates directly to a logical AND. Both parts have to be true: DO at or above the minimum AND temperature below the maximum. Try changing site_do to 4.5, then site_temp to 22, and re-running to see how the result changes. This same logic pattern scales directly to filtering whole datasets — same &, same ==, just applied to columns instead of single values.

Chapter 07

Translating Equations into Code

Watershed ecology is full of equations — runoff models, nutrient load calculations, dilution equations, hydraulic geometry relationships. A core R skill is taking an equation off a textbook page and turning it into working code.

Manning's equation for open-channel flow

v = (1/n) · R^2/3 · S^1/2

where v is mean velocity (m/s), n is Manning's roughness coefficient, R is hydraulic radius (m), and S is the channel slope (m/m).

# Parameters for a small cobble-bedded stream
n <- 0.045    # Manning's n for cobble/boulder channel
R <- 0.28     # hydraulic radius (m)
S <- 0.012    # slope (m/m)

v <- (1/n) * R^(2/3) * S^(1/2)
v

Pollutant load calculation

L = C × Q × k

where L is load (kg/day), C is concentration (mg/L), Q is discharge (m³/s), and k is the unit conversion factor 86.4.

nitrate_conc <- 1.85   # mg/L
discharge    <- 4.2    # m³/s
k            <- 86.4   # unit conversion

nitrate_load <- nitrate_conc * discharge * k
nitrate_load   # kg N per day

Tip: When an equation has nested operations, build it up from the inside. Follow PEMDAS and use parentheses generously!

Exercise 7.1 Streeter-Phelps

The Streeter-Phelps deficit equation describes how dissolved oxygen recovers downstream of a pollution source. A simplified version for the DO deficit at a given travel time is:

D_t = D₀ · e^−k₂t

where D₀ is the initial DO deficit (mg/L), k₂ is the reaeration rate constant (per day), and t is travel time (days). In R, exp(x) computes e raised to the power of x.

A point source produces an initial DO deficit of 4.5 mg/L. The stream has a reaeration rate of 0.35 per day. Compute the DO deficit 3 days downstream and store the result in a variable called Dt.

Your answer

Reveal solution

R · Solution

D0 <- 4.5
k2 <- 0.35
t  <- 3

Dt <- D0 * exp(-k2 * t)
Dt

Explanation. This is a classic textbook calculation in stream chemistry. The result (~1.58 mg/L) means the deficit has decayed substantially — about two-thirds of the way back toward saturation — over those three days. Storing inputs as named variables (rather than typing the numbers directly into the equation) makes it trivial to recalculate D_t for a different travel time or reaeration rate. This is also the foundation of the more complete Streeter-Phelps model that you'll see in water quality modeling courses.

Chapter 08

Calculations Across Data Frame Columns

Because columns are vectors, you can do math across an entire column in one line — no loops needed. R automatically applies the operation row by row.

stream_data <- data.frame(
  site_id     = 1:10,
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  temp_C      = c(11.2, 10.8, 14.6, 15.3, 18.7, 19.4, 11.5, 14.9, 17.8, 10.5),
  do_mg_L     = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13)
)

# Temperature in Fahrenheit
stream_data$temp_F <- stream_data$temp_C * 9/5 + 32

# DO impairment flag (TRUE if below 6 mg/L)
stream_data$do_impaired <- stream_data$do_mg_L < 6

# DO at saturation as a function of temperature (empirical formula)
stream_data$do_sat <- 14.652 - 0.41022 * stream_data$temp_C +
                      0.0079910 * stream_data$temp_C^2 -
                      0.000077774 * stream_data$temp_C^3

# Percent DO saturation
stream_data$do_pct_sat <- (stream_data$do_mg_L / stream_data$do_sat) * 100

stream_data

In four lines of code, we computed four new variables for all ten sites at once. If the dataset had 10,000 sites, the code would be exactly the same.

Exercise 8.1 SC → TDS conversion

The Specific Conductance-Total Dissolved Solids (TDS) relationship is widely used in watershed monitoring. A common rule-of-thumb conversion is:

TDS (mg/L) = 0.65 × SC (μS/cm)

The EPA's general guidance is that TDS values above 500 mg/L are concerning for many beneficial uses. Using the data frame in your answer cell:

Add a tds_mgL column with the estimated TDS for each site.
Add a tds_concern column that is TRUE if TDS exceeds 500 mg/L.
Filter the data frame to a new data frame called concern_sites containing only the rows where tds_concern is TRUE.

Your answer

sc_data <- data.frame(
  site_id = 1:8,
  reach = c("Upper", "Upper", "Mid", "Mid", "Lower", "Lower", "Trib", "Trib"),
  sc_uScm = c(185, 210, 425, 510, 845, 920, 295, 340)
)

# Your code below…

Reveal solution

R · Solution

sc_data <- data.frame(
  site_id = 1:8,
  reach = c("Upper", "Upper", "Mid", "Mid", "Lower", "Lower", "Trib", "Trib"),
  sc_uScm = c(185, 210, 425, 510, 845, 920, 295, 340)
)

# (a)
sc_data$tds_mgL <- 0.65 * sc_data$sc_uScm

# (b)
sc_data$tds_concern <- sc_data$tds_mgL > 500

# (c)
concern_sites <- sc_data[sc_data$tds_concern, ]
concern_sites

Explanation. Each line does the calculation for every site in the dataset at once — that's vectorization at work. In part (c), we used the new logical column directly inside the brackets. The conversion factor of 0.65 is itself a simplification — the true ratio depends on the dominant ions in solution and ranges from about 0.55 (NaCl-dominated) to 0.9 (CaSO₄-dominated). A real analysis would calibrate the factor for your specific watershed.

Chapter 09

For Loops: Repeating Over a Set of Values

A for loop runs a block of code once for each item in a sequence. Use it when you know in advance what set of values you want to iterate over — a list of sites, a series of years, a vector of measurements. (Use a while loop, which we'll see in the next chapter, when you don't know how many iterations you'll need.)

The syntax reads almost like English: for (each item in my_list) do something with item:

# Loop through the numbers 1 through 5 and print each one
for (i in 1:5) {
  print(i)
}

Inside the parentheses, i is a variable name we chose — it takes on each value of the sequence 1:5 in turn. You can name the loop variable anything you like (often i or j for indices, or something descriptive like site or year).

A watershed example: looping over sites

Suppose you have a vector of stream site names and you want to print a quick status report for each one:

site_names <- c("Headwater", "Mid_reach", "Confluence", "Outlet")

for (site in site_names) {
  print(paste("Processing site:", site))
}

The loop variable site takes on each character string in turn. The paste() function glues strings together with a space between them.

Building up a result inside a loop

A common pattern is to start with an empty container, then accumulate values as the loop runs. Here we compute a running total of monthly precipitation, recording the cumulative total after each month:

# Monthly precipitation (cm) for one year at a watershed
monthly_precip <- c(8.2, 6.5, 9.1, 7.8, 5.4, 3.2,
                    2.1, 2.9, 4.6, 7.3, 9.8, 8.7)

# Pre-allocate a vector to hold the running totals
running_total <- numeric(length(monthly_precip))

# Loop through each month
running_total[1] <- monthly_precip[1]
for (m in 2:12) {
  running_total[m] <- running_total[m - 1] + monthly_precip[m]
}

running_total
paste("Annual total:", running_total[12], "cm")

Notice the pattern: we initialized running_total as an empty numeric vector with the right length, set the first element by hand, then used the loop to fill in each subsequent element by adding the next month's precipitation to the previous total. This kind of iterative accumulation is the bread and butter of for loops.

Tip — vectorization first: R's vectorized operations are usually faster and more idiomatic than loops. The example above could be written in one line as running_total <- cumsum(monthly_precip). For loops are still useful when each iteration depends on a previous result, when you're calling a complicated function on each item, or when you want explicit step-by-step control. As you write more R, you'll develop a feel for which approach fits the situation.

Exercise 9.1 Counting storm-flow days

You have 14 days of mean daily discharge (m³/s) from a small stream:

discharge <- c(2.1, 2.4, 8.7, 5.3, 3.2, 2.6, 2.3, 1.9, 2.0, 12.4, 7.8, 4.1, 2.5, 2.2)

A "storm flow day" is any day where discharge exceeds 5.0 m³/s. Use a for loop to count the number of storm-flow days. Store your answer in a variable called storm_days and print it.

Hint: start with storm_days <- 0, then loop through each value in discharge and use an if statement inside the loop to check if it exceeds 5.0.

Your answer

discharge <- c(2.1, 2.4, 8.7, 5.3, 3.2, 2.6, 2.3,
               1.9, 2.0, 12.4, 7.8, 4.1, 2.5, 2.2)

# Your for-loop here…

Reveal solution

R · Solution

discharge <- c(2.1, 2.4, 8.7, 5.3, 3.2, 2.6, 2.3,
               1.9, 2.0, 12.4, 7.8, 4.1, 2.5, 2.2)

storm_days <- 0
for (q in discharge) {
  if (q > 5.0) {
    storm_days <- storm_days + 1
  }
}
storm_days

Explanation. The loop variable q takes on each discharge value in turn. We test each value against the threshold and increment the counter when it exceeds 5.0. Three days qualify: 8.7, 12.4, and 7.8 m³/s. As mentioned in the tip above, the vectorized one-liner sum(discharge > 5.0) would give the same answer in less code — but the loop version makes the logic explicit, which is often what you want when teaching or debugging. As your loops get more complex (for instance, accumulating multiple statistics per site), the loop approach scales naturally; the vectorized approach can become harder to read.

Chapter 10

While Loops: Repeating Until a Condition is Met

A while loop runs a block of code repeatedly as long as a condition stays true. It's the right tool when you don't know in advance how many iterations you'll need — you just know when to stop.

# Count from 1 to 5
i <- 1
while (i <= 5) {
  print(i)
  i <- i + 1   # IMPORTANT: update the counter or you'll loop forever!
}

Infinite loops: if you forget to update the variable that the condition depends on, R will run forever. If your code cell hangs, just refresh the page — the WebR engine will reset.

A watershed example: simulating a reservoir filling

A small reservoir starts at 40% capacity. Inflows raise its level by 4 percentage points per day during a wet period. How many days until it reaches at least 95%?

percent_full <- 40
daily_rise <- 4
days <- 0

while (percent_full < 95) {
  percent_full <- percent_full + daily_rise
  days <- days + 1
}

paste("The reservoir reached", percent_full, "% capacity after", days, "days")

Exercise 10.1 First-order decay

A closed pond receives a one-time pulse of a contaminant at a concentration of 12.0 mg/L. The contaminant decays at a first-order rate of 8% per day, meaning each day the concentration drops to 92% of its previous value. Use a while loop to determine how many days it will take for the concentration to fall below 1.0 mg/L. Store the day counter in a variable called days and the running concentration in a variable called concentration. Print both at the end.

Your answer

Reveal solution

R · Solution

concentration <- 12.0
decay_factor <- 0.92    # multiplier per day (1 - 0.08)
days <- 0

while (concentration >= 1.0) {
  concentration <- concentration * decay_factor
  days <- days + 1
}

paste("Concentration fell below 1.0 mg/L after", days,
      "days. Final concentration:", round(concentration, 3), "mg/L")

Explanation. The structure mirrors the reservoir example, but we're shrinking a value rather than growing it. The condition concentration >= 1.0 keeps the loop running as long as the contaminant is at or above the threshold. Each iteration applies one day of first-order decay (multiplying by 0.92). The final value is slightly below 1.0 because decay happens in discrete daily steps — a useful reminder that simulations make choices about time steps.

Chapter 11

Linear Models: Fitting Relationships in Data

A linear model describes the relationship between a response variable and one or more predictor variables. R fits linear models with the lm() function. The syntax y ~ x reads as "y is modeled as a function of x."

Stage-discharge rating curve

The relationship between stream stage and discharge typically follows a power law Q = a · h^b, which becomes linear after taking logs of both sides:

log(Q) = log(a) + b · log(h)

# Simulate 30 paired stage-discharge measurements
set.seed(123)
sim_stage <- runif(30, min = 0.2, max = 1.8)
sim_q     <- 6.5 * sim_stage^2.4 * exp(rnorm(30, 0, 0.1))

gauge_data <- data.frame(
  stage_m = sim_stage,
  q_cms   = sim_q
)

# Fit the linear model on log-transformed data
rating_model <- lm(log(q_cms) ~ log(stage_m), data = gauge_data)
summary(rating_model)

# Back-transform coefficients
coefs <- coef(rating_model)
a_estimate <- exp(coefs[1])
b_estimate <- coefs[2]
paste("Estimated a =", round(a_estimate, 3))
paste("Estimated b =", round(b_estimate, 3))

Exercise 11.1 Longitudinal nitrate

You measured nitrate concentration along a longitudinal gradient downstream of an agricultural field, recording the distance (km) from the field edge and the resulting concentration (mg N/L).

Combine these into a data frame called gradient_data.
Fit a linear model predicting nitrate from distance and store it in nitrate_mod.
Use coef() to extract the slope, store it in a variable called slope, and interpret what it means in terms of the watershed processes acting on nitrate.

Your answer

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)

# Your code below…

Reveal solution

R · Solution

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)

# (a)
gradient_data <- data.frame(
  distance = distance_km,
  nitrate  = nitrate_mgL
)

# (b)
nitrate_mod <- lm(nitrate ~ distance, data = gradient_data)
summary(nitrate_mod)

# (c)
slope <- coef(nitrate_mod)[2]
slope

Explanation. The slope of approximately −0.74 means nitrate decreases by about 0.74 mg N/L for every additional kilometer downstream of the source. Biologically, this attenuation reflects a combination of in-stream processes — dilution from groundwater and tributary inputs, biological uptake by algae and biofilms, denitrification in hyporheic and riparian zones, and physical dispersion. A linear model isn't always ideal for this kind of decay (a first-order exponential decay is often more mechanistically appropriate), but for a short reach it's a reasonable first approximation.

Chapter 12

Plotting in Base R

R has plotting tools built right into the language. Plots produced in code cells appear below the cell as images.

set.seed(123)
sim_stage <- runif(30, min = 0.2, max = 1.8)
sim_q     <- 6.5 * sim_stage^2.4 * exp(rnorm(30, 0, 0.1))
gauge_data <- data.frame(stage_m = sim_stage, q_cms = sim_q)

plot(gauge_data$stage_m, gauge_data$q_cms,
     main = "Stage-Discharge Rating Curve",
     xlab = "Stage (m)",
     ylab = "Discharge (m³/s)",
     pch = 19, col = "steelblue")

Boxplots compare distributions across groups

stream_data <- data.frame(
  land_use = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
               "Urban", "Forest", "Pasture", "Urban", "Forest"),
  do_mg_L  = c(9.5, 9.7, 7.8, 7.2, 5.9, 5.5, 9.4, 7.5, 6.1, 9.8)
)

boxplot(do_mg_L ~ land_use, data = stream_data,
        main = "Dissolved Oxygen by Land Use",
        xlab = "Land Use",
        ylab = "Dissolved Oxygen (mg/L)",
        col = c("forestgreen", "khaki", "gray60"))

Exercise 12.1 Scatterplot with regression line

Using the gradient_data from Exercise 11.1, make a scatterplot of nitrate vs. distance and add the fitted regression line using abline(). Hint: when both axes are on the original scale, you can pass the model directly to abline().

Your answer

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)
gradient_data <- data.frame(distance = distance_km, nitrate = nitrate_mgL)
nitrate_mod <- lm(nitrate ~ distance, data = gradient_data)

# Your plot code below…

Reveal solution

R · Solution

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)
gradient_data <- data.frame(distance = distance_km, nitrate = nitrate_mgL)
nitrate_mod <- lm(nitrate ~ distance, data = gradient_data)

plot(gradient_data$distance, gradient_data$nitrate,
     main = "Downstream Attenuation of Nitrate",
     xlab = "Distance from Source (km)",
     ylab = "Nitrate (mg N/L)",
     pch = 19, col = "darkgreen")
abline(nitrate_mod, col = "red", lwd = 2)

Explanation. abline() automatically extracts the intercept and slope from a one-predictor model and draws the line. This wouldn't work for a log-log model because the line is straight only on the log scale — there you'd use predict() and lines() instead.

Chapter 13

Plotting in ggplot2

ggplot2 uses a layered approach — you start with a blank canvas, map data to visual elements, then add geoms (points, lines, bars), labels, and themes with the + operator.

library(ggplot2)

stream_data <- data.frame(
  land_use    = c("Forest", "Forest", "Pasture", "Pasture", "Urban",
                  "Urban", "Forest", "Pasture", "Urban", "Forest"),
  nitrate_mgL = c(0.15, 0.18, 1.45, 1.82, 2.35, 2.61, 0.21, 1.67, 2.14, 0.13),
  ept_taxa    = c(18, 17, 9, 8, 4, 3, 19, 7, 5, 20)
)

ggplot(stream_data, aes(x = nitrate_mgL, y = ept_taxa, color = land_use)) +
  geom_point(size = 4) +
  labs(
    title = "EPT Richness Declines with Nitrate Enrichment",
    x = "Nitrate (mg N/L)",
    y = "EPT Taxa Richness",
    color = "Land Use"
  ) +
  theme_minimal()

Inside aes() vs outside: use aes() for data-driven aesthetics (color = a column), and put fixed values like color = "steelblue" outside. This trips up everyone at first.

Exercise 13.1 ggplot scatter with smoother

Using gradient_data, make a ggplot scatterplot of nitrate (y) vs. distance (x), add a linear smoother with geom_smooth(method = "lm"), and include an informative title and axis labels with units.

Your answer

library(ggplot2)

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)
gradient_data <- data.frame(distance = distance_km, nitrate = nitrate_mgL)

# Your ggplot code below…

Reveal solution

R · Solution

library(ggplot2)

distance_km <- c(0.1, 0.5, 1.0, 1.5, 2.0, 3.0, 4.0, 5.0)
nitrate_mgL <- c(4.8, 4.2, 3.5, 3.1, 2.6, 1.9, 1.5, 1.1)
gradient_data <- data.frame(distance = distance_km, nitrate = nitrate_mgL)

ggplot(gradient_data, aes(x = distance, y = nitrate)) +
  geom_point(color = "darkgreen", size = 4) +
  geom_smooth(method = "lm", color = "firebrick", fill = "mistyrose") +
  labs(
    title = "Downstream Attenuation of Nitrate",
    x = "Distance from Source (km)",
    y = "Nitrate (mg N/L)"
  ) +
  theme_minimal()

Explanation. ggplot2 gives us a confidence ribbon "for free" and cleaner default styling. Most watershed and aquatic ecology articles you'll read in 2026 use ggplot2 — it's worth the investment to learn it well. Always specify whether you mean mg-N/L (just the nitrogen) or mg-NO₃/L (the whole molecule) — they differ by a factor of 4.4.

Chapter 14

Putting It All Together

Let's combine everything you've learned into a mini analysis. We'll simulate a synoptic stream survey, compute derived metrics, fit a stressor-response model, and visualize the results.

R · Capstone

library(ggplot2)

# 1. Simulate 50 stream sites across a gradient of agricultural cover
set.seed(2026)
n_sites <- 50

watershed_survey <- data.frame(
  site_id = 1:n_sites,
  pct_ag  = round(runif(n_sites, 0, 90), 1),
  temp_C  = round(rnorm(n_sites, mean = 14, sd = 3), 1)
)

# 2. Nitrate increases with agricultural cover
watershed_survey$nitrate_mgL <- round(
  0.3 + 0.05 * watershed_survey$pct_ag + rnorm(n_sites, 0, 0.6), 2
)
watershed_survey$nitrate_mgL <- pmax(watershed_survey$nitrate_mgL, 0.05)

# 3. EPT richness declines with nitrate
watershed_survey$ept_taxa <- round(
  pmax(20 - 4 * watershed_survey$nitrate_mgL + rnorm(n_sites, 0, 2), 0)
)

# 4. Flag sites of concern with a while loop
i <- 1
flagged_count <- 0
while (i <= nrow(watershed_survey)) {
  if (watershed_survey$nitrate_mgL[i] > 2 & watershed_survey$ept_taxa[i] < 8) {
    flagged_count <- flagged_count + 1
  }
  i <- i + 1
}
print(paste("Flagged", flagged_count, "sites of concern"))

# 5. Fit stressor-response model
ept_model <- lm(ept_taxa ~ nitrate_mgL, data = watershed_survey)
slope <- coef(ept_model)[2]

# 6. Visualize
ggplot(watershed_survey, aes(x = nitrate_mgL, y = ept_taxa, color = pct_ag)) +
  geom_point(size = 3, alpha = 0.8) +
  geom_smooth(method = "lm", color = "black", fill = "gray80") +
  scale_color_viridis_c(name = "% Agricultural\nCover") +
  labs(
    title = "Macroinvertebrate Community Response to Nitrate",
    subtitle = paste("Slope =", round(slope, 2),
                     "EPT taxa lost per mg/L | n =", nrow(watershed_survey)),
    x = "Nitrate (mg N/L)",
    y = "EPT Taxa Richness"
  ) +
  theme_minimal()

This 30-line analysis touches every concept from the guide: variables, vectors, data frames, operators, equations, column-wise math, while loops, linear models, and plotting. Real watershed analyses are just longer versions of this same pattern — and the stressor-response framework you see here is essentially the foundation of state and federal assessment programs.

Chapter 15

Where to Go Next

Tidyverse

dplyr + ggplot2

The dplyr package makes data manipulation much cleaner than base R brackets. Pair it with ggplot2 and you have a modern analysis workflow.

USGS Data

dataRetrieval

Connects directly to NWIS and the Water Quality Portal — pull discharge records and water quality samples for any USGS gauge with a single function call.

Spatial

sf + terra

sf handles vector GIS data (watershed boundaries, stream networks); terra handles rasters (DEMs, NLCD). Together they make R a capable GIS.

⌘

Programming is a skill that rewards practice far more than it rewards reading. Open RStudio, type the examples in this document yourself, change the numbers, break things, and fix them. After a few weeks of regular use, R will feel less like a foreign language and more like a tool you reach for instinctively whenever data shows up.

— Welcome to watershed science in the modern age.