Overview
Please provide a reproducible .Rmd script that answers all questions
below and produces all of your analysis and plots. Some reminders:
- Reproducible: This means when I run your .Rmd file,
it will run all analyses and create all plots without errors, without me
having to reset my working directory, and without forcing me to install
anything on my machine (i.e., use the require(librarian); shelf(your
packages, lib = tempdir()) approach as in the code block below). This
also means there should be NO ERRORS when I run it!
- Format: Answer each question with text and code,
plus figures/tables where appropriate.
Background
“Coral Bleaching” by NOAA’s National Ocean
Service is marked with Public domain mark 1.0.
For our first homework assignment, we will be modeling the
relationship between sea surface temperature anomalies and coral
bleaching events. This data is derived from a paper on the regional
severity of coral bleaching events triggered by high ocean
temperatures:
McWilliams, J et al (2005) Accelerating impacts of
temperature-induced coral bleaching in the Caribbean. Ecology
86(5):2055-2060.
Do not use the analysis presented in the paper, although you can
refer to the paper for the ecological background.
Data column descriptions:
- SST: sea surface temperature “anomalies.” That is, the difference
between the average sea surface temperature for the Caribbean between
Aug and October, and the long run average from 1961 to 1990. Positive
values represent years in which the water was warmer than average;
negative values represent cool years.
- LOGBLEACH: the log10 transformed percentage of 1 degree
latitude/longitude cells reporting at least one coral bleaching event in
a given year.
- MASS: an indicator marking those years which were considered “Mass
Bleaching Events”. 1 = mass bleaching event; 0 = not a mass bleaching
event.
# List of packages necessary to run this script:
require(librarian, quietly = TRUE)
shelf(tidyverse, # for tidying data
mgcv, # for qq.gam
quiet = TRUE,
lib = tempdir())
# Load data:
bleach <-
read.csv("https://github.com/LivingLandscapes/Course_EcologicalModeling/raw/master/data/bleach.csv")
Homework questions
- Create two models:
- a linear regression model ( function = lm() ) relating LOGBLEACH to
SST, with LOGBLEACH being the response variable.
- a generalized linear model ( function = glm() ) relating MASS to
SST, with MASS being the response variable.
- NOTE: for this model, you will need to decide on the family and link
function.
- Assess model assumptions (i.e., check model diagnostics) for both
models.
- Linear regression: revisit the Linear models and probability
distributions lab for information on how to do this.
- Generalized linear model: depending on your family/link choices,
model diagnostics that assume normality may be useless. Instead, check
out the mgcv::qq.gam function documentation and examples. Also, check
out Ben
Bolker’s logistic regression walkthrough, especially page 5.
- Summarize results from both models, as well as your model
diagnostics. Be careful to differentiate the linear regression results
from the generalized linear model results.
- The authors log transformed the bleaching variable “… to generate
linear relationships with temperature.” In R, back transform LOGBLEACH
to the original percentage scale, and plot against SST (No need to
re-run the model). Does a log-transformation make ecological sense,
given the type of process and the data?
- At what range of values does the linear regression model (still
using the LOGBLEACH response variable) predict 50% bleaching will occur?
Provide an estimate of uncertainty for this prediction (Hint: use the
predict.lm() function with options newdata and interval=”prediction”.
See help(predict.lm) ).
- Given the generalized linear model results, at what value of SST is
there a >50% chance of a mass bleaching event occurring? Provide an
estimate of uncertainty for this prediction. (Hint: check out the
differences between ?predict.lm() and ?predict.glm(). I suggest setting
‘type = link’, ‘se.fit = TRUE’, and transforming the resultant
predictions into probabilities using the plogis() function).
- Choose your favorite model and create a figure showing the
relationship between the response and predictor. Include confidence
intervals. There are multiple ways to create predicted effects plots
(e.g., ggeffects R package, use the base R package ‘predict’ function).
But be careful that the confidence intervals are
correct (see hint in question #6)!