Learning Objectives

Following this assignment students should be able to:

  • import, view properties, and plot a raster
  • perform simple raster math
  • extract points from a raster using a shapefile
  • evaluate a time series of raster

Reading

Lecture Notes


Exercises

  1. Canopy Height from Space (30 pts)

    The National Ecological Observatory Network has invested in high-resolution airborne imaging of their field sites. Elevation models generated from LiDAR can be used to map the topography and vegetation structure at the sites. This data gets really powerful when you can compare ecological processes across sites. Download the elevation models for the Harvard Forest (HARV) and San Joaquin Experimental Range (SJER) and the plot locations for each of these sites. Often, plots within a site are used as representative samples of the larger site and act as reference areas to obtain more detailed information and ensure accuracy of satellite imagery (i.e., ground truth).

    1. Map the digital surface model for SJER.
    2. Create and map the Canopy Height Model using raster math (chm = dsm - dtm) for SJER site.
    3. Creat a map that combines the Canopy Height Model from 3 with the corresponding plot locations from the plot_locations folder.
    4. Extract the canopy heights at each plot location for SJER and display the values.
    5. Extract the maximum canopy heights (using fun = max) in a buffer of 10 for each point at the HARV site and SJER plots. Create a single dataframe with two columns, one holding the maximum height values for each site at each plot_id.
    [click here for output] [click here for output]
  2. Species Occurrences Map (40 pts)

    A colleague of yours is working on a project on banner-tailed kangaroo rats (Dipodomys spectabilis) and is interested in what elevations these mice tend to occupy in the continental United States. You offer to help them out by getting some coordinates for specimens of this species and looking up the elevation of these coordinates.

    Start by getting banner-tailed kangaroo rat occurrences from GBIF, the Global Biodiversity Information Facility, using the spocc R package, which is designed to retrieve species occurrence data from various openly available data resources. Use the following code to do so:

    ```
    dipo_df = occ(query = "Dipodomys spectabilis", 
    			from = "gbif",
    			limit = 1000,
    			has_coords = TRUE)
    dipo_df = data.frame(dipo_df$gbif$data)
    ```
    
    1. Clean up the data by:
      • Filter the data to only include those specimens with Dipodomys_spectabilis.basisOfRecord that is PRESERVED_SPECIMEN and a Dipodomys_spectabilis.countryCode that is US
      • Remove points with values of 0 for Dipodomys_spectabilis.latitude or Dipodomys_spectabilis.longitude
      • Remove all of the columns from the dataset except Dipodomys_spectabilis.latitude and Dipodomys_spectabilis.longitude and rename these columns to latitude and longitude using select. You can rename while selecting columns using a format like this one select(new_column_name = old_column_name)
      • Use the head() function to show the top few rows of this cleaned dataset
    2. Do the following to display the locations of these points on a map of the United States:
      • Get data for a US map using usmap = map_data("usa")
      • Plot it using geom_polygon. In the aesthetic use group = group to avoid weird lines cross your graph. Use fill = "white" and color = "black".
      • Plot the kangaroo rat locations
      • Use coord_quickmap() to automatically use a reasonable spatial projection
    [click here for output] [click here for output]
  3. Species Occurrences Elevation Histogram (30 pts)

    This is a follow up to Species Occurrences Map.

    Now that you’ve mapped some species occurrence data you want to understand how environmental factors influnece the species distribution.

    1. The raster package comes with some datasets, including one of global elevations, that can be retrieved with the getData function as follows:

       elevation = getData("alt", country = "US")
       elevation = elevation[[1]]
      

      Create a new version of the map from Species Occurrences Map that shows the elevation data as well. Plotting the elevation data may take a while because there are a lot of data points in the dataset. Pay attention to the order that the geom_ objects are plotted in. The name of the elevation variable is USA1_msk_alt. If the website is down you can download a copy from the course site by downloading http://www.datacarpentry.org/semester-biology/data/wc10.zip and unzipping it into your home directory (/home/username on Mac and Linux, C:\Users\username\Documents on Windows) and using the command elevation = getData("alt", country = "US", path = ".")

    2. Turn the dipo_df dataframe from Species Occurrences Map into a SpatialPointsDataframe, making sure that its projection matches that of the elevation dataset, and extract the elevation values for all of the kangaroo rat occurrences. Turn this subset of elevation values into a dataframe and plot a histogram of the elevations.

    3. Part 2 showed us the elevations where banner-tailed kangaroo rats occur, but without context it’s hard to tell how important elevation is. Make a new graph that shows histograms for all elevations in the US in gray and the kangaroo rat elevations in red. Plot the kangaroo elevations on top of the full elevations and make them transparent so that you can see the overlap. To get the histograms on the same scale we need to plot the density of points instead of the total number of points. This can be done in ggplot using code like:

       ggplot() +
         geom_histogram(data = elevations, aes(x = USA1_msk_alt, y = ..density..))
      

      Lable the x axis elevation and add the title “Kangaroorat habitat elevation relative to background”.

    [click here for output] [click here for output] [click here for output]

Assignment submission & checklist