Learning Objectives
Following this assignment students should be able to:
- practice basic syntax and usage of
for
loops- use
for
loops to automate function operations- understand how to decompose complex problems
Reading
Lecture Notes
Exercises
DNA or RNA Iteration (25 pts)
This is a follow-up to DNA or RNA.
Write a function,
dna_or_rna(sequence)
, that determines if a sequence of base pairs is DNA, RNA, or if it is not possible to tell given the sequence provided. Since all the function will know about the material is the sequence the only way to tell the difference between DNA and RNA is that RNA has the base Uracil ("u"
) instead of the base Thymine ("t"
). Have the function return one of three outputs:"DNA"
,"RNA"
, or"UNKNOWN"
.- Use the function and a
for
loop to print the type of the sequences in the following list. - Use the function and
sapply
to print the type of the sequences in the following list.
sequences = c("ttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcg", "gauuauuccccacaaagggagugggauuaggagcugcaucauuuacaagagcagaauguuucaaaugcau", "gaaagcaagaaaaggcaggcgaggaagggaagaagggggggaaacc", "guuuccuacaguauuugaugagaaugagaguuuacuccuggaagauaauauuagaauguuuacaacugcaccugaucagguggauaaggaagaugaagacu", "gauaaggaagaugaagacuuucaggaaucuaauaaaaugcacuccaugaauggauucauguaugggaaucagccggguc")
Optional: For a little extra challenge make your function work with both upper and lower case letters, or even strings with mixed capitalization
[click here for output]- Use the function and a
Cocili Data Exploration (optional)
Understanding the spatial distribution of ecological phenomena is central to the study of natural systems. A group of scientists has collected a dataset on the size, location, and species identify of all of the trees in a 4 ha site in Panama call “Cocoli”.
Download the Cocoli Data and explore the following spatial properties.
- Make a single plot showing the location of each tree for all species with
more than 100 individuals. Each species should be in its own subplot (i.e.,
facet). Label the subplots with the genus and species names, not the species
code. Scale the size of the point by its stem diameter (use
dbh1
) so that larger trees display as larger points. Have the code save the plot in afigures
folder in your project. - Basal area is a common measure in
forest management and ecology. It is the sum of the cross-sectional areas of
all of the trees occuring in some area and can be calculated as the sum of
0.00007854 * DBH^2 over all of the trees. To look at how basal area varies
across the site divide the site into 100 m^2 sample regions (10 x 10 m cells)
and determining the total basal area in each region. I.e., take all of the
trees in a grid cell where x is between 0 and 10 and y is between 0 and 10
and determine their basal area. Do the same thing for x between 0 and 10 and
y between 10 and 20, and so on. You can do this using two “nested” for loops
to subset the data and calculate the basal area in that region. Make a plot
that shows how the basal area varies spatially. Since the calculation is for
a square region, plot it that way using
geom_tile()
with the center of the tile at the center of the region where basal area was calculated. Have the code save the plot in afigures
folder in your project.
- Make a single plot showing the location of each tree for all species with
more than 100 individuals. Each species should be in its own subplot (i.e.,
facet). Label the subplots with the genus and species names, not the species
code. Scale the size of the point by its stem diameter (use
Length of Floods (optional)
You are interested in studying the influence of the timing and length of small scale flood events on an ecosystem. To do this you need to determine when floods occurred and how long they lasted based on stream gauge data.
Download the stream guage data for USGS stream gauge site 02236000 on the St. Johns River in Florida. Find the continuous pieces of the time-series where the stream level is above the flood threshold of 2.26 feet and store the information on the start date and length of each flood in a data frame.
[click here for output]