Question #1

Using the built-in iris dataset:

  1. write a for() loop to test whether each numeric column in the iris dataset is normally distributed using the shapiro.test() function, and print out the p-value from this test you must write a for loop to get credit for this question
  2. make a histogram of each iris numeric variable
  3. calculate the mean and median of the squared value of iris$Sepal.Length (calculated like this: iris$Sepal.Length^2).
  4. explain in words why these mean and median values are different.

Question #2

One very common problem in bio-statistics is that many characters are not quite normally distributed. To understand how that arises, try the following steps:

  1. Generate a vector called body.sizes for 1000 individuals sampled from a normal distribution with mean = 10 grams and sd = 3 grams. Save a copy of this in a variable called body.sizes.original so we can always go back and look at the original.

  2. Simulate body growth assuming that each individual grows by some percentage each year (\(t\)) so that \(size(t+1) = g*size(t)\). A reasonable value of \(g\) might be 1.15 (15% growth per year). Write a loop representing 10 years of growth, in which you update the vector body.sizes by multiplying it by the annual growth rate each year for 10 years. Save the result to a new variable called body.sizes.growth.

  3. Next, model inter-individual variation in growth rates. Create a vector of growth rates for the 1000 individuals, with mean 1.15 and standard deviation of 0.2. Reset body.sizes to its original values. Again simulate 10 years of growth this time applying each individual’s specific growth rate. Note: an individual’s growth rate stays the same from year to year, but each individual should have a different growth rate. Save the results to a new variable called body.sizes.vargrowth.

  4. Finally, see what happens when environmental stochasticity is added in. Reset body.sizes to its original values. Repeat the variable growth scenario, this time adding in yearly fluctuations to the scenario. For each year multiply ALL individuals by single scaling factor representing the productivity that particular year. Have that scaling factor be drawn from a normal distribution with mean 1 and standard deviation of 0.2. Save the results to body.sizes.environmentalgrowth.

  5. Do a Shapiro-Wilk test for normality on all four distributions. Which ones are normal, which aren’t? Produce Q-Q plots of all four. Verbally describe the deviations from normality.

  6. To conclude, answer the following questions:

    1. Does growth of a normally-distributed population lead to non-normal distributions?
    2. Does variation in growth cause non-normality?
    3. Does environmental stochasticity reduce or amplify body size variation?