## Problem # 1

Using this fake dataset on lemur feeding rates do a two-factor ANOVA to determine whether feeding rate is influenced by sex and by the group each individual belongs to.

1. Make a boxplot of feeding rate by sex
2. Make a boxplot of feeding rate by group
3. Show the ANOVA table for the two way ANOVA
4. Make an interaction plot of your two way ANOVA
5. Explain whether or not there is an interaction between the two factors.

Note: the first column of this dataset contains row numbers, so it doesn’t have a column heading. This is a common situation, so there is an option in the read.table() function to deal with it. You simply indicate which column number contains the row.names (in this case the names are numbers!). So the following code will work read.table("https://hompal-stats.wabarr.com/datasets/lemurfeeding.txt", header=T, row.names = 1)

## Problem # 2

Write a function to simulate the kind of data suitable for ANOVA, with two groups and some difference in means between the groups. You can simulate the data using the equation you saw in the ANOVA slide show.

$Y_{ij}=\mu + A_i + \epsilon_{ij}$

Assume that the design is balanced (groups have same sample size).

Your function should take the following arguments and default values:

• sampleSize default 20. The total sample size of both groups combined
• grandMean default 50. The grand mean of the data (ignoring group)
• errorSD default 5. The standard deviation of the error term
• meanDiff default 3. The difference between the means of the two treatment groups

Your function should do the following:

• create the error term
• create the grouping vector
• simulate the y variable using the supplied parameters
• make a boxplot with the groups on the x axis and the value of y on the y axis (you may have to explicitly print your plot if you use ggplot2 from within your function)
• do the ANOVA
• return the p value from the anova

### End by answering the following questions:

1. what effect (if any) does increasing the sample size have on the p value for a given set of parameters?
2. what about increasing the grand mean?
3. what about increasing the standard deviation of the error term

*make sure to show me a run of your function to prove that it works!