For all problems, unless otherwise indicated, assume a Type 1 error rate = .05.
An emotion researcher regularly uses different emotional stimuli within a given category (e.g., a set of happiness-inducing film clips) to elicit emotional responses among subjects. Let's say that she is interested in the degree to which variability in responses to happiness-inducing stimuli is due to variability among the stimuli themselves. She conducts an experiment in which 50 subjects are randomly assigned to one of 5 stimulus conditions (ni = 10). Subjects in each stimulus condition are shown one film clip designed to elicit happiness. A different film clip is used in each condition. The 5 clips can be considered a random sample from a much larger population of potentially happiness-inducing clips. Self-report responses to each film clip are recorded on a 1 to 100 scale (higher scores indicate greater happiness).
The data are as follows:
Clip #1 | 51 | 62 | 67 | 56 | 60 | 63 | 65 | 60 | 65 | 34 |
---|---|---|---|---|---|---|---|---|---|---|
Clip #2 | 50 | 52 | 33 | 59 | 51 | 54 | 68 | 51 | 60 | 43 |
Clip #3 | 56 | 53 | 38 | 41 | 33 | 49 | 64 | 49 | 50 | 53 |
Clip #4 | 66 | 73 | 74 | 71 | 51 | 49 | 57 | 60 | 68 | 54 |
Clip #5 | 46 | 65 | 60 | 53 | 42 | 61 | 57 | 64 | 40 | 55 |
Indicate the null and alternative hypotheses.
H0: The amount of happiness induced by a film clip is a function solely from the viewer watching the clip and not from the clip itself. Essentially that these five clips are drawn from a random set of clips that are all equally happiness inducing.
If any element of this set of happiness inducing things was shown to everyone in the world the average response would be the same:
Another way of saying this is that the variability because of differences in the clips is zero:
H1: The amount of happiness induced by watching a file clip is a function of both the viewer and the clip being watched.
Recall that the variance can't be less than zero.
Really what is being tested by this experiment is whatever method is being used to pick clips "designed to elicit happiness." The ANOVA doesn't seem the appropriate test however.
I'm assuming that if I designed a method for picking clips that elicit happiness I would like to prove that it works. I'll design a method of picking clips now that I believe won't work, alternating pictures of kittens and surgery, I can test if that won't work, but proving it won't be a surprising finding.
All that the ANOVA can do is fail to disprove that it doesn't work. Failing to disprove the null is not the same thing as supporting it. Instead of the ANOVA, a test is needed whose null hypothesis is that the means are unequal.
A situation where my experimental might genuinely be that the means are unequal is if these five clips were used in another experiment and I am hypothesizing that they were responsible for some of the variability in that experiment. In this situation though the clips are not random, they're fixed.
I suppose if the researcher has a very large sample of clips that are used to elicit happiness and she is spot checking the whole set by selecting five and testing them, then this experiment makes some sense. She would likely want to do contrasts then if the ANOVA failed.
Another option would be a set of clips that would be assumed to elicit similar amounts of happiness. Maybe I have five clips of puppies playing and I want to assess how much of the variability comes from the concept of "puppy" as opposed to the specific characteristics of the puppy playing. That makes sense as a research question.
Conduct a one-way ANOVA on these data. You can either use SAS or a hand computation. Indicate the value of your observed F and whether it is statistically significant. Interpret the results.
For this data, R produces:
Df Sum Sq Mean Sq F value Pr(>F) Clip_Number 4 1139.7 284.9 3.2678 0.01954 Residuals 45 3923.6 87.2
The null hypothesis is rejected in this situation and the experimenter's belief that the clips are not equal in the amount of happiness they elicit is supported.
Indicate the expected values for the Mean Squares that you used to compute the F ratio of interest in the ANOVA that you conducted in part b.
For a one-way random effects design:
Effect | E(Mean Square) | H0 | E(Mean Square |H0) |
---|---|---|---|
Random | |||
Within |
Based on these sample data, compute an unbiased estimate of the proportion of the total variability in self-report responses in the population that is due to variations in film clip content.
The proportion of the total variability from the randomness of the factor and not the samples, ρ1, is:
This quantity requires an unbiased estimate of the variance from the factor:
For this data, R calculates:
This means that 18.5% of the overall variability comes from a difference in the clips rather than from the subjects.
Why would we be unlikely to follow up this analysis with contrasts testing for differences between the means of the five film clips?
A contrast of these specific means will only tell if these individual clips are not equivalent in inducing happiness. It is not a generalizable research question.
A pharmaceutical company is interested in comparing the efficacy of three drugs (A, B, and C) for the treatment of panic disorder. In this context, drug can be considered a fixed effect. Four hospitals are recruited for the study. At each hospital, three patients within each of the three drug conditions are treated. The post-treatment scores on an inventory of panic symptoms are as follows (lower scores indicate fewer symptoms and better treatment effects):
Hospital | Drug A | Drug B | Drug C |
---|---|---|---|
LA General | 81 | 109 | 106 |
91 | 93 | 105 | |
67 | 95 | 109 | |
Chicago VA | 86 | 105 | 111 |
75 | 111 | 106 | |
79 | 95 | 102 | |
Des Moines Baptist | 89 | 106 | 115 |
95 | 115 | 117 | |
99 | 102 | 106 | |
Nashville Centennial | 106 | 115 | 111 |
111 | 117 | 118 | |
103 | 106 | 114 |
Studies on this data is going to be interested in the expected value of a particular Hospital / Drug pair. The expected values (means) are:
Hospital | Drug A | Drug B | Drug C | Marginal |
---|---|---|---|---|
LA General | 79.666 | 99.000 | 106.6667 | 95.111 |
Chicago VA | 80.00000 | 103.6667 | 106.3333 | 96.666 |
Des Moines Baptist | 94.333 | 107.6667 | 107.6667 | 103.222 |
Nashville Centennial | 106.666 | 112.6667 | 114.3333 | 111.222 |
Marginal | 90.166 | 105.750 | 108.750 | 101.555 |
Let’s assume that these four clinics are the sole customers of the pharmaceutical company. Thus, the company representatives who are supervising the study are not interested in generalizing the results beyond these four clinics.
What would be the expected values of each of the four relevant mean squares?
For this section a new notation will be used:
In this situation the hospitals are fixed factors and the analysis is simply a 4×3 fixed effects ANOVA. The characteristics of the mean squares are:
Effect | E(Mean Square) | H0 | E(Mean Square |H0) | F-ratio |
---|---|---|---|---|
Fixed α | ||||
Fixed β | ||||
α × β Interaction | ||||
Within |
What would be the null hypotheses for the main effects of drug and of hospital?
Now, let’s assume a different situation. Let’s say that these four hospitals can be considered a random sample of the extremely large number of hospitals around the world that the company sells to.
What would be the expected values of each of the four relevant mean squares?
Effect | E(Mean Square) | H0 | E(Mean Square |H0) | F-ratio |
---|---|---|---|---|
Fixed α | ||||
Random β | ||||
α × β Interaction | ||||
Within |
What would be the null hypotheses for the main effects of drug and of hospital?
Let’s assume that the assumption articulated in part b holds (i.e., hospitals as random samples).
Using SAS or other software, conduct an omnibus ANOVA that tests each of the main effects and the interaction. Summarize your results and conclusions.
It turns out doing this is not easily explained in R, so I did it in SAS in the interests of surviving the end of the semester:
Source DF Type I SS Mean Square F Value Pr > F drug 2 2617.055556 1308.527778 30.83 <.0001 hosp 3 1523.638889 507.879630 11.97 <.0001 drug*hosp 6 441.611111 73.601852 1.73 0.1562
Both the hospital and the drug being used have a significant effect on the outcome of treatment of panic disorders. There is no evidince that a particular hospital does especially well or poorly with a particular drug.
Using SAS or other software, conduct a planned contrast (two-tailed) comparing the average of the A and B drug groups to drug C. Summarize your results and conclusions.
Tests of Hypotheses Using the Type III MS for drug*hosp as an Error Term Contrast DF Contrast SS Mean Square F Value Pr > F drug 1&2 vs 3 1 1160.013889 1160.013889 15.76 0.0074
There is a statistically significant probability that Drug C is less effective than the average of Drugs A and B.