For all problems, unless otherwise indicated, assume a Type 1 error rate = .05.

  1. A cognitive psychologist is studying the effects of different stimulus-exposure durations on visual search performance. He is also interested in whether such effects differ in middle-aged vs. older individuals. There are 3 levels of the Exposure factor (short, medium, long) and 2 levels of the Age factor (middle-age/old-age). There are 8 participants per group. The means and variances (unbiased estimates of population variance) on the visual search performance measure for the 6 cells of this design are as follows (higher scores on this measure indicate better performance):

    Exposure
    ShortMediumLongMarginal
    AgeMiddle29334435.333
    Old25293630
    Marginal27314032.666
    s2 (unbiased)Exposure Factor
    ShortMediumLong
    Age FactorMiddle105102108
    Old11094111
    1. Fill in the marginal means in the first table above.

      This assignment will use a new notational shorthand to represent summations and averages:

      yj = i=1ni yij yj¯ = i=1ni yij ni y¯ = y ninj = i=1ni j=1nj yij ninj = ij yij ninj

      The table then for two factors, A and B, represents the following information about the set of samples where sample yi,j,k is the ith samples from the jth group of factor A and kth group of factor B. Frequently, in the data table, the subscript for i is omitted.

      Factor B
      B1 B2 Bk Bnk Marginal
      Factor A A1 y11¯ y11¯ y12¯ y1k¯ y1nk¯ y1¯ y1¯
      A2 y21¯ y22¯ y2k¯ y2nk¯ y2¯
      Aj yj1¯ yj2¯ yjk¯ yjnk ¯ yj¯
      Anj ynj1¯ ynj2¯ ynjk¯ ynjnk ¯ ynj¯
      Marginal y1¯ y1¯ yk¯ ynk¯ y¯ y¯
    2. Find the effects corresponding to each of the marginal means of the Age factor. That is, if we considered Age to be "Factor A" here and Exposure to be "Factor B", find a1 and a2. Include the formula that you are using to determine the effects. Verify that the effects sum to 0.

      Consider two population quantities:

      • μMiddle¯ — the average mean for middle aged subjects across different levels of exposure.
      • μOld¯ — the average mean for old subjects across different levels of exposure.

      To say that there is a "main effect" for age is to say that these two quantities vary and age in some way plays a part in the response to visual stimulus.

      Note that this is not the only way for age to play a part. These two means could be equal, but with completely different distributions of scores. This situation would be captured by interaction effects.

      If there is an actual population level effect for age, it is represented for each level of the factor (e.g. "Middle" and "Old") by a deviation from the grand mean:

      αj = μj¯ - μ¯

      An unbiased estimate of this quantity is simply derived from the sample means:

      α^j = aj = yj¯ - y¯

      For the factor "Age," these quantities are:

      • aMiddle = yMiddle¯ - y¯ 35.333-32.666 2.666
      • aOld = yOld¯ - y¯ 30-32.666 = -2.666

      The set of effects for a factor will always sum to 0 (since they are simply the set of unweighted deviations from the mean). This can be easily verified with these two:

      jA aj = aMiddle+aOld 2.666+-2.666 = 0
    3. Find the effects corresponding to each of the marginal means of the Exposure factor. Include the formula that you are using to determine the effects. Verify that the effects sum to 0.

      The effects of the row-wise factor, called "Factor A," is represented with a variable α. The effects for the column-wise factor, called "Factor B," are conceptually identical, but represented with a variable β. Likewise, the estimate of the effect of factor A is a and the estimate of the effect of factor B is b.

      For exposure, the estimates of these effects are:

      • bShort = yShort¯ - y¯ 27-32.666 = -5.666
      • bMedium 31-32.666 = -1.666
      • bLong 40-32.666 7.333

      As expected, these effects also sum to zero:

      kB bk = bShort+bMedium+bLong -5.666+-1.666+7.333 = 0
    4. Find the interaction effects, that is each of the (ab)j,k terms. Include the formula that you are using to determine the effects. Verify that the effects sum to 0 within each column and each row.

      The main effects model described above measures the relationship of factorwise groups to the entire population. For example, the effects of being old as the main effects for age (αOld) or the effect of short exposure (βShort). What the main effects model doesn't capture is the situation where being both old with short exposure, as opposed to say old and long exposure, has a particular effect on the outcome.

      If exposure time and age are completely unrelated then to predict something about an old short-exposure person, one would only need the effect of being old and the effect of short exposure. If, however, there is some connection between these two characteristics then the factors are said to "interact." In order to properly make an estimate about old short-exposure people, one would then need to know an additional "interaction effect": (αβ)Old,Short. The complete model is represented mathematically as:

      αβjk = μjk- μ¯ +αj+βk

      The estimate of this term is (ab)j,k, and for this particular set of data, those estimates are:

      Exposure
      ShortMediumLongSum
      AgeMiddle-0.666-0.6661.3330
      Old0.6660.666-1.3330
      Sum0000

      Note that these values sum to zero both row-wise and column-wise.

    5. Express the mean of the old age/long exposure cell as a linear combination of the grand mean, the two marginal effects, and the interaction effect and verify that this model reproduces the cell mean.

      y¯ +aOld+bLong+ abOldLong 32.666+-2.666+7.333+-1.3333 = 36 = yOldLong¯
    6. Compute the following sums of squares:

      1. SSAge

        SSAge = jAge njaj2 = KnjAge aj2 38 2.6662+ -2.6662 341.163
      2. SSExposure

        SSExposure = kExposure nkbk2 = JnkExposure bk2 1418.664
      3. SS(Age × Exposure)

        SSAgeExposure = jAge kExposure njk ab jk 2 = n jk ab jk 2 42.6664
      4. SSWithin (remember that the contribution of each cell to the SSW can be computed as (nj,k - 1)sj,k2)

        SSW = jk njk -1 s^ jk 2 = n-1 jk s^ jk 2 = 4410
      5. Conduct a two-way ANOVA (alpha = .05, 2-tailed) on these data testing for all relevant main effects and interactions. Show your results in an ANOVA source table. What do you conclude about the effects of Exposure and Age on visual search performance here?

        MSW = SSW dfWithin = SSW JKn-1 = 4410 238-1 = 105 MSAge = SSAge dfAge = SSAge J-1 341.163 2-1 = 341.163 MSExposure = SSExposure dfExposure = SSExposure K-1 1418.664 3-1 = 709.332 MSAge×Exposure = SSAge×Exposure dfAge×Exposure = SSAge×Exposure J-1K-1 42.6664 3-12-1 = 21.332 P MSAge MSW ~ F J-1 JKn-1 P 341.163105 F 2-1 238-1 7.86%

        The same form is used for all the results and produces the following table:

        TermDoF Sum of SquaresMean SquareF ValueP(>F)
        age1341.33333341.333333.25079370.9214333
        exposure21418.66667709.333336.75555560.9971407
        age:exposure242.6666721.333330.20317460.1830669
        Within424410105

        These results only support a significant main effect for exposure. Looking at the data, this means that longer exposure to visual stimulus data increased performance.

  2. Consider the data from Maxwell and Delaney problem #9 on pp. 346-347. These data are from a study assessing the effects of 3 different treatments for phobias (Desensitization, Implosion, and Insight) and 2 differing levels of fear severity (mild and severe) on Behavioral Avoidance test scores among a group of phobics.

    You can load in the data in R using the following statements:

    snake.phobia.n = 8
    data.frame(avoidance = c(16, 13, 12, 15, 11, 12, 14, 13, 16, 10, 11, 12, 6, 8, 14, 12,
                             14, 16, 17, 15, 13, 17, 15, 16, 13, 7, 3, 10, 4, 2, 4, 9,
                             15, 15, 12, 14, 13, 11, 11, 12, 15, 10, 11, 7, 5, 12, 6, 8),
                 therapy = rep(c("Desensitization", "Implosion", "Insight"),
                               each = snake.phobia.n * 2),
                 phobia = rep(c("Mild", "Severe"), each = snake.phobia.n))

    Use R for the questions below that involve computations. At some points you may have to go beyond the R print-out and do hand calculations.

    1. Conduct a two-way ANOVA on these data. Indicate whether each effect tested is statistically significant.

      To start off with a general understanding of the data, a means table is useful:

      Type of TherapyMarginal
      DesensitizationImplosionInsight
      Degree of PhobiaMild13.25015.37512.87513.833
      Severe11.1256.5009.2508.958
      Marginal12.18710.93711.062511.396

      The higher the score, the less a person in afraid of snakes.

      Looking at the data, we can guess at a general model for the types of therapies:

      • Desensitization brings everyone to about the same point. For severe phobias it is the most effective.
      • Implosion therapy works particularly well for people with mild phobias, but for people with severe phobias it is not effective at all.
      • Insight therapy seems to be less effective overall, but the difference is possibly simply due to the particular samples taken.

      The answers to some of these questions depend on the variances:

      Type of TherapyMarginal
      DesensitizationImplosionInsight
      Degree of PhobiaMild2.7861.9822.6963.536
      Severe10.12515.14311.35714.911
      Marginal7.22928.99610.0627.348

      R produces the following output for the ANOVA:

                     Df  Sum Sq Mean Sq F value    Pr(>F)
      therapy         2  15.167   7.583  1.0320  0.365152
      phobia          1 285.188 285.188 38.8104 1.855e-07
      therapy:phobia  2 100.500  50.250  6.8384  0.002686
      Residuals      42 308.625   7.348

      There are three experimental questions addressed here:

      • Main Effect for Therapy: Does the type of therapy have an effect on the effectiveness of treating a phobia of snakes?

        The data suggests there is a 36.5% chance that the differences in the scores from the differences in therapies were solely due to sampling error and not to a genuine difference in the mean. So the researcher would not be able to reject the hypothesis that there is a difference with greater than a 5% probability.

      • Main Effect for Phobia: Does the severity of a phobia affect how strongly an individual will react to therapy?

        There is an extremely high probability (99.99999%) that the difference in the responses was from a genuine difference and not simply sampling error. So, the experimenter is able to reject the hypothesis and support the experimental belief that severity of phobia affects response to therapy. Because there are three types of therapy however, it is not possible to say simply with the ANOVA if all three differ or if one differs from the other two. To delve to that depth would require additional analysis; perhaps a contrast.

      • Interaction Effect for Therapy × Phobia: Are certain therapies more effective for people who have stronger or weaker phobias?

        There is a 99.8% chance this data represents a significant interaction effect between the type therapy and strength of phobia. As before, it will take further analysis to determine specifically how this interaction takes place.

    2. Conduct a Tukey HSD test assessing whether each of the three pairwise comparisons among the marginal means of the Therapy factor are significantly different. Set the familywise alpha for this set of contrasts at .05. Indicate the minimum mean difference necessary for a comparison to be judged significant and indicate whether each comparison is statistically significant.

      R produces the following output:

        Tukey multiple comparisons of means
          95% family-wise confidence level
          factor levels have been ordered
      
      Fit: aov(formula = avoidance ~ 1 + therapy + phobia + therapy:phobia,
               data = snake.phobia.frame)
      
      $therapy
                                 diff       lwr      upr     p adj
      Insight-Implosion         0.125 -2.203422 2.453422 0.9906676
      Desensitization-Implosion 1.250 -1.078422 3.578422 0.4007739
      Desensitization-Insight   1.125 -1.203422 3.453422 0.4751470

      All of the confidence intervals contain 0, so there is no significance in any of the interactions.

      The minimum mean difference confirms these findings:

      y_max- y_min Q J JKn-1 R 1-α MSWn Q 342R 0.95 7.3488 1.90
    3. In general, would the Tukey HSD test be the optimal multiple comparison procedure for the planned pairwise comparisons you did in part b? If not, what would be the optimal procedure?

      For any set of contrasts that have a maximum degree of freedom of two, the Fisher LSD is permissible and will be the most powerful analysis. For this particular data the Fisher LSD did not allow for any contrasts on the type of therapy because the main effect for Therapy from the ANOVA was not significant.

      In general the Tukey should not succeed when the Fisher LSD fails regardless of the degrees of freedom. The reason for not using the Fisher LSD is not that other tests are more powerful, but that it does not adequately control for type I errors. The ANOVA has already said that there is no reason to believe that any of these means differ. To then use the Tukey to determine which of these non-differing means is different will, not surprisingly, generally (always?) fail.

    4. Using the Bonferroni procedure, conduct 3 planned interaction comparisons assessing whether the difference between Desensitization and Implosion is conditional upon severity, whether the difference between Desensitization and Insight is conditional upon severity, and whether the difference between Implosion and Insight is conditional upon severity. Set the familywise alpha level for this set of contrasts at .05. Indicate your relevant critical values t or F values for each comparison and whether each comparison is statistically significant.

      The question of "is the effects of desensitization and implosion therapies conditional upon the severity phobia?" can be thought of mathematically as, "is there a difference between the difference of desensitization and implosion for mild phobics and the difference of those factors for severe phobics?" Or, mathematically:

      H0 : μImploMild- μDesenMild = μImploSevere- μDesenSevere 1μImploMild +-1μDesenMild +-1μImploSevere +1μDesenSevere = 0

      To make these contrasts easier to represent in a table, the combination of factors will be listed as a 1×6 matrix rather than a 2×3:

      Desen:MildDesen:Sev Implo:MildImplo:Sev Insight:MildInsight:Sev
      desensitization v. implosion is conditional on severity -111-100
      desensitization v. insight is conditional on severity -11001-1
      implosion v. insight is conditional on severity 00-111-1

      Each of these comparisons is done using a two-dimensional general form of the unbiased estimate:

      Ψ^ = jk cjk yjk¯ σΨ^2 = MSW jk cjk2 njk

      The distribution of the contrast remains the same:

      Ψ^ σΨ^ ~ tdfMSW = tN-JK

      For the contrast of "the effectiveness of desensitization v. implosion is conditional on severity of phobia" these calculations are:

      Ψ^ = cDesenMild yDesenMild¯ + cDesenSevere yDesenSevere¯ + + cInsightSevere yInsightSevere¯ = -113.25 +111.125 +115.375 +-16.5 +012.875 +09.25 = 6.75 σΨ^2 = MSW cDesenMild2 + cDesenSevere2 + + cInsightSevere2 n 7.348 -12 +12 +12 +-12 +02 +02 8 = 3.674 Ψ^ σΨ^ = Ψ^ σΨ^2 6.753.674 3.522 P Ψ^ σΨ^ tN-JK P 3.522 t48-23 0.999

      Using this process, R produces the following result:

      P(< t)P(< F)
      desensitization v. implosion is conditional on severity0.99950.9990
      desensitization v. insight is conditional on severity0.78090.5617
      implosion v. insight is conditional on severity0.00450.9910

      Recall that tdf2 = F1df. Also note that the t-test is two-tailed whereas the F-test is one-tailed. This is why the implosion v. insight comparison has an extremely small t-value probability and extremely high F-value probability.

      Because multiple comparisons are being performed it is necessary to correct for multiplicity. Recall that the Bonferroni simply reduces the per-comparison error rate sufficiently to bring the familywise error rate to the desired level. Specifically for c comparisons:

      αpc = αfwc = .053 0.01667

      This means that a test will be considered significant if the cumulative probability for the F-score is greater than 0.9833. Two of these contrasts are significant: "desensitization v. implosion is conditional on severity" and "implosion v. insight is conditional on severity."

      Given the tests performed so far, what specific experimental beliefs are supported?

      • Main Effects:
        • There is no support to believe that in treating ophidiophobia that a specific therapy is more effective for a random person.
        • When treating ophidiophobia, subjects with an initially mild phobia will finish the therapy with a lower degree of phobia than those that began with a severe phobia. (This seems relatively intuitive.)
      • Interaction Effect:
        • When treating ophidiophobia, the severity of the subjects phobia is meaningful as to which therapy is the most effective.
      • Tukey Pairwise Contrasts on Type of Therapy:
        • There is no support for the belief that a particular type of therapy is more effective than any other when treating ophidiophobia in general. (This is simply an affirmation of the main effect for therapy.)
      • Interaction Contrasts:
        • When comparing desensitization therapy to implosion therapy for treating ophidiophobia, the difference between the two methods is greater for people with a more severe phobia.
        • When comparing desensitization therapy to insight therapy for treating ophidiophobia, there is no support for the idea that the severity of the subject's phobia is meaningful in choosing the most likely effective treatment.
        • When comparing implosion therapy to insight therapy for treating ophidiophobia, … (I'm not sure how to interpret this.)

      Consider the contrast "the effectiveness of desensitization v. implosion is conditional on severity of phobia" while examining the marginal means:

      Type of TherapyMarginal
      DesensitizationImplosionInsight
      Degree of PhobiaMild13.25015.37512.87513.833
      Severe11.1256.5009.2508.958
      Marginal12.18710.93711.062511.396

      All that the contrast says is that the degree of phobia affects the difference between these two therapies. For these particular numbers we can look at Desensitization:Severe (11.125) and Implosion:Severe (6.500) and say that there is likely a significant difference there. Perhaps there is one for mild phobics as well, or perhaps it is simply sampling error. Specifically, the contrast only really tells the experimenter that a difference exists dependent on severity. It does not say whether a particular therapy is more effective.

    5. Conduct two simple-effects analyses testing for overall (i.e., omnibus) differences among the three treatment groups at each of the two levels of severity. Indicate your relevant critical values and whether each simple-effect analysis is statistically significant.

      Simple effects analysis is an alternative method for exploring interaction effects. It examines each of the two strengths of phobia and determines if there is a significant difference within that effect.

      Because there are two degrees of freedom and the difference based on strength of phobia was supported by the ANOVA, these comparisons can be done with α = 0.05 by the reasoning of the Fisher LSD.

      The sum of squares for a simple effect is:

      SSsimp(k) = nj yjk¯ - yk¯ 2

      For mild phobics, this is:

      SSsimp(Mild) = n yDesenMild¯ - yMild¯ 2 + yImploMild¯ - yMild¯ 2 + yInsightMild¯ - yMild¯ 2 8 13.250-13.8332 +15.375-13.8332 +12.875-13.8332 29.053

      This is essentially an ANOVA done on these numbers. However, to get a better estimate of the population variance, the mean square within for the entire population is used rather than only for the samples being considered. Given this, the following distribution makes sense:

      MSsimp(k) = SSsimp(k)dfsimp(k) = SSsimp(k)K-1 MSsimp(k)MSW ~ F dfsimp(k)dfMSW = F K-1 N-JK

      For phobics, the two simple effects are:

      SSsimp(Mild)K-1 MSW 29.05327.348 1.977 F 242 0.8488 SSsimp(Severe)K-1 MSW 86.5827.348 6.195 F 242 0.9956

      So, this decomposition lends breaks down a general belief that the strength of phobia affects the effectiveness of treatment into a more specific belief that for severe phobics there is a difference in the effectiveness of the type of phobia whereas for mild phobics there isn't evidence that this is the case.