Statistical Inference — Practice Exam #3

Suppose you are performing a 2-sample, independent sample t-test, and you use the 2-sided rejection points with α = 0.05. For which of the following situations would the actual α be highest when $σ_{1}^{2} = 10$ , $σ_{2}^{2} = 20$ ?
1. $N_{1} = 20, N_{2} = 10$ ✓
2. $N_{1} = 20, N_{2} = 20$
3. $N_{1} = 30, N_{2} = 30$
4. $N_{1} = 10, N_{2} = 20$
The error on a linear combination of means is:
$\begin{matrix} E_{\sum c_{i} {\overline{x}}_{i}} & = & \sqrt{\sum \frac{c_{i}^{2} σ_{i}^{2}}{n}} \end{matrix}$
Error decreases with a increase in N, so the only real options are A and D, and for this example:
$\begin{matrix} \sqrt{\frac{N_{A 1}}{σ_{A 1}^{2}} + \frac{N_{A 2}}{σ_{A 2}^{2}}} & = & \sqrt{\frac{20}{10} + \frac{10}{20}} & > & \sqrt{\frac{10}{10} + \frac{20}{20}} & = & \sqrt{\frac{N_{D 1}}{σ_{D 1}^{2}} + \frac{N_{D 2}}{σ_{D 2}^{2}}} \end{matrix}$
The 2-sample independent sample t-test makes several statistical assumptions. Which of the following assumptions is most likely to cause severe problems if it is substantially violated?
1. Independence of observations within group ✓
2. Normally distributed populations
3. Homogeneity of variances when sample sizes are equal
Suppose you somehow knew that the population standard deviations are 10 in two populations, and that the population distributions are normal. You wish to test the null hypothesis that $μ_{1} - μ_{2} = 0$ , using the 2-Sample Z-Test of the form:
Z = X_•1- X_•2 1N+1N σ2 = N2 X_•1- X_•2 σ
What minimum sample size N do you need for each group to insure that power is at least 0.95 if α = 0.05 and $μ_{1} - μ_{2} = 5.0$ ?
1. 104 ✓
2. 121
3. 115
4. 101
5. 116
6. 106
7. 112
8. 100
Power, in general is a function of several factors:
- Φ(x) — which gives the percentage of a normal curve xσ standard deviations above μ.
  - Φ(1.645) = 0.95
  - Φ(1.96) = 0.975
  - Φ(2.34) = 0.99
  - Φ(2.58) = 0.995
  - Φ(z) = 1 - Φ(-z)
  - Φ^-1(α) = Φ^-1(1 - α)
- E_s — the standardized error: $\begin{matrix} E_{s} & = & \frac{Δ μ}{σ} \end{matrix}$
- R — the rejection point, which is, in general: Φ^-1(97.5%)
The two sample z-test is of the form:
Z = X_•1- X_•2 1N+1N σ2 = 2N Es
The null hypothesis is accepted when:
Z ≤ R = Φα2
Power then is:
1-β = Φ-1 n2Es-R
The minimum sample size then is:
n = 2 Φ-11-β +R Es 2 = 2 Φ-11-β +Φ-1α2 μ-μ0σ 2 = 2 Φ-10.95 +Φ-10.025 5.010 2 = 2 1.645+1.96 0.5 2 = 103.968 = 104

You wish to test the null hypothesis that the population proportion p is less than or equal to .50, using the version of the Z-statistic that incorporates the null hypothesis in the denominator. You obtain a sample proportion of $\hat{p} = 0.65$ , based on a sample size of 100. In this case:

	Obtained Value of the Z-statistic	the null hypothesis is rejected
	2.7	true
	2.4	true
	3.14485	false
	3.3	true
✓	3.0	true
	2.4	false
	3.0	false
	3.14485	true

\begin{matrix} Z & = & \frac{\hat{p} - a}{\sqrt{\frac{a (1 - a)}{n}}} \\ = & \frac{0.65 - 0.5}{\sqrt{\frac{0.5 (1 - 0.5)}{100}}} \\ = & \frac{0.15}{\frac{\sqrt{0.25}}{10}} & = & 3 \end{matrix}

So, unless the rejection point was greater than 3 (Φ(3) ≊ 99.87%), the hypothesis would be rejected. Another verification of this is to construct the 95% confidence interval:

Assuming the distribution is reasonably normal, i.e.:

np > .5
n(1 - p) > .5

Then the confidence interval is:

\begin{matrix} p & \pm & Φ^{-1} (\frac{α}{2}) \sqrt{\frac{p (1 - p)}{n}} \\ 0.5 & \pm & Φ^{-1} (0.025) \sqrt{\frac{0.65 (1 - 0.65)}{100}} \\ 0.5 & \pm & 1.96 \frac{\sqrt{0.2275}}{10} \\ 0.5 & \pm & 0.09349 \end{matrix}

You wish to perform a 2-sample matched sample t-test of equality of means on a group of N = 91 people who were all measured on 2 occasions. Unfortunately, you do not have the raw data, i.e., two columns of scores representing the repeated measurements. However, you do have the mean difference, ${\overline{X}}_{• 1} - {\overline{X}}_{• 2} = 10.0$ . You also have the variances at time 1 and time 2, and the covariance between the two columns of numbers. These are $s_{1}^{2} = 100$ , $s_{2}^{2} = 144$ and $s_{1, 2} = 29.0$ . Using your knowledge of linear combinations, use this information to compute the variance of the difference scores, $s_{D}^{2}$ , and then compute the matched sample t statistic. In this case, you obtain:

	T-Statistic Value	Degrees of Freedom	Critical Value of the t Distribution with α = 0.05
	7.69408	90	1.98667
	6.99462	90	1.66196
	0.512871	90	1.98667
✓	6.99462	90	1.98667
	6.50581	90	1.98667
	5.4893	90	1.98667
	6.99462	91	1.98638
	6.29516	90	1.98667

In general:

\begin{matrix} σ_{a x + b y}^{2} & = & a^{2} σ_{x}^{2} + b^{2} σ_{y}^{2} + 2 a b σ_{x, y} \end{matrix}

So:

\begin{matrix} S_{D}^{2} & = & S_{1}^{2} + S_{2}^{2} - (2) S_{x, y} & = & 100 + 144 - 2 (29) & = & 186 \end{matrix}

The appropriate t-statistic then is:

\begin{matrix} t_{n - 1} & = & \frac{\overline{x_{1}} - \overline{x_{2}}}{\sqrt{\frac{S_{D}^{2}}{n}}} & = & \frac{Δ \overline{x}}{\sqrt{\frac{S_{D}^{2}}{n}}} & = & \frac{10}{\sqrt{\frac{186}{91}}} & ≊ & 6.99462 \end{matrix}

$t_{n - 1} = t_{90}$ has 90 degrees of freedom and a value that is looked up in a t-test table.

You perform an experiment design to test whether a persuasive message can effectively change opinion on a political issue. You measure a group of people on two occasions, and use McNemar’s Z-test to assess the null hypothesis that p₁ = p₂, where p₁ and p₂ are the proportions of people who said "yes" to the pollster on the two occasions. The data are summarized in a 2 × 2 table, where n₁₀ is the number of people who said "yes" at time 1 but "no" at time 2, n₀₁ is the number of people who said "no" at time 1 and "yes" at time 2. Suppose we have n₀₁ = 70 and n₁₀ = 44. The absolute value of the Z-statistic is:
1. 2.11271
2. 22.3572
3. 44.7145
4. 2.19161
5. 2.73002
6. 2.67864
7. 0.22807
8. 2.43512 ✓
$\begin{matrix} Z & = & \frac{n_{10} - n_{01}}{\sqrt{n_{10} + n_{01}}} & = & \frac{44 - 70}{\sqrt{44 + 70}} & ≊ & -2.4351 \end{matrix}$ $\begin{matrix} |Z| & ≊ & 2.4351 \end{matrix}$

You observe a sample correlation of 0.46 based on a sample of N = 60.0 independent observations from a bivariate normal distribution. You test the hypothesis that ρ = 0 using the t-statistic. You calculate:

	Value of t	Degrees of Freedom
	3.94547	59
	4.34001	59
	3.55092	58
	4.67932	58
	4.34001	58
✓	3.94547	58
	3.59432	58
	3.94547	60

For H₀: ρ = 0 there is a special form of the test:

\begin{matrix} t_{n - 2} & = & t_{58} & = & \frac{r}{\sqrt{\frac{1 - r^{2}}{n - 2}}} & = & \frac{0.46}{\sqrt{\frac{1 - {.46}^{2}}{60 - 2}}} & ≊ & 3.94547 \end{matrix}

You test the hypothesis that ρ = 0.5 using the Fisher transform. The sample correlation you observe is r = 0.53. The sample size is N = 56. The Z-statistic value is:
1. 0.31417
2. 0.228247
3. 0.561921
4. 0.165573
5. 0.297313 ✓
6. 0.281050
7. 0.442996
8. 0.359748
The Fisher transform is simply:
$\begin{matrix} φ (x) & = & {tanh}^{-1} (x) & = & \frac{1}{2} ln (\frac{1 + x}{1 - x}) \end{matrix}$
The z-statistic for H₀: ρ = a can be computed using:
$\begin{matrix} Z & = & \frac{φ (r) - φ (a)}{\sqrt{\frac{1}{n - 3}}} & = & \frac{φ (0.53) - φ (0.5)}{\sqrt{\frac{1}{56 - 3}}} & ≊ & 0.29731 \end{matrix}$
Suppose you have two independent groups of size N = 200. These groups represent random samples from two populations. If 62 people in group 1 and 149 people in group 2 can perform a behavior, construct a 95% confidence interval on p₂ - p₁, the population difference in the proportions of people who can perform the behavior. The endpoints of the interval are:
1. 0.327913 ; 0.546562
2. 0.279343 ; 0.593168
3. 0.314036 ; 0.5667
4. 0.324614 ; 0.545386
5. 0.338117 ; 0.531883
6. 0.334851 ; 0.551008
7. 0.346924 ; 0.523076 ✓
8. 0.346924 ; 0.537076
The relevant proportions are:
$\begin{matrix} p_{1} & = & \frac{62}{200} & = & 0.31 \end{matrix}$ $\begin{matrix} p_{2} & = & \frac{149}{200} & = & 0.745 \end{matrix}$
The confidence interval then is:
$\begin{matrix} p_{2} - p_{1} & \pm & Φ^{-1} (\frac{α}{2}) \sqrt{\frac{p_{1} (1 - p_{1})}{n} + \frac{p_{2} (1 - p_{2})}{n}} \end{matrix}$ $\begin{matrix} .745 - 0.31 & \pm & Φ^{-1} (\frac{0.05}{2}) \sqrt{\frac{0.31 (1 - 0.31)}{200} + \frac{0.745 (1 - 0.745)}{200}} \end{matrix}$ $\begin{matrix} .435 & \pm & 1.96 \sqrt{\frac{0.2139}{200} + \frac{0.189975}{200}} & ≊ & 0.088077 \end{matrix}$
Suppose you obtain random samples of N₁ = 70 males and N₂ = 51 females, and obtain a correlation r₁ = 0.54 between two variables of for the male participants, and r₂ = .40 for the famale participants. If you test the null hypothesis with the standard Z-statistic, what value should you obtain?
1. 1.80411
2. 0.902344
3. 1.15502
4. 1.42229
5. 0.732814
6. 1.00868
7. 0.531593
8. 0.954558 ✓
$\begin{matrix} Z & = & \frac{φ (r_{1}) - φ (r_{2})}{\sqrt{\frac{1}{n_{1} - 3} + \frac{1}{n_{2} - 3}}} & = & \frac{φ (0.54) - φ (0.4)}{\sqrt{\frac{1}{70 - 3} + \frac{1}{51 - 3}}} & ≊ & 0.954558 \end{matrix}$
Suppose you obtain a random samples of N = 74 individuals, and obtain a correlation r = 0.53 between two variables. Suppose you construct a 90% confidence interval for the population correlation. What are the endpoints of the confidence interval?
1. 0.340001 ; 0.710460
2. 0.362537 ; 0.690787
3. 0.375609 ; 0.655769 ✓
4. 0.300487 ; 0.918076
5. 0.195316 ; 0.685213
6. 0.357579 ; 0.685213
7. 0.302440 ; 0.743642
The confidence interval on φ(r) is:
$\begin{matrix} φ (r) & \pm & Φ^{-1} (\frac{α}{2}) \sqrt{\frac{1}{n - 3}} \end{matrix}$
The confidince interval on r then is:
$\begin{matrix} φ^{-1} (φ (r) - Φ^{-1} (\frac{α}{2}) \sqrt{\frac{1}{n - 3}}) & \leq & r & \leq & φ^{-1} (φ (r) + Φ^{-1} (\frac{α}{2}) \sqrt{\frac{1}{n - 3}}) \end{matrix}$ $\begin{matrix} 0.375594 & = & φ^{-1} (φ (0.53) - Φ^{-1} (\frac{0.1}{2}) \sqrt{\frac{1}{74 - 3}}) & \leq & 0.53 & \leq & φ^{-1} (φ (0.53) + Φ^{-1} (\frac{0.1}{2}) \sqrt{\frac{1}{74 - 3}}) & = & 0.655779 \end{matrix}$
The standard F-test for comparing two variances for equality:
1. is robust to non-normality
2. is a one-tailed test
3. has high power
4. None of the above is correct ✓
Suppose X has a $χ_{8}^{2}$ distribution. What is the variance of X?
1. 4
2. 16 ✓
3. 13
4. 9
For a chi-square distribution, $χ_{ν}^{2}$ , with ν degrees of freedom:
$\begin{matrix} E (χ_{ν}^{2}) & = & ν \end{matrix}$ $\begin{matrix} Var (χ_{ν}^{2}) & = & 2 ν \end{matrix}$
Suppose you take a sample of size 48, and observe a sample variance of 60. The endpoints for the 95% confidence interval on σ² are:
1. 41.5803 ; 94.1375 ✓
2. 50.0 ; 70.0
3. 42.4649 ; 96.1404
4. 44.1916 ; 87.0141
$\begin{matrix} \frac{(n - 1) S^{2}}{{}_{1 - \frac{α}{2}}χ_{n - 1}^{2}} & \leq & σ^{2} & \leq & \frac{(n - 1) S^{2}}{{}_{\frac{α}{2}}χ_{n - 1}^{2}} \end{matrix}$
The values for χ² are looked up in a table.

You obtain samples of size N₁ = 30 and N₂ = 40, and observe sample variances of 127 and 29. Test the null hypothesis H₀: $σ_{1}^{2} = σ_{1}^{2}$ using the standard F statistic. The results are:

	Test Statistic Value	Degrees of Freedom - 1	Degrees of Freedom - 2	Null Hypothesis
	4.37931	29	39	not rejected
	4.37931	30	40	not rejected
	5.37931	30	40	not rejected
✓	4.37931	29	39	rejected
	4.37931	30	40	rejected
	3.37931	29	39	rejected

When N = 12, and the sample is independent and random from a normal distribution with standard deviation 41.0, the sample standard deviation S has an expected value of 40.0889. We know that the expected value of S² is σ² , i.e., 1681.0. Using this information, and a well known formula for the variance of a random variable, the sampling variance of S is:
1. 82.9683
2. 59.1048
3. 56.8884
4. 73.8810 ✓
5. 104.320
Suppose you know the population variance, and it is σ² = 182. You take 3 samples from the (normally distributed) populations, and all of them are of size N = 11. The sample means are 18.63, 21.82, and 33.27. Compute a chi-square statistic for testing the null hypothesis that all 3 populations have the same mean. The value of the statistic is:

χ² Degrees of Freedom

✓ 7.16427 2

35.8213 3

35.8213 2

7.16427 3

7.16427 30

	χ²	Degrees of Freedom
✓	7.16427	2
	35.8213	3
	35.8213	2
	7.16427	3
	7.16427	30

Suppose you sample N = 43 independent observations from a normal distribution, and observe a sample variance of S² = 220.2. You test the null hypothesis that σ² = 100 with α =. 05.

	Observed Value of the χ² Statistic	Degrees of Freedom	Critical Value of the Test Statistic
	94.686	42.0	58.124
✓	92.484	42.0	61.7768
	101.732	42.0	61.7768
	92.484	42.0	59.3035
	92.484	43.0	61.7768
	184.968	42.0	61.7768

You have the following data from 3 groups:

	Group 1	Group 2	Group 3
Mean	11	37	48
Variance	257	287	299
Sample Size (N)	18	18	18

You perform a 1-Way Analysis of variance. You obtain the following results:

	F-statistic	Degrees of Freedom (numerator)	Degrees of Freedom (denominator)	Critical Value from the F distribution with α =. 05
✓	23.1246	2	51	3.17880
	23.1246	2	51	2.78623
	23.1246	3	51	3.17880
	27.9807	2	51	3.17880
	34.6868	2	51	3.17880
	20.8121	3	51	3.17880
	34.6868	3	51	3.17880
	25.437	2	51	3.17880

You observe the following results for 3 independent groups. You compute the 1-Way ANOVA for unequal N. The test is performed with α = .05.

Group 1	Group 2	Group 3
4	1	5
11	7	12
7	3	15
	3	9
		16

	SS_between	SS_within	df_between	df_within	SS_between	SS_within	F_observed	F_critical
✓	139.383	124.867	2	9	69.6917	13.8741	5.02316	4.25649
	139.383	129.867	2	9	69.6917	13.8741	5.52547	4.25649
	135.383	124.867	2	9	69.6917	13.8741	5.02316	4.25649
	139.383	124.867	2	9	69.6917	13.8741	5.02316	5.71471
	139.383	124.867	2	9	69.6917	8.91905	5.02316	4.25649
	139.383	124.867	2	9	69.6917	13.8741	5.02316	8.02152
	139.383	124.867	2	9	46.4611	13.8741	5.02316	4.25649
	139.383	124.867	2	9	69.6917	13.8741	1.11626	4.25649

Suppose statistic A has a χ² distribution with 90 degrees of freedom, and statistic B has a χ² distribution with 93 degrees of freedom. If A and B are independent, then what is the distribution of A/B?
1. A χ² with 183 degrees of freedom
2. 8370 times an F_90,93 distribution
3. $\frac{30}{31}$ times an F_93,90 distribution
4. $\frac{30}{31}$ times an F_90,93 distribution ✓
5. $\frac{31}{30}$ times an F_90,93 distribution

PSY 310: Statistical Inference

Will Holcomb

Practice Exam #3

Due: Wed., 10 December 2007