Incorrectly accepting a false statistical null hypothesis is to commit:
A "null hypothesis," represented by H0, is the hypothesis being presented by the researcher for examination. It, along with the "alernative hypothesis," represented by H1, partition the paramater space (and the sample space which is a subset of the parameter space). It is assumed that a given hypothesis, H0, is either true or not in reality. The hypthesis is then either found true or false by the researcher. The combination for the various truth combinations is:
Actual H0 | |||
---|---|---|---|
True | False | ||
Experimental H0 | True | Correct Acceptance 1 - α | Type II Error β |
False | Type I Error α (Significance) | Correct Rejection 1 - β (Power) |
Suppose you have a binomial process based on 9 trials, with probability of success equal to What is the probability of obtaining exactly 4 successes in this situation?
In a Bernoulli process with probability of success, p and number of trials n, the probability of r successes is:
In Reject-Support testing, a Type II Error represents
See the question on error types for the pertinent table. In Reject-Support testing the researcher structures their experiment such that the null hypothesis, H0, is is contrary to their belief. To find experimental evidence of their belief then, they must disprove H0. Since a Type II error is an incorrect acceptance of H0, the experimenter's theory is incorrectly rejected.
Given the following probability distribution for the random variable:
x | Px(x) |
---|---|
1 | 0.1 |
2 | 0.18 |
3 | 0.2 |
4 | 0.2 |
5 | 0.32 |
The variance of X is:
The variance is, in general, the average squared deviation from the expected value. For this data, the expected value is:
The variance then is:
Suppose that the sex of a child is completely random, i.e., boys and girls occur in independent sequences with probability .50 each. What percentage of families with 5 children will have all 5 the same sex?
For a Bernoulli process with p = .5:
For this situation, bear in mind that the number of ways to choose b things from a is the same as the number of ways fail to choose those things (a - b), so:
Given that,
You have a pegboard with a line of 9 holes. How many distinctly different sequences can you construct that have 5 black pegs and 4 white pegs?
This is simply the number of ways to pick the four white pegs from the set of nine:
Suppose that there is no such thing as being "on a hot streak" in basketball, i.e., when a player shoots, a binomial process is a good model for whether the player makes or misses a shot. Suppose that a player is "a 50% shooter," that is, the player has a probability of success of .5. What is the probability that, if he takes 16 shots, he will make exactly 8?
Because p = .5, the degenerate case may be used:
Suppose you play a game where you know your probability of winning is 0.8. You play the game with even odds, i.e., if you win, you make $50, if you lose, you lose $50. Suppose you play the game 10 times. Which of the following is closest to the probability you will win money, i.e., win more than 5 of the games?
Since this is a significant amount of math, it would be nice if there was a quicker way to find the solution. One possibility is the normal approximation of the binomial. According to the Central Limit Theorem, if the sum of the variables has a finite variance, then it will be approximately normally distributed. The basic idea is:
For this example, an intutive formulation of the range would be:
There are two issues however:
To accommodate moving from a discrete function to a continuous one, a better approximation will be produced by expanding the range covered by a half of the fundamental unit in each direction. In this case the result would be:
5.5 - 10.5 is a two-tailed interval. It includes the probability that the number of games won could be greater than 10. This is not possible and a better formulation would be:
This numbers now needs to be converted to a z-score to use a distribution table to approximate the area under the normal curve to the left of that points. Converting to z-scores requires a mean, μ, and a standard deviation, σ since:
For a binomial distribution:
This particular interval is:
The distribution table is then used to derive:
This is within 2.8% of 0.96721, so that is a safe choice for the answer.
You deal 5 cards without replacement from a shuffled poker deck. What is the probability that there will be exactly one ace, and that it will be drawn on card number 2?
If a family has 3 children, what is the probability that they have exactly 2 girls?
This is the same reasoning as the question on Bernoulli processes with .5 probability.
Samantha has 7 vases that she wishes to arrange on a shelf in her kitchen. How many different ways can she order the vases?
The number of permutations of r objects chosen from n is:
When r = n:
For this example:
Statistic A has a sampling variance of 58 and statistic B has a sampling variance of 86. The relative efficiency of A relative to B is
Efficiency of an estimator is:
The efficiency of A relative to B is:
The sampling distribution of the sample mean based on N independent observations
Other factors remaining constant, which of the following factors would increase power?
Power is the probability a correct rejection. It is represented by 1 - β where β is the probability of a Type II error (incorrect acceptance).
A fundamental concept in calculations of power is the effect size which represents the deviation from the actual value. If, for example, the real μ for a population is 30 and the μ being tested by H0, μ0, is 40, then the effect size would be -10 (μ - μ0).
In general, it is not useful to discuss pure effect size because it varies across tests with different units, sample sizes and variances. A more useful measure is the effect size scaled by the standard deviation, known as the standardized effect size:
Using this form, it is possible to make a statement like, "the effect size is half a standard deviation," which is meaningful for any test.
In general, the amount of error possible for an estimation of the mean is:
This is from:
So, for a given experimentally derived , what is the probability that it will be correctly rejected if it is not within a confidence interval of α precision around the actual mean, μ?
Suppose you somehow knew that the population standard deviation σ is 10, and that the population distribution is normal. You wish to test the null hypothesis that μ = 0, using the 1-Sample Z-test of the form:
What is the statistical power if α =. 05, N = 34, and μ = 4.0
Suppose you observe a sample of size 40 independent observations from a normal distribution, and obtain a sample mean of . If the population standard deviation is σ = 21.0, then a 95% confidence interval for the population mean μ has endpoints:
To have a certainty of 95%, that means that 5% of the curve needs to lie outside, or that you need the range bounded by 2.5% and 97.5%. The z-scores for those are ±1.96. In general, the confidence interval for an accuracy α is defined as:
Which, for this problem is:
If E(X) = 4.0, E(Y) = 2.0, and E(XY) = 12.0, then the covariance of the random variables X and Y is necessarily equal to
Covariance is defined as:
If A = {1, 2, 3, 4, 5} and B = {2, 5, 6}, then A ∩ B =
A ∩ B is the elements A and B have in common.
For any two events A and B, we can say that the probability of A ⋃ B is always:
When joining A and B, the area A ∩ B gets included twice: once with A and once with B. It is therefore necessary to remove it once in computing the probability.
There are 2,500,000 adult people in the large metropolitan area where you live. There is a serial killer on the loose who can be assumed to be a local adult resident, and otherwise alll that is known about the serial killer is that:
It is also known that, in the general population,
Assuming these behaviors are independent, the probability that a person has all 3 of these behaviors is .00001. Suppose you have just started dating someone, and they have blonde hair. They arrange to pick you up for your first date, and they drive up in a red sports car. The evening goes well, and you decide to go for a romantic drive by the lake. At that point, your date puts on a CD of "Tiptoe Through the Tulips," by Tiny Tim. Knowing probability as well as you do, you
The likelihood that someone has all three behaviors is 0.00001 (1 in 100,000) however, the likelihood that my date is the killer is 1 in 2,500,000 (since there's only one killer in the city). The specific probabilities specified in the problem are:
First, consider the probability that my date is the killer:
To find the odds then are:
New spark plugs have just been installed in a small airplane with a 4-cylinder engine. For each spark plug, the probability that it is defective and will fail during the first 20 minutes of flight is 0.0001. Assume that spark plugs fail independently of each other. What is the probability that at least one of the spark plugs will fail during the first 20 minutes of flight?
This is the proabaility that one, two, three or four plugs will fail.
Or, rather than computing all of those, it is the inverse of the probability that none will fail:
In a recent election, 55.0% of the voters were Republican and 45.0% were not. Of the Republicans, 80.0% voted for Candidate X, and of the non-Republicans, 10.0% voted for Candidate X. Consider a randomly selected voter. What is the probability that the voter is Republican and voted for candidate X?
For independent events:
Convergence to a normal sampling distribution occurs
In the casino game of roulette, a gambler can bet on which of 38 numbers will be selected by the spin of a wheel. On a $2 bet, the gambler gains $70 for picking the correct number, but loses the $2 otherwise. Let X be the amount won or lost on a roll. What is E(X)?
The distribution table for this situation looks like:
x | Px(x) |
---|---|
$70 | |
-$2 |
If the random variable X has an expected value of 100 and a variance of 49, then the random variable has a mean and variance of:
The mean is straightforward:
The standard deviation requires the application of the variance heuristic:
Generate the square of the desired quantity:
In the squared quantity drop all variables that are not squares of a variable.
In this quantity replace the squared variables with the variance of the given value or the covariance if it is a function of multiple variables:
So for this example, through the variance heurisitic, we get:
Another option is a change table:
μ | σ | σ2 | |
---|---|---|---|
x | 100 | 7 | 49 |
x - 2 | 98 | 7 | 49 |
(x - 2) / 7 | 14 | 1 | 1 |
Suppose you wish to test the null hypothesis H0 : μ = 100 against the alternative H1 : μ ≠ 100 What kind of test is this?
The hypothesis is a single point in the parameter space, so it is a point hypothesis. The alternative hypothesis allows for the data to be either greater than or less than the given point, so the test is 2-tailed.
How many distinctly different subsets, including the null set, can be composed from 12 objects?
In general, this would be the number of subsets of size 0 through n, or:
However, one can consider a representation where each set is represented by a n-length bit field with a xi = 0 if an element is not present or 1 if it is. The question is then the number of possibilities for that bitfield which is:
Suppose an exam has 12 multiple choice items, and each has 4 alternatives. Suppose that a student who knows absolutely nothing about the course material guesses on all 12 items. On average, how many items would you expect the student to get right?
For a binomial:
Given the following probability distribution for the random variable:
x | Px(x) |
---|---|
1 | a |
2 | 0.1 |
3 | 0.2 |
4 | 0.15 |
5 | 0.4 |
The expected value of X is:
First it is necessary to find a. This is straightforward knowing that:
The expected value is simply:
Suppose that the sex of a child is completely random, i.e., boys and girls occur in independent sequences with probability .50 each. What percentage of families with 5 children will have all 5 the same sex?
This is identical to the question on Bernoulli processes with .5 probability.
You have a pegboard with a line of 13 holes. How many distinctly different sequences can you construct that have 3 black pegs and 10 white pegs?
The reasoning is the same as the previous question on combinations.
Suppose that σ is known to be 15, and you gather data and construct a 95% confidence interval for μ, using the standard technique. If this confidence interval ranges from 25 to 37.8, what would be the outcome of a hypothesis test, performed at the α = 0.05 significance level, of the statistical null hypothesis that μ = 30?
The H0 being tested is for a μ that falls within the confidence interval, so the hypothesis is accepted.
At 3am in the morning after the annual graduate student party (at which you consumed 5 beers, a jar of salsa, 8 ounces of potato chips, and 4 chocolate twinkies), you awaken with a splitting headache and stumble to the medicine cabinet in your dingy apartment. You grab four painkillers and gulp them down with a glass of water. You are about to head back to bed when you notice that your roomate has left a bottle of "rat-be-gone" tablets just to the left of the painkillers. (Your apartment has been overrun by rats recently, and your roomate decided to take action.) It suddenly hits you — you might have accidentally swallowed two rat-be-gone tablets!! Just then, you develop stomach cramps. You call the student hospital, but they put you on 30 minute hold. You don’t own a car, and your roomate has not returned, so you realize there is no chance of obtaining help. At that point, you remember your outstanding Psych. 310 training, and your probabilistic thinking skills kick in. It is known that: Since you reach with your right hand, the prior probability that you grabbed the painkiller bottle is .90. The probability that you will have stomach cramps, given that you swallowed rat-be-gone, is .99. The probability that you will have stomach cramps given that you did not swallow rat-be-gone, is .50. The probability that you will survive the night if you did indeed swallow rat-be-gone is .40. The probability that you will survive the night if you did not swallow rat-be-gone is .99999. What is the probability that you will survive the night?