Final Exam
Review Questions Stat 101
Year |
Time |
Year |
Time |
1959 |
143 |
1970 |
131 |
1960 |
141 |
1971 |
139 |
1961 |
144 |
1972 |
136 |
1962 |
144 |
1973 |
136 |
1963 |
139 |
1974 |
134 |
1964 |
140 |
1975 |
130 |
1965 |
137 |
1976 |
140 |
1966 |
137 |
1977 |
135 |
1967 |
136 |
1978 |
130 |
1968 |
142 |
1979 |
129 |
1969 |
134 |
1980 |
132 |
1.
The
a) By how much on average did the winning time improve per year during this period?
b) Use the regression line to predict the winning time in 1990. Is this prediction trustworthy? (The actual 1990 winning time was 128 minutes.)
Answer: regression line: Time^ = 1221.1 - 0.5505 Year, so
a) 0.5505 minutes
b) 125.487 min. (calculator) Not trustworthy, far outside range.
Note: I used 4-digit years and did this both with calculator and Excel.
The Excel prediction was slightly different from the calculator’s.
This is a fairly strong linear model since R2 = .59926.
Group |
n |
|
s |
1 |
73 |
3.41 |
3.62 |
2 |
302 |
2.20 |
2.67 |
2. A study of the effect of eating sweetened cereals on tooth decay in children compared 73 children (Group 1) who ate such cereals regularly with 302 children (Group 2) who did not. After three years the number of new cavities was measured for each child. The summary statistics are as shown in the table. The researchers suspected that sweetened cereals increase the mean number of cavities. Do the data support this suspicion? Formulate and perform the hypothesis test. Also give a 95% confidence interval for the difference of the means.
Answer:
H0: m1 = m2
HA: m1 > m2
2-sample t-test: t = 2.6848, P = .004306. Low P-value, so reject H0. Strong evidence that sweetened cereals increase the mean number of cavities.
2-sample t-interval for m1 - m2: (.3149, 2.1051)
3. In the article "The Impact of Cover Design and First Questions on Response Rates for a Mail Survey of Skydivers" in Leisure Sciences, 1991, pp. 67--76, researchers reported on the results of an experiment with cover design. Of 420 skydivers, 207 were randomly selected to receive a survey with a plain cover, and the remaining 213 received a cover with a picture of a skydiver. The outcome of the experiment is shown below:
Group |
Number-sent |
Number-returned |
Plain |
207 |
104 |
Skydiver |
213 |
109 |
The researchers are interested in seeing if there is any difference between the response rates to the survey for plain versus skydiver covers.
a) What are the null and alternative hypotheses?
b) Calculate the value of the test statistic for the hypothesis test.
c) Using a significance level of .05, is there evidence of a detectable difference between response rates for plain and skydiver covers?
d) Give a 95% confidence interval for the difference between the proportion of skydivers who respond to plain covers and the proportion of skydivers who respond to a skydiver cover. Use both the standard method and the plus 4 method.
Answers:
b) 2-proportion z-test: z = -.1910
c) P = .8485 > .05, virtually no evidence against H0, so fail to reject it. No evidence of a detectable difference between response rates for plain and skydiver covers.
d) 2-proportion z-interval: standard (-.105, .08631), plus 4 (-.1039, .08558)
Final Exam Stat 101 2007 summer Name__________
1. (25) Here are data on the years of schooling completed x, and annual income y (in thousands of dollars), for a sample of 9 40-year-old men.
Years |
10 |
16 |
12 |
6 |
12 |
12 |
16 |
16 |
18 |
Income |
48 |
58 |
36 |
33 |
45 |
50 |
55 |
47 |
48 |
a) Find the equation of the regression line.
b) Give a scatterplot of the 9 points and graph the regression line.
c) By how much on average does income improve per year of education for this sample?
d) How much of the income increase is explained by the regression line?
f) Use the regression line to predict the income of a 40-year-old man with 14 years of schooling.
Answers:
a) income^ = 27.232 + 1.4823 years
b)
c) $1,482
d) 48.07%
f) $47,984
2. (25) The water diet requires one to drink two cups of water every half hour from when one gets up until one goes to bed, but otherwise allows one to eat whatever one likes. Four adult volunteers agree to test the diet. They are weighed prior to beginning the diet and after six weeks on the diet. The weights (in pounds) are
Person 1 2 3 4
Weight before the diet 180 125 240 150
Weight after six weeks 170 130 215 152
For the population of all adults, assume that the weight loss after six weeks on the diet (weight before beginning the diet minus weight after six weeks on the diet) is normally distributed with mean m.
a) Test whether the diet leads to weight loss. State hypotheses, calculate a test statistic, give a P-value (by calculator or as exact as the tables in the text allow), and state your conclusion in words.
b) Find a 90% confidence interval for m.
Answers:
a) H0: m = 0, HA: m > 0. Paired t-test: t-test on the differences before-after: 10, -5, 25, -2. t = 1.0265, P = .1901. Insufficient evidence for weight loss; reject H0.
b) (-9.048, 23.048) (not requested in problem, but df = 3; ME = 16.048)
(over)
3. (25) An SRS of 25 male faculty members at a large university found that 10 felt that the university was supportive of female and minority faculty. An independent SRS of 20 female faculty found that 5 felt that the university was supportive of female and minority faculty. Let p1 and p2 represent the proportion of all male and female faculty, respectively, at the university who felt that the university was supportive of female and minority faculty at the time of the survey.
a) Test whether significantly more male faculty than female faculty felt that the university was supportive of female and minority faculty. State hypotheses, calculate a test statistic, give a P-value (by calculator or as exact as the tables in the text allow), and state your conclusion in words.
b) Find a 98% plus-four confidence interval for p1 – p2.
Answers:
a) H0: p1 = p2, HA: p1 > p2 ; 2-proportion z-test: z = 1.0607, P = .1444. Little evidence that significantly more male faculty than female faculty felt that the university was supportive of female and minority faculty; fail to reject H0. Also unreliable since 5 are too few successes in group 2 (should be 10 or more).
d) 2-proportion z-interval: (plus 4 method compensates for only 5 successes in group 2): (-.1809, .4252)
4. (25) Answer each of the following questions. (No explanation is needed—just a short answer.)
a) You are reading an article in your field that reports several statistical analyses. The article says that the P-value for a significance test is 0.045. Is this result significant at the 5% significance level? YES
b) Is the result with P-value 0.045 significant at the 1% significance level? NO
c) For another significance test, the article says only that the result was significant at the 1% level. Are such results always, sometimes, or never significant at the 5% level? ALWAYS
d) The article contains a 95% confidence interval. Would the margin of error in a
99% confidence interval computed from the same data be less, the same, or greater? GREATER
e) Reaction times of a subject to a stimulus are often strongly skewed to the right because of a few slow reaction times. You wish to test H0: μ1 = μ2 where μ1 is the mean reaction time for Stimulus 1, and μ2 for Stimulus 2. You have two independent samples, 8 observations for Stimulus 1 and 10 for Stimulus 2. Which, if any, of the tests that we have covered in this course can be used to test this? NONE (t-test needs normality of population, esp. for small samples.) A more advanced course would cover non-parametric methods that could be used here.
Excel Problems
1. Here are the number of pieces of mail received at a school office for 36 days.
123 |
80 |
52 |
112 |
118 |
95 |
151 |
100 |
66 |
143 |
110 |
115 |
70 |
78 |
103 |
92 |
118 |
131 |
115 |
128 |
135 |
100 |
75 |
105 |
90 |
72 |
138 |
93 |
106 |
59 |
97 |
130 |
76 |
88 |
60 |
85 |
(1) Draw a histogram with the given bin range.
|
65 |
80 |
95 |
110 |
125 |
140 |
155 |
(2) Compute the five-number summary, the mean and standard deviation.
Answers:
(1)
(2)
Min |
52 |
Q1 |
79 |
Median |
100 |
Q3 |
118 |
Max |
151 |
|
|
Mean |
100.9667 |
SD |
24.26587 |
2. The table shows the number of live births per 1000 women
aged 15-44 years in the
(1)
Make a scatter plot with trend line on it. (2)
Find the equation of the regression line, and R2. (3)
Predict what the birth rate will be in 2010.
Year |
Rate |
1965 |
19.4 |
1970 |
18.4 |
1975 |
14.8 |
1980 |
15.9 |
1985 |
15.6 |
1990 |
16.4 |
1995 |
14.8 |
2000 |
14.4 |
2005 |
14.0 |
Answer:
3. The data below show the sugar content (as a percentage of weight) of two different brands of cereals. Do these data provide evidence that the two brands differ in sugar content?
Brand A |
Brand B |
40.3 |
40 |
45.7 |
42.2 |
43.3 |
36.5 |
43 |
41.4 |
44.2 |
38.4 |
44 |
41.1 |
44 |
40.3 |
37.8 |
40.9 |
45.9 |
39.8 |
44.1 |
41 |
43.7 |
41.2 |
Answer:
H0: m1 = m2
HA: m1 ¹ m2
2-sample t-test (unequal variances, i.e. not pooled)
t Stat |
3.540701414 |
P(T<=t) two-tail |
0.002335713 |
Strong evidence that the two brands differ in sugar content. Reject H0.
4. One Thursday, researchers gave students a set of 50 new Spanish vocabulary words to memorize. On Friday the students took a vocabulary test. On the following Monday, they were retested without advance warning. Both sets of test scores of the 25 students are shown below. Are the scores on Friday and the following Monday different?
Friday |
Monday |
42 |
36 |
44 |
44 |
45 |
46 |
48 |
38 |
44 |
40 |
43 |
38 |
41 |
37 |
35 |
31 |
43 |
32 |
48 |
37 |
43 |
41 |
45 |
32 |
47 |
44 |
50 |
47 |
34 |
34 |
38 |
31 |
43 |
40 |
39 |
41 |
46 |
32 |
37 |
36 |
40 |
31 |
41 |
32 |
48 |
39 |
37 |
31 |
36 |
41 |
Answer:
H0: m = 0
HA: m ¹ 0
Paired t-test: t-test on the differences Friday - Monday.
t Stat |
5.166581919 |
P(T<=t) two-tail |
2.72715E-05 |
Very strong evidence that the two scores differ. Reject H0.
Stat 101
Final Exam Thulin
- 2009 Autumn
Sex |
Height |
Shoe |
F |
60 |
6 |
F |
62 |
5.5 |
F |
62 |
6 |
F |
62 |
7 |
F |
64 |
9 |
F |
64 |
5.5 |
F |
64 |
6.5 |
F |
65 |
7 |
F |
66 |
9 |
F |
67 |
9 |
F |
68 |
9 |
F |
69 |
10 |
M |
67 |
12 |
M |
67 |
10 |
M |
67 |
10.5 |
M |
69 |
8 |
M |
71 |
10.5 |
M |
74 |
11 |
A
Stat 101 class was polled and the table below right records height and shoe
size for the students. Use it to answer questions 1-4.
1. Draw a
histogram for Height of females only with
the given bins.
60-61 |
62-63 |
64-65 |
66-67 |
68-69 |
2. Draw side by side boxplots for the female heights and male heights.
3. Do
these data support the assertion that average height of male Stat 101 students
is greater than that of female Stat 101 students? Perform the appropriate
hypothesis test, stating the null and alternative hypotheses, and report the t
and P values.
4. Do
linear regression for only the male’s
height (explanatory) vs. shoe size (response). Find:
a) a scatterplot with regression line
b) the equation of the regression line
c) the proportion of variation in shoe size explained by the
regression line
d) the predicted shoe size for a height of 70 inches.
5. 5
randomly chosen subjects endured the UltraFast
diet for 4 weeks. The before and after weights were:
Before |
160 |
200 |
210 |
150 |
140 |
After |
150 |
199 |
190 |
160 |
125 |
Do
these data support the efficacy of the diet? Perform the appropriate hypothesis
test, stating the null and alternative hypotheses, and report the t and P values.
6.
Suppose that scores on an exam follow the normal model with mean 70 and
standard deviation 10. Determine the probability that:
a)
a randomly chosen examinee has a score between 80 and
90
b)
4 randomly chosen examinees have a mean score between
80 and 90.
7. A
survey of 18 Stat 101 students revealed that 7 were politically moderate.
Assume this was a random sample of Stat 101 students. Construct a 90%
confidence interval for the proportion of moderate Stat 101 students. Be sure
to check the conditions, and if one is not satisfied, use a method that does
not require it. Hint: plus 4.
Group |
n |
mean |
s |
women |
153 |
5.41 |
3.65 |
men |
127 |
7.20 |
4.02 |
8.
Researchers are interested in whether there is a difference between men and
women in response to a certain drug. The summary statistics are as shown in the
table. Formulate and perform the appropriate hypothesis test. Also give a 95%
confidence interval for the difference of the means.
9. Answer
each of the following questions. (No explanation is needed—just a short answer.)
a)
You are reading an article in your field that reports several statistical
analyses. The article says that the P-value for a significance test is
0.005. Is this result significant at the 5% significance level?
b) Is
the result with P-value 0.005 significant at the 1% significance level?
c)
For another significance test, the article says only that the result was significant
at the 10% level. Are such results always, sometimes, or never significant at
the 5% level?
d)
The article contains a 98% confidence interval. Would the margin of error in a
90%
confidence interval computed from the same data be less, the same, or greater?
e) A
95% confidence interval is computed from a given sample. Additional data is gathered
so that the new sample is 4 times the size of the old. The sample standard
deviation of the new sample turns out to be very close to that of the original.
A new 95% confidence interval is computed from the new sample. The new interval
will be about ___ as wide as the old one.
10. A
poll indicated the following support for candidate X in various parts of a
city. Is the evidence that support for candidate X is different in different
parts of the city at the a) .05 b) .01 significance levels? Report the degrees
of freedom, chi-square and P values.
|
N |
E |
W |
S |
Support |
22 |
15 |
11 |
23 |
Oppose |
15 |
17 |
14 |
29 |
No opinion |
11 |
22 |
19 |
13 |