Final Exam Review Questions                  Stat 101

 

Year

Time

Year

Time

1959

143

1970

131

1960

141

1971

139

1961

144

1972

136

1962

144

1973

136

1963

139

1974

134

1964

140

1975

130

1965

137

1976

140

1966

137

1977

135

1967

136

1978

130

1968

142

1979

129

1969

134

1980

132

1. The Boston Marathon is one of the world’s best-known foot races. The winning time in the Boston Marathon has decreased as runners get faster. The table shows time of the winning man, in minutes, for the years from 1959 to1980.

a) By how much on average did the winning time improve per year during this period?

b) Use the regression line to predict the winning time in 1990. Is this prediction trustworthy? (The actual 1990 winning time was 128 minutes.)

 

Answer: regression line: Time^ = 1221.1 - 0.5505 Year, so

a) 0.5505 minutes

b) 125.487 min. (calculator) Not trustworthy, far outside range.

Note: I used 4-digit years and did this both with calculator and Excel.

The Excel prediction was slightly different from the calculator’s.

This is a fairly strong linear model since R2 = .59926.

 

 

 

 

Group

n

s

1

73

3.41

3.62

2

302

2.20

2.67

2. A study of the effect of eating sweetened cereals on tooth decay in children compared 73 children (Group 1) who ate such cereals regularly with 302 children (Group 2) who did not. After three years the number of new cavities was measured for each child. The summary statistics are as shown in the table. The researchers suspected that sweetened cereals increase the mean number of cavities. Do the data support this suspicion? Formulate and perform the hypothesis test. Also give a 95% confidence interval for the difference of the means.

 

Answer:

H0: m1 = m2

HA: m1 > m2

2-sample t-test: t = 2.6848, P = .004306. Low P-value, so reject H0. Strong evidence that sweetened cereals increase the mean number of cavities.

2-sample t-interval for m1 - m2:  (.3149, 2.1051)

 

 

3. In the article "The Impact of Cover Design and First Questions on Response Rates for a Mail Survey of Skydivers" in Leisure Sciences, 1991, pp. 67--76, researchers reported on the results of an experiment with cover design.  Of 420 skydivers, 207 were randomly selected to receive a survey with a plain cover, and the remaining 213 received a cover with a picture of a skydiver.   The outcome of the experiment is shown below:

Group

Number-sent

Number-returned

Plain

207

104

Skydiver

213

109

 

The researchers are interested in seeing if there is any difference between the response rates to the survey for plain versus skydiver covers.

a)  What are the null and alternative hypotheses?

b)  Calculate the value of the test statistic for the hypothesis test.

c)  Using a significance level of .05, is there evidence of a detectable difference between response rates for plain and skydiver covers?

d)  Give a 95% confidence interval for the difference between the proportion of skydivers who respond to plain covers and the proportion of skydivers who respond to a skydiver cover. Use both the standard method and the plus 4 method.

 

Answers:

a) H0: p1 = p2, HA: p1 ¹ p2

b) 2-proportion z-test: z = -.1910

c) P = .8485 > .05, virtually no evidence against H0, so fail to reject it. No evidence of a detectable difference between response rates for plain and skydiver covers.

d) 2-proportion z-interval: standard (-.105, .08631), plus 4 (-.1039, .08558)

 

 

 

 

Final Exam          Stat 101               2007 summer       Name__________

 

 

 

1. (25) Here are data on the years of schooling completed x, and annual income y (in thousands of dollars), for a sample of 9 40-year-old men.

 

Years

10

16

12

6

12

12

16

16

18

Income

48

58

36

33

45

50

55

47

48

 

a) Find the equation of the regression line.

 

b) Give a scatterplot of the 9 points and graph the regression line.

 

c) By how much on average does income improve per year of education for this sample?

 

d) How much of the income increase is explained by the regression line?

 

f) Use the regression line to predict the income of a 40-year-old man with 14 years of schooling.

 

Answers:

a) income^ = 27.232 + 1.4823 years

b)

 

c) $1,482

d) 48.07%

f) $47,984

 

 

 

 

 

2. (25) The water diet requires one to drink two cups of water every half hour from when one gets up until one goes to bed, but otherwise allows one to eat whatever one likes.  Four adult volunteers agree to test the diet.  They are weighed prior to beginning the diet and after six weeks on the diet.  The weights (in pounds) are

 

Person                                       1          2          3          4     

Weight before the diet               180      125      240      150

Weight after six weeks              170      130      215      152

 

For the population of all adults, assume that the weight loss after six weeks on the diet (weight before beginning the diet minus weight after six weeks on the diet) is normally distributed with mean m. 

 

a) Test whether the diet leads to weight loss. State hypotheses, calculate a test statistic, give a P-value (by calculator or as exact as the tables in the text allow), and state your conclusion in words.

 

b) Find a 90% confidence interval for m.

 

Answers:

a) H0: m = 0, HA: m > 0. Paired t-test: t-test on the differences before-after: 10, -5, 25, -2. t = 1.0265, P = .1901. Insufficient evidence for weight loss; reject H0.

b) (-9.048, 23.048) (not requested in problem, but df = 3; ME = 16.048)

 

 

(over)


3. (25) An SRS of 25 male faculty members at a large university found that 10 felt that the university was supportive of female and minority faculty.  An independent SRS of 20 female faculty found that 5 felt that the university was supportive of female and minority faculty.  Let p1 and p2 represent the proportion of all male and female faculty, respectively, at the university who felt that the university was supportive of female and minority faculty at the time of the survey.

 

a) Test whether significantly more male faculty than female faculty felt that the university was supportive of female and minority faculty. State hypotheses, calculate a test statistic, give a P-value (by calculator or as exact as the tables in the text allow), and state your conclusion in words.

 

b) Find a 98% plus-four confidence interval for p1p2.

 

Answers:

a) H0: p1 = p2, HA: p1 > p2 ; 2-proportion z-test: z = 1.0607, P = .1444. Little evidence that significantly more male faculty than female faculty felt that the university was supportive of female and minority faculty; fail to reject H0. Also unreliable since 5 are too few successes in group 2 (should be 10 or more).

d) 2-proportion z-interval: (plus 4 method compensates for only 5 successes in group 2): (-.1809, .4252)

 

 

 

4. (25) Answer each of the following questions. (No explanation is needed—just a short answer.)

 

a) You are reading an article in your field that reports several statistical analyses. The article says that the P-value for a significance test is 0.045. Is this result significant at the 5% significance level? YES

 

b) Is the result with P-value 0.045 significant at the 1% significance level? NO

 

c) For another significance test, the article says only that the result was significant at the 1% level. Are such results always, sometimes, or never significant at the 5% level? ALWAYS

 

d) The article contains a 95% confidence interval. Would the margin of error in a

99% confidence interval computed from the same data be less, the same, or greater? GREATER

 

e) Reaction times of a subject to a stimulus are often strongly skewed to the right because of a few slow reaction times. You wish to test H0: μ1 = μ2 where μ1 is the mean reaction time for Stimulus 1, and μ2 for Stimulus 2. You have two independent samples, 8 observations for Stimulus 1 and 10 for Stimulus 2. Which, if any, of the tests that we have covered in this course can be used to test this? NONE (t-test needs normality of population, esp. for small samples.) A more advanced course would cover non-parametric methods that could be used here.

 

 

 

 

 

 

 

 

 

Excel Problems

 

1. Here are the number of pieces of mail received at a school office for 36 days.

123

80

52

112

118

95

151

100

66

143

110

115

70

78

103

92

118

131

115

128

135

100

75

105

90

72

138

93

106

59

97

130

76

88

60

85

 

(1)   Draw a histogram with the given bin range.

  

Bin Range

65

80

95

110

125

140

155

 

(2)  Compute the five-number summary, the mean and standard deviation.

 

Answers:

(1)

 

(2)

Min

52

Q1

79

Median

100

Q3

118

Max

151

 

 

Mean

100.9667

SD

24.26587

 

 

 

 

2. The table shows the number of live births per 1000 women aged 15-44 years in the United States, starting in 1965.

(1)   Make a scatter plot with trend line on it.

(2)   Find the equation of the regression line, and R2.

(3)   Predict what the birth rate will be in 2010.

 
 


Year

Rate

1965

19.4

1970

18.4

1975

14.8

1980

15.9

1985

15.6

1990

16.4

1995

14.8

2000

14.4

2005

14.0

 

Answer:

 

 

 

 

 

3.       The data below show the sugar content (as a percentage of weight) of two different brands of cereals. Do these data provide evidence that the two brands differ in sugar content?

Brand A

Brand B

40.3

40

45.7

42.2

43.3

36.5

43

41.4

44.2

38.4

44

41.1

44

40.3

37.8

40.9

45.9

39.8

44.1

41

43.7

41.2

 

Answer:

H0: m1 = m2

HA: m1 ¹ m2

2-sample t-test (unequal variances, i.e. not pooled)

t Stat

3.540701414

P(T<=t) two-tail

0.002335713

Strong evidence that the two brands differ in sugar content. Reject H0.

 

 

 

 

4.       One Thursday, researchers gave students a set of 50 new Spanish vocabulary words to memorize. On Friday the students took a vocabulary test. On the following Monday, they were retested without advance warning. Both sets of test scores of the 25 students are shown below. Are the scores on Friday and the following Monday different?

Friday

Monday

42

36

44

44

45

46

48

38

44

40

43

38

41

37

35

31

43

32

48

37

43

41

45

32

47

44

50

47

34

34

38

31

43

40

39

41

46

32

37

36

40

31

41

32

48

39

37

31

36

41

 

Answer:

H0: m = 0

HA: m ¹ 0

Paired t-test: t-test on the differences Friday - Monday.

t Stat

5.166581919

P(T<=t) two-tail

2.72715E-05

Very strong evidence that the two scores differ. Reject H0.

 

 

 

Stat 101 Final Exam     Thulin - 2009 Autumn

 

 

 

Sex

Height

Shoe

F

60

6

F

62

5.5

F

62

6

F

62

7

F

64

9

F

64

5.5

F

64

6.5

F

65

7

F

66

9

F

67

9

F

68

9

F

69

10

M

67

12

M

67

10

M

67

10.5

M

69

8

M

71

10.5

M

74

11

A Stat 101 class was polled and the table below right records height and shoe size for the students. Use it to answer questions 1-4.

 

 

1. Draw a histogram for Height of females only with the given bins.

 

60-61

62-63

64-65

66-67

68-69

 

 

2.  Draw side by side boxplots for the female heights and male heights.

 

3. Do these data support the assertion that average height of male Stat 101 students is greater than that of female Stat 101 students? Perform the appropriate hypothesis test, stating the null and alternative hypotheses, and report the t and P values.

 

4. Do linear regression for only the male’s height (explanatory) vs. shoe size (response). Find:

a) a scatterplot with regression line

b) the equation of the regression line

c) the proportion of variation in shoe size explained by the regression line

d) the predicted shoe size for a height of 70 inches.

 

5. 5 randomly chosen subjects endured the UltraFast diet for 4 weeks. The before and after weights were:

 

Before

160

200

210

150

140

After

150

199

190

160

125

 

Do these data support the efficacy of the diet? Perform the appropriate hypothesis test, stating the null and alternative hypotheses, and report the t and P values.

 

6. Suppose that scores on an exam follow the normal model with mean 70 and standard deviation 10. Determine the probability that:

a)      a randomly chosen examinee has a score between 80 and 90

b)      4 randomly chosen examinees have a mean score between 80 and 90.

 

7. A survey of 18 Stat 101 students revealed that 7 were politically moderate. Assume this was a random sample of Stat 101 students. Construct a 90% confidence interval for the proportion of moderate Stat 101 students. Be sure to check the conditions, and if one is not satisfied, use a method that does not require it. Hint: plus 4.

Group

n

mean

s

women

153

5.41

3.65

men

127

7.20

4.02

 

8. Researchers are interested in whether there is a difference between men and women in response to a certain drug. The summary statistics are as shown in the table. Formulate and perform the appropriate hypothesis test. Also give a 95% confidence interval for the difference of the means.

 

9. Answer each of the following questions. (No explanation is needed—just a short answer.)

a) You are reading an article in your field that reports several statistical analyses. The article says that the P-value for a significance test is 0.005. Is this result significant at the 5% significance level?

b) Is the result with P-value 0.005 significant at the 1% significance level?

c) For another significance test, the article says only that the result was significant at the 10% level. Are such results always, sometimes, or never significant at the 5% level?

d) The article contains a 98% confidence interval. Would the margin of error in a

90% confidence interval computed from the same data be less, the same, or greater?

e) A 95% confidence interval is computed from a given sample. Additional data is gathered so that the new sample is 4 times the size of the old. The sample standard deviation of the new sample turns out to be very close to that of the original. A new 95% confidence interval is computed from the new sample. The new interval will be about ___ as wide as the old one.

 

10. A poll indicated the following support for candidate X in various parts of a city. Is the evidence that support for candidate X is different in different parts of the city at the a) .05 b) .01 significance levels? Report the degrees of freedom, chi-square and P values.

 

 

N

E

W

S

Support

22

15

11

23

Oppose

15

17

14

29

No opinion

11

22

19

13