Final Project

The Final Project is due no later than the end of Unit 9.

These are the Final Project questions:

Problem 1)

A book inventory record contains the following information:

a) Title: More Mysteries

b) Author: Roger Mortimer

c) Date of publication: 1998

d) List price: $25.00

e) Number in stock: 6

For the information (a) to (e) list the highest level of measurement as ratio, interval, ordinal, or nominal.

Answer:

a) Nominal

b) Nominal

c) Ordinal

d) Ordinal

e) Ordinal

Problem 2)

What technique for gathering data (sampling, experiment, simulation or census) do you think was used in each of the following studies?

a) A computer program was used to model global weather patterns and to produce long-range weather forecasts for a rural agricultural region.

b) A random sample of 1,000 residents of a major metropolitan area was surveyed to determine the level of support for a new sports complex among all residents of the area.

c) To determine the effect of a new fertilizer on productivity of tomato plants one group of plants is treated with the new fertilizer while a second group is grown without such treatment. The number of ripe tomatoes produced by each group is counted.

d) A study was done regarding the number of home runs scored by major league baseball teams playing at altitudes over 5,000 feet. Data for all major league baseball games played at this altitude was used in the study.

Answer:

a) Simulation

b) Sampling

c) Experiment

d) Census

Problem 3)

To determine monthly rental prices of apartment units in San Francisco Bay area, samples were constructed in the following ways. Identify the technique used to produce each sample. (cluster, convenience, random, stratified, systematic):

a) Number all the units in the area and use a random number table to select the apartments to include in the sample.

b) Classify the apartment units according to the number of bedrooms and then take a random sample from each of the classes.

c) Classify the apartments according to zip code and take a random sample from each of the zip code regions.

d) Look in the newspaper and choose the first apartments you find that list rents.

Answer:

a) Systematic

b) Stratified

c) Cluster

d) Convenience

Problem 4)

The golf scores for the 20 members of a country club were as follows:

81, 76, 107, 95, 119, 92, 83, 74, 108, 88

95, 74, 83, 76, 97, 82, 79, 91, 93, 89

Create a relative frequency histogram using ten-point intervals to show the distribution of the scores.

Answer:

Minimum score = 74; Maximum score = 119; Number of classes n = 10

Class Width = ~5

Therefore, following is the frequency distribution table for the given data:

Class Limit

Class Boundaries

Mid – Point

Frequency

(f)

Relative frequency

(f/n)

Lower – Upper

Lower – Upper

72 – 76

71.5 – 76.5

74

4

0.20

77 – 81

76.5 – 81.5

79

2

0.10

82 – 86

81.5 – 86.5

84

3

0.15

87 – 91

86.5 – 91.5

89

3

0.15

92 – 96

91.5 – 96.5

94

4

0.20

97 – 101

96.5 – 101.5

99

1

0.10

102 – 106

101.5 – 106.5

104

0

0.00

107 – 111

106.5 – 111.5

109

2

0.10

112 – 116

111.5 – 116.5

114

0

0.00

117 – 121

116.5 – 121.5

119

1

0.10

n =

20

The relative frequency histogram of the golf scores is presented below in figure 1:

Problem 5)

The data below show the average daily high temperature for Chicago, Illinois, for twelve recent spring and summer months. Construct a stem-and-leaf diagram for the data:

64, 61, 57, 68, 78, 75, 50, 83, 71, 58, 62, 80

Answer: The stem-and-leaf diagram of the average daily high temperature data is presented below:

Stem Leaves

5 0 7 8

6 1 2 4 8

7 1 5 8

8 0 3

Problem 6)

Black Hole Pizza Parlor instructs its cooks to put a “handful” of cheese on each large pizza. A random sample of six such handfuls was weighed. The weights to the nearest ounce were:

3 2 3 4 3 5

a) Find the mode, the median and the mean weight of the handfuls of cheese.

b) Find the range and the stand deviation of the weights

Answer:

a) Mean =

Median = 3

Mode = 3

b) Range = Max – Min = 5 – 2 = 3

Standard Deviation =

Problem 7)

The cost of one serving of peanut butter (in cents) for a random sample of 19 jars of peanut butter was found to be:

22 27 32 26 26 19 16

26 14 21 20 21 20 17

12 32 17 9 16

a) Give the five-number summary (low, Q1, median, Q3, high)

b) Calculate the IQR.

Answer:

a) Five number summary of the given data is:

Low = 9

Q1 = 16

Median = 21

Q3 = 26

High = 32

b) IQR = Q3 – Q1 = 26 – 16 = 10

Problem 8)

In a random sample of eight military contracts involving cost overruns, the following information was obtained. x = big price of the contract (in millions of dollars) and y = cost of overrun (expressed as a percent of the bid price).

x

6

10

3

5

9

18

16

21

y

31

25

39

35

29

12

17

8

a) Draw the scatter diagram for this data.

b) Find the slope, b, and the intercept, a, for the least-squares line. Write the equation of the least-squares line.

c) Graph the least-squares line on your scatter diagram.

d) If an overrun contract was bid at 12 million dollars, what does the least-squares line predict for the cost of overrun (as a percent of bid price)?

Answer:

a) The scatter diagram between the bid price of the contract (in million dollars) and the cost overrun (% of the bid price) is given below in figure 2:

b) Equation of the best fit line is y = 42.99 – 1.6809x

Intercept a = 42.99 and Slope b = -1.6809

c) The least square line is drawn on the scatter diagram in the figure 3, below:

d) For bid price of contract x = 12 million dollars

Cost of overrun y = 42.99 – 1.6809x = 42.99 – 1.6809*12 = 22.82% of the bid price

Problem 9)

Mary Sue wants to know if there is a connection between attendance at craft fairs and the number of exhibitors who have booths at the fair. For a random sample of seven local craft fairs, she chose a random day of the fair and recorded the number of exhibitors. In the data below, x represents the number of exhibitors and y represents the attendance in hundreds of people.

x

35

55

75

95

100

135

150

y

1.2

2.1

4.2

5.4

5.8

6.2

9.5

a) Draw the scatter diagram for the data.

b) Calculate the sample correlation coefficient, r.

c) Calculate the coefficient of determination, r2.

d) What does the coefficient of determination tell you about the variation in attendance and the variation in the number of exhibitors?

Answer:

a) The scatter diagram for the given data along with the least square fit line is given below in figure 4:

b) Correlation coefficient r =

c) Coefficient of determination r2 = 0.9243

d) Coefficient of determination r2 = 0.9243 implies that this regression equation or the least square fit line explains 92.43% values of y due to variation in x.

Problem 10)

You roll two fair dice, one red and one green.

a) What is the probability of getting a number less than 5 on both?

b) What is the probability of getting a sum of 9 on the two dice?

c) What is the probability of getting a 5 on both?

Answer:

If two fair dice are rolled, there is 36 equally likely pair of numbers that will appear and each will have a probability of occurrence equal to 1/36. Let n1 and n2 are the numbers on the two dice respectively.

a) To get a number less than 5 on both dice; there can be 4×4 = 16 possible events (pair of numbers) and each of these will be equally likely.

Therefore, P(n1<5, n2<5) = P(n1<5)*P(n2<5) = (4/6)*(4/6) = 4/9

b) There are eight equally like pair of numbers which add to 9; therefore,

P(n1+n2 = 9) = 8/36 = 2/9

b) There is only one pair of numbers that corresponds to 5 on both dice.

Therefore, P(n1=n2=5) = 1/36

Problem 11)

An urn contains 8 balls identical in every respect except color. There are 4 blue balls, 3 red balls, and 1 white ball.

a) If you draw one ball from the urn what is the probability that it is blue or white?

b) If you draw two balls without replacing the first one, what is the probability that the first ball is red and the second ball is white?

c) If you draw two balls without replacing the first one, what is the probability that one ball is red and the other is white?

Answer:

a) P(B or W) = P(B) + P(W) =

b) P(1st R and 2nd W) = P(1st R)*P(2nd W) = (3/8)*(1/7) = 3/56

c) P(R and W) = P(1st R and 2nd W or 1st W and 2nd R)

= P(1st R and 2nd W) + P(1st W and 2nd R)

= (3/8)*(1/7) + (1/8)*(3/7) = 6/56 = 3/28

Problem 12)

Evaluate:

a) P6,4

b) C7,2

c) P4,4

d) C9,0

Answer:

a) P6,4 =

b) C7,2 =

c) P4,4 =

d) C9,0 =

Problem 13)

Laura is training for a week-long mountain cycling tour. She has 12 short hilly routes from which to choose mid-week rides.

a) How many ways can she choose 4 different rides from the list for the first week’s training if order matters?

b) How many ways can she choose 4 different rides if order does not matter?

c) If she has chosen the first weeks rides, how many ways can she choose four more different rides for the second week? Assume that order does not matter.

Answer:

a) The required number will be P12,4 =

b) The required number will be C12,4 =

c) Once she has chosen the four rides for the first week, there are only 8 more different drives left to be chosen for the next week’s ride.

The required number will be C8,4 =

Problem 14)

Identify each of the random variables as continuous or discrete.

a) Speed of an automobile

b) The number of doughnuts left in the pantry

c) The air temperature of a public park

d) The weight of a professional wrestler

e) The number of restaurant patrons

Answer:

a) Continuous

b) Discrete

c) Continuous

d) Discrete

e) Discrete

Problem 15)

Richard has just been given a ten-question multiple choice test in his history class. Each question has five answers only one of which is correct. Since Richard has not attended class recently, he does not know any of the answers. Assume that Richard guesses randomly on all ten questions.

a) Find the probability that he will answer all 10 questions correctly.

b) Find the probability that he will answer 5 or more questions correctly.

c) Find the probability that he will answer none of the questions correctly.

d) Find the probability that he will answer at least 3 questions correctly.

;

Answer:

The probability distribution will be a binomial distribution with

n = 10

p = 1/5; this is probability of the random answer to be correct and

q = 1 – (1/5) = 4/5; this is the probability of the random answer to be incorrect

a) P(10 correct) = C10,0*(p)10*(q)0 = 1*(1/5)10*1 = 1.024*10-7

b) P(;5 correct) = C10,0*(p)10*(q)0 + C10,1*(p)9*(q)1 + C10,2*(p)8*(q)2 + C10,3*(p)7*(q)3

+ C10,4*(p)6*(q)4 + C10,5*(p)5*(q)5

= C10,0*(1/5)10*(4/5)0 + C10,1*(1/5)9*(4/5)1 + C10,2*(1/5)8*(4/5)2

+ C10,3*(1/5)7*(4/5)3 + C10,4*(1/5)6*(4/5)4 + C10,5*(1/5)5*(4/5)5

= (1 + 40 + 640 + 7680 + 53760 + 262144)*(1/5)10

= 324265*1.024*10-7

= 332047.36*10-7

= 3.3204736*10-2

c) P(no correct) = C10,10*(p)0*(q)10 = 1*(1/5)0*(4/5)10 = 1.04858*10-4

d) P(;3 correct) = C10,0*(p)10*(q)0 + C10,1*(p)9*(q)1 + C10,2*(p)8*(q)2 + C10,3*(p)7*(q)3

+ C10,4*(p)6*(q)4 + C10,5*(p)5*(q)5 + C10,6*(p)4*(q)6 + C10,7*(p)3*(q)7

= C10,0*(1/5)10*(4/5)0 + C10,1*(1/5)9*(4/5)1 + C10,2*(1/5)8*(4/5)2

+ C10,3*(1/5)7*(4/5)3 + C10,4*(1/5)6*(4/5)4 + C10,5*(1/5)5*(4/5)5

+ C10,6*(1/5)4*(4/5)6 + C10,7*(1/5)3*(4/5)7

= (1 + 40 + 640 + 7680 + 53760 + 262144 + 860160 + 1966080)*(1/5)10

= 3150505*1.024*10-7

= 3226117.12*10-7

= 0.322611712

;

;

;

Problem 16)

Long-term history has shown that 65% of all elected offices in a rural county have been won by Republican candidates. This year there are 5 offices up for public election in the county Let r be the number of public offices won by Republicans.

a) Find P(r) for r=0,1,2,3,4, and 5

b) Make a histogram for the r probability distribution.

c) What is the expected number of Republicans who will win office in the coming election?

d) What is the standard deviation of r?

Answer:

The probability distribution will be a binomial distribution with

n = 5

p = 0.65; this is probability of a republican candidate winning the election and

q =1 – 0.65 = 0.35; this is the probability of a republican candidate not winning the election

a) P(r = 0) = C5,5 (p)0 (q)5 = 1* (0.65)0*(0.35)5 = 1*1*0.005252 = 5.252*10-3

P(r = 1) = C5,4 (p)1 (q)4 = 5* (0.65)1*(0.35)4 = 4.877*10-2

P(r = 2) = C5,3 (p)2 (q)3 = 10* (0.65)2*(0.35)3 = 1.8185*10-1

P(r = 3) = C5,2 (p)3 (q)2 = 10* (0.65)3*(0.35)2 = 3.33642*10-1

P(r = 4) = C5,1 (p)4 (q)1 = 5* (0.65)4*(0.35)1 = 3.1239*10-1

P(r = 5) = C5,5 (p)5 (q)0 = 1* (0.65)5*(0.35)0 = 1.1603*10-1

b) The probability distribution is presented in the following histogram in figure 5.

c) Expected number of republicans who will win the current election = np = 5*0.65 = 3.25

Rounding it to the nearest number of integers, republicans are expected to win 3 seats in this election.

d) Standard Deviation =

;

;

;

;

;

Problem 17)

Lewis earned 85 on his biology midterm and 81 on his history midterm. In the biology class the mean score was 79 with standard deviation 5. In the history class the mean score was 76 with standard deviation 3.

a) Convert each midterm score to a standard z score.

b) On which test did he do better compared to the rest of the class?

Answer:

a) z-score is defined as

Therefore, for biology;

For history;

b) z-score normalizes individual’s performance with respect to that of the entire class. As z-score of Lewis is higher in history than that in biology; therefore, his performance in history is better than that in biology, with respect to the entire class.

;

Problem 18)

Let x be a random variable that represents the length of time it takes a student to write a term paper for Dr. Adam’s Sociology class. After interviewing many students, it was found that x has an approximately normal distribution with mean ? = 6.8 hours and standard deviation ? = 2.1 hours.

Convert each of the following x intervals to standardized z intervals:

a) x ? 7.5

b) 5 ? x ? 8

c) x ? 4

Convert each of the following z intervals to raw score x intervals:

d) z ? -2

e) 0 ? z ? 2

f) z ? 3

Answer:

Because ; Therefore,

a)

b)

c)

d)

e)

f)

Problem 19)

Researchers at a pharmaceutical company have found that the effective time duration of a safe dosage of a pain relief drug is normally distributed with mean 2 hours and standard deviation 0.3 hour. For a patient selected at random:

a) What is the probability that the drug will be effective for 2 hours or less?

b) What is the probability that the drug will be effective for 1 hour or less?

c) What is the probability that the drug will be effective for 3 hours or more?

Answer:

a) For x = 2 hrs;

Therefore, x < 2 hrs implies z < 0

This corresponds to 50% area under the normal curve.

Therefore, desired probability is 0.5 or 50%.

b) For x = 1 hr;

Therefore, x < 1 hr implies z < -3.33

This corresponds to 0.043% area under the normal curve.

Therefore, desired probability is 0.00043 or 0.043%.

c) For x = 3 hrs;

Therefore, x > 3 hrs implies z > 3.33

This corresponds to 0.043% area under the normal curve.

Therefore, desired probability is 0.00043 or 0.043%.

Problem 20)

Roger has read a report that the weights of adult mail Siberian tigers have a distribution which is approximately normal with mean ? = 390 lb and ? = 65 lb.

a) Find the probability that an individual male Siberian tiger will weigh more than 450 lb.

b) Find the probability that a random sample of 4 male Siberian tigers will have a sample mean weight more than 450 lb.

Answer:

a) For x = 450 lb;

Therefore, x > 450 lb implies z > 0.92

This corresponds to 17.88% area under the normal curve.

Therefore, desired probability is 0.1788 or 17.88%.

b)

For ;

Therefore, > 450 lb implies z > 1.85

This corresponds to 3.22% area under the normal curve.

Therefore, desired probability is 0.0322 or 3.22%.

Problem 21)

A biologist has found the average weight of 12 randomly selected mud turtles to be 8.7 lb with standard deviation 3.6 lb. Find a 90% confidence interval for the population mean weight of all such turtles.

Answer:

Given, Sample mean lb; Sample standard deviation S = 3.6 lb and

Sample size n =12

For 90% confidence interval z = 1.65

Therefore, 90% confidence interval for the population mean will be given as

The desired interval is 7.661 < m < 9.739

Problem 22)

How tall are college hockey players? The average height has been 68.3 inches. A random sample of 14 hockey players gave a mean height of 69.1 inches. We may assume that x has a normal distribution with ? = 0.9 inch. Does this indicate that the population mean height is different from 68.3 inches? Use 5% level of significance.

a) State the null and the alternate hypothesis.

b) Identify the sampling distribution to be used: the standard normal distribution or the Student’s t distribution. Find the critical value(s).

c) Compute the z or t value of the sample test statistic.

d) Find the P value or an interval containing the P value for the sample test statistic.

e) Based on your answers to a through d, decide whether or not to reject the null hypothesis at the given significance level. Explain your conclusion in the context of the problem.

Answer:

a) Null Hypothesis is H0: m = 68.3

Alternate Hypothesis is H1: m ? 68.3

b) Because the sample size n = 14 is small (n ; 30); therefore, Student’s t distribution will be used as sampling distribution.

Degree of freedom df = n – 1 = 14 – 1 = 13

Level of significance a = 5% = 0.05

Critical value for hypothesis testing or tcritical = tdf, a = t13, 0.025 = 2.1604

c) ttest =

d) For ttest; p-value is less than 0.01.

The interval containing the sample test statistics is

The desired interval is 68.3 ; m ; 69.9

e) Based on the findings from a) through d) As, ttest ; tcritical

Therefore, Null hypothesis is rejected and the Alternate Hypothesis is accepted. This means the population mean height is different from 68.3 inches.

Problem 23)

Recently the national average yield on municipal bonds has been ? = 4.19%. A random sample of 16 Arizona municipal bonds gave an average yield of 5.11% with a sample standard deviation s = 1.15%. Does this indicate that the population mean yield for all Arizona municipal bonds is greater than the national average? Use ? = 0.05. Assume x is normally distributed.

a) State the null and the alternate hypothesis.

b) Identify the sampling distribution to be used: the standard normal distribution or the Student’s t distribution. Find the critical value(s).

c) Compute the z or t value of the sample test statistic.

d) Find the P value or an interval containing the P value for the sample test statistic.

e) Based on your answers to a through d, decide whether or not to reject the null hypothesis at the given significance level. Explain your conclusion in the context of the problem.

Answer:

a) Null Hypothesis is H0: m = 4.19

Alternate Hypothesis is H1: m > 4.19

b) Because the sample size n = 16 is small (n < 30); therefore, Student’s t distribution will be used as sampling distribution.

Degree of freedom df = n – 1 = 16 – 1 = 15

Level of significance a = 5% = 0.05

Critical value for hypothesis testing or tcritical = tdf, a = t15, 0.05 = 1.7531

c) ttest =

d) For ttest; p-value is less than 0.01.

The interval containing the sample test statistics is

The desired interval is 4.19 < m < 6.03

e) Based on the findings from a) through d) As, ttest > tcritical

Therefore, Null hypothesis is rejected and the Alternate Hypothesis is accepted. This means that the population mean yield for all Arizona municipal bonds is greater than the national average.