Module # 5 assignment

Issaiah Jennings

Question 1:

The director of manufacturing at a cookies company needs to determine whether a new machine is able to produce a particular type of cookies according to the manufacturer's specifications, which indicate that cookies should have a mean of 70 and standard deviation of 3.5 pounds. A sample of 49 cookies reveals a sample mean breaking strength of 69.1 pounds.

A. State the null and alternative hypothesis H₀: μ = 70, H₁: μ ≠ 70

B. Is there evidence that the machine is not meeting the manufacturer's specifications for average strength? Use a 0.05 level of significance

z=493.569.1−70=0.5−0.9=−1.8

Population mean (μ₀) = 70
Sample mean (x̄) = 69.1
Sample size (n) = 49
Standard deviation (σ) = 3.5
Significance level (α) = 0.05

> z <- -1.8

> p_value <- 2 * pnorm(-abs(z))

> p_value

[1] 0.07186064

The p-value (0.0719) is greater than the significance level (0.05), we fail to reject the null hypothesis.

C. Compute the p value and interpret its meaning:

> z <- -1.8

> p_value <- 2 * pnorm(-abs(z))

> p_value

[1] 0.07186064

D. What would be your answer in (B) if the standard deviation were specified as 1.75 pounds?

z=1.75/4969.1−70=0.25−0.9=−3.6

> z <- -3.6

> p_value <- 2 * pnorm(-abs(z))

> p_value

[1] 0.0003182172

E. What would be your answer in (B) if the sample mean were 69 pounds and the standard deviation is 3.5 pounds?

z=3.5/4969−70=0.5−1=−2

> z <- -3.6

> p_value <- 2 * pnorm(-abs(z))

> p_value

[1] 0.0003182172

Question 2:

If x̅ = 85, σ = standard deviation = 8, and n=64, set up 95% confidence interval estimate of the population mean μ.

Sample mean (x̄) = 85
Standard deviation (σ) = 8
Sample size (n) = 64
Confidence level = 95%

xbar <- 85

> sigma <- 8

> n <- 64

> error <- qnorm(0.975) * (sigma / sqrt(n))

> lower <- xbar - error

> upper <- xbar + error

> c(lower, upper)

[1] 83.04004 86.95996

Question 3, using Correlation Analysis

The correlation coefficient analysis formula:

(r) =[ nΣxy – (Σx)(Σy) / Sqrt([nΣx2 – (Σx)2][nΣy2 – (Σy)2])]

r: The correlation coefficient is denoted by the letter r.

n: Number of values. If we had five people we were calculating the correlation coefficient for, the value of n would be 5.

x: This is the first data variable.

y: This is the second data variable.

Σ: The Sigma symbol (Greek) tells us to calculate the “sum of” whatever is tagged next to it.

Using the dataset downloadable below (i.e. the data you will use to create the vectors are located in the download link below), complete these tasks in Rstudio:

x1 < - c(your data) e.g. girls_goals <- c(data1, data2, data3)

x2 <- c(your data) e.g. girls_time<- c(data1, data2, data3)

y1<- c(your data) e.g. boys_goals ..........

y2<- c(your data) e.g. boys_time............

Note: from past couple of classes/assignments that x = c(data1, data2, data3) creates a vector of 3 data points and parse it into variable x.

Merge all in a dataframe

df<-data.frame(x1, x2, y1, y2)

Plot:

cor(df)

cor(df,method="pearson") #As pearson correlation

cor(df, method="spearman") #As spearman correlation

Use corrgram( ) to plot correlograms. (Note: you may have to install the corrgram package and call the library)

> # Girls' data

> x1 <- c(4, 5, 6) # Goals

> x2 <- c(49, 50, 69) # Grades

> y1 <- c(24, 36, 38) # Popular

> y2 <- c(19, 22, 28) # Time spent on assignment

> # Boys' data

> x3 <- c(4, 5, 6) # Goals

> x4 <- c(46.1, 54.2, 67.7) # Grades

> y3 <- c(26.9, 31.6, 39.5) # Popular

> y4 <- c(18.9, 22.2, 27.8) # Time spent on assignment

> # Merge vectors into a data frame

> df <- data.frame(Girls_Goals = x1, Girls_Grades = x2, Girls_Popular = y1, Girls_Time = y2,

+ Boys_Goals = x3, Boys_Grades = x4, Boys_Popular = y3, Boys_Time = y4)

> # Calculate correlations

> cor(df) # Default Pearson correlation

Girls_Goals Girls_Grades Girls_Popular Girls_Time Boys_Goals Boys_Grades Boys_Popular Boys_Time

Girls_Goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517

Girls_Grades 0.8873565 1.0000000 0.6445509 0.9585035 0.8873565 0.9441243 0.9448614 0.9456833

Girls_Popular 0.9244735 0.6445509 1.0000000 0.8357661 0.9244735 0.8605276 0.8593826 0.8580918

Girls_Time 0.9819805 0.9585035 0.8357661 1.0000000 0.9819805 0.9989061 0.9990085 0.9991175

Boys_Goals 1.0000000 0.8873565 0.9244735 0.9819805 1.0000000 0.9897433 0.9894203 0.9890517

Boys_Grades 0.9897433 0.9441243 0.8605276 0.9989061 0.9897433 1.0000000 0.9999975 0.9999887

Boys_Popular 0.9894203 0.9448614 0.8593826 0.9990085 0.9894203 0.9999975 1.0000000 0.9999968

Boys_Time 0.9890517 0.9456833 0.8580918 0.9991175 0.9890517 0.9999887 0.9999968 1.0000000

> cor(df, method = "pearson") # Pearson correlation