Module # 7 (includes assignment)

 Issaiah Jennings 

Module # 7 (includes assignment)


Questions

1. In this assignment's segment, we will use the following regression equation  Y = a + bX +e
Where:
Y is the value of the Dependent variable (Y), what is being predicted or explained

a or Alpha, a constant; equals the value of Y when the value of X=0

b or Beta, the coefficient of X; the slope of the regression line; how much Y changes for each one-unit change in X.

X is the value of the Independent variable (X), what is predicting or explaining the value of Y

e is the error term; the error in predicting the value of Y, given the value of X (it is not displayed in most regression equations).

A reminder about lm() Function. 

lm([target variable] ~ [predictor variables], data = [data source])

1.1
The data in this assignment:

x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)


1.1 Define the relationship model between the predictor (x) nand the response (Y) variable:

> x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)

> y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

> # Using lm() function  to define the model 

> model <- lm(y ~ x)

> # Display the model summary

> summary(model)


Call:

lm(formula = y ~ x)


Residuals:

    Min      1Q  Median      3Q     Max 

-11.435  -7.406  -4.608   6.681  16.834 


Coefficients:

            Estimate Std. Error t value Pr(>|t|)  

(Intercept)   19.206     15.691   1.224   0.2558  

x              3.269      1.088   3.006   0.0169 *

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 10.48 on 8 degrees of freedom

Multiple R-squared:  0.5303, Adjusted R-squared:  0.4716 

F-statistic: 9.033 on 1 and 8 DF,  p-value: 0.01693


1.2 Calculate the coefficients?

> coefficients(model)

(Intercept)           x 

  19.205597    3.269107 

>

2. The following question is posted by Chi Yau (Links to an external site.) the author of  R Tutorial With Bayesian Statistics Using Stan (Links to an external site.) and his blog posting regarding Regression analysis (Links to an external site.)

.

Problem -

Apply the simple linear regression model (see the above formula) for the data set called "visit" (see below), and estimate the the discharge duration if the waiting time since the last eruption has been 80 minutes. Note: The full dataset is in R which can be accessed as data(faithful).
> head(visit)
  discharge  waiting
1     3.600      79
2     1.800      54
3     3.333      74
4     2.283      62
5     4.533      85
6     2.883      55 

Employ the following formula discharge ~ waiting and data=visit)

2.1 Define the relationship model between the predictor and the response variable.


 lm(eruptions ~ waiting, data = faithful)


Call:

lm(formula = eruptions ~ waiting, data = faithful)


Coefficients:

(Intercept)      waiting  

   -1.87402      0.07563  



2.2 Extract the parameters of the estimated regression equation with the coefficients function.


> # Load the dataset

> data(faithful)

> # Fit the linear model

> model <- lm(eruptions ~ waiting, data = faithful)

> # Extract the coefficients (intercept and slope)

> coef(model)

(Intercept)     waiting 

-1.87401599  0.07562795 



2.3 Determine the fit of the eruption duration using the estimated regression equation.

> # Predict discharge when waiting time is 80 minutes

> predict(model, newdata = data.frame(waiting = 80))

       1        2        3        4        5        6        7        8        9       10 

71.51130 74.78041 61.70398 78.04952 58.43488 64.97309 81.31862 55.16577 55.16577 51.89666 



3.  Multiple regression

We will use a very famous datasets in R called mtcars. This dateset was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973--74 models).

This data frame contain 32 observations on 11 (numeric) variables.

[, 1]

mpg

Miles/(US) gallon

[, 2]

cyl

Number of cylinders

[, 3]

disp

Displacement (cu.in.)

[, 4]

hp

Gross horsepower

[, 5]

drat

Rear axle ratio

[, 6]

wt

Weight (1000 lbs)

[, 7]

qsec

1/4 mile time

[, 8]

vs

Engine (0 = V-shaped, 1 = straight)

[, 9]

am

Transmission (0 = automatic, 1 = manual)

[,10]

gear

Number of forward gears

To call mtcars data in R
R comes with several built-in data sets, which are generally used as demo data for playing with R functions. One of those datasets build in R is mtcars.
In this question, we will use 4 of the variables found in mtcars by using the following function

input <- mtcars[,c("mpg","disp","hp","wt")]

print(head(input))

The R will display

3.1 Examine the relationship Multi Regression Model as stated above and its Coefficients using 4 different variables from mtcars (mpg, disp, hp and wt).
Report on the result and explanation what does the multi regression model and coefficients tells about the data?   


input <- mtcars[,c("mpg","disp","hp","wt")] 

lm(formula = mpg ~ disp + hp + wt, data = input) 

> input <- mtcars[,c("mpg","disp","hp","wt")]

> print(head(input))

                   mpg disp  hp    wt

Mazda RX4         21.0  160 110 2.620

Mazda RX4 Wag     21.0  160 110 2.875

Datsun 710        22.8  108  93 2.320

Hornet 4 Drive    21.4  258 110 3.215

Hornet Sportabout 18.7  360 175 3.440

Valiant           18.1  225 105 3.460

> input <- mtcars[,c("mpg","disp","hp","wt")]  

> lm(formula = mpg ~ disp + hp + wt, data = input) 


Call:

lm(formula = mpg ~ disp + hp + wt, data = input)


Coefficients:

(Intercept)         disp           hp           wt  

  37.105505    -0.000937    -0.031157    -3.800891  




4.  From our textbook pp. 124, 6.5-Exercises # 6.1
With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg?
The data set rmr is R, make sure to install the book R package: ISwR. After installing the ISwR package, here is a simple illustration to the set of the problem.


library(ISwR)

plot(metabolic.rate~body.weight,data=rmr)

> library(ISwR)

> # Load the dataset

> data("rmr")

> # Fit a linear model

> fit <- lm(metabolic.rate ~ body.weight, data = rmr)

> # Predict metabolic rate for 70 kg

> predict(fit, newdata = data.frame(body.weight = 70))

       1 

1305.394 

> # 95% confidence interval for slope

> confint(fit)

                 2.5 %   97.5 %

(Intercept) 655.883819 966.5695

body.weight   5.086656   9.0324




Comments

Popular posts from this blog

The Final project in this class

Module # 8 Assignment

Module # 6 assignment