Module # 7 (includes assignment)
Issaiah Jennings
Module # 7 (includes assignment)
Questions
1. In this assignment's segment, we will use the following regression equation Y = a + bX +e
Where:
Y is the value of the Dependent variable (Y), what is being predicted or explained
a or Alpha, a constant; equals the value of Y when the value of X=0
b or Beta, the coefficient of X; the slope of the regression line; how much Y changes for each one-unit change in X.
X is the value of the Independent variable (X), what is predicting or explaining the value of Y
e is the error term; the error in predicting the value of Y, given the value of X (it is not displayed in most regression equations).
A reminder about lm() Function.
lm([target variable] ~ [predictor variables], data = [data source])
1.1
The data in this assignment:
x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
1.1 Define the relationship model between the predictor (x) nand the response (Y) variable:
>
> x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10)
> y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
>
> # Using lm() function to define the model
> model <- lm(y ~ x)
>
> # Display the model summary
> summary(model)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-11.435 -7.406 -4.608 6.681 16.834
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.206 15.691 1.224 0.2558
x 3.269 1.088 3.006 0.0169 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.48 on 8 degrees of freedom
Multiple R-squared: 0.5303, Adjusted R-squared: 0.4716
F-statistic: 9.033 on 1 and 8 DF, p-value: 0.01693
1.2 Calculate the coefficients?
> coefficients(model)
(Intercept) x
19.205597 3.269107
>
2. The following question is posted by Chi Yau (Links to an external site.) the author of R Tutorial With Bayesian Statistics Using Stan (Links to an external site.) and his blog posting regarding Regression analysis (Links to an external site.)
.
Problem -
Apply the simple linear regression model (see the above formula) for the data set called "visit" (see below), and estimate the the discharge duration if the waiting time since the last eruption has been 80 minutes. Note: The full dataset is in R which can be accessed as data(faithful).
> head(visit)
discharge waiting
1 3.600 79
2 1.800 54
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55
Employ the following formula discharge ~ waiting and data=visit)
2.1 Define the relationship model between the predictor and the response variable.
lm(eruptions ~ waiting, data = faithful)
Call:
lm(formula = eruptions ~ waiting, data = faithful)
Coefficients:
(Intercept) waiting
-1.87402 0.07563
2.2 Extract the parameters of the estimated regression equation with the coefficients function.
> # Load the dataset
> data(faithful)
>
> # Fit the linear model
> model <- lm(eruptions ~ waiting, data = faithful)
>
> # Extract the coefficients (intercept and slope)
> coef(model)
(Intercept) waiting
-1.87401599 0.07562795
2.3 Determine the fit of the eruption duration using the estimated regression equation.
> # Predict discharge when waiting time is 80 minutes
> predict(model, newdata = data.frame(waiting = 80))
1 2 3 4 5 6 7 8 9 10
71.51130 74.78041 61.70398 78.04952 58.43488 64.97309 81.31862 55.16577 55.16577 51.89666
>
3. Multiple regression
We will use a very famous datasets in R called mtcars. This dateset was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973--74 models).
This data frame contain 32 observations on 11 (numeric) variables.
To call mtcars data in R
R comes with several built-in data sets, which are generally used as demo data for playing with R functions. One of those datasets build in R is mtcars.
In this question, we will use 4 of the variables found in mtcars by using the following function
input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))
The R will display
3.1 Examine the relationship Multi Regression Model as stated above and its Coefficients using 4 different variables from mtcars (mpg, disp, hp and wt).
Report on the result and explanation what does the multi regression model and coefficients tells about the data?
input <- mtcars[,c("mpg","disp","hp","wt")]
lm(formula = mpg ~ disp + hp + wt, data = input)
> input <- mtcars[,c("mpg","disp","hp","wt")]
> print(head(input))
mpg disp hp wt
Mazda RX4 21.0 160 110 2.620
Mazda RX4 Wag 21.0 160 110 2.875
Datsun 710 22.8 108 93 2.320
Hornet 4 Drive 21.4 258 110 3.215
Hornet Sportabout 18.7 360 175 3.440
Valiant 18.1 225 105 3.460
> input <- mtcars[,c("mpg","disp","hp","wt")]
>
> lm(formula = mpg ~ disp + hp + wt, data = input)
Call:
lm(formula = mpg ~ disp + hp + wt, data = input)
Coefficients:
(Intercept) disp hp wt
37.105505 -0.000937 -0.031157 -3.800891
4. From our textbook pp. 124, 6.5-Exercises # 6.1
With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation. According to the fitted model, what is the predicted metabolic rate for a body weight of 70 kg?
The data set rmr is R, make sure to install the book R package: ISwR. After installing the ISwR package, here is a simple illustration to the set of the problem.
library(ISwR)
plot(metabolic.rate~body.weight,data=rmr)
> library(ISwR)
>
> # Load the dataset
> data("rmr")
>
> # Fit a linear model
> fit <- lm(metabolic.rate ~ body.weight, data = rmr)
>
> # Predict metabolic rate for 70 kg
> predict(fit, newdata = data.frame(body.weight = 70))
1
1305.394
>
> # 95% confidence interval for slope
> confint(fit)
2.5 % 97.5 %
(Intercept) 655.883819 966.5695
body.weight 5.086656 9.0324
>
Comments
Post a Comment