---
title: "Assignment Memo Example"
author: "Sean van der Merwe"
date: "29 November 2019"
output:
html_document:
df_print: paged
params:
st: 2001234567
---
## Currently marking student `r params$st`
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
# Mark processor
mrks <- 0
mrk <- function(newmark) {mrks <<- mrks + newmark; if (newmark==1) {return('1 mark')} else {return(paste(newmark,'marks'))} }
```
```{r generate, eval=FALSE, include=FALSE}
library(openxlsx)
students <- c(2001234567,2012345678,2000000123)
nStudents <- length(students)
n <- 100
datasets <- vector('list',nStudents)
for (i in 1:nStudents) {
x1 <- rnorm(n,4,1)
x2 <- rgamma(n,4,2)
y <- 20 + 2*x1 + rnorm(n)
datasets[[i]] <- data.frame(y,x1,x2)
}
names(datasets) <- paste0('St',students)
write.xlsx(datasets, file = "Practical1data.xlsx")
```
## Instructions
Your individual task with this assignment is to build a linear model and discuss it.
1. Read in the data from the sheet with your student number and verify that it was read in correctly.
1. Fit the standard linear regression model using main effects only, and give the summary.
1. Test the individual coefficients for significance if appropriate after testing the global hypothesis.
1. Add a scatterplot of $y$ versus $x_1$ with the regression line added.
The model is:
$$y_i\sim N(\beta_0+\beta_1x_i,\ \sigma^2)$$
## Memorandum
#### Q1: Read data
```{r getdata}
library(openxlsx)
mydata <- read.xlsx('Practical1data.xlsx',paste0('St',params$st))
head(mydata)
```
**|| `r mrk(1)` for reading in their own data. `r mrk(1)` for checking it. ||**
#### Q2: Linear model
```{r LM}
(s1 <- summary(model1 <- lm(y~x1+x2, data=mydata)))
```
#### Q3: Model discussion
In order to test the global hypothesis that no coefficients are significant we look at the F test p-value. If it is more than $\alpha$ we fail to reject the global null hypothesis and stop there. If the p-value is significant ($<\alpha$) then we test the individual hypotheses.
**|| `r mrk(2)` for having a global conclusion, and discussing it. ||**
The individual hypothesis is simply $H_0:\beta_i=0\ vs\ H_1:\beta_i\ne 0$. We reject the individual hypothesis for the significant terms based on the original p-values being less than $\alpha$ and conclude that these terms are useful and significant in predicting $y$.
**|| `r mrk(4)` for discussing individual terms if appropriate. ||**
#### Q4: Plot
```{r Plot}
plot(mydata$x1, mydata$y, pch=4, col='purple', main = '', xlab = expression(x[1]), ylab = expression(y))
abline(coef(model1)[1:2], col='blue', lwd=2, lty=4)
```
**|| `r mrk(2)` for neat plot with line. ||**
**|| Total: `r mrks` marks ||**