Introduction
Why?
- Do you ever work through an analysis example in class with code and then never get around to typing it up nicely?
- Do you ever get students copy-pasting assignments from their classmates?
- Do you ever type up a piece of analysis by copy-pasting graph after graph or table after table, only to realise there’s a problem with the data and you have to redo everything?
- Do you ever need to include code from one or more languages and struggle to get the syntax highlighted nicely?
- Have you ever been asked by a reviewer to make your work more reproducible?
What?
- Notebooks are all the rage
- Instead of micro-managing your document, you use codes to indicate headings, links, or emphasis
- You type your document and code in a single editor
- In the end you press a button and it compiles your document for you, running the code and embedding the output along the way
- It’s the modern workflow
- Markdown is the popular notebook format
- R Markdown is what I use
- Everything discussed here applies to other popular packages too
How?
- I’m going to focus on RStudio
- Click File -> New File -> R Markdown
- Choose whether you want a Word document, a website, a pdf, a slideshow, etc.
- Save
- Click Knit
Where?
- I use it and nothing else for ALL my consultation projects
- I use it when I teach my classes
- I teach it to the students to use for their assignments and tests
- with markdown there cannot be a disconnect between code, output, and discussion
- I use it for research projects
- I use it for myself
- And yes, this presentation was made in RStudio
Examples
Assignments
- To enable good assessment, students must be able to answer with code, maths, graphs, and written explanations interspersed
- Thus we need submissions in a modern shareable and mark-able format such as .pdf or .docx
- Until recently that meant that students would:
- Make a Word document
- Search how to do the thing they were supposed to study
- Copy their code to the document
- Run their code and realise there’s a problem, fix the problem and run again
- Copy the output (text and graphs) to the document
- Forget to copy the changed code, creating inconsistencies, and so on…
How to reduce copying
- To reduce copying I give each student different data
- This is not more work (usually)
- I generate data randomly based on some properties I need them to learn about
- And repeat for each student number
- How does markdown factor into this?
- it doesn’t, markdown comes in with the memo
- R Markdown allows for document parameters (like student number) to be entered when compiling to create a unique document just for the relevant student
- I include marking guidance in the markdown file and give it to a competent marking assistant
Data generating example
- In this example I generate regression data with 1 significant variable and 1 insignficant variable to see if the students can differentiate
library(openxlsx)
students <- c(2001234567,2012345678,2000000123)
nStudents <- length(students)
n <- 100
datasets <- vector('list',nStudents)
for (i in 1:nStudents) {
x1 <- rnorm(n,4,1)
x2 <- rgamma(n,4,2)
y <- 20 + 2*x1 + rnorm(n)
datasets[[i]] <- data.frame(y,x1,x2)
}
names(datasets) <- paste0('St',students)
write.xlsx(datasets, file = "Practical1data.xlsx")
Memo example
- Will be shown in RStudio
- Available for download here: