# R

## Simulation attempt for Andréhette Verster

Introduction and disclaimer The data simulated here pertains to a specific set of assumptions, and we should not try to extend the results beyond that without seriously considering and accounting for any systematic differences between that situation and any broader situation. The data analysed here is inherently random. If the study were to be repeated then the results will differ. While the computer software used is tried and tested, the analysis involves multiple human elements.

## Time Series Prediction via simulated paths

Observed data We begin by reading in data. We will use the old FTSE stock exchange index for this example. We will try to reach stationarity via calculating log returns. data(EuStockMarkets) Index <- EuStockMarkets[,'FTSE'] n <- length(Index) LogReturns <- diff(log(Index)) n1 <- length(LogReturns) par(mfrow=c(1,2),bg=rgb(0,0,0),fg=rgb(1,1,1),col.axis='white',col.main='white',col.lab='white',mar=c(4,4,0.6,0.2)) plot(Index,col='#2A9FD6',lwd=2) plot(LogReturns,col='#2A9FD6') Next we consider any residual correlation. source('autocor.r') par(mfrow=c(1,2),bg=rgb(0,0,0),fg=rgb(1,1,1),col.axis='white',col.main='white',col.lab='white',mar=c(4,4,0.6,0.2)) autocor(LogReturns) ## Autocor Partial Autocor ## [1,] 0.0920293254 0.092029325 ## [2,] -0.0080311473 -0.016641487 ## [3,] 0.

## Miscellaneous useful code

R code Download code freely, but please remember where you got it. NB: Right click -> Save As Draw nice correlograms of a time series. Infinite hypothesis test problem generator. Function for finding the shortest interval from bootstraps or HPD interval from simulations. Function for calculating the sample skewness. Function that calculates the symmetric matrix square root of a positive definite matrix (such as covariance matrices). Example of using the manipulate function on a 2D graph.

## Univariate Time Series Overview

Why do we care? Understanding and predicting time series can help us make better decisions. Better decisions can lead to more profit, or less losses, or less wasting of natural resources, and other benefits. How can we predict the unpredictable? By breaking down the problem into parts we know how to deal with. We know how to deal with independent and identically distributed values: Draw a histogram and get summary statistics.

## Useful R code for the Dirichlet distribution

Simulation To simulated a Dirichlet sample quickly we use the method on the Wikipedia page for the Dirichlet distribution. rdirichlet <- function(N=1,K=c(1,1)) { # Simulations from the Dirichlet Distribution, according to the method of Wikipedia lk <- length(K) sim <- matrix(0,N,lk) gams <- matrix(0,N,lk) for (i in 1:lk) { gams[,i] <- matrix(rgamma(N,K[i]),N,1) } gamtotal <- matrix(rowSums(gams),N,lk) sim <- gams/gamtotal return(sim) } Frequentist Parameter estimation The simplest method of parameter estimation is the method of moments:

## Multiple lines on a graph example

What are we doing and why? We are just going to draw a graph in R with multiple lines on one graph. This is interesting because the way base R draws graphs is a bit strange to people who are used to other packages. Some explanation is useful. Specific example In this example we are going to use the lengths of the 25 most popular movies of each year from 1931 to 2013, as explained here bu Randy Olson.