0.1 Stock Market Data
- Load and explore (through numerical and graphical summaries) the
Smarketdata (this is inISLRpackage).
This data contains percentage returns for the S&P 500 stock index over 1,250 days (2001 - 2005). For each date, we have the percentage returns of the five previous days (Lag1 - Lag5), the number of shares traded on the previous day in billions (Volume), percentage return on the date in question (Today) and Direction (Up or Down on this date).
0.2 Logistic Regression
Fit a logistic regression model to predict
DirectionusingLag1throughLag5andVolume. Describe your results.Create a confusion matrix for the training data.
What is the overall error rate of the model?
Create two data sets,
trainandtestthat correspond to the observations from 2001 to 2004 (train) and 2005 (test).Repeat 1-3, but obtain the test confusion matrix and error rate.
Repeat 5, but with a model of
DirectiononLag1andLag2only.
0.3 LDA
Fit a linear discriminant analysis model to the
traindata set you created in the previous section withDirectionas the response andLag1andLag2as the predictors.What are the values for \(\hat{\pi}_1\) and \(\hat{\pi}_2\)?
Create a confusion matrix for the
testdata.What is the test error rate?
0.4 QDA
Fit a quadratic discriminant analysis model to the
traindata set you created in the previous section withDirectionas the response andLag1andLag2as the predictors.Create a confusion matrix for the
testdata.What is the test error rate?
0.5 KNN
Fit a KNN model with \(K = 1\) to the
traindata set you created in the previous section withDirectionas the response andLag1andLag2as the predictors.Create a confusion matrix for the
testdata.What is the test error rate?
Repeat 1.-3. with \(K = 3\) and \(K = 5\).
Of all the models you fit today, which would you pick to predict values of Direction and why?