University of The Cumberlands Overfitting in Statistical Machine Learning Questions – Description
This project is about over-fitting and it is based on chapter 6 Statistical Machine Learning from ‘Practical Statistics for Data Scientists’.
Cover in the project the following:
1. Explain the data from figure P4p4F1.pdf.
2. Explain the differences in (a) and (b) parts in figure P4p4F2.pdf.
3. Try to recreate with R or Octave, as close as possible, the data from the figure P4p4F1.pdf. Functions needed are: runif (R) rand (Octave) for uniform distribution and rnorm (R) randn() (Octave) for the normal distribution
a. explain how you can recreate P4p4F1.pdf
b. compare and discuss my P4p4F3.pdf with the figure you created
4. Based on the P4p4F3.pdf, or your data created, explain how you would make a decision tree to classify ‘+’ and ‘o’ similarly to the way it was done in the left tree in P4p2F2.pdf
5. In your opinion, why is it practical or useful to simulate the data for the classification?
The post University of The Cumberlands Overfitting in Statistical Machine Learning Questions first appeared on .