Supervised Machine Learning
Teacher(s)Patrick Groenen, Pieter Schoonees
DatesPeriod 2 - Oct 25, 2021 to Dec 17, 2021
Statistical learning methods arising from statistics, machine learning, and data science have become more widely available. These methods can be split into supervised learning, with the aim of predicting a response variable, and unsupervised learning, which aims to describe the relations between all variables simultaneously. This course focuses on particular supervised learning methods and has as its goal that the student obtains a thorough technical understanding of a selection of supervised machine learning techniques, can implement the technique in the high level language R and can write a report about an application of the technique. The follow-up course Machine Learning II covers additional supervised and unsupervised learning methods.
The book of Hastie, Tibshirani, and Friedman (2001, 1st edition) has been a milestone in connecting statistical ideas into machine learning techniques. Parts of the second edition of this book (2009) form the basis of this course. An overview of techniques and ideas to be treated are:
- linear methods for regression,
- linear methods for classification,
- basis expansions and regularization,
- model assessment and selection,
- classification and regression trees,
- bootstrap aggregation and random forests.
This course is a field course in the Tinbergen Institute program for 3 credits in the major econometrics.
The following book is considered essential for your learning experience and it is part of the examined material. Changes in the reading list will be communicated via Canvas.
Hastie, T., Tibshirani, R. and J. Friedman. (2009). 'The elements of statistical learning (2nd edition).' Springer. Available at https://web.stanford.edu/~hastie/Papers/ESLII.pdf.