Machine learning is one of the main concepts behind data science and artificial intelligence (AI). The term machine learning or statistical learning refers to the science of automated detection of patterns in data. It has been widely used in tasks that require information extraction from large data sets. Examples of tasks include SPAM detection, fraudulent credit card transaction detection, face recognition by digital cameras, and voice commands recognition by personal assistance on smart-phones. Machine learning is also widely used in scientific domains such as Bioinformatics, medicine, and astronomy. One characteristic of all these applications is that a human developer cannot provide an explicit and detailed specification of how these tasks should be executed, due to the complexity of the patterns that need to be detected.
This course aims to introduce the main concepts underlying machine learning, including for instance, (a) what is learning, (b) how can a machine learning, (c) what kind of problems can be solved by using machine learning approach, (d) how to formalize them as a machine learning problem, and (e) how to compare and evaluate the performance of different machine learning. We will focus on methods that are successfully used in practice, including regression, supervised and unsupervised techniques, and neural networks.
- Introduction to machine learning
- Supervised learning: linear regression, logistic regression, linear discriminant analysis, k-nearest neighbors (kNN), decision trees, support vector machines (SVM), and boosting
- Unsupervised learning: k-means, principal component analysis (PCA)
- Model selection and validation
- Neural networks
Grades will be based on:
- Lab assignments: 10 pts
- Final project report: 40 pts -- (Guidelines)
- Written exam: 50 pts
An assignment will be set after each lecture. One point per assignment will be given based on a reasonable solution to the proposed problem. A reasonable solution is one that shows you have attempted to solve the problem.
Students must organize into groups (up to 4 members) and they must implement a machine learning project that is (a) relevant to the topics discussed in class, (b) requires a significant effort from all team members, and (c) be unique (i.e., two groups may not choose the same project topic). The projects will vary in both scope and topic, but they must satisfy these criteria. Students are encouraged to begin to think about projects that interest them early on. If a group is unable to come up with its own project idea, it may ask the instructor to provide suggestions on interesting topics.
It will be a written close book exam with questions based on the mandatory readings and topics discussed during the classes.
- Hastie, T., Tibshirani, R., and Friedman, J. (2016). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2nd edition
- Valiant, L. (2013). Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World. Basic Books, Inc
- Daumé III, H. (2017). A Course in Machine Learning. Self-published, 2nd edition
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning. Springer
- Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press.
- Azencott, C.-A. (2018). Introduction au Machine Learning. Dunod
- Google Dataset Search
- UC Irvine Machine Learning Repository
- Kaggle datasets
- Google Colaboratory
- Jupyter Notebook
- scikit-learn
-
Introduction slides; assignment 01
This lectures introduces machine learning, its applications, and the kind of problems it can be applied. Likewise, it presents related concepts such as supervised and unsupervised learning, and generalization.
-
Linear and logistic regression lecture notes; lab 01: introduction, lab 02: regression; assignment 02
This lecture introduces parametric approaches to supervised learning and linear models. Linear regressors are expressed as maximum likelihood estimation problem. The discussed concepts include: (a) parametric methods, (b) maximum likelihood estimates, (c) linear regression, and (d) logistic regression.
References:
- Tsanas A, Xifara A. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings. 2012 Jun 1;49:560-7.
- Linear Models
-
Dimension reduction slides; lab 03: dimension reduction
This lectures discusses how to tackle high-dimensional learning problems, and how to reduce dimension through the principal component analysis (PCA) method.
References:
-
Model evaluation and selection slides; lab 04: model assessment
This lecture discusses model assessment of supervised machine learning. The discussed topics include: (a) training and test sets, (b) cross-validation, (c) bootstrap, (d) metrics of model complexity, and (e) metrics of performance for classification and regression.
-
Regularized linear regression and nearest-neighbors methods slideslab 05: kNN
This lecture introduces the concept of regularization as a means to controlling the complexity of the hypothesis space, and apply it to linear models. Furthermore, non-parametric methods are illustrated with the nearest-neighbors approaches. The discussed topics are: lasso, ridge regression, structured regularization, non-parametric learning, and k-nearest neighbors.
References:
- Linear Models
- Nearest Neighbors
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction
- Ridge regression: session 3.4.1
- Lasso: session 3.4.2
- Regularization: session 10.12
-
Tree-based methods slides; lab 06.1: decision trees; lab 06.2: tree-based methods
This lecture discusses decision tree approaches and shows how to combine simple classifiers to yield state-of-the-art predictors.
References:
- Quinlan, J. Ross. Induction of decision trees. Machine Learning 1, no. 1 (1986): 81-106
- Breiman, Leo. Random forests Machine Learning 45, no. 1 (2001): 5-32
- Schapire, Robert E. The strength of weak learnability. Machine learning 5, no. 2 (1990): 197-227
- Breiman, Leo. Bagging predictors. Machine Learning 24, no. 2 (1996): 123-140.
- An Intoductory Tutorial on kd-trees
- Voronoi Tessellation
- A complete tutorial on tree-based modeling
- How to visualize decision trees
-
Support vector machines
This lecture introduces support-vector machine from its principles in the case of linearly separable data and shows how positive-definite kernels can be used to extend the approach to non-linear separating functions.
References:
- Burges, Christopher JC. A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery 2, no. 2 (1998): 121-167
-
Clustering slides; lab 08: k-means
This lecture introduces clustering, the common unsupervised learning problem. Its concepts are illustrated through hierarchical clustering, k-means, and DBSCAN.