CS565600 Deep Learning

Fundamentals of machine learning, deep learning, and AI.

Description

This class introduces the concepts and practices of deep learning. The course consists of three parts. In the first part, we give a quick introduction to classical machine learning and review some key concepts required to understand deep learning. In the second part, we discuss how deep learning differs from classical machine learning and explain why it is effective in dealing with complex problems such as image and natural language processing. Various CNN and RNN models will be covered. In the third part, we introduce deep reinforcement learning and its applications.

This course also gives coding labs. We will use Python 3 as the main programming language throughout the course. Some popular machine learning libraries such as Scikit-learn and Tensorflow 2.0 will be used and explained in details.

Syllabus

Teaching Assistants

Yung-Cheng Chen
陳永承*

Yen-Ting Wang
王彥婷

Jou-Hsuan Yang
楊柔暄

Sheng-Wei Cheng
鄭勝偉

Wei-Hung Chang
張維紘


Mail: dl2024@datalab.cs.nthu.edu.tw


Time & Location

  • Tue. 3:30-5:20pm at Delta 107
  • Thu. 3:30-4:20pm at Delta 107
  • Office hour: Thu. 4:20pm-5:10pm at Delta 723

Grading Policy

  • Quiz: 10%
  • Lab Assignments: 20%
  • Competition: 45%
  • Final: 25%
  • Bonus: 10%

Prerequisites

This course is intended for senior undergraduate and junior graduate students who have a proper understanding of

  • Python Programming Language
  • Calculus
  • Linear Algebra
  • Probability Theory
Although it would be helpful, knowledge about classical machine learning is NOT required.

IMPORTANT

Students will group (2~4 people a group). This class requires each group of students to prepare a GPU card to perform the necessary computing. You can follow this link to decide which GPU card to go for. NO GPU CARD PROVIDED IN THE CLASS.

Announcement

Curriculum

If you have any feedback, feel free to contact: shwu [AT] cs.nthu.edu.tw

Lecture 01

Introduction

What's ML? | What’s Deep Learning? | About This Courses | FAQ

Slides Notation

Scientific Python 101 (No Assignment)

This lab guides you through the setup of scientific Python environment and provides useful references for self-reading.

Notebook

Lecture 02

Linear Algebra

Span & Linear Dependence | Norms | Eigendecomposition | Singular Value Decomposition | Traces & Determinant

Video Slides

Data Exploration & PCA (Bonus)

This lab guides you through the process of Exploratory Data Analysis (EDA) and discuss how you can leverage the Principle Component Analysis (PCA) to visualize and understand high-dimensional data.

Notebook

Lecture 03

Probability & Information Theory

Random Variables & Probability Distributions | Multivariate & Derived Random Variables | Bayes’ Rule & Statistics | Principal Components Analysis | Technical Details of Random Variables | Common Probability Distributions | Common Parametrizing Functions | Information Theory | Decision Trees & Random Forest

Video Slides

Decision Trees & Random Forest (Bonus)

In this lab, we will apply the Decision Tree and Random Forest algorithms to the classification and dimension reduction problems using the Wine dataset.

Notebook

Lecture 04

Numerical Optimization

Numerical Computation | Optimization Problems | Gradient Descent | Newton's Method | Stochastic Gradient Descent | Perceptron | Adaline | Constrained Optimization | Linear & Polynomial Regression | Generalizability & Regularization | Duality

Video Slides

Perceptron & Adaline (Bonus)

In this lab, we will guide you through the implementation of Perceptron and Adaline, two of the first machine learning algorithms for the classification problem. We will also discuss how to train these models using the optimization techniques.

Notebook

Regression (Bonus)

This lab guides you through the linear and polynomial regression using the Housing dataset. We will also extend the Decision Tree and Random Forest classifiers to solve the regression problem.

Notebook

Lecture 05

Learning Theory & Regularization

Learning Theory | Point Estimation | Bias & Variance | Consistency | Decomposing Generalization Error | Weight Decay | Validation

Slides

Regularization

In this lab, we will guide you through some common regularization techniques such as weight decay, sparse weight, and validation.

Notebook Slides

Competition 01

Predicting News Popularity

In this competition, you are provided with a supervised dataset consisting of the raw content and binary popularity of news articles. What you need to do is to learn a function that is able to predict the popularity of an unseen news article.

Notebook Slides Kaggle

Lecture 06

Probabilistic Models

Probabilistic Models | Maximum Likelihood Estimation | Linear Regression | Logistic Regression | Maximum A Posteriori Estimation | Bayesian Estimation and Inference | Gaussian Process

Slides

Logistic Regression & Metrics

In this lab, we will guide you through the practice of Logistic Regression. We will also introduce some common evaluation metrics other than the "accuracy" that we have been used so far.

Notebook Slides

Neural Networks from Scratch (No Assignment)

In this tutorial, you will learn the fundamentals of how you can build neural networks without the help of the deep learning frameworks, and instead by using NumPy.

Notebook

TensorFlow 101 (No Assignment)

We are going to use TensorFlow as our framework in the following lectures. In this lab, you will learn how to install TensorFlow and get a better understanding by implementing a classical deep learning algorithm.

Notebook Slides

Lecture 07

Non-Parametric Methods & SVMs (Suggested Reading)

KNNs | Parzen Windows | Local Models | Support Vector Classification (SVC) | Slacks | Nonlinear SVC | Dual Problem | Kernel Trick

Slides

SVMs & Scikit-Learn Pipelines (Bonus)

In this lab, we will classify nonlinearly separable data using the KNN and SVM classifiers. We will show how to pack multiple data preprocessing steps into a single Pipeline in Scikit-learn to simplify the training workflow.

Notebook

Lecture 08

Cross Validation & Ensembling (Suggested Reading)

Cross Validation | How Many Folds? | Voting | Bagging | Boosting | Why AdaBoost Works?

Slides

CV & Ensembling (Bonus)

In this lab, we will guide you through the cross validation technique for hyperparameter selection. We will also practice and compare some ensemble learning techniques.

Notebook

Lecture 09

Neural Networks: Design

NN Basics | Learning the XOR | Back Propagation | Cost Function & Output Neurons | Hidden Neurons | Architecture Design & Tuning

Slides

Lecture 10

Neural Networks: Optimization & Regularization

Momentum & Nesterov Momentum | AdaGrad & RMSProp | Batch Normalization | Continuation Methods & Curriculum Learning | NTK-based Initialization | Cyclic Learning Rates | Weight Decay | Data Augmentation | Dropout | Manifold Regularization | Domain-Specific Model Design

Slides

Word2Vec

In this lab, we will introduce a neural network, called the word2vec, that embeds words into a dense vector space where semantically similar words are mapped to nearby points.

Notebook Slides

Lecture 11

Convolutional Neural Networks

Convolution Layers | Pooling Layers | Variants & Case Studies | Visualizing Activations | Visualizing Filters/Kernels | Visualizing Gradients | Dreaming and Style Transfer | Segmentation and Localization | Object Detection | More Applications

Slides

Convolutional Neural Networks & Data Pipelines

In this lab, we will introduce two datasets, MNIST and CIFAR-10, then we will talk about how to implement CNN models for these two datasets using tensorflow. Then offer a guide to illustrate typical input pipeline of TensorFlow 2.0.

Notebook Slides

Visualization, Style Transfer & Save and Load Models

This lab guides how to load and use a pretrained VGG19 model and how to visualize what the CNN networks have learned in selected layers. This also introduces an interesting technique called "Style Transfer" and displays galleries of its creative outputs. Last but not least, we will also demonstrate how to save and load model during training and explain the TensorFlow family briefly.

Notebook Slides

Competition 02

Object Detection & Localization

In this competition, you should design a model to detect multiple objects in the image. Object detection is a multi-tasks learning problem, which means the model have to localize and classify several objects simultaneously.

Notebook Slides

Lecture 12

Recurrent Neural Networks & Transformers

Vanilla RNNs | Design Alternatives | Backprop through Time (BPTT) | Optimization Techniques | Optimization-Friendly Models & LSTM | Parallelism & Teacher Forcing | Attention | Explicit Memory | Adaptive Computation Time (ACT) | Visualization | Memory Networks | Google Neural Machine Translation | Transformers | Subword Tokenization |

Slides

Seq2Seq Learning for Machine Translation

This lab guides how to use recurrent neural networks to model continuous sequence like nature language, and use it on not only article comprehension but also word generation.

Notebook

Image Caption

In this lab, we introduce how to design a model that can be given an image, and then generates suitable caption which can describe the image. To accomplish this, you'll use an attention-based model, which enables us to see what parts of the image the model focuses on as it generates a caption.

Notebook Slides

Lecture 13

Unsupervised Learning & Generative AI

Text Models & Image Models | Clustering | Factorization | Dimesion Reduction | ChatGPT | Autoencoders & Manifold Learning | Variational Autoencoders (VAE) | Flow-based Models | Diffusion Models | Generative Adversarial Networks (GANs)

Slides

Diffusion Models

In this lab, we are going to introduce Diffusion Models.

Notebook Slides

Competition 03

Reverse Image Caption

In this competition, given a set of texts, your task is to generate suitable imagese to illustrate each of the texts. We will guide you to use GANs to complete this competition.

Notebook Slides

Lecture 14

Reinforcement Learning

Markov Decision Process(MDP) | Model-Free RL using Monte Carlo Estimation | Temporal-Difference Estimation and SARSA | Exploration Strategies | Q-Learning

Slides

Q-learning

In this lab, we will introduce temporal-difference learning and then use Q-learning to train an agent to play "Flappy Bird" game.

Notebook Slides

Lecture 15

Deep Reinforcement Learning

Introduction | Deep Q-Network | Double DQN | Prioritized Reply | Dueling Network | NoisyNet and Scalable Implementations | Policy Gradient Methods & DDPG | Episodic Policy Gradient & REINFORCE | Reducing Variance | Baseline Subtraction | Function Approximation, Actor-Critic, and A3C

Slides

PPOxGAE

In this lab, we will introduce PPOxGAE and use it to train a frame-based and state-based agents to play "Flappy Bird" game.

Notebook Slides

Resources

Following provides links to some useful online resources. If this course starts your ML journey, don't stop here. Enroll yourself in advanced courses (shown below) to learn more.

Other Course Materials

For more course materials (such as assignments, score sheets, etc.) and online forum please refer to the eeclass system.

eeclass System

Reference Books

  • Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016, ISBN: 0387848576

  • Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer, 2009, ISBN: 0387848576

  • Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006, ISBN: 0387310738

  • Sebastian Raschka, Python Machine Learning, Packt Publishing, 2015, ISBN: 1783555130

Online Courses