This talk introduce Continual Learning in general and a deep dive into the CVPR 2020 paper "Conditional Channel Gated Networks for Task-Aware Continual Learning".
Wiki: https://wiki.continualai.org
Arxiv: https://arxiv.org/abs/2004.00070
References to everything covered in the lecture: https://www.reddit.com/r/2D3DAI/comments/js69za/references_from_lecture_introduction_to_continual/
00:00 Intro
04:32 What is Continual Learning (CL)?
06:16 Catastrophic forgetting08:07 Tasks • Usually, each task is in the form of a classification dataset
12:59 Continual learning: timeline
13:58 Continual Learning desiderata
16:12 Related works
17:37 Progressive Neural Networks
23:29 Learning without Forgetting
27:58 Elastic Weight Consolidation
29:00 Regularization approaches
30:01 Gradient Episodic Memory (GEM)
39:52 PackNet
40:05 Parameter isolation methods
40:55 Conditional Channel Gated Networks for Task-Aware Continual Learning CVPR 2020
42:26 Motivation
43:16 Joint prediction of task and class
45:05 Task-incremental learning of class labels
48:51 Sparsity dynamics
01:02:00 Class incremental learning of task labels
01:09:10 Episodic or generative memory?
01:14:20 Limitations
01:20:14 Conclusions and Future directions for continual learning
01:23:00 Discussion
[Chapters were auto-generated using our proprietary software - contact us if you are interested in access to the software]
Lecture abstract:
Neural networks struggle to learn continuously and experience catastrophic forgetting when optimized on a sequence of learning problems. As such, whenever the training distribution shifts, they overwrite the old knowledge to fit the current examples. Continual Learning (CL) is the research area addressing the forgetting problem in learning models, and it has inspired a plethora of approaches and evaluation settings. This talk will discuss several successful strategies as well as some of their drawbacks. Moreover, we will introduce a CL model based on conditional computation: by equipping each layer with task-specific gating modules, the network can autonomously select which units to apply at the given input. By monitoring the activation patterns of such gates, we can identify important units for the current task and freeze them before proceeding to the next one, ensuring no loss in performance. An extension will also be discussed, capable of dealing with the more general case in which, at test time, examples do not come with associated task labels.
Presenter BIO:
Davide Abati is a machine learning researcher at Qualcomm AI Research, based in Amsterdam. He holds a master’s degree in computer engineering from the University of Modena and Reggio Emilia, where he also pursued his Ph.D. in computer vision advised by Prof. Rita Cucchiara. His research focuses on different areas of computer vision, spanning from human attention prediction to novelty detection and continual learning. Some of his works were published in top-tier conferences and journals, such as CVPR, Neurips, and TPAMI. He also regularly serves as a reviewer for several IEEE transactions journals.
His website: https://davideabati.info
————————-
Find us at:
Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D
Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/
Discord server for, well, discord ➜ https://discord.gg/MZuWSjF
Blog ➜ https://2d3d.ai
We are the people behind the AI consultancy Abelians ➜ https://abelians.com/