COMP 6211I Trustworthy Machine Learning [Spring 2023]
Monday, 13:30-14:50 @ Room 6591
Friday, 9:00-10:20 @ Room 6591
Overview
- Instructor: Minhao Cheng (minhaocheng@cse.ust.hk)
- Office hours: Tuesday 13:00-14:00 or by appointment @ Room 2542
- Teaching assistant:
- Zeyu Qin’s (zqinao@connect.ust.hk) office hours: Wednesday 10:00-11:00
- Canvas: COMP6211I
Announcements
- 2023-02-17: Plesase sign up the paper presentation using this link.
- 2023-02-13: Exam will be held on Feb 20th during the class time.
- 2023-02-06: Welcome to COMP 6211I!
Description
This is an intensive graduate seminar on Trustworthy machine learning. The course covers different topics in emerging research areas related to the broader study of security and privacy in machine learning. Students will learn about attacks against computer systems leveraging machine learning, as well as defense techniques to mitigate such attacks.
Prerequisites
The course assumes students already have a basic understanding of machine learning. Students will familiarize themselves with the emerging body of literature from different research communities investigating these questions. The class is designed to help students explore new research directions and applications. Most of the course readings will come from both seminal and recent papers in the field.
Grading Policy
- Paper presentation (25%)
- Paper summaries (10%)
- Class notes & participation (15%)
- Exam (15%)
- Research project (35%)
Assignments
A 1 page summary of reading assigned is due each class (starting from week 2 and onwards). A physical copy should be turned in before the beginning of class. The summary should cover the following: (a) what did the papers do well?, (b) where did the papers fall short?, (c) what did you learn from these papers?, and (d) what questions do you have about the papers?
- Paper presentation: starting from week 2, a team of students will present the papers assigned for reading each week. The team may choose an appropriate format (e.g., slides, interactive demos or code tutorials, …) for this presentation with the only requirements being that the presentation should (a) involve the class in active discussions, (b) cover all papers assigned for reading, and (c) last no more than 1h30mn including discussions.
- Class notes: Another team of students will be charged with writing notes synthesizing the content of the presentation and class discussion.
Research Projects
Students are required to do a project in this class. The goal of the course project is to provide the students an opportunity to explore research directions in trustworthy machine learning. The project should be related to the course content. An expected project consists of
- A novel and sound solution to an interesting problem
- Comprehensive literature review and discussion
- Thorough theoretical/experimental evaluation and comparisons with existing approaches
Tentative Schedule and Material
Date | Topic | Slides | Readings&links | Assignments |
---|
Fri 3/2 | Overview of Trustworthy Machine Learning | lecture_0 | | |
Mon 6/2 | Machine learning basics part 1 | lecture_1 | | |
Fri 10/2 | Machine learning basics part 2 | lecture_2 | | |
Mon 13/2 | Machine learning basics part 3 | lecture_3 | | |
Fri 17/2 | Machine learning basics part 4 | lecture_4 | | |
Mon 20/2 | Exam | | | |
Fri 24/2 | Test-time intergrity (attack) | slides | White-box attack: • Goodfellow et al., Explaining and Harnessing Adversarial Examples • Carlinin and Wagner, Towards Evaluating the Robustness of Neural Networks • Moosavi-dezfooli et al., Universal adversarial perturbations Hard-label black-box attack: • Brendel et al., Decision-based adversarial attacks: reliable attacks against black-box machine learning models • Cheng et al., Query-efficient hard-label black-box attack: an optimization-based approach • Chen et al., HopSkipJumpAttack: A Query-Efficient Decision-Based Attack | |
Mon 27/2 | Test-time intergrity (defense) | slides | • Madry et al., Towards Deep Learning Models Resistant to Adversarial Attacks • Wong et al., Fast is better than Free: Revisiting Adversarial Training • Zhang et al., Theoretically Principled Trade-off between Robustness and Accuracy | |
Fri 3/3 | Training-time intergrity (backdoor attack) | slides | • Liu et al., Trojaning Attack on Neural Networks • Shafahi et al., Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks • Gu et al., BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain | |
Mon 6/3 | Training-time intergrity (defense) | slides | • Wang et al., Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks • Wang et al., Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases | |
Fri 10/3 | Test-time intergrity (verification) part 1 | slides | • Eric and Kolter, Provable defenses against adversarial examples via the convex outer adversarial polytope • Zhang et al., Efficient Neural Network Robustness Certification with General Activation Functions Option: • Zhang et al., General Cutting Planes for Bound-Propagation-Based Neural Network Verification | |
Mon 13/3 | Test-time intergrity (verification) part 2 | slides | • Cohen et al., Certified Adversarial Robustness via Randomized Smoothing | |
Fri 17/3 | Training-time intergrity (poisoning attack) | | • Koh and Liang, Understanding Black-box Predictions via Influence Functions • Carlini and Terzis, Poisoning and Backdooring Contrastive Learning • Carlini, Poisoning the Unlabeled Dataset of Semi-Supervised Learning | |
Mon 20/3 | Confidentiality (data) attack | slides | • Carlini et al., Extracting Training Data from Large Language Models • Kahla et al., Label-Only Model Inversion Attacks via Boundary Repulsion | |
Fri 24/3 | Privacy attacks | slides | • Shokri et al., Membership Inference Attacks against Machine Learning Models • Fredrikson et al., Model inversion attacks that exploit confidence information and basic countermeasures • Choquette-Choo et al., Label-Only Membership Inference Attacks | |
Mon 27/3 | Confidentiality (model) | slides | • Jagielski et al., High Accuracy and High Fidelity Extraction of Neural Networks • Tramer et al., Stealing Machine Learning Models via Prediction APIs | |
Fri 31/3 | Confidentiality defense | slides | • Huang et al., Unlearnable Examples: Making Personal Data Unexploitable • Maini, Dataset Inference: Ownership Resolution in Machine Learning | |
Mon 3/4 | Fairness | slides | • Zhao et al., Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints • Dwork et al., Fairness Through Awareness • Caliskan et al., Semantics derived automatically from language corpora contain human-like biases | |
Fri 7/4 | Study break | | | |
Mon 10/4 | Study break | | | |
Fri 14/4 | Differential privacy part I | slides | • Dwork et al., Calibrating Noise to Sensitivity in Private Dat Analysis • Abadi et al., Deep Learning with Differential Privacy | |
Mon 17/4 | Differential privacy part II | slides | • Papernot et al., Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data • Mironov, Renyi Differential Privacy | |
Fri 21/4 | Interpretability (XAI) part 1 | slides | • Simonyan et al., Deep inside convolutional networks: Visualising image classication models and saliency maps • Selvaraju et al., Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization | |
Mon 24/4 | Interpretability (XAI) part 2 | slides | • Ribeiro et al., “Why Should I Trust You?”: Explaining the Predictions of Any Classifier • Lundberg and Lee, A unified approach to interpreting model predictions | |
Fri 4/28 | Safety | | • Athalye et al., Synthesizing Robust Adversarial Examples • Xu et al., Adversarial T-shirt! Evading Person Detectors in A Physical World | |
Mon 1/5 | Labor day | | | |
Fri 5/5 | Uncertainty | slides | • Guo et al., On Calibration of Modern Neural Networks • Minderer et al., Revisiting the Calibration of Modern Neural Networks | |
Mon 8/5 | Project Presentation | | | |
References
There is no required textbook for this course. Some recommended readings are
- Deep Learning (by Ian Goodfellow, Yoshua Bengio, Aaron Courville)
- Adversarial Robustness for Machine Learning (By Pin-Yu Chen and Cho-Jui Hsieh )