COMP 6211I Trustworthy Machine Learning [Spring 2023]

Monday, 13:30-14:50 @ Room 6591

Friday, 9:00-10:20 @ Room 6591

Overview

Instructor: Minhao Cheng (minhaocheng@cse.ust.hk)
- Office hours: Tuesday 13:00-14:00 or by appointment @ Room 2542
Teaching assistant:
- Zeyu Qin’s (zqinao@connect.ust.hk) office hours: Wednesday 10:00-11:00
Canvas: COMP6211I

Announcements

2023-02-17: Plesase sign up the paper presentation using this link.
2023-02-13: Exam will be held on Feb 20th during the class time.
2023-02-06: Welcome to COMP 6211I!

Description

This is an intensive graduate seminar on Trustworthy machine learning. The course covers different topics in emerging research areas related to the broader study of security and privacy in machine learning. Students will learn about attacks against computer systems leveraging machine learning, as well as defense techniques to mitigate such attacks.

Prerequisites

The course assumes students already have a basic understanding of machine learning. Students will familiarize themselves with the emerging body of literature from different research communities investigating these questions. The class is designed to help students explore new research directions and applications. Most of the course readings will come from both seminal and recent papers in the field.

Grading Policy

Paper presentation (25%)
Paper summaries (10%)
Class notes & participation (15%)
Exam (15%)
Research project (35%)

Assignments

Reading summary:

A 1 page summary of reading assigned is due each class (starting from week 2 and onwards). A physical copy should be turned in before the beginning of class. The summary should cover the following: (a) what did the papers do well?, (b) where did the papers fall short?, (c) what did you learn from these papers?, and (d) what questions do you have about the papers?

Paper presentation: starting from week 2, a team of students will present the papers assigned for reading each week. The team may choose an appropriate format (e.g., slides, interactive demos or code tutorials, …) for this presentation with the only requirements being that the presentation should (a) involve the class in active discussions, (b) cover all papers assigned for reading, and (c) last no more than 1h30mn including discussions.
Class notes: Another team of students will be charged with writing notes synthesizing the content of the presentation and class discussion.

Research Projects

Students are required to do a project in this class. The goal of the course project is to provide the students an opportunity to explore research directions in trustworthy machine learning. The project should be related to the course content. An expected project consists of

A novel and sound solution to an interesting problem
Comprehensive literature review and discussion
Thorough theoretical/experimental evaluation and comparisons with existing approaches

Tentative Schedule and Material

Date	Topic	Slides	Readings&links
Fri 3/2	Overview of Trustworthy Machine Learning	lecture_0
Mon 6/2	Machine learning basics part 1	lecture_1
Fri 10/2	Machine learning basics part 2	lecture_2
Mon 13/2	Machine learning basics part 3	lecture_3
Fri 17/2	Machine learning basics part 4	lecture_4
Mon 20/2	Exam
Fri 24/2	Test-time intergrity (attack)	slides	White-box attack: • Goodfellow et al., Explaining and Harnessing Adversarial Examples • Carlinin and Wagner, Towards Evaluating the Robustness of Neural Networks • Moosavi-dezfooli et al., Universal adversarial perturbations Hard-label black-box attack: • Brendel et al., Decision-based adversarial attacks: reliable attacks against black-box machine learning models • Cheng et al., Query-efficient hard-label black-box attack: an optimization-based approach • Chen et al., HopSkipJumpAttack: A Query-Efficient Decision-Based Attack
Mon 27/2	Test-time intergrity (defense)	slides	• Madry et al., Towards Deep Learning Models Resistant to Adversarial Attacks • Wong et al., Fast is better than Free: Revisiting Adversarial Training • Zhang et al., Theoretically Principled Trade-off between Robustness and Accuracy
Fri 3/3	Training-time intergrity (backdoor attack)	slides	• Liu et al., Trojaning Attack on Neural Networks • Shafahi et al., Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks • Gu et al., BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
Mon 6/3	Training-time intergrity (defense)	slides	• Wang et al., Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks • Wang et al., Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases
Fri 10/3	Test-time intergrity (verification) part 1	slides	• Eric and Kolter, Provable defenses against adversarial examples via the convex outer adversarial polytope • Zhang et al., Efficient Neural Network Robustness Certification with General Activation Functions Option: • Zhang et al., General Cutting Planes for Bound-Propagation-Based Neural Network Verification
Mon 13/3	Test-time intergrity (verification) part 2	slides	• Cohen et al., Certified Adversarial Robustness via Randomized Smoothing
Fri 17/3	Training-time intergrity (poisoning attack)		• Koh and Liang, Understanding Black-box Predictions via Influence Functions • Carlini and Terzis, Poisoning and Backdooring Contrastive Learning • Carlini, Poisoning the Unlabeled Dataset of Semi-Supervised Learning
Mon 20/3	Confidentiality (data) attack	slides	• Carlini et al., Extracting Training Data from Large Language Models • Kahla et al., Label-Only Model Inversion Attacks via Boundary Repulsion
Fri 24/3	Privacy attacks	slides	• Shokri et al., Membership Inference Attacks against Machine Learning Models • Fredrikson et al., Model inversion attacks that exploit confidence information and basic countermeasures • Choquette-Choo et al., Label-Only Membership Inference Attacks
Mon 27/3	Confidentiality (model)	slides	• Jagielski et al., High Accuracy and High Fidelity Extraction of Neural Networks • Tramer et al., Stealing Machine Learning Models via Prediction APIs
Fri 31/3	Confidentiality defense	slides	• Huang et al., Unlearnable Examples: Making Personal Data Unexploitable • Maini, Dataset Inference: Ownership Resolution in Machine Learning
Mon 3/4	Fairness	slides	• Zhao et al., Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints • Dwork et al., Fairness Through Awareness • Caliskan et al., Semantics derived automatically from language corpora contain human-like biases
Fri 7/4	Study break
Mon 10/4	Study break
Fri 14/4	Differential privacy part I	slides	• Dwork et al., Calibrating Noise to Sensitivity in Private Dat Analysis • Abadi et al., Deep Learning with Differential Privacy
Mon 17/4	Differential privacy part II	slides	• Papernot et al., Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data • Mironov, Renyi Differential Privacy
Fri 21/4	Interpretability (XAI) part 1	slides	• Simonyan et al., Deep inside convolutional networks: Visualising image classication models and saliency maps • Selvaraju et al., Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Mon 24/4	Interpretability (XAI) part 2	slides	• Ribeiro et al., “Why Should I Trust You?”: Explaining the Predictions of Any Classifier • Lundberg and Lee, A unified approach to interpreting model predictions
Fri 4/28	Safety		• Athalye et al., Synthesizing Robust Adversarial Examples • Xu et al., Adversarial T-shirt! Evading Person Detectors in A Physical World
Mon 1/5	Labor day
Fri 5/5	Uncertainty	slides	• Guo et al., On Calibration of Modern Neural Networks • Minderer et al., Revisiting the Calibration of Modern Neural Networks
Mon 8/5	Project Presentation

References

There is no required textbook for this course. Some recommended readings are

Deep Learning (by Ian Goodfellow, Yoshua Bengio, Aaron Courville)
Adversarial Robustness for Machine Learning (By Pin-Yu Chen and Cho-Jui Hsieh )