IFML Seminar: Data Security in Federated Learning and Generative Models

Tom Goldstein, Associate Professor of Computer Science, Director of Machine Learning, University of Maryland

12:15 - 1 pm

The University of Texas at Austin
United States

Abstract: Machine learning systems are built using large troves of training data that may contain private or copyrighted content. In this talk, I'll survey a number of security issues that arise when sensitive data is used. I'll begin by talking about attack methods that extract private training data from federated learning protocols. Then, I'll discuss data privacy issues that arise when using generative models. These models are often created using a training objective that explicitly promotes their ability to regenerate their training data, causing a host of issues. I'll discuss how diffusion models can reproduce their training data, leading to potential legal issues. I'll also discuss methods for watermarking large language models, and I'll explore ways in which the ability to reproduce training data complicates our ability to detect LLM-produced text.

Bio: Tom Goldstein is the Perotto Associate Professor of Computer Science at the University of Maryland, and director of the Maryland Center for Machine Learning. His research lies at the intersection of machine learning and optimization, and targets applications in computer vision and signal processing. Professor Goldstein has been the recipient of several awards, including SIAM’s DiPrima Prize, a DARPA Young Faculty Award, a JP Morgan Faculty award, an Amazon Research Award, and a Sloan Fellowship.

Event Registration