Past seminars & Recordings
Click to see abstract and bio. Recorded seminars can be replayed below.
Abstract: Standard machine learning produces models that are accurate on average but degrade dramatically on when the test distribution of interest deviates from the training distribution. We consider three settings where this happens: when test inputs are subject to adversarial attacks, when we are concerned with performance on minority subpopulations, and when the world simply changes (classic domain shift). Our aim is to produce methods that are provably robust to such deviations. In this talk, I will (attempt to) summarize all the work my group has done on this topic over the last three years. We have found many surprises in our quest for robustness: for example, that the "more data" and "bigger models" strategy that works so well for average accuracy sometimes fails out-of-domain. On the other hand, we have found that certain tools such as analysis of linear regression and use of unlabeled data (e.g., robust self-training) have reliably delivered promising results across a number of different settings.
Bio: Percy Liang is an Associate Professor of Computer Science at Stanford University (B.S. from MIT, 2004; Ph.D. from UC Berkeley, 2011). His two research goals are (i) to make machine learning more robust, fair, and interpretable; and (ii) to make computers easier to communicate with through natural language. His awards include the Presidential Early Career Award for Scientists and Engineers (2019), IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), and a Microsoft Research Faculty Fellowship (2014).
Zoom registration: https://us02web.zoom.us/webinar/register/WN_-M5Y9REHTMS2tf7e1X1-4w
YouTube live-stream and recording: https://www.youtube.com/watch?v=jCEo8PRJ9NA
Twitter thread to continue the conversation: https://twitter.com/trustworthy_ml/status/1321863535144529926?s=20
Irene's Abstract: Machine learning (ML) has demonstrated the potential to fundamentally improve healthcare because of its ability to find latent patterns in large observational datasets and scale insights rapidly. However, the use of ML in healthcare also raises numerous ethical concerns, especially as models can amplify existing health inequities. In this talk, I briefly outline two approaches to characterize inequality in ML and adapt models for patients without reliable access to healthcare. First, I decompose cost-based metrics of discrimination in supervised learning into bias, variance, and noise, and propose actions aimed at estimating and reducing each term. Second, I describe a deep generative model for disease subtyping while correcting for patient misalignment in disease onset time. I conclude with a pipeline for ethical machine learning in healthcare, ranging from problem selection to post-deployment considerations, and recommendations for future research.
Irene's Bio: Irene Chen is a computer science PhD student at MIT, advised by David Sontag. Her research focuses on machine learning methods to improve clinical care and deepen our understanding of human health, with applications in areas such as heart failure and intimate partner violence. Her work has been published in both machine learning conferences (NeurIPS) and medical journals (Nature Medicine, AMA Journal of Ethics), and covered by media outlets including MIT Tech Review, NPR/WGBH, and Stat News. Prior to her PhD, Irene received her AB in applied math and SM in computational engineering from Harvard University.
Arpita's Abstract: Major B2C eCommerce websites (such as Amazon, Spotify, etc.) are two-sided platforms, with customers on one side and producers on the other. Traditionally, recommendation protocols of these platforms are customer-centric---focusing on maximizing customer satisfaction by tailoring the recommendation according to the personalized preferences of individual customers. However, this may lead to unfair distribution of exposure among the producers and adversely impact their well-being. As more and more people depend on such platforms to earn a living, it is important to strike a balance between fairness among the producers and customer satisfaction. The problem of two-sided fairness in recommendation can be formulated as a hierarchically constrained fair allocation problem. This problem naturally captures a number of other resource-allocation applications, including budgeted course allocation and allocation of cloud computing resources. Our main contribution is to develop a polynomial time algorithm for the problem. In this talk, I’ll discuss the constrained fair allocation problem, and show how the solution can be applied to ensure a two-sided fair recommendation.
Arpita's Bio: Arpita Biswas completed her Ph.D. from the Department of Computer Science and Automation, Indian Institute of Science (IISc). During her Ph.D., she was a recipient of Google Ph.D. Fellowship award. Her Ph.D. dissertation provides algorithms and provable guarantees for fair decision making in resource allocation, recommendation, and classification domains. After completing her Ph.D., she joined Google Research as a Visiting Researcher, where she worked closely with a non-profit organization that aims at improving maternal health among low-income households in India by carrying out a free call-based program for spreading maternal care information. She is joining Harvard University as a Postdoctoral Research Fellow, starting from November 2020. Her primary areas of interest include Algorithmic Game Theory, Optimization, and Machine Learning---in particular, multi-agent learning, incentive mechanisms, market algorithms, scheduling, etc. Thus far, she has worked on problems arising from real-world scenarios like online crowd-sourcing, resource allocation, healthcare, dynamic pricing in transportation, ride-sharing, etc.
Zoom registration: https://us02web.zoom.us/webinar/register/WN_mOjYGQYcS2y8y9qYiPpNIQ
YouTube live-stream and recording: https://youtu.be/KM2vwajbasU
Twitter thread to continue the conversation: https://twitter.com/trustworthy_ml/status/1326944468122132480?s=20
Abstract: At 27, Dr. Ayanna Howard was hired by NASA to lead a team designing a robot for future Mars exploration missions that could “think like a human and adapt to change.” Her accomplishments since then include being named as one of 2015’s most powerful women engineers in the world and as one of Forbes’ 2018 U.S. Top 50 Women in Tech. From creating robots to studying the impact of global warming on the Antarctic ice shelves to founding a company that develops STEM education and therapy products for children and those with varying needs, Professor Howard focuses on our role in being responsible global citizens. In this talk, Professor Howard will delve into the implications of recent advances in robotics and AI and explain the critical importance of ensuring diversity and inclusion at all stages to reduce the risk of unconscious bias and ensuring robots are designed to be accessible to all. Throughout the talk, Professor Howard will weave in her own experience on developing new AI technologies through her technical leadership roles at NASA, Georgia Tech, and in technology startups.
Bio: Dr. Ayanna Howard is Chair of the School of Interactive Computing at the Georgia Institute of Technology. She also serves on the Board of Directors for Autodesk and the Partnership on AI. Prior to Georgia Tech, Dr. Howard was at NASA's Jet Propulsion Laboratory where she functioned as Deputy Manager in the Office of the Chief Scientist. To date, Dr. Howard’s unique accomplishments have been highlighted through a number of awards and articles, including being recognized as one of the 23 most powerful women engineers in the world by Business Insider and one of the Top 50 U.S. Women in Tech by Forbes. She regularly advises on issues concerning robotics, AI, and workforce development. Howard also serves on the board of CRA-WP, a nonprofit dedicated to broadening participation in computing research and education, as well as AAAS COOS, a board-appointed committee with the mandate to advise the Association on matters related to diversity in science, engineering, and related fields.
Zoom registration: https://us02web.zoom.us/webinar/register/WN_hlQKlJ52S0qCNMT_mXnDCg
YouTube live-stream and recording: This seminar will not be live-streamed or recorded.
Twitter thread to continue the conversation: https://twitter.com/trustworthy_ml/status/1329481200340123648?s=20
Dec 3, 2020: Jenn Wortman Vaughan, Microsoft Research
Intelligibility Throughout the Machine Learning Life Cycle
Abstract: People play a central role in the machine learning life cycle. Consequently, building machine learning systems that are reliable, trustworthy, and fair requires that relevant stakeholders—including developers, users, and the people affected by these systems—have at least a basic understanding of how they work. Yet what makes a system “intelligible” is difficult to pin down. Intelligibility is a fundamentally human-centered concept that lacks a one-size-fits-all solution. I will explore the importance of evaluating methods for achieving intelligibility in context with relevant stakeholders, ways of empirically testing whether intelligibility techniques achieve their goals, and why we should expand our concept of intelligibility beyond machine learning models to other aspects of machine learning systems, such as datasets and performance metrics.
Bio: Jenn Wortman Vaughan is a Senior Principal Researcher at Microsoft Research, New York City. Her research background is in machine learning and algorithmic economics. She is especially interested in the interaction between people and AI, and has often studied this interaction in the context of prediction markets and other crowdsourcing systems. In recent years, she has turned her attention to human-centered approaches to transparency, interpretability, and fairness in machine learning as part of MSR's FATE group and co-chair of Microsoft’s Aether Working Group on Transparency. Jenn came to MSR in 2012 from UCLA, where she was an assistant professor in the computer science department. She completed her Ph.D. at the University of Pennsylvania in 2009, and subsequently spent a year as a Computing Innovation Fellow at Harvard. She is the recipient of Penn's 2009 Rubinoff dissertation award for innovative applications of computer technology, a National Science Foundation CAREER award, a Presidential Early Career Award for Scientists and Engineers (PECASE), and a handful of best paper awards. In her "spare" time, Jenn is involved in a variety of efforts to provide support for women in computer science; most notably, she co-founded the Annual Workshop for Women in Machine Learning, which has been held each year since 2006.
YouTube live-stream and recording: https://youtu.be/bogHfN-RkaA
Twitter thread to continue the conversation: https://twitter.com/trustworthy_ml/status/1334554412673544194?s=20
Dec 17, 2020: Pin-Yu Chen, IBM Research
Practical Backdoor Attacks and Defenses in Machine Learning Systems
Abstract: Backdoor attack is a practical adversarial threat to modern machine learning systems, especially for deep neural networks. It is a training-time adversarial attack that embeds Trojan patterns to a well-trained model for gaining the ability to manipulate machine decision-making at the testing phase. In this talk, I will start by providing a comprehensive overview of adversarial robustness in the lifecycle of machine learning systems. Then, I will delve into recent backdoor attacks and practical defenses in different scenarios, including standard training and federated learning. The defenses include methods to detect and repair backdoored models. I will also cover a novel application of transfer learning with access-limited models based on the lessons learned from backdoor attacks.
Bio: Dr. Pin-Yu Chen is a research staff member at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. He is also the chief scientist of RPI-IBM AI Research Collaboration and PI of ongoing MIT-IBM Watson AI Lab projects. Dr. Chen received his Ph.D. degree in electrical engineering and computer science from the University of Michigan, Ann Arbor, USA, in 2016. Dr. Chen’s recent research focuses on adversarial machine learning and robustness of neural networks. His long-term research vision is building trustworthy machine learning systems. He has published more than 30 papers related to trustworthy machine learning at major AI and machine learning conferences, given tutorials at CVPR’20, ECCV’20, ICASSP’20, KDD’19, and Big Data’18, and organized several workshops for adversarial machine learning. He received a NeurIPS 2017 Best Reviewer Award, and was also the recipient of the IEEE GLOBECOM 2010 GOLD Best Paper Award.
Zoom registration: https://us02web.zoom.us/webinar/register/WN_3wXz3lgJTcSvGtC0re7NKA
YouTube live-stream and recording: https://youtu.be/RY8j_2zIvPY
Twitter thread to continue the conversation: Check back again after the seminar.
Lizzie's Abstract: As the public seeks greater accountability and transparency from machine learning algorithms, the research literature on methods to explain algorithms and their outputs has rapidly expanded. Feature importance, or the practice of assigning quantitative importance values to the input features of a machine learning model, form a popular class of such methods. Much of the research on feature importance rests on formalizations that attempt to capture universally desirable properties. We investigate the ways in which epistemic values are implicitly embedded in these methods and analyze the ways in which they conflict with ideas from feminist philosophy. We offer some suggestions on how to conduct research on explanations that respects feminist epistemic values, taking into account the importance of social context, the epistemic privileges of subjugated knowers, and adopting more interactional ways of knowing.
Lizzie's Bio: Lizzie Kumar is a second-year Computing Ph.D. student advised by Suresh Venkatasubramanian at the University of Utah where her work has previously been supported by the ARCS Foundation. She is interested in the practice of analyzing the social impact of machine learning systems and developing responsible AI law and policy. Previously, she developed risk models on the Data Science team at MassMutual while completing her M.S. in Computer Science at the University of Massachusetts, and also holds a B.A. in Mathematics from Scripps College.
Amirata's Abstract: As data becomes the fuel driving technological and economic growth, a fundamental challenge is how to quantify the value of data in algorithmic predictions and decisions. For example, in healthcare and consumer markets, it has been suggested that individuals should be compensated for the data that they generate, but it is not clear what is an equitable valuation for individual data. In this talk, we discuss a principled framework to address data valuation in the context of supervised machine learning. Given a learning algorithm trained on a number of data points to produce a predictor, we propose data Shapley as a metric to quantify the value of each training datum to the predictor performance. Data Shapley value uniquely satisfies several natural properties of equitable data valuation. We introduce Monte Carlo and gradient-based methods to efficiently estimate data Shapley values in practical settings where complex learning algorithms, including neural networks, are trained on large datasets. We then briefly discuss the notion of distributional Shapley, where the value of a point is defined in the context of underlying data distribution.
Amirata's Bio: Amirata Ghorbani is a fifth year Ph.D. student at Stanford University working with James Zou. His research is focused on problems in machine learning including equitable methods for data valuation, algorithms to interpret machine learning models, ways to make existing ML predictors more interpretable and fair, and creating ML systems for healthcare applications such as cardiology and dermatology. He has also done work as a research intern in Google Brain, Google Brain Medical, and Salesforce Research. Before joining Stanford, he got his bachelor's degree in Electrical Engineering from Sharif University of Technology after doing some work in Signal Processing and Game Theory.
Zoom registration: https://us02web.zoom.us/webinar/register/WN_mwddSBuHRROcDmuqDl-q7A
YouTube live-stream and recording: https://youtu.be/_vL1Gy_6m-A
Twitter thread to continue the conversation: Check back again after the seminar.
Jan 21, 2021 Zachary Lipton, Carnegie Mellon University
Prediction Data-Driven Decision-Making in Real World Environments
Zack's Abstract: Most machine learning methodology is developed to address prediction problems under restrictive assumptions and applied to drive decisions in environments where those assumptions are violated. This disconnect between our methodological frameworks and their application has caused confusion both among researchers (who often lack the right formalism to tackle these problems coherently) and practitioners (who have developed a folks tradition of ad hoc practices for deploying and monitoring systems). In this talk I'll discuss some of the critical disconnects plaguing the application of machine learning and our fledgling efforts to bridge some of these gaps.
Zack's Bio: Zachary Chase Lipton is the BP Junior Chair Assistant Professor of Operations Research and Machine Learning at Carnegie Mellon University and a Visiting Scientist at Amazon AI. His research spans core machine learning methods and their social impact and addresses diverse application areas, including clinical medicine and natural language processing. Current research focuses include robustness under distribution shift, breast cancer screening, the effective and equitable allocation of organs, and the intersection of causal thinking and the messy high-dimensional data that characterizes modern deep learning applications. He is the founder of the Approximately Correct blog (approximatelycorrect.com) and a co-author of Dive Into Deep Learning, an interactive open-source book drafted entirely through Jupyter notebooks. Find on Twitter (@zacharylipton) or GitHub (@zackchase).
Zoom registration: https://us02web.zoom.us/webinar/register/WN_HCip_6VzQSOucLtL97Crng
YouTube live-stream and recording: https://youtu.be/fvL6MSzsQ6Q
Twitter thread to continue the conversation: Check back again after the seminar.