Ruth Fong

I am a teaching professor in the Department of Computer Science at Princeton University, where I teach intro and AIML CS courses and conduct research in computer vision and machine learning with a focus on explainable AI. At Princeton, I lead the Looking Glass Lab and frequently collaborate with Professor Olga Russakovsky and the Visual AI Lab.

I completed my PhD in the Visual Geometry Group at the University of Oxford, where I was advised by Andrea Vedaldi and funded by the Rhodes Trust and Open Philanthropy. Also at Oxford, I earned a Masters in Neuroscience, where I worked with Rafal Bogacz, Ben Willmore, and Nicol Harper. I received a Bachelors in Computer Science at Harvard University, where I worked with David Cox and Walter Scheirer.

Funding acknowledgements: Looking Glass Lab is grateful to Princeton SEAS and Open Philanthropy for generous support of our research.

Email | CV | Bio | Google Scholar | GitHub

Hello! 👋

COS324. I'm excited to have you in my class this semester! Please see our course website for the best ways to reach the course staff.

Research/IW/thesis advising. I am excited to work with Princeton students! If you are a...

Princeton undergraduate student interested in doing an IW, a senior thesis, and/or research with me, please first read this doc with a list of preferred IW topics and my philosophy on IW. If one of the topics interests you, please email me your (1) CV, (2) transcript, and (3) a short description of what you might be interested in working on (I also have a few project ideas I can suggest). I am particularly excited about helping students from underrepresented groups have a positive and successful first research experience. For lists of other potential IW/thesis advisors...
- For a list of most AIML faculty members in the COS department, see here.
- For a list of all COS faculty who are available for IW/thesis advising, see here.
Princeton graduate student interested in working or collaborating with me, please reach out via email.
non-Princeton student (including prospective applicants) interested in working with me, there's no need to email; I unfortunately am not accepting non-Princeton students at this time.

Other Princeton things. Engaging with students is one of my favorite parts of the job. If you'd like to reach me about other Princeton-related things (e.g. participating in a student event, grabbing a meal), shoot me an email!

News 🗞

Our CHI 2023 paper, "Help Me Help the AI": Understanding How Explaiinability Can Support Human-AI Interaction, received an honorable mention award 🏆! Congrats to Sunnie S. Y. Kim and our co-authors 🎉! arXiv | program link
Along with my co-editors, our book on "xxAI - Beyond Explainable AI" is now available: link
Along with my co-instructors, I introduced an open-ended final project to COS126 (i.e. Princeton's intro CS course), here's the online gallery of the amazing projects students created!
I am excited to announce that I am joining Princeton's CS department as a teaching faculty member starting July 2021.
My PhD thesis on "Understanding Convolutional Neural Networks" can be found here. For those with less experience, all chapters except chapters 3-6 were written with accessibility in mind.

Talks 👩🏻‍🏫

Explainability in Machine Learning workshop 2023 — "Directions in Interpretability": slides | workshop webpage
HEIBRiDS Lecture Series 2022 — "Directions in Interpretability": slides | series webpage
MICCAI 2022 — "Directions in Interpretability" at the iMIMIC workshop (Interpretability of Machine Intelligence in Medical Image Computing): slides | workshop webpage
CVPR 2022 — "Directions in Interpretability" at the Human-Centered AI for Computer Vision tutorial: video | slides | tutorial webpage
CVPR 2020 — "Understanding Deep Neural Networks" at the Interpretable Machine Learning for Computer Vision tutorial: video | slides | tutorial webpage
Oxford VGG 2019 — Interpretability tutorial: slides

Looking Glass Lab 👤

Members

Rawand Aziz Matthew Barrett Emilio Chan Anha Khan Ben Wachspress Indu Panigrahi

Collaborators

Sunnie S. Y. Kim Prof. Parastoo Abtahi Prof. Andrés Monroy-Hernández Prof. Vikram V. Ramaswamy Prof. Olga Russakovsky

[image attribution]

Awards

Ruth Fong, Princeton Engineering Council Teaching Award, 2025.
Ruth Fong, Princeton Keller Center Summer Course Development Grant, 2025.
Sunnie S. Y. Kim et al., CHI Honorable Mention Paper Award, 2023.
Devon Ulrich, Tau Beta Pi, 2023.
Alexis Sursock, Sigma Xi, 2023.
Indu Panigrahi, Sigma Xi, 2023.
Indu Panigrahi, Outstanding Computer Science Senior Thesis Prize, 2023.
Indu Panigrahi, NSF Graduate Fellowship Award Honorable Mention, 2023.
Indu Panigrahi, Computing Research Association (CRA) Outstanding Undergraduate Research Award Nominee, 2022.
Indu Panigrahi, Outstanding Independent Work Award, 2022.
Indu Panigrahi, Princeton Research Day Orange & Black Undergraduate Presentation Award, 2022.
Ruth Fong, Open Philanthropy AI Fellowship AI Fellowship, 2018.
Ruth Fong, Rhodes Scholarship, 2015.

Alumni

Adam Kelch '24, IW (spring 2023), Extending Feature Visualization Methods to Text-To-Image Generative AI Models
Sai Rachumalla '24, IW (spring 2023), Evaluating Concept-based Visual Explanations
Creston Brooks '23, senior thesis, Optimizations towards AI-based Travel Recommendation (started CS MS at Princeton in 2023).
Alexis Sursock '23, senior thesis, Stravl: The World's First Large-Scale, AI-based Travel Designer.
Indu Panigrahi '23, senior thesis, A Semi-supervised Model for Fine-grain, Serial Image Instance Segmentation (started CS MS at Princeton in 2023).
Devon Ulrich '23, senior thesis, Investigating the Fairness of Computer Vision Models for Medical Imaging.
Icey Siyi '24 and Fatima Zohra Boumhaout '24, research (summer 2022), Interactive Perturbation Visualization Tool.
Frelicia Tucker '22, senior thesis, The Virtual Black Hair Experience: Evaluating Hairstyle Transform Generative Adversarial Networks on Black Women.
Vedant Dhopte '22, senior thesis, Holistically Interpreting Deep Neural Networks via Channel Ablation.

Research 🧪

My research interests are in computer vision, machine learning (ML), and human-computer interaction (HCI), with a particular focus on explainable AI and ML fairness. Most of my work focuses on developing novel techniques for understanding AI models post-hoc, designing new AI models that are interpretable-by-design, and/or introducing paradigms for finding and correcting existing failure points in AI models. See Google Scholar for the most updated list of papers.

* denotes equal contribution; ^ denotes peer-reviewed, non-archival work (e.g. accepted to non-archival workshop).

	Interactivity x Explainability: Toward Understanding How Interactivity Can Improve Computer Vision Explanations Indu Panigrahi, Sunnie S. Y. Kim, Amna Liaqat, Rohan Jinturkar, Olga Russakovsky, Ruth Fong, Parastoo Abtahi CHI Extended Abstracts (Late-Breaking Work), 2025 paper \| project page \| bibtex We study how end-users leverage and perceive interactive computer vision explanations.
	Interactive Visual Feature Search Devon Ulrich and Ruth Fong CVPR Workshop on Computer Vision for XAI (CV4XAI), 2024 arXiv \| code \| bibtex We present an interactive visualization tool that allows you to perform a reverse image search for similar image regions using intermediate activations.
	Humans, AI, and Context: Understanding End-Users' Trust in a Real-World Computer Vision Application Sunnie S. Y. Kim, Elizabeth Anne Watkins, Olga Russakovsky, Ruth Fong, Andrés Monroy-Hernández FAccT, 2023 arXiv \| project page \| bibtex We study how end-users trust AI in a real-world context. Concretely, we describe multiple aspects of trust in AI and how human, AI, and context-related factors influence each.
	UFO: A Unified Method for Controlling Understandability and Faithfulness Objectives in Concept-based Explanations for CNNs Vikram V. Ramaswamy, Sunnie S. Y. Kim, Ruth Fong, Olga Russakovsky arXiv, 2023 arXiv \| bibtex We introduce a novel concept-based explanation framework for CNNs: UFO, which is a method for controlling the understandability and faithfulness of concept-based explanations using well-defined objective functions for the two qualities.
	Improving Data-Efficient Fossil Segmentation via Model Editing Indu Panigrahi, Ryan Manzuk, Adam Maloof, Ruth Fong CVPR Workshop on Learning with Limited Labelled Data for Image and Video Understanding, 2023 arXiv \| bibtex We explore how to improve a model for segmenting coral reef fossils by first understanding its systematic failures and second ``editing'' the model to mitigate said failures.
	"Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction Sunnie S. Y. Kim, Elizabeth Anne Watkins, Olga Russakovsky, Ruth Fong, Andrés Monroy-Hernández CHI, 2023 (Honorable Mention award 🏆) arXiv \| supp \| 30-sec video \| 10-min video \| bibtex We explore how explainability can support human-AI interaction by interviewing 20 end-users of a real-world AI application. Specifically, we study (1) what XAI needs people have, (2) how people intend to use XAI explanations, and (3) how people perceive existing XAI methods.
	Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Salience, and Human Capability Vikram V. Ramaswamy, Sunnie S. Y. Kim, Ruth Fong, Olga Russakovsky CVPR, 2023 arXiv \| bibtex We analyze three commonly overlooked factors in concept-based explanations, (1) the choice of the probe dataset, (2) the saliency of concepts in the probe dataset, (3) the number of concepts used in explanations, and make suggestions for future development and analysis of concept-based interpretability methods.
	Gender Artifacts in Visual Datasets Nicole Meister, Dora Zhao, Angelina Wang, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky ICCV, 2023 arXiv \| project page \| bibtex We demonstrate the pervasive-ness of gender artifacts in popular computer vision datasets (e.g. COCO and OpenImages). We find that all of the following (and more) are gender artifacts: the mean value of color channels (i.e. mean RGB), the pose and location of people, and most co-located objects.
	ELUDE: Generating Interpretable Explanations via a Decomposition into Labelled and Unlabelled Features Vikram V. Ramaswamy, Sunnie S. Y. Kim, Nicole Meister, Ruth Fong, Olga Russakovsky arXiv, 2022 arXiv \| bibtex We present ELUDE, a novel explanation framework that decomposes a model's prediction into two components: 1. using labelled, semantic attributes (e.g. fur, paw, etc.) and 2. using an unlabelled, low-rank feature space.
	HIVE: Evaluating the Human Interpretability of Visual Explanations Sunnie S. Y. Kim, Nicole Meister, Vikram V. Ramaswamy, Ruth Fong, Olga Russakovsky ECCV, 2022 arXiv \| project page \| extended abstract \| code \| 2-min video \| bibtex We introduce HIVE, a novel human evaluation framework for diverse interpretability methods in computer vision, and develop metrics that measure achievement on two desiderata for explanations used to assist human decision making: (1) Explanations should allow users to distinguish between correct and incorrect predictions. (2) Explanations should be understandable to users.
	Interactive Similarity Overlays^ Ruth Fong, Alexander Mordvintsev, Andrea Vedaldi, Chris Olah VISxAI, 2021 interactive article \| code \| bibtex We introduce a novel interactive visualization that allows machine learning practitioners and researchers to easily observe, explore, and compare how a neural network perceives different image regions.
	On Compositions of Transformations in Contrastive Self-Supervised Learning Mandela Patrick, Yuki M. Asano, Polina Kuznetsova, Ruth Fong, João F. Henriques, Geoffrey Zweig, and Andrea Vedaldi ICCV, 2021 arXiv \| code \| bibtex We give transformations the prominence they deserve by introducing a systematic framework suitable for contrastive learning. SOTA video representation learning by learning (in)variances systematically.
	Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning Iro Laina, Ruth Fong, and Andrea Vedaldi NeurIPS, 2020 arxiv \| supp \| bibtex We introduce two novel human evaluation metrics for quantifying for evaluating the interpretability of clusters discovered via self-supervised methods. We also outline how to partially approximate one of the metrics using a group captioning model.
	Debiasing Convolutional Neural Networks via Meta Orthogonalization^ Kurtis Evan David, Qiang Liu, and Ruth Fong NeurIPS Workshop on Algorithmic Fairness through the Lens of Causality and Interpretability (AFCI), 2020 arxiv \| supp \| poster \| bibtex We introduce a novel paradigm for debiasing CNNs by encouraging salient concept vectors to orthogonal to class vectors in the activation space of an intermediate CNN layer (e.g., orthogonalizing gender and oven concepts in conv5).
	Contextual Semantic Interpretability Diego Marcos, Ruth Fong, Sylvain Lobry, Rémi Flamary, Nicolas Courty, and Devis Tuia ACCV, 2020 arxiv \| supp \| code \| bibtex We introduce an interpretable-by-design machine vision model that learns to sparse groupings of interpretable concepts and demonstrate the utility of our novel architecture on scenicness prediction.
	There and Back Again: Revisiting Backpropagation Saliency Methods Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, and Andrea Vedaldi CVPR*, 2020 arxiv \| code \| bibtex @InProceedings{rebuffi_cvpr_2020, author = {Rebuffi, Sylvestre-Alvise and Fong, Ruth and Ji, Xu and Vedaldi, Andrea}, title = {There and Back Again: Revisiting Backpropagation Saliency Methods}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2020} } We outline a novel framework that unifies many backpropagation saliency methods. Furthermore, we introduce NormGrad, a saliency method that considers the spatial contribution of the gradients of convolutional weights. We also systematically study the effects of combining saliency maps at different layers. Finally, we introduce a class-sensitivity metric and a meta-learning inspired technique that can be applied to any saliency method to improve class sensitivity.
	Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims Miles Brundage, Shahar Avin, Jasmine Wang, Haydn Belfield, Gretchen Krueger, … , Ruth Fong, et al. arXiv*, 2020 arxiv \| project page \| bibtex This report suggests various steps that different stakeholders can take to make it easier to verify claims made about AI systems and their associated development processes. The authors believe the implementation of such mechanisms can help make progress on one component of the multifaceted problem of ensuring that AI development is conducted in a trustworthy fashion.
	Understanding Deep Networks via Extremal Perturbations and Smooth Masks Ruth Fong, Mandela Patrick, and Andrea Vedaldi ICCV, 2019 (Oral) arxiv \| supp \| poster \| code (TorchRay) \| 4-min video \| bibtex @InProceedings{fong_iccv_2019, author = {Fong, Ruth and Patrick, Mandela and Vedaldi, Andrea}, title = {Understanding Deep Networks via Extremal Perturbations and Smooth Masks}, booktitle = {The IEEE International Conference on Computer Vision (ICCV)}, year = {2019} } We introduce extremal perturbations, an novel attribution method that highlights "where" a model is "looking." We improve upon Fong and Vedaldi, 2017 by separating out regularization on the size and smoothness of a perturbation mask from the attribution objective of learning a mask that maximally affects a model's output; we also extend our work to intermediate channel representations.
	Occlusions for Effective Data Augmentation in Image Classification Ruth Fong and Andrea Vedaldi ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models, 2019 paper \| bibtex \| code (coming soon) @InProceedings{fong_iccv_workshop_2019, author = {Fong, Ruth and and Vedaldi, Andrea}, title = {Occlusions for Effective Data Augmentation in Image Classification}, booktitle = {ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models}, year = {2019} } We introduce a simple paradigm based on batch augmentation for leveraging input-level occlusions (both stochastic and saliency-based) to improve ImageNet image classification. We also demonstrate the necessary of batch augmentation and quantify the robustness of different CNN architectures to occlusion via ablation studies.
	Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks Ruth Fong and Andrea Vedaldi CVPR, 2018 (Spotlight) arxiv \| supp \| bibtex \| code \| 4-min video \| slides @InProceedings{fong_cvpr_2018, author={Fong, Ruth and Vedaldi, Andrea}, title={Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks}, booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month={July}, year={2018} } Investigating how human-interpretable visual concepts (i.e., textures, objects, etc.) are encoded across hidden units of a convolutional neural network (CNN) layer as well as across CNN layers.
	Using Human Brain Activity to Guide Machine Learning Ruth Fong, Walter Scheirer, and David Cox Scientific Reports, 2018 arxiv \| supp \| Harvard thesis \| bibtex We introduce a biologically-informed machine learning paradigm for object classification that biases models to better match the learned, internal representations of the visual cortex.
	Interpretable Explanations of Black Box Algorithms by Meaningful Perturbation Ruth Fong and Andrea Vedaldi ICCV, 2017 arxiv \| supp \| bibtex \| code \| book chapter (extended) \| chapter bibtex We developed a theoretical framework for learning "explanations" of black box functions like CNNs as well as saliency methods for identifying "where" a computer vision algorithm is looking.

Theses 🎓

	Understanding Convolutional Neural Networks Ruth Fong (advised by Andrea Vedaldi) Ph.D. Thesis This is a "thesis-by-staples", so the novel parts are the non-paper chapters (i.e. all chapters except chapters 3-6), which I wrote with accessibility in mind (e.g., the ideal reader is a motivated undergraduate or graduate student looking to learn more about deep learning and interpretability). The introduction is accessible to a high-school student, and appendices A and B are primers on the relevant math concepts and convolutional neural networks respectively.
	Modelling Blind Single Channel Sound Separation Using Predict Neural Networks Ruth Fong (advised by Ben Willmore and Nicol Harper) M.Sc. Thesis #2 I developed an unsupervised learning paradigm for sound separation using fully connected and recurrent neural networks to predict the future from past cochleagram data.
	Optimizing Deep Brain Stimulation to Dampen Tremor Ruth Fong (advised by Rafal Bogacz) M.Sc. Thesis #1 Tutorial \| Demo \| MATLAB Rayleigh statistics toolbox I developed a computational oscillator model that modeled the tremor-dampening effects of phasic deep brain stimulation and analyzed it on experimental data.
	Leveraging Human Brain Activity to Improve Object Classification Ruth Fong (advised by David Cox and Walter Scheirer) A.B. Thesis Published as Fong et al., Scientific Reports 2018.