Mansi Sakarvadia

Hello! I am a Computer Science Ph.D. student at the University of Chicago, where I am co-advised by Ian Foster and Kyle Chard.

I develop machine learning interpretability methods. My research aims to systematically reverse engineer neural networks to interpret their weights. For example, much of my work focuses on localizing sources of model failure within weight-space and developing efficient methods to correct model behavior. My work is supported by a Department of Energy Computational Science Graduate Fellowship.

Prior to my Ph.D., I completed my Bachelors in Computer Science and Mathematics with a minor in Environmental Science at the University of North Carolina, Chapel Hill.

news

Sep 1, 2025	Had a great time at Lawrence Berekley National Laboratory’s ML and Analytics group this summer studying the limits of machine-learned operators for modeling PDEs.
Jul 16, 2025	Presented my poster “The False Promise of Super-Resolution of Machine-Learned Operators” at the CSGF Program Review in Washington, DC.
Apr 15, 2025	Was honored to have given a talk on my recent work on Mitigating Memorization in Language Models at the Midwest Speech and Language Days at University of Notre Dame!
Mar 1, 2025	Recent work on Mitigating Memorization in Language Models was accepted as a Top 5% Spotlight paper at International Conference on Learning Representations (ICLR) 2025! Check out a 5 min video summary of the work.
Jan 15, 2025	I was interviewed by the Department of Energy Science in Parallel podcast about the recent Nobel prizes in Physics and Chemistry and their implications for ML and the domain sciences.

selected publications

Preprint
The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators

Mansi Sakarvadia, Kareem Hegazy, Amin Totounferoush, and 4 more authors

2025

Abs arXiv Bib Blog

A core challenge in scientific machine learning, and scientific computing more generally, is modeling continuous phenomena which (in practice) are represented discretely. Machine-learned operators (MLOs) have been introduced as a means to achieve this modeling goal, as this class of architecture can perform inference at arbitrary resolution. In this work, we evaluate whether this architectural innovation is sufficient to perform “zero-shot super-resolution,” namely to enable a model to serve inference on higher-resolution data than that on which it was originally trained. We comprehensively evaluate both zero-shot sub-resolution and super-resolution (i.e., multi-resolution) inference in MLOs. We decouple multi-resolution inference into two key behaviors: 1) extrapolation to varying frequency information; and 2) interpolating across varying resolutions. We empirically demonstrate that MLOs fail to do both of these tasks in a zero-shot manner. Consequently, we find MLOs are not able to perform accurate inference at resolutions different from those on which they were trained, and instead they are brittle and susceptible to aliasing. To address these failure modes, we propose a simple, computationally-efficient, and data-driven multi-resolution training protocol that overcomes aliasing and that provides robust multi-resolution generalization.
@article{sakarvadia2025false, title = {The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators}, author = {Sakarvadia, Mansi and Hegazy, Kareem and Totounferoush, Amin and Chard, Kyle and Yang, Yaoqing and Foster, Ian and Mahoney, Michael W.}, year = {2025}, }
Preprint
Topology-Aware Knowledge Propagation in Decentralized Learning

Mansi Sakarvadia, Nathaniel Hudson, Tian Li, and 2 more authors

2025

Abs arXiv Bib Blog

Decentralized learning enables collaborative training of models across naturally distributed data without centralized coordination or maintenance of a global model. Instead, devices are organized in arbitrary communication topologies, in which they can only communicate with neighboring devices. Each device maintains its own local model by training on its local data and integrating new knowledge via model aggregation with neighbors. Therefore, knowledge is propagated across the topology via successive aggregation rounds. We study, in particular, the propagation of out-of-distribution (OOD) knowledge. We find that popular decentralized learning algorithms struggle to propagate OOD knowledge effectively to all devices. Further, we find that both the location of OOD data within a topology, and the topology itself, significantly impact OOD knowledge propagation. We then propose topology-aware aggregation strategies to accelerate (OOD) knowledge propagation across devices. These strategies improve OOD data accuracy, compared to topology-unaware baselines, by 123% on average across models in a topology.
@article{sakarvadia2025topology, title = {Topology-Aware Knowledge Propagation in Decentralized Learning}, author = {Sakarvadia, Mansi and Hudson, Nathaniel and Li, Tian and Foster, Ian and Chard, Kyle}, year = {2025}, }
ICLR
Mitigating Memorization In Language Models

Mansi Sakarvadia, Aswathy Ajith, Arham Khan, and 6 more authors

2025

Spotlight (top 5%)

Abs arXiv Bib Blog

Language models (LMs) can "memorize" information, i.e., encode training data in their weights in such a way that inference-time queries can lead to verbatim regurgitation of that data. This ability to extract training data can be problematic, for example, when data are private or sensitive. In this work, we investigate methods to mitigate memorization: three regularizer-based, three finetuning-based, and eleven machine unlearning-based methods, with five of the latter being new methods that we introduce. We also introduce TinyMem, a suite of small, computationally-efficient LMs for the rapid development and evaluation of memorization-mitigation methods. We demonstrate that the mitigation methods that we develop using TinyMem can successfully be applied to production-grade LMs, and we determine via experiment that: regularizer-based mitigation methods are slow and ineffective at curbing memorization; fine-tuning-based methods are effective at curbing memorization, but overly expensive, especially for retaining higher accuracies; and unlearning-based methods are faster and more effective, allowing for the precise localization and removal of memorized information from LM weights prior to inference. We show, in particular, that our proposed unlearning method BalancedSubnet outperforms other mitigation methods at removing memorized information while preserving performance on target tasks.
@article{sakarvadia2024Mitigating, title = {Mitigating Memorization In Language Models}, author = {Sakarvadia, Mansi and Ajith, Aswathy and Khan, Arham and Hudson, Nathaniel and Geniesse, Caleb and Chard, Kyle and Yang, Yaoqing and Foster, Ian and Mahoney, Michael W.}, year = {2025}, note = {Spotlight (top 5\%)}, publisher = {International Conference on Learning Representations, 2025}, }
Preprint
SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques

Arham Khan, Todd Nief, Nathaniel Hudson, and 6 more authors

2024

Abs arXiv Bib

Understanding neural networks is crucial to creating reliable and trustworthy deep learning models. Most contemporary research in interpretability analyzes just one model at a time via causal intervention or activation analysis. Yet despite successes, these methods leave significant gaps in our understanding of the training behaviors of neural networks, how their inner representations emerge, and how we can predictably associate model components with task-specific behaviors. Seeking new insights from work in related fields, here we survey literature in the field of model merging, a field that aims to combine the abilities of various neural networks by merging their parameters and identifying task-specific model components in the process. We analyze the model merging literature through the lens of loss landscape geometry, an approach that enables us to connect observations from empirical studies on interpretability, security, model merging, and loss landscape analysis to phenomena that govern neural network training and the emergence of their inner representations. To systematize knowledge in this area, we present a novel taxonomy of model merging techniques organized by their core algorithmic principles. Additionally, we distill repeated empirical observations from the literature in these fields into characterizations of four major aspects of loss landscape geometry: mode convexity, determinism, directedness, and connectivity. We argue that by improving our understanding of the principles underlying model merging and loss landscape geometry, this work contributes to the goal of ensuring secure and trustworthy machine learning in practice.
@article{khan2024sok, title = {SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques}, author = {Khan, Arham and Nief, Todd and Hudson, Nathaniel and Sakarvadia, Mansi and Grzenda, Daniel and Ajith, Aswathy and Pettyjohn, Jordan and Chard, Kyle and Foster, Ian}, year = {2024}, }
BlackboxNLP
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Mansi Sakarvadia, Aswathy Ajith, Arham Khan, and 5 more authors

2023

Work accepted to BlackBoxNLP 2023.

Abs arXiv Bib

Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%.
@article{sakarvadia2023memory, title = {Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models}, author = {Sakarvadia, Mansi and Ajith, Aswathy and Khan, Arham and Grzenda, Daniel and Hudson, Nathaniel and Bauer, André and Chard, Kyle and Foster, Ian}, year = {2023}, note = {Work accepted to BlackBoxNLP 2023.}, }