Soham Gadgil

Hi! I am a first CSE PhD student at University of Washington in the AIMS Lab, co-advised by Dr. Su-In Lee and Dr. Linda Shapiro. Prior to starting my Ph.D, I spent a year as a data engineer at Microsoft in the Windows Experience team.

I completed my Masters from Stanford in Computer Science with a depth in AI, where I was a research asistant in the Computational Neuroimage Science Lab (CNSLAB), advised by Dr. Kilian Pohl. I was also part of the AI for Healthcare (AIHC) bootcamp in the Stanford ML Group, advised by Dr. Pranav Rajpurkar and Dr. Andrew Ng. I finished my Bachelor's at Georgia Institute of Technology, with a major in Computer Engineering and a minor in Computer Science.

I am also a part-time instructor at Persolv, teaching AI fundamentals to high school students. In my spare time, I like playing tennis, hiking, exploring different cuisines, and watching movies (especially legal thrillers).

Email  /  CV  /  Google Scholar  /  Github  /  Twitter  /  LinkedIn

profile photo

The research problems I want to work on lie at the interesection of Artificial Intelligence and Healthcare. Some of my current research interests include:

  • Clinical AI: Using multi-modal data (images, text, etc.) with deep learning to perform diagnoses and treatment of various ailments.
  • Explainability: Developing AI techniques to increase model interpretability in settings where features are not trivial to obtain (like Emergency Medicine), auditing AI models to make them more trustworthy by exploring causal relationships between inputs and predictions.

3DSP Discovering mechanisms underlying medical AI prediction of protected attributes
Soham Gadgil *, Alex J. DeGrave *, Roxana Daneshjou, Su-In Lee
CVPR 2024 Data Curation and Augmentation in Medical Imaging Workshop (Oral)

Recent advances in Artificial Intelligence (AI) have started disrupting the healthcare industry, especially medical imaging, and AI devices are increasingly being deployed into clinical practice. Such classifiers have previously demonstrated the ability to discern a range of protected demographic attributes (like race, age, sex) from medical images with unexpectedly high performance, a sensitive task which is difficult even for trained physicians. Focusing on the task of predicting sex from dermoscopic images of skin lesions, we are successfully able to train high-performing classifiers achieving a ROC-AUC score of ~0.78. We highlight how incorrect use of these demographic shortcuts can have a detrimental effect on the performance of a clinically relevant downstream task like disease diagnosis under a domain shift. Further, we employ various explainable AI (XAI) techniques to identify specific signals which can be leveraged to predict sex. Finally, we introduce a technique to quantify how much a signal contributes to the classification performance. Using this technique and the signals identified, we are able to explain ~44% of the total performance. This analysis not only underscores the importance of cautious AI application in healthcare but also opens avenues for improving the transparency and reliability of AI-driven diagnostic tools.

3DSP Estimating Conditional Mutual Information for Dynamic Feature Selection
Soham Gadgil *, Ian Covert *, Su-In Lee
ICLR 2024
[Paper] [Code]

Dynamic feature selection, where we sequentially query features to make accurate predictions with a minimal budget, is a promising paradigm to reduce feature acquisition costs and provide transparency into the prediction process. The problem is challenging, however, as it requires both making predictions with arbitrary feature sets and learning a policy to identify the most valuable selections. Here, we take an information-theoretic perspective and prioritize features based on their mutual information with the response variable. The main challenge is learning this selection policy, and we design a straightforward new modeling approach that estimates the mutual information in a discriminative rather than generative fashion. Building on our learning approach, we introduce several further improvements: allowing variable feature budgets across samples, enabling non-uniform costs between features, incorporating prior information, and exploring modern architectures to handle partial input information. We find that our method provides consistent gains over recent state-of-the-art methods across a variety of datasets.

3DSP Fostering transparent medical image AI via an image-text foundation model grounded in medical literature
Chanwoo Kim, Soham Gadgil, Alex J. DeGrave, Zhuo Ran Cai, Roxana Daneshjou, Su-In Lee
Nature Medicine 2024
[Paper] [Code]

Building trustworthy and transparent image-based medical AI systems requires the ability to interrogate data and models at all stages of the development pipeline: from training models to post-deployment monitoring. Ideally, the data and associated AI systems could be described using terms already familiar to physicians, but this requires medical datasets densely annotated with semantically meaningful concepts. Here, we present a foundation model approach, named MONET (Medical cONcept rETriever), which learns how to connect medical images with text and generates dense concept annotations to enable tasks in AI transparency from model auditing to model interpretation. Dermatology provides a demanding use case for the versatility of MONET, due to the heterogeneity in diseases, skin tones, and imaging modalities. We trained MONET on the basis of 105,550 dermatological images paired with natural language descriptions from a large collection of medical literature. MONET can accurately annotate concepts across dermatology images as verified by board-certified dermatologists, outperforming supervised models built on previously concept-annotated dermatology datasets. We demonstrate how MONET enables AI transparency across the entire AI development pipeline from dataset auditing to model auditing to building inherently interpretable models.

3DSP CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation
Soham Gadgil *, Mark Endo *, Emily Wen *, Andrew Y. Ng, Pranav Rajpurkar
MIDL 2021
[Paper] [Code]

Medical image segmentation models are typically supervised by expert annotations at the pixel-level, which can be expensive to acquire. In this work, we propose a method that combines the high quality of pixel-level expert annotations with the scale of coarse DNNgenerated saliency maps for training multi-label semantic segmentation models. We demonstrate the application of our semi-supervised method, which we call CheXseg, on multilabel chest X-ray interpretation. We find that CheXseg improves upon the performance (mIoU) of fully-supervised methods that use only pixel-level expert annotations by 9.7% and weakly-supervised methods that use only DNN-generated saliency maps by 73.1%. Our best method is able to match radiologist agreement on three out of ten pathologies and reduces the overall performance gap by 57.2% as compared to weakly-supervised methods.

3DSP Spatio-Temporal Graph Convolution for Resting-State fMRI Analysis
Soham Gadgil *, Qingyu Zhao *, Adolf Pfefferbaum *, Edith V. Sullivan, Ehsan Adeli, Kilian M. Pohl
[Paper] [Code]

The Blood-Oxygen-Level-Dependent (BOLD) signal of resting-state fMRI (rs-fMRI) records the temporal dynamics of intrinsic functional networks in the brain. However, existing deep learning methods applied to rs-fMRI either neglect the functional dependency between different brain regions in a network or discard the information in the temporal dynamics of brain activity. To overcome those shortcomings, we propose to formulate functional connectivity networks within the context of spatio-temporal graphs. We train a spatio-temporal graph convolutional network (ST-GCN) on short sub-sequences of the BOLD time series to model the non-stationary nature of functional connectivity. Simultaneously, the model learns the importance of graph edges within ST-GCN to gain insight into the functional connectivities contributing to the prediction. In analyzing the rs-fMRI of the Human Connectome Project (HCP, N = 1,091) and the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N = 773), ST-GCN is significantly more accurate than common approaches in predicting gender and age based on BOLD signals. Furthermore, the brain regions and functional connections significantly contributing to the predictions of our model are important markers according to the neuroscience literature.

3DSP Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning
Soham Gadgil, Yunfeng Xin, Chengzhe Xu
IEEE SouthEastCon 2020
[Paper] [Code]

Reinforcement Learning (RL) is an area of machine learning concerned with enabling an agent to navigate an environment with uncertainty in order to maximize some notion of cumulative long-term reward. In this paper, we implement and analyze two different RL techniques, Sarsa and Deep QLearning, on OpenAI Gym's LunarLander-v2 environment. We then introduce additional uncertainty to the original problem to test the robustness of the mentioned techniques. With our best models, we are able to achieve average rewards of 170+ with the Sarsa agent and 200+ with the Deep Q-Learning agent on the original problem. We also show that these techniques are able to overcome the additional uncertainities and achieve positive average rewards of 100+ with both agents. We then perform a comparative analysis of the two techniques to conclude which agent peforms better.

  • President of Georgia Tech IEEE, leading the largest IEEE student branch in the US with over 800 members.
  • International Liasion for the Student Alumni Association at Georgia Tech.
  • Peer leader in freshmen and senior student dormitories.
Teaching Assistantships

University of Washington

cs_107 Introduction to AI (CSE 473)

The course covers principal ideas and developments in artificial intelligence: Problem solving and search, game playing, knowledge representation and reasoning, uncertainty, machine learning, natural language processing. I held weekly office hours and assisted in preparing/grading the homework assignments.


cs_107 Computer Organization and Systems (CS 107)

TA for CS 107, one of the largest introductory undergraduate courses at Stanford with over 150 students. I led two hour-long lab sessions each week along with office hours and assisted the professor in grading homework and desiging exams. Topics included the C programming language, data representation, machine-level code, computer arithmetic, elements of code compilation, optimization of memory and runtime performance, and memory organization and management.

iswc2018 Trustworthy Machine Learning (CS 329T)

TA for the first course offering of CS 329T. I co-developed and led the lab sections with ~25 students. I also helped the instructors design some of the lecture slides, homework assignments, and the final project.

Georgia Tech

linear_alg Linear Algebra (MATH 1554)

As a TA for linear algebra, I led two 50 minute recitation sessions with 25 students each week. Concepts ranged from eigenvalues, eigenvectors, applications to linear systems, least squares, diagonalization, quadratic forms.

comp_arch Computer Architecture (CS 3056)

Guided over 60 students with homeworks and projects in computer architecture. Held weekly office hours, exam review sessions, and collaborated with the instructor for grading and project ideation. Topics included the basic organizational principles of the major components of a processor - the core, memory hierarchy, and the I/O subsystem.

(Design and CSS courtesy: Jon Barron and Amlaan Bhoi)