I'm a research fellow at Sankara Eye Hospital and Microsoft Research India, working with Dr. Mohit Jain, Dr. Vineeth N B, and Prof. Thomas Schultz.
My research spans computer vision and its application to healthcare. I study task-transfer dynamics in vision-language models, and work on surgical scene understanding in cataract surgery. Earlier, I worked on building expert-in-the-loop LLM chatbots for patients and community health workers.
I received my B.Tech. in Computer Science from Delhi Technological University (DTU) in 2023. During my undergrad, I interned at Prime Video, Amazon and Kroop AI.
We investigate how finetuning on one task affects zero-shot performance on others across the Qwen-2.5-VL model family. We introduce the Perfection Gap Factor (PGF) to characterize task-to-task transfer and leverage task interactions to identify effective proxy datasets.
We propose the first framework for intra-operative complication detection in cataract surgery. Using SAM-2 for object tracking and a complication-specific risk-scoring module, the system detects high-risk segments for VLM-based reasoning. We also release CataComp, the first dataset annotated for three intra-operative complications.
We present the first comprehensive dataset for MSICS cataract surgery tool segmentation and introduce ToolSeg, a phase-informed segmentation method that leverages surgical phases to enhance tool segmentation accuracy.
We developed CataractBot, an LLM-powered chatbot answering cataract surgery questions via a curated knowledge base with expert-verified responses, supporting multimodal and multilingual inputs.