About me
I am a PhD candidate at UdeM/Mila, co-supervised by Eilif Muller and Irina Rish. I aim to understand the conditions under which models lose prior capabilities and gain new ones, in order to predict and influence observable phenomena such as catastrophic forgetting and loss of plasticity across scale.
By leveraging various properties of large language models, I demonstrate how diverse reasoning patterns can be surfaced as a model continues to learn. I study how learned representations and output distributions drift as tasks shift, allowing for principled interventions in order to mitigate forgetting and promote plasticity. This work is preliminary, with a preprint along the way.
Previously, I have led work on benchmarking prior coding knowledge conflicts, contributed to work on continual pre‑training strategies, and explored test‑time learning. Throughout all of these projects, the goal has been consistent: find ways to improve models over time and better evaluate them over time. Looking forward, if we want to use resources spent on AI to gain new capabilities, then the better we understand the aforementioned phenomena, the more strategically we can allocate compute toward solving the most pressing problems.
