Andrei Mircea

I’m Andrei, a 1st year PhD student at University of Montreal and Mila supervised by Irina Rish, working on mechanistic interpretability and continual learning of language models for scientific texts. Previously, I was a Masters’ student in Jackie Cheung’s group at McGill University and Mila, with my thesis on the topic of augmenting language model pretraining with lexical semantics. I also have collaborations with Renee Seiber’s group (applying NLP to social media to support the information needs of crisis managers and affected people during extreme weather events) and Yaoyao Fiona Zhao’s group (creating a scientific information extraction system leveraging LLMs to support researchers in literature reviews).

My current research interests center on how language models and machine learning more broadly can effectively support scientists in their research and enable scientific progress. I’m currently approaching this from three related angles: (1) mechanistic interpretability to gain insights into how LLMs handle scientific knowledge; (2) inductive biases to improve LLM robustness on scientific domains; and (3) HCI to better understand how LLMs can support scientists in practice. If this interests you, please reach out!

mirandrom+ghp@pm.me

Timeline

Oct 2024	🧻 Our paper Language model scaling laws and zero-sum learning got accepted into the NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning!
Jun 2024	🧑‍🔬 Applied research internship at Capital One for the summer, working on understanding and improving LLM scaling from the perspective of per-example gradient interactions
Jun 2024	🧻 Our paper Gradient Dissent in Language Model Pretraining and Saturation got accepted into the ICML 2024 Workshop on High-dimensional learning dynamics!
May 2024	💸 Received NSERC ($40,000/yr) and FRQNT ($25,000/yr) grants for my PhD.
Sep 2023	🎉 Started a PhD with Irina Rish at the University of Montreal and Mila
Jun 2023	🧑‍🔬 Summer research with the Additive Design and Manufacturing Lab at McGill under the supervision of Yaoyao Fiona Zhao, building a human-centered scientific information extraction system with large language models and Next.js
May 2023	🎓 Completed my M.Sc. in Computer Science supervised by Jackie Cheung at McGill University and Mila. Thesis: Language model pretraining with lexical semantics.