Hey ! I’m Roman, currently a PhD student in the ALMAnaCH team at Inria Paris, a research team focused on Natural Language Processing.
Before that, I was a student at ENS Paris-Saclay’s MVA master degree, and an engineering student at Ecole des Ponts ParisTech. I also interned at Naver Labs Europe. You can find my resume here.
My work
Currently, my main interest lies around improving existing multilingual Language Models. In particular, I study how the tokenization process impacts the performance on low resource languages, and how to mitigate this negative effect.
I am also interested in studying the time dynamics of a language model :
- Can we speed up model training, or improve training when less data is available ?
- How does the model evolve after training ? Can we make sure a model stays relevant in an ever evolving world ?
Feel free to reach out for a chat at : roman [dot] castagne [at] gmail [dot] com
!