Hey ! I’m Roman, currently a PhD student in the ALMAnaCH team at Inria Paris, a research team focused on Natural Language Processing.
Currently, my main interest lies around improving existing multilingual Language Models. In particular, I study how the tokenization process impacts the performance on low resource languages, and how to mitigate this negative effect.
I am also interested in studying the time dynamics of a language model :
- Can we speed up model training, or improve training when less data is available ?
- How does the model evolve after training ? Can we make sure a model stays relevant in an ever evolving world ?
Feel free to reach out for a chat at :
roman [dot] castagne [at] gmail [dot] com !