Here are some of the projects I have been working on recently. You can find my full resume here.
- MANTa Findings EMNLP (to appear) 2022
- MANTa is a differentiable tokenization module which learns to segment input sequences end-to-end with the Language Model objective.
- BLOOM 2022
- BLOOM is a Large Language Model resulting from a collaborative, open source effort. I actively participated in creating its tokenizer.
- Hands-on CamemBERT June 2022
- We gave a 3 hours-long tutorial on how to use and finetune CamemBERT, a French Language Model and turned it into a blogpost (in French).
- Active Learning from Demonstrations MVA RL course 2021
- We designed an agent robust to imperfect demonstrations and evaluated it in discrete and continuous environments.
- SinGAN for Inpainting MVA Computer Vision course 2021
- We adapt SinGAN for inpainting using Partial Convolutions.
- Domain Shift in Disaster Tweets Classification MVA Deep Learning course 2021
- We study the impact of domain shifts on Disaster Tweet Classifiers, and solutions to mitigate potential degradations.