Diversity & Inclusion
My journey in NLP started with my interest in my native Moroccan Darija. It is an indigenous language spoken by over 30 million people, but remains underrepresented in language technologies as it has very few resources. My Bachelor’s thesis was on creating Moroccan Darija linguistic resources.
I am committed to increasing the visibility of North African people in NLP and language technology.
North Africans in NLP
We are an affinity group for people of North African descent and/or based in North Africa, and who share an interest in NLP research. At EMNLP 2020, we organized a panel discussion on NLP research as a North African (watch it here!) with amazing North African figures of NLP and AI. We also organized social events at COLING 2020, EACL 2021, and NAACL 2021.
We have a Slack channel for North Africans in NLP. Please email me and I can add you.
I had the honour to be the first invited speaker of the Morocco.AI Webinar series. My invited talk was about my EMNLP 2020 paper on Syntactic Parsing, and was on Wednesday, February 10th, 2021. You can find a recording of the talk here. I would highly recommend this webinar series. Morocco AI hosted some of the brightest Moroccan researchers in AI, based both in Morocco and abroad.
I have started collaborating with the Morocco AI team on projects related to Moroccan Darija:
Moroccan Darija WordNet: I am looking for volunteers to validate bilingual English-Darija word pairs that were automatically crawled.
Moroccan Darija Language Model: We are collaborating with co-founders of the Moroccan Darija Wikipedia to build a Moroccan Darija language model. Let me know if you have plain text data of Moroccan Darija, or if you want to help us!