Massively Multilingual NLU 2022
A Workshop Colocated with EMNLP 2022 in Abu Dhabi and Online Dec 7, 2022
Let’s scale natural language understanding technology to every language on Earth!
By 2023 there will be over 8 billion virtual assistants worldwide, the majority of which will be on smartphones. Additionally, over 100 million smart speakers have been sold, most of which exclusively use a voice interface and require Natural Language Understanding (NLU) during every user interaction in order to function. However, even as we approach the point in which there will be more virtual assistants than people in the world, major virtual assistants still only support a small fraction of the world’s languages. This limitation is driven by the lack of labeled data, the expense associated with human-based quality assurance, model maintenance and update costs, and more. Innovation is how we will jump these hurdles. The vision of this workshop is to help propel natural language understanding technology into the 50-language, 100-language, and even the 1,000-language regime, both for production systems and for research endeavors.
News
- 26 Oct: We are pleased to declare Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, and Walter Daelemans of the
bolleke
team as the winners of the Organizers’ Choice Award! Please come to our workshop to hear more about their model and their associated paper, Machine Translation for Multilingual Intent Detection and Slots Filling. - 12 Aug: We welcome submissions until Sep 2nd for the MMNLU-22 Organizers’ Choice Award, as well as direct paper submissions until Sep 7th. The Organizers’ Choice Award is based primarily on our assessment of the promise of an approach, not only on the evaluation scores. To be eligible, please (a) make a submission on eval.ai to either MMNLU-22 task and (b) send a brief (<1 page) writeup of your approach to
mmnlu-22@amazon.com
describing the following:- Your architecture,
- Any changes to training data, use of non-public data, or use of public data,
- How dev data was used and what hyperparameter tuning was performed,
- Model input and output formats,
- What tools and libraries you used, and
- Any additional training techniques you used, such as knowledge distillation.
- 12 Aug: We are pleased to declare the HIT-SCIR team as the winner of the MMNLU-22 Competition Full Dataset Task. Congratulations to Bo Zheng, Zhuoyang Li, Fuxuan Wei, Qiguang Chen, Libo Qin, and Wanxiang Che from the Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology. The team has been invited to speak at the MMNLU-22 workshop on Dec 7th, where you can learn more about their approach.
- 12 Aug: We are pleased to declare the FabT5 team as the winner of the MMNLU-22 Competition Zero-Shot Task. Congratulations to Massimo Nicosia and Francesco Piccinno from Google. They have been invited to speak at the MMNLU-22 workshop on Dec 7th, where you can learn more about their approach.
- 30 Jul: Based on compelling feedback, we have updated our rules as follows: Contestants for the top-scoring model awards must submit their predictions on the evaluation set by the original deadline of Aug 8th. Contestants for the “organizers’ choice award” can submit their predictions until Sep 2nd. The organizers’ choice award will be based primarily on the promise of the approach, but we will also consider evaluation scores.
- 29 Jul: (Outdated – see above) We have extended the deadline for MMNLU-22 evaluation to Sep 2nd. Additionally, besides the winners of the “full dataset” and “zero-shot” categories, we plan to select one team (“organizers’ choice award”) to present their findings at the workshop. This choice will be made based on the promise of the approach, not just on model evaluation scores.
- 25 Jul: The unlabeled evaluation data for our shared task is now live. See instructions in the alexa/massive repo.
- 7 Jul: A Slack workspace is now available.
- 30 Jun: Paper submissions are now being accepted.
- 20 Apr: The MASSIVE dataset and the associated paper were released publicly. Anyone can now start modeling on the data in preparation for the release of the MMNLU-22 evaluation set on July 25th.
Important Dates
Note: We accept both (a) direct submissions through OpenReview and (b) ARR commitments
- Apr 20th: Release of the MASSIVE dataset (training, validation, test splits) and paper
Aug 15thJuly 15th: ACL Rolling Review (ARR) submission deadline- Jul 25th: Release of the MMNLU-22 Competition evaluation set
- Aug 8th: Competition deadline for the top-scoring model awards
- Sep 2nd: Competition deadline for the organizers’ choice award and end of MMNLU-22 Competition
- Sep 7th: OpenReview submission deadline
- Oct 2nd: ARR commitment deadline
Oct 9th(TBA): Acceptance notificationsOct 16thOct 26th: Camera ready deadline- Dec 7th: Massively Multilingual NLU 2022 Workshop
Invited Speakers
Heng Ji, UIUC, USA
Géraldine Damnati, Orange Labs, France
Mahdi Namazifar, Amazon Alexa, USA
Anna Rumshisky, UMass Lowell, USA
Sebastian Ruder, Google, UK
David Yarowsky, JHU, USA
Workshop Organizers
Jack FitzGerald, Amazon Alexa, USA
Kay Rottmann, Amazon Alexa, Germany
Julia Hirschberg, Columbia University, USA
Anna Rumshisky, UMass Lowell, USA
Mohit Bansal, UNC, USA
Charith Peris, Amazon Alexa, USA
Christopher Hench, Amazon Alexa, USA
Competition (Shared Tasks) Organizers
Charith Peris, Amazon Alexa, USA
Jack FitzGerald, Amazon Alexa, USA