Humour remains one of the most difficult aspects of intercultural communication. Understanding humour often involves understanding implicit cultural references and/or double meanings, especially in the case of wordplay, which raises not only the question of its (un)translatability, but also how to detect and classify instances of this complex phenomenon. The goal of the JOKER track is to bring together linguists, translators, and computer scientists to create reusable test collections for benchmarking and to explore new evaluation metrics in order to foster work on automatic methods for wordplay interpretation, generation, and translation.
JOKER lies at the intersection of multiple fields, including natural language processing, machine translation (MT), and human-computer interaction, as well as linguistics, philosophy, and psychology. In the 2022 edition, we focused on machine translation and we constructed a unique English–French parallel corpus of wordplay with 5K parallel one-liner puns and 1.5K parallel instances of wordplay in named entities. We also saw runs based on our corpus for an unshared task for pun generation in order to improve interlocutor engagement in dialog systems. Recent developments in machine learning and artificial intelligence have greatly improved the quality of MT, but puns are often held to be untranslatable, particularly by statistical or neural MT , , which cannot robustly deal with texts that deliberately disregard or subvert linguistic conventions .
A few monolingual humour corpora do exist, including the datasets created for shared tasks of the International Workshop on Semantic Evaluation (SemEval): #HashtagWars: Learning a Sense of Humor , Detection and Interpretation of English Puns , Assessing Humor in Edited News Headlines , and Hackathon: Detecting and Rating Humor and Offense . Mihalcea et al.  collected 16,000 humorous sentences and an equal number of negative samples from news titles, proverbs, the British National Corpus, and the Open Mind Common-Sense dataset. Another dataset contains 2,400 puns and non-puns from news sources, Yahoo!Answers, and proverbs , . Most datasets are in English, with some notable exceptions in Italian , Russian , , and Spanish . To the best of our knowledge, the corpus we constructed within the frame of the JOKER Task 3 ,  is the first one for wordplay detection in French.
Wordplay is a recurrent feature of literature, advertising, movies, and social conversations. It is therefore vitally important that natural language processing applications operating on these discourse types be capable of recognising and appropriately dealing with instances of wordplay . As we mentioned before, preserving wordplay in translation might be crucial to understanding the humorous aspect of a sense. Thus, machine translation of wordplay is especially crucial in subtitling. As we demonstrated previously , machine translation (including popular engines like DeepL1) is successful in only 13% of cases. Although it is impossible to resolve such a complex problem at once, we identified three main steps that could bring us closer to the automation of wordplay analysis—namely, wordplay detection, interpretation, and translation. Wordplay detection and interpretation are also important in dialogue systems in order to allow a virtual agent to react properly on the cue of the interlocutor.
 H. Ardi, M. A. Hafizh, I. Rezqi, and R. Tuzzikriah, “Can Machine Translations Translate Humorous Texts?”, Humanus, 2022, doi: 10.24036/humanus.v21i1.115698.
 F. Regattin, “Traduction automatique et jeux de mots : l’incursion (ludique) d’un inculte”. Brest, Université de Bretagne occidentale, mars 2021. URL
 T. Miller, “The Punster’s Amanuensis: The Proper Place of Humans and Machines in the Translation of Wordplay”, in Proceedings of the Second Workshop on Human-Informed Translation and Interpreting Technology (HiT-IT 2019), sept. 2019, p. 57‑64. doi: 10.26615/issn.2683-0078.2019_007.
 P. Potash, A. Romanov, and A. Rumshisky, “ SemEval-2017 Task 6: #HashtagWars: Learning a Sense of Humor “, in Proceedings of the 11th International Workshop on Semantic Evaluation, août 2017, p. 49‑57. doi: 10.18653/v1/S17-2004.
 N. Hossain, J. Krumm, M. Gamon, and H. Kautz, “ SemEval-2020 Task 7: Assessing Humor in Edited News Headlines “, in Proceedings of the Fourteenth Workshop on Semantic Evaluation, déc. 2020, p. 746‑758. [En ligne]. Disponible sur: https://aclanthology.org/2020.semeval-1.98
 J. A. Meaney, S. Wilson, L. Chiruzzo, A. Lopez, and W. Magdy, “ SemEval-2021 Task 7: HaHackathon, Detecting and Rating Humor and Offense “, in Proceedings of the 15th International Workshop on Semantic Evaluation, août 2021, p. 105‑119. doi: 10.18653/v1/2021.semeval-1.9.
 R. Mihalcea and C. Strapparava, “ Making Computers Laugh: Investigations in Automatic Humor Recognition “, in Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, oct. 2005, p. 531‑538. [En ligne]. Disponible sur: https://www.aclweb.org/anthology/H05-1067
 A. Cattle and X. Ma, “ Recognizing Humour using Word Associations and Humour Anchor Extraction “, in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, août 2018, p. 1849‑1858. [En ligne]. Disponible sur: https://www.aclweb.org/anthology/C18-1157
 D. Yang, A. Lavie, C. Dyer, and E. Hovy, “ Humor Recognition and Humor Anchor Extraction “, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, sept. 2015, p. 2367‑2376. doi: 10.18653/v1/D15-1284.
 A. Reyes, D. Buscaldi, and P. Rosso, “ An Analysis of the Impact of Ambiguity on Automatic Humour Recognition “, in Text, Speech and Dialogue, Berlin, Heidelberg, 2009, p. 162‑169. doi: 10.1007/978-3-642-04208-9_25.
 V. Blinov, V. Bolotova-Baranova, and P. Braslavski, “Large Dataset and Language Model Fun-Tuning for Humor Recognition”, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, p. 4027‑4032. doi: 10.18653/v1/P19-1394.
 A. Ermilov, N. Murashkina, V. Goryacheva, and P. Braslavski, “Stierlitz Meets SVM: Humor Detection in Russian”, in Artificial Intelligence and Natural Language, Cham, 2018, p. 178‑184. doi: 10.1007/978-3-030-01204-5_17.
 S. Castro, L. Chiruzzo, A. Rosá, D. Garat, and G. Moncecchi, “A Crowd-Annotated Spanish Corpus for Humor Analysis”, in Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media, Melbourne, Australia, juill. 2018, p. 7‑11. doi: 10.18653/v1/W18-3502.
 L. Ermakova et. al., “CLEF Workshop JOKER: Automatic Wordplay and Humour Translation”, in Advances in Information Retrieval, vol. 13186, M. Hagen, S. Verberne, C. Macdonald, C. Seifert, K. Balog, K. Nørvåg, et V. Setty, Éd. Cham: Springer International Publishing, 2022, p. 355‑363. doi: 10.1007/978-3-030-99739-7_45.
 L. Ermakova et al., “Overview of the CLEF 2022 JOKER Task 3: Pun Translation from English into French”, in Proceedings of the Working Notes of CLEF 2022: Conference and Labs of the Evaluation Forum, 2022.
1: DeepL translator. Accessed on 17th July 2022. URL
If you extend or use this work, please cite the paper where it was introduced:
Liana Ermakova, Tristan Miller, Fabio Regattin, Anne-Gwenn Bosser, Claudine Borg, Élise Mathurin, Gaëlle Le Corre, Sílvia Araújo, Radia Hannachi, Julien Boccou, Albin Digue, Aurianne Damoy & Benoît Jeanjean, 2022. Overview of JOKER@ CLEF 2022: Automatic Wordplay and Humour Translation workshop. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 447-469). Springer, Cham.
This project has received a government grant managed by the National Research Agency under the program "Investissements d'avenir" integrated into France 2030, with the Reference ANR-19-GURE-0001.
JOKER is supported by The Human Science Institute in Brittany (MSHB)