In this talk, we introduce social media mining for health applications 2020 & 2021 shared tasks and the first Russian adverse drug reaction corpus of tweets.
The vast amount of data on social media presents significant opportunities and challenges for utilizing it as a resource for health informatics. The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address natural language processing challenges of using social media data for health informatics, including informal, colloquial expressions of clinical concepts, noise, data sparsity, ambiguity, and multilingual posts.
In this talk, Elena Tutubalina and Ilseyar Alimova introduce SMM4H 2020 & 2021 shared tasks and the first Russian adverse drug reaction corpus of tweets. Elena describes three tasks on mining adverse drug effects using annotated datasets, focusing on current challenges and the imbalanced nature of the datasets. Ilseyar describes the creation of the Russian dataset, focusing on the data collection and annotation process. At the end of the talk, the results of participants of the SMM4H shared tasks on the classification of Russian tweets are discussed. This is a joint work with the University of Pennsylvania, USA.