Using AI for Multi-Country Automated Protest Event Collection

November 27, 2020

GLODEM AI & CSS Seminar Series

Title: Using AI for Multi-Country Automated Protest Event Collection

Date: 3 December 2020 – Thursday

Time:18:00-19:30 (GMT+3)

Place: This seminar will be conducted on Zoom


Dr. Erdem Yörük, Assoc. Prof. of Sociology, Koç University

Dr. Çağrı Yoltar, Postdoctoral Researcher, Koç University

Dr. Ali Hürriyetoğlu, Postdoctoral Researcher, Koç University

Fırat Duruşan, PhD Candidate, Ankara University

Moderator: Dr. Merih Angın, Director of MA-CSSL, Assistant Professor of International Relations, Koç University


Abstract: Protest event analysis is the most common approach for social movement scholars as it is an unobtrusive and context-sensitive technique that can convert unstructured matter into large volumes of data, in a cross-national, cross-time and cross-issue comparative character. The use of digitized new sources and automated approaches to collect protest events information has accelerated in the 2000s and several very large projects were formed to apply fully or semi-automated methods to collect contentious political event data from news sources, such as GDELT, ICEWS, EMBERS, SPEED, POLCON, and MMAD. In this short presentation, we will describe the potentials and challenges of using artificial intelligence to understand social movements and then present the new bottom-up methodological approach that has been adopted by the GLOCON (Global Contentious Politics Database) Project, which is part of our European Research Council (ERC) funded project Emerging Welfare ( We look for the most optimal way of creating a gold standard corpus for training a deep learning system that is designed for automatically collecting protest information in a cross-country context. We show that creating a gold standard corpus for training and testing machine learning models on the basis of randomly chosen news articles from news archives yields better performance than selecting news articles on the basis of keyword filtering, which is the most prevalent method currently used in automated event coding. We advance this new bottom-up approach in order to ensure generalizability and reliability in cross-country comparative protest event collection from international and local news in different countries, languages, sources and time periods, which entails a large variety of event types, actors, and targets. We present the results of comparing our random-sample approach to keyword filtering and illustrate that the machine learning algorithms, and particularly state-of-the-art deep learning tools, perform much better when they are trained with the gold standard corpus from a randomly selected set of news articles from China, India and South Africa. Finally, we also present our approach to overcome the major ethical issues that are intrinsic to protest event coding.

Please visit to register for the event (advance registration required)