Serbian Voices

Research & Data Collection Project

The project’s goal was to assist Duke University’s DevLab in gathering data for a Machine Learning system they Developed. Marko Galjak was in charge of the project’s overall design, which involved developing a system for gathering news articles and gathering information from the social networking site Twitter. A tool for examining networks inside Serbia’s Twittersphere was developed as part of that effort.

The premise of the project was that enough local context and understanding can be incorporated into a machine learning model to potentially predict shifts in civic spaces more effectively than traditional voices on a country by country level. Catalyst Balkans set out to implement an innovative set of activities that added value to the INSPIRES forecasting model by gathering a broader (and more diverse) range of local voices, providing context and interpretation of both domestic news and local Twitter activity, and building a local validation dataset that aided the ability of the machine learning algorithm to more accurately create a forecast for Serbia.

Serbian Twittersphere network of positive and negative communication for a week.

This project was supported by the INSPIRES Consortium withing Catalyst Balkans. The technical infrastructure for the project was realized by an extremely competent Data Science Backend Engineer Vladimir Sivčević.