One of my projects is automated misinformation detection. We have applied it for COVID-19 misinformation detection from Twitter, with applications deployed on EDNA. In addition, EDNA for misinformation detection is a popular project for students in Enterprise Computing, as well as Real-Time Systems courses at Georgia Tech.
The findings in [^1] suggest a need for automated misinformation and disinformation detection. Since misinformation spreads exponentially quickly, and identifying it quickly manually requires significant investment in labor and analysis. Automated detection can improve response times to rapidly spreading misinformation by identifying potential new correlated keywords. In addition, automated detection can complement manual detection. Integrating periodic, manual misinformation detection with automated detection can ensure a best-of-both-worlds scenario. Automated detection can scale towards different varieties of misinformation and manual detection can help identify new types of misinformation.
Such an automated misinformation filter requires dynamic trend models to identify misinformation keywords as well as language models to detect fake news and markers in text that indicate misinformation and disinformation. Furthermore, these models need to be adaptive, since misinformation changes rapidly. Old misinformation is discarded by disinformation agents and replaced with new forms over time. So, models need to be continuously updated to handle concept drift in the streaming data. Such an approach then requires automatic data generation to create training and evaluation sets. An end-to-end system would integrate data generation, model training, deployment, and updates. We present such a system in EDNA-COVID (note: it is still WIP).
This is our end-to-end system for dynamic misinformation tracking.
The misinformation model is an ensemble of experts, composed of models to detect misinformation in a variety of ways:
When any member of the ensemble of experts detects misinformation, the system filters it out of the stream. Misinformation data is recorded as training data for other members of the expert. Periodically, we trigger an update for the component models.
WIP
[^1] Some Like it Hoax: Automated Fake News Detection in Social Networks. https://arxiv.org/abs/1704.07506