Researchers claim to have trained a machine learning system to identify posts on social media that aim to manipulate political events. This has been made possible, researchers claim, through the development of an automated machine learning system that identifies certain posts based around their content.
There are thought to be a rising number of political events targeted by foreign activity via social media networks. Researchers have found that these foreign manipulation campaigns can be identified when looking at the timing, the URLs contained in posts, as well as the length of the these posts.
Princeton University’s Dr Meysam Alizadeh, one of the co-authors on the research, stated in an article by The Guardian “We can use machine learning to automatically identify the content of troll postings and track an online information operation without human intervention”
The research team reports in the journal Science Advances exactly how the work was carried out, using content from four well-known social media campaigns that were targeted to the US. As a comparison, the researchers also used data from American Twitter accounts, including both average users and those engaged in the country’s politics. In addition to this, the research also used posts from accounts on Reddit that weren’t associated to any of these campaigns.
Once the system had been trained, the research then explored whether it could distinguish between normal user activity and trolls. The team found positive results, with posts flagged up by the technology generally being those made by trolls. However, it’s worth noting that whilst the system did identify some of the troll posts, it did not pick up on all of them.
Director of Cardiff University’s Crime and Security Research Institute Prof Martin Innes commented “This is an important, interesting and sometimes intriguing piece of analysis”
“That machine learning algorithms should be able to identify similar content from within bounded datasets is perhaps to be expected, as after all there were already signals in the data that enabled them to be connected. But as the authors quite correctly clarify, there is a gap still to be bridged in terms of applying these approaches ‘in the wild’ to identify ‘live’ operations.”
This approach, the team claim, is different from detecting bots – an important point to note as these social media campaigns very often involve posts made by humans.
In addition to distinguishing between the activity of trolls and that of normal users, the research also found differences in the technology’s ability to detect, dependent on the country – Dr Alizadeh stating the system’s performance was “near-perfect; close to 99% accurate” for Venezuelan campaigns. In addition to this, Chinese activity was claimed to be easier to detect in comparison to that from Russia.
However, whilst the campaigns of some countries were easier to detect than others, Dr Alizadeh notes that this does not mean certain countries are better at disguising themselves as regular US users, and that there are a whole host of other reasons why some are more difficult to spot – “For example, the Venezuelans always talk about politics. The Russian trolls, some of them never talk about politics – they engage in hashtag games or share links to download music. Why are Russian trolls doing that? One answer could be to build their own audience.”