Between January and April 2016 I undertook a study of the use of Twitter for public engagement among members of the University of Nottingham staff. The project was run under the auspices of CaSMa – Citizen-Centric Approaches to Social Media Analysis – a research team at the Horizon Digital Economy Research Institute that explores methods of performing social media research that respect the rights to privacy and ownership of personal data of social media users. Consenting study participants provided data from their Twitter feed by exporting it from a web tool that was designed primarily to allow users to monitor and manage their Twitter activity. Using graph visualisation software Gephi I created an image of the network of interactions created by the tweeting, retweeting, quoting, mentioning, favouriting and following events in the data, and performed an analysis of hashtags propagation to look for signs of successful public engagement.
It was a challenging project. Designing the data collection in line with CaSMa’s citizen-centric ethos required meeting with each participant in person (and consequently much to-ing and fro-ing between the University of Nottingham’s various campuses and partner organisations), talking them through the web tool and the data collection process, and obtaining their written consent to analyse their data. Once the collection procedure was in place, I had to work out what to do with the data: it was delivered to me in json format text files, and in order to be able to render complete data sets much sorting and parsing of the data structures was needed. The text files presented a series of events: tweets, retweets, favouriting, following, etc. I needed to find, for example, where in the hierarchical text structure I could find the ID of the initiator of a particular event – the person who had written a tweet, or like a tweet, or retweeted a tweet. In the latter two cases I also wanted to know the ID of the person whose tweet had been liked or retweeted. This information was not nested in exactly the same place in every event type, and a considerable amount of time was spent establishing the necessary paths within each event type. Once this was established, I used Python to retrieve the data and compile it into uniform data sets. I had not previously done any programming, and getting to grips with the language was a real learning experience.
The code compiled primary user, secondary user, and mentioned user data, and with this I created a network graph visualisation. The final product looked like this:
The layout is determined by mathematical algorithm, and the colours a result of a modularity analysis carried out by the software to identify discrete communities based on interactions. Unsurprisingly, most of the communities in the image above are centered on my participants, although the blue, purple and black communities subsume more than one individual participant, and not, in all cases, by the conscious design of the users themselves. Outlying coloured dots that seem to have ‘escaped’ their neighbours represent individuals who bridge two communities (and are consequently located equi-distant between the two).
Combining this approach with an analysis of hashtags suggested that successful uptake of a hashtag-denoted topic or event can be aided by recruiting partners to help spread the message. However, detecting true public engagement proved challenging. Due to the data collection method, full profile data were only collected on users tweeting or retweeting, and not from users favouriting or following, resulting in profile data for only 60% of users. Consequently, it was not possible to perform a robust analysis of users as ‘inside’ or ‘outside’ the academic community, and to what extent the message was reaching a general ‘public’, or circulating around a more specialised audience. In fact, this consideration raised questions of who constitutes the ‘public’ in public engagement, and whether the concept of a demarcated ‘academia’ is a valid proposition (apologies for all the air-quotes).
Further research could look at finding computational methods to process profile descriptions and produce judgements of the likely affiliation of an individual. However, this would again raise ethical questions, which are going to become more and more salient in future social media research.