I built a chat room for a PhD research project.
Any write up of this in the PhD Thesis itself will never be able to do justice to the drawn-out, agonising, Herculean labour that this simple, whimsical statement turned out to involve, so I thought I’d do that here instead!
So my PhD involves studying the experiences of people using English as a second language in text-based online communication. Chat rooms, blogs, comments threads, Instant Messaging, social media, even emails, you name it. In order to do this, I thought a good first step would be to talk to these people and ask them about their experiences, bringing them together in small focus groups.
Then I thought: why not do it online? Why not use an online chat room to talk about online communication? Massive benefits of this include the fact that, as opposed to face to face focus groups, there is no need to convert video recording to text transcript (for which you can expect to spend four hours for each hour of recorded material), and that participants can take part from wherever they have an internet connected computer, meaning that a) they don’t have to travel to the study and b) they can be located pretty much anywhere in the world, multi-time-zone scheduling permitting.
Next question: what platform to use? I considered a number of possibilities. WhatsApp looked like a good option for several reasons: it gives you the option to email to yourself the chat log of any conversation you’re having, providing an instant transcript of the conversation, plus conversations are encrypted as they wing their way through the wires of the web. But you need people’s phone numbers, which I thought was a bit unnecessarily invasive. Other obvious contenders like Facebook Messenger lacked option to download the chat (technically possible, but it involves downloading your entire Facebook dataset, which seemed a bit cumbersome. Ditto with Google Hangouts). For research purposes, however, a major flaw of all commercial and web-based options was that you can’t be sure where your data is going, where it is being stored, and what is being done with it.
Ethics is a pretty big deal in academic research these days, and in anything to do with internet research, data security is an import ethical consideration. To satisfy ethics review boards you need to show you’ve given thought to the whereabouts of data collected from research participants at all times – that’s the physical location, so if it’s “in the Cloud” you need to know exactly what that means, what company’s particular subsection of “the Cloud” that is, where their actual computers are on which they store your data, and what laws apply to data storage in that location.
So, reasoned my treacherous brain, we can avoid all of these problematic ethical issues and ensure that the platform has all necessary features like exportable transcripts etc. if we simply… do it ourselves. You know, make a chat room and host it on a University of Nottingham server.
As I look back on this last sentence I’m not sure what on earth I thought I was thinking.
I should mention at this point that I have no background at all in computer science. I did a GCSE in IT in 1998. I had spent some time over the first year of my PhD learning the basics of programming using the excellent codecademy and had successfully learned enough Python to process Twitter data, and this experience had shown me that with perseverance and a great deal of searching on programming advice forums (stackoverflow, w3schools, digital ocean, many others…), it’s possible to figure stuff out and do things with computers that initially seem impossible. So I thought I should be able to do this.
And I was right. Just. But every step of the way threw up new and seemingly intractable challenges that not only required the finding of a solution, but often the hasty acquisition and understanding of some fundamental aspect of computer science and web science. Sometimes it took hours of painstaking online research just to understand what question I needed to ask.
Possibly it was not the most productive use of my time. But I’ve finally arrived at a point now where I have a cast-iron online platform on which I can do some (hopefully) interesting and (definitely) ethically-sound research.
In part II of this post I’ll describe the process, partly as an aide-mémoire for myself (lest we forget…) and partly in the hope that it might be of some use to anyone equally daft enough to try doing it for themselves. It will be full of dreadfully dull technical details, but hopefully I can explain it in such a way as to make it accessible to non-CS people like me.