More than pretty?

Whilst out for a run this week, I was catching up my podcast listening. On my playlist was Episode 91 of Data Stories in which the creators of RAW were sharing what is, what it does and how it came into being. RAW claims to be ‘The missing link between spreadsheets and data visualization.’ Back when I wrote my research proposal, I thought that social network analysis (SNA) would be one technique I might use to learn more about teacher learning on Twitter. There are a raft of tools that can help with this, which exist on a spectrum from those which rely on having expertise in coding, to those (like TAGS and NodeXL) which are usable by novice like me. In addition to gathering tweets, they often allow you to produce visualisations of the connections between those tweets:

“NodeXL Twitter Network Graphs: CHI2010” flickr photo by Marc_Smith shared under a Creative Commons (BY) license

The visualisation produced often shows users as nodes, where the connections between them are produced as a result of one user @ing another. The tools which produce the visualisations invariably provide access to the connective data which underpins the visualisation – in degree, out degree, betweenness and centrality for example – which in turn allow you to begin to understand the types of relationships between those involved in the exchange. This video by Martin Everett provides an accessible introduction to social network analysis

And as Nicholas Christakis ‘explains why individual actions are inextricably linked to sociological pressures,’ we learn why SNA can be so important across a range of contexts. However, SNA was a direction I chose not to take. Although learning the principles would have taken time (even more so at becoming adept with the tools), it might have been worth it had it produced insights which informed my study. The largely quantitative data produced and the interpretations they enabled weren’t coherent with the direction I wanted to take. I wasn’t after the big picture and wanting to describe and explain the interplay between large (or small) numbers of individuals so I could say this is a tight knit community with strong social ties which rarely interacts with this other community, for example. I’m more interested in describing and understanding what the individuals themselves are doing and the effects it’s having on themselves, in addition to those with whom they’re connected. So I walked away from SNA … but not visualisations.

Visual depictions have always been important in my own learning and when I was teaching others. Whether it was the classic experiment report diagram, a circuit diagram to render a simpler(?) version of the wires and components we connected together, or a chart to present the data we had collected, this form of visualisation (arguably) helped convey meaning; they did semiotic work. So whilst I may have left SNA behind for now, I never dropped visualisations, sure that they might yet have something to contribute to my research, whether in helping me to better understand my data, or in reaching out to others to help explain to them what I had been finding and what I thought it was telling me. Which brings me back to RAW.

The original version I accessed a while ago required a download from GitHub and some technical jiggery pokery to get it configured on your local machine. I lacked the nous and the time to get to grips with it then. As I learned in the podcast, some support funding has enabled RAW has to be made available as a web app and boy is it simple and straightforward to use. You simply paste in data from a spreadsheet for example, which RAW then processes in your browser (the data never leaves your computer, so confidentiality is not an issue), then a click or two later, you can test the various standard visualisations available through the app to see which best helps you understand and tell your data’s stories.

“Chat Vis” flickr photo by IaninSheffield shared under a Creative Commons (BY-NC-SA) license

Producing this visualisation took no more than a few minutes. The data were collected as a mini corpus of tweets from a recent hashtag chat on Twitter. In collecting the tweets, I wanted to undertake a qualitative study of the tweet contents; the other data which came down with the corpus – Twitter handles, hashtags, @ replies, Likes, RTs – were largely superfluous to my requirements and I wouldn’t need them for my analysis. However, they were in spreadsheet form, so when I was introducing myself to the RAW web app, remembering that the corpus was in spreadsheet form, I wondered what RAW might be able to do. It was with some surprise and delight that the data was instantly recognised and parsed by RAW (that hardly ever happens with data!) and I was able to test which of the standard visualisations might help me to better visualise what was going on in the #chat.

What did the visualisation help me with?

It’s perhaps important to say that whether the RAW visualisations are useful will depend on the nature of your data; are they textual, numeric, temporal  etc. I tried several different visualisations with different strands of my data, several of which showed promise and to which I may return to explore further. I settled, for the moment, on the one above, largely for its simplicity coupled with the new understanding it afforded me. (To follow along, you might find it easier to click on the image to take you to a larger version where you can see more detail. All names have been anonymised).

Down the left and right are listed the Twitter handles of various participants in the chat. A ‘ribbon’ connecting someone on the left with someone on the right represents an ‘@’ tweet, directed to that person. The wider the ribbon, the more tweets have been sent to them. Knowing that allows us to consider what the visualisation shows. We can see immediately those people who have sent tweets to others; any tweet with no ‘@’ in it is represented by a ribbon from the person to the top right bar which has no Twitter handle. It looks like about half of tweets sent are directed at specific people. The most active participants appear to be LenD, StephanieB and KerryW, who all also seem to direct around half of their tweets at others. MartinA and MichaelM, who are quite prolific, tend to send most of their tweets to other people and few undirected ones; KarenM is the opposite.

The visualisation also allows us to quickly see whether people are sending tweets to a variety of people or just a few. We can also take a view from the recipients’ perspectives; are they ‘popular,’ receiving tweets from a variety of people? Maybe there are individuals who have been pulled into the chat by a tweet, but are not participating, as indicated by tweets at people on the right, but who are not also listed in the senders list?

When used as a comparative tool with visualisations produced for the same #chats over a period of time, it might be possible to explore which chats were more discursive with people interacting with one another, and which were more about posting into the ether. It would also be possible to compare one chat with a different chat and make similar comparisons.

So what?

Having elected not to take the SNA route, my reason for collecting a bunch of tweets from a #chat or shoutout are to enable me to analyse them in detail. The additional data which are also collected are peripheral and I haven’t used them, other than the url to get back to individual tweets when necessary. What RAW allowed me to do, is to expand the possibilities and consider more of the data, and in a different way. I can experiment and explore quickly with minimal time cost and quickly see whether rendering the data in different ways can produce different insights. This also introduces the possibility of indicating different ways of revisiting the way I’ve been coding tweets thus far by offering different ‘targets’ for analysis. It makes visible people engaged in particular types of activity which might not otherwise have been apparent if looking at a corpus on a tweet by tweet basis.

What this little experiment has shown me is the power of the ‘@.’ It’s the ‘@’ and its function as a marker of addressivity which connects people to one another and has the capacity to change a Twitter session from an ideas sharing fest to a discussion where people might be discussing ideas. I was also obliged to consider that the visualisation produced only a simplified version of the exchange; people listed on the right were simply the first @mentioned in tweets from those on the left. It’s quite possible a particular tweet was addressed to multiple people; something which would not have been apparent without examining the contents of the tweet in detail. The visualisation also lumped together all addressed tweets; that is, initial tweets sent to someone, together with @replies. There are clearly shortcomings which I need to dig into further.

I’m also minded to reflect that it was also the ‘@’ which produced the above visualisation; without its presence in the tweets, this visualisation would perhaps not even have been possible. The presence of the @ allowed the original data to be partitioned in such a way as to produce different data fields and subsequently allowed those field to be set against or compared with one another in the visualisation itself.

I’ve only scratched the surface with RAW and need to experiment more and think more about whether the visualisations it produces can add to the story I want to share.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s