SM&Society Day 2: Challenging Social Media Analytics

flickr photo by Steve Burt shared under a Creative Commons (BY-SA) license

Susan Halford provided the opening keynote and reminded us that ‘data never sleeps’ and is being generated at ever increasing scales in real time and over time. Whilst this may constitute an ‘unexpected gift’, it’s meant we’re also ‘building the boats as we row’ in terms of the way we’re gathering and analysing those data.

Susan challenged us to consider three questions:

  1. What are social media data?
  2. Where are the data produced?
  3. Why does this matter?

There is genuine concern that much of the current evangelism around Big data may have done more harm than good, leading to inflated expectations about what is possible and what we can learn. If we’re not careful, our reliance on the platforms through which we access the data may unduly influence what we find, in a host of different ways, and in ways which vary over time. Demographic and geographic data especially need treating with caution, or at least with care and in full knowledge of their limitations. Perhaps we should go beyond demographics and make a virtue of the biases, limitations and specificities inherent in the data.

I hope in a sense that is what my research is doing, where I’m focusing on a particular,  self-selecting sample, engaged in a specific activity. For me, the demographics are in some senses pre-defined – teachers who using Twitter. What their gender, religion or ethnicity is, will be of no consequence since I’ll not be classifying my results using those criteria. Or at least I never intended to, until I though about location. I’ve assumed my participants will be teachers drawn from a global population, though due to my linguistic limitations, from the english-speaking world. The keynote has encouraged me to revisit my thinking; in different places (with different cultures?), might teachers have a different view of, and approach to, professional learning using Twitter?

Susan asked Les Carr, her colleague from Southampton, to join her on stage. Amongst other things, Les pointed out that vivas inevitably ask us to justify our methods and the data they generate, and how they are appropriate for the research questions we pose. I was grateful for that reminder as I begin to think about my RF2 submission. Duly noted!


Social Media Mining

flickr photo by ianguest shared under a Creative Commons (BY-NC-SA) license

Social Media Mining starts from the premise that with social media being so pervasive, with such a large proportion of our population engaged in it and with the ease of posting, enormous quantities of data are being generated. This provides both opportunities and challenges.

Social Media Mining is the process of representing, analyzing and extracting actionable patterns from social media data.

The challenges include gathering the data, whilst ensuring they are in a format which can be processed; devising procedures which will render the data in a format which they can be interpreted; and bringing to bear appropriate frameworks which allow meaning to emerge.

On page five we’re informed that those “with a basic computer science background and knowledge of data structures, search and graph algorithms will find this book easily accessible” and “having a data mining or machine learning background is a plus.” Hmm, that didn’t bode well. However, nothing ventured as they say. The book takes you through the essentials of processing and representing the data in ways which facilitate interpretation. To do this and illustrate how this might be achieved, it draws on set theory, linear algebra, calculus and constructs algorithms. These are not areas in which I’m experienced (the undergraduate maths I used was mostly that demanded by classical physics), but if it proved necessary, I could probably get grips with them. The question is whether I will need to. If social network analysis does indeed prove to be a useful method, then my first instinct would be to look for tools already available; there’s no point investing time developing an application if someone’s already done that heavy lifting. If I need to pursue an line of enquiry for which nothing is currently available, then my next step might be to explore potential partnerships with someone who has the expertise. Are there any undergraduates studying in this area who are looking for a real-world project to fulfill their course criteria for example? If not and the outcome of my research depended on it, then it would just be a matter of rolling up my sleeves and getting stuck in.

Where I found the book really useful was in introducing some of the concepts and terminology which would be needed to represent and interpret the data. Nodes, edges, centrality, transitivity, reciprocity and assortativity are useful ideas when discussing concepts embodied within social media. The book goes on to to discuss how we can identify communities, how they form, how they are interconnected and how information travels within and across them. Both emic (explicit) and etic (implicit) communities are discussed, though there’s little exploration of what the intent or purpose of communities might be. I’m left pondering on the significance of communities within my study. Facebook, LinkedIn & Google+ all have group features, whereas Twitter does not. Curious then that despite Twitter apparently failing to provide the means through which communities might self-organise, it appears to be the dominant SNS to which teachers have been attracted. Why might that be? Although only discussed briefly in the preceding post, the hashtag seems to be an important actor here and a potential candidate around which communities might assemble. Perhaps too, the hashtag community (if it exists), cocks a snook at the emic/etic dichotomy?

Through explorations of ‘friendship’ networks, we learn, not only how to recognise ‘influentials,’ but how to measure their potency through the notions of ‘influence’ and ‘homophily.’ The ‘twitterati‘ certainly exist, even within the education-oriented subset of Twitter users. What role then, if any, do they have in the context of professional learning? Are they hubs around which actions occur? Are they champions or in any sense leaders? Connected with influence and homophily, ‘confounding,’ is the environment‘s effect on making people similar. Assuming similarity might be an important factor in promoting interaction (not at all necessarily a given), how does the environment encourage that? By introducing a non-human actor, I acknowledge that I’m well on the way to recruiting actor-network theory.

I may or may not turn to social media mining as a method, but whatever the outcome, many of the underlying principles might prove useful. What I definitely need to resolve is whether to hold on to the notion of communities and explore how they might integrate with an actor-network theory interpretation, or whether I should lose that terminology and translate those ideas into solely ANTish terms and ANTish analyses.