“People Want Touch and Keyboard on Clamshell Devices” flickr photo by IntelFreePress shared under a Creative Commons (BY) license

'Human subject' as a term still found in articles discussing ethics, or 'participant observation' from ethnographic literature hint at the source of some of my troubles this week. There have been a host of different people who have wittingly or otherwise become involved as my research has unfolded. How should I refer to them in my thesis? Subjects? Participants? Respondents? Informants? We are not short of terms and can go on from there to include interviewees, co-researchers, collaborators and many more. To some extent, it depends upon the research tradition within which your research is located. For example the British Sociological Association ethics guidance refers to research participants, as indeed does that from the British Psychological Society. The move away from talking about research 'subjects' acknowledges the agency that someone invited to participate in research has in determining their level of involvement and respects the contribution they make to the research endeavour. However, the term 'human subject' still persists in many disciplines and still forms one of the criteria used in decision-making processes when considering one's ethical approach: 'Does the research involve human subjects?'

Updating my ethics

“Earth Science Applications Showcase (201408050002HQ)” by NASA HQ PHOTO is licensed under CC BY-NC-ND

During the last couple of weeks, I've been involved in quite a number of exchanges on Twitter, as part of my participant observation. There have been a number of occasions when I was moved to consider the ethics of a particular situation, as indeed a researcher should always do. Developing your ethical sensibility doesn't end the moment a Research Ethics Committee has signed off your submission. Instead it should be an ongoing critical process of reflection and renegotiation (Fileborn, 2015), a fluid dialogue interwoven with the fabric of your research endeavours (Madge, 2007). Whilst that sounds rather grand, for me it means being continually alert to the ways that you conduct exchanges and being sensitive to situations which unfold which you may have originally never have anticipated. Let's take a look at some of the issues which arose.

Weighing Anchor

“Anchor” flickr photo by MarcieLew shared under a Creative Commons (BY-SA) license

During a recent interview, Joe Dale mentioned a useful new app he'd found which offered some potential in the context of professional sharing – Anchor. It's a free (as of Jan 2017) smartphone app (Android & Apple) through which you can create a two-minute audio posting (a 'Wave') which others can listen to, then respond, again in audio. Joe (with Rachel Smith) had experimented with it by posting a question posed by one of the #mfltwitterati, then crowdsourcing responses from Anchor users. The final combined thread is then presented as a single, stitched audio stream, where the question and responses form a coherent whole.

Green light

flickr photo by My Buffo shared under a Creative Commons (BY-SA) license

If potential research participants gave their permission, what would be the implications of posting interview recordings online? That was essentially the theme of the preceding post. I wasn’t so sure of which way to jump, but the encouraging and supportive comments I received there and on Twitter prompted me to take the trickier route of writing a new ethics submission. In addition to rewriting the University pro forma submission document, I had to rewrite a couple of consent forms and their associated participant information sheets, in order to accommodate the possibility that participants might give their permission to ‘publish’ their recording. I also had to write a consent form and participant information sheet for an new, additional method I want to use. I then had to amend and extend the matrix I composed which summarises the ethical issues for each method. Finally, but perhaps most importantly, I felt it was important to attempt to justify the rather radical notion that interview recordings might be posted as podcasts. Here then is that supplement to my ethics submission:

Why am I proposing a change?

The usual position is that interview participants are afforded confidentiality and anonymity – that the data they provide will only be available to those specified, and that all features which might identify them will be removed before making the findings more public. In the interests of speed and given the small scale of my pilot study, I adopted the aforementioned approach. As I move forward into my main study, I would like to propose a different stance, building on those issues discussed in Appendix 02(?): Anonymity. This also contributes to the University’s and wider Open Access policies.

The arena from which potential participants will be drawn is highly participatory, where members generally adopt a performative approach. The norms of the space include a sense of sharing what you have and what you know; where people acknowledge and give credit to those who have supported or helped them. I’d like to suggest that this participatory space invites a more participatory research approach. As Grinyer (2002) noted, researchers have to balance the need to protect participants from harm by hiding their identity, whilst preventing loss of ownership ‘on an individual basis with each respondent.’ This is manageable, provided the sample size is small, as it will be in this study,

It’s perhaps helpful at this stage to reiterate that the topic of this research is not sensitive, participants are not vulnerable and the data they share will not be ‘sensitive personal data.’

How this differs from the interviews in the pilot study

In the pilot study, the participant was assured confidentiality, anonymity, that the transcript would be deleted at the end of the study and that the findings would not be reported (only used to inform the next stage of research).

For the main study I propose a shift in emphasis from ‘human subject to ‘authored text.’ This would be achieved by allowing interviews to contribute to the participatory agenda, by releasing the interview recordings as podcasts (streamed online audio files), if participants give their permission. Links to the audio files would be embedded in a web page associated with the research project, the interviewees would be named and their contribution credited. This represents an attempt to move beyond the notion that participants are merely sources of data to be mined. In Corden and Sainsbury’s (2006) study, participants responded positively when offered a copy of the audio recording of their interviews and were given the option to amend their responses, though few chose to exercise that control.

This is a very different approach to that found in most studies, but is not without precedent. The ‘edonis’ project, part of an EdD study by David Noble, included a series of interviews with teachers on the theme of leadership in educational technologies. The interviews from those people who gave permission were posted online. It could be argued that this proposed approach is only one step further on from conducting ‘interviews’ in visible online public spaces like blog comments, forums, and some chat rooms.

Risks and benefits

Once participants’ identities are no longer disguised, both potential risks and benefits become more significant. Table xxx summarises possible risks and benefits:

Risks Benefits
Loss of privacy which could lead to exposure to ridicule and/or embarrassment. Direct: Increase in participant agency, moving beyond the notion of participants merely as sources from which researchers abstract data.
Change in future circumstances which renders what participants originally said to be viewed in a less-positive light. Direct: Makes provision for participants to amend or extend what they said in the original interview.
  Indirect: Increasing the awareness and understanding of the wider community of issues associated with professional learning and social media.
Increased attention through increased exposure.
This could be perceived as either a risk or benefit and would depend on the participant’s preferred online behaviours.


As with conventional approaches, in order to make an informed decision, potential participants would need to be made fully aware of:

  1. Purpose and potential consequences of the research
  2. Possible benefits and harms
  3. The right to withdraw
  4. Anticipated uses of the data
  5. How the data will be stored and secured and preserved for the longer term.

With items 4 and 5 the circumstances will be different, depending on whether participants accede to their interview recording being released. This distinction needs to be made absolutely clear at the outset so participants are able to decide whether to be involved at all and whether they want to take that additional step.

At the start of an interview, participants who agreed to allowing the interviews to be posted would be reminded of the above once more and their verbal consent captured in the recording. In the debriefing after the interview is complete, participants will be asked whether they wish to change their minds, and reminded that should they do so subsequently, how they can make those views known.


As in the pilot study, potential participants would be provided with a participant information sheet, but one extended to include the additional considerations (see Appendix xxx). The form through which they provide their consent will also be amended to offer options for the different levels of involvement (see Appendix xxx) and whether they are prepared to allow the recording to be released under a Creative Commons license (see next section)

Given the small number of interviewees (<5), coping with different levels of involvement should be a manageable process.

Copyright and Intellectual Property

These issues will also need to be made clear to participants through the participant information sheet.

…for data collected via interviews that are recorded and/or transcribed, the researcher holds the copyright of recordings and transcripts but each speaker is an author of his or her recorded words in the interview.

(Padfield, 2010).

Rather than seeking formal copyright release from participants, it is proposed that the interview recordings will be released with Creative Commons, Attribution – NonCommercial – ShareAlike 4.0 International licensing. Participants will be asked at the point of providing consent to state whether they agree to that release; if they don’t, then the recording would not be released. Once more, potential participants are likely to be familiar with the principles of CC licensing; many of them release their own materials under these licenses.

Eynden et al (2014) recommend the use of Open Data Commons licenses for data released through research, however this licensing system is more appropriate where data is stored in databases and the database itself need licensing separately from the content. CC licensing was chosen since the content will not be wrapped within a database; at least not one which the public will be able to manipulate (copy, remix, redistribute).


flickr photo by mherzber shared under a Creative Commons (BY-SA) license

I’m delighted to be able to report that my revised submission has passed the ethics review process. It’s highly unusual for interviews to be allowed to be published in this way; standard practice is to afford anonymity to interviewees. Perhaps it’s indicative of the need to make our research more open, or the more performative behaviours of potential participants … or perhaps a bit of both. Whatever the case, I’m chuffed to bits, as we’d say up here in the ‘North.’ Now all that remains is to find participants sufficiently confident and generous enough to give it a shot. Know anyone …?

The only way is ethics

flickr photo by cybass shared under a Creative Commons (BY-NC) license

Right from the outset, one of the options I’ve tried to keep in mind is that of ‘publishing’ those data that are amenable. Publishing in this sense refers to sharing interview recordings, as podcasts, back with the community. This feels like the right thing to do; when teachers experiment with new techniques that someone else showed or explained to them, they often share those insights more widely. If that is the norm, why wouldn’t my research study, conducted within this environment, be any different? Well there are a number of reasons, mostly arising as a result of a researcher’s’ ethical sensitivities and obligations towards potential participants.

The default ethical stance is to maintain participants’ anonymity and confidentiality; with an interview transcript, this isn’t too difficult. If on the other hand, the audio file of the interview is shared, the potential for the participant to be identified is so much greater, even if personal identifiers are edited out of the audio. However, it could be argued (as I began to discuss here) that in the online performative space with which participants are comfortable, anonymising what they have created actually does them a disservice. Much better to acknowledge their co-authorship and give credit where it’s due. I wonder how many researchers conducting interviews as one of their methods, discuss the issue of ownership, copyright or intellectual property with their interviewees, beyond explaining where their data will be stored and how it will be used. In fact ‘for data collected via interviews that are recorded and/or transcribed, the researcher holds the copyright of recordings and transcripts but each speaker is an author of his or her recorded words in the interview.’ So I find myself speculating what the implications and potential consequences of that are? As Van den Eynden et al (2011) explain, an author could at some time in the future assert their rights over the words they provided and you would be obliged to comply. It is possible however for the researcher to have sought ‘transfer of copyright or a licence to use the data’ ideally at the outset of the project. There are even templates available through the Data Archive to make things easier. I wonder though whether taking the route towards Creative Commons licensing might provide a route forward? Potential participants are likely to be familiar with it; many will indeed use it with their own material. But that then has me wondering whether that’s permissible under the University regulations for PhD research (which of course I could doubtless find out), but also what the implications might be if you subsequently wish to publish your research through conventional commercial channels.

My work this morning has been with the apparently less sticky technical issues – where would the audio files be stored, how would they be served/streamed etc. In the past I’ve used the free versions of various podcast services like AudioBoom, SoundCloud, Spreakr etc, but they’re of course limited in some way and would not be adequate for several hour-long podcasts. Paying for upgrades is an option, but I don’t fancy picking up the tab of tens of pounds per annum, just for this project. Online storage can be bought for a much more manageable outlay through services like Amazon S3, or perhaps more ethically(?) through Reclaim Hosting, but which of course demand a higher level of technical capability to configure, manage and maintain the site. I probably have enough background to cope with that, especially if supplemented by online tutorials … and I have been considering securing a new domain name anyway. But then what happens in the longer term? How long will I need to maintain the site and content?

I can’t help but be drawn back to ethical principles, specifically those of non-malfeasance and beneficence. Would sharing podcasts of interviews be likely to result in any harm befalling participants and are there ways in which they might benefit? Is is not easy to speculate what harms an interviewee might incur, but not dissimilar perhaps than those from potentially any online activity. In most cases (assuming the material is not inflammatory or illegal) the most harm is likely to be reputational damage from an inappropriate or ill-judged comment. It might be possible that potential future employers might be put off by opinions or ideas expressed – if as a teacher, you expressed particular pedagogical approaches you favoured and they were at odds with the views of a potential employer who heard your interview, then s/he might be less inclined to offer an interview. Again though, if you hold a particular set of values and have an online presence, it’s likely you’ll have already burnt that bridge. This can of course be flipped and work in your favour as it did for Daniel Needlestone – a benefit? For those who share widely, seek exposure and an audience, then being provided with an opportunity for that through an interview, then this might indeed be considered to be in their interests. And of course, as for many research participants, but perhaps particularly for teachers, there is the sense that their participation is contributing the pool of knowledge from which we all sip … or gulp.

I’m obliged to also ask myself why I might want to do this; what do I stand to gain? Am I being selfish and actually seeking kudos from the community? Am I attempting to follow in the spirit of making research more open and more accessible? Am I attempting to be more faithful to my participants in seeking to ensure their voice is not lost through my transcription. Is this one way in which I can be more transparent about my analysis and interpretation? Is this an additional channel through which I can make my research accessible to a wider audience? Perhaps a little of all of the above?

So which way do I go? My easy route is to stick with the ethical issues I’ve already had ratified for my pilot study and go with participant anonymity. The difficult route, for all of the aforementioned reasons, is to seek to ‘publish’ the data and therefore have to write a new ethics submission incorporating all those issues and explaining how I would address them. That might be time consuming (both in the composition and in the approval process), but is not impossible; the edonis project by David Noble has already set a precedent in fact. Which option would you choose if a) you were me, and b) you were a potential participant – what would your preference be?

Ethics submission draft – feedback

flickr photo by rynde shared under a Creative Commons (BY-ND) license

Today I received detailed feedback from my supervisors on my draft submission for ethical approval. It’s reassuring to have some wiser and more experienced clearly isolate and highlight some of the issues that felt ill-defined or less well-articulated. Not to mention point out the inconsistencies or lack of alignment that you never spotted. The latter are relatively easily corrected; the former need a little more thought.

In particular I need to reconsider the toolkit of methods I’m proposing for my pilot study. I’m clearly asking a lot of myself and in my submission haven’t clearly expressed whether what I’m proposing is achievable or desirable. I could reduce the breadth by prioritising and rationalising my choices, or I could aim for less depth and rather than a full thematic analysis, write research and substantive memos reflecting on the methods. Another way I might explore the demands these choices will make on me is to break down the times required for each of the methods and map that out against the time I’ll have available in following months.

Another area where my submission could be strengthened is by providing a supplement in which I justify my ethical decisions by referencing the relevant literature. This could of course be literature specific to ethical issues, or more generic methodology literature in which ethical issues are referenced. Although I included with my submission the appendix I mentioned previously, it didn’t include any references to the literature which had informed my thinking. I made the choice not to include references for simple practical reasons – in an attempt get all the information in a single summarising sheet. Like my supervisors, the ethical review panel won’t have the time to pore over the posts I wrote, so I need to summarise that thinking into a more succinct form.

The final issue I need to address is one I knew might be problematic; that of preserving anonymity for the participants. If verbatim quotes are not required in any published materials, ensuring anonymity for participants should be possible. As I discussed before though, I’m concerned whether those involved would want to remain anonymous; I think I need to make a case for why citing participants might actually be more ethical, by drawing on those studies where more learned people than me have actually done just that.

Over the next few posts I’ll attempt to address these issues.

Ethics 8 – So what?

flickr photo by PaoloMazzoleni shared under a Creative Commons (BY) license

That’s the question researchers have to continually ask themselves – why is what I’ve said/written/discovered important or why does it matter? The preceding seven posts have covered various topics around the ethics of conducting research on the Internet. So what? What matters in this instance is how that will help to ensure my research study is ethically sound. In this concluding post I’ll try to frame that learning within this context.

It’s worth reiterating the context with which my study is located. A specific group of teachers engaged in a particular activity – those using Twitter for professional learning. These are well educated people engaged in an activity (of their own volition) which would not be considered ‘sensitive.’ Charged with supporting young people in using technology and social media wisely, they are more likely to be aware of the consequences of engaging in the use of social media.

The pilot methods I outlined in a previous post are restated here:

  1. Immersion in my Twitter stream for 24 hours (possibly over three shifts) – ‘deep hanging out’. This will provide a snapshot of activity from a self-selecting sample of the two thousand plus educators I follow.
  2. Closely following the twitterstream of a teacher for a limited period, chosen from those who have made claims regarding the efficacy of Twitter. This is to investigate whether focusing on an individual might yield more informative data.
  3. Attempting informal interviews using the commenting feature on blogs. This will be across a small number of blog posts in which the authors make claims of how useful they found Twitter for professional learning.
  4. Conducting a single semi-structured interview with one of the more evangelical of those educators making claims for Twitter. This should tease out areas and themes to explore in more depth.
  5. Seeking permission, then attempting a focus group interview within a Twitter #edchat. This may push the boundaries of what constitutes a focus group, or the depth of discussion possible in a #chat.
  6. Using an automated routine to collect tweets over one month which reference a particular term e.g. “professional learning.” This will access the general Twitter stream and therefore a wider sample, offering the potential for unanticipated outcomes to emerge.
  7. Attempting to open dialogue within Twitter (or elsewhere) with anyone who makes claims about Twitter in relation to their professional learning. A ‘naive’ stance will be taken whilst attempting to draw out further information.
  8. Small-scale social network analysis of a topic or hashtag to explore the interconnections which are forming. The focus here is not on the content of the tweets, nor the people which are connected, but the ways they are connected with each other and the information flows between them.

In the following table, I highlight how the significant themes discussed in this ethics strand of posts apply to my proposed methods:

These are of course only my interpretations, based on being a user of Twitter for professional learning for the last seven years. It would be imprudent however to assume I am at liberty to speak for all teachers on Twitter. Ideally I should seek to verify the alignment of my perceptions with those of the potential participants. I could survey or interview people, as Beninger et al (2014) and Hudson and Bruckman (2010) did. But I suspect the outcome would be far from definitive, and I would find people expressing the same range of views from ‘this is an open platform and people should know what they’re doing’ to ‘I’m OK with people using my information so long as they ask first.’ The difficulty then is in being sensitive to and addressing the wishes of all participants. Is this even possible? It’s important therefore to behave in a way that responds to context; if a tweet or the content of a blog leans more towards personal reflection, than open debate, I would be less inclined to intrude. As Roberts et al (2004) advise – “expectation of privacy overrides the distinction between public and private spaces.” Better then to consider the micro-context very carefully.

Some factors in the summary (like ‘degree of interaction’) are less subjective, whilst other issues are open to interpretation. Views about what is public and what private often differ in degree, as indeed does the need to seek consent. Where possible, it would seem sensible to be guided by precedent and what has been deemed acceptable practice by researchers who have gone before.

My greatest dilemma is regarding anonymity. Based on experience and personal preference, I tend to concur with Roberts et al (2004) & Sixsmith & Murray (2001), and would prefer to seek to empower participants by offering them the choice of having their authorship recognised and being credited in published works. What I feel I’ve done, as can be seen in the table, is opt for the safer and simpler option of anonymising the data, rather than acknowledging the contributions of participants. Wrong decision? What would you prefer?

Despite all the dilemmas, dichotomies and disagreements, the touchstone to which I’ll always return is to ensure minimal impact/harm for participants and to maximise beneficence.

Ethics 7 – “When first we practise to deceive”

With one of the fundamental principles of research being to minimise harm, perhaps we should be aiming to tread as lightly as possible in the field? Minimise disturbance to participants? Could it be argued therefore that we should aim to be unobtrusive as possible?


If it is important to gather data based on naturalistic behaviour, as untainted as possible by researcher presence, then being unobtrusive becomes a primary goal. The intention is to avoid observer effects by acting as an ‘overhearer’ (D’Arcy, 2012), rather than a participant who may influence outcomes. However it is a fine line between being unobtrusive and covert; between minimising influence and hiding from view. As Hine (2011) observed:

although we might be able to easily access data using unobtrusive methods, this does not make this ‘ethically available’

The ethical issues do not prevent us from using unobtrusive methods, so much as to remind us of our obligations to participants …and authors.
Unobtrusive research methods predate the Internet, but they have certainly become easier through online channels. Some techniques like content or sentiment analysis of large data sets which provide summaries, rather than specific, identifiable details come with less ethical baggage. Entering a chatroom or monitoring the twitterstream without announcing your presence is also unobtrusive and some might say, crosses an ethical line. This is now covert behaviour, albeit in a ‘public’ place. Lurking, as it is known, is a legitimate online activity however and is common when people enter a new environment for the first time; it allows them to become familiar with norms and conventions.

flickr photo by fudj shared under a Creative Commons (BY-ND) license

An ethical case can be made for unobtrusive, even covert research on the grounds of the reduced impact it has on participants. They’re not required to give up any time, to fit appointments into their schedule or to worry whether what they’ve said is helpful/useful to the researcher. Once more we’re confronted with shades of grey rather than definitive answers, however Whitty (2004) draws a line in the sand for us:

“While it might be unclear as to how ethical it is for lurkers to collect data on the Internet, there is less doubt as to whether it is acceptable to deceive others online in order to conduct social research,…”


Being unobtrusive tips over into deception if researchers deliberately conceal their purpose, do not fully disclose relevant information to participants, or provide false information. (Madge, 2007; Frankel and Siang, 1999). Whether lurking constitutes deception is open for debate. Perhaps we need to return to some of the issues discussed in earlier posts, like privacy.

There are circumstances however where deception might pass scrutiny from an ethical review panel. Those situations where the research could not otherwise be undertaken for example, but only if participants (and researcher) are protected from harm and they are debriefed after the research.
In some online arenas, deception (withholding information, pretending to be someone other) might be the norm; MMORPGs and virtual worlds for example, where a player might take on the role of a character or choose an alternate identity. Without good reason, a researcher should avoid such behaviour, instead opting to find the means to disclose to fellow participants that you are conducting research as (Eynon, 2009; Krotowski, 2010).


How you disclose your status as a researcher to the group with whom you are participating online will depend on the conventions in that space. Providing details in your profile together with a link to an institutional website ‘can increase the credibility of the researcher’s claimed identity and shows respect and courtesy to members of the newsgroup.’ (Madge, 2007). Although some groups are openly hostile to the presence of researchers (Hudson and Bruckman, 2004), a respectful approach and involvement might not only grant access, but also pay dividends:

“Such efforts to establish cultural membership and disclose research aims were foundational to creating relations of caring and trust with group participants.” (Walstrom, 2004)

Porr & Ployhart (2004) consider this rendered even more powerfully through the disclosure-reciprocity effect – “we reveal more to those who have been open to us.” By being completely open and transparent with our participants, it is likely that they will reciprocate.

Since they were conducting observation-only research in a public space and did not need to interact with the participants (using interviews, surveys or experiment), Coughlan and Perryman (2015) felt justified in not disclosing their status. However they also took great care in anonymising the data they gathered, even going so far as to break Facebook’s terms of service by altering the screenshots they captured; a practice regularly undertaken by many reputable institutions and other academics.

Perhaps it is our ethical sensitivity that makes us feel uneasy with unobtrusive, covert or even deceptive research. That is right and proper, but we should also take care that we do not sacrifice a potential contribution to knowledge by playing it too safe. Under the right circumstances, we are granted ethical latitude:

“If research requires any kind of deception, then only by the clear demonstration of the benefits of the research can it be justified.” SRA, 2003

“Education researchers do not use deceptive techniques unless they have determined that their use poses no more than minimal risk to research participants; that their use is justified by the study’s prospective scientific, scholarly, educational, or applied value; and that equally effective alternative procedures that do not use deception are not feasible.” AERA, 2011

“Researchers must therefore avoid deception or subterfuge unless their research design specifically requires it to ensure that the appropriate data is collected or that the welfare of the researchers is not put in jeopardy.” BERA, 2011


Ethics 6 – Human subject/authored text?

flickr photo by Scott Smith (SRisonS) shared under a Creative Commons (BY-NC-ND) license

Research participants have a right to anonymity, confidentiality and being able to provide informed consent. This would appear to be ethically unproblematic, setting aside the issues discussed in the previous post for a moment. To incorporate data drawn from a blog into a report or article would therefore require anonymising any details which might identify the author. But here’s the thing; what if they don’t want to be anonymous? What if their livelihood or reputation is based on the success of their posts, as distinguished by how widely they’re shared and reposted? To unpick this tension, we need consider two different perspectives: the online world can be viewed as a cultural realm wherein people perform and interact, or that what we see online consists instead or authored texts.

A social world

The default (and safe) option is to assume that the activity we view online is intrinsically linked with the people who produced it and consequently we should adopt a ‘human subject’ ethical stance. Originating in medical research, the foundations can be traced back to the Belmont Report and further. Walther (2002) provides the specifics:

Human subjects research is that in which there is any intervention or interaction with another person for the purpose of gathering information, or in which information is recorded by the researcher in such a way that a person can be identified directly or indirectly with it.

The question then is whether is a human subject approach is still valid where the research process neither involves biomedical procedures, nor interacting with people? Markham and Buchanan (2012) in the guidance from the Association of Internet Researchers caution that:

Because all digital information at some point involves individual persons, consideration of principles related to research on human subjects may be necessary even if it is not immediately apparent how and where persons are involved in the research data.

but acknowledge that the Internet may call into question the notion of personhood, asking whether the digital traces we produce online can be considered an ‘extension of self.’ Although the data a researcher might be linked with a person, if we interact with that text, are we still interacting with a human subject?

A textual domain

Some argue for a different sensibility where the data we access on the Internet ought to be considered as ‘authored texts’ (Basset and O’Riordan, 2002; Walther, 2002). Whether a web page, blog post, forum thread or tweet, these textual artefacts have been inscribed onto the world wide web and persist over time. More akin to books, newspaper articles and company reports, they are divorced from the originator and require us to draw from a different set of ethical principles. We think instead of issues around ownership, copyright and attribution, where the text may have been authored by someone, but does not constitute a part of them; the two need not be conflated.

This perspective does not mean we are at liberty to drop the fundamental ethical principles of autonomy and beneficence; they simply shift to ensure them in a different way. Rather than provide anonymity and confidentiality, we acknowledge authorship and attribute it in the same way we would when quoting more traditional works. This assumes of course that we remain sensitive to issues of privacy and sensitivity of subject, though Wilkinson & Thelwall (2011) point out we need not feel obliged to be drawn too far back towards the human subject stance:

Although web texts can be treated as documentary research sources or cultural artefacts, they deserve special consideration because they are less obviously public than a published book and often contain personal information. This is an issue of privacy rather than consent, however.

I’m not sure I’d agree that online texts are less ‘public’ than a published book, when (assuming the text was not written to a private online space) visibility of that text may be no more than a click or two away. A book has to be first bought or borrowed through a conscious act; online texts can be encountered by chance, at any time, by anyone.

We must remember that if someone is publishing in a performative space, where the notion of public audience (and possibly recognition) is the norm, then by anonymising rather than attributing their contribution, we may be doing them a disservice. A harm, rather than a benefit, or as Basset and O’Riordan (2002) put it, a diminution of ‘the cultural capital of those engaging in cultural production through Internet technologies.’

Shades of grey

Simple, accessible tools have become available which enable researchers to hoover up large corpora of data from social media. Big data research brings with it a set of ethical concerns of its own, but within the context of this post, we should ask to what extent the people who generated the texts within these corpora are still present. This becomes even more pertinent if the findings of the research are published in aggregate form, rather than being attributed to (anonymised) individuals. This introduces the ‘distance principle’ (Lomborg, 2013); the degree of conceptual and experiential separation between the researcher and the participant/author, or between the research object and the person who produced it. The closer the distance, the more vividly the ‘human’ comes into view and therefore the more likely it ought to be classified as human subject research. Greater distance can be achieved when the participant identity is less distinct, or when the interaction between researcher and participant is lower.

In the following graphic, on a continuum between human subject and authored text, I’ve added a distance dimension. The examples are notional online research situations, but their precise position will depend on context.

flickr photo by ianguest shared under a Creative Commons (BY-NC-SA) license

The examples coloured red involved close interaction between researcher and participant and high proximity between the object being analysed (interview ‘transcript’) and the producer of the object. The green examples show much greater separation or distance. Red examples will undoubtedly require informed consent; green perhaps not. Would you argue for any of the examples being moved? Can you think of other examples which could be included?


