Ethics 6 – Human subject/authored text?

flickr photo by Scott Smith (SRisonS) shared under a Creative Commons (BY-NC-ND) license

Research participants have a right to anonymity, confidentiality and being able to provide informed consent. This would appear to be ethically unproblematic, setting aside the issues discussed in the previous post for a moment. To incorporate data drawn from a blog into a report or article would therefore require anonymising any details which might identify the author. But here’s the thing; what if they don’t want to be anonymous? What if their livelihood or reputation is based on the success of their posts, as distinguished by how widely they’re shared and reposted? To unpick this tension, we need consider two different perspectives: the online world can be viewed as a cultural realm wherein people perform and interact, or that what we see online consists instead or authored texts.

A social world

The default (and safe) option is to assume that the activity we view online is intrinsically linked with the people who produced it and consequently we should adopt a ‘human subject’ ethical stance. Originating in medical research, the foundations can be traced back to the Belmont Report and further. Walther (2002) provides the specifics:

Human subjects research is that in which there is any intervention or interaction with another person for the purpose of gathering information, or in which information is recorded by the researcher in such a way that a person can be identified directly or indirectly with it.

The question then is whether is a human subject approach is still valid where the research process neither involves biomedical procedures, nor interacting with people? Markham and Buchanan (2012) in the guidance from the Association of Internet Researchers caution that:

Because all digital information at some point involves individual persons, consideration of principles related to research on human subjects may be necessary even if it is not immediately apparent how and where persons are involved in the research data.

but acknowledge that the Internet may call into question the notion of personhood, asking whether the digital traces we produce online can be considered an ‘extension of self.’ Although the data a researcher might be linked with a person, if we interact with that text, are we still interacting with a human subject?

A textual domain

Some argue for a different sensibility where the data we access on the Internet ought to be considered as ‘authored texts’ (Basset and O’Riordan, 2002; Walther, 2002). Whether a web page, blog post, forum thread or tweet, these textual artefacts have been inscribed onto the world wide web and persist over time. More akin to books, newspaper articles and company reports, they are divorced from the originator and require us to draw from a different set of ethical principles. We think instead of issues around ownership, copyright and attribution, where the text may have been authored by someone, but does not constitute a part of them; the two need not be conflated.

This perspective does not mean we are at liberty to drop the fundamental ethical principles of autonomy and beneficence; they simply shift to ensure them in a different way. Rather than provide anonymity and confidentiality, we acknowledge authorship and attribute it in the same way we would when quoting more traditional works. This assumes of course that we remain sensitive to issues of privacy and sensitivity of subject, though Wilkinson & Thelwall (2011) point out we need not feel obliged to be drawn too far back towards the human subject stance:

Although web texts can be treated as documentary research sources or cultural artefacts, they deserve special consideration because they are less obviously public than a published book and often contain personal information. This is an issue of privacy rather than consent, however.

I’m not sure I’d agree that online texts are less ‘public’ than a published book, when (assuming the text was not written to a private online space) visibility of that text may be no more than a click or two away. A book has to be first bought or borrowed through a conscious act; online texts can be encountered by chance, at any time, by anyone.

We must remember that if someone is publishing in a performative space, where the notion of public audience (and possibly recognition) is the norm, then by anonymising rather than attributing their contribution, we may be doing them a disservice. A harm, rather than a benefit, or as Basset and O’Riordan (2002) put it, a diminution of ‘the cultural capital of those engaging in cultural production through Internet technologies.’

Shades of grey

Simple, accessible tools have become available which enable researchers to hoover up large corpora of data from social media. Big data research brings with it a set of ethical concerns of its own, but within the context of this post, we should ask to what extent the people who generated the texts within these corpora are still present. This becomes even more pertinent if the findings of the research are published in aggregate form, rather than being attributed to (anonymised) individuals. This introduces the ‘distance principle’ (Lomborg, 2013); the degree of conceptual and experiential separation between the researcher and the participant/author, or between the research object and the person who produced it. The closer the distance, the more vividly the ‘human’ comes into view and therefore the more likely it ought to be classified as human subject research. Greater distance can be achieved when the participant identity is less distinct, or when the interaction between researcher and participant is lower.

In the following graphic, on a continuum between human subject and authored text, I’ve added a distance dimension. The examples are notional online research situations, but their precise position will depend on context.

flickr photo by ianguest shared under a Creative Commons (BY-NC-SA) license

The examples coloured red involved close interaction between researcher and participant and high proximity between the object being analysed (interview ‘transcript’) and the producer of the object. The green examples show much greater separation or distance. Red examples will undoubtedly require informed consent; green perhaps not. Would you argue for any of the examples being moved? Can you think of other examples which could be included?


BASSETT, Elizabeth H. and O’RIORDAN, Kate (2002). Ethics of Internet research: Contesting the human subjects research model. Ethics and information technology, 4 (3), 233-247.
LOMBORG, Stine (2013). Personal internet archives and ethics. Research ethics, 9 (1), 20-31.
MARKHAM, Annette and BUCHANAN, Elizabeth (2012). Ethical Decision-Making and Internet Research: Version 2.0. [online]. Association of Internet Researchers.
WALTHER, Joseph B. (2002). Research ethics in Internet-enabled research: Human subjects issues and methodological myopia. [online]. Ethics and information technology, 4 (3), 205-216.
WILKINSON, David and THELWALL, Mike (2011). Researching personal information on the public web methods and ethics. Social science computer review, 29 (4), 387-401.

3 thoughts on “Ethics 6 – Human subject/authored text?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s