Note: This is a repost of a blog post about the Facebook emotional contagion experiment that I wrote on People Pattern’s blog.
This is the first in a series of posts responding to the controversial Facebook study on Emotional Contagion
The past two weeks have seen a great deal of discussion around the recent computational social science study of Kramer, Guillory and Hancock (2014) “Experimental evidence of massive-scale emotional contagion through social networks” . I encourage you to read the published paper before getting caught up in the maelstrom of commentary. The wider issues are critical to address, and I have summarized the often conflicting but thoughtful perspectives below. These issues strike close to home, given our company’s expertise in computational linguistics and reliance on social media.
In this post, I provide a brief description of the original paper itself along with a synopsis of the many perspectives that have been put forth in the past two weeks. This post sets the stage for two posts to follow tomorrow and Tuesday next week that provide our take on the study plus our own Facebook-external opt-in version of the experiment, which anyone currently using Facebook can participate in.
Summary of the study
Kramer, Guillory and Hancock’s paper provides evidence that emotional states as expressed in social media posts are contagious in that they affect whether readers of those posts reflect similar positive or negative emotional states in their own later posts. The evidence is based on an experiment involving about 700,000 Facebook users over a one week period in January 2012. These users were split into four groups: a group that had a reduction in positive messages in their Facebook feed, another that had a reduction in negative messages, a control group that had an overall 5% reduction in posts, and a second control group that had a 2% reduction. Positivity and negativity were determined by using the LIWC word lists. LIWC, which was created and maintained by my University of Texas at Austin colleague James Pennebaker, is a standard resource for psychological studies of emotional expression in language. Over the past two decades, it has been applied to language from varying sources, including speech, essays, and social media.
The study found a small but statistically significant difference in emotional expression between the positive suppression group and the control and the negative suppression group and the control. Basically, users who had positive posts suppressed produced slightly lower rates of positive word usage and slightly higher rates of negative word usage, and the mirror image of this was found for the negative suppression group (check out the plot for these). (This description of the study is short — see Nitin Madnani’s description for more detail and analysis.)
Objections to the study and the infrastructure that made it possible have come from many sources. The two major complaints have to do with ethical considerations and research flaws.
The first major criticism is that the study was unethical. The key problem is that there was no informed consent. Facebook users had no idea that they were part of this study and had no opportunity to opt out of it. An important aspect of this is that the study conforms to the Facebook terms of service: Facebook has the right to experiment with feed filtering algorithms as part of improving its service. However, because Jeff Hancock is a Cornell University professor, many state it should have passed Cornell’s IRB process. Furthermore, many feel that Facebook should obtain consent from users when running such experiments, whether for eventual publication or for in-company studies to improve the service. The editors of PNAS itself have issued an editorial expression of concern over the lack of informed consent and opt-out for subjects of the study. We agree this is an issue, so in our third post, we’ll introduce a way this can be achieved through an opt-in version of the study.
The second type of criticism is that the research is flawed or otherwise unconvincing. The most obvious issue is that the effect sizes are small. A subtler problem familiar to anyone who has done anything with sentiment analysis is that counting positive and negative words is a highly imperfect means for judging the positivity/negativity of a text (e.g. it does the wrong thing with negations and sarcasm — see Pang and Lee’s overview). Furthermore, the finding that reducing positive words seen leads to fewer positive words produced does not mean that the user’s actual mood was affected. We will return to this last point in tomorrow’s post.
Support for the study
In response, several authors have joined the discussion to support the study and others similar to it, or to refute some aspects of the criticism leveled at it.
Several commentators have made unequivocal statements that the study would have never obtained IRB approval. This is in fact a misperception: Michelle Meyer provides a great overview of many aspects of IRB approval and concludes that actually this particular study could have legitimately passed the IRB process. A key point for her is that had an IRB approved the study, it would probably be the right decision. She concludes: “We can certainly have a conversation about the appropriateness of Facebook-like manipulations, data mining, and other 21st-century practices. But so long as we allow private entities freely to engage in these practices, we ought not unduly restrain academics trying to determine their effects.”
Another defense is that many concerns expressed about the study are misplaced. Tal Yarkoni argues “In Defense of Facebook” that many critics have inappropriately framed the experimental procedure as injecting positive or negative content into feeds, when in fact it was removal of content. Secondly, he notes that Facebook already manipulates users’ feeds, and this study is essentially business-as-usual in this respect. Yarkoni notes that it is a good thing that Facebook publishes such research: “by far the most likely outcome of the backlash Facebook is currently experiencing is that, in future, its leadership will be less likely to allow its data scientists to publish their findings in the scientific literature.” They will do the work regardless, but the public will have less visibility into the kinds of questions Facebook can ask and the capabilities they can build based on the answers they find.
Duncan Watts takes this to another level, saying that companies like Facebook actually have a moral obligation to conduct such research. He writes in the Guardian that the existence of social networks like Facebook gives us an amazing new platform for social science research, akin to the advent of the microscope. He argues that companies like Facebook, as the gatekeepers of such networks, must perform and disseminate research into questions such how users are affected by the content they see.
Finally, such collaborations between industry and academia should be encouraged. Kate Niederhoffer and James Pennebaker argue that both industry and academy are best served through such collaborations and that the discussion around this study provides an excellent case study. In particular, the backlash against the study highlights the need for more rigor, awareness and openness about the research methods and more explicit informed consent among clients or customers.
Wider issues raised by the study and the backlash against it
The backlash and the above responses have furthermore provided fertile ground for other observations and arguments based on subtler issues and questions that the study and the response to it have revealed.
One of my favorites is the observation that IRBs do not perform ethical oversight. danah boyd argues that the IRB review process itself is mistakenly viewed by many as mechanism for ensuring research is ethical. She makes an insightful, non-obvious argument: that the main function of an IRB is to ensure a university is not liable for the activities of a given research project, and that focusing on questions of IRB approval for the Facebook study is beside the point. Furthermore, the real source of the backlash for her is that there is public misunderstanding and growing negative sentiment for the practice of collecting and analyzing data about people using the tools of big data.
Another point is that the ethical boundaries and considerations between industry and academia are difficult to reconcile. Ed Felten writes that though the study conforms to Facebook’s terms of service, it clearly is inconsistent with the research community’s ethical standards. On one hand, this gap could lead to fewer collaborations between companies and university researchers, while on the other hand it could enable some university researchers to side-step IRB requirements by working with companies. Note that the opportunity for these sorts of collaborations often arise naturally and reasonably frequently; for example, it often happens when a professor’s student graduates and joins such companies, and they continue working together.
Zeynep Tufekci escalates the discussion to much higher level—she argues that companies like Facebook are effectively engineering the public. According to Tufekci, this study isn’t the problem so much as it is symptomatic of the wider issue of how a corporate entity like Facebook has the power to target, model and manipulate users in very subtle ways. In a similar, though less polemical vein, Tartleton Gillespie notes the disconnect between Facebook’s promise to deliver a better experience to its users with how users perceive the role and ability of such algorithms. He notes that this leads to “a deeper discomfort about an information environment where the content is ours but the selection is theirs.”
In a follow up post responding to criticism of his “In Defense of Facebook” post, Tal Yarkoni points out that the real problem is the lack of regulations/frameworks for what can be done with online data, especially that collected by private entities like Facebook. He suggests the best thing is to reserve judgment with respect to questions of ethics for this particular paper, but that the incident does certainly highlight the need for “a new set of regulations that provide a unitary code for dealing with consumer data across the board–i.e., in both research and non-research contexts.”
Perhaps the most striking thing about the Kramer, Guillory and Hancock paper is how the ensuing discussion has highlighted many deep and important aspects of the ethics of research in computational social science from both industry and university perspectives, and the subtleties that lie therein.
A standard blithe rejoinder to users of services like Facebook who express concern, or even horror, about studies like this is to say “Don’t you see that when you use a service you don’t pay for, you are not the customer, you are the product?” This is certainly true in many ways, and it merits repeating again and again. However, it of course doesn’t absolve corporations from the responsibility to treat their users with respect and regard for their well-being.
I don’t think the researchers nor Facebook itself have been grossly negligent with respect to this study, but nonetheless the study is in an ethical gray zone. Our second post will touch on other activities, such as A/B testing in ad placement and content, that are arguably in that same gray zone, but which have not created a public outcry even after years of being practiced. It will also say more about how the linguistic framing of the study itself essentially primed the extreme backlash that was observed and how it is in many ways more innocuous than its own wording would suggest.
Our third post will introduce our own opt-in version of the study, which we think is a reasonable way to explore the questions posed in the study. We’d love to get plenty of folks to try it out, and we’ll even let participants guess whether they were in the positive or negative group. Stay tuned!