I just finished Sinclair Lewis’s fascinating novel Arrowsmith, published in 1925 (plain text here). If you are a scientist or are very interested in science, you should consider reading it. Arrowsmith plots the trajectory from youth to middle age of Martin Arrowsmith, a medical doctor turned researcher, and it touches on many of the daily topics a researcher encounters, as well as the personal and societal impact and questions of their work. In no small part, the force of this book comes from the fact that it was actually co-authored by Paul de Kruif, who had worked both as a professor at the University of Michigan and researcher at the Rockefeller Institute. In addition to the tremendous attention to scientific detail, it is likely that many of the characters and situations arose from his personal experience.

The book influenced many people, especially in increasing their appreciation of carefully controlled clinical trials and their skepticism of quack remedies and “scientific” therapies that were rushed to market before they’d been properly vetted. The book goes into great depth about the pressures put on a scientist (Arrowsmith) to rush to publication and creation of remedies. Lewis touches on these topics both in the context of a university (in the fictional mid-western state of Winnemac) and a top-flight research laboratory in New York City. These topics and many other aspects of the book remain relevant today.
The plotting is a bit slow at times, but the writing is delightful. The brief and wonderful descriptions of incidental characters especially stand out. Here’s a great example:

Watters’s house was new, and furnished in a highly built-in and leaded-glass manner. He had in three years of practice already become didactic and incredibly married; he had put on weight and infallibility; and he had learned many new things about which to be dull. Having been graduated a year earlier than Martin and having married an almost rich wife, he was kind and hospitable with an emphasis which aroused a desire to do homicide.

As someone raised in Michigan and who did my undergraduate degree in Ohio, I love the critiques of the boosterism and conformity in small to mid-sized towns in the context of bland sameness across the mid-west. As an atheist, I also appreciate the way Lewis brought up the explicit religious overtones of the mid-west and navigated his essentially agnostic/atheist main characters through that without hammering on it too much (which probably would have been unthinkable for a novel in the 1920s anyway). It was interesting for me to discover that de Kruif was born in Zeeland, Michigan and died in Holland, Michigan. Zeeland is less than an hour drive from my hometown of Rockford, and when I was growing up, we used to joke that God had his address in Zeeland (because it was such a religious town). In general, West Michigan is overwhelmingly and stridently Christian of the you-will-go-to-hell-if-you-don’t-believe-in-our-version-of-Jesus variety. This became annoying and tiresome for me growing up as a non-Christian in that area, so I personally appreciated the well-placed satirical points on organized religion in Arrowsmith.

As a scientist, I love the description of the joys of research and the tensions between doing research, earning a living, and having time/headspace for the other things in life. Here’s a great passage about Arrowsmith’s burning passion for research at a time when he is working an intern en route to becoming a doctor:

But on night duty, alone, he had to face the self he had been afraid to uncover, and he was homesick for the laboratory, for the thrill of uncharted discoveries, the quest below the surface and beyond the moment, the search for fundamental laws which the scientist (however blasphemously and colloquially he may describe it) exalts above temporary healing as the religious exalts the nature and terrible glory of God above pleasant daily virtues. With this sadness there was envy that he should be left out of things, that others should go ahead of him, ever surer in technique, more widely aware of the phenomena of biological chemistry, more deeply daring to explain laws at which the pioneers had but fumbled and hinted.

I have often felt this fear of missing out while devoting myself to other things (teaching, family, etc), and this passage captures that brilliantly. Arrowsmith eventually returns to research, after a circuitous path through being a physician and public health worker. Though I myself have chosen quite different priorities on these than Arrowsmith does, I’ve experienced the same scientific thrills and motivations, and I know plenty of people who tend more toward the pained but satisfied scientific asceticism that Arrowsmith ultimately reaches. A review of Arrowsmith by Noortje Jacobs puts it well like this: “the novel in many ways also presents its readers with a bleak vision on the possibility of having a scientific life while remaining a sociable human being.” I think it is fair to say that pretty much everybody engaged in serious scientific research navigates this tension: when research is going really well, it is an amazing experience of flow that begs for more and provides further rewards if you give it; however, we are also social animals that must nurture the relationships we choose to (or must) keep. Arrowsmith provides a detailed window into a person who chooses to live for his pure research and it highlights the costs of that choice for others, without getting sentimental about it.

So much has changed, but so much has stayed the same. While I was reading the book, I found myself already working out the storyline for a modern day Arrowsmith, with an emphasis on artificial intelligence rather than biology and clinical medicine. We have lots of ethical issues to sort out in this front, and as an atheist mid-westerner who has worked in academia as a research professor and in industry as both a consultant and co-founder and chief scientist of a startup and who cares a lot about what we do with machine learning, I’m perhaps particularly well-suited to do that someday.

Every year, my brother Justin and I each pick our favorite 100 songs that we listened to throughout the year. They aren’t necessarily tracks released that year, but are things that resonated during the year (and generally got listened to a lot). This year, my list is, for me, a very satisfying mix of indie, rock, hiphop, rap, and random weird stuff that I like. (Like listening to The Sound of Animals Fighting, anyone?)

I listen mainly while I program, so these are the tunes that help me be productive (and many of which I would expect to do the opposite for most people, even keeping them from being productive). This year’s list has perhaps less low key stuff than lists from past years—somehow, I found myself wanting more hard driving songs in general. I guess events like Sandra Bland’s death and the killing of Tamir Rice, and sadly many others, had me raging against the machine and looking for some musical release.

So, here’s the list:

Note: the preview of the list showing here is listing 111 songs, which is most certainly greater than 100. I can still count, but it seems that there are 11 songs that disappeared from Spotify’s inventory throughout the year and I haven’t been seeing them while working on finalizing the list (but they hang around for the preview). It’s also a reminder of how annoyed I was that Divine Styler’s new album was on Spotify in January and then disappeared.

I actually got to the 100 top tracks from a starting list of over 350 songs, which is more than I’ve ever had in five years of doing top 100 lists. This is in large part due to the excellent Spotify Discover feature that Chris Johnson and his team built—I found tons of new music that way, and I’m going to keep listening to what it gives me every week this year. It’s fun to see personalized recommendations working so well, providing novel and good stuff and benefitting from it myself.

Here’s the list of runner ups, which has lots of great tracks on it too (especially because I had a one track per album constraint for the top 100, which often means making hard choices about which song stays and which ones are cut).

If you are into music, and into these genres, enjoy! Please let me know if there are things you’d recommend for me to check out. And if you are among those who assert that all the great music was from the 1960’s, or from the 1800’s, or whatever, you’re just… old. ;-P


I also curate some music to play in the People Pattern office, and generally try to make it be composed of songs that most people can listen too (which is definitely not true of the majority of the songs in my personal favorites). So if you’d like something a bit more generally palatable, here are 299 songs you might find more appealing.

I’ve been reading papers about deep learning for several years now, but until recently hadn’t dug in and implemented any models using deep learning techniques for myself. To remedy this, I started experimenting with Deeplearning4J a few weeks ago, but with limited success. I read more books, primers and tutorials, especially the amazing series of blog posts by Chris Olah and Denny Britz. Then, with incredible timing for me, Google released TensorFlow to much general excitement. So, I figured I’d give it a go, especially given Delip Rao’s enthusiasm for it—he even compared the move from Theano to TensorFlow feeling like changing from “a Honda Civic to a Ferrari.”

Here’s a quick prelude before getting to my initial simple explorations with TensorFlow. As most people (hopefully) know, deep learning encompasses ideas going back many decades (done under the names of connectionism and neural networks) that only became viable at scale in the past decade with the advent of faster machines and some algorithmic innovations. I was first introduced to them in a class taught by my PhD advisor, Mark Steedman, at the University of Pennsylvania in 1997. He was especially interested in how they could be applied to language understanding, which he wrote about in his 1999 paper “Connectionist Sentence Processing in Perspective.” I wish I understood more about that topic (and many others) back then, but then again that’s the nature of being a young grad student. Anyway, Mark’s interest in connectionist language processing arose in part from being on the dissertation committee of James Henderson, who completed his thesis “Description Based Parsing in a Connectionist Network” in 1994. James was a post-doc in the Institute for Research in Cognitive Science at Penn when I arrived in 1996. As a young grad student, I had little idea of what connectionist parsing entailed, and my understanding from more senior (and far more knowledgeable) students was that James’ parsers were really interesting but that he had trouble getting the models to scale to larger data sets—at least compared to the data-driven parsers that others like Mike Collins and Adwait Ratnarparkhi were building at Penn in the mid-1990s. (Side note: for all the kids using logistic regression for NLP out there, you probably don’t know that Adwait was the one who first applied LR/MaxEnt to several NLP problems in his 1998 dissertation “Maximum Entropy Models for Natural Language Ambiguity Resolution“, in which he demonstrated how amazingly effective it was for everything from classification to part-of-speech tagging to parsing.)

Back to TensorFlow and the present day. I flew from Austin to Washington DC last week, and the morning before my flight I downloaded TensorFlow, made sure everything compiled, downloaded the necessary datasets, and opened up a bunch of tabs with TensorFlow tutorials. My goal was, while on the airplane, to run the tutorials, get a feel for the flow of TensorFlow, and then implement my own networks for doing some made-up classification problems. I came away from the exercise extremely pleased. This post explains what I did and gives pointers to the code to make it happen. My goal is to help out people who could use a bit more explicit instruction and guidance using a complete end-to-end example with easy to understand data. I won’t give lots of code examples in this post as there are several tutorials that already do that quite well—the value here is in the simple end-to-end implementations, the data to go with them, and a bit of explanation along the way.

As a preliminary, I recommend going to the excellent TensorFlow documentation, downloading it, and running the first example. If you can do that, you should be able to run the code I’ve provided to go along with this post in my try-tf repository on Github.

Simulated data

As a researcher who works primarily on empirical methods in natural language processing, my usual tendency is to try new software and ideas out on language data sets, e.g. text classification problems and the like. However, after hanging out with a statistician like James Scott for many years, I’ve come to appreciate the value of using simulated datasets early on to reduce the number of unknowns while getting the basics right. So, when sitting down with TensorFlow, I wanted to try three simulated data sets: linearly separable data, moon data and saturn data. The first is data that linear classifiers can handle easily, while the latter two require the introduction of non-linearities enabled by models like multi-layer neural networks. Here’s what they look like, with brief descriptions.

The linear data has two clusters that can be separated by a diagonal line from top left to bottom right:


Linear classifiers like perceptrons, logistic regression, linear discriminant analysis, support vector machines and others do well with this kind of data because learning these lines (hyperplanes) is exactly what they do.

The moon data has two clusters in crescent shapes that are tangled up such that no line can keep all the orange dots on one side without also including blue dots.


Note: see Implementing a Neural Network from Scratch in Python for a discussion working with the moon data using Theano.

The saturn data has a core cluster representing one class and a ring cluster representing the other.saturn_data_train.jpg

With the saturn data, a line is catastrophically bad. Perhaps the best one can do is draw a line that has all the orange points to one side. This ensures a small, entirely blue side, but it leaves the majority of blue dots in orange terroritory.

Example data has been generated in try-tf/simdata for each of these datasets, including a training set and test set for each. These are for the two dimensional cases visualized above, but you can use the scripts in that directory to generate data with other parameters, including more dimensions, greater variances, etc. See the commented out code for help to visualize the outputs, or adapt plot_data.R, which visualizes 2-d data in CSV format. See the  README for instructions.

Related: check out Delip Rao’s post on learning arbitrary lambda expressions.

Softmax regression

Let’s start with a network that can handle the linear data, which I’ve written in softmax.py. The TensorFlow page has pretty good instructions for how to define a single layer network for MNIST, but no end-to-end code that defines the network, reads in data (consisting of label plus features), trains and evaluates the model. I found writing this to be a good way to familiarize myself with the TensorFlow Python API, so I recommend trying it yourself before looking at my code and then referring to it if you get stuck.

Let’s run it and see what we get.

$ python softmax.py --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv
Accuracy: 0.99

This performs one pass (epoch) over the training data, so parameters were only updated once per example. 99% is good held-out accuracy, but allowing two training epochs gets us to 100%.

$ python softmax.py --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv --num_epochs 2
Accuracy: 1.0

There’s a bit of code in softmax.py to handle options and read in data. The most important lines are the ones that define the input data, the model, and the training step. I simply adapted these from the MNIST beginners tutorial, but softmax.py puts it all together and provides a basis for transitioning to the network with a hidden layer discussed later in this post.

To see a little more, let’s turn on the verbose flag and run for 5 epochs.

$ python softmax.py --train simdata/linear_data_train.csv --test simdata/linear_data_eval.csv --num_epochs 5 --verbose True

0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47 48 49

Weight matrix.
[[-1.87038445 1.87038457]
[-2.23716712 2.23716712]]

Bias vector.
[ 1.57296884 -1.57296848]

Applying model to first test instance.
Point = [[ 0.14756215 0.24351828]]
Wx+b = [[ 0.7521798 -0.75217938]]
softmax(Wx+b) = [[ 0.81822371 0.18177626]]

Accuracy: 1.0

Consider first the weights and bias. Intuitively, the classifier should find a separating hyperplane between the two classes, and it probably isn’t immediately obvious how W and b define that. For now, consider only the first column with w1=-1.87038457, w2=-2.23716712 and b=1.57296848. Recall that w1 is the parameter for the `x` dimension and w2 is for the `y` dimension. The separating hyperplane satisfies Wx+b=0; from which we get the standard y=mx+b form.

Wx + b = 0
w1*x + w2*y + b = 0
w2*y = -w1*x – b
y = (-w1/w2)*x – b/w2

For the parameters learned above, we have the line:

y = -0.8360504*x + 0.7031074

Here’s the plot with the line, showing it is an excellent fit for the training data.


The second column of weights and bias separate the data points at the same place as the first, but mirrored 180 degrees from the first column. Strictly speaking, it is redundant to have two output nodes since a multinomial distribution with n outputs can be represented with n-1 parameters (see section 9.3 of Andrew Ng’s notes on supervised learning for details). Nonetheless, it’s convenient to define the network this way.

Finally, let’s try the softmax network on the moon and saturn data.

python softmax.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 2
Accuracy: 0.856

$ python softmax.py --train simdata/saturn_data_train.csv --test simdata/saturn_data_eval.csv --num_epochs 2
Accuracy: 0.45

As expected, it doesn’t work very well!

Network with a hidden layer

The program hidden.py implements a network with a single hidden layer, and you can set the size of the hidden layer from the command line. Let’s try first with a two-node hidden layer on the moon data.

$ python hidden.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 100 --num_hidden 2
Accuracy: 0.88

So,that was an improvement over the softmax network. Let’s run it again, exactly the same way.

$ python hidden.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 100 --num_hidden 2
Accuracy: 0.967

Very different! What we are seeing is the effect of random initialization, which has a large effect on the learned parameters given the small, low-dimensional data we are dealing with here. (The network uses Xavier initialization for the weights.) Let’s try again but using three nodes.

$ python hidden.py --train simdata/moon_data_train.csv --test simdata/moon_data_eval.csv --num_epochs 100 --num_hidden 3
Accuracy: 0.969

If you run this several times, the results don’t vary much and hover around 97%. The additional node increases the representational capacity and makes the network less sensitive to initial weight settings.

Adding more nodes doesn’t change results much—see the WildML post using the moon data for some nice visualizations of the boundaries being learned between the two classes for different hidden layer sizes.

So, a hidden layer does the trick! Let’s see what happens with the saturn data.

$ python hidden.py --train simdata/saturn_data_train.csv --test simdata/saturn_data_eval.csv --num_epochs 50 --num_hidden 2
Accuracy: 0.76

With just two hidden nodes, we already have a substantial boost from the 45% achieved by softmax regression. With 15 hidden nodes, we get 100% accuracy. There is considerable variation from run to run (due to random initialization). As with the moon data, there is less variation as nodes are added. Here’s a plot showing the increase in performance from 1 to 15 nodes, including ten accuracy measurements for each node count.


The line through the middle is the average accuracy measurement for each node count.

Initialization and activation functions are important

My first attempt at doing a network with a hidden layer was to merge what I had done in softmax.py with the network in mnist.py, provided with TensorFlow tutorials. This was a useful exercise to get a better feel for the TensorFlow Python API, and helped me understand the programming model much better. However, I found that I needed to have upwards of 25 or more hidden nodes in order to reliably get >96% accuracy on the moon data.

I then looked back at the WildML moon example and figured something was quite wrong since just three hidden nodes were sufficient there. The differences were that the MNIST example initializes its hidden layers with truncated normals instead of normals divided by the square root of the input size, initializes biases at 0.1 instead of 0 and uses ReLU activations instead of tanh. By switching to Xavier initialization (using Delip’s handy function), 0 biases, and tanh, everything worked as in the WildML example. I’m including my initial version in the repo as truncnorm_hidden.py so that others can see the difference and play around with it. (It turns out that what matters most is the initialization of the weights.)

This is a simple example of what is often discussed with deep learning methods: they can work amazingly well, but they are very sensitive to initialization and choices about the sizes of layers, activation functions, and the influence of these choices on each other. They are a very powerful set of techniques, but they (still) require finesse and understanding, compared to, say, many linear modeling toolkits that can effectively be used as black boxes these days.


I walked away from this exercise very encouraged! I’ve been programming in Scala mostly for the last five years, so it required dusting off my Python (which I taught in my classes at UT Austin from 2005-2011, e.g. Computational Linguistics I and Natural Language Processing), but I found it quite straightforward. Since I work primarily with language processing tasks, I’m perfectly happy with Python since it’s a great language for munging language data into the inputs needed by packages like TensorFlow. Also, Python works well as a DSL for working with deep learning (it seems like there is a new Python deep learning package announced every week these days). It took me less than four hours to go through initial examples, and then build the softmax and hidden networks and apply them to the three data sets. (And a bunch of that time was me remembering how to do things in Python.)

I’m now looking forward to trying deep learning models, especially convnets and LSTM’s, on language and image tasks. I’m also going to go back to my Scala code for trying out Deeplearning4J to see if I can get these simulation examples to run as I’ve shown here with TensorFlow. (I would welcome pull requests if someone else gets to that first!) As a person who works primarily on the JVM, it would be very handy to be able to work with DL4J as well.

After that, maybe I’ll write out the re-occurring rant going on in my head about deep learning not removing the need for feature engineering (as many backpropagandists seem to like to claim), but instead changing the nature of feature engineering, as well as providing a really cool set of new capabilities and tricks.

The united colors of Baldridge: my hand, my wife's hand, and my boys' hands.

The united colors of Baldridge: my hand, my wife’s hand, and our boys’ hands.

This a long and personal post about racism in the USA. It’s an outpouring of some of what I’ve felt this past year, with an appeal for us all, whether white, black, Hispanic, Asian, mixed, decidedly undeclared, or whatever, to not give up, to keep working to make this country, this world, a better place.

The catalyst for me to write this was a series of tweets by Shaun King (@shaunking) several months ago. King has emerged as one of the leaders of #BlackLivesMatter, a movement to document and address racism in the USA, and especially focus on police misconduct and brutality. In those tweets, King noted his acceptance of a pessimistic view that racism is a permanent feature of American society. It’s not an unreasonable perspective, but it deeply saddens me. As a white husband of a black woman and father of biracial children, I desperately want to remain optimistic. I need to remain optimistic. My family lives between two worlds, and we can’t pick sides. In this post, I want to give some support for embracing a more optimistic perspective. But first, let’s establish why there is good cause to be pessimistic.

It’s been a hell of a few years for race relations in the United States of America. From Trayvon Martin through to the Charleston shootings to Sam Dubose and Corey Jones, black people have been disproportionately killed. Included in the body count are far too numerous instances of police misconduct and brutality. This is violence meted out by the state, and the individuals are disproportionately people of color. It’s been going on for years and it’s nothing new to the black community. Nearly ubiquitous video cameras and social media are now finally making it less easy for the wider community to ignore.

As sad, frustrating and angering as this all is, this moment presents a tremendous opportunity. To put it simply: systemic racism can’t be addressed effectively without white Americans being aware of it and acting to reduce it. Until recently, most white people in the country seem to have been living under the convenient but false perception that racism is a more or less a problem of the past. Now, white Americans see racism as a national problem, but generally don’t think it is a major problem in their own communities. In general, it seems white Americans tell themselves that perhaps there is some discrimination that we still need to address, but it’s not violent, really serious stuff. Maybe there are some backward people down south who are real racists, but by and large we’ve gotten past it, at least in our own communities. Unfortunately, that’s wishful thinking. Ignorance may at times be bliss, but that only really holds for the privileged. And, anyway, there are outright racist people, and they aren’t just in the south.

My wife is African-American. Our nine years together have been a crash course in race relations for me. There is so much I could never have guessed about the black experience in the United States without being with her. To learn at the age of five that there were people who wanted you dead because of your skin color, and furthermore, to learn this from a six-year-old friend. To wish at the age of seven that you are actually a white girl so that you could avoid the burden of being black (this is not uncommon, and Whoopi Goldberg has a powerful performance about it in her 1985 standup show Direct From Broadway). To hear your mom talk of seeing the severed head of a black man rolling down the street in the 1960s. To ask your husband not to stop in Vidor, Texas—even though you are in a traffic jam, pregnant and really needing to pee, because during college you saw “Nigger, don’t let the sun go down on you in this town” written on a wall there. To fear interactions with the police, even though you are a law-abiding, upstanding citizen with graduate degrees from Harvard and Yale. Just last week, she was driving down a road in Austin, at the speed limit, and a police officer in an SUV pulled up beside her, eyed her and matched her pace for some time—nothing happened, but it felt very threatening. Frankly, I didn’t really get her concerns about the police until last year. Now it is all too clear, and it was really driven home by what happened to Sandra Bland, right here in our home state of Texas, and in a city we often pass on our drive between Houston and Austin.

My wife and I have watched the events of the past year with sadness and horror. We have two bright and beautiful sons. Like any parents, we have huge dreams for them and want to set them up to live the happiest, most fulfilled lives they can possibly realize. Yet, we live in a country and time where not only black men and women are killed without justifiable cause or with extremely fast judgment, but even children like Trayvon Martin and Tamir Rice. Where a 14-year-old girl in a bikini is thrown to the ground and sat on by a police officer for several minutes—an officer who also pulled a gun out to threaten two other teens who were concerned for her and who swore repeatedly at other teens at the same incident in McKinney, Texas. (I wrote to the chief of police to ask that officer be dismissed.) Where Chris Lollie, a man waiting to pick up his kids in the St. Paul skyway, is apprehended without cause, tasered, and body searched by the police (and being polite didn’t help in the least). Where a 7-year-old boy is handcuffed for an hour for being unruly in class. It goes on and on—far to many to enumerate here. And because of all this, my wife and I frequently find ourselves watching and listening to the advice that many thoughtful people are giving about raising black children in the USA (e.g., Greta Gardner, Clint Smith, W. Kamau Bell). These are all concerns that were foreign to me and played no part in my own upbringing.

This July, my family took a vacation road trip from Texas to DC to Michigan and back. You learn a lot about the different parts of the US as a biracial family on such a trip. We nearly always stop at McDonald’s for bathroom breaks because we know there are cameras more consistently than in gas stations. We are quite accustomed to the hate stares directed at us, especially in poorer regions in the south. We also get disapproving looks from many black people, especially in black neighborhoods in cities like Houston and DC. Though it is usually just looks and stares, one white woman in a North Carolina rest stop loudly stated that she found our family “disgusting”. We planned our driving so that we wouldn’t have to stay the night in Missouri because of the recent racial tensions highlighted in Ferguson. (There is great irony in this, of course—our own state of Texas has its own poor track record with racism and police brutality, including recently the McKinney pool incident and Sandra Bland’s wrongful arrest and death and more.)

There was only one time on our trip that we felt real fear of more than looks and words. We were low on gas at one point and exited the highway to refuel, only to find the station we’d spotted was no longer operational—however, there were several trucks idling around this otherwise abandoned gas station. We immediately started to go, but our six-year-old declared he needed to pee, so I took him to the forest line—during which time more trucks started to show up. I hurried back as quickly as I could, and my wife had already hopped into the driver’s seat. We got out and back onto the highway fast. It may have been nothing, but it felt like something was possible. When I looked the location up later, I found out that it is a small township that hosts a chapter of the KKK. (I’m now definitely going to map the locations of such chapters out before we go on such a road trip again.)

So, we’ve thankfully only experienced mild discomfort as a family (my wife has experienced much more on her own, including being called a nigger by two white men in a car while walking on Harvard Square), but there is lots of stuff that is pretty bad going on out there. Shortly after our road trip, a similar biracial family on a long drive was stopped and cross-examined by police in a very intimidating manner. And there are plenty of people having rallies for the Confederate flag, and they don’t know their history, so let us admit is not about “heritage”. They even show up at kids’ birthday parties and threaten people. They definitely don’t seem to like black people.

So… where’s the room for optimism? My best guess is that the availability heuristic is playing a big role here, in multiple ways. If it is possible for you at this point, go read the book “Thinking, Fast and Slow“, by Daniel Kahneman, to learn about the availability heuristic and much more. But you probably can’t do that, so here it is briefly: the availability heuristic is a shortcut used by the human mind to evaluate a topic by using examples that are readily retrieved from memory. As an example, consider the question “is the world more violent today than it was in the past?” Perhaps a majority of people would respond yes—it is certainly easy to come to that conclusion if you watch the news. However, Steven Pinker carefully argues in his excellent book “The Better Angels of Our Nature: Why Violence Has Declined” that the data points convincingly toward the opposite conclusion. In fact, he spends a large portion of a large book to get the reader past their own sense of the problem as biased by the availability heuristic. As it turns out, there has is fact never been a time when the probability of a given individual dying violently has been lower. But sex and violence are what sell news, so that’s what we hear about. Then, when we consider the question, the availability heuristic brings those examples quickly to mind. It’s much harder to think about the billions of people just boringly living their lives. There are obviously many pockets of the world and our society where these trends are not as encouraging, so it isn’t time to sit back and say all is well.

It seems quite likely that when someone like Shaun King considers a question like “is racism a permanent feature of American society?”, examples like the ones I’ve mentioned above easily come to mind and dominate the mental computation. Frankly, it happens to me too—it gets me angry and upset and I find myself listening more regularly to artists like Killer Mike, The Roots and even going back to Rage Against the Machine. And, this is not to say “yes” isn’t the right answer. It is just to say that we need to consider the availability heuristic’s potential role in arriving at that answer. I believe we need more data and perspectives before we truly give up hope. The other thing is that it is notoriously hard to make predictions, especially about the future. As just one related example, I heard one family member lament—just a year before Obama’s candidacy—that we’d never have a black president.

As another example, consider American slavery in the decade before the Civil War. It would have been reasonable to feel that slavery would be a permanent feature of American society. In the concluding chapter of “The Slavery Question” from the 1850’s, the author, John Lawrence, writes:

Are there any prospects that the long and dreary night of American despotism will speedily end in a joyous morning?

If we turn our eye towards the political horizon we shall find it overspread with heavy clouds portentous of evil to the oppressed. The government of the United States is intensely pro-slavery. The great political parties, with which the masses of the people act, vie with each other in their supple and obsequious devotion to the slaveocracy. The wise policy of the fathers of the Republic to confine slavery within very narrow limits, so that it would speedily die out and be supplanted by freedom, has been abandoned; the whole spirit of our policy has been reversed ” and our national government seems chiefly concerned for the honor, perpetuation and extension of slavery.

Lawrence goes on to make further points of how dire the situation is, and quotes Frederick Douglass. But his book is called “The Slavery *Question*”, so he of course isn’t giving up. In fact, he flips it with excellent rhetorical flourish.

But dark as is this picture, there is still hope. The exorbitant demands of the slave power, the extreme measures it adopts, the deep humiliation to which it subjects political aspirants, will produce a reaction.

Inflated with past success it is throwing off its mask and revealing its hideous proportions. It is now proving itself the enemy of all freedom. The extreme servility of the popular churches is opening the eyes of many earnest people to the importance of taking a bolder position. They are finding out that it is a duty to come out from churches which sanction the vilest iniquity that ever existed, or exhaust their zeal for the oppressed in tame resolves, never to be executed.

The truth is gaining ground that slaveholding is a great sin, that slaveholders are great sinners, and that he who apologises for the system is a participator in the guilt and shame.

In other words, it’s a systemic problem, and not taking a position against slavery is to be complicit in its evils. In his concluding paragraph, he declares “The day of deliverance is not distant.” It took a bloody war, but a decade later, slavery was abolished.

And this brings us to what can be so frustrating about discussing current race relations with white Americans—namely that they have a very hard time discussing it. In fact, there is now a term, “white fragility,” that describes the odd sensitivity that nearly all white people have when discussing race. We just aren’t very good at it and it’s for a pretty obvious reason: we aren’t required to navigate race to function in our society, while any person of color must. There is also plenty of ambiguity to deal with since race itself is a social construct with very fluid boundaries, and a frequent white response is the well-intentioned but ultimately naive and counter-productive statement “I don’t see color”. One side of this leads to awkward, relatively harmless everyday encounters that can even be made light of — see “What if black people said the stuff white people say” (see also the videos for latinos and asians). But there is a deeper problem of systemic racial disparities that disproportionately benefit white Americans (for a very effective analogy, see this post comparing it to being a bicyclist on the road). The tricky nature of these benefits is that few white Americans realize and admit they are receiving them. They are working hard, dealing with their own successes, failures, pleasures and pains, and it sounds crazy to them that they are privileged. And in fact, this a natural conclusion to reach when you rely on the availability heuristic to consider the topic.

Another dynamic here is that so few white people have close black friends. I don’t mean your co-worker or a person you see from time-to-time. I mean deep personal connections that allow true sharing and sympathetic understanding of another person’s life and experiences. It’s not uncommon for a black American to be THE black friend for many white people, and they are probably keeping a good share of themselves out of reach. My wife learned to do that after even simple comments led some friends and acquaintances to go into conniptions. One man asked my wife “is the singing in black churches as good as they say?”, to which she responded “the singing is great in all the black churches I’ve been to.” He became hysterical and declared that this was a racist thing for her to say. She tried to continue the conversation by contextualizing it more specifically, saying she hadn’t been to every black church and every white church, and that she was just stating her own experience. He just became more irate, and it really seemed that he just wanted to validate his existing prejudices. After exchanges like this and many others like it, it’s often easier just to avoid racial topics altogether.

It is also just common for white Americans to lack deep experiences with black Americans. Until I started dating my wife, I also was similarly removed. I grew up in Rockford, Michigan. We had just a few black students in our high school and I didn’t know any of them. My eyes were opened to a number of things by listening to rap in the late 1980s, especially Willie Dee’s album Controversy, which included songs like “Fuck the KKK” (and many unfortunate misogynistic songs on the second side). My freshman college roommate at the University of Toledo was black and we got along great, but we didn’t hang out together much outside the dorm. I recognized that there were many problems for black Americans living in the inner cities, but I had little knowledge or appreciation for the day-to-day hurdles that black Americans faced regardless of their social status and location (often referred to as “paying the black tax”). It was never through any personal desire to be distanced, but it just didn’t happen until I fell in love with my amazing, wonderful wife in 2006. (Side note: we actually knew each other as students in Toledo in the 1990s. I had a crush on her, but considered her out of my league and didn’t do anything about it at the time. Doh!)

Much of the nation, it seems, expressed huge outrage about the killing of Cecil the Lion. At the same time, we had footage of a police officer shooting Sam Dubose in the head—and it hardly even seems to register outside the black community. I’m not setting up a false dilemma here: it’s fine to be upset about both killings; however, I’m highlighting the apparent higher proportion of the white population that is moved to express outrage by the former and what that says about priorities (especially when considering that much big game hunting is supporting nature preserves and endangered animal populations). Regardless, what I actually appreciate most about contrasting the two killings is how Cecil provided a platform for humorous, but serious, comparisons—most importantly, to highlight how every killing of an unarmed black person turns into an analysis of their character and actions and how those led or contributed to their being killed (as if it’s okay for police to be executioners). Doing the same for Cecil highlights the absurdity of this. Don’t forget that #AllLionsMatter, and can we also please have a serious discussion about lion on lion crime?

In case it isn’t obvious, many of the common defenses of police violence meted on black Americans are not much different from blaming a rape victim because she wore a particular skirt, flirted too much, drank too much, was out too late, and so on. If you don’t believe me, go back and watch the videos of Chris Lollie, Sandra Bland, and Sam Dubose. Consider that for the latter two, the statements about the stops by the officers involved were contradicted by the video evidence. Then consider the many cases where people have died at the hands of the police and there was no video to check the veracity of their version of events—the police are always cleared of wrong doing. In the case of Sandra Bland, consider that there has been tons of focus on whether she committed suicide or was murdered, but let’s not forget it started with a completely ridiculous traffic stop. She should not have died in that cell because she should have never been there in the first place.

We need the police, but we need them to do their job right. That means to serve and protect all citizens, regardless of race, religion, sexual preference, etc. I hope that efforts in community oriented and evidence-based policing will start to improve matters. It makes a lot of sense, but the data is still inconclusive as to whether it actually reduces crime and improves public perceptions of the police. I’m also encouraged that many police departments are adopting data-driven methodologies that have the potential to help reduce racial profiling and identify problem officers. We must also analyze and evaluate the potential for both improved policing and even worse racial profiling that are offered by new algorithms—a topic I wrote about in my article “Machine Learning and Human Bias“. Getting policing into better shape in the country will nonetheless require sustained efforts such as Justice Together and Campaign Zero, and those have a greater chance of success if white people are agitating for change as well as black people.

My family at the Lincoln Memorial.

My family at the Lincoln Memorial.

I am optimistic that we can get to a better place as a society. My family’s road trip brought us to Washington DC, and we went to the Lincoln Memorial. It’s a powerful place, especially for a family like ours. The words of Lincoln’s second inaugural address are on the wall. At that time, the nation was nearing the end of its greatest existential crisis, but Lincoln showed tremendous restraint and forward-thinking, concluding:

With malice toward none, with charity for all, with firmness in the right as God gives us to see the right, let us strive on to finish the work we are in, to bind up the nation’s wounds, to care for him who shall have borne the battle and for his widow and his orphan, to do all which may achieve and cherish a just and lasting peace among ourselves and with all nations.

They did it—they actually defeated slavery and kept the nation together. One-hundred and fifty years later, we are still working through the divisions created by that vile institution, including how we view that time and institution itself now. It’s hard, but we must remain optimistic as well as realistic.

It’s easy to feel overwhelmed by the scale of racism in the USA. But we can’t just throw up our hands. It’s not enough to be well-meaning and holding good intentions. All of us, black, white and more, must own our part of the solution. There is much that white Americans can do to understand and help. Talk to your kids explicitly about race and racism. My mom has gotten through to white friends who dismiss #BlackLivesMatter by talking about her black daughter-in-law and grandsons and how events impact them directly. Even if you have no strong personal connections to black Americans, you can start by reading books like “Between the World and Me” by Ta-nehisi Coates to get a better sense of what it means to grow up black in the USA. It’s the best book I’ve read this year. I particularly like it because he states things starkly, with no sugar-coating: he puts forth a grounded, atheist viewpoint that doesn’t romanticize. Coates discusses what is done to black bodies, not black spirits, hopes and dreams.

“The spirit and soul are the body and brain, which are destructible—that is precisely why they are so precious.” – Ta-nehisi Coates

The focus on the body allows him to dissociate the cultural from the perceived biological components of race, and remind us that white people aren’t white people, but are “people who believe they are white”. That’s an important, powerful distinction.

Actions such as legislation against racism (possibly limiting free speech) are not likely to improve things, and will likely make other things much worse. Support policies that seek to diminish our out-of-control prison system, which includes locations like Rikers Island and Homan Square where people have been held with out being charged, sometimes for years. These places breathe life into Kafka’s book “The Trial”, and they destroy actual lives. Perhaps some of the biggest payoffs on societal issues like racism is to support policies that truly improve educational and economic opportunities for all Americans (no easy problem, I know—the important thing is to realize this is surely more important than symbolic actions). The more that each of us, regardless of our background, can fulfill our potential, the better our chances of getting along better.

Black lives matter, and ALL our lives depend on that. Spread love, not hate, and work for justice and equality of opportunity for all. My family wouldn’t have been possible if others hadn’t done the same. Meaningful change generally takes a long time, but it can come relatively rapidly too. Consider that couples like my wife and I could not legally marry in Texas and many other states until 1967—just seven years before we were born (many thanks to the Lovings and others of their generation!). Consider that it wasn’t until 1993 that marital rape was illegal in all 50 states. Consider that gay couples could only legally marry each other in all 50 states, well, this very year.

We can do this. We must do this.

I gave a talk at IZEAfest about the science of sharing. I wove together a narrative based on recent research in social network analysis and some work we’ve done at People Pattern. It was fun preparing and delivering it!

The livestream of the talk is here [NOTE: the video is currently unavailable and I’m working on finding a new link], and here are the slides.


In preparing for the talk, I looked at and referred to a lot of papers. Here’s a list of papers referenced in the talk! Each paper is followed by a link to a PDF or site where you can get the PDF. A huge shout-out to the fantastic work by these scholars—I was really impressed by social network analysis work and the questions and methods they have been working on.

Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.” – http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10508

Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.” – http://arxiv.org/abs/1504.00680

Friggeri et al. (2015). “Rumor Cascades.” – http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8122

Goel et al. (2015). “The Structural Virality of Online Diffusion.” – https://5harad.com/papers/twiral.pdf

Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions.” – http://arxiv.org/abs/1403.6838

Gomez Rodriguez et al. (2014). “Uncovering the structure and temporal dynamics of information propagation.” –  http://www.mpi-sws.org/~manuelgr/pubs/S2050124214000034a.pdf

Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.” – http://www.research.ed.ac.uk/portal/files/12949424/Iacobelli_Gill_et_al_2011_Large_scale_personality_classification_of_bloggers.pdf

Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in Networks.” – http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10483

Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.” – http://arxiv.org/abs/1504.00704

Kulshrestha et al (2015). “Characterizing Information Diets of Social Media Users.”  – https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10595/10505

Lerman et al. (2015). “The Majority Illusion in Social Networks.” – http://arxiv.org/abs/1506.03022

Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” – http://www.memetracker.org/quotes-kdd09.pdf

Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns.” – http://snap.stanford.edu/quotus/

Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.” – http://arxiv.org/abs/1403.6199

Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.” – http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2885844/

Shaun King has been one of the most visible and vocal leaders of the Black Lives Matter movement over the past year. He’s done a great deal to raise awareness in particular of police misconduct and brutality, with a particular emphasis on the disproportionate targeting of black Americans. (Though it is worth noting that he and others have noted when people of other races have been killed by police, even while the supposed #AllLivesMatter folks seemed oddly silent.)

An unsurprising development as regards to Black Lives Matter is that its leaders are coming under character attacks. There is a long tradition of privileged segments of society and even the government doing this, including to the now revered and respected Martin Luther King. And now, Shaun King has recently come under a very odd sort of attack from the conservative media: they are now saying he is not truly black and are accusing him of being duplicitous, like Rachel Dolezal.

This issue has a particular resonance for me because my family is a tangible example of the complexities of the concept of race. I’m white and my wife is black. We have two sons, both our biological children. The picture on the right is of our four hands. family_handsOur older son has darker skin and dark curly hair. He’s absolutely beautiful. Most people see him and think of him as “black”. Our younger son has light skin and blonde hair with just a hint of curl. He’s absolutely beautiful. Most people see him and think of him as “white”. In fact, when my wife is in public with our younger son (and without me), most people assume she’s his nanny. (And white people monitor her to make sure she’s treating him well, but that’s another story.)

So, we have these two children who are perceived very differently by others. Are you to tell me that the younger one isn’t “black” or is less “black” than his older brother? Just like Shaun King isn’t black because his skin is too light? What if both of my sons strongly identify with their black heritage and become leaders of some future “black” movement that seeks to reduce racial disparities? Would my younger son be attacked for not being “black” enough? With his older brother standing right by his side and no one questioning his blackness? One gets to speak for the black community because the genetic dice gave him the darker skin and hair, while the other is unsuitable? That would be pure and utter bullshit.

Let’s step back for a moment. It’s important to consider what “race” means and how any given individual might define it differently from others. And that one’s own notion of racial categories might shift over time, as applied to others or even to oneself. Can we even operationalize racial categories? It’s rather tricky. I wrote about this in the context of machine learning, and there’s good recent academic work on figuring out what the notion of race fundamentally encompasses. As Sen and Wasow argue in their article “Race as a bundle of sticks“, we should look at race as a multi-faceted group of properties, some of which are immutable (like genes) and many of which are mutable (such as location, religion, diet, etc). The very notion of racial categorization shifts over time—for example, there was a time not long ago when southern Europeans were not considered “white”. All this is not to say that race isn’t a thing, but that it is very very complicated. In fact, it is far more complicated than most people have ever stopped to really consider.

Returning to the attacks on Shaun King, here’s the thing: I personally don’t care if he is “black” or not, or is somewhat “black” or not. He could be Asian or white and it wouldn’t matter. I think he is doing what he’s doing because he is a caring human being who believes it is right and necessary. He wants to raise awareness of and reduce police violence and reduce racial disparities. That’s a laudable goal no matter who you are, no matter what race you identify with, no matter what. Period.

To me, this is clearly an ad hominem attack based on the flawed premise that race is a concept that we can clearly and objectively delineate. It has nothing to do with the facts and arguments that surround questions of racism in the USA, police conduct and related issues. There is plenty to debate there and, for what it’s worth, I don’t agree with Shaun King on many things. We all must do our best to learn, consider and reflect on the information we have. Ideally, we also seek new perspectives and keep an open mind while doing so. As it is, this attack is a distraction designed to deflect attention away from the real issues. It’s just smoke and mirrors.

And if you think there aren’t real issues here… Ask yourself if you think our country should support a truly Kafka-esque institution like Ryker’s Island. Ask yourself if you are comfortable with the Sandra Bland traffic stop (even FOX news and Donald Trump aren’t, as Larry Wilmore noted). Ask yourself if people should be threatened by the police when they are in their own driveway, hooking up their own boat to their own car. Ask whether the police should be outfitted with military-grade vehicles and weapons (see also John Oliver’s serious/humorous take on this). These are just a few (important) examples, and there are unfortunately many more. They do not reflect the United States of America that I believe in—a great country based on a civil society that protects the rights of individuals without prejudice for their race, religion, political affiliation, etc. You are ignoring much evidence if you think there isn’t a problem. Pay attention, please.

This is a horrible video of police in McKinney, Texas treating a bunch of kids — I stress, KIDS — at a pool party in a very heavy-handed way, way out of proportion to the situation (the “incident”). One officer, Eric Casebolt, pulls his gun as a threat (and he is now on leave because of it). Kids who had nothing to do with the situation are handcuffed, yelled at, and called motherfuckers. I can’t imagine this happening at a similar party in my (almost entirely white) hometown of Rockford, Michigan.

For more context, see this article.

I find this all very upsetting, and I took up Joshua Dubois’ suggestion to write to the police chief. My letter is below.

Dear Police Chief Conley,

I’m writing to express my extreme disapproval and concern regarding the incident in McKinney involving very heavy-handed behavior by police, and in particular Corporal Eric Casebolt, against a group of teens.

I have reviewed the videos and read many different reports on the matter, and I realize that there may be more information yet to come to light. Regardless of how things transpired prior to the police force arriving, the actions of Corporal Casebolt are incredibly disturbing: yanking a 14-year-old girl by her hair, pinning her to the ground, chasing other teens with a gun, and swearing and cursing at teens. Many of the teens were interacting very respectfully, yet he tells them to “sit your asses down on the ground”. Many of the other teens appear incredibly scared — wanting to help their friends, but not wanting to escalate the situation (which is probably wise given recent events in the country and Corporal Casebolt’s disposition and his brandishing of his gun).

This is not behavior befitting an officer of the law. I fully realize that the police have an important and difficult job to do, and I’m thankful to those who serve and keep the peace. I believe a big part of that job is to show respect to the people that the police serve, and to apply rules and force consistently, regardless of the age, race, or socio-economic status of the individuals involved. Sadly, recent events in the country, including Saturday’s incident in McKinney, indicate that this is far from the case currently.

I’m not writing this just as a concerned citizen from afar. I live in Austin, Texas. My wife is African-American and we have two biracial sons, currently two and six years old. My six year old likes dinosaurs, tennis, and math. He’s going to do amazing things, but I fear that society—including the authorities—will view him as a threat by the time he becomes a teenager in 2022. My wife has family who live in Lewisville, less than 30 minutes from McKinney. If my son goes to a pool party with his cousin in seven years, should I worry that he will be handcuffed just for being present? And that no matter how polite and respectful he is, he’ll be told to sit his ass down? I certainly hope not, but seven years isn’t very much time. I sincerely hope that you and others in similar positions will do whatever you can to help reduce the likelihood of these sorts of incidents and to ensure that the members of the police force are respectful of the rights of all citizens. A good start to this would be for you to dismiss Corporal Casebolt.


Dr. Jason Baldridge

Associate Professor of Computational Linguistics, The University of Texas at Austin

Co-founder and Chief Scientist, People Pattern

I’m not at all sure it will do any good, but it’s a start to trying to effect some change. If you feel the same, please consider writing, and getting involved. Follow Shaun King and Deray McKeeson for much much more on what is going on with the police and racism. We need to find a better way forward, as a society.

Bozhidar Bozhanov wrote an blog post titled “The Low Quality of Academic Code“, in which he observed that most academic software is poorly written. He’s makes plenty of fair points, e.g.:

… there are too many freshman mistakes – not considering thread-safety, cryptic, ugly and/or stringly-typed APIs, lack of type-safety, poorly named variables and methods, choosing bad/slow serialization formats, writing debug messages to System.err (or out), lack of documentation, lack of tests.

But, here’s the thing — I would argue that this lack of engineering quality in academic software is a feature, not a bug. For academics, there is basically little to no incentive to produce high quality software, and that is how it should be. Our currency is ideas and publications based on them, and those are obtained not by creating wonderful software, but by having great results. We have limited time, and that time is best put into thinking about interesting models and careful evaluation and analysis. The code is there to support that, and is fine as long as it is correct.

The truly important metric for me is whether the code supports replication of results from the paper it supports. The code can be as ugly as you can possibly imagine as long as it does this. Unfortunately, a lot of academic software doesn’t make replication easy. Nonetheless, having the code open sourced makes it at least possible to hack with it to try to replicate previous results. In the last few years, I’ve personally put a lot of effort into having my work and my students’ work easy to replicate. I’m particularly proud of how I put code, data and documentation together for a paper I did on topic model evaluation with James Scott for AISTATS in 2013, “A recursive estimate for the predictive likelihood in a topic model.” That was a lot of work, but I’ve already benefited from it myself (in terms of being able to get the data and run my own code). Check out the “code” links in some of my other papers for some other examples that my students have done for their research.

Having said the above, I think it is really interesting to see how people who have made their code easy to use (though not always well-engineered) have benefited from doing so in the academic realm. A good example is word2vec and how the software that was released for it generated tons of interest in industry as well as academia and probably led to much wider dissemination of that work, and to more follow on work. Academia itself doesn’t reward that directly, nor should it. That’s one reason you see it coming out of companies like Google, but it might be worth it to some researchers in some cases, especially PhD students who seek industry jobs after they defend their dissertation.

I read an blog post last year in which the author encouraged people to open source their code and not worry about how crappy it was. (I wish I could remember the link, so if you have it, please add in a comment. Here is the post, “It’s okay for your open source library to be a bit shitty.“) I think this is a really important point. We should be careful to not get overly critical about code that people have made available to the world for free—not because we don’t want to damage their fragile egos, but because we want to make sure that people generally feel comfortable open sourcing. This is especially important for academic code, which is often the best recipe, no matter how flawed it might be, that future researchers can use to replicate results and produce new work that meaningfully builds on or compares to that work.

Update: Adam Lopez pointed out this very nice related article by John Regehr “Producing good software from academia“.

Addendum: When I was a graduate student at the University of Edinburgh, I wrote a software package called OpenNLP Maxent (now part of the OpenNLP toolkit, which I also started then and which is still used widely today). While I was still a student, a couple of companies paid me to improve aspects of the code and documentation, which really helped me make ends meet at the time and made the code much better. I highly encourage this model — if there is an academic open source package that you think your company could benefit from, consider hiring the grad student author to make it better for the things that matter for your needs! (Or do it yourself and do a pull request, which is much easier today with Github than it was in 2000 with Sourceforge.)

Update: Thanks to the commenters below for providing the link to the post I couldn’t remember, It’s okay for your open source library to be a bit shitty.! As a further note, the author surprisingly connects this topic to feminism in a cool way.

I’m a longtime fan of Chris Manning and Hinrich Schutze’s “Foundations of Natural Language Processing” — I’ve learned from it, I’ve taught from it, and I still find myself thumbing through it from time to time. Last week, I wrote a blog post on SXSW titles that involved looking at n-grams of different lengths, including unigrams, bigrams, trigrams and … well, what do we call the next one up? Manning and Schutze devoted an entire paragraph to it on page 193 which I absolutely love and thought would be fun to share for those who haven’t seen it.

Before continuing with model-building, let us pause for a brief interlude on naming. The cases of n-gram language models that people usually use are for n=2,3,4, and these alternatives are usually referred to as a bigram, a trigram, and a four-gram model, respectively. Revealing this will surely be enough to cause an Classicists who are reading this book to stop, and leave the field to uneducated engineering sorts: “gram” is a Greek root and so should be put together with Greek number prefixes. Shannon actually did use the term “digram”, but with the declining levels of education in recent decades, this usage has not survived. As non-prescriptive linguists, however, we think that the curious mix of English, Greek, and Latin that our colleagues actually use is quite fun. So we will not try to stamp it out. (1)

And footnote (1) follows this up with a note on four-grams.

1. Rather than “four-gram”, some people do make an attempt at appearing educated by saying “quadgram”, but this is not really correct use of a Latin number prefix (which would be “quadrigram”, cf. “quadrilateral”), let alone correct use of a Greek number prefix, which would give us “a tetragram model.”

In part to be cheeky, I went with “quadrigram” in my post, which was obviously a good choice as it has led to the term being the favorite word of the week for Ken Cho, my People Pattern cofounder, and the office in general. (“Hey Jason, got any good quadrigrams in our models?”)

If you want to try out some n-gram analysis, check out my followup blog post on using Unix, Mallet, and BerkelyLM for analyzing SXSW titles. You can call 4+-grams whatever you like.

Note: This is a repost of a blog post about the Facebook emotional contagion experiment that I wrote on People Pattern’s blog.


This is the first in a series of posts responding to the controversial Facebook study on Emotional Contagion

The past two weeks have seen a great deal of discussion around the recent computational social science study of Kramer, Guillory and Hancock (2014) “Experimental evidence of massive-scale emotional contagion through social networks” . I encourage you to read the published paper before getting caught up in the maelstrom of commentary. The wider issues are critical to address, and I have summarized the often conflicting but thoughtful perspectives below. These issues strike close to home, given our company’s expertise in computational linguistics and reliance on social media.

In this post, I provide a brief description of the original paper itself along with a synopsis of the many perspectives that have been put forth in the past two weeks. This post sets the stage for two posts to follow tomorrow and Tuesday next week that provide our take on the study plus our own Facebook-external opt-in version of the experiment, which anyone currently using Facebook can participate in.

Summary of the study

Kramer, Guillory and Hancock’s paper provides evidence that emotional states as expressed in social media posts are contagious in that they affect whether readers of those posts reflect similar positive or negative emotional states in their own later posts. The evidence is based on an experiment involving about 700,000 Facebook users over a one week period in January 2012. These users were split into four groups: a group that had a reduction in positive messages in their Facebook feed, another that had a reduction in negative messages, a control group that had an overall 5% reduction in posts, and a second control group that had a 2% reduction. Positivity and negativity were determined by using the LIWC word lists. LIWC, which was created and maintained by my University of Texas at Austin colleague James Pennebaker, is a standard resource for psychological studies of emotional expression in language. Over the past two decades, it has been applied to language from varying sources, including speech, essays, and social media.

The study found a small but statistically significant difference in emotional expression between the positive suppression group and the control and the negative suppression group and the control. Basically, users who had positive posts suppressed produced slightly lower rates of positive word usage and slightly higher rates of negative word usage, and the mirror image of this was found for the negative suppression group (check out the plot for these). (This description of the study is short — see Nitin Madnani’s description for more detail and analysis.)

The study was published in PNAS, and then the shit hit the fan.

Objections to the study

Objections to the study and the infrastructure that made it possible have come from many sources. The two major complaints have to do with ethical considerations and research flaws.

The first major criticism is that the study was unethical. The key problem is that there was no informed consent. Facebook users had no idea that they were part of this study and had no opportunity to opt out of it. An important aspect of this is that the study conforms to the Facebook terms of service: Facebook has the right to experiment with feed filtering algorithms as part of improving its service. However, because Jeff Hancock is a Cornell University professor, many state it should have passed Cornell’s IRB process. Furthermore, many feel that Facebook should obtain consent from users when running such experiments, whether for eventual publication or for in-company studies to improve the service. The editors of PNAS itself have issued an editorial expression of concern over the lack of informed consent and opt-out for subjects of the study. We agree this is an issue, so in our third post, we’ll introduce a way this can be achieved through an opt-in version of the study.

The second type of criticism is that the research is flawed or otherwise unconvincing. The most obvious issue is that the effect sizes are small. A subtler problem familiar to anyone who has done anything with sentiment analysis is that counting positive and negative words is a highly imperfect means for judging the positivity/negativity of a text (e.g. it does the wrong thing with negations and sarcasm — see Pang and Lee’s overview). Furthermore, the finding that reducing positive words seen leads to fewer positive words produced does not mean that the user’s actual mood was affected. We will return to this last point in tomorrow’s post.

Support for the study

In response, several authors have joined the discussion to support the study and others similar to it, or to refute some aspects of the criticism leveled at it.

Several commentators have made unequivocal statements that the study would have never obtained IRB approval. This is in fact a misperception: Michelle Meyer provides a great overview of many aspects of IRB approval and concludes that actually this particular study could have legitimately passed the IRB process. A key point for her is that had an IRB approved the study, it would probably be the right decision. She concludes: “We can certainly have a conversation about the appropriateness of Facebook-like manipulations, data mining, and other 21st-century practices. But so long as we allow private entities freely to engage in these practices, we ought not unduly restrain academics trying to determine their effects.”

Another defense is that many concerns expressed about the study are misplaced. Tal Yarkoni argues “In Defense of Facebook” that many critics have inappropriately framed the experimental procedure as injecting positive or negative content into feeds, when in fact it was removal of content. Secondly, he notes that Facebook already manipulates users’ feeds, and this study is essentially business-as-usual in this respect. Yarkoni notes that it is a good thing that Facebook publishes such research: “by far the most likely outcome of the backlash Facebook is currently experiencing is that, in future, its leadership will be less likely to allow its data scientists to publish their findings in the scientific literature.” They will do the work regardless, but the public will have less visibility into the kinds of questions Facebook can ask and the capabilities they can build based on the answers they find.

Duncan Watts takes this to another level, saying that companies like Facebook actually have a moral obligation to conduct such research. He writes in the Guardian that the existence of social networks like Facebook gives us an amazing new platform for social science research, akin to the advent of the microscope. He argues that companies like Facebook, as the gatekeepers of such networks, must perform and disseminate research into questions such how users are affected by the content they see.

Finally, such collaborations between industry and academia should be encouraged. Kate Niederhoffer and James Pennebaker argue that both industry and academy are best served through such collaborations and that the discussion around this study provides an excellent case study. In particular, the backlash against the study highlights the need for more rigor, awareness and openness about the research methods and more explicit informed consent among clients or customers.

Wider issues raised by the study and the backlash against it

The backlash and the above responses have furthermore provided fertile ground for other observations and arguments based on subtler issues and questions that the study and the response to it have revealed.

One of my favorites is the observation that IRBs do not perform ethical oversight. danah boyd argues that the IRB review process itself is mistakenly viewed by many as mechanism for ensuring research is ethical. She makes an insightful, non-obvious argument: that the main function of an IRB is to ensure a university is not liable for the activities of a given research project, and that focusing on questions of IRB approval for the Facebook study is beside the point. Furthermore, the real source of the backlash for her is that there is public misunderstanding and growing negative sentiment for the practice of collecting and analyzing data about people using the tools of big data.

Another point is that the ethical boundaries and considerations between industry and academia are difficult to reconcile. Ed Felten writes that though the study conforms to Facebook’s terms of service, it clearly is inconsistent with the research community’s ethical standards. On one hand, this gap could lead to fewer collaborations between companies and university researchers, while on the other hand it could enable some university researchers to side-step IRB requirements by working with companies. Note that the opportunity for these sorts of collaborations often arise naturally and reasonably frequently; for example, it often happens when a professor’s student graduates and joins such companies, and they continue working together.

Zeynep Tufekci escalates the discussion to much higher level—she argues that companies like Facebook are effectively engineering the public. According to Tufekci, this study isn’t the problem so much as it is symptomatic of the wider issue of how a corporate entity like Facebook has the power to target, model and manipulate users in very subtle ways. In a similar, though less polemical vein, Tartleton Gillespie notes the disconnect between Facebook’s promise to deliver a better experience to its users with how users perceive the role and ability of such algorithms. He notes that this leads to “a deeper discomfort about an information environment where the content is ours but the selection is theirs.”

In a follow up post responding to criticism of his “In Defense of Facebook” post, Tal Yarkoni points out that the real problem is the lack of regulations/frameworks for what can be done with online data, especially that collected by private entities like Facebook. He suggests the best thing is to reserve judgment with respect to questions of ethics for this particular paper, but that the incident does certainly highlight the need for “a new set of regulations that provide a unitary code for dealing with consumer data across the board–i.e., in both research and non-research contexts.”

Perhaps the most striking thing about the Kramer, Guillory and Hancock paper is how the ensuing discussion has highlighted many deep and important aspects of the ethics of research in computational social science from both industry and university perspectives, and the subtleties that lie therein.

Summing up

A standard blithe rejoinder to users of services like Facebook who express concern, or even horror, about studies like this is to say “Don’t you see that when you use a service you don’t pay for, you are not the customer, you are the product?” This is certainly true in many ways, and it merits repeating again and again. However, it of course doesn’t absolve corporations from the responsibility to treat their users with respect and regard for their well-being.

I don’t think the researchers nor Facebook itself have been grossly negligent with respect to this study, but nonetheless the study is in an ethical gray zone. Our second post will touch on other activities, such as A/B testing in ad placement and content, that are arguably in that same gray zone, but which have not created a public outcry even after years of being practiced. It will also say more about how the linguistic framing of the study itself essentially primed the extreme backlash that was observed and how it is in many ways more innocuous than its own wording would suggest.

Our third post will introduce our own opt-in version of the study, which we think is a reasonable way to explore the questions posed in the study. We’d love to get plenty of folks to try it out, and we’ll even let participants guess whether they were in the positive or negative group. Stay tuned!