S04E20: The Herb Garden Germination

Tonight is our first ever guest post.   It is by my close friend Kristina Lerman.  Kristina and I met the first week in freshman year, where we were both physics majors.  We spent many years together working on problem sets — which is how physicists like to spend their twenties.   After getting her Ph.D. from University of California, Santa Barbara in physics, Kristina became an expert in the mathematics of  networks, especially online networks, long before “social networking” became a buzz word. So when there was a line in tonight’s script on meme theory by Amy Farrah Fowler, I immediately called Kristina for help.  Now she’s been kind enough to explain to us the science behind tonight’s episode.  So without further ado…

Tonight guest blogger: Prof. Kristina Lerman

——————————–

(By Kristina Lerman)

AMY:  Meme theory suggests items of gossip are like living organisms that seek to reproduce using humans as their hosts.

In this episode, Sheldon and Amy discover that memes, or items of gossip and other information, are like infectious organisms that reproduce themselves using humans as hosts. They engage in a bit of “memetic epidemiology” as they conduct social experiments on their friends to test the theory that tantalizing pieces of gossip make stronger, more virulent memes that spread faster and farther among their friends than mundane pieces of information.

The idea that information moves through a social group like an infectious disease has itself proved to be a powerful meme.   This analogy has informed sociologists’ attempts to understand many diverse phenomena, including adoption of innovations,  the spread of fads and fashion, word-of-mouth recommendations, and social media campaigns. The analogy becomes even stronger when social interactions are encoded within a friendship graph, the so-called social network.  In a social epidemic each informed, or “infected,” individual infects her network neighbors with some probability given by the transmissibility, which measures how contagious the infection is.  Understanding social epidemics is crucial to identifying influential people,  predicting how far epidemics will spread, and identifying methods to enhance or impede its progress. Advertisers and social media consultants have been busy devising “viral” marketing strategies. Much like an epidemiologist might advise people on ways to reduce the transmissibility of a virus (wash hands), or if that fails, figure out who should be vaccinated to limit its spread (kindergarden teachers in many cases), marketing types are interested in identifying individuals who will generate the greatest buzz if they receive free products and other incentives.

Though theoretical progress has been brisk, until recently, empirical studies of epidemics were limited to taking case histories of sick people and attempting to trace their contacts. The advent of  social media has changed that.  People are joining social media sites such Twitter, Digg, Flickr, and YouTube  to find interesting content and connect with friends and like-minded people through online social networks. Traces of human activity that are exposed by the sites have given scientists treasure troves of data about  individual and group behavior. This data has given social science an empirical grounding that many physicists find irresistible. As a result, physicists (author included)  have flooded the field, much to the chagrin of practicing social scientists. In the culture wars of science,  physicists often come off as arrogant, like Sheldon, but that is the price of being right.

The detailed data about human behavior on social media sites has allowed us to quantitatively study dynamics of social epidemics. In my own work I study how information spreads on Digg and Twitter. These sites allow users to add friends to their social network whose activities they want to follow.   A user becomes infected by voting for (digging) or tweeting a story and exposes her network neighbors to it. Each neighbor may in turn become infected (i.e., vote or retweet),  exposing her own neighbors to it, and so on. This way interest in a story cascades through the network. This data enables us to trace the flow of information along social links. We found that social epidemics look and spread very differently from diseases on networks.  Contrary to our expectations, the vast majority of information cascades grew slowly and failed to reach“epidemic” proportions. In fact, on Digg, these cascades reached fewer than 1% of users.

There are a number of factors that could explain this observation.   Perhaps users modulate transmissibility of stories to be within a narrow range of threshold to prevent information overload. Perhaps the structure of the network (e.g., clustering or communities) limits the spread of information.  Or it could be that the mechanism of social contagion, in other words, how people decide to vote for a story once their friends voted for it, prevents interest in stories from growing.   We examined these hypotheses through simulations of epidemic processes on networks and empirical study of real information cascades.

We found that while network structure somewhat limits the growth of cascades, a far more dramatic effect comes from the social contagion mechanism. Unlike the standard models of disease spread used in previous works on epidemics, repeated exposure to the same story does not make the user more likely to vote for it. We defined an alternative contagion mechanism that fits empirical observations and showed that it reproduces the observed properties of real information cascades on Digg.

(Longer version:  Specifically, we simulated the independent cascade model that is widely used to study epidemics on networks.   Each simulated cascade began with a single seed node who voted for a story. By analogy with epidemic processes, we call this node infected. The susceptible followers of the seed node decide to vote on the story with some probability given by the transmissibility, λ (lambda). Every node can vote for the story once, so at this point the seed node is removed, and we repeat the process with the newly infected nodes. A node who is following n voting nodes has n independent chances to decide to vote. Intuitively, this assumption implies that you are more likely to become infected if many of your friends are infected. )

Cascade size as a function of transmissibility λ (lambda) for simulated cascades on the Digg graph and the randomized graph with the same degree distribution. Heterogeneous mean field predicts cascade size as a fraction of the nodes affected. The line (hmf) reports these predictions multiplied by the total number of nodes in the Digg network.

After some time, no new nodes are infected, and the cascade stops. The final number of infected nodes gives cascade size. These are shown in the figure above, where each point represents a single cascade with the y-axis giving the final cascade size and the x-axis giving the transmissibility, λ.   Blue dots represent cascades on the original Digg graph, while pink dots represent cascades on a randomized version of the Digg graph, and gold line gives theoretical predictions. In both simulations, there exists a critical value of λ, the epidemic threshold, below which cascades quickly die out and above which they spread to a significant fraction of the graph.

Comparing the theoretical and simulation results to real cascades presents a puzzle. Why are cascades so small? According to our cascade model, only transmissibilities in a very narrow range near the threshold produce cascades of the appropriate size of ~500 votes. Clearly, the structure is not enough to explain the difference. To delve deeper, we looked at the contagion mechanism itself. We measured the probability that a Digg user votes for a story given than n of his friends have voted. We found that independent cascade model grossly overestimates the probability of a vote even with 2 or 3 voting friends. In fact, we found that multiple exposures to a story only marginally increase the probability of voting for it.

Cascade size vs inferred transmissibility for simulated and real cascades on the Digg graph. HMF prediction of cascade size is shown for reference.

After simulating information cascades using the new contagion mechanism, we found that their size is an order of magnitude smaller than before, as shown in the figure above. The size of the real Digg cascades is similar to the simulated cascades, giving us confidence that we have uncovered the mechanism that limits the spread of information. These findings underscore the fundamental difference between the spread information and disease: despite multiple opportunities for infection within a social group, people are less likely to become spreaders of information with repeated exposure.

34 Responses to “S04E20: The Herb Garden Germination”

  1. Randall Says:

    Or to put it in layman’s terms, you can’t force a meme. (Note: Link goes to Encyclopedia Dramatica, and while this particular entry is safe for work, the site as a whole is not.)

  2. Tejaswy Says:

    How about the author who was giving the introduction of his book at the beginning.?

  3. Work featured on the Big Blog theory « Apparent Horizons Says:

    […] science consultant for the show asked my colleague Kristina Lerman to write about the topic for the Big Bang Theory blog. She mentions our recent paper (previous post). Unfortunately, although she gave the science […]

  4. watcher Says:

    I read that the show’s creators read some of Brian Greene’s books to find out about physics. I might check out some of Greene’s books at the library!

    • comrade_bazarov Says:

      @Watcher: His first one (Elegant Universe) is better than Fabric of Cosmos. Overall, he does tend to toot his own horn a bit too much and his books are definitely not at par with some of the others in the trade right now (like Stephen Hawking, for example)

  5. watcher Says:

    @comrade_bazarov, thanks for the tip.

    Also, Greene’s “The Hidden Reality,” about possible parallel universes sounds interesting.

  6. Zig zag foiled Says:

    awww… it showed the clips! The surprise is ruined! Doubt is the fun!!!!

  7. Pat Mächler Says:

    “In the culture wars of science, physicists often come off as arrogant […] but that is the price of being right.”

    I hope that’s irony in there…(?)
    Just to name a few occasions where the physicist community agreed to revise their world view, after some time:
    – classical mechanics
    – Lorentz ether theory
    – the comparison between steam engines and female biology, brought forward by a group of European thermodynamic scientists around 1900 as an argument to prohibit women from studying (look up the publications of Dorit Heinsohn)

  8. feldfrei Says:

    This sound just created by another virus?

    (please forgive me – but JSB is just great 🙂

  9. Andrew Kazyrevich Says:

    Thanks, interesting findings!

    I’d expect that one has equal probability of catching a disease from any of his infected friends – however, the probability of passing over a meme depends on the concrete person you get the meme from.

    I wonder how that had been addressed in the research!

    • Kristina Says:

      That’s a good question. Of course, fundamentally, the probability of infecting a friend with a meme depends on details of friend’s interest in the meme, and ultimately maybe even what he had for breakfast that morning, what the weather was and how the stock market did. The beauty of the type of statistical approaches that we are using (and the basis for criticism for them as well), is that they average over the details like this to produce a description of “typical” behavior. It is supposed to get the “average” behavior right – in other words, if we observed many memes with similar transmissibility, on average, they would spread as far as the model predicts. In practice, we stretch the predictions and apply them to individual memes, where the models don’t perform as well. But, if we make the models more complex to include more details, for example, the topics friends are interested in, we will get more accurate predictions for the propagation of individual memes. This, at least, is the direction I am taking with my research.

      • Andrew Kazyrevich Says:

        Hi Kristina,

        I was making a point that one will be probably more inclined to pass over say, some news about XYZ, if heard them from XYZ specialist (rather than from a mere neighbor).

        ..which I assumed your research didn’t account to, and you just confirmed that.

        On second thought, however, I started to doubt that the average behavior is like I described above 🙂 The average behavior seems to be passing over any news that seems exciting/fun/weird/etc regardless of credibility of news source. So your statistical route is probably the right take 🙂

        (I’ve replaced the concept of “meme” with “news” which I assume is a safe bet!)

        All the best, thanks for taking time to answer my question.
        Andrew

  10. Peter Cullen Says:

    This is outstanding. I can’t believe in a few short years the US has produced such interesting stuff! It seemed only a few years ago that to approach life intelligently and passionately meant hiding oneself away from a mainstream of codswallop. Son of a climate-change oceanographer, and a PhD in economic history, myself, growing up always meant questioning the validity of the background information presented TV shows – usually beginning with Pa’s disgusted grunt at some impossible stretch of physics or chemistry, followed by my Mum’s request that he not ruin the show for the rest of us while my brother and I went to find books to explain it. Before internet, we didn’t really have tonnes of books on physics at home – but anyway.
    I do have a question/comment, however. In “The Zazzy Substitution” – wouldn’t Sheldon have a trump case to make by citing Kandel’s (and many others – including Leon Cooper’s) work on neural plasticity and the molecular physics of memory – satisfying his need for field superiority? It seems to me that neurological mapping of any of Sheldon’s theories would, in itself, imply using a great deal of molecular physics and chemistry. Waddayatink?
    Anyway – THANK you for making this an engaging as well as enjoyable show! – 1 step closer to building a better America! All the best to you!
    Pete Cullen
    Language and Culture for Business
    Faculty of Foreign Languages and Literature
    Piazza Rinascimento 7
    61029 Urbino (PU)
    Italy

  11. Nira Says:

    Dr. Saltzberg,
    I have a question regarding a whiteboard in the pilot episode. I cannot clearly read the joke/spoof of the Born-Oppenheimer Aproximation. It is hard to see, but I think the symbols below the CKM matrix represent, “CP-violating phase doesn’t equal 0 =>” I can’t make out what symbol(s) comes after the “=>” arrow. Could you please explain what symbols are written under the matrix? This has been driving me crazy trying to figure the joke out. I’m a medical student, not a physicist, but am having fun trying to learn more physics. Sorry if this is the wrong place for this post; I’m unsure where to ask about this. Thank you!!!

    • David Saltzberg Says:

      Alas, that was the pilot episode, before we had all the kinks worked out of the system. The whiteboard equation that went with that line was on a different board that was moved at the last minute. That would never happen now. (It was actually correct in the first version of the pilot, but that was never aired.) What was actually on the correct board in the unaired pilot, let’s leave as a deeply hidden Easter Egg.

  12. MNC Says:

    Hi there, big fan of your blog! I was just wondering if there will be an update for the last episodes of this season or if we have to wait for the next season to see your blog return? (not to seem pushy, or anything.. I just really enjoy reading your posts)

    Somewhat unrelated… I feel like the science played too small a role in the last episodes of this season (and instead we got a lot more sitcom-staple story lines). Does anyone share this perception?

  13. leo Says:

    So when is the cast going on holiday to “Monster Camp” to further explore the cascading effects of the nerd cultural meme?

  14. david Says:

    is this blog dead?

    • David Saltzberg Says:

      No. I didn’t get to the last four episodes. When the next season starts up again in September I will start.

  15. steve Says:

    updates?

  16. steve Says:

    updates? there have been new shows since april….

  17. Li Voon Says:

    Looking forward to reading more science behind the show:)

  18. La explicación de la primera pizarra de Sheldon en la serie “Big Bang” « Francis (th)E mule Science's News Says:

    […] que tiene un blog en el que explica la física de la serie “The Big Blog Theory.” El 25 de mayo de 2011 una tal Nira le preguntó por la última línea de la pizarra de Sheldon y David le […]

  19. Gil Silberman Says:

    Physicists… have flooded the field [of social network analysis], much to the chagrin of practicing social scientists. In the culture wars of science, physicists often come off as arrogant… but that is the price of being right.

    Classic.

  20. Liz Pullen (@nwjerseyliz) Says:

    The fact that you can consider drawing a strict analogy between a virus spreading an epidemic among a population and a meme spreading information among a social group shows that you need a rudimentary education in social behavior. One does not choose to be vulnerable to a virus that one comes into contact with. But individuals do choose when, where, how and to whom to share a news story, photo or anecdote. If, for example, I hear a funny news story worth sharing, I could tell the people I work with (verbal communication), send it to my family from a newspaper site (email), ReBlog a blog post about it on Tumblr, Like it when I see someone posting the same story the next day on Facebook and/or a week later, ReTweet a link to the news story when I see it come across my Tweetstream. Different modes of communication, same result – sharing a story. Sometimes I initiate the spread and direct my version of the story to particular individuals in specific areas of my life, other times I merely give my nod of approval to someone else’s version of the story and recirculate it among a wide audience. Different social circles, different ways of communication, different time frames, differing levels of involvement on my part (active vs. passive). Quite a few more variables than merely being exposed to a virus in one instance.

    And the selection of Digg, an insular, parochial, dying social network, instead a vital, international community like Facebook or Twitter was a fatal judgment call. Most of the people I know who were most active on Digg abandoned it several years ago and its community has shriveled.

Comments are closed.


%d bloggers like this: