“Troubling Oddities” In A Social Psychology Data Set

By Neuroskeptic | February 6, 2016 12:16 pm

A potential case of data manipulation has been uncovered in a psychology paper. The suspect article is ‘Why money meanings matter in decisions to donate time and money’ (2013) from psychologists Promothesh Chatterjee, Randall L. Rose, and Jayati Sinha.

This study fell into the genre of ‘social priming‘, specifically ‘money priming’. The authors reported that making people think about cash reduces their willingness to help others, while thinking of credit cards has the opposite effect.

Now, a critical group of researchers led by Hal Pashler allege “troubling oddities” in the data. Pashler et al.’s paper is followed by three responses, one from each of the original authors (Chatterjee, Rose, Sinha), and finally by a summing-up from the critics. Pashler et al. recently published a failure to replicate several money priming effects.

Pashler et al. focus on Chatterjee et al.’s Study #3, the last of the three experiments reported in the paper; they report having some concerns about the other two studies as well, but they don’t go into much detail.

The “odd” data in Study #3 comes from a word completion task. In this paradigm, participants are shown ‘word stems’ and asked to complete them with the first word they think of. e.g. the stem might be BR___ and I being a neuroscientist might write BRAIN; you might be feeling hungry so you might write BRUNCH.

Pashler et al. say that 20 participants (out of 94 who completed Study #3) gave a striking similar pattern of word-stem responses. Specifically, these 20 participants tended to give the same answers to nine ‘filler’ items, which were chosen to not be affected by the money vs. credit card priming. Here are the raw responses:


The sets of words are not identical, but most of them differ in only one or two words from the “consensus” answers within the block. Pashler et al. say that this is extremely unlikely to have happened by chance, and they raise the possibility that these 20 participants were “reduplicated” – essentially, copy-pasted. The few differences may then have been added manually, to make the data look less suspicious.

Couldn’t it be that people just tend to complete these stems the same way? Pashler et al. say that this is prima facie unlikely – is “SPOOK” really the obvious completion for “SPO_”? More importantly, they also show that the degree of overlap is far higher than would be expected by chance: the other 74 participants, who Pashler et al. seem to accept as real, gave much more divergent answers.

What makes Pashler et al.’s point more remarkable is that these 20 participants were not selected (or cherry-picked) on the basis of their similar responses to the filler items. They were selected for an entirely different reason – because they form two subgroups who showed a dramatic response to the priming manipulation.

Priming response in the word completion task was defined by the number of ‘cost’ and ‘benefit’ word stems completed in a particular way. There were 8 of each of these kinds of stems. If we plot the number of ‘cost’ and ‘benefit’ completions per participant, we get a scatter plot. Two outlying subgroups are apparent: these groups show a strong priming effect in the expected direction (e.g. they are evidence that money priming works.)


These are the same 20 participants in whom the responses to the filler items are extremely similar. Hmm.

So this is the main “troubling oddity”. Pashler et al. also report other strange features, such as a number of subjects who made the same, invalid responses e.g. six participants wrote “SURGERY” for the stem “SUPP__”. This, they say, could be evidence that someone manually changed copy-pasted responses, forgetting what the stem was.

In my view, Pashler et al. are right: these data are extremely odd. True, there is no proof of misconduct here, or even of honest error. These data could be real. It seems extremely improbable, however.

That said, it’s hard to say exactly how unlikely these results are. The authors, in their various rebuttals, raise the possibility that people who are highly susceptible to priming (i.e. the 20 “odd” participants) are psychologically similar to one another, and therefore tend to give similar word completions, even to filler words. Pashler et al. dispute this defense, saying that the ‘priming susceptibility’ effect would have to be enormous in order to account for the data, but it’s impossible to rebut completely.

Overall, I think we’re faced with a similar situation as with Jens Förster. Förster is a German psychologist who in 2014 was shown to have published papers containing extremely improbable data. Many of these papers have since been retracted, but Förster denies any wrongdoing, and he has defended himself by saying that some unknown mechanism could have generated the odd statistical patterns.

In this case, none of the authors have confessed to wrongdoing. They have, however, reportedly agreed to retract Study #3, and two of them have now disclaimed any involvement in handling the data for that study. According to Pashler et al. in their summing up:

Shortly after our paper was accepted for publication, we learned that all of the original authors had apparently decided amongst themselves that Study 3 should be “retracted.” As far as we know, they have not explained precisely what that means or exactly why they wish this partial retraction to take place, beyond referring to alleged “coding errors”…

From the authors’ commentaries on our paper, it seemed to us that two of the three authors (Rose and Sinha) wish it to be known that they had no personal involvement in the data analysis. Sinha stated that the first author (Chatterjee) was exclusively responsible for “data merging,” data coding, and data analysis. Rose goes further to say that he had no involvement in either data collection or data analysis.

ResearchBlogging.orgPashler, H., Rohrer, D., Abramson, I., Wolfson, T., & Harris, C. (2016). A Social Priming Data Set With Troubling Oddities Basic and Applied Social Psychology, 38 (1), 3-18 DOI: 10.1080/01973533.2015.1124767

CATEGORIZED UNDER: papers, select, statistics, Top Posts
  • smut clyde

    This looks like a job for cluster analysis or multidimensional scaling… if you define a measure of dissimilarity between completion data for each pair of subjects, then clusters of similar subjects would stand out immediately in the MDS solution.
    Yes, I do a lot of MDS.

    Pashler &c also point out that the purported effect sizes are preposterous.

    And purportedly, subjects in the “credIt-primed” condition gave away 3/4 of their (token) reward for the task, to a fake charity offered by the experimenters (compared to the unprimed group, who only have back half their reward, and the cash-primed group who gave back 1/4). I’m sorry, if my employer invited me to give back 73% of my earnings to the employer’s own slush-fund, I would not be so generous.

    • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

      “This looks like a job for cluster analysis”

      I thought that this was such a good idea that I implemented it in MATLAB. There are 94 participants in the dataset (data is available here). For each pair of participants I defined the dissimilarity between them as the number of words for which they didn’t give an identical answer. There are 9 filler/neutral words so dissimilarity ranges from 0 to 9.

      I calculated the dissimilarity matrix and performed cluster analysis using MATLAB’s linkage() and cluster() commands.

      Here’s the preliminary result:

      There are three major clusters and several minor ones. Two of major clusters are related quite closely to each other – these represent (0,5) and (5,0) from Pashler et al. The third corresponds to the (0,0) cluster defined by the word “SURGERY” also noted by Pashler et al.

      The minor clusters are a new discovery I think. Most of them consist of just two participants, but they often have extremely high similarity. For instance, two participants matched on 9/9 words including two misspellings, “RREAD” and “SOPT”. Clearly something has gone wrong here.

      These “twin” participants might be nothing more than a data entry error but there are a lot of them and often they match on 7/9 or 8/9 words, rather than 9/9, so they can’t be a simple row duplication error…

      Could these twin participants be the “8 coding errors” which Chatterjee et al. claim to have discovered and which formed the basis for the retraction?

      Possibly – I count more than 8 participants affected but Chatterjee et al. might not have found all of them.

      • D Samuel Schwarzkopf

        Those replicated typos are really troubling. Of course, some typos are quite common but are these? And even if they are, the odds of several appearing in the same two people may still be low.

        • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

          Yeah and there’s several other examples of the same thing.

          In my view the most charitable explanation is that the typos were introduced by whoever entered the data into Excel, rather than by the participants. So we’d only have one person making the same typo, not two.

          But this wouldn’t explain all the other similarities between the “pairs”. Unless we assume that the data entry person was also copy pasting participants for some reason!

          • David_in_Oregon

            If the data was hand typed into a computer, could autocorrect be the evil doer?

          • reed1v

            Data entry into an excel spreadsheet does not have “autocorrect” for numerics. How could it?

      • smut clyde

        I love heat maps.

        • http://blogs.discovermagazine.com/neuroskeptic/ Neuroskeptic

          The heat is on… will this paper get cooked?

          Using cluster analysis I believe that I have discovered concerning new abnormalities in the study #2 data. Things are heating up.

  • Pingback: Briefly | Stats Chat()

  • Pingback: A longish tl;dr conclusion of Srull & Wyer trace | …not that kind of psychologist()

  • Ibn

    here’s what’s my problem with social priming and the like: it’s the complete lack of sense of proportion. it’s rarely expressed how grandiose the claim is that everyday experiences, that we generally regard to have very minor and indirect effects on our decisions, in fact have a huge effect on how we act. this is close to what quantum entanglement is to the classical view of physics – a riddle that’s about 80 years old, and has an unbelievable epistemological machinery thrown at it meticulously eliminating logical fallacies one by one, and still, there are strong critical voices within physics saying that QM is not properly understood. In comparison, the methodological basis of social priming amounts to little more than a guess, the theory is not underdeveloped but nonexistent, and the people working on it are as well trained in formal epistemology as random high school students. Given this, data falsification is the least of a problem.

  • Michael Milburn

    Is it common for authors of papers to not be involved with verifying data integrity and/or collection? If so, should it be?

  • andrew oh-willeke

    This study is notable also because the study, even if everything was as claimed, wouldn’t be all that powerful. A twenty person word completion test with background random chance probabilities that are so ill established that is just part of a study in a low profile journal just doesn’t tell you very much. It wouldn’t merit submitting as a paper at all on its own, and that is part of what makes me inclined to believe that the original error may have been honest, even if the response was less than fully forthcoming. But, the non-random deviations from pure cut and paste cast some doubt on that.

    This is not like the data tweaking done in Mendel’s famous pea plant study that later statistical analysis has established was almost certainly doctored to some degree. That paper was making a revolutionary new claim in a ground breaking article, and the real data probably did support his hypothesis, just not so unequivocally (and no statistical tests in wide use at the time could have detected his data tweaks). This paper, in contrast, just doesn’t seem to justify the risk which far more people are capable of having the ability to review and analyze, but, I guess people can get tunnel vision.

  • Pingback: Winter Brain, Summer Brain: Seasonality in Brain Responses? | Breaking World News()

  • Pingback: I’ve Got Your Missing Links Right Here (13 February 2016) – Phenomena: Not Exactly Rocket Science()

  • Pingback: URL()



No brain. No gain.

About Neuroskeptic

Neuroskeptic is a British neuroscientist who takes a skeptical look at his own field, and beyond. His blog offers a look at the latest developments in neuroscience, psychiatry and psychology through a critical lens.


See More

@Neuro_Skeptic on Twitter


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Collapse bottom bar