Colin F. Camerer
(comments entirely personal, coauthors do not endorse)
This post is about the editorial process on “General Economic Principles of Bargaining and Trade: Evidence from 2,000 Classroom Experiments” that just came out in Nature Human Behaviour on August 3rd, 2020. We were actually generally happy about the entire three-stop process, except for a short unhelpful review from PNAS (it’s reprinted in full below). Even that, however, is a good teaching example of “How not to peer-review a paper” which early-career researchers and some schadenfreude-havers may find useful.
Stop one: No shoes, no shirt, no incentives? No service!
The first place we submitted was AER:Insights. (September 15-November 4, 2018= 50 days—not bad!). It is a newish short-form journal similar to Science, Psychological Science etc. The conclusions of our paper are easily conveyed in a small number of compact graphs. I have a lot of experience publishing in both economics journals and short-form general science journals (probably more than anyone except Ernst Fehr and perhaps some of his coauthors) and thought this would fit the short form.
Referees were all unenthusiastic for slightly different reasons, which were not too surprising to us. The main issue is that double auctions (DAs) and ultimatum games have been studied a lot and there are some cross-country and cross-cultural comparisons. But it is easy to judge a paper by whether it’s conclusion is brand new or whether it adds important evidence to a conclusion that is thought to hold.
The key to our paper is that there has never been a large series of DA experiments with a common design (trading rules, and induced supply and demand), across lab settings and populations. Never. The most knowledgeable experimental economists might be surprised I am making this claim.
But it’s true, and here’s why: Until relatively recently (c 2005?) lab experiments with many subjects (e.g. 10-20) interacting in a market were run in bricks-and-mortar labs. The first wave of pioneering experiments in the 1970s and 1980s (following Smith 1965) used relatively small samples of sessions, and design features were often a little idiosyncratic. After establishing surprisingly good and fast convergence in centralized DA, experimental economists quickly moved on to study market power, monopoly, practices facilitating collusion, different trading rules (e.g. posted pricing), etc. Nobody bothered to do the basic, uncreative work of just replicating a single DA design many, many times. There was also no professional incentive to do so, if you are trying to build a career by showing creativity and asking an interesting new question.
As a result, suppose you look in Davis and Holt’s beautiful encyclopedic textbook on Experimental Economics (and others)
. You will not see graphs like ours that compile hundreds of identical results. Instead, you will see an intriguing series of graphs showing individual experiments on the effect of different trading rules.
With that background, here are some highlights of what referees had to say:
Referee #1 says “There is not much new reported but for the massive number of classroom games reported on.”
S/he also says “There is some testing of models for the double auction market; e.g., zero intelligence buyers and sellers. But I don’t know anyone who took this model seriously as an explanation for behavior in DA markets other than to suggest that it’s the institution that can, and quite likely does, generate equilibrium outcomes.” People attending this upcoming conference at Yale in October 2020 [ https://som.yale.edu/event/2020/10/the-first-conference-on-zerominimal-intelligence-agents-now-virtual] might beg to differ about taking zero-intelligence seriously (although the referee’s point is understandable).
Referee #2 thought we tried to do too much including both double auction (DA) and ultimatum data in one paper. That is a very reasonable criticism, and a helpfully constructive one.
S/he also notes “The discussion is a bit superficial at times, presumably because the authors were constrained by the limits that AERI’s short paper format imposes.”
The problem here is that if you are not in the business of reviewing short-form papers, the papers do seem superficial—because they are! There is not enough space to describe as much detail as in the longer-form.
Referee #3 also says the main results have been “known for decades”. Depending what is meant by “known”, there is some truth to that (see my preface comment above).
S/he also says “Economics experiments without financial motivations are generally not publishable in refereed journals, and I am not convinced that the large quantity of data, in this case, justifies an exception.” The justification for rejection here is that you literally cannot publish economics experiments without financial motivations (although exceptions might be made).
Stop 2: PNAS and some alleged shark-jumping
Then we tried PNAS. (Jan 12-March 20, 2018 = 67 days. A little slow for PNAS but cannot complain at all on speed). Here is one of the two reviews at PNAS. The Editor-in-Chief also noted, “I read the paper myself and with some reluctance sided with referee #2.”
[Referee #2 report] This paper is one of several that now looks at replication. Many well done ones by some of these very authors. I agree with the movement and I have refereed several of the papers that have been published in the top general interest journals. At that point I strongly urged publication. But I think we have now “jumped the proverbial shark.” What is next, sharing games? Centipede games? I urge the authors to keep doing these exercises, but I don’t see these as a general contribution at this point to merit PNAS pages.
Here is what PNAS says to reviewers [https://www.pnas.org/page/authors/reviewers]
“Besides giving authors insight into deficiencies in the submitted work, reviewer comments should acknowledge positive aspects of the material under review, present negative aspects constructively, and indicate the improvements needed. Reviewers should explain and support their judgment so that editors and authors may understand the basis of the comments.”
It seems to that Reviewer #2 managed to break all the guidelines in only 93 words. That’s a feat! There is nothing too constructive (besides “I urge the authors to keep doing these exercises…”).
The “negative aspect” is that we allegedly “jumped the shark”.
Do you know what that means? No?
Reviewer #2 doesn’t either.
The phrase “jumped the shark” was coined to describe an episode of the popular show “Happy Days”. The character Fonzie literally jumps on water-skis over a shark confined in an ocean area, after a challenge to his manhood. Wikipedia is reliable here: “Jumping the shark is an idiom used to describe a moment when something that was once widely popular, but has since grown less popular, makes a misguided attempt at generating publicity that instead only serves to highlight its irrelevance.”
The PNAS referee is alleging that previous replication efforts were worthwhile and he (it’s not a she) “strongly urged publication”. But the referee implies this study is a misguided attempt to generate publicity.
It was not.
Read the paper and we explain why we thought this type of replication was especially useful. And note that none of the other referees described our paper in this way.
Final stop: Nature Human Behaviour
Third and final stop was NHB. For economists and others who are not familiar, this is a newish journal in the Nature “family”. From their site (https://www.nature.com/nathumbehav/about)
“Launched in January 2017, Nature Human Behaviour is an online-only monthly journal dedicated to the best research into human behaviour from across the social and natural sciences.
Timeline: First round (May 30-Aug 22, 2019=84 days).
Editor notes “Please accept my sincere apologies for the delay in getting back to you with a decision. I know how important timeliness is, and I am very sorry I failed to provide you with a decision and reviewer feedback sooner.” (Contrast AER policy: Don’t bother to contact us until 6 months=182 days have passed)
Four reviewers, NHB identifies them by affiliation which is interesting (e.g. “Reviewer 1 (experimental economics, price formation, trade)”. Am linking to the verbatim reports and our responses here so ECRs can get an even deeper look behind the scenes.
There were two other shorter rounds which is common at NHB, but very fast (favorable referees often sign off quickly—days or weeks, not months– on small changes for papers that they have basically accepted). There were some small headaches that are not common in academic publishing, but also some special to this particular paper: We had to get an Open Access waiver from Caltech (routine, one-click process). Needed a signed release from Moblab. Medium headache: We used a map to show different places in the world in which experiments had been done. They noted that “We ask that you revise the figure using a map under a different license (ones which do not contain SA or NC (ShareAlike/Non-Commercial).” So we had to find a free way to make the maps.