IoP Publishing鈥檚 discovery that researchers are split down the middle on the merits of using AI in peer review is not surprising given the complexity of the issue.
The publisher鈥檚 August of just under 350 physics researchers found that 41 per cent were positive about the use of AI in peer review and 37 per cent were negative.
The case for using AI is obvious. With more than聽 in existence, the process is highly human-intensive. To illustrate this point, if the average journal published 50 papers per year, with each submitted paper being reviewed by two independent referees and each review requiring four hours of effort, that translates into an annual 12 million hours of peer review work.
Moreover, that figure does not include the work of the editorial boards that oversee the review process or the editorial staff who process the papers 鈥 not to mention the time editors take trying to find suitable reviewers willing to take on a manuscript. It also ignores the fact that many聽, creating a multiplier effect that may increase the annual reviewing burden by a factor of five or more.
色盒直播
Replacing human referees with generative AI instead would therefore ease a reviewing burden that is commonly agreed to be verging on unsustainable.
Then there is the issue of . Many journals promise rapid review. Yet speed must be balanced against the quality and depth of the reviews provided. With AI-generated reviews, time would no longer be an issue.
色盒直播
Furthermore, if AI peer reviewing did become widely available, researchers could use it to evaluate their research manuscripts before submitting to a journal. Incorporating such a system into repositories, such as , would facilitate this, making peer review聽part of the research process itself by offering suggestions that impact the final product.
But, of course, there are also challenges to adopting AI. The purpose of peer review is to answer three questions about the manuscript. Is the research new? Are the research results correct? And do the results add intellectual value in a field or provide benefits in a discipline or beyond?
Most research incrementally builds on existing results, and AI systems are well suited to make such evaluations. They can respond to an informal checklist of measures that capture what the research manuscripts build on and how well it has used the scientific method to achieve its objectives. This is no different from how a human peer reviewer would proceed.
Campus resource: Developing a GenAI policy for research and innovation
However, the answers to the second two of the three questions above are highly dependent on the field of study and the type of research conducted. Although the is likely at the foundation of most research and discovery in STEM disciplines and some of the social sciences, variations in theoretical, experimental and data analytics research make a one-size-fits-all approach to AI peer review problematic.
In addition, if the research breaks new ground, providing a quantum shift in thinking, making such evaluations would be more difficult given that the existing literature would not provide any foundation to evaluate such new ideas.
色盒直播
Perhaps the most difficult role for AI would be to assess the third question, pertaining to the value and benefits of the research. Although such evaluation is highly subjective, it is often what provides the insight that is at the core of peer review鈥檚 value.
Donald Trump鈥檚 recent executive order, 鈥濃, calls for the adoption of 鈥渦nbiased peer review鈥 to improve the research process, including how research is disseminated and evaluated. That could be read as an implicit call for the adoption of AI peer review. But, of course, bias is always in the eye of the beholder. While Trump and his MAGA allies might see research on gender or climate change as being of little value, others will disagree. AIs are no more 鈥渦nbiased鈥 than humans in that sense 鈥 as the of Elon Musk鈥檚 Grok AI aptly demonstrates.
惭辞谤别辞惫别谤,听聽must be trained with data, which itself may 鈥 depending on your opinion 鈥 be biased or contaminated with information that is demonstrably false. Although AI systems look smart, they are doing nothing more than regurgitating what they learned when trained. As the data modelling adage goes: 鈥溾.
色盒直播
To return to the issue of recognising the value of groundbreaking research, it is possible that an AI鈥檚 training data could inadvertently create a 鈥済roup think鈥 assessment, which uprates research that methodologically builds on existing knowledge but fails to recognise the benefits of 鈥渙ut-of-the-box鈥 ideas, potentially disincentivising research creativity.
In my view, when it comes to assessing the value and significance of research. But we should not rely on hunches. To test this possible limitation, an AI peer-review process should be implemented in parallel with human peer review, with the human peer reviewers allowed to see the AI peer review after they complete their own assessments. It may well turn out that humans and AIs agree with each other much more frequently in some fields than others, with the former more suited to a switch to AI reviewing.
Ultimately, AI鈥檚 most appropriate role might be to support human peer review, rather than replace it, picking up more perfunctory issues while the human reviewer connects the dots and has the final say. But we don鈥檛 know. And the bottom line is that we must proceed with caution until we do.
Let鈥檚 hold off implementing AI peer review until 鈥 no pun intended 鈥 it can itself be peer-reviewed to ensure it meets the very standards that authors and editors rightly expect.
色盒直播
is founder professor in computer science at the University of Illinois Urbana-Champaign.
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?




