healthcarereimagined

Envisioning healthcare for the 21st century

  • About
  • Economics

How AI use in scholarly publishing threatens research integrity, lessens trust, and invites misinformation – Bulletin of the Atomic Scientists

Posted by timmreardon on 03/25/2026
Posted in: Uncategorized.

By Andrew Gray | March 12, 2026

Scientific research underpins the things we do. Huge investments are made capitalizing on technological developments; governments declare that their policies will be based on academic evidence; doctors decide what treatments to use for their patients. And beneath all that is the idea that, ultimately, we can trust that published research fairly reflects the realities of the world: that it is true, that it is balanced, and that it has been produced and reviewed by expert researchers. But that foundation is starting to wobble.

Shortly after ChatGPT was released, it became clear that it was beginning to affect scholarly research. Published papers became much more likely to meticulously delve into intricate questions, and to do so with great enthusiasm, in ways they never had before (Stokel-Walker 2024). Distinctive quirks of large language model (LLM) writing such as these began to explode in popular usage, first in certain fields such as computer science or engineering, before spreading to other disciplines. Some researchers estimate that in 2024, 13.5 percent of all papers in PubMed indexed journals had been processed using LLMs, representing around 200,000 articles that year (Kobak 2025). In preprints—papers posted online as unreviewed drafts—the rates increased even faster, with more than 20 percent of computer science preprints showing signs of LLM involvement by late 2024 (Liang 2025).

In retrospect, this was not surprising. For many researchers, forced by the conventions of academia to publish in a second language, a tool that could help with fluent translation is a blessing. And across the world, researchers have been under strong pressure to publish more papers for decades; a tool which could speed up the process of writing was always going to be attractive. And it does speed it up; researchers who have used LLMs in their writing produce around a third more preprints than their colleagues (Kusumegi et al 2025).

But it can be tempting to use it too much. Some researchers have fallen into the trap of simply getting the LLM to generate large portions of papers for them, or to rewrite a draft so extensively that it might unintentionally change the meaning (Conroy 2023). What emerges is something that looks superficially like research, written fluently, convincingly, and confidently, but which might potentially just turn out to be so much smoke and mirrors. In extreme cases, they can be capable of generating entire papers based on research that simply never took place. It is no surprise that researchers have found that identifiably LLM-edited papers are retracted twice as often as average (Kousha & Thelwall 2025).

To a reader, though, LLM-copyedited papers are hard to distinguish from LLM-generated ones. One can sometimes tell that the tools were used, but not how much they were used in any given paper. When surveyed, 28 percent of researchers said they had used LLMs for copyediting and 8 percent for generating new text, but half or more of both groups didn’t disclose it in the paper (Kwon 2025).

Alongside this reluctance to disclose LLM use, many researchers appear keen to disguise it. When some of the distinctive markers of AI writing in research papers were first reported, they suddenly became less popular in newer publications, but the use of the less-publicized markers continued to grow (Geng & Trotta 2025). Together, this strongly implies that many authors just don’t want it known that they are using these tools.

And it is not just in writing the papers where people are trying to cut corners with AI. Most research papers are peer-reviewed by other researchers, giving a degree of confidence that the research is robust and legitimate. This can be a time-consuming and thankless task, and—unsurprisingly—is one where LLMs have begun to creep in. Most publishers now have explicit warnings against reviewing papers with LLMs, but it almost certainly still happens. Some less scrupulous authors have even been discovered leaving invisible comments in drafts, instructing the LLM that they expect will review it to skip straight to approval (Sugiyama & Eguchi 2025). If nothing else, this technology has invented new kinds of research integrity problems!

These tools are also beginning to affect how we find research. The major scholarly databases are all beginning to offer “AI-assisted search” in some form or another, using LLMs to interpret a user question and find results—either as a list of recommended papers, or as a summary and analysis of the results. When this works well, it can be very convincing. It may return six useful and interesting papers. But will it give you what you want: the right six papers, or the best six? We just don’t know.

And here lies a big risk. LLMs are often described as black boxes; any oddities in the way they work, or biases they encode, will be baked into the results, with no easy ways to spot them. There is no reason to think that any of the scholarly databases are intentionally skewing their results, but biases or censorship can easily arise unintentionally, especially in such a complex system as these (Tay 2025).

The most prominent and accessible of these databases, for non-academics, is Google Scholar. Google Scholar works by indexing everything found by Google, with results which look broadly like a research paper. This is unlike traditional databases, which work from a selective list of publications. It is more expansive than more traditional databases, indexing things like preprints and working papers as well as published research. But this has made it more vulnerable to disruption or manipulation by LLMs (Haider et al 2024). Because it includes a wider range of material, it already indexes a higher proportion of the unreviewed types of items that are more likely to involve LLM text. Because it is entirely automated, it does not have the manual screening which could keep out some of the lowest-value junk.

That automated approach causes other problems. Google Scholar identifies papers it does not otherwise know about by looking in the references list of the ones it indexes. This means it can report a reference to them even if no digital copy exists, which can be very useful for more obscure material. But one of the more dramatic failures of LLMs is that they often hallucinate citations—works that do not exist, plausible sounding mirages, often in journals that themselves do not exist. Google Scholar does not have any way to distinguish between real and false references—understandably, its developers never expected that anyone would be including false references—so it reports that they exist. People trying to validate what another LLM tells them look up the paper, find it indexed in Google Scholar, and, well, surely it must be real! It’s in the database.

Most researchers would never admit to citing a paper they have not read… but one can imagine that it is tempting, especially when it seems to perfectly address the question in hand, and you seem to have a fair summary of it but just can’t track it down however hard you try. And so those fictional citations creep out into real papers. Entire fictional journals may be conjured into a shadowy existence this way (Klee 2025).

This is a perfect storm brewing for the integrity of scholarly publishing. The volume of significantly AI-generated material is increasing, and it is being masked by a flood of “AI polished” papers, which have the same surface style. It’s no wonder that readers, especially casual readers, cannot be confident in distinguishing between real research and fictional, and cannot tell how much of the paper might potentially be hallucinated.

At the same time, the system is stumbling under the extra burdens placed on it by the use of LLMs; it has become easier to produce papers, without becoming easier to assess or peer review them. In late 2025, the preprint server arXiv reported that it would tighten its rules and no longer accept the submission of computer science review articles; the volume of them was simply too large for their moderators to cope with (Castelvecchi 2025). As the system creaks under strain, more and more venues will be faced with an unpleasant choice: Restrict submissions, and add yet more work to their volunteer reviewers? Or loosen standards and risk problematic material slipping through?

Then we have to consider why those problematic papers are out there. At the moment, most of the primarily AI-generated papers appear to be from academics trying to bolster their own publication lists. They are unlikely to be deliberately malicious, though they may fit into more traditional patterns of scientific fraud (Richardson et al 2025). But they are still cluttering up the databases, filled with information that may or may not be valid, conclusions and recommendations that may or may not be true, citations pointing to other non-existent literature. These AI-edited papers will place a burden on every future researcher to try to make sense of them, even if that’s not the intention.

But not all examples might be so innocent. Scientific papers—and all the prestige, reliability, and authority that they carry—are a prime target for intentional misinformation campaigns (Bergstrom & West 2023, Haider 2024). Should someone wish to publish a large number of deliberately skewed papers to bolster a certain position—that a new drug is remarkably effective; that an industrial process is perfectly safe; that a particular policy decision has made us all happier and wealthier—then they have found themselves a new tool to help produce them quickly and easily, at the same time that the system is less resilient at keeping them out. It is difficult to say for sure whether this is yet happening, but it is clear that the opportunity cost of doing it has become easier, cheaper, and more achievable.

The ways in which we access research are also changing. The move towards LLM-based information retrieval means that an opaque system is being inserted between readers and the information they are looking for, opening up the opportunity for third parties to control access to research in ways that may not be obvious, or even intentional.

And to cap it all off, anyone who is motivated to reject the validity of research which does not fit their preconceptions now has a perfect pretext to do so, regardless of its quality: “Oh, you can’t trust that anyway, don’t you know it’s all AI rubbish now?”

A compelling analogy here, suggested by the historian Kevin Baker, is to think of the publishing system as an immune system for science: It rejects things that might harm the system, perhaps not perfectly, but reliably enough to keep everything ticking along and reasonably healthy. But when our immune system is stressed, we can succumb more easily to a minor infection that we would normally brush off (Baker 2025).

The scholarly publishing system is, undeniably, not in the best of health. It is beset by a whole range of pressures. It carries on, but it is limping. The well-meaning use of AI to help speed things up might, in this analogy, be the fever that ends up sending the whole thing to its sickbed, opening the door for much more damaging illnesses—in the form of intentional and malicious disinformation—to take root and do real harm.

Article link: https://thebulletin.org/premium/2026-03/how-ai-use-in-scholarly-publishing-threatens-research-integrity-lessens-trust-and-invites-misinformation/

References

Baker, K. 2025. “Context Widows.” December 12. Artificial Bureaucracy – Substack. https://artificialbureaucracy.substack.com/p/context-widows

Bergstrom, C, & West, J. 2023. “How publishers can fight misinformation in and about science and medicine.” July 7. Nature Medicine. https://www.nature.com/articles/s41591-023-02411-7

Castelvecchi, D. 2025. “Preprint site arXiv is banning computer-science reviews: here’s why.” November 7. Nature. https://www.nature.com/articles/d41586-025-03664-7

Conroy, G. 2023. “Scientific sleuths spot dishonest ChatGPT use in papers.”  September 8. Nature https://www.nature.com/articles/d41586-023-02477-w

Geng, M, & Trotta, R. 2025. “Human-LLM coevolution: evidence from academic writing.” February 17. arXiv https://arxiv.org/abs/2502.09606

Haider, J, et al. 2024. “GPT-fabricated scientific papers on Google Scholar.” September 3. Misinformation Review. https://misinforeview.hks.harvard.edu/article/gpt-fabricated-scientific-papers-on-google-scholar-key-features-spread-and-implications-for-preempting-evidence-manipulation/

Klee, M. 2025. “AI is inventing academic papers that don’t exist – and they’re being cited in real journals.” December 17. Rolling Stone. https://www.rollingstone.com/culture/culture-features/ai-chatbot-journal-research-fake-citations-1235485484/

Kobak, D, et al. 2025. “Delving into LLM-assisted writing in biomedical publications through excess vocabulary.” July 2. Science Advances 11(27). https://www.science.org/doi/10.1126/sciadv.adt3813

Kousha K & Thelwall M. 2025. “How much are LLMs changing the language of academic papers after ChatGPT? A multi-database and full text analysis.” September 11. arXiv https://arxiv.org/abs/2509.09596

Kusumegi, K, et al. 2025. “Scientific production in the era of large language models.” December 18. Science390(6779) https://www.science.org/doi/10.1126/science.adw3000

Kwon 2025. “Is it OK for AI to write science papers? Nature survey shows researchers are split.” May 14. Naturehttps://www.nature.com/articles/d41586-025-01463-8

Liang, W, et al. 2025. “Quantifying large language model usage in scientific papers.” August 4. Nature Human Behaviour9. https://www.nature.com/articles/s41562-025-02273-8

Richardson, R, et al. 2025. “The entities enabling scientific fraud at scale are large, resilient, and growing rapidly.” August 4. Proceedings of the National Academy of Sciences122(32). https://www.pnas.org/doi/10.1073/pnas.2420092122

Stokel-Walker, C. 2024. “AI Chatbots Have Thoroughly Infiltrated Scientific Publishing.” May 1. Scientific American. https://www.scientificamerican.com/article/chatbots-have-thoroughly-infiltrated-scientific-publishing/.

Sugiyama, S. & Eguchi, R. 2025. “’Positive review only’: Researchers hide AI prompts in papers.” July 1. Nikkei Asia. https://asia.nikkei.com/business/technology/artificial-intelligence/positive-review-only-researchers-hide-ai-prompts-in-papers

Tay, A. 2025. “The AI powered Library Search That Refused to Search.” July 28. Musings about Librarianship – Substack. https://aarontay.substack.com/p/the-ai-powered-library-search-that

Share this:

  • Share on X (Opens in new window) X
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Related

Posts navigation

← VA Prepares April Relaunch of EHR Program – GovCIO
  • Search site

  • Follow healthcarereimagined on WordPress.com
  • Recent Posts

    • How AI use in scholarly publishing threatens research integrity, lessens trust, and invites misinformation – Bulletin of the Atomic Scientists 03/25/2026
    • VA Prepares April Relaunch of EHR Program – GovCIO 03/19/2026
    • Strong call for universal healthcare from Pope Leo today – FAN 03/18/2026
    • EHR fragmentation offers an opportunity to enhance care coordination and experience 03/16/2026
    • When AI Governance Fails 03/15/2026
    • Introduction: Disinformation as a multiplier of existential threat – Bulletin of the Atomic Scientists 03/12/2026
    • AI is reinventing hiring — with the same old biases. Here’s how to avoid that trap – MIT Sloan 03/08/2026
    • Fiscal Year 2025 Year In Review – PEO DHMS 02/26/2026
    • “𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮 𝗠𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗦𝗮𝗹𝗲” – NATO Strategic Communications COE 02/26/2026
    • Claude Can Now Do 40 Hours of Work in Minutes. Anthropic Says Its Safety Systems Can’t Keep Up – AJ Green 02/19/2026
  • Categories

    • Accountable Care Organizations
    • ACOs
    • AHRQ
    • American Board of Internal Medicine
    • Big Data
    • Blue Button
    • Board Certification
    • Cancer Treatment
    • Data Science
    • Digital Services Playbook
    • DoD
    • EHR Interoperability
    • EHR Usability
    • Emergency Medicine
    • FDA
    • FDASIA
    • GAO Reports
    • Genetic Data
    • Genetic Research
    • Genomic Data
    • Global Standards
    • Health Care Costs
    • Health Care Economics
    • Health IT adoption
    • Health Outcomes
    • Healthcare Delivery
    • Healthcare Informatics
    • Healthcare Outcomes
    • Healthcare Security
    • Helathcare Delivery
    • HHS
    • HIPAA
    • ICD-10
    • Innovation
    • Integrated Electronic Health Records
    • IT Acquisition
    • JASONS
    • Lab Report Access
    • Military Health System Reform
    • Mobile Health
    • Mobile Healthcare
    • National Health IT System
    • NSF
    • ONC Reports to Congress
    • Oncology
    • Open Data
    • Patient Centered Medical Home
    • Patient Portals
    • PCMH
    • Precision Medicine
    • Primary Care
    • Public Health
    • Quadruple Aim
    • Quality Measures
    • Rehab Medicine
    • TechFAR Handbook
    • Triple Aim
    • U.S. Air Force Medicine
    • U.S. Army
    • U.S. Army Medicine
    • U.S. Navy Medicine
    • U.S. Surgeon General
    • Uncategorized
    • Value-based Care
    • Veterans Affairs
    • Warrior Transistion Units
    • XPRIZE
  • Archives

    • March 2026 (7)
    • February 2026 (6)
    • January 2026 (8)
    • December 2025 (11)
    • November 2025 (9)
    • October 2025 (10)
    • September 2025 (4)
    • August 2025 (7)
    • July 2025 (2)
    • June 2025 (9)
    • May 2025 (4)
    • April 2025 (11)
    • March 2025 (11)
    • February 2025 (10)
    • January 2025 (12)
    • December 2024 (12)
    • November 2024 (7)
    • October 2024 (5)
    • September 2024 (9)
    • August 2024 (10)
    • July 2024 (13)
    • June 2024 (18)
    • May 2024 (10)
    • April 2024 (19)
    • March 2024 (35)
    • February 2024 (23)
    • January 2024 (16)
    • December 2023 (22)
    • November 2023 (38)
    • October 2023 (24)
    • September 2023 (24)
    • August 2023 (34)
    • July 2023 (33)
    • June 2023 (30)
    • May 2023 (35)
    • April 2023 (30)
    • March 2023 (30)
    • February 2023 (15)
    • January 2023 (17)
    • December 2022 (10)
    • November 2022 (7)
    • October 2022 (22)
    • September 2022 (16)
    • August 2022 (33)
    • July 2022 (28)
    • June 2022 (42)
    • May 2022 (53)
    • April 2022 (35)
    • March 2022 (37)
    • February 2022 (21)
    • January 2022 (28)
    • December 2021 (23)
    • November 2021 (12)
    • October 2021 (10)
    • September 2021 (4)
    • August 2021 (4)
    • July 2021 (4)
    • May 2021 (3)
    • April 2021 (1)
    • March 2021 (2)
    • February 2021 (1)
    • January 2021 (4)
    • December 2020 (7)
    • November 2020 (2)
    • October 2020 (4)
    • September 2020 (7)
    • August 2020 (11)
    • July 2020 (3)
    • June 2020 (5)
    • April 2020 (3)
    • March 2020 (1)
    • February 2020 (1)
    • January 2020 (2)
    • December 2019 (2)
    • November 2019 (1)
    • September 2019 (4)
    • August 2019 (3)
    • July 2019 (5)
    • June 2019 (10)
    • May 2019 (8)
    • April 2019 (6)
    • March 2019 (7)
    • February 2019 (17)
    • January 2019 (14)
    • December 2018 (10)
    • November 2018 (20)
    • October 2018 (14)
    • September 2018 (27)
    • August 2018 (19)
    • July 2018 (16)
    • June 2018 (18)
    • May 2018 (28)
    • April 2018 (3)
    • March 2018 (11)
    • February 2018 (5)
    • January 2018 (10)
    • December 2017 (20)
    • November 2017 (30)
    • October 2017 (33)
    • September 2017 (11)
    • August 2017 (13)
    • July 2017 (9)
    • June 2017 (8)
    • May 2017 (9)
    • April 2017 (4)
    • March 2017 (12)
    • December 2016 (3)
    • September 2016 (4)
    • August 2016 (1)
    • July 2016 (7)
    • June 2016 (7)
    • April 2016 (4)
    • March 2016 (7)
    • February 2016 (1)
    • January 2016 (3)
    • November 2015 (3)
    • October 2015 (2)
    • September 2015 (9)
    • August 2015 (6)
    • June 2015 (5)
    • May 2015 (6)
    • April 2015 (3)
    • March 2015 (16)
    • February 2015 (10)
    • January 2015 (16)
    • December 2014 (9)
    • November 2014 (7)
    • October 2014 (21)
    • September 2014 (8)
    • August 2014 (9)
    • July 2014 (7)
    • June 2014 (5)
    • May 2014 (8)
    • April 2014 (19)
    • March 2014 (8)
    • February 2014 (9)
    • January 2014 (31)
    • December 2013 (23)
    • November 2013 (48)
    • October 2013 (25)
  • Tags

    Business Defense Department Department of Veterans Affairs EHealth EHR Electronic health record Food and Drug Administration Health Health informatics Health Information Exchange Health information technology Health system HIE Hospital IBM Mayo Clinic Medicare Medicine Military Health System Patient Patient portal Patient Protection and Affordable Care Act United States United States Department of Defense United States Department of Veterans Affairs
  • Upcoming Events

Blog at WordPress.com.
  • Reblog
  • Subscribe Subscribed
    • healthcarereimagined
    • Join 153 other subscribers
    • Already have a WordPress.com account? Log in now.
    • healthcarereimagined
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Copy shortlink
    • Report this content
    • View post in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d