Combat paper mills with slow science, not warring AIs

Might industrial-scale scientific misconduct kill the publish-or-perish culture that spawned it, asks John Whitfield

Call me nostalgic, but scientific fraud isn’t what it once was.

Twenty years ago, unmasked academic fraudsters were unusual and generally a big deal. They had ambition—publishing in Science and Nature, making claims that promised to revolutionise fields, even being tipped for Nobels. When they fell, they fell from a height.

Today, reports of fraud are depressingly common and mundane. Your average crooked researcher seems content producing something that crawls just far enough over the threshold of coherence to get published in a remote corner of the literature. Often they pay for the pleasure of publication via slightly dubious predatory journals.

Many don’t even fake their own work, outsourcing the task to paper mills that generate bogus text, data and figures on their behalf. These people aren’t just dishonest; they’re lazy.

Fears that a substantial proportion of published research is unreliable—what’s sometimes called the reproducibility crisis—are well over a decade old. But until recently, it was widely hoped that outright fraud only accounted for a small number of dodgy papers. Most problems seemed instead to stem from what are known as questionable research practices: sloppy methods, selective reporting, overprocessed statistics.

Now it increasingly looks as if science could be awash with entirely fictitious publications.

Perhaps the only thing holding fraud back was the lack of a scalable business model.

Misconduct has gone from being the occasional crash of a falling idol to an ever-present drone that threatens to make entire disciplines unintelligible.

Reading and writing

In a widely reported preprint posted last week, a German team used a simple system to red-flag papers that combined a lack of international co-authors, use of private email addresses, affiliation with a hospital and citations of papers with the same characteristics.

Looking for these markers in more than 15,000 papers in the PubMed database and calibrating their method against two groups of papers, one of known fakes and the other of known validity, the team estimated that about 20 per cent of biomedical articles could be fraudulent.

There is already at least one app that uses machine learning to flag potentially suspect research. Given that artificial intelligence is also surely used to produce such work, you don’t need to have read too many cyberpunk novels to envision the scholarly literature becoming a battleground for warring AI bots.

But the most powerful technology for rooting out bad science, whether sloppy or shady, is much older. It’s called reading.

The issue, of course, is that reading doesn’t scale. Part of this is about peer review and the difficulty of properly scrutinising papers before they are published. But the problem of reproducibility is exacerbated by the diminished role of reading in science more generally, as researchers are judged on what they produce and publishers shift their business models from selling subscriptions to readers to attracting publishing fees from authors.

When research skews so much towards writing and away from reading, it’s little wonder that papers exist to be published rather than read. Paying to publish your own work is ‘gold open access’ in research. Other parts of the writing world call it ‘vanity publishing’.

Slow science

Last week also saw the release of a report on reproducibility from the House of Commons Science, Innovation and Technology Committee. Its most novel and interesting suggestions focus on time: “protected research time” for academics, a minimum postdoc duration of three years, and UK Research and Innovation setting up “a trial funding programme with an emphasis on ‘slower’ science”.

The MPs are silent on the wider importance of scientists as consumers and critics of each other’s work, although they have a few things to say about peer review, such as recommending publication of reviewers’ comments. But some of that protected time needs to be set aside for reading, as a contribution to researchers’ own development and as an act of academic citizenship.

Good luck with all that, you might say. Calls for a slow science movement go back to at least 2010. If the revolution is coming, it is practising what it preaches and taking its own sweet time.

Industrial-scale fraud, though, might contain the seeds of its own demise. If the proportion of unreliable papers becomes too high, scientists will not believe anything they haven’t run the rule over themselves and discussed with trusted colleagues.

Once humans find themselves outmatched by computers in a game of publish or perish, there will be little point in playing any more.

John Whitfield is opinion editor at Research Professional News

This article also appeared in Research Fortnight