Inside arXiv—the Most Transformative Platform in All of ScienceNEWS | 29 March 2025“Just when I thought I was out, they pull me back in!” With a sly grin that I’d soon come to recognize, Paul Ginsparg quoted Michael Corleone from The Godfather. Ginsparg, a physics professor at Cornell University and a certified MacArthur genius, may have little in common with Al Pacino’s mafia don, but both are united by the feeling that they were denied a graceful exit from what they’ve built.
Nearly 35 years ago, Ginsparg created arXiv, a digital repository where researchers could share their latest findings—before those findings had been systematically reviewed or verified. Visit arXiv.org today (it’s pronounced like “archive”) and you’ll still see its old-school Web 1.0 design, featuring a red banner and the seal of Cornell University, the platform’s institutional home. But arXiv’s unassuming facade belies the tectonic reconfiguration it set off in the scientific community. If arXiv were to stop functioning, scientists from every corner of the planet would suffer an immediate and profound disruption. “Everybody in math and physics uses it,” Scott Aaronson, a computer scientist at the University of Texas at Austin, told me. “I scan it every night.”
Every industry has certain problems universally acknowledged as broken: insurance in health care, licensing in music, standardized testing in education, tipping in the restaurant business. In academia, it’s publishing. Academic publishing is dominated by for-profit giants like Elsevier and Springer. Calling their practice a form of thuggery isn’t so much an insult as an economic observation. Imagine if a book publisher demanded that authors write books for free and, instead of employing in-house editors, relied on other authors to edit those books, also for free. And not only that: The final product was then sold at prohibitively expensive prices to ordinary readers, and institutions were forced to pay exorbitant fees for access.
The “free editing” academic publishers facilitate is called peer review, the process by which fellow researchers vet new findings. This can take months, even a year. But with arXiv, scientists could post their papers—known, at this unvetted stage, as preprints—for instant and free access to everyone. One of arXiv’s great achievements was “showing that you could divorce the actual transmission of your results from the process of refereeing,” said Paul Fendley, an early arXiv moderator and now a physicist at All Souls College, Oxford. During crises like the Covid pandemic, time-sensitive breakthroughs were disseminated quickly—particularly by bioRxiv and medRxiv, platforms inspired by arXiv—potentially saving, by one study’s estimate, millions of lives.
While arXiv submissions aren’t peer-reviewed, they are moderated by experts in each field, who volunteer their time to ensure that submissions meet basic academic standards and follow arXiv’s guidelines: original research only, no falsified data, sufficiently neutral language. Submissions also undergo automated checks for baseline quality control. Without these, pseudoscientific papers and amateur work would flood the platform.
In 2021, the journal Nature declared arXiv one of the “10 computer codes that transformed science,” praising its role in fostering scientific collaboration. (The article is behind a paywall—unlock it for $199 a year.) By a recent count, arXiv hosts more than 2.6 million papers, receives 20,000 new submissions each month, and has 5 million monthly active users. Many of the most significant discoveries of the 21st century have first appeared on the platform. The “transformers” paper that launched the modern AI boom? Uploaded to arXiv. Same with the solution to the Poincaré conjecture, one of the seven Millennium Prize problems, famous for their difficulty and $1 million rewards. Just because a paper is posted on arXiv doesn’t mean it won’t appear in a prestigious journal someday, but it’s often where research makes its debut and stays openly available. The transformers paper is still routinely accessed via arXiv.
For scientists, imagining a world without arXiv is like the rest of us imagining one without public libraries or GPS. But a look at its inner workings reveals that it isn’t a frictionless utopia of open-access knowledge. Over the years, arXiv’s permanence has been threatened by everything from bureaucratic strife to outdated code to even, once, a spy scandal. In the words of Ginsparg, who usually redirects interview requests to an FAQ document—on arXiv, no less—and tried to talk me out of visiting him in person, arXiv is “a child I sent off to college but who keeps coming back to camp out in my living room, behaving badly.”
Ginsparg and I met over the course of several days last spring in Ithaca, New York, home of Cornell University. I’ll admit, I was apprehensive ahead of our time together. Geoffrey West, a former supervisor of Ginsparg’s at Los Alamos National Laboratory, once described him as “quite a character” who is “infamous in the community” for being “quite difficult.” He also said he was “extremely funny” and a “great guy.” In our early email exchanges, Ginsparg told me, upfront, that stories about arXiv never impress him: “So many articles, so few insights,” he wrote.Author: Will Knight. Sheon Han. Tiffany Ng. Jason Kehe. Claire L. Evans. Samantha Spengler. Amit Katwala. Adam Kucharski. Steven Levy. Ramin Skibba. Source