Paul Ginsparg on arXiv, OA, and the future

Paul Ginsparg, As we may read, Journal of Neuroscience, September 20, 2006. (I thought I blogged this earlier but just discovered that I hadn’t.)  Excerpt:

The e-print arXiv, initiated in August 1991, has effectively transformed the research communication infrastructure of multiple fields of physics and could play a prominent role in a unified set of global resources for physics, mathematics, and computer science. It has grown to contain >375,000 articles (as of July 2006), with >50,000 new submissions expected in calendar year 2006 and >40,000,000 full-text downloads per year. It is an international project, with dedicated mirror sites in 17 countries and collaborations with United States and foreign professional societies and other international organizations, and it has also provided a crucial lifeline for isolated researchers in developing countries…

The arXiv is entirely scientist driven: articles are deposited by researchers when they choose (either before, simultaneous with, or after peer review), and the articles are immediately available to researchers throughout the world. As a pure dissemination system, it operates at a factor of 100–1000 times lower in cost than a conventionally peer-reviewed system (Ginsparg, 2001). This is the real lesson of the move to electronic formats and distribution: not that everything should somehow be free, but that with many of the production tasks automatable or off-loadable to the authors, the editorial costs will then dominate the costs of an unreviewed distribution system by many orders of magnitude….

The methodology works within copyright law, as long as the depositor has the authority to deposit the materials and assign a nonexclusive license to distribute at the time of deposition, because such a license takes precedence over any subsequent copyright assignment….

From the outset, arXiv.org relied on a variety of heuristic screening mechanisms, including a filter on institutional affiliation of submitter, to ensure insofar as possible that submissions are at least “of refereeable quality.” …

A form of open access appears to be happening by a backdoor route: using standard search engines, more than one-third of the high-impact journal articles in a sample of biological/medical journals published in 2003 were found at nonjournal Web sites (Wren, 2005). To assess the extent of this phenomenon less systematically in the neuroscience community, I looked up the publications posted at [Brain Mapping Studies]….The result is striking: at least 75% of the publications listed were freely available either via direct links from the above Web page or via a straightforward Web search for the article title. If indeed this is representative, then the neuroscience community may already be farther along in the direction of open access than most realize….

 Because the current generation of undergraduates, and the next generation of researchers, already takes for granted that such materials should be readily accessible from anywhere, it is more than likely that this percentage will only increase over time and that the publishing community will need to adapt to the reality of some form of open access, regardless of the outcome of the government mandate debate.

There is more to open access, however, than just the free access assessed above. True open access permits any third party to aggregate and data mine the articles, themselves treated as computable objects, linkable and interoperable with associated databases. We are still just scratching the surface of what can be done with large and comprehensive full-text aggregations. A forward-looking example is provided by the PubMed Central database, operated in conjunction with GenBank and other biological databases at the United States National Library of Medicine….

The enormously powerful sorts of data mining and number crunching that are already taken for granted as applied to the open-access genomics databases can be applied to the full text of the entirety of the biology and life sciences literature and will have just as great a transformative effect on the research done with it….

source: Paul Ginsparg on arXiv, OA, and the future

Comments are closed.

Creative Commons License
This work is licensed under a Creative Commons License.