Open Access, Academia.edu, and why I’m all-in on Zenodo.org

Note: The majority of this post and the migration framework (academia-migrate) were authored about a year ago, but placed on the back burner while other projects demanded my time. Between the revelation that Academia.edu has implemented banner ads on some profiles and Sarah Bond’s article in Forbes, I have been motivated to finally push this project into production.

Many scholars throughout the world use Academia.edu to broaden access to their own research, which includes not only published documents, but unpublished manuscripts or presentation materials (such as Powerpoint slideshows) that would otherwise never be submitted to peer-reviewed journals. Academia.edu bills itself as a “disruptive” service that takes a shot at the increased commercialization (and resulting access restrictions) of academic publication. For scholars that want their research to be made available to the widest possible audience, peer-reviewed journals are falling short. Peer review offers a certain cachet required by university administrators for considerations toward tenured professorship, but more and more journals are owned and distributed by fewer and fewer publishers. University libraries are strapped with increasing costs to subscribe to journals, and unaffiliated scholars are on the outside looking in with regard to access to current scholarship, unless they would like to pay as much as $50 to acquire a single article. Academia.edu has changed this somewhat. With HTML microdata and pathways for search robots to crawl full-text articles, researchers are able to find relevant articles through Google, and Google’s algorithms tend to favor Academia.edu over other harder-to-crawl sources.

On the surface, this seems great for scholars. And it was good in the beginning, but this has changed over the last year. Despite its domain name, Academia.edu is a commercial venture. It is beholden to investors, not the scholarly community it serves, nor universities, governments, or taxpayers. Recently, an Academia.edu developer approached a scholar about his willingness to participate in a pay-to-play system. I won’t go into great detail, as the initial exchange and subsequent outrage on Twitter have already been covered thoroughly. But what does paying for a recommendation mean? Aside from sacrificing a certain intellectual honestly, a recommendation essentially enhances visibility and access to your work. By definition, though, not paying for a recommendation thus reduces visibility and access to your work. If the Academia.edu developers alter the metadata provided to robots to improve search relevance for those that pay for their publications to be promoted, this necessarily reduces relevance for non-paying users. As a result, access declines, which reduces the likelihood of citation, and may even negatively impact administrative reviews of faculty output.

Furthermore, it appears that Academia.edu is now experimenting with banner advertisements. They do not yet appear to be a permanent fixture, but I believe we are seeing the beginning of overt attempts at generating income on top of research that scholars have published to the site in good faith that it is free and open.

So is there a solution?

Yes. It is Zenodo.org.

Zenodo.org is a truly open access scholarly publication framework that is capable of replacing Academia.edu. Zenodo is open to “research outputs from across all fields of science,” including the humanities and social sciences. Like Academia, users may upload journal articles, conference papers, posters, and presentations, but may also upload raw research data. Zenodo is developed by CERN, which has long demonstrated its devotion to open science and the web. It is backed by funding from the European Union. Moreover, Zenodo has a well-documented API for publishing and harvesting content via well-known open web standards. This is in stark contrast to Academia.edu, which goes to great lengths to prevent users from harvesting publication metadata and makes it impossible to download documents without registering for an account (which also inflates their userbase). Academia.edu prides itself in being disruptive, but it too needs to be disrupted.

Migrating from Academia.edu to Zenodo.org

I fully advocate leaving Academia.edu, but what purpose does it serve to simply delete your account? You are removing publications that are, in the very least, freely and openly available at the moment. Essentially, the best decision is to migrate documents to Zenodo.org, and allow at least one week for Google to fully index migrated content before deleting the Academia.edu account. My MA thesis entitled “Recent Advances in Roman Numismatics,” about the application of Linked Open Data methodologies toward Roman numismatics with Nomisma.org and Online Coins of the Roman Empire, had been available in both the ANS Digital Library and Academia.edu as of January 28, 2016. Due to our superior use of microdata and full-text indexing, the ANS Digital Library version surpassed Academia days after it was published. I uploaded my thesis to Zenodo.org January 29, 2016, and it was already on the first page of Google three days later.

Many of us have uploaded a substantial number of documents to Academia.edu, and it might be tedious to re-upload these documents into a new system, especially with regard to re-entering publication metadata. I have sought to rectify this by facilitating a more efficient migration system. I have developed a framework that is capable of parsing metadata from an Academia.edu profile (although not all publications are listed when the profile page loads), accepting re-uploaded documents (since these cannot be extracted from Academia.edu directly), and uploading these contents into Zenodo.org. This framework itself is open source and available on Github. I will save the technical discussion for different venue.

Screen capture of one of Terhi Nurmikko-Fuller's papers after parsed from Academia.edu
This system took about a week to develop, but I hope that this migration process might save each user several minutes per publication. I hope that this work will encourage more scholars to consider migrating to Zenodo.org from Academia. Migration ultimately enhances the value of these publications, as they can be harvested en masse by members of the general public, who might be able to use them for statistical analyses, to enhance them with named entity recognition or improved interlinking between publications (via Library of Congress Subject Headings, which are incorporated into Zenodo’s metadata entry system), or to simply read them without the obstacle of registering for an account. It is time to accept that the Academia.edu is seeking to shift the academic publishing paradigm from one commercial provider to another.

Help the ANS! TEI Specialist Needed (short-term contract)

The ANS is looking for a TEI specialist wanted for a short-term, part-time (c. 250-hour) project at $20.00/hour. TEI proficiency preferred. Numismatics knowledge helpful but not required. The successful candidate will add to the research value of TEI XML-encoded ebooks by enhancing linking to American Numismatic Society projects and external resources, which will also facilitate broader dissemination though other cultural heritage aggregation projects, such as Social Networks and Archival Context (SNAC) and Pelagios. The TEI specialist will verify existing tags in the current files for ca. 90 books (some are quite short), and will supplement tagging with:

  • specific references to coins in the ANS collection
  • all of the hoards published on coinhoards.org
  • specific coin types published in Coinage of the Roman Republic Online (CRRO) and Pella
  • any other place name or personal name that is featured or most relevant to the subject matter of a particular section in a book

The TEI specialist will also ensure that the illustrations referenced by the TEI XML files are linked to the correct images on the project server.

These tasks may be done outside of the ANS’s New York office. The project is perfect for graduate students in the Digital Humanities, and provides the opportunity to work with a world leader in linked open data, open source, and open access at the ANS with its Director of Data Science, Ethan Gruber.

Send CV/resume to Ethan Gruber by September 1. Project deadline is December 1.

ANS Scanning Project Passes 30,000 Pages

John Graffeo works the controls of the Table Top Scribe at the ANS.
In December 2015, the American Numismatic Society began a joint project with the Newman Numismatic Portal, an online numismatic resource, in order to enable greater access to American numismatic research material on both the ANS Digital Library and Newman Portal websites.

One of the many rare catalogues scanned by the ANS so far.
To date, nearly 500 early American auction catalogues have been scanned by the ANS for this project totaling over 30,000 pages so far. These catalogues include those of Frossard, Woodward, Chapman, Elder, and other notable names in the field.

Scanned ANS auction catalogues as they appear online now.
John Graffeo, the scanner operator, trained first with ANS librarian/archivist David Hill throughout 2015 in the care and handling of the Society’s rare books, and later trained in Princeton with the Internet Archive on how to use the revolutionary Table Top Scribe scanner. The scanning process includes matching an auction catalogue with its metadata (information about the book) in the ANS’s library catalogue, taking a test image, and then proceeding to scan the rest of the volume.

Pages of a catalogue are ready to be scanned on the Table Top Scribe.

The scanner itself is comprised of a metal carriage that cradles each book so as not to effect the spine. Graffeo uses a foot-pedal to raise the book to a pair of glass panes after which he clicks a button to activate two cameras, which photograph the book’s spread.

John Graffeo checks the quality of the scans before proceeding to the next auction catalogue.

After the catalogue has been photographed, Graffeo checks for image quality, tags each page (front matter, interior pages, back matter, cover), and then crops each image, being sure to maintain all of the data on the page including handwritten marginalia such as sale prices and buyer names. Once the volume is complete, Graffeo uploads it to the Internet Archive for a spot-check on quality, after which it goes live online at the Newman Numismatic Portal as part of the ANS’s collection of online auction catalogues. Some of the auction catalogues are exceedingly fragile, so scanning them in this fashion helps to preserve their contents without destroying or damaging the books.

Sample scanned pages as the appear online. All of these scans may be searched and also saved in a variety of digital formats.

“This is a great project,” Graffeo said between scans. “It’s extending the wealth of numismatic information and sharing it for the public good.” All of the scans are available immediately as Open Access for free use by anyone for any reason. The content is shared, and the artifact of the book is preserved. Numismatics has been an interdisciplinary subject since its inception, and making all of the ANS’s holdings publicly available remains central to the Society’s mission.

newmannumismaticThe Newman Numismatic Portal, sponsored by a $2 million grant from the Eric P. Newman Numismatic Education Society to the Washington University Libraries, began operations in December 2014. It has digitized over 1,000 documents to date, including a unique set of bid books from the firms of Samuel and Henry Chapman, which were generously loaned by ANS Trustee Dan Hamelberg.

Internet-Archive-LogoAdministered through Washington University Libraries in St. Louis, the Newman Portal contracted with Internet Archive, which has provided equipment, training, and staff for the scanning operation at the ANS Library. Internet Archive is a non-profit organization dedicated to digital preservation of all media.

“We are thrilled to have this opportunity to begin distributing some of the Library’s research collections on a large scale in the same way that much of the Society’s coin collections are being made available through online research tools like PELLA, OCRE, and MANTIS,” David Hill, supervisor of the onsite operations, said. “When you think about all of the materials in the Library’s Rare Book Room, which include unique archival collections such as dealer and collector correspondence, you really begin to realize what an impact a project like this can have.”

(all photos by Alan Roche, American Numismatic Society)

On Open Access

Why the American Numismatic Society is Open Access . . . and why your institution, learned society, publisher, etc., should be, too

Open_Access_logo_PLoS_white.svgAcademic and scholarly publication is at a crossroads as publishers, authors, and institutions of research and higher learning consider both the financial and ‘moral’ implications of publishing new scholarship as Open Access. The American Numismatic Society (ANS) has adopted what some would consider a progressive approach, while others would find these points to simply be common sense and good manners. As you read the points below, I challenge you to formulate arguments against each one that does not include money. Profit and loss in academic publishing is a very real concern, but it can be demonstrated (and has been in my nine years of experience as an academic publisher) that publishing niche scholarship is (and likely always will be) a money-losing venture. Publication is often built into the mission statements of learned societies, and funding needs to be sought from sources beyond book sales and journal subscriptions to keep the publishing enterprise sustainable.

The ANS has addressed each of the following problems in its efforts to make published research open without taking a hit financially.

Problem: Gold Open Access

One method some publishers use to offset production costs is to charge those authors (or their institutions) who wish to make their research freely available online immediately upon publication instead of waiting some contractually agreed amount of time before being given permission to post the work the web or via a university repository. These costs often range from the hundreds into the low thousands of dollars (e.g., Maney Publishing’s “Article Publishing Charge” (APC) for immediate Open Access publication). Charging authors for Open Access creates an economic barrier to scholars, some of whom cannot afford the fee, and whose institutions may not have budgeted for such costs. Unaffiliated and independent scholars are especially affected by these fees, which they have to pay out-of-pocket and may even require securing a loan.

What the ANS is Doing About It: It is our opinion that authors (and their institutions) should never be charged to make their own research available to the world immediately upon publication.

ajn26coverProblem: Embargo Periods

Going hand-in-glove with “gold” Open Access is the common practice of an embargo period, which is the time (anywhere from one to five years in most cases) between when research is published and when an author can make that work freely available. The point of the embargo period is to allow the publisher to recover the production costs of that publication prior to making it available as Open Access. Authors are forbidden to post more than a citation or abstract, and their work is often locked behind a paywall until the embargo expires. Timely research becomes less so as long as the embargo period lasts, except to those readers who opt for early access. Scholars who wish to access that author’s work must either pay to access the publication, wait until the embargo ends, ask the author for a PDF offprint (which is normally forbidden) or their login credentials to a paywalled platform (even more forbidden). As with file-sharing of other media, many people tend to look for the free version of something they would otherwise have to pay for, thereby short-circuiting the embargo period and the paywall, which nets both the publisher and paywall provider nothing, i.e., the same amount they would make by giving away the published work.

ccWhat the ANS is Doing About It: Authors of ANS publications may place their published work wherever they like upon publication, and may assign to it whichever Creative Commons license that they are the most comfortable using. A brief word on the types of Creative Commons licenses follows below.

Problem: Paywalls

As stated above regarding embargos on published research, paywalls do little to discourage the exchange of files between colleagues, and also place a barrier in the way of scientific progress. Platforms such as JSTOR can strike a happy medium in curating content into packages to which institutional libraries may subscribe, thereby providing a revenue stream for publishers. That same content can be shared with individuals on a non-commercial basis provided the publisher has successfully negotiated a content-sharing agreement.

What the ANS is Doing About It: The ANS has such an agreement with JSTOR, and is making some of its publications available on that platform for library subscribers, while also making those same publications available for free to individuals via the Hathi Trust Digital Library and with our own Digital Library.


Problem: “Predatory” Publishers

Following the paywall model is the usury of so-called “predatory” publishers that charge libraries and individuals hundreds and even thousands of dollars to access newly published research. Authors should be wary of publishing in journals owned by these companies as their work will reach a limited set of eyes. If most authors found other journals in which to publish, the dearth of content would force predatory publishers to either change their business model or to close entirely. Libraries can also choose not to subscribe to those journals, favoring instead those with a more reasonable Open Access policy.

What the ANS is Doing About It: The ANS has no intention of partnering with any of the large publishing companies that choose to lock current research behind paywalls with formidable access costs.

Problem: Geography-Based Access

Some Open Access content is not globally available. Sometimes this is a technical issue, and, for some publishers, this is a conscious decision based on their understanding and implementation of copyright. Actively choosing to limit access to content that is otherwise open deprives international scholars of their ability to read that work freely, at which point they must resort to paying for access, or to bending the rules and asking colleagues for a free copy or access to something.

What the ANS is Doing About It: The ANS makes every effort to ensure that its Open Access content is available worldwide. Much of it is hosted via numismatics.org and various subdomains. Agreements signed with partners such as HathiTrust make sure that the content is available globally without restriction.

ANS, 0000.999.52801
Problem: Profit-Based Publishing

One of the greatest mistakes a learned society or institution can make is to become focused on making its publications turn a profit. Scholarly publications typically cater to a niche market and sell dozens or occasionally hundreds of copies over a period of three years. Sales beyond three years of the original publication date are rare. If an organization recognizes the fact that it will realize little (or no) profit from the sale of what it publishes, it can strategize how to pay the not inconsiderable production costs. These costs can be built into annual budgets, can be inserted into grant applications for projects, and can be sought in the form of subventions. Basing choices of what to publish by what the publisher (or Board) thinks will sell can be a mistake, especially when what is to be published fulfills the mission of the parent institution.

What the ANS is Doing About It: The ANS favors a mission-based approach to publishing. It understands that some publications will never recover their production costs, but nevertheless that the content is exceedingly important in fulfilling the Society’s stated goals for research and dissemination of that research.

Problem: “Commercial” Publications

Non-profit, academic institutions historically have published scholarship as non-commercial ventures. As stated above, the publication of journals and monographs is hardly a money-making enterprise. Books and subscriptions are sold in order to recover some production costs. Recently one major international rights-holder updated its Terms of Service regarding the reproduction of its images in scholarly publications, classing journals and scholarly monographs as “commercial”, which then allows charging for image permissions. Typically a reciprocal relationship exists between institutions where no permissions fees are charged for non-commercial, scholarly, short-run publications. In switching the Terms of Service to “commercial”, the budget for publishing books or articles featuring images from one of these rights-holders expands by hundreds if not thousands of dollars. This charge represents another barrier to scholarship; publishers will simply go elsewhere for similar images. This also actually hurts the rights holder, in effect limiting wider access to its own holdings and hiding them behind a self-inflicted paywall.

What the ANS is Doing About It: The ANS will never class scholarly publications as “commercial,” and will not charge reproduction fees for the use of its images in scholarly publications.

Screen Shot 2016-01-14 at 11.10.05 AM

Problem: Permissions Charges

Most academic publishers ask the authors to pay for their own image permissions. The publishers cannot themselves afford to pay the fees, so the charges get passed to the author. For many authors, however, many of their images can be used without any permissions fees because of the non-commercial nature of their work. Should an institution opt to charge an author for an image, it is possible that the author will opt to find a similar image elsewhere, or will choose not to use an image at all. Either way, the rights holder receives no revenue, and also loses whatever additional exposure it would have otherwise received via a credit line in the publication. Charging authors for image permissions further limits access to content that would otherwise be freely available.

What the ANS is Doing About It: The ANS will not charge authors for the use of its images in non-commercial publications.

ANS, 1916.192.368
Problem: Print-Only Publishing

Arguably the biggest roadblock to Open Access research is publishing solely in print. Publishing in print restricts access to the content locked on the pages and favors those readers with library access or the ability to purchase the publication. Print editions of scholarship, while useful to many, are themselves silos of information, unable to interact with anything other than the active reader. This is the opposite of Open Access. Making print editions available online as digital editions unlocks that content, making it searchable, and perhaps more importantly, gives the content the ability to link to any other data available openly online, as well as making itself available to be linked to from other online sources.

What the ANS is Doing About It: The ANS will continue to produce print editions of scholarship, but it will make digital editions of all of its publications past, present, and future available online as Open Access. Doing so allows the ANS to play well with others, to be a good academic citizen, and to contribute to the work of others. By sharing publications openly, this guarantees that multiple copies will be made and circulated thereby preventing loss of that content should something happen to the original publisher.

A Word on Creative Commons Licensing

There are several varieties of Creative Commons (CC) licensing available to authors and publishers that both protect and promote content on the Internet and elsewhere. Anything published as Open Access must have a CC license attached to it, otherwise the content is not free to use. Most Open Access publications have a CC-BY (users must cite the source) or CC-BY-NC (citation required, and must be used for non-commercial purposes only). On rare occasions, the most open CC license, CC0 (content may be used for any purpose, commercial or otherwise, with or without citation) is used. The ANS’s Open Access publications online are posted under a CC license, usually CC-BY or CC-BY-NC. Its publications on HathiTrust are posted as CC0. The ANS works with its authors to determine which CC license they are most comfortable with prior to posting their work online.

ANS, 1927.64.5
If Open Access publication of content is not part of your institution’s/society’s/publisher’s strategy, it should be. As authors and as consumers of content, it is within your rights to ask (and in some cases demand) that your research (or the scholarship you need) be made openly available online. Open Access does not require the cessation of the sale of that same content. Many readers still prefer to read printed books and journals, and will pay for them (or will ask their libraries to pay for them). Most readers prefer a suite of media with which to work, using print in concert with digital as they produce new scholarship. The end goal of the production of that scholarship should not be to make money, but instead to advance the humanities, arts, and sciences. The best way to do that is to make that scholarship available immediately to the world upon publication. Openly. The ANS hopes that other institutions, learned societies, and publishers will share in this approach to placing published work online without cumbersome restrictions. The Internet is genetically predisposed to facilitate such sharing, which makes it the greatest enabler of advancing our collective intellectual enterprise.

Andrew Reinhard