Crossing a significant threshold: more than one billion citations now available in COCI!
"The competitive benefits of closing access to citation data diminish with each new citation released to the public domain, but the benefits of open data remain. Going forward, citation data is almost completely public domain".
With these words, from the article “A tipping point for open citations data” (July 15, 2021), Ian Hutchins celebrated the threshold crossing of one billion citations on public-domain databases in February 2021.
Now, a new significant milestone has been reached. We are enthusiastic to announce that COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations has just been extended with 334 million additional citations. Its most recent release, the COCI July 2021 release, now contains a total of 1.09 billion DOI-to-DOI citation links derived from open references within Crossref, which includes the references of articles deposited or opened in Crossref between November 2020 and January 2021.
These numbers make us proud, and confirm the essential value of the Initiative for Open Citations (I4OC). Since 2018, the mission of I4OC has been to persuade publishers to provide open citation data by means of the Crossref platform. The I4OC untiring commitment has led the major academic publishers to a progressive change of heart regarding open citations, and the scholarly community to a deeper interest in this openness.
These factors contributed to the creation of COCI in 2018, the first open citation index created by OpenCitations, in which we applied the concept of citations as first-class data entities (Heibi I., Peroni S., Shotton D., 2019). Over the last three years, COCI has been extended in a series of releases, by harvesting citations mostly from Crossref data dumps, starting from an initial coverage of 300 million citations (First release).
A crucial event that preceded (and delayed!) this latest COCI release was Elsevier’s endorsement in the DORA Declaration on Research Assessment in December 2020, thereby making “reference lists for all articles published in Elsevier journals openly available via Crossref so they can be available for reuse. This means other important initiatives like I4OC can draw on this metadata”. As described in our previous post, Elsevier’s welcome commitment led to the opening of many previously closed references from its numerous academic journals submitted to Crossref. Now, after an extended period of data ingestion and processing, all these newly opened Elsevier references are available at OpenCitations within COCI.
Elsevier’s involvement has both an effective and a symbolical value. Even if publishing more than one billion citations is a thrilling achievement, and – as Hutchins wrote – we are now at a tipping point with regard to open citations data, this milestone is not the last stop. Together with the other organizations and projects that participate in the Initiative for Open Citations, we will keep claiming the urgency for the remaining academic publishers to join our cause, and sharing our values with the whole academic community to make all existing citations data freely open and accessible. Recalling what Dario Taraborelli wrote in the conclusion of his article The citation graph is one of humankind's most important intellectual achievements, “the world is waiting for the citation graph to become a public good”.