INTRODUCTIONDigital libraries that store scientific publications are becoming increasingly central to the research process. They are not only used for traditional tasks, such as finding and storing research outputs, but also as a source for discovering new research trends or evaluating
research excellence. With the current growth of scientific publications deposited in digital libraries, it is no longer sufficient to provide only access to content. To aid research it is especially important to improve the process of how research is being done.
The recent development in natural language processing, information retrieval and the semantic web make it possible to transform the way we work with scientific publications. However, in order to be able to improve these technologies and carry out experiments, researchers
need to be able to easily access and use large databases of scientific publications.
This workshop aims to bring together people from different backgrounds who:
- are interested in analysing and mining databases of scientific publications
- develop systems that enable such analysis and mining of scientific databases or
- who develop novel technologies that improve the way research is being done.TOPICSThe topics of the workshop will be organised around the following themes:
- The whole ecosystem of infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs that enable analysis of large volumes of scientific publications.
- Semantic enrichment of scientific publications by means of text-mining, crowdsourcing or other methods.
- Analysis of large databases of scientific publications to identify research trends, high impact, cross-fertilisation between disciplines, research excellence etc.
Topics of interest relevant to theme 1 include, but are not limited to:
- Infrastructures including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs for accessing scientific publications and/or research data. The existence of datasets, services, systems and APIs (in particular those that are open) providing access to large volumes of scientific publications and research data, is an essential prerequisite for being able to research and develop new technologies that can transform the way people do research.
Topics of interest relevant to theme 2 include, but are not limited to:
- Novel information extraction and text-mining approaches to semantic enrichment of publications.
- Automatic categorization and clustering of scientific publications.
- New methods and models for connecting and interlinking scientific publications.
- Models for semantically representing and annotating publications.
- Semantically enriching/annotating publications by crowdsourcing.Topics of interest relevant to theme 3 include, but are not limited to:
- New methods, models and innovative approaches for measuring impact of publications.
- New methods for measuring performance of researchers.
- Evaluating impact of research groups.
- Methods for identifying research trends and cross-fertilization between research disciplines.
- Application and case studies of mining from scientific databases and publications.
- Improving the infrastructure of repositories to support the development and integration of new impact and performance metrics.SPECIAL OPEN PUBLICATIONS DATASET TRACKThis year we would like to invite the workshop participants to make use of the CORE publications dataset containing large volume of research publications from a wide variety of research areas. The dataset contains not only full-texts, but also an enriched version of publications' metadata. This dataset provides a framework for developing and testing methods and tools addressing the workshop topics. The use of this dataset is not mandatory, however it is encouraged.
A new data dump will be made available here.
EXPECTED AUDIENCEThe workshop on Mining Scientific Publications aims to bring together researchers, digital library developers and practitioners from government and industry to address the current challenges in the domain of mining scientific publications.
SUBMISSION FORMATWe invite submissions related to the workshop's topics. Long papers should not exceed 8 pages and short papers should not exceed 4 pages of the ACM style. Furthermore, we welcome demo presentations of systems or methods. A demonstration submission should consist of a maximum two page description of the system, method or tool to be demonstrated.
IMPORTANT DATESJuly 13, 2014 - Submission deadline
August 11, 2014 - Notification of acceptance
August 25, 2014 - Camera-ready
September 12, 2014 - Workshop
The dates are at this stage indicative only and can change.
Articles presented at this workshop will be published in the November issue of D-Lib (http://www.dlib.org/). Proceedings from the previous workshops are also available here.
Petr Knoth, Knowledge Media institute, The Open University, UK
Zdenek Zdrahal, Knowledge Media institute, The Open University, UK
Stelio Piperidis, Institute for Language and Speech processing (META-SHARE),
Athena Research Center, Greece
Nuno Freire, The European Library, The Netherlands
Kris Jack, Mendeley Ltd., United Kingdom
Drahomira Herrmannova, Knowledge Media institute, The Open University, UK
Lucas Anastasiou, Knowledge Media institute, The Open University, UK
More details available on the workshop website