ChinaXiv.org 中国科学院科技论文预发布平台

按提交时间

2022
16

按主题分类

计算机科学的集成理论
16

按作者

按机构

Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands
3
Data Science Institute, University of Virginia, Charlottesville, VA 22903-1738, USA
2
Department of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK
2
International Institute of Social Studies, Erasmus University, 29776 2502 LT The Hague, The Netherlands
2
Liacs Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands
2
San Diego Supercomputer Center, University of California San Diego, San Diego CA 92093, USA
2
Tilburg School of Humanities and Digital Sciences, Tilburg University, 90153 5000 LE Tilburg, The Netherlands
2
CSIRO, Kensington, WA 6151, Canberra, Australia
1
California Digital Library, Oakland, California 94612-2901, USA
1
Centro de Biotecnología y Genómica de Plantas UPM-INIA (CBGP), Autopista M-40 (Km 38) 28223-Pozuelo de Alarcón (Madrid), Comunidad de Madrid 28040, Spain
1
DataCite, Welfengarten 1B, Hannover 30167, Germany
1
Deutsches Klimarechenzentrum, Bundesstrasse 45a, Hamburg 20146, Germany
1
Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA
1
Development Sciences OMNI-Biomarker Development, Genentech Inc., South San Francisco, CA 94080-4990, USA
1
European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, CB10 1SD, UK
1
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Moscow 119333, Russia
1
Figshare, Crinan Street, London, N1 9XW, UK
1
GO FAIR International Support & Coordination Office (GFISCO), Leiden, The Netherlands
1
GO FAIR international Support & Coordination Office (GFISCO), Leiden, The Netherlands
1
Gesellschaft für Wissenschaftliche Datenverarbeitung Göttingen, Am Faßberg 11, 37077 Göttingen, Germany
1
Great Zimbabwe University, 1235 Masvingo Zimbabwe Harare, Zimbabwe
1
Indiana University Bloomington, Bloomington, IN 47405, USA
1
Informatics Institute, University of Amsterdam, Amsterdam 1090 GH, The Netherlands
1
Institute of Data Science, Maastricht University, Universiteitssingel 60, Maastricht 6229 ER, The Netherlands
1
Institute of Interdisciplinary Studies, Mbarara University of Science and Technology, 1410 Mbarara, Uganda
1
Learning and Research Resources Centre (CRAI), Universitat de Barcelona, 08007 Barcelona, Spain
1
Rancho BioSciences LLC., San Diego, CA 92127, USA2Development Sciences Informatics, Genentech Inc., South San Francisco, CA 94080-4990, USA3Development Sciences OMNI-Biomarker Development, Genentech Inc., South San Francisco, CA 94080-4990, USA
1
US National Academy of Sciences, Washington DC 20418, USA
1
Wiley, 9600 Garsington Road, Oxford OX4 2DQ, UK
1

当前资源共 16条

隐藏摘要

点击量

时间

下载量

1. ChinaXiv:202211.00191
下载全文

A Generic Workflow for the Data FAIRification Process

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Jacobsen, Annika Kaliyaperumal, Rajaram Santos, Luiz Olavo Bonino da Silva Mons, Barend Schultes, Erik Roos, Marco Thompson, Mark

摘要： The FAIR guiding principles aim to enhance the Findability, Accessibility, Interoperability and Reusability of digital resources such as data, for both humans and machines. The process of making data FAIR (FAIRification) can be described in multiple steps. In this paper, we describe a generic step-by-step FAIRification workflow to be performed in a multidisciplinary team guided by FAIR data stewards. The FAIRification workflow should be applicable to any type of data and has been developed and used for Bring Your Own Data (BYOD) workshops, as well as for the FAIRification of e.g., rare diseases resources. The steps are: 1) identify the FAIRification objective, 2) analyze data, 3) analyze metadata, 4) define semantic model for data (4a) and metadata (4b), 5) make data (5a) and metadata (5b) linkable, 6) host FAIR data, and 7) assess FAIR data. For each step we describe how the data are processed, what expertise is required, which procedures and tools can be used, and which FAIR principles they relate to.

点击量 1070 下载量 389 评论 0
2. ChinaXiv:202211.00170
下载全文

How to (Easily) Extend the FAIRness of Existing Repositories

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Hahnel, Mark Valen, Dan

摘要： Data repository infrastructures for academics have appeared in waves since the dawn of Web technology. These waves are driven by changes in societal needs, archiving needs and the development of cloud computing resources. As such, the data repository landscape has many flavors when it comes to sustainability models, target audiences and feature sets. One thing that links all data repositories is a desire to make the content they host reusable, building on the core principles of cataloging content for economical and research speed efficiency. The FAIR principles are a common goal for all repository infrastructures to aim for. No matter what discipline or infrastructure, the goal of reusable content, for both humans and machines, is a common one. This is the first time that repositories can work toward a common goal that ultimately lends itself to interoperability. The idea that research can move further and faster as we un-silo these fantastic resources is an achievable one. This paper investigates the steps that existing repositories need to take in order to remain useful and relevant in a FAIR research world.

点击量 1224 下载量 390 评论 0
3. ChinaXiv:202211.00185
下载全文

FAIR Computational Workflows

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Goble, Carole Cohen-Boulakia, Sarah Soiland-Reyes, Stian Garijo, Daniel Gil, Yolanda Crusoe, Michael R. Peters, Kristian Schober, Daniel

摘要： Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products. They can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance. These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right. This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.

点击量 1515 下载量 476 评论 0
4. ChinaXiv:202211.00194
下载全文

Unique, Persistent, Resolvable: Identifiers as the Foundation of FAIR

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Juty, Nick Wimalaratne, Sarala M. Soiland-Reyes, Stian Kunze, John Goble, Carole A. Clark, Tim

摘要： The FAIR principles describe characteristics intended to support access to and reuse of digital artifacts in the scientific research ecosystem. Persistent, globally unique identifiers, resolvable on the Web, and associated with a set of additional descriptive metadata, are foundational to FAIR data. Here we describe some basic principles and exemplars for their design, use and orchestration with other system elements to achieve FAIRness for digital research objects.

点击量 869 下载量 304 评论 0
5. ChinaXiv:202211.00408
下载全文

Open Science and the Hype Cycle

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-27 合作期刊: 《数据智能（英文）》

George, Strawn

摘要： The introduction of a new technology or innovation is often accompanied by ups and downs in its fortunes. Gartner Inc. defined a so-called hype cycle to describe a general pattern that many innovations experience: technology trigger, peak of inflated expectations, trough of disillusionment, slope of enlightenment, and plateau of productivity. This article will compare the ongoing introduction of Open Science (OS) with the hype cycle model and speculate on the relevance of that model to OS. Lest the title of this article mislead the reader, be assured that the author believes that OS should happen and that it will happen. However, I also believe that the path to OS will be longer than many of us had hoped. I will give a brief history of the todays semi-open science, define what I mean by OS, define the hype cycle and where OS is now on that cycle, and finally speculate what it will take to traverse the cycle and rise to its plateau of productivity (as described by Gartner).

点击量 987 下载量 317 评论 0
6. ChinaXiv:202211.00189
下载全文

FAIR Data Reuse - the Path through Data Citation

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Groth, Paul Cousijn, Helena Clark, Tim Goble, Carole

摘要： One of the key goals of the FAIR guiding principles is defined by its final principle to optimize data sets for reuse by both humans and machines. To do so, data providers need to implement and support consistent machine readable metadata to describe their data sets. This can seem like a daunting task for data providers, whether it is determining what level of detail should be provided in the provenance metadata or figuring out what common shared vocabularies should be used. Additionally, for existing data sets it is often unclear what steps should be taken to enable maximal, appropriate reuse. Data citation already plays an important role in making data findable and accessible, providing persistent and unique identifiers plus metadata on over 16 million data sets. In this paper, we discuss how data citation and its underlying infrastructures, in particular associated metadata, provide an important pathway for enabling FAIR data reuse.

点击量 1230 下载量 416 评论 0
7. ChinaXiv:202211.00193
下载全文

Making Data and Workflows Findable for Machines

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Weigel, Tobias Schwardmann, Ulrich Klump, Jens Bendoukha, Sofiane Quick, Robert

摘要： Research data currently face a huge increase of data objects with an increasing variety of types (data types, formats) and variety of workflows by which objects need to be managed across their lifecycle by data infrastructures. Researchers desire to shorten the workflows from data generation to analysis and publication, and the full workflow needs to become transparent to multiple stakeholders, including research administrators and funders. This poses challenges for research infrastructures and user-oriented data services in terms of not only making data and workflows findable, accessible, interoperable and reusable, but also doing so in a way that leverages machine support for better efficiency. One primary need to be addressed is that of findability, and achieving better findability has benefits for other aspects of data and workflow management. In this article, we describe how machine capabilities can be extended to make workflows more findable, in particular by leveraging the Digital Object Architecture, common object operations and machine learning techniques

点击量 1001 下载量 348 评论 0
8. ChinaXiv:202211.00209
下载全文

The Open Data Challenge: An Analysis of 124,000 Data Availability Statements and an Ironic Lesson about Data Management Plans

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

Graf, Chris Flanagan, Dave Wylic, Lisa Silver, Deirdre

摘要： Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyze 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorized the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.

点击量 1362 下载量 439 评论 0
9. ChinaXiv:202211.00212
下载全文

Considerations for the Conduction and Interpretation of FAIRness Evaluations

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

Azevedo, Ricardo de Miranda Dumontier, Michel Michel Dumontier

摘要： The FAIR principles were received with broad acceptance in several scientific communities. However, there is still some degree of uncertainty on how they should be implemented. Several self-report questionnaires have been proposed to assess the implementation of the FAIR principles. Moreover, the FAIRmetrics group released 14, general-purpose maturity for representing FAIRness. Initially, these metrics were conducted as open-answer questionnaires. Recently, these metrics have been implemented into a software that can automatically harvest metadata from metadata providers and generate a principle-specific FAIRness evaluation. With so many different approaches for FAIRness evaluations, we believe that further clarification on their limitations and advantages, as well as on their interpretation and interplay should be considered.

点击量 1470 下载量 434 评论 0
10. ChinaXiv:202211.00198
下载全文

FAIR Practices in Africa

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

van Reisen, Mirjam Stokmans, Mia Mawere, Munyaradzi Basajja, Mariam Ong'ayo, Antony Otieno Nakazibwe, Primrose Kirkpatrick, Christine Chindoza, Kudakwashe

摘要： This article investigates expansion of the Internet of FAIR Data and Services (IFDS) to Africa, through the three GO FAIR pillars: GO CHANGE, GO BUILD and GO TRAIN. Introduction of the IFDS in Africa has a focus on digital health. Two examples of introducing FAIR are compared: a regional initiative for digital health by governments in the East Africa Community (EAC) and an initiative by a local health provider (Solidarmed) in collaboration with Great Zimbabwe University in Zimbabwe. The obstacles to introducing FAIR are identified as underrepresentation of data from Africa in IFDS at this moment, the lack of explicit recognition of situational context of research in FAIR at present and the lack of acceptability of FAIR as a foreign and European invention which affects acceptance. It is envisaged that FAIR has an important contribution to solve fragmentation in digital health in Africa, and that any obstacles concerning African participation, context relevance and acceptance of IFDS need to be removed. This will require involvement of African researchers and ICT-developers so that it is driven by local ownership. Assessment of ecological validity in FAIR principles would ensure that the context specificity of research is reflected in the FAIR principles. This will help enhance the acceptance of the FAIR Guidelines in Africa and will help strengthen digital health research and services.

点击量 1090 下载量 350 评论 0
11. ChinaXiv:202211.00179
下载全文

Helping the Consumers and Producers of Standards, Repositories and Policies to Enable FAIR Data

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

McQuilton, Peter Batista, Dominique Beyan, Oya Granell, Ramon Coles, Simon Izzo, Massimiliano Lister, Allyson L. Pergl, Robert Rocca-Serra, Philippe Schaap, Ben Shanahan, Hugh Thurston, Milo Sansone, Susanna-Assunta

摘要： Thousands of community-developed (meta)data guidelines, models, ontologies, schemas and formats have been created and implemented by several thousand data repositories and knowledge-bases, across all disciplines. These resources are necessary to meet government, funder and publisher expectations of greater transparency and access to and preservation of data related to research publications. This obligates researchers to ensure their data is FAIR, share their data using the appropriate standards, store their data in sustainable and community-adopted repositories, and to conform to funder and publisher data policies. FAIR data sharing also plays a key role in enabling researchers to evaluate, re-analyse and reproduce each others work. We can map the landscape of relationships between community-adopted standards and repositories, and the journal publisher and funder data policies that recommend their use. In this paper, we show how the work of the GO-FAIR FAIR Standards, Repositories and Policies (StRePo) Implementation Network serves as a central integration and cross-fertilisation point for the reuse of FAIR standards, repositories and data policies in general. Pivotal to this effort, the FAIRsharing, an endorsed flagship resource of the Research Data Alliance that maps the landscape of relationships between community-adopted standards and repositories, and the journal publisher and funder data policies that recommend their use. Lastly, we highlight a number of activities around FAIR tools, services and educational efforts to raise awareness and encourage participation.

点击量 1419 下载量 401 评论 0
12. ChinaXiv:202211.00188
下载全文

Making FAIR Easy with FAIR Tools: From Creolization to Convergence

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Thompson, Mark Burger, Kees Kaliyaperumal, Rajaram Roos, Marco Santos, Luiz Olavo Bonino da Silva

摘要： Since their publication in 2016 we have seen a rapid adoption of the FAIR principles in many scientific disciplines where the inherent value of research data and, therefore, the importance of good data management and data stewardship, is recognized. This has led to many communities asking What is FAIR? and How FAIR are we currently?, questions which were addressed respectively by a publication revisiting the principles and the emergence of FAIR metrics. However, early adopters of the FAIR principles have already run into the next question: How can we become (more) FAIR? This question is more difficult to answer, as the principles do not prescribe any specific standard or implementation. Moreover, there does not yet exist a mature ecosystem of tools, platforms and standards to support human and machine agents to manage, produce, publish and consume FAIR data in a user-friendly and efficient (i.e., easy) way. In this paper we will show, however, that there are already many emerging examples of FAIR tools under development. This paper puts forward the position that we are likely already in a creolization phase where FAIR tools and technologies are merging and combining, before converging in a subsequent phase to solutions that make FAIR feasible in daily practice.

点击量 1070 下载量 386 评论 0
13. ChinaXiv:202211.00205
下载全文

Towards the Tipping Point for FAIR Implementation

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

van Reisen, Mirjam Stokmans, Mia Basajja, Mariam Ong'ayo, Antony Otieno Kirkpatrick, Christine Mons, Barend

摘要： This article explores the global implementation of the FAIR Guiding Principles for scientific management and data stewardship, which provide that data should be findable, accessible, interoperable and reusable. The implementation of these principles is designed to lead to the stewardship of data as FAIR digital objects and the establishment of the Internet of FAIR Data and Services (IFDS). If implementation reaches a tipping point, IFDS has the potential to revolutionize how data is managed by making machine and human readable data discoverable for reuse. Accordingly, this article examines the expansion of the implementation of FAIR Guiding Principles, especially how and in which geographies (locations) and areas (topic domains) implementation is taking place. A literature review of academic articles published between 2016 and 2019 on the use of FAIR Guiding Principles is presented. The investigation also includes an analysis of the domains

点击量 1526 下载量 431 评论 0
14. ChinaXiv:202211.00429
下载全文

A Semantic Approach to Workflow Management and Reuse for Research Problem Solving

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-28 合作期刊: 《数据智能（英文）》

Nikolay, A. Skvortsov Sergey, A. Stupnikov

摘要： The investigation proposes the application of an ontological semantic approach to describing workflow control patterns, research workflow step patterns, and the meaning of the workflows in terms of domain knowledge. The approach can provide wide opportunities for semantic refinement, reuse, and composition of workflows. Automatic reasoning allows verifying those compositions and implementations and provides machine-actionable workflow manipulation and problem-solving using workflows. The described approach can take into account the implementation of workflows in different workflow management systems, the organization of workflows collections in data infrastructures and the search for them, the semantic approach to the selection of workflows and resources in the research domain, the creation of research step patterns and their implementation reusing fragments of existing workflows, the possibility of automation of problem#2; solving based on the reuse of workflows. The application of the approach to CWFR conceptions is proposed.

点击量 3764 下载量 743 评论 0
15. ChinaXiv:202211.00203
下载全文

Implementation of the FAIR Data Principles for Exploratory Biomarker Data from Clinical Trials

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

Arefolov, Alexander Adam, Laura Brown, Shoshana Budovskaya, Yelena Chen, Cong Das, Diya Farhy, Chen Ferguson, Rebecca Huang, Hongmei Kanigel, Kimberly Lu, Christina Polesskaya, Oksana Staton, Tracy Tajhya, Rajeev Whitley, Maryann Wong, Jee-Yeon Zeng, Xiangpei McCreary, Mark

摘要： The FAIR data guiding principles have been recently developed and widely adopted to improve the Findability, Accessibility, Interoperability, and Reuse of digital assets in the face of an exponential increase of data volume and complexity. The FAIR data principles have been formulated on a general level and the technological implementation of these principles remains up to the industries and organizations working on maximizing the value of their data. Here, we describe the data management and curation methodologies and best practices developed for FAIRification of clinical exploratory biomarker data collected from over 250 clinical studies. We discuss the data curation effort involved, the resulting output, and the business and scientific impact of our work. Finally, we propose prospective planning for FAIR data to optimize data management efforts and maximize data value.

点击量 1324 下载量 341 评论 0
16. ChinaXiv:202211.00169
下载全文

Licensing FAIR Data for Reuse

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-16 合作期刊: 《数据智能（英文）》

Labastida, Ignasi Margoni, Thomas

摘要： The last letter of the FAIR acronym stands for Reusability. Data and metadata should be made available with a clear and accessible usage license. But, what are the choices? How can researchers share data and allow reusability? Are all the licenses available for sharing content suitable for data? Data can be covered by different layers of copyright protection making the relationship between data and copyright particularly complex. Some research data can be considered as a work and therefore covered by full copyright while other data can be in the public domain due to their lack of originality. Moreover, a collection of data can be protected by special rights in Europe to acknowledge the investment in time and money in obtaining, presenting, arranging or verifying the data. The need of using a license when sharing data comes from the fact that, under current copyright laws, when rights exist, the absence of any legal notice must be understood as the default all rights reserved regime. Unless an exception applies, the authorisation of right holders is necessary for reuse. Right holders could use any text to state the reusability of data but it is advisable to use some of the existing licenses, and especially the ones that are suitable for data and databases. We hope that with this paper we can bring some clarity in relation to the rights involved when sharing research data

点击量 1216 下载量 420 评论 0