分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-18 合作期刊: 《数据智能(英文)》
摘要: The findable, accessible, interoperable, reusable (FAIR) principles for scientific data management and stewardship aim to facilitate data reuse at scale by both humans and machines. Research and development (RD) in the pharmaceutical industry is becoming increasingly data driven, but managing its data assets according to FAIR principles remains costly and challenging. To date, little scientific evidence exists about how FAIR is currently implemented in practice, what its associated costs and benefits are, and how decisions are made about the retrospective FAIRification of data sets in pharmaceutical RD. This paper reports the results of semi-structured interviews with 14 pharmaceutical professionals who participate in various stages of drug RD in seven pharmaceutical businesses. Inductive thematic analysis identified three primary themes of the benefits and costs of FAIRification, and the elements that influence the decision-making process for FAIRifying legacy data sets. Participants collectively acknowledged the potential contribution of FAIRification to data reusability in diverse research domains and the subsequent potential for cost-savings. Implementation costs, however, were still considered a barrier by participants, with the need for considerable expenditure in terms of resources, and cultural change. How decisions were made about FAIRification was influenced by legal and ethical considerations, management commitment, and data prioritisation. The findings have significant implications for those in the pharmaceutical RD industry who are engaged in driving FAIR implementation, and for external parties who seek to better understand existing practices and challenges.
分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-16 合作期刊: 《数据智能(英文)》
摘要: The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 FAIR guiding principles do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability and Reusability of digital resources. This has likely contributed to the broad adoption of the FAIR principles, because individual stakeholder communities can implement their own FAIR solutions. However, it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations. Thus, while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways, for true interoperability we need to support convergence in implementation choices that are widely accessible and (re)-usable. We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations. Any self-identified stakeholder community may either choose to reuse solutions from existing implementations, or when they spot a gap, accept the challenge to create the needed solution, which, ideally, can be used again by other communities in the future. Here, we provide interpretations and implementation considerations (choices and challenges) for each FAIR principle.
分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-16 合作期刊: 《数据智能(英文)》
摘要: Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products. They can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance. These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right. This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.
分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-16 合作期刊: 《数据智能(英文)》
摘要: One of the key goals of the FAIR guiding principles is defined by its final principle to optimize data sets for reuse by both humans and machines. To do so, data providers need to implement and support consistent machine readable metadata to describe their data sets. This can seem like a daunting task for data providers, whether it is determining what level of detail should be provided in the provenance metadata or figuring out what common shared vocabularies should be used. Additionally, for existing data sets it is often unclear what steps should be taken to enable maximal, appropriate reuse. Data citation already plays an important role in making data findable and accessible, providing persistent and unique identifiers plus metadata on over 16 million data sets. In this paper, we discuss how data citation and its underlying infrastructures, in particular associated metadata, provide an important pathway for enabling FAIR data reuse.