To celebrate the launch of our upcoming course, Research Integrity for Experienced Researchers, Dr. Nushrat Khan, author of the module “Data management and open data”, shares her thoughts on the significance of sharing and reusing open data in today’s research environment.
Dr. Nushrat Khan is a Senior Research Fellow at the GOS Institute of Child Healthcare, University College London.
While completing my Ph.D. research back in 2018, I wanted to explore how open data was being reused in biodiversity research. Although identifying and obtaining this data from a repository was a hurdle, it demonstrated that the value of open data is ever-growing. Furthermore, when I surveyed over 3,000 researchers from across the world (from Europe to Australia), they consistently told me about the challenges they face when it comes to sharing their data and reusing existing data. I concluded from this that there needs to be a far better understanding of meaningful data sharing. The “Data management and open data” module, part of Epigeum’s Research Integrity for Experienced Researchers, is an attempt to fill this gap and support researchers in their data-sharing journey, as well as help them to successfully locate existing data.
From lab-based to participatory research, the collection and creation of data often underpins research outcomes. In recent years there has been a major push from both funders and journal publishers to make research data openly available. Sharing research data with other users allows reproducibility, helps to build trust in research, and ensures overall research integrity. Moreover, open data creates new opportunities for research collaboration by either asking new research questions or combining newly collected data with existing datasets. However, this can only be achieved through meaningful data sharing that is accessible in the long term.
If you are perhaps a researcher in the Arts and Humanities, you might be questioning whether the practice of open data – and therefore the relevance of this module – applies to scientific research only. The simple answer to this is no, it does not; the concept of research data can take many forms and applies to a whole range of data types. More specifically, the process of archiving and sharing data may vary based on disciplines and data types. If your research involves human participants, there will of course need to be additional ethical considerations for the lawful sharing of data. The concepts of data anonymization would also be useful in determining the right approach.
With adequate and advanced planning, researchers can structure the process of preparing their data and identifying appropriate research data repositories to share their data upon completion of a research project. Depending on the discipline, funders may have different requirements for the data retention period. It is essential to have a clear idea of these prerequisites and find the appropriate platform to share data, which is applicable to all types of research. Research Integrity for Experienced Researchers is therefore specifically designed for a cross-disciplinary approach, with discipline-specific icons and menu options throughout the course linking aspects of the programme to researchers’ backgrounds and experiences.
The module on “Data Management and Open Data” discusses the research data lifecycle and introduces many of the useful concepts highlighted above, including data documentation, anonymisation, data protection, and licensing. It also introduces the FAIR principles of data sharing (findability, accessibility, interoperability, and reusability) and some practical tips to make your data more “FAIR”. In addition, the module recommends some resources on finding and reusing existing data that could be used as a first step toward finding relevant data.
Despite a significant shift towards data sharing, finding useful data remains challenging for third-party data users. When open data is reused for new research, it is imperative that the original data creators/collectors are given credit appropriately – by citing the dataset in the article and including it in a data availability statement and references.
I hope that there is better recognition across all disciplines and stakeholders to accelerate meaningful data sharing, as we recognise the need to acknowledge the creators and value the outputs derived from open data. This is how we can truly support open science. I hope that institutions benefit therefore from the inclusion of this topical module within Research Integrity for Experienced Researchers.
Discover more about “Open Data and Data Management”, and how Research Integrity for Experienced Researchers can benefit your institution more generally, by visiting our course’s website page today!
Find out more about Research Integrity for Experienced Researchers