Open Research Case Studies
Discover how researchers at Keele are advancing their fields through open research practices. These case studies provide real-world examples of the benefits and approaches to open research. Explore their experiences, learn from their insights, and consider how open research can fit into your own work.
"The workshop at Keele sought to consider the School approach to the Open Research agenda."
A workshop was held at Keele on Open Research in the Humanities (and to some degree Social Sciences). The workshop sought to capitalise on and to consolidate a burgeoning interest in digital humanities scholarship in English, History, and Media at Keele. It featured external and internal speakers, both staff and students, and provided an unprecedented opportunity to share experiences and methods, plan collaborative projects, and to consider the School approach to the open research agenda.
The workshop provided a platform to discuss issues raised by Open Research including funding, ethics, external collaboration, technological possibilities and limitations, open access publishing, enduring (global) inequalities, and career status.
Speakers presented ongoing and recently-completed work that included open sharing of data from social media, transcribed manuscripts, creative writing, historical records and other sources.
The following presentations were given during the day:
· Molly Drummond, Ed De Quincy, and Elizabeth Poole (Keele), ‘Sharing big data: Practices and ethics’.
· Brodie Waddell (Birkbeck), ‘What can you do with 2,526 petitions? Creating, using and sharing a new digital corpus of transcribed manuscripts’.
· Joanna Taylor (Manchester), ‘Deep [m]apps and spatial narratives: Walking with Dorothy Wordsworth’.
· Rachel Bright (Keele), ‘Uneven ‘Open Access’ in Global Perspective’.
· Ceri Morgan (Keele), ‘Creative writing as (open) research practice’.
· Alannah Tomkins (Keele), ‘Generating and distributing open-access Data: Capturing the Past and Zenodo’.
· Peter Collinge (Keele), ‘Open access and early career research: Providing for the Poor’.
· Paul Carter (National Archives), ‘Digital representations in “Teaching the Voices of the Victorian Poor”’.
The discussions have contributed to School planning and research strategy, so that colleagues continue to feel confident and inspired about undertaking open research themselves, knowing which Humanities professors are currently leading large projects in these areas, how they were designed and managed, and the institutional support for this work.
"Open data sources have greatly benefitted my research."
I’ve recently published two papers on Neotropical bird ecology and conservation. The first investigates the impacts of land use change on the distribution of Hyacinth Macaw in Brazil over different time scales and spatial extents. The other study looks at how likely it is that the Purple-winged Ground-dove is now extinct by using the history of its past occurrence records.
For both projects, I used open source software for data analysis and data from open databases made up largely, of citizen science records. R software is script-based, which provides a clear record of every step in the analysis process, including the production of results, tables and figures in the final papers. The code is stored on an open platform, github, which allows both collaboration between researchers during the analysis process and sharing the final code in a permanent repository after the study has finished. See code for the Hyacinth Macaw project, and Purple-winged Ground-dove project.
Open data sources have greatly benefitted my research. For both the above projects, we used online data repositories on bird occurrence records. These allow species presence, and to some extent, absence information, to be used in the analysis of probability of extinction, or probability of occurrence across the species’ geographic ranges. Open data bases such as these have led to huge increases in the understanding of animal occurrences, especially birds, for example, migration patterns, other movements, habitat requirements, threats, and more, over the last approximately 20 years.
"Open research will contribute to furthering contemporary research, keeping aspiring researchers engaged and knowledgeable about contemporary research, and foster a collaborative atmosphere in academia. "
I am supervised by an excellent team of international experts in cardiovascular disease and epidemiology comprised of Dr Andrija Matetić, Professor Christian Mallen and Professor Mamas Mamas. Our research uses national electronic health record datasets from the United States. With this data, we are able to investigate trends with great statistical power, precision and generalisability. We have been able to disseminate this research at international conferences and publish studies in reputable peer-reviewed journals.
The most recent project was an investigation into the association of frailty status with the causes, characteristics and outcomes of patients attending the emergency department and admitted to hospital with cardiovascular disease. This project was part of an intercalated MPhil with the School of Medicine and several presentations and publications in open access journals.
I have been empowered by my supervisors, past and present, to contribute to open research principles. Firstly, Knowledge of open research was first imparted to me during my first ever project with the School of Medicine by my supervisors Dr Sara Muller and Professor Samantha Hider. I was able to publish this project in an open access journal which had an open access agreement with Keele. Liaising with the Keele library was fundamental to identifying journals to publish works I had completed as part of my medicine course. Following this I began my work with the Dr Andrija Matetić and Professor Mamas Mamas. Overall, I have actively published (with pre-print) the majority of research I have contributed to as open access in reputable journals:
- Association of Frailty Status on the Causes and Outcomes of Patients Admitted With Cardiovascular Disease
- Management and outcomes of patients admitted with type 2 myocardial infarction with and without standard modifiable risk factors
-
Treatment and Outcomes of Acute Myocardial Infarction in Patients With Polymyalgia Rheumatica With and Without Giant Cell Arteritis
- Acute cerebellar ischaemic stroke secondary to arterial thoracic outlet syndrome
These can be accessed either through the journal websites themselves or websites such as I am subscribed to which hold conference abstracts and manuscripts.
Secondly, much of the work we complete uses publicly available datasets and we make an active effort to ensure our methods are transparent and reproducible for readers (Dataset accessibility information for NIS and for NEDS) Whilst this should be the case for all articles, often found articles lacking in key information, which hinders early career researchers who would like to learn, exposing researchers to poor practice. We hope transparency in research allows researchers to learn and take inspiration for their own research.
Thirdly, I am learning how to use the free, open source software R. I aim to transition to using R to further contribute to open research by sharing methods.
Finally, I enjoy collaborating on research and teaching with students and staff. I commonly teach students research skills and aim to supervise on projects. With support from the Faculty of Medicine and Health Science, there are plans to continue all of to pilot ‘research sessions’ for all students and post-graduate researchers. The aim of these sessions is to teach attendees the fundamentals of research and inspire future researchers. Furthermore, this presents a fantastic opportunity to introduce the initiative of open research throughout. I hope these aspects of my work contribute to further open research.
By having my research that contributes to patient care and outcomes be openly accessible, people are able to access the research and contact authors about it if needed. Accessible research garners more interest and encourages future researchers to do the same with the goal of allowing students to keep up with ever-changing field of medicine and healthcare. This research can help students reading these articles to pursue research and academia themselves and adopt open research practices. By attending the proposed research sessions, students are enthused to collaborate with each other openly, fostering a true multi-disciplinary environment and culture within research. Improved collaboration will pay dividends to improve the perceived competitive culture of academia, and further advances in all fields of research.
All researchers should be encouraged to adopt the open research practices. Open research will contribute to furthering contemporary research, keeping aspiring researchers engaged and knowledgeable about contemporary research, and foster a collaborative atmosphere in academia. All of this will, if nothing else, benefit the end point the research is trying to effect, which in this case is the patient.
"The [open] availability of the Guidelines gives researchers an additional reason to access our resources and drives interest in our research topic. "
Elizabeth Poole, Professor of Media and Communications; Ed de Quincey, Professor of Computer Science; Molly Drummond, PhD
Our project investigated best practice for open research data sharing in cross-disciplinary big data/social science projects and developed guidelines for researchers working in this area. We were particularly interested in approaches to sharing sensitive Twitter data in relation to legal, ethical and practical constraints around these practices. The project sought to answer two questions:
- In what form can Twitter data be shared so it both complies legally with Twitter’s policies and adheres to best ethical practice?
- How can such data be shared so it has a high degree of usefulness and therefore re-use for researchers?
We found that most projects uploaded data to repositories in ‘unhydrated’ form (containing the Tweet ID but not its content) to comply with Twitter’s legal standards. Researchers also applied the FAIR (Findable, Accessible, Interoperable, and Reusable) principles but sometimes adapted them based on the sensitivity of the data and ethical principles of the researchers.
Further ethical concerns relating to the sharing of data were raised around data loss, potential misuse, user agency, access, infrastructural/institutional support, cross-disciplinary practices, and the sustainability of data management including cataloguing. Participants emphasised the need for transparency in sharing ethical approaches particularly given the constantly evolving legal, technological, and ethical context.
Our own datasets will be shared on Zenodo, in unhydrated form and compliant with FAIR principles.
The Guidelines for researchers sharing large data sets that we developed are freely available online with a Creative Commons licence.
As a result of the project, we have developed contacts with researchers and experts working in data sharing. The availability of the Guidelines gives researchers an additional reason to access our resources and drives interest in our research topic.
The project provided an opportunity for a Research Assistant to work with an established team and acquire new skills.
"I have found the process rewarding in that we have received peer-review comments on our study design, materials, and proposed analyses rather than on the results themselves and whether they align with our predictions. Given the size and cost of the present project, submitting as a registered report has removed the risk that the results would impact its publication likelihood."
The project, led by Dr Cillian McHugh (University of Limerick), is a collaboration between myself, Dr McHugh, Dr Jim Everett (University of Kent) and Dr Shane Timmons (Economic and Social Research Institute, Ireland). As researchers in moral psychology, we have an interest in understanding the mechanisms that explain moral decision-making and this project aims to do just that.
In the classic trolley dilemma, individuals must decide whether it is morally acceptable to divert a train car away from five people towards one person instead, thus sacrificing one life in order to save five. Dual process approaches are the commonly accepted default method to understanding moral judgments and can predict typical responses in these trolley-type moral dilemmas (e.g., Greene, 2001). Yet these approaches have been criticised because they can fail to account for critical components of moral judgment.Railton (2017) proposed an alternative method to understanding the complexity of moral judgment using a social learning approach, whereby many factors interact to guide implicit and explicit moral learning. To date, Railton’s theory has only been tested on small samples of his own students (max n = 45) and using adapted trolley-type cases that are not controlled for known confounds. In this project, we aim to replicate Railton’s original studies:
1) using more robust materials that account for confounding factors,
2) in an experimental design that supports the disentangling of the constructs of interest, and
3) on a large and diverse sample of 2,200 participants to increase statistical power and generalisability.
Step 1: We conducted Study 1 initially and pre-registered the basic predictions at https://aspredicted.org/N85_2JV. This involved completing a straightforward online form to preregister our predictions, as well as stating how we would determine our sample size and what approach we would take to data analysis. It can take as little as half an hour to complete this form. Once preregistered, you can provide the link to the preregistration to reviewers. It’s important to note that the preregistration will remain private until you click publish, which will make it a public document accessible to everyone. But you decide when to do that. Because Aspredicted.org has a tight word count, in the next step we made all our study materials available through the Open Science Framework (OSF).
Step 2: OSF allows researchers to upload their preregistration form, such as the one from Aspredicted.org, as well as all your study materials, including data set and analysis files. Note that once you upload things in OSF you can opt for them to remain private vs. public. If you decide for the materials to remain private, please be aware that after a certain period of time OSF will publish them.
As for our project, see all materials for Study 1 on the project’s page on the OSF: https://osf.io/59quk/?view_only=18414ad4433a4145a718f7015c012e36 and all anonymised data, analysis scripts for simulations and analyses can also be found on the paper’s project page on the OSF: https://osf.io/59quk/?view_only=18414ad4433a4145a718f7015c012e36.
Step 3: In the final step, we submitted an application for a registered report.
Registered reports are a form of empirical article but where the study/experiment proposal is reviewed before the research is conducted. Pre-registered reports that are deemed high in rigor and scientific standards are then provisionally accepted before the results of the study are known. This separates the merit of the proposed research from the results obtained and subsequently reduces publication bias.
To ensure full transparency and rigor in the project, we submitted the project overview, results of Study 1, and planned experimental design and planned analyses for Study 2 as a registered report at Journal of Experimental Social Psychology. In recognition of the contribution of the proposed project, it has been accepted in-principle after Stage 1 peer-review. This means that we are now collecting data following feedback.
I engage in pre-registration and sharing of anonymised datasets, full materials, and analyses pipelines across the majority of research projects that I work on. However, this has been my first experience of submitting a registered report to a journal.
I have found the process rewarding in that we have received peer-review comments on our study design, materials, and proposed analyses rather than on the results themselves and whether they align with our predictions. Given the size and cost of the present project, submitting as a registered report has removed the risk that the results would impact its publication likelihood.
"The single largest benefit of Open Research to my own research lay with the fact that others implementing good Open Research practices meant that I could easily access scripts for analysis and experimental design. This allowed me to develop my own skills in programming and subsequently implement these good Open Research practices myself."
I was introduced to Open Research at the outset of my doctoral research programme. Hence, I applied Open Research to the empirical studies that contributed towards my doctoral thesis.
I used a range of different Open Research practices to ensure that my research was transparent and easily verifiable. Primarily, this involved making all data and analysis scripts openly available, allowing independent researchers to easily reproduce and verify results. The analysis scripts were all written using the free and open-source programming language R (see r-project.org), with both data and analysis scripts made available on the Open Science Framework (OSF; osf.io). The OSF is a professional online repository where researchers can make projects, data, and materials available to the wider research community and the public. For example, my entire PhD thesis can be found at this link: osf.io/dnqxz/?view_only=e68968defbcc471dbcd1b1561ec515f6, with my public profile being found here: osf.io/8se7g, which also includes a pre-registered study I was a collaborator on (see Röer et al., 2022).
In addition, I also became involved in the development of Open Software during my PhD. This stemmed from being offered the opportunity to collaborate on a paper detailing an R package developed by my supervisor Dr Jim Grange and colleagues. This was my first insight into package development, which I then used to develop my own, somewhat basic R package (see github.com/s-b-moore/sdtt).
Initially, my reasoning for implementing Open Research practices within my own work was to allow independent researchers to verify and reproduce my results with ease. Transparency in relation to data and statistical analysis is crucial to ensure the integrity of the research being conducted, guarding against the possibility of data manipulation or inappropriate analytical practices (e.g., p-hacking). However, throughout PhD study, my reasons for implementing Open Research practices developed. Given that I was still a relative newcomer to R, I ensured that my code was well commented, clear, and simple enough to understand for novice users. Therefore, rather than my initial view of simply providing some transparency to my research, I hoped that by sharing my analysis scripts new/less experienced R users could use these as resources to help develop their skills and in turn, allow them to implement Open Research practices in their own research.
This reasoning also ties with my interest in Open Software. For instance, implementation of mixture modelling in visual short-term memory research prior to development of the “mixtur” package typically required either an institutional or personal subscription to proprietary software, something which may not be possible for smaller research institutions or independent researchers. The free and open-source nature of R (as well as some other programming languages) offers the ability for researchers to explore different topics—such as the application of mixture modelling—that may have previously been unavailable to them due to the high cost of proprietary software. By increasing the ability of researchers to explore these areas of research, this could bring with it a range of new ideas and innovation.
There are a multitude of reasons why Open Research benefitted my programme of research, from increasing the transparency of my research, to ensuring results were easily reproducible and verifiable. However, I feel that the single largest benefit of Open Research to my own research lay with the fact that others implementing good Open Research practices meant that I could easily access scripts for analysis and experimental design. This allowed me to develop my own skills in programming and subsequently implement these good Open Research practices myself. As I mentioned before, I was a newcomer to programming when starting my PhD. I believe that having access to scripts written others with more experience allowed me to develop my skills at a much faster rate than I may have done otherwise, giving me more time to focus on producing quality research, while also allowing me to explore more complex analyses and experimental design.
Perhaps the biggest challenge I faced when attempting to implement good Open Research practices within my PhD research was in developing the necessary skills. For instance, prior to beginning doctoral study, I had a very limited understanding of programming, with no practical experience at all. However, through a combination of guidance and support from my supervisor Dr Jim Grange, as well as lots of independent study, I was able to develop a level of skill which allowed me to write analysis scripts in R for all my PhD experiments. Eventually, I was also able to develop documentation and write manuscripts using R Markdown and the R package “papaja” (see Aust & Barth, 2022) which allows code and statistics to be embedded within text, ultimately using a combination of these to write my entire PhD thesis. Furthermore, this understanding of R helped me to then develop skills in other programming languages such as Python and JavaScript which allowed me to program experiments using PsychoPy (see Peirce et al., 2019) and Gorilla Experiment Builder (see Anwyl-Irvine et al., 2019).
Anwyl-Irvine, A.L., Massonié, J., Flitton, A., Kirkham, N.Z., Evershed, J.K. (2019). Gorilla in our midst: an online behavioural experiment builder Behavior Research Methods. doi: 10.3758/s13428-019-01237-x
Aust F, Barth M (2022). papaja: Prepare reproducible APA journal articles with R Markdown. R package version 0.1.1, https://github.com/crsh/papaja.
Grange, J.A., Moore, S.B. (2022) mixtur: An R package for designing, analysing, and modelling continuous report visual short-term memory studies. Behavior Research Methods, 54, 2071–2100. doi: 10.3758/s13428-021-01688-1
Peirce, J. W., Gray, J. R., Simpson, S., MacAskill, M. R., Höchenberger, R., Sogo, H., Kastman, E., Lindeløv, J. (2019). PsychoPy2: experiments in behavior made easy. Behavior Research Methods. doi: 10.3758/s13428-018-01193-y
Röer, J. P., Bell, R., Buchner, A., Saint-Aubin, J., Sonier, R. P., Marsh, J. E., Moore, S. B., Kershaw, M. B. A., Ljung, R., & Arnström, S. (2022). A multilingual preregistered replication of the semantic mismatch effect on serial recall. Journal of Experimental Psychology: Learning, Memory, and Cognition, 48(7), 966–974. doi: 10.1037/xlm0001066
"My work makes it straightforward for all researchers within the CHEOPS science team and beyond to prepare, access and analyse CHEOPS observations using state-of-the-art techniques. The software makes it easy for users to openly share their own results and data."
I have developed a software package for the analysis of data from the European Space Agency’s CHEOPS mission. This mission uses ultrahigh-precision photometry obtained from low-Earth orbit to study the properties of planets orbiting bright stars beyond the solar system (exoplanets). The software (pycheops) is open-source [1] and fully described in a recently published paper [2]. The paper includes my analysis using pycheops of the CHEOPS “Early Science” observations.
CHEOPS makes very precise measurements of the small dip in brightness when an exoplanet passes between the Earth and its host star. With a suitable model, we can analyse this dip in brightness to obtain an accurate measurement of the exoplanet’s properties. The very high quality of the data from CHEOPS recently enabled us to make the first ever measurement of the shape of a planet outside the solar system.
I taught myself how to write and distribute multi-platform numerical analysis software in python. I developed a fast and accurate model for the transit of a Sun-like star by an exoplanet. I used feedback from members of the CHEOPS science team to make pycheops user-friendly and reliable.
I had to develop a new algorithm that was accurate enough to model ultra-high precision light curves while being fast enough to be accessible to users without access to large and expensive computing facilities. I also had to learn how to write and distribute an open-source, multi-platform software package to a professional standard.
My work makes it straightforward for all researchers within the CHEOPS science team and beyond to prepare, access and analyse CHEOPS observations using state-of-the-art techniques. The software makes it easy for users to openly share their own results and data, e.g. to export their results in a format suitable for upload to a public database.
Working with the CHEOPS Science Team enables me to attract high-quality postgraduate students to help me with the analysis of these data. The software has also enabled undergraduates on the Year-3 Astrophysics group project module to analyse data from NASA’s TESS mission to find suitable targets for me to follow-up with more detailed observations.
[1] https://github.com/pmaxted/pycheops
[2] Maxted et al. Analysis of Early Science observations with the CHaracterising ExOPlanets Satellite (CHEOPS) using PYCHEOPS. Monthly Notices of the Royal Astronomical Society, in press. arxiv:2111.08828