This module covers making data for AI/ML FAIR (Findable, Accessible, Interoperable, and Reusable).
Development of these FAIR AI/ML training modules was supported by a supplement to NIH/NIGMS T34GM118272 (D. Julian, PI; 08/01/2021-05/31/2023).
Session 1
Before the session
Before starting this session, please take this quick pre-assessment.
Hands-on exercise
Exercise 1: A Hands-on exploration of data in spreadsheets.
Objectives
Students recognize the importance of organizing data so that others can understand them by experiencing the challenges of reviewing data sheets that are organized in a common but inappropriate way.
Learning Outcomes
- Students can explain why standardizing descriptions and organization of data is important.
- Students can strategize about the best ways to organize data.
Session 2
Before the session
Prior to session 2, students should complete the activity for Exercise 2: Searching the literature for published datasets.
Objectives
Students recognize the value of accessibly archived data, by experiencing the challenges of accessing data from published papers.
Learning Outcomes
- Students can explain why accessible data archiving is valuable.
- Students can provide strategies for getting data from published papers, and anticipate challenges to accessing the data.
- Students can define FAIR and the identify the components FAIR data
- Students can summarize steps involved in FAIR data collection, management, and deposition.
- Students can define metadata summarize key components of metadata
Presentation
Additional Resources
- For a closer look at the Cardiovascular Disease Ontology, check out the
.obo
file here: https://github.com/OpenLHS/CVDO - For a nice, graphical interface to an ontology, check out the Disease Ontology.
- Another good example is the Plant Phenology Ontology.
Session 3
Exercise 3
Exercise 3 will introduce you to data repositories.
Objectives
- Students recognize the value of shared data, and develop some skills at searching for data, by searching data repositories for datasets.
- Students learn about domain-specific and generalist repositories.
- Students evaluate metadata associated with data found in repositories.
Presentation
Additional Resources
- DataOne Data Sharing and Research Data Management handout.
- Higman, R., Bangert, D. and Jones, S., 2019. Three camps, one destination: the intersections of research data management, FAIR and Open. Insights, 32(1), p.18. DOI: http://doi.org/10.1629/uksg.468
- European Commission, Directorate-General for Research and Innovation, Turning FAIR into reality : final report and action plan from the European Commission expert group on FAIR data, Publications Office, 2018, https://data.europa.eu/doi/10.2777/54599
View the source repository on GitHub
Instructors wishing to use these resources can apply for access to the instructor repositories which contain additional lesson plans and discussion prompts.