Back to Main Agenda
Back to THEME 3- Data-Driven Approaches session page
Session Lead: Bill Lytton
IMAG Moderators: Elebeoba (Chi-Chi) May (NSF), Ken Wilkins (NIDDK)
Breakout Session Notes:
- Introductions (name and affiliation): interest stated verbally
- Session 1 (with some Session 2 additions):
- Bill Lytton - SUNY Downstate
- William Barnett - GaState
- Ted Dick - Case Western Reserve U
- Chase Cockrell - UnivVermont
- Ashlee Ford Versypt - OSU
- Reinhard Laubenbacher - UConn School Med / Jackson Labs
- Jacob Barhat - Independent
- Ahmet Erdemir - Cleveland Clinic
- Yiling Fan - Massachusetts Institute of Technology
- Mark Palmer - Medtronic
- Kristofer Bouchard -
- Haroon Anwar -
- Mohamed Sherif - Yale
- Elsje Pienaer - Purdue
- J. L-Duke - UVa
- Amy Gryshuk – Lawrence Livermore National Laboratory
- Amanda
- Susan Wright- NIDA
- Tong Song - USC
- Simon Giszter - Drexel
- Gunnar Cedersund
- Mark Alber - UC-Riverside
- Andrzej Przekwas - CFD Research Corporation
- Amir Barati Farimani - Carnegie Mellon
- Dong Song - U. Southern California
- General Comments
- AFV- Point of semantics and how “Data Driven” approaches are viewed by agencies.
- BL - related to the connection to MSM, certainly agent based applies
- Discussion of meaning/why data driven
- BL - Contrast with theory driven (vs Data Driven)
- Jacob - what type of data is available
- -- TD - described data type producer
- --WB - as consumer of data, discussed goals
- --BL - What percent of NIH funded projects generate public data
- -- Discussion of usable data and formating in a way that is usable
- -- Ken - discussion of FAIR principles for generating database repositories that are actually used by the community; want to also share models as well as data
- --Jacob - discussed things that deter use of public databases - manipulating data into a useable form
- -- Ken - points to metadata issues (mentioned NLM efforts to make things interopperable) / To add a link / Discussion of standards (e.g. FHIRE)
- --SW - NIH announcement related to FHIRE
- --BL - Discussion of ModelDB. Issues there related to metadata for models
- DATA DRIVEN vs THEORY DRIVEN
- CC - draw distinction between data driven and statistics driven models
- BL - ML tends to be iterative
- KW - task at hand and what you're trying to drive towards and how that determines methods used; different methods
- ABF-- Models for molecular dynamics for protein modeling
- LOOSE ENDS FROM THE ODE Day 1 Discussion
- AFV - ABM simplified to ODE, but others using it for spatial scale resolution and more comparable to PDE
- MA - Emphasize stochastic aspects of ABM
- AFV - stochastic based methods - how ML connects more to methods
- ABF - ML connecteded to generative vs ___ models; learning stochastic PDEs using ML
- MA - getting access to medical data. Huge challenge.
- KW - not only those that hold the data but those that are contributing the data - they have a stake and responsibility to that group as well.
- Jacob - restriction on medical data (HIPA) and messy data in clearing house sites.
- KW - discussed some NLM efforts to investigate data issues related to clinical trials/studies (to add link)
- AFV- Point of semantics and how “Data Driven” approaches are viewed by agencies.
- Build on current state of the art Data-Driven Models
- BL/KW - early state, what we need next is better databses, better integration of DBs
- KW - How to integrate animal model data with human model; particularly where animal data informs
- MA - Digital twin for Animal
- GC - animal is 4th M in M4 - have rodent version of model; also working on organ on a chip
- MA - do we move from microfluidic to animal to blood?
- GC - modular approach - sort pieces based on functional panel then use iterative methods to characterize in various species then can move to species comparison;
- GC - feature specific comparison for well characterized features
- GC - need to solve reproducibility not just in models but on the experimental side; using systems approach to assist with this
- KBouchard - what is the incentive to enabling cooperativity in considering reproducibility
- KW - journal/publication based motivation
- ABF - re transferability of models (animal-human-patient) - (methods to address) transfer learning methods, multi-task learning methods; design architecture where the transferability features can be associated with params
- Jacob - govt role in reproducibility
- KW - need not just "talk" but culture shift to address reproducibility
- GC - re the incentive portion - once systems level becomes standard for integrating data/if that becomes the expected (scientific norm) then reproducibility will solve itself
- KB- what do we mean by a useful ML model needs to be considered. In general do not gain insight into structure of data but reproduce it very well. Is that satisfactory? Have we learned about bio
- Build on current state of the art models for Digital Twins
- KW - Methods involved in digital twin
- GC - requirements for data and use of data is quite different for mechanistic models vs ML methods;
- GC - once a phenotypic observation in data, that is sufficient to start mechanistic model (very little data as long as your observations are reproducible)
- GC - ML requires significantly more data b/f getting started
- KB - at what level of resolution are we defining digital twins
- MA referenced and summarized discussion from ODE breakout on Day1
- BL - application dependent - referenced hypertension
- Jacob - question of resolution is related to how accurate do you want to be - and question to agencies is what is good enough -
- Comment - the accuracy question has many stakeholders
- MA - the question of dynamics is impt - b/c digital human under some type of stress/condition/treatment
- GC - setting the expectation in the right frame is key. Need to avoid falsehood of inability to be 100% accurate so why try. However current clinical practice is oftentimes pedagogics
- KW - true for several diseases that are highly associated with lifestyle changes
- GC - problem of adherence but use of simulation could motivate adherence
- MA - may not be as effective with elderly. Discussed behavior and that is a key factor
- - Why would patient respect the authority of digitial twin vs doctor
- BL - frequency of digital twin updates and reminder vs less frequent interaction with doctors
- GC - Digital twin can help motivate discussion around behavior that impacts health and less expensive and more accessible than doctor/health specialist. Leads to participatory medicine.
- KB - If public not as in tune with science and questioning fact-based studies, how much will they believe the predictions
- GC - trust will build over time
- KW - think of weather forecasting -- population relatively trusts
- MSharief - how we use the digital twin
- GC - should not dismiss becuase of not having 100% accuracy (refer to discussio from yesterday)
- Simon - US implemntation may be more problematic (vs Sweden) and protection of information from insurance companies, etc. Security and management of data related to DT is an issue.
- MA - car insurance companies and use of ML and how that relates to decisions that affect individuals
- ML-MSM integration opportunities
- MA - digital twin for research vs clinical would be different; digital twin has been created for specific conditions; doctors/clinicians are conservative
- MA - gap between clinical community and comp modeling community
- MA/KW - there are challenges to bring in DT into clinical practice (ethical and legal issues, etc.) - need champions within clinical realm
- KW - issue of getting access to data for underrep populations
- BL - DT as electronic diary that informs clinicians (MA - useful for monitoring elderly)
- GC - As we heave more personal monitoring devices, public will expect clinicians to take that into account. DT can provide summary of that data and provide a state variable describing patient status
- -- Why not build model that replaces reliance on clinician (KW - expert systems)
- Challenges ML-MSM modelers should address
- MSharief - physician may use ML and multiscale modeling to help inform decision
- KB-- Google satisfied with black box model; what would be great is to be able to have predictive model and be able to extract that knowledge
- KB-- explainable AI is different; want to know the representation that the network is learning - intermediate representation constraining those
- Session 1 (with some Session 2 additions):
- Some resources mentioned during discussion
- Data sharing/re-use, to enable Data-Driven aspects of approaches
- Data repositories:
- Clearinghouse lists of data repositories: re3data.org
- Biomedical-focused repositories affiliated with NIH: https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html
- FAIR: Findable, Accessible, Interoperable, Re-usable Data principles: https://www.go-fair.org/fair-principles/
- Digital Object Identifiers (DOIs)
- now available for NIH-funded research via nih.figshare.com
- Common Data Elements and other emerging standards via, e.g., NLM common data element repository under https://cde.nlm.nih.gov/
- Data repositories:
- Model sharing/re-use, to inform Digital Twin approaches
- Research Resource Identifiers (RRIDs) https://www.rrids.org/
- MSM-integration opportunities
- Challenges cited
- Data sharing/re-use, to enable Data-Driven aspects of approaches
Table sorting checkbox
Off