Model Repositories - Next Steps

Back to 2019 MSM Agenda

 

Session Description:  

Goals of this MSM breakout session

  1. Determine the interest in the community for PubMed to add links to a model repository.
    1. Do we think model repositories are useful? 
    2. What kinds of models would you like to see in a repository?
    3. Archiving of tutorial material for modeling fields.  Should every model be represented by a published paper, or a tutorial paper, or a text book (with many models)?
  2. Canvas support to write a white paper to submit to NLM in support for such functionality
    1. Need infrastructure and funding for curation and long-term sustainability
    2. Establish group to produce white paper by FNLM meeting in Summer of 2019
  3. Sketch out white paper ideas, which could include:
    1. Define standards and methods of maintaining standards for the several aspects of modeling in research

2017 MSM Webinar Follow-up from Dr. Patti Brennan, Director of NLM (YouTube link at the top)

Barbara, 7 March

Different levels of curation

Need transparency, Jos, Zonk - software deposit, quick check, gets a doi?

Model compatibility- different software, hardware.

Data in models may be more useful than the code itself in instances. Need method that will allow citing data separately from model.

Thanks to Herb for moderating!

Speaker Bios and Abstracts:

https://ods.od.nih.gov/About/Barbara_Sorkin.aspx

Orlando Lopez, PhD, Program Director, Dental Materials & Biomaterials Program, National Institute of Dental and Craniofacial Research (NIDCR) / NIH, orlando.lopez@nih.gov

Interactive Discussion (please put you name before your comments):

BCS, added 6 March: From Herb Sauro

Here are a number of desirable attributes for a model repository. This list is based on experience gleaned from biomodels which is a well-established model repository. If the US does choose to create a repository it should be as good as or better than biomodels otherwise there is no point in spending the tax dollars and ending up with something inferior. If that were the case we should use biomodels which has permanent funding.

1. Are the models curated, i.e do that work as published?

2. How easy is it for someone to download the model and get it running on their computer?

3. Are the model reusable, i.e can parts of the model be separated out or can the model be used to build bigger models?

4. Are the models readable (i.e they are not just a mass of impenetrable computer code)?

5. Is the model in a form that would allow it to be reproduced, i.e is it possible to recreate the results of the model based on the given biological description?

6. Is the repository fully searchable?

7. Are model annotated in any way?

8. What other metadata other than annotation is provided with the models?

9. Are any data or modeling standards used or is it just ad hoc code and data?

The other thing to consider which I didn't mention was long term funding. In the last 15 years or so there are been at least two attempts to create a model repository in the US, one funded by the NIH and the second funded by the NSF. Neither survived once funding stopped. There may be others I am unaware of. To avoid this the repository would have to be funded as a long term resource, much like Biomodels.

Barbara C Sorkin comments:

One thing that we’ve considered in the repository discussions I’ve listened to (chemical structure data and models) is what the goal is – the ideal repository – and what is good enough step that is feasible now.  If the two are the same, super!  If not, something that is feasible (to curate and maintain, and to get people to contribute their models!!!), and that can be refined over time towards the ideal, is better than waiting for the ideal (not letting the perfect be the enemy of the good).

which of those criteria are critical, and which are dispensable, if need be, in the interests of having something?

Also critical to consider, how will models and metadata be uploaded, and how will search results and model code and other info be downloaded?

How much time will it take for a modeler to upload their model? What will incentivize that time investment?

perhaps it would be wise for future repository planners to include sustainability planning as part of the effort.

Herbert Sauro: The funding issue is one of my biggest worries and I understand the reluctance of agencies to commit long term. A sustainability plan is good in theory but hard to devise in practice. It might be possible to charge for sertain services but one would have to look at the economics to see if it were viable.  This is perhaps the main reason why I would also suggest talking to EBI/Biomodels and teaming up with them to add value to what already exists. Make it a joint US/European venture. If US funding were to falter at a future date it wouldn't be such an issue because Biomodels will continue because it is an EBI service. I have some Biomodels usage statistics from July 2018, which was 880,000 page views per month from around 18000 unique hosts visiting the EBI instance of BioModels so it is a heavily used resource. I don't have download numbers which would be better but I bet they are huge. The mirror at Caltech has about half that number, still substantial. If we did team up with EBI the question then would be what would we contribute? Perhaps a more substantial mirror site with a small curation team based in the US for dealing with models that aren't currently catered for by Biomodels. These would include most spatial models as well as multicellular models of tissues and organs. 

 

Orlando Lopez (comments):

Consider to establish framework for standarizing format and types of information into modeling repository:

Ken Wilkins (comments):
Here's the follow-up to my comment on being aware of funding/economic contexts to your intended white paper (happy to join those discussions/paper-prep sessions that follow this well-moderated breakout session, can't rep NIH, however):

https://www8.nationalacademies.org/pa/projectview.aspx?key=51436 

a link to the National Academies project initiated by the National Library of Medicine (NLM) on "Forecasting Costs for Preserving, Archiving, and Promoting Access to Biomedical Data" where you can sign up to hear more discussions between the panel and NLM leadership in coming months [ soonest being Tuesday March 12th at 10am on a zoom livestream ]. Looking forward to contributing to and following developments of this breakout group...also will benefit in setting up a modest pilot repository of data-meets-models for machine learning in  patient-centered outcomes research (PCOR); for more on the relevant trans-HHS efforts in building data/modeling capacity in PCOR, see reports and FAQs under https://aspe.hhs.gov/patient-centered-outcomes-research-trust-fund-reports/

Comment

Your name
Jacob Barhak
Comment

Barbara Sorkin asked to put this information public: Here are some references to why make government supported research products publicly available: 1.The new NIH strategic plan for data science:  Online: https://datascience.nih.gov/sites/default/files/NIH_Strategic_Plan_for_Data_Science_Final_508.pdf That document states: “The goal of creating a more competitive marketplace, in which open-source programs, workflows, and other applications can be provided directly to users, could also allow direct linkages to key data resources for real-time data analysis.” Moreover, this approach of making government funded research products available to the public is supported by older government policies * http://blogs.nature.com/news/2013/02/us-white-house-announces-open-access-policy.html * Fair Access to Science and Technology Research (FASTR) Act: Cornyn, Wyden Introduce Bill to Increase Access to Taxpayer-Funded Research, Online: https://www.cornyn.senate.gov/content/news/cornyn-wyden-introduce-bill-increase-access-taxpayer-funded-research

Submitted by Anonymous (not verified) on Thu, 03/07/2019 - 15:09

Your name
Jacob Barhak
Comment

Please add me to the white paper sharing list. Thanks for the great moderation.

Submitted by Anonymous (not verified) on Thu, 03/07/2019 - 15:35

Table sorting checkbox
Off