Date
Attendees
Committee Member | Present | Absent |
---|---|---|
X | ||
X | ||
Pihl, Todd (NIH/NCI) [C] | X | |
X | ||
X | ||
Gregory Tawa | X | |
William Hendriks | X | |
Matthew Breen | X | |
Roel Verhaak | X | |
X |
Goals
- Review, discuss and prioritize ICDC study submission proposals
Outstanding Action items
- Kuffel, Gina (NIH/NCI) [C]to send out an email containing the candidate publication list.
- Kim, Erika (NIH/NCI) [E] to evaluate GitHub pipeline
- Musk, Philip (NIH/NCI) [C] & Kuffel, Gina (NIH/NCI) [C]to discuss and develop approach for obtaining permission from original authors of data used for re-analysis.
Discussion items
Item | Who | Notes |
---|---|---|
Candidate Publications/Data for ICDC |
| |
New Data Submission | Kuffel, Gina (NIH/NCI) [C] |
|
Study Updates | ||
Feature Updates |
| |
Material Transfer Agreement (MTA) | Kuffel, Gina (NIH/NCI) [C] |
|
RNA-Seq broad normalization |
|
Minutes (Not Verbatim)
GT- From a high-level, my recommendation would be study 1 & 7 from the candidate publication list. Other studies have limited data available.
WH- Thinking back to a year or more ago, there was a prior and similar effort. We did some reaching out to authors for submitting data, wondering where the documentation is for that effort? David Adams from Welcome and other examples.
TP- Prior to me stepping in.
WK- Regarding first paper, is this comparative genomics paper, they have human methylation data. Should we represent that data?
WH- These are data that broaden the type of data in ICDC, but wondering if the focus should be more on omics type data.
GT- So the Osteo paper would be the right type of study to bring in then?
WH- So much change in ICDC so we need more continuity before proceeding.
WK- Slight amendment, may not be redundant, ICDC is in a much different place, we could circle back with authors that were previously contacted.
WK- Harmonization pipeline is valuable, we should send acknowledgement to authors who derived the data before proceeding. We should only import the harmonized data, don't need to bring in the original data.
GT- The harmonized data is good to have, maybe the original BAM files would be good to consume as well for reproducibility purposes.
RV- Slightly different perspective, if we look at Pan-Cancer studies in the human space, they provide aggregate data in portals and use cBioPortal and others to distribute other files. Opposed to have Bam files in multiple places.
GT- You could pull the raw data from NCBI, need ability to perform queries across multiple datasets. If data is stored outside of ICDC will we still be able to run queries effectively. I'm interested in cohort building tooling and queries.
WK- The important part would be the cohort-building query.
WK- With the careful approach of not bringing in the original data, we should just make sure to reach out to original authors.
WH- Agree that there is great value in being able to point to existing datasets, but let's not add noise to the system. Shaying's pipeline has not necessarily been validated by the community it was developed by a single group.
GT- It would be great to launch a harmonization pipeline that could run in an automated fashion without having to put forth much effort.
Action items
- Kuffel, Gina (NIH/NCI) [C] to compile list of data that is NOT publicly accessible elsewhere for Pan-Cancer study submission
- Kuffel, Gina (NIH/NCI) [C] to generate stock email for authors of original data for Pan-Cancer study submission
- Kuffel, Gina (NIH/NCI) [C] to locate previous candidate publications list and revisit communications with authors
- Kuffel, Gina (NIH/NCI) [C] to generate stock email for authors of original data for Pan-Cancer study submission
- Kuffel, Gina (NIH/NCI) [C] to locate previous candidate publications list and revisit communications with authors