NIH | National Cancer Institute | NCI Wiki  

Date

Attendees



Goals

Discussion items

TimeItemWhoNotes
Review of current study statusPhilip

Notes from today:
Reviewed current status of studies
-Glioma done from case point of view, migrating data from SRA to ICDC bucket.
-COTC007b reannotated and reloaded.
-NCATS resubmitting with annotations per new guidelines and getting BAM files to replace FASTA files.
-Vemurtafenib - working on aligning data with guidelines. Coming along nicely. Sequences already in GEO - need to be transferred to ICDC.
-Knapp cross-sectional study - starting alignment with guidelines. Sequences already in GEO - need to be transferred to ICDC.
-Shaying - not started yet.


Discussion of TGN-TFT-OSU-OSA01All

Maps nicely to existing data model. Uses existing annotations on sample level not previously used, but anticipated (primary tumor vs. metastatic tumor). WGS, WES and RNA-seq data for 59 dogs. Very like the Glioma study. Osteosarcoma.
Matthew Breen is interested in quality scores since Osteosarcoma is notoriously difficult to work with.
Do these samples come from CCOGC? Tawa/Breen. Ohio/Colorado - Canine Comparative Oncology Group Consortium. $2M project to collect 7 types of cancer. Specimens submitted to Frederick repository. Samples given to Ohio/Colorado to analyze (P30 grantees). May be duplicated from multiple sites analyzing the same samples. Must include CCOGC identifiers to be able to track unique samples.

From Matthew Breen:

The CCOGC collected and stored biological specimens from almost 2,000 dogs, each with one of seven different cancers

lymphoma
osteosarcoma
melanoma
mast cell tumor
pulmonary carcinoma
hemangiosarcoma
soft tissue sarcoma

For every specimen collected, a unique identifier was provided to identify the animal, beginning CCB———— and the recipients of all CCOGC specimens were asked to maintain this identifier at all times to retain the opportunity to allow data from the same patient to be merged at some point in time. The CCOGC provided numerous specimens to investigators across the world, and importantly provided NIH-P30 recipients with specimens at no charge. 

The issue we may see is that while a CCOGC collection site was collecting and submitting patient specimens to the CCOGC repository, the sites often set aside some of the specimens for their own use and stored these locally. It is possible that an investigator submitting data to ICDC may have data from a dog that was assigned an institutional identifier, but that the same dog was also assigned a CCOGC identifier. It is important for us to be able to connect these identifiers to maximize the impact of the data and to have full awareness of any data duplication.

The NCATS samples should all have a CCOGC designated CCB identifier as that was the source of the specimens. However, it is possible that data from other sites, especially CCOGC collection sites, may have used their own institutional identifier and not the CCOGC identifier, since the actual specimen they used never left their site.  It is therefore possible that one dog may have two more more data sets submitted to ICDC and so these cases need to be determined. 

It is likely that the investigators from each data generating site are fully aware if any sample was also replicated as a CCOGC specimen, but we need to double check and cross reference. I suggest that in addition to asking each submitter if any of the biological samples they processed were provided to them by CCOGC (in which was they should provide the CCOGC’s CCB number), we also as if any of their biological samples were from a dog for which a replicate specimen was ALSO submitted to CCOGC, even of the specimen they used was not sent to CCOGC.

Several major vet schools also house their own biospecimen repositories and some make these specimens available for others. We should also cross reference these instances to identify duplicate samples. 


***Need to ask Heather if these are CCOGC samples.
Lymphoma, Melanoma, Osteosarcoma.
Loop in Amy/Christina to compare against who got what. Provide this list to submitters to compare against. Need to know from where the samples originated.
SRA does not show other identifiers.
***Ask Heather and Will if they know if these samples are from CCOGC. Matthew Breen will formulate question and email me.

publication: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6642146/


Approved for prioritization.


PrioritizationAll

Does this supercede Shaying's study?
Potential duplication in Osteosarcoma complicates the issues.

**Need to go back to previous projects and see if their samples could possible be in other repositories as well.
Maybe we should prioritize this to sort out the CCOGC issues (Tawa). Go back to other studies and request other identifiers (Kibbe).
Shaying may also have samples out of CCOGC.
Matthew Breen will contact Christina Mazcko and see if she can make available the CCB numbers for CCOGC. CCB numbers are supposed to be permanently attached to samples to be able to track.

Prioritization tabled until next meeting to give us time to answer the questions that came up today.

**Add question to submission questionaire - what is the source of these samples?

Action items

  • Matt to email Heather Gardner regarding CCOGC sample IDs.  (Done 5/12/2020)
  • Matt to create Study Prioritization Listing (Done 5/12/2020)
  • Matt to propose edits to DGAB Guidelines to include SourceIDs (Done 5/11/2020)