NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 35 Next »

Document Information

Author:  Craig Stancl, Scott Bauer, Cory Endle
Email: craig.stancl2@nih.gov, scott.bauer@nih.gov,  cory.endle@nih.gov
Team:  LexEVS
Contract:   16X237
Client:  NCI CBIIT
National Institutes of Heath
US Department of Health and Human Services

Contents of this Page

The purpose of this document is to document the 2017.12 technical face to face meeting details between the NCI and the LexEVS Team.

2017  December Face-to-Face Meeting Notes 

Tuesday, December 5th, 2017

TimeLocationTopicsParticipantsResources
9:00 AM - 11:00 AM4-W-034
EVS Status and Future Direction

Discuss EVS current state, trends, and future directions

  • Larry to give brief overview of EVS infrastructure, resources, and services.
  • Review overall technical workflow and architecture.
  • Group discussion of future possible directions and priorities.
Broad cross-EVS participationEVS Project Architecture

Attendees: 

Jason Lucas, Scott Bauer, Larry Wright, Cory Endle, Kim Ong, Tracy Safran, Rob Wynn, Gilberto Fragoso, Margaret Haber, Kumar, Sherri De Coronado, John Campbell, Bron, Luba, Shamine, Craig Stancl

 

Discussion Points:

  • House keeping Items
    • Reviewed agenda and approved - unless there are changes along the way.
    • WebEx will be live all day.
    • Goal is to record key tasks and wikis
  • Goal is to set context for the rest of the meetings this week and to start identifying the issues to be addressed.  
  • Larry would like to start to complete the complete EVS Project Architecture (including LexEVS)
  • General workflow for architecture:
    • Gather terminology content → protege and meme → loaded into LexEVS terminology service → accessed via java api, rest service api, browsers
    • Architecture now needs to include the triple store database and usage (REST API and native REST API).  This service is to support clinical trials (CTRP).  Ability to make changes into the production service was a driving factor in going to TripleStore architecture.  Loading into triplestore can be done nightly if needed.  Currently the loads have been nightly.
    • OBO is currently being looked as as a third delivery channel.  
    • Expected to have expanded use of services and downloads
      • Adverse events were the most downloads  - and then used and built in other systems (CDISC and FDA, etc).
    • Report Writer extracts value sets from LexEVS
      • Current work happening to create a SPARQL based report writer. Planned for early February.
        • This would be only on QA team.
        • External use would require a security layer (doesn't exist today)
        • Gilberto noted that report writer cannot currently take a search and return result set (with preferred names).  Noted this is "simple search" and could also be part of the term browser.
        • Existing templates should be able to be run without authentication.
  • The TripleStore still cannot provide all the terminology data needed for EVS and is stored in LexEVS. 
  • Mappings - need to determine how to capture and allow access to mappings.  The LexEVS model and triple store model do not provide the needed flexibility today.
  • Synchronization of data sources and coordination of distribution of data is an open issue.  Consideration is needed to provide an umbrella API (Federated) that serves both LexEVS and TripleStore content.
  • Gilberto noted that CTS2 services should be revisited and he'd like to review missing functionality and complexity (noted by CTRP).
  • LexEVS historically has been based on standards since the early inception of the tooling.  
    • As a terminology service, all the content has been loaded into Lexgrid data model.
    • Focus from standards has shifted to providing usable services to end users.
    • There are possibilities for enhancing the service today that still provides interoperability.  
    • Thesaurus based use cases should be considered when determining goals.
    • CTRP usage of TripleStore was speed in loading content.  LexEVS loading of transitive table is long process.  Tracy noted that LexEVS could start to use TripleStore technology to remove that bottleneck when loading content.
    • Noted that existing applications may not be ready to transition to new serivces.
    • Primary goal of EVS is to proviede terminolgoy content to NCI customers and users to support the sciences.  
      • Noted that no interest in re-desigining 
    • LexEVS provides a consistent model. 
  • Mappings
    • Currently default to the LexGrid XML format.   The other is RRF loader mappings.
    • No plans to load mappings into SPARQL.
    • ICDO3 Map from Meta - Kim looked at performace - not the best results.
    • Mappings in LexEVS in coding schemes.  
    • FTP will be main distribution for Maps. (Tab delimited to text)
    • Review of CTS2 mapping support will help decide if additional functionality can be added.
    • No support for contextual mapping currently exists.
  • Systems Priorities
    • LexEVS has spent time this last year and utilizing Docker.  This will provide efficient deployments.  
      • Current deployments are completed by using a documented deployment document.
      • Containers will reduce the middle man needed for deployments.  
      • Systems team has been trying to create an "approved" NCI container so the LexEVS team can use.  
      • Will still need to continue running tests.
      • Need to talk with systems about environment.
      • Docker is operational at NCI.
      • Docker usage for data
        • Gilberto suggested promotion of data using docker.
          • Provide the "database" container to the systems team.  
          • This would help as it is deployed up the tiers.  
      • Docker distributions to be used for end users.
        • ASU was working on this type of container.  
        • Current scripts do 90% of what is needed.  Need to change configuration so it doesn't remove the services.  
  • QA
    • Currently there are test scripts created by Kim and Shamime(?)
      • Covers the majority of the usage.
      • Test scripts still being developed.  
      • Tin is still available for reference.
      • Tests should be avaiable in Jenkins. (LexEVS team is currently doing this)
      • Docker shouldn't cause concern for QA.
    • Tech stack support 
      • Confirm that NODEJS and other technologies are supported.
  • Security Scans
    • 508 needs to be addressed.
      • Heroku and SWAGGER pages may need to be reviewed. 
  • Lucene, SOLR and elastic search
    • Jason noted that there might be need for discussion during the API focused discussion on Thursday.

Decision Points:

  • Action Items
    • Mapping to be discussed further with the editors.
    • Capture the Architecture to describe the workflow.
    • LexEVS to look at utilizing the triplestore to speed up load (remove the need to load transitive table)
    • LexEVS REST (CTS2) services should be revisited and reviewed for missing functionality and complexity
    • Investigate the use of Docker to deploy data (Database container).
    • Investigate the use of Docker containers to end users.
    • Docker environment updates from systems team. 
    • Consider 508 compliance going forward.
    • Confirm tech stack status for use of NODEJS and other related technologies.

 

TimeLocationTopicsParticipantsResources
11:00 AM - 12:00 PM4-W-034
EVS Technical Infrastructure, Issues, and Options

Flesh out architecture and workflow diagrams, identify key areas of discussion

  • Review overall architecture and expand/update
  • Identify areas where current infrastructure is changing or problematic

Primarily EVS technical team members

(several have conflicting clinical trials meeting)

EVS Project Architecture

Attendees: Jason Lucas, Scott Bauer, Kim Ong, Rob Wynn, Gilberto Fragoso, Cory Endle, Craig Stancl, Shamine, Sherri de Coronado

Discussion Points:

  • Mapping
    • LexEVS doesn't allow for ability to map into Meta
      • Only saves code to code mapping and qualifiers
      • Synonym information is not saved.  Currently you need to call into the coding scheme to get additional information.
      • There is no coding scheme loaded for ICDO3 so cannot get the additional information.  
        • Sherri to determine if ICDO3 could be loaded.
        • Gilberto suggestedto have ICDO3 as an active coding scheme but not displayed in browser.   Kim would need to do this.
      • Scott suggested to investigate the possibility of mapping for independent term to a source in meta; or another loaded terminology; or 2 sources in meta.
      • Rob suggested MetaMorphosis usage, but would need further investigation.  
      • Execel spreadsheet (provided by Steph) provides ICDO code to Meta mappings.
      • SwisPROT mapping
        • Not a coding scheme.
        • Just a tab delimited file from website.
        • When loading into LexEVS, there is no target coding scheme
        • Scott suggested loading target entities that allows users to look elsewhere for resolution.
        • Traci suggested the use of URL Resolver - but doesn't need to be resolvable. 
        • UniPROT resources
  • Architecture and Workflow

Decision Points:

  • Action Items:
    • Sherri to determine if ICDO could be loaded into LexEVS
    • Investigate the possibility of mapping for independent term to a source in meta; or another loaded terminology; or 2 sources in meta.

 

 

TimeLocationTopicsParticipantsResources
1:00 PM - 2:00 PM 5-W-032
User Group Discussion - caDSR

User Teams to share how they are using EVS and  discuss requirements/priorities for the future.

  • APIs: Java, REST (CTS2 or 3-store), SPARQL, FTP
  • Backwards compatibility of server/client/data releases
  • Incl: Java/jar file issues and future
  • Incl: New terminology server API/content/other needs.
caDSR contact - Denise, Philippa, developers 

Attendees: Vikram, Natalia, Luba, Larry Wright, Scott Bauer, Jason Lucas, Cory Endle, Craig Stancl, Tracy Safran, Phillipa Barnes, Margaret Haber, John Campbell, Rob Wynn, Bron Kessler, Kim Ong, Sherri de Coronado, Denise Warzel, Sana Din, Liz

Discussion Points:

  • Currently uses the API using the Jar file and dependencies.
    • Curresnt uses LexEVS Java API
    • No use of REST API except for limited use.
  • Denise noted issues when data model or data have changed
    • Curation tool and SIW - no plan to change/update.
      • No reason to change the API unless it was going to be depricated. 
  • Denise noted that no release somplete to access to resolved value sets
    • LexEVS can provide, but caDSR needs to update.
  • There are proof of concept services being evaluated.
    • CDE Recommender Service
    • CDE Validator Service
    • Looking at using SPARQL
  • Traci asked what of the Java API is currently being used by caDSR so that when building the REST API, the team could focus on those services. 
    • Search for Concept Code
      • Return name, definition, def source
      • Return super concepts or sub concepts
    • Search for Top level concept for value set/domain
      • Return resolved codes
    • Search for CDISC SDTM Variable Terminology
    • List all value sets
  • Release Roadmap
    • Early 2018 - plan to make recommendation and decision
  • Java 7 / Java 8 Jar
    • would require caDSR to do a maintenace release in 2018 (Q1)
    • Until this is complete, the EVS team needs to maintain 6.4 and 6.5 sets of data. 

Decision Points:

  • Action Items
    • caDSR team to provide a list of what is used from the Java API to determine what would need to be exposed in a REST API.  Phillipa could meet with the team Wednesday at 3PM.
    • caDSR to update to Java 8 jar in 2018Q1

 

TimeLocationTopicsParticipantsResources
2:00 PM - 3:00 PM  5-W-032
User Group Discussion - FDA and CDISC

User Teams to share how they are using EVS and  discuss requirements/priorities for the future.

  • APIs: Java, REST (CTS2 or 3-store), SPARQL, FTP
  • Backwards compatibility of server/client/data releases
  • Incl: Java/jar file issues and future
  • Incl: New terminology server API/content/other needs.

Editors

Liz, Erin, Brenda

 

Attendees: 

Bron, Lub, Larry Wright, Scott Bauer, Rob Wynn, Jason Lucas, Liz, Gilberto Fragoso, Tracy Safran, Erin Mulbrandt, Lori Whiteman, Margaret Haber, Terry Quinn, Sherri de Coronado, Sana Din

Discussion Points:

  • FDA
    • Report Writer
      • Terry generates 25 files (FDA and others) every month.  Would like to be able to batch command and provide the dates needed for report writer.
      • Files are posted to FTP site.
        • These files contain subsets
      • Changes are indentifed by doing an exact compare of both sets of data (addition, changes, deletions)
      • Rob and Tracy are working on new report writer that should help make this process tolerable.
  • CDISC
    • On the cancer.gov page for CDISC Terminology - as there are alot to scroll through, request for a table of contents to make it more usable.
    • OWL/RDF updates to metadata model for CDISC (Rob, via TopBraid)
    • Request to update the CDISC new term suggestion request form.
      • Update Request type dropdown
      • Update Code List dropdown
      • Possible type-ahead
    • CDISC Publication Column Headers
      • header naming switch planned (significant change) - 2018Q2
        • i.e. "CDISC Submission Value"
      • Rob noted there are changes needed for the reports

Decision Points:

  • Action Items
    • Request for a table of contents on the CDISC Terminology page.
    • Request to update the CDISC term suggestion request form.

 

TimeLocationTopicsParticipantsResources

3:00 PM - 4:00 PM

5-W-032
User Group Discussion - CTRP / CTS-API

User Teams to share how they are using EVS and  discuss requirements/priorities for the future.

  • APIs: Java, REST (CTS2 or 3-store), SPARQL, FTP
  • Backwards compatibility of server/client/data releases
  • Incl: Java/jar file issues and future
  • Incl: New terminology server API/content/other needs.
CTRP / CTS-API - managers,  developers, Tiger team (Gisele, Samantha, David, Brian, Peter, Tracy, Jason, others) 

Attendees: 

Bron, Lub, Larry Wright, Scott Bauer, Rob Wynn, Jason Lucas, Liz, Gilberto Fragoso, Tracy Safran, Margaret Haber, Sherri de Coronado, Sana Din, Gisele, Samantha, David, Kim Ong

Discussion Points:

  • Moving to hierarchial structure.
    • Search NCIT natively (no longer to use caDSR)
  • Data needs to be avaiable for precise matches
  • https://www.cancer.gov/about-cancer/treatment/clinical-trials/advanced-search
    • Larry noted that when finding clinical trials (search) the same stage could be listed several times in the dropdown. 
    • Drugs and Drug family is problematic when determing what should come to the top of the list.  Need to look at agent/therapy categories.
  • Accrual coding
    • Need to understand how to capture the mapping data. (Meeting on Dec 12)
  • Partial matching on terms
    • i.e. search for partial term and provide weighted results (relevancy ranking). More exact on top and then less weighted results.
  • David to follow up on use of REST services (CTS and LexEVS REST).   

Decision Points:

  • Action Items:
    • Follow up on the use of REST Services (CTS, LexEVS REST)
    • Investigate the issue when finding clinical trials (search) the same stage could be listed several times in the dropdown. 
    • Investigate the issue - Drugs and Drug family is problematic when determing what should come to the top of the list.
    • Determine mapping for accrual coding.

 


Wednesday, December 6th, 2017

TimeLocationTopicsParticipantsResources
 9:00 AM - 10:00 AM3-W-030
EVS Architecture

Discuss Potential of using a variety of architectures

Proposed topics for discussion:

  • Micro services
    • Considerations:
      • Determine how to synchronize data on the back-end. LexEVS DB and Triple Store need to be in sync when NCIt information (such as value sets) changes. 
      • Determine the potential of a loader that relies on SPARQL queries (after SPARQL query load, kick off LexEVS loader)
  • LexEVS integration with EVS Triple Store
    • Considerations:
      • Determine use of triple store calls in parallel with LexEVS DB
      • Determine performance improvements over LexEVS DB
      • Determine what calls could be made to the triple store instead of LexEVS.
      • Determine use of Stardog built in graph database.
      • Determine performance considerations for hierarchy traversal for graph resolution.
  • Future implementation considerations

Gilberto Fragoso

Kim Ong

Tracy Safran

Rob Wynne

Larry Wright

Margaret Haber

Sherri De Coronado

Bron Kisler

Systems team

John Cambell /Ruth Monterio users of the SQARL

MicroServiceProp.pptx

TripleStore.pptx

Attendees: 

Jason Lucas, Kim Ong, John Campbell, Larry Wright, Bron Kisler, Rob Wynn, Craig Stancl, Cory Endle, Scott Bauer, Kumar, Luba, Sherri De Coronado, Margaret Haber, Gilberto Fragoso, Liz, Denise, Tracy Safran

Discussion Points:

  • Overview of EVS Architecture
    • Need to had value sets and mappings.
    • Would like to include all the ways that reports are created. (content channels - as separate slide)
    • Need to incude additional sources being loaded into SPARQL.
    • Consider adding channels from triplestore to LexEVS.
    • Add detail for Browser and assiciated dependencies
    • Change from SPARQL to TripleStore.  
  • Overview of LexEVS Stack
    • Scott noted that the Distributed LexEVS should be considered to be deprecated.
      • Serialization is the primary concern.
    • Tracy noted that the REST service is used more than the Java API.
    • Most NCI customers/users need simplified API.  
    • Users of Distributed API
      • Matching Program for Editors
      • caDSR
    • Lucene recommendations
      • Move all Lucene code into the DAO layer (this is not complete today)

Decision Points:

  • Action Items:
    • Update Architecture to had value sets and mappings.
    • Update Architecture to include all the ways that reports are created. (content channels - as separate slide)
    • Update Architecture to incude additional sources being loaded into SPARQL.
    • Update Architecture by adding channels from triplestore to LexEVS.
    • Update Architecture to add detail for Browser and assiciated dependencies
    • Update Architecture to change from SPARQL to TripleStore.  

 

TimeLocationTopicsParticipantsResources
10:00 AM - 12:00 PM

10:00-10:30

3-W-030

11:00-12:30

TE-420

(Can't fill gap 10:30-11:00)

EVS Architecture - Technical Discussion with Systems team

Discuss technical aspects of potentially using a variety of architectures

Proposed topics for discussion:

  • Micro services
    • Considerations:
      • Embedded Tomcat implementations
      • Alternative web service platforms
      • Container/Port clashes
  • LexEVS integration with EVS Triple Store
    • Considerations:
      • SPARQL clients
      • Docker options
  • Future implementation considerations
    • Java
    • Python
    • Node.js/javascript
    • Others?

Systems team

 

Attendees: 

Jason Lucas, Kim Ong, John Campbell, Rob Wynn, Craig Stancl, Cory Endle, Scott Bauer, Gilberto Fragoso, Tracy Safran

Discussion Points:

  • EVS REST Service Overview
    • John provided overview of the REST service.
    • There exists a UI for loading and report writer.
  • Anthill Pro project migration to Jenkins
    • Teams will need to work with systems team to ensure migration is successful to Jenkins.

Decision Points:

 

TimeLocationTopicsParticipantsResources
 1:00 PM - 2:00 PM3-W-030
EVS Project Group Discussion (During regular call-in time)

Proposed topics for discussion:

  • (High Level Overview) Discuss direct calls to NCIt for value sets
    • Performance
    • Workflow
    • API Implications
  • Discuss Mappings and cross-walking coding schemes

  •   SwissProt, ICD-0-3, and MED-RT as the successor of NDRFT
    • Associations from/to
    • Cross walking coding schemes
    • Loader considerations for Mesh, RxNorm

Kim Ong

Tracy Safran

Rob Wynne

Editor's Representative/Margaret Haber

Larry Wright

Sherri De Coronado

Gilberto Fragoso

Proposed Biomarker Terminology Sets_2017-12-05.pptx

Attendees: 

Discussion Points:

Decision Points:

 

TimeLocationTopicsParticipantsResources
2:00 PM - 3:00 PM3-W-030
NCI Systems Discussions

Proposed topics for discussion:

  • Discuss CI and Docker Status/Roadmap
    • Discuss the current status of the Docker scripts used to build/test LexEVS components.
      • Discuss the current NCI Docker images used in LexEVS tests.
    • Discuss NCI's current status and future plans to use Docker.
  • Discuss Tech Stack Upgrades
    • Discuss DB upgrade: MySQL 5.6 vs. MariaDB
    • Discuss migrating from Anthill Pro to Jenkins

Jacob and Systems team

Gilberto Fragoso

Rob Wynne

Tracy Safran

Kim Ong

Larry Wright

Margaret Haber

Sherri De Coronado

Q/A (Sana)

 

Attendees: 

Discussion Points:

Decision Points:

 

TimeLocationTopicsParticipantsResources

3:00 PM - 4:00 PM

(added meeting)

3-W-030
User Group Discussion - caDSR 

Continued discussion of current API

  • APIs: Java, REST (CTS2 or 3-store), SPARQL

caDSR - Philippa,

Vikram, Natalia, Rui

EVS REST API

CTS2 REST API

Attendees: 

Discussion Points:

Decision Points:

 


Thursday, December 7th, 2017

TimeLocationTopicsParticipants
    

Attendees: 

Discussion Points:

Decision Points:

 

TimeLocationTopicsParticipants
    

Attendees: 

Discussion Points:

Decision Points:

 

 

 

 

  • No labels