NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • caDSR Team represented by Sima, Natalia, Vikram
  • caDSR Applications that use EVS
    • Sentinel - Alerts for concepts, job that does concept clean up (compares concepts)
    • Curation tooling - links to concepts, and search results.  
    • CDE Browser - Concepts used from search results.  
    • Semantic integration workbench - concepts used from search results
    • CDEs
      • Utilize the NCIT, NCI meta,
      • Look into NCIT - will use the concept to describe the CDE.
      • Organizing concepts to build CDE terminology.  
      • CDEs are used for forms (permissible values on forms)
  • Tooling hasn't changed or been replaced.
    • Currently use JARS and put in /lib
    • Using Remote API today.
    • Recently removed EJB 
    • Need to consider architecture in the future.
    • MDR is planning to architect a solution moving forward.
    • Currently searches are restricted to preferred terms.
      • Building data element - definitional information, preferred name
      • Existing CDE - pull back perferred name.
  • Current Tooling Issues
    • Need to have a data load completed to PROD.
    • Confirmed data load and ready once things move to production.
    • New LexEVS Jars will be included in next release. 
  • Remote API Architecture
    • Issues
      • Replacement of JARS
      • Serialization of objects.
  • Proposed Architecture?
    • REST-ful API 

Decision Points:

  • Identify current JAVA API usage by caDSR caDSR to provide list of what is currently used in the Java API
    • EVS team to provide feedback as to how to do things better.
    • EVS team to ensure that if REST-ful API is created, functionality to be prioritized.  

...

  • Value Sets
    • Would like LexEVS to support production of value sets with more rich structure. (to more efficiently assemble this deliverable)
    • 100K downloads from FTP site.  Fewer users use the browser to download the value sets.  
  • Mapping
    • On a mapping page, you can download an excel or cvs file.  
    • ie, chebi has mapping.  
    • For GDC - ICD9 or 10 coding - to be able to use NCIT coding, there needs to be way to translate between ICD9 and NCIT codes.  There currently isn't a good way to do that today.  
      • ie ICD9 - Brest cancer - corresponds to ABC in NCIT
      • Determine how such a map could be published (browser and LexEVS)
  • Other Services
    • IUPAC - there are 2 flavors to be considered - but can be managed.
    • HUGO - the slash and hyphens have been problematic, but have been mostly resolved in the NCI Browser.
    • Review of 4 identified searching issues (differences between results in LexEVS and Protege).
    • NGram tokenizers may provide solution if we implement an Expert System.

Decision Points:

  • Identify additional mapping requirements from the EVS Project group.
  • Investigate the use of
  • Revisit the mapping discussion during a future project meeting.
  • Consider and prioritize the expert system solution to support specialized search capabilities for complex chemical and genetic names. 
  • Review Investigate the usage of NGrams in Lucene to support specialized search.

 

TimeLocationTopicsParticipants
2:00 PM - 3:00 PM5E030

LexEVS Mapping Discussion

Determine requirements and propose solution for mapping.

  • User requirement: One terminology to many terminologies mapping.
  • Other topics:  Current, conditional, external relationships
 

...

  • Investigate the MRMAP load and determine why that work wasn't completed.

 

...

TimeLocation

...

 

TimeLocationTopicsParticipants
3:00 PM - 4:30 PM5E030

Lucene Discussion

Propose additional features of Lucene to be used within LexEVS.

  • Discuss specialized search use cases.
  • Possible Lucene enhancements for coding scheme categorizations, auto complete aids, Lucene services.
 

...

  • Investigate ability to use Lucene Facets and identify where it could be used. 
  • Investigate ability to design a usable auto complete and where it could be used.  

 

TimeLocationTopicsParticipantsResources
4:30 PM - 5:00 PM5E030

Overflow/Additional Topics

  

...

TimeLocationTopicsParticipants
9:00 AM - 11:00 AM3W030

Value Set management and workflow

  • Discuss requirements for value set version management and workflow management and supporting technology.
  • Rob to give a demo of their current workflow and the scripts they use.
Rob, Tracy

Attendees:  Jason, Gilberto, Rob, Tracy, Scott, Cory, Craig, Tin, Larry, Kim, Sana, Liz

Discussion Points:

  • Properties in Thesaurus that support the browsers
    • Subsets in Thesaurus (Protege)
      • Publish_Value_Set
      • Term_Browser_Value_Set_Description
      • Value_Set_Location - where browser fetches the report from (ftp location and path within evs, and filename)
      • TVS_Location - Terminology Value Set OWL file - hierarchy and components
    • Properties are scrubbed before loading into LexEVS - these are private/internal properties.
  • Baseline - A diff is done on the value sets from month to month.  Triggers update load procedures monthly.  
    • Load, Remove and Resolve scripts are generated for changes
  • OWL file provides information about where the value set lives in the hierarchy.
  • NCI Thesaurus and TVS  (provides structure to value sets)  - used by the browser
  • Process was created to provide structure/hierarchy to value sets
  • Value Set loads - 700 coding schemes - loaded in 24 hours
  • If value set resolution is performed against a new version of the code system, does LexEVS handle the versions?
  • Process currently isn't limited to NCIT.
  • Script created to create a txt file that views hierarchy of concepts on value set.
  • value set downloads are ~10k a month and used across agencies by diverse set of consumers.

Decision Points:

 

...

Value Set and Mapping Data with Hierarchical structure Discussion

  • Determine requirements and propose options to hierarchical structure and mapping.
  • Discuss how VS could retain their multiple hierarchical structure that it came from.
  • Discuss what changes would be needed to CTS2 for this.

...

Attendees:

Discussion Points:

Decision Points:

 

...

NCI Systems Discussions

  • Nexus Deployment Discussion
    • Current status of LexEVS artifacts on NCI Nexus server
    • Discuss current technical challenges.
  • CI and Docker Status/Roadmap
    • Discuss the current status of the Docker scripts used to build/test LexEVS components.
    • Discuss NCI's current status and future plans to use Docker.
    • Discuss security challenges associated with NCI's environment and Docker.
  • Discuss a separate DEV environment for CI server deployment
  • Tech Stack Upgrades
    • Discuss DB upgrade: 
      • MySQL 5.6 vs. MariaDB (10.1 Supported 2017.01)
    • Discuss CentOS 7 upgrade
  • Tier Deployment testing responsibilities
    • Mayo development team responsibilities
    • NCI development team responsibilities

...

Sara, Shireesha, Phil

 

 

Jacob, Yeon (Systems Team)

  • Discuss latest issues on PROD.
Rob, Tracy
TimeLocationTopicsParticipants
11:00 AM - 12:00 PM3W030

Value Set and Mapping Data with Hierarchical structure Discussion

  • Determine requirements and propose options to hierarchical structure and mapping.
  • Discuss how VS could retain their multiple hierarchical structure that it came from.
  • Discuss what changes would be needed to CTS2 for this.
 

 

Attendees:  Jason, Gilberto, Rob, Tracy, Scott, Cory, Craig, Tin, Larry, Kim, Sana, Liz

Discussion Points:

  • Properties in Thesaurus that support the browsers
    • Subsets in Thesaurus (Protege)
      • Publish_Value_Set
      • Term_Browser_Value_Set_Description
      • Value_Set_Location - where browser fetches the report from (ftp location and path within evs, and filename, BNF)
      • TVS_Location - Terminology Value Set OWL file - hierarchy and components
    • Properties are scrubbed before loading into LexEVS - these are private/internal properties.
  • Baseline - A diff is done on the value sets from month to month.  Triggers update load procedures monthly.  
    • Load, Remove and Resolve scripts are generated for changes
  • OWL file provides information about where the value set lives in the hierarchy.
  • NCI Thesaurus and TVS  (provides structure to value sets)  - used by the browser
  • Process was created to provide structure/hierarchy to value sets
  • Value Set loads - 700 coding schemes - loaded in 24 hours
  • If value set resolution is performed against a new version of the code system, does LexEVS handle the versions?
  • Process currently isn't limited to NCIT.
  • Script created to create a txt file that views hierarchy of concepts on value set.
  • value set downloads are ~10k a month and used across agencies by diverse set of consumers.
  • curation and delivery processes are driven by the users and consumers.
  • EVS editors work directly with CDISC and other groups.
  • CDISC and FDA have different formats and standards 
  • Resolved value sets as coding schemes - done that way for performance.
  • Process can be error prone for the EVS Editors.  The editors need to do specific things to drive Browser.

  • Hierarchy (value set groupings)
    • TVS_CDISC provides information for the browser to display the value set hierarchy

Image Added

    • Possibly add hierarch to the NCIT to replace the complexity today.
      • hierarchy  (value set groupings) could be captured in lucene index (using Lucene Facets).

 

  • Hierarchal representation in value set
    • A coding scheme with custom hierarchy

Image Added

    • Neoplasm core is a starting point for coding neoplasms.  It is a starting point which then allows to branch out.
    • Could extend the current implementation of resolved value sets so that that coding scheme would provide hierarchy. This is very much like a vocabulary.
    • Could Provide "Hierarchal Value Sets"

Image Added

      • Browser could provide another tab "Hierarchal Value Sets" that would show the coding schemes that are the resolved hierarchal value sets.
      • This might require an different value set loader or an extension to it.  
      • It could be complicated if the hierarchal value set hierarchy doesn't  match the original coding scheme hierarchy.  
    • CTS2 representation would need to be extended to support the idea of a Hierarchical Value Set.
  • Considerations for investigation
    • Investigate ability to be able to determine if Resolved VS coding scheme has changed.  
    • Investigate ability to be able to determine if Value Set Definition has changed. 
    • Investigate ability to update as needed (not have to load all 700 at the same time).
    • Investigate ability to capture value set groupings in lucene index (using Lucene Facets and the NCIT).
    • Investigate ability to capture "Hierarchal Value Sets" as coding schemes with hierarchy.

Decision Points:

  • Investigate ability to be able to determine if Resolved VS coding scheme has changed.  
  • Investigate ability to be able to determine if Value Set Definition has changed. 
  • Investigate ability to update as needed (not have to load all 700 at the same time).
  • Investigate ability to capture value set groupings in lucene index (using Lucene Facets and the NCIT).
  • Investigate ability to capture "Hierarchal Value Sets" as coding schemes with hierarchy.

 

TimeLocationTopicsParticipants
1:00 PM - 3:30 PM3W030

NCI Systems Discussions

  • Nexus Deployment Discussion
    • Current status of LexEVS artifacts on NCI Nexus server
    • Discuss current technical challenges.
  • CI and Docker Status/Roadmap
    • Discuss the current status of the Docker scripts used to build/test LexEVS components.
    • Discuss NCI's current status and future plans to use Docker.
    • Discuss security challenges associated with NCI's environment and Docker.
  • Discuss a separate DEV environment for CI server deployment
  • Tech Stack Upgrades
    • Discuss DB upgrade: 
      • MySQL 5.6 vs. MariaDB (10.1 Supported 2017.01)
    • Discuss CentOS 7 upgrade
  • Tier Deployment testing responsibilities
    • Mayo development team responsibilities
    • NCI development team responsibilities

Sara, Shireesha, Phil

 

 

Jacob, Yeon (Systems Team)

Attendees:  Larry, Sherri, Rob, Tracy, Jacob, Sarah, Scott, Cory, Craig, Kwan, Yeon, Sherri, Tin, Sana, Shireseha, Jason

Discussion Points:

Nexus Server Configuration

  • Testing on DEV tier - config of security permissions
  • Manual publishing until configured
  • CTS2 - Maven build was the simplest case, so that's the focus.
  • ANT publishing not currently available.  Will need to look at public and private key possibilities.  Sara's team should be able to support. (LexEVS requires ANT build).

Tech Stack Updates

  • DB
    • Currently at 5.5 
    • Tech stack is moving to 5.6 and migration has started. 
    • Preliminary tests show that we can support 5.6.
    • 5.6.33 is the current version.  Yeon can update each tier for EVS team.
    • No plan for 5.7, but MarieDB (in 6 months to year)
    • Need to move to 5.6 as soon as we can.
  • CentOS 7
    • LexEVS is ready to upgrade to CentOS 7
    • CentOS 7 is available.
    • Blade servers would need to be ordered or current blades would need to be re-imaged.
    • May be able to shuffle the upgrade and swap servers so it moves up.  
  • Java 1.8
    • LexEVS is ready with 1.8
    • Waiting on other tooling to support 1.8

Dev Environment

  • Set up secondary Dev instance for Jenkins and application servers.  
  • Need to consider what database connection is needed. 
  • Suggested - set up a VM for this Dev 
  • Can submit tickets to Jacob to get this started.  

Docker and CI Discussion

  • Overview of Mayo usage and configuration.
    • NCI uses Jenkins 2.19
    • Docker differences between what NCI has and Mac version.
    • Would require move from Ubuntu.  
    • Image repository not ready, testing Nexus 3.x for Docker Image Storage.  
  • NCI can support Docker configuration. 
  • Need to negotiate timelines.  

Decision Points:

  • Plan to migrate to 5.6.33 as soon as possible.
  • Plan to migrate to CentOS7 (work with Jacob).
  • Plan configuration of DEV instance (work with Jacob).
  • Plan to further investigate Docker configuration. 

 

TimeLocationTopicsParticipants
3:30 PM - 4:00 PM3W030

FHIR and terminology services (CTS2)

  • Harold to provide update on CTS2 and FHIR.
Harold

Attendees: Tin, Jason, Rob, Tracy, Scott, Craig, Cory, Larry, Sherri, Harold, Gilberto

Discussion Points:

  • Harold noted that the OMG process stalled by no further participation by Mayo.
  • Remaining issues:
    • SOAP WSDL
    • Miscellaneous issues
  • Additional Features:
    • Columnar Format
    • Cannonical RDF
    • SNOMED CT implementation guide 
  • FHIR and CTS2 are similar in that both are complex but much is not required - only use what you need.
  • Clinical Research and Biomedical informatics groups are taking note of FHIR and beginning participation in FHIR.
  • FHIR Terminology - possible integration of CTS2 services. 
  • Grahm Grieve (HL7 FHIR) is in support of CTS2 services for FHIR.  
  • Current Planning
    • Plan on implementing entity description in native FHIR to demonstrate the differences and begin discussing with the FHIR community.
    • Review FHIR terminology and CTS2 terminology to describe overlap and gaps.  A paper will be written and published.  
  • Other project - CIMI HSP - Determined that CTS2 wasn't a candidate for services.
  • FHIR does offer:
    • Provides extensibility
    • Not to be fully implemented.
  • HL7 and OMG HSSP process was not successful in that the standard wasn't successfully integrated back to HL7.
  • A better model would have been what FHIR is doing within HL7 - collaborative within HL7.  
  • Harold to be at the January HL7 meeting to listen in on the FHIR sessions.  

Decision Points:

 

TimeLocationTopicsParticipants
4:00 PM - 5:00 PM3W030

OWL Restrictions in LexGrid Model

  • Discuss approach and propose additional features.
  • Determine if there are LexEVS model changes needed.
  • Loader considerations.
  • Additional problems and solutions
 

Attendees:  Larry, Sherri, Jason, Rob, Tracy, Cory, Scott, Craig, Harold, Gilberto, Kim

Discussion Points:

  • Much of OWL2 is similar to OWL1.
  • OWL2 includes property chains, but thy aren't being used.
  • The entire semantic meaning in LexEVS isn't required for OWL2.  For example, reasoners would use OWL2 source - not out of a terminology server. 
  • There is no requirement to include additional OWL2 representation in LexEVS.  Instead, use triple store and expand RESTful services.  
  • Current OWL2 issues have been resolved.  
  • Need to revisit the OBO JIRA item - and close it. 

Decision Points:

  • No additional OWL2 representation needed in LexEVS.
  • Review OBO JIRA issue and resolve.

 

TimeLocationTopicsParticipants
5:00 PM - 5:30 PM3W030

Overflow/Additional Topics

 

Attendees:   Larry, Sherri, Jason, Rob, Tracy, Cory, Scott, Craig, Harold, Gilberto, Kim

Discussion Points:

  • Browser issue - search issue when value sets don't return content.  
  • Noted that QA could be done in Protege before publishing
  • Value Set Loader should not load a value set with no content.  

Decision Points:

  • Implement fix for Value Set Loader to not load value set with no content.  
    Jira
    serverNCI Tracker
    serverId7954a81f-12da-3366-a0ef-97c806660e7c
    keyLEXEVS-2510

Friday, December 2nd, 2016

 

 

TimeLocationTopicsParticipants
9:00 AM - 10:00 AM1W030

LexEVS Admin

Discuss current and future requirements.

  • GUI
    • Consider a web based tool.  A simple way to look at the data.
  • Command Line loader requirements
  • Other considerations
 

Attendees:  Larry, Rob, Tracy, Cory, Craig, Scott, Tin

Discussion Points:  

  • Ability to look at data in a graphical way would be important.
  • Command Line usage - List Schemes - can return 700+, so prefer to use the UI.
  • Usage of the UI for troubleshooting to review the data in the database.
  • Minimal ability to look at data would be preferred (fully graphical is not required)
  • Administrative tasks not required in the GUI (only in the cmd line tooling)
  • Ability to load metadata at the same time as the load. 
  • Web based tool. 
  • We need to replace based on functionality used by Browser
    • graphical hierarchy representation (tree extension?)
    • Optimal or best practice as an alternative
  • providing code snippets for end users as options
  • Secure any admin code (Loading, changing code systems) on a web based gui is a concern
  • Could be potentially be used as a browser for technical users on an NCI Production server (Discussed)
  • There is a request for admin ability for editing the preferences and manifest.
  • Currently, there is no way to view what metadata is loaded.
  • Investigate ability to combine data from the metadata and manifest files into one file.
    • This would make administration/loading easier.
    • Post load options may be an issue.
  • History loader creates multiple errors when loading - there is an existing JIRA item.
    • May be caused by DB timeout. 
    • Investigate what is causing.
  •  LG xml Loader is used to load maps.  However, it doesn't take in account of the type of maps (it could).  Not sure the rankings can be applied.   
    • No existing issues, but may find some loading additional maps
    • SY relationships and Ranking are provided.
    • Monthly changes are applied.  

  • GUI Performance during x-forwarding noted by Rob.  
  • File system preferences - lock?
  • UI is good for Tagging to Production
  • UI is good for Removing a Coding Scheme
  • Listing Schemes in CMD - ListSchemes.sh - formatting is limited to column width.  
    • Image Added
    • Default, do not show entire width.
    • Minimailly add 10 chars to URL
    • Minimally add 5 chars to Versions
    • Add option to see full length's of all fields. 
    • Add option to see minimal information.  



 

Attendees:

Discussion Points:

Decision Points:

 

...

FHIR and terminology services (CTS2)

  • Harold to provide update on CTS2 and FHIR.

...

Attendees:

Discussion Points:

Decision Points:

 

...

OWL Restrictions in LexGrid Model

  • Discuss approach and propose additional features.
  • Determine if there are LexEVS model changes needed.
  • Loader considerations.
  • Additional problems and solutions

...

Attendees:

Discussion Points:

Decision Points:

 

...

Overflow/Additional Topics

...

Attendees:

Discussion Points:

Decision Points:

Friday, December 2nd, 2016

Topic:

Attendees:

Discussion Points:

Decision Points:

 


TimeLocationTopicsParticipants
910:00 AM - 1012:00 AM1W030

LexEVS Admin

Discuss current and future requirements.

  • GUI
    • Consider a web based tool.  A simple way to look at the data.
  • Command Line loader requirements
  • Other considerations
 

Attendees:

Discussion Points:

Decision Points:

 

...

Prioritization and Debrief

  • Discuss OWL2, RRF, LexEVS, CTS2, Browser, and all previous topics
    • Discuss future architecture
  • Determine next steps/road map and priorities

...

PM1W030

Prioritization and Debrief

  • Discuss OWL2, RRF, LexEVS, CTS2, Browser, and all previous topics
    • Discuss future architecture
  • Determine next steps/road map and priorities
 

Attendees: Kumar, Larry, Jason, Sherri, Rob, Tracy, Cory, Scott, Craig

Discussion Points:

Architecture

  • Future considerations.
    • Smaller services - 
      • For example, Coding List listing service as a separate service.
    • Concerns around ability to deploy up the tiers
      • Current requirements will prohibit how quickly services can be exposed.  
      • Concerns about tech stack upgrades across services.  Micro Services may or may not be impacted by upgrades (some or all).
    • If addressed well, we can get rid of silos and duplication.  
    • Resources are a concern,
      • Containers, JETTY, and how to balance. 
    • Security
      • Scanning will take nearly as long as the large service.
    • Instead of re-architecting all, focus on new and additional functionality (along side existing LexEVS)
    • No longer would need clients to include jars, dependencies.


Decision Points:

  • Investigate services architecture to support new and additional functionality.

Attendees:

Discussion Points:

Decision Points:

 

TimeLocationTopicsParticipants
1:00 PM - 2:00 PM1W030

Prioritization and Debrief (Continued if needed)

 

Attendees:

Discussion Points:

Strategic direction - RESTful services

  • Moving to micro architecture in new areas in functionality for LexEVS
  • Integrated REST services across LexEVS, Triple Store, Clinical Trials (Integrated REST Services)
  • Future MDR redesign effort - areas of service support of terminologies. 
  • Future CTRP support

Decision Points: