NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Kim reviewed the work had has done with the Loader tutorial code that Scott provided.
  • Scott discussed how property values are tied to entities in Lucene.
  • It was noted that the 6.4 Lucene development isn't complete so some of the functionality is not working today.  
  • Scott reviewed how set theory is currently implemented.  (2 parent block join)

9:00 AM - 9:30 AM

2E914

Recap and Planning

Attendees: Kim, Cory, Scott, Craig, Jason, Larry, Tin

  • The 2015.12 Technical Face-To-Face Prioritization List was reviewed and updated to capture additional items.
  • It was noted that Partonomy should be considered for CTRP requirements.  
  • Larry discussed structured presentation of Value Set and Mapping data.  Currently flat lists (concepts with terms) or terms with source and target.  To be useful, would prefer to have hierarchy viewable to represent the internal structures.  There may be existing JIRA items, but we should look at this again.  Scott suggested the use of codedNodeGraph call to create this hierarchy.  Usage needs to be considered - requirements need to be established from the users and then look at the technology to support.  

9:30AM - Noon

2E914

Discussion: Coding Scheme Search and Indexing

  • Determine requirements/use cases for horizontal coding scheme searches
  • Overview of indexing of qualifiers
  • Overview of search results in 6.4

Discussion: Coding Scheme Search and Indexing

  • Traversing Coding Schemes
    • OBI and GO would be the starting place to determine ability needed to traverse from one coding scheme to the next.   We have this captured and will consider.
  • Indexing
    • Index Qualifiers
      • Scott described how we currently index qualifiers.  Qualifiers are stored in a file as a list that are grouped together and parsed into the index.   The list is added to the parent document as part of the block join implementation.  
    • LexEVS 6.4 Implementation
      • Scott discussed the status of 6.4 and noted that we've noticed some result differences in going form a single index to multi indexes.   The scoring is based on the frequency of a term - and we can boost the score.  There is a junit that tests the boosting of terms, but we aren't sure this is a credible issue or not.   Approximations are going to make this difficult.  Larry suggested that the raking be considered only for the individual source and not across all sources.  
      • Gilberto described a search result page where results could be split by vocabulary sources.  To do this, the list of sources could be presented to the user and the user could select the source from a pop-up.  The browser would need to be updated. 
      • Even with multiple indexes, exact matches will always be at the top of the results. Similar weighting should also be preserved.  
      • Larry requested that we share with the group how it worked in 6.3 and how it now is returned in 6.4 once fully implemented.  
      • Scott noted that we could always write our own analyzer, but then we'd need to maintain and support.
      • Stop word list is still valid in new implementation.   

 

Wiki Markup
{scrollbar:icons=false}

...