LexEVS 6.0 Mapping Discussion

Object Model Considerations
• Map Set
o Connects two versioned coding schemes
o Is versioned itself.
o Should be named in a descriptive fashion (e.g. MDR12 to SNOMEDCT_2010_07_31 Mappings).
o Significant amount of metadata exists at this level
o Characterize map sets (complexity, completeness, content domain, “officialness”, ..)
o Consider metadata to track whether mappings are curated or not.
o Consider metadata to track whether mappings are generated (and not reviewed) or not.
• Group of Mappings (a mapping subset, or search results, or a page of listings)
o This is more sophisticated with complex rule-based map sets where a collection of mapping entries may actually represent only a single mapping.
• Mapping
o Source and target “codes” (or expressions of codes)
o Represented as an association with qualifiers for other semantically important information
o If target is an expression, represented as an “association to data”
o Have “default preferred name” in the event that a source/target code cannot be resolved in a loaded coding scheme
• Mapping attributes
o Any information about a mapping than what directly fits the “association” object model will be rendered as attributes. Hopefully these attributes will have standard names across different types of map set loader. In other words – we can define the semantics of mapping association qualifiers so that particular ones are used to always represent the same aspect of some kind of mapping semantics.

Searching Scenarios
• By name
o Find mappings where the source concept has a name matching a search string
o Find mappings where the target concept has a name matching a search string
o Find mappings where the source concept has a name matching a search string AND the target concept has a name matching a different string
• By code (like by name but for matching codes)
• Restrict by vocabulary (source vocabulary, target vocabulary, or map set vocabulary)
• Restrict by “current” – only retrieve mappings whose endpoints can be resolved in current versions of terminologies
• Restrict by cardinality – e.g. only retrieve 1-1 mappings (that meet other criteria) - (1-1, n-1, 1-n, n-n)
• Restrict by association type – e.g. only retrieve “synonymous” mappings
• Consider whether search by name should include “default preferred name” in cases where source/target code are unresolveable

Browsing/Discovery Scenarios
• Organize map sets into groups or categories
o How authoritative is it?
o Is it actively maintained?
o Does it connect “current version“ terminologies loaded into LexEVS?
o How important or heavily used is it?
o What “kind” of mapping set is it (i.e. what is it for)
o Group by “from” terminology or “to” terminology
• Pre-select certain map sets for searching based on well-defined criteria
• When searching across multiple map sets, consider options like grouping all mappings with the same “source” terminology together to make cognitive task easier.
• Support ability to identify map sets that are “to” or “from” a particular terminology
o E.g. get all map sets where the “source” coding scheme is NCI2010_07

Presentation Scenarios
• Map Set view (including all known metadata)
• Map Group view (where needed) – e.g. one “subset” in a complex rule-based mapping.
• Mappings view (sortable table, paging capabilities).
o Standard column headers come from LexEVS view.
o Need to consider “extended” columns that may not apply to all map sets (e.g. MAP_RANK, or other MRMAP fields like MAPPRIORITY, etc)
• Individual mapping view – link source and target codes to the concept pages for those things – may not need this
• “Mappings” tab on concept view – render all mappings “to” or “from” that concept
o need to determine which map sets to search in
o retrieve this info only when user actually clicks on this tab
• Make good decisions about when clicking should open a new window vs. reload in the same page.
• Support a “view all mappings” function for simple browsing of small sets (without requiring search).
• Handling expression-based mappings (may need to parse – can do based on grammar or “style”).
• Consider when a new page should open and when content should be reloaded in the current tab/browser window.

Loader Considerations
• Need to specify map set metadata (switches or prefs file)
• Need to indicate end point coding schemes for “source” and “target“ codes.
• If we know they cannot be resolved, placeholder names for endpoints should be provided and loaded.
• WARNING: be careful using Excel to export to CSV because of conversion things like “078.12” to “78.12”
• Consider a characterization of the “type” or “category” of a map set for organization in a list and pre-selection
• Consider how many map sets to load into a single coding scheme, and what criteria would be used to make that decision.
• Handle cases where we know the source/target codes cannot be resolved due to different versions and load default preferred names.

Obtaining or Generating Data
• MRMAP data from UMLS
• Data sets created as part of NCI-META export
• Map sets generated as a special project (e.g NCIt Neoplasms to SNOMEDCT)
• CTCAE 3->4 mappings
• On-demand generation of mappings from an integrated terminology resource (like UMLS or NCI-META).
o Criteria-based candidate selection
o Criteria-based ranking and filtering
o See report (posted to gforge)
• API performance considerations.
• Identify major mapping efforts, start gathering data – formats can drive further loader considerations.

Maintenance and Legacy Scenarios
• It may be important to know what map sets have mappings whose source or target codes will not resolve.
• Managing versions:
o “from” coding scheme,
o “to” coding scheme
o Map set coding scheme iteslf
• Learn what we need to from various mapping maintenance environments about authoring, data models, visualization, etc.
o IHTSDO stand-alone tool (and eventual workbench)
o CogZ (protégé tool)
o Oboedit
o Ad-hoc data manipulation

Content

Space Tools