NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A "Coding Scheme Manifest" (or manifest) allows the user to set meta data values for a coding scheme. This can be done while loading or converting a LexGrid
"XML", "NCI MetaThesaurus", "NCI OWL","OWL", "OBO", "UMLS RRF File", or"HL7 RIM Database" source to LexGrid format or post load.

...

A Coding Scheme has some meta information about it; values like 'formal name', 'local names', 'default language', 'version', 'copyright', 'sources' to name some.

Why

...

Do We Need a Coding Scheme Manifest?

When a terminology is being converted to the LexGrid data model from its native format (in this case OWL), Coding Scheme information is read from the source file. Sometimes values may be missing (not provided or invalid) or the author/user of the terminology wants to override or set default values despite (or in addition to) what is provided in the source file. This can be accomplished using "manifest" files along with the source file.

How

...

Do We Create a Coding Scheme Manifest file?

A coding scheme manifest file is a valid XML file, conforming to the schema defined by http://LexGrid.org/schema/LexBIG/2007/01/CodingSchemeManifestList.xsd. This XML file can define values for one or more coding schemes you are dealing with. Some coding scheme meta-information may not easily map to information in the source file. In this case a manifest file is of great help to bridge the gap and control the information flow while mapping to the LexGrid model. A detailed model of the LexGrid Coding Scheme and its fields can be found online 1. Structure of the schema for the manifest file is explained in the following table (manifest components refer to the original LexGrid model schema namespaces and types):

id

  • Coding Scheme Manifest entry field:

...

  • id
    • Type: lgCommon:registeredName
    • Required: Yes
    • Override flag set: Not applicable
    • Description:
      The registered name is the key used to find a coding scheme (for example a unique URL or namespace by which other people access same coding scheme). This String value will be used to identify the manifest entry in the manifest file for the coding scheme too. For example the registered name for coding scheme "Amino-acid" is http://www.co-ode.org/ontologies/amino-acid/2006/05/18/amino-acid.owl#. This string is also set as the coding scheme's registered name field in the LexGrid model.

codingScheme

...

  • Coding Scheme Manifest entry field:

...

  • codingScheme

...

    • Type: lgBuiltin:localId
    • Required: No
    • Override flag set: Yes
    • Description:
      This value will be set for 'coding scheme name' in the LexGrid format counterpart. If the override flag is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is treated as a default value and used only if the value is not provided in the source file.

entityDescription

...

  • Coding Scheme Manifest entry field:

...

  • entityDescription

...

    • Type: lgCommon:entityDescription
    • Required: No
    • Override flag set: Yes
    • Description:
      This value will be set for 'coding scheme description' in the LexGrid format counterpart. If the override flag is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is treated as a default value and used only if the value is not provided in the source file.

formalName

  • Coding Scheme Manifest entry field:

...

  • formalName
    • Type: lgBuiltin:tsCaseIgnoreIA5String
    • Required: No
    • Override flag set: Yes
    • Description:
      This value will be set for 'coding scheme formal name' in the LexGrid format counterpart. If the override flag is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is treated as a default value and used only if the value is not provided in the source file.

codingSchemeURI

  • Coding Scheme Manifest entry field:

...

  • codingSchemeURI
    • Type: lgCommon:URI
    • Required: No
    • "To Add" flag set: Yes
    • Description:
      This value will be set for 'coding scheme URI' in the LexGrid format counterpart. If the override flag is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is treated as a default value and used only if the value is not provided in the source file.

defaultLanguage

  • Coding Scheme Manifest entry field:

...

  • defaultLanguage
    • Type: lgCommon:defaultLanguage
    • Required: No
    • Override flag set: Yes
    • Description:
      This value will be set for 'coding scheme default language' in the LexGrid format counterpart. If the override flag is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is treated as a default value and used only if the value is not provided in the source file.

representsVersion

  • Coding Scheme Manifest entry field:

...

  • representsVersion
    • Type: lgCommon:version
    • Required: No
    • Override flag set: Yes
    • Description:

    • This value will be set for 'coding scheme version' in the LexGrid format counterpart. If the override flag is
      set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value is
      treated as a default value and used only if the value is not provided in the source file.

localName

  • Coding Scheme Manifest entry field:

...

  • localName
    • Type: lgBuiltin:tsCaseIgnoreIA5String
    • Required: No
    • "To Add" flag set: Yes
    • Description:
      This value will be added for 'coding scheme local names'. If the add flag is set to 'true', this value will
      be added to the list of local names (if not there already). Otherwise, this value is treated as the default
      value and used only if the value is not provided in the source file.

source

...

  • Coding Scheme Manifest entry field:

...

  • source

...

    • Type: lgCommon:source
    • Required: No
    • "To Add" flag set: Yes
    • Description:
      This value will be added for 'coding scheme sources'. If the add flag is set to 'true', this value will be
      added to the list of sources (if not there already). Otherwise, this value is treated as the default value
      and used only if the value is not provided in the source file.

...

  • Coding Scheme Manifest entry field:

...

  • copyright

...

    • Type: lgCommon:text
    • Required: No
    • Override flag set: Yes
    • Description:
      This value will be set for 'coding scheme copyright' in the LexGrid format counterpart. If the override flag
      is set to 'true', the value provided in the source file will be replaced with this one. Otherwise, this value
      is treated as a default value and used only if the value is not provided in the source file.

mappings

  • Coding Scheme Manifest entry field:

...

  • mappings
    • Type: lgCS:mappings
    • Required: No
    • "To Add" flag set: Yes
    • Description:

    • This value will be added for 'coding scheme mappings'. If the add flag is set to 'true', this value will be
      added to the list of mappings (if not there already). Otherwise, this value is treated as the default value
      and used only if the value is not provided in the source file.

associationDefinitions

...

  • Coding Scheme Manifest entry field:

...

  • associationDefinitions

...

    • Type: lgRel:association
    • Required: No
    • "To Add" flag set: Yes
    • Description:
      This value will be added for 'coding scheme associations'. If the add flag is set to 'true', this value will
      be added to the list of associations (if not there already). Otherwise, this value is treated as the default
      value and used only if the value is not provided in the source file.
Info
titleNote

This option is used internally by the system to provide default recognition of some common associations. It is typically not necessary to provide this value, however, since association definitions are automatically
derived from the source.

What

...

Code Changes May Be Required to Use a Manifest File?

If you want to use the manifest file, you can supply the manifest file URI to the following methods when Loading NCI OWL or generic OWL Loads: "

  • org.LexGrid.LexBIG.Extensions.Load.OWL_Loader.load()

...

...

  • org.LexGrid.LexBIG.Extensions.Load.OWL_Loader.validate()

...

An example code snipped:

Include Page
LexEVS:SupplyOwlFileUri Snippet
LexEVS:SupplyOwlFileUri Snippet

...

Dealing with Large Terminologies

Loading Large RRF Terminologies (Examples: (NCI Metathesaurus, SNOMEDCT, LOINC, etc):

  • Loading Primary Key Strategy - see (DB_PRIMARY_KEY_STRATEGY Config Setting)
    • Sequential Integer Primary Key (SEQUENTIAL_INTEGER) is the best strategy for large loads. This allows the database to insert records into the index in a sequential manner, which is more efficient. If GUID strategy is used, records will be inserted into the index tree at random locations, resulting in index re-balancing after every insert.
  • Hardware is very important to large content loads.
    • RRF Loads are loaded in a multi-threaded manner. Multi-processor servers will give the best performance.
    • If possible, seperate the database server and the loader server.
  • Monitoring a load
    • Monitor all LexEVS logs (both 'load' and 'full' log).
    • If using MySQL, use INNODB tools to monitor Inserts per second. ( SHOW INNODB STATUS )

Load Time Preferences

Preferences for loading elements of sources such as OWL can be done at load time.

General Meta Data File Association Preferences

This value can be adjusted by creating an XML file that resolves against this schema: http://LexGrid.org/schema/LexBIG/2009/01/Preferences/load/LoadPreferences

...

Any xml document can be assigned as metadata to a newly loaded coding scheme.
The xml document is broken down into individual tags and values, which are then searchable
through the LexBIG Service Metadata interface. This parameter indicates the path of
xml metadata assigned during the current load operation. For most loaders, the given path
serves strictly as an option to input user-specified data. For The NCI Metathesaurus loader, metadata is automatically generated and
assigned to the coding scheme. In these cases, the generated xml will be output to the
given file, overwriting any existing content.

Owl Loader Preferences

These values can be adjusted by creating an XML file that resolves against this schema: http://LexGrid.org/schema/LexBIG/2009/01/Preferences/load/OWLLoadPreferences

...

Entities can be assigned a property that indicates whether or not it is
considered primitive (having no equivalent classes). This preference controls
the name of the property that is created; the property value will indicate
true or false. If not specified, the name 'primitive' is assumed.

...

Anonymous OWL classes of type OWLNAryLogicalClass can be assigned
properties that indicate the nature or type of component logical operations.
This preference controls the name of the property that is created; the
property value will indicate the logical operation (e.g. owl:oneOf). If not
specified, the name 'type' is assumed.

...

This preference allows for entity codes to be derived from a specific RDF
property. The provided string is interpreted as a regular expression to be
compared against properties assigned to each processed class. If a
property name matches the regular expression, the property value
is assigned as the entity code. If not specified no default match is
assumed, and the entity code is derived from the RDF resource name.

...

This preference allows for entity status to be derived from a specific RDF
property. The provided string is interpreted as a regular expression to be
compared against properties assigned to each processed class. If a
property name matches the regular expression, the property value
is assigned as the entity status. If not specified, the regular expression
of ('concept_status') is assumed, and if not matched no status string
is assigned (the isActive boolean flag will still be set based on
deprecation).

MatchNoopNamespaces

This preference allows for classes to be selectively ignored on import to LexGrid.
The provided string is interpreted as a regular expression to be compared
against class namespace. If matched, a counterpart entity is not created in the
LexGrid coding scheme. If not provided, the expression '(:|@_:|protege:|xsp(smile).*'
is assumed.

MatchRootName

This preference allows for custom declaration of root concepts for hierarchical
relationships. The provided string is interpreted as a regular expression to be
compared against the resource name for each class. If matched, the node is
designated as a root in the supported hierarchy metadata. If not specified,
root nodes are identified by having a superclass of owl:thing.

...

If processing of complex properties is enabled (see ProcessComplexProps
preference), this preference allows for identification of representational form
names contained by XML fragments embedded within rdf property text.
The provided string is interpreted as a regular expression and compared
against the XML tags in each fragment. If not specified, the default
expression '(term-group)' is assumed.

...

If processing of complex properties is enabled (see ProcessComplexProps
preference), this preference allows for identification of source names contained
by XML fragments embedded within rdf property text. The provided string is
interpreted as a regular expression and compared against the XML tags in
each fragment. If not specified, the default expression '(term-source|def-source)'
is assumed.

MatchXMLTextNames

If processing of complex properties is enabled (see ProcessComplexProps
preference), this preference allows for identification of descriptive text contained
by XML fragments embedded within rdf property text. The provided string is
interpreted as a regular expression and compared against the XML tags in
each fragment. If not specified, the default expression '(term-name|def-definition|go-term)'
is assumed.

IsDBXrefSource

If processing of complex properties is enabled (see ProcessComplexProps
preference) and source, this preference allows for identification of ISBN cross
reference information in xml element text. If not specified, the default
of 'false' is assumed.

IsDBXrefRepform

If processing of complex properties is enabled (see ProcessComplexProps
preference) and source, this preference allows for identification of representational
form cross reference information in xml element text. If not specified, the default
of 'false' is assumed.

ProcessComplexProps

Indicates whether rdf property text will be evaluated to detect and process
embedded XML. This is a master switch controlling whether the MatchXML*
and isDBXRef* preferences are acknowledged. The default is false.

...

Controls the relationship between restrictions of anonymous classes and parent concepts.
If true, restrictions for anonymous classes are not connected to the parent concepts.
The default is true.

CreateConceptForObjectProp

Controls whether concept entities are created for object properties defined in
the OWL source. The default is false.

...

Controls how data type properties are converted to components of the LexGrid
model. If 'association' is specified, each data type property is recorded in LexGrid as
an entity-to-entity relationship. If 'conceptProperty' is specified, traditional
LexGrid properties are created and assigned directly to new entities . If 'both'
is specified, both entity relationships and standard LexGrid entity properties
are generated. The default is 'both'.

...

Indicates a list of rdf property names to be attributed special semantic
significance as comments in the LexGrid model. If not specified, the default
if 'DesignNote, Editor_Note, Citation, and VA_Workflow_Comment' is
assumed. If not matched, only generic properties are created.

...

Indicates a list of rdf property names to be attributed special semantic
significance as definitions in the LexGrid model. If not specified, the default
of 'DEFINITION", dDEFINITION, LONG_DEFINITION, ALT_DEFINITION,
ALT_LONG_DEFINITION, and MeSH_Definition' is assumed.

...

Indicates a list of rdf property names to be attributed special semantic
significance as definitions in the LexGrid model. If not specified, the default
of 'NCI_Preferred_Term, Preferred_Name, Display_Name, Search_Name,
FULL_SYN, Synonym, VA_Print_Name, VA_National_Formulary_Name,
VA_Abbreviation, VA_Dose_Form_Print_Name, VA_Trade_Name,
MeSH_Name, NDFRT_Name, RxNorm_Name' is assumed.

UMLS SEMNET Preferences

This value can be adjusted by creating an XML file that resolves against this schema: http://LexGrid.org/schema/LexBIG/2009/01/Preferences/load/SemNetLoadPreferences

...

The load parameter controls which inherited relationships are loaded and navigable within
LexBIG. When selecting the option not to load inherited relationships, all associations
are extracted from the source file SRSTR (stated relations). When loading all inherited
relations, associations are extracted from the source file SRSTRE1 (classified relations).

At NCI's request we provided an additional option to load only stated relations for direct
subclass ('is_a') associations, but inherited relationships for all other associations.
This was intended to provide similar behavior to their provision of OWL sources.

Info
titleNote

...

Direct or stated relationships are always imported, regardless of the selected

...

option.

  • value="0" Only load direct or stated relationships.
  • value="1" Load all inherited relationships.
  • value="2" Load all inherited relationships except is_a.

Applying Revisions to a Coding Scheme

A Revision Overview

CodingSchemes can be extensively revised by loading a Revision object in LexGrid XML format. A coding scheme Revision can be created to resolve against a "revision" schema URL and loaded to a coding scheme current in the service. This revision is tracked within the service history. Revision function centers around LexGrid model elements that inherit from the Versionable element. Versionable classes and attributes include those "types" of Versionable and any attributes inherited from this element. Whenever a Versionable element appears in a revision it is accompanied by an EntryState element which helps define it's role in the revision process.

...

There are 5 changeType definitions. Each change type needs to be applied in it's own context as follows:

  • NEW - to create a new versionable element
  • MODIFY - to change the attributes of an existing versionable element
  • VERSIONABLE - Versionable attribute has changed since prior version. Versionable attributes include: isActive, status, owner, effectiveDate or expirationDate.
  • DEPENDENT - The status of a dependent entry has changed since the last version. Dependent entities include properties, codedEntries for codingSchemes, associationInstances, etc.
  • REMOVE - Versionable entry was removed as of this version. This is not the same as deprecated, as the entity ceases to exist in future versions.

Revision Resources

Revisions in LexGrid are discussed in more detail here:

Post Processing Options

Post load processing algorithms allow users to access information about the source that may only be available post load and apply to coding scheme meta-data.

...

Users interested in this functionality might try the following to see it in action:

...

Step

...

Action

...

Illustration

...

1.

  1. Start from an installed LexEVS local API.
    screenshot showing Explorer window of installed directoriesImage Modified

...

2.

  1. Load from <LexEVS root>/test/resources/testData/ the coding scheme Automobiles.xml (You should be able to do this using a source in any format supported for loading)
    screenshot showing the selection of the Automobiles.xml fileImage Modified

...

3,

  1. Activate this scheme and view it's contents by getting a coded node set and resolving it.
    screenshot of the Result browser windowImage Modified

...

4.

  1. Load from <LexEVS root>/test/resources/testData/ the coding scheme testExtension.xml selecting the option to extend by selecting the Automobiles terminology from the drop down list by it's URN and version.
    screenshot showing the LexGrid Loader dialog boxImage Modified

...

  1. View the concept codes for the extension and see both the original code set and the supplemental code set.
    screenshot of the Result browser windowImage Modified

Keep in mind that the testExtension.xml's file format can be used to extend any coding scheme currently loaded to LexEVS.

...

A complete authoring API is featured in the CTS2 interface and is Detailed here

Create a Mapping Scheme using LexGrid XML

...

...

...