NIH | National Cancer Institute | NCI Wiki  

Contents of this Page

Use natural language processing to guess a concept mapping

Use Case Number

Init1hm3.pm27.1

Brief Description

It can be challenging to identify the correct semantic annotation of data elements in the metadata repository when the user performing the concept annotation is not intimately familiar with the ontology.  Natural language processing could be used to digest a data element description and map it to a concept more correctly than a simple text search.

Actor(s) for this particular use case

Metadata Specialist

Pre-condition
The state of the system before the user interacts with it

A common data element is available for annotating.

Post condition
The state of the system after the user interacts with it

The common data element has been annotated with one or more semantic concepts.

Steps to take
The step-by-step description of how users will interact with the system to achieve a specific business goal or function

  1. The Metadata Specialist identifies a new data element to be semantically annotated
  2. The Metadata Specialist sends all available information, including the data element description, to a natural language processing algorithm.
  3. The algorithm returns a list of potential matches with their likelihood of correctness.
  4. The Metadata Specialist selects those concepts that are appropriate to be annotated to the data element.

Alternate Flow
Things which would prevent the normal flow of the use case

None.

Priority
The priority of implementing the use case: High, Medium or Low

High.

Associated Links
The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.

Fit criterion/Acceptance Criterion 
How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?

The NLP algorithm must be more accurate than simple text matching.

Train a natural language processor using existing semantic annotations

Use Case Number

Init1hm3.pm27.2

Brief Description

A large number of semantic annotations exist within information models in caBIG.  These mappings of semantic concepts to data elements with textual descriptions could be used to train the natural language processor algorithm.

Actor(s) for this particular use case

Metadata Specialist

Pre-condition
The state of the system before the user interacts with it

The semantic annotations of all caBIG information models are available.

Post condition
The state of the system after the user interacts with it

A natural language processor algorithm is trained.

Steps to take
The step-by-step description of how users will interact with the system to achieve a specific business goal or function

  1. The Metadata Specialist selects the training set for the algorithm
  2. The algorithm is trained

Alternate Flow
Things which would prevent the normal flow of the use case

None.

Priority
The priority of implementing the use case: High, Medium or Low

Low.

Associated Links
The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.

Fit criterion/Acceptance Criterion 
How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement? 

None.


  • No labels