h1. Supplemental Page for Requirements Gathering

*Requirements planning for next generation semantics infrastructure* 

_This page is linked to the caBIG VKC Semantic Infrastructure pages as a gateway into requirements gathering and related materials for the next generation semantic infrastructure for caBIG®. The semantics computing capabilities to support interoperability in an SOA demand enhancements and new infrastructure to support the vision described below._ _This page contains links to the related documents and projects tied to development of_ *{_}new capabilities{_}*.
\\

h2. Vocabulary Knowledge Center Semantics Requirements Forum (VKC)

The community was asked to tell us what their requirements are for the next generation infrastructure and the [VKC Semantics Requirement Forum|https://cabig-kc.nci.nih.gov/Vocab/forums/viewforum.php?f=34] was created for this purpose. Please feel free to visit the site and comment or contirbute additional ideas.

ICR Requirements are [here|https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146&sid=d26a0805c7809fe5330a36ec30aa2295 ]

h2. Background

The *Semantic Infrastructure and Operations Group* is responsible for the semantic aspects of the CORE Program Area, the caBIG® Vocabulary and Common Data Elements Workspace, and certain aspects of the caBIG Architecture Workspace, and caGrid®.
\\

The activities of the NCI CBIIT Semantics and Operations Group fall into three areas:
\\
* Content Management - the processes and procedures that ensure the breadth and quality of the metadata and terminology used to record the semantics of data meet the needs of the caBIG® community.
* Semantics Infrastructure - design and development of  software resources and operations including producing reference implementations of platform independent models
* Semantics Architecture and Management - defining the platform independent (as also called "implementation Independent") specification for systems and processes required to meet the semantics needs of the CBIIT/caBIG enterprise, and for assuring that operational requirements for semantics support are met ina timely and reliable way.

Our *vision* is to provide computational and human interpretable representation of the meaning and context of data and services. Realization of this goal is a vital to enable the caBIG® community to revolutionize biomedical research, personalized medicine, and integrated care.  To achieve this vision, the semantic infrastructure must:

\-          Continue to provide caBIG with +computationally tractable representations of the meaning and representation of data+, and to extend semantic support to analytic and other services so that they can be discovered, understood, and securely utilized.

\-          Utilize a +consistent, comprehensive information management discipline+ and software engineering standards such as [ISO 10746 RM-ODP|https://wiki.nci.nih.gov/x/IyAhAQ] and its companion standard [UML4ODP ISO 19793|https://wiki.nci.nih.gov/x/IyAhAQ] to define both enterprise semantics needs and implementation neutral solutions to meet those needs.

\-          Provide +reference implementations+ of enterprise-level platform independent models addressing semantic needs, especially the need for behavioral semantics. 

\-          +Reduce the level of effort+ associated with creation of semantic information, in part by leveraging to the greatest extent possible automated approaches to harvest semantics information from line of business and software engineering activities
\\

h3. Metadata management for Semantics Support

The semantic model and infrastructure forms a key component of the caBIG® collaborative infrastructure. The current semantic infrastructure uses a modified version of ISO 11179 Ed2, which formed the seeds for the development of ISO 11179 Ed3, and is the central component in allowing data elements and models to be annotated with concepts, and curated and registered in a repository allowing lookup and retrieval by both end-users and applications.  It enables the automatic integration and transformations of data for sharing and collaboration by provisioning the infrastructure with clear, computable, and unambiguous data descriptors +for those who would create software+ that can use and interpret the data in the service of cancer research. 

The goals of the next generation infrastructure is to make these capabilities available to everyone with coarser grained services that require little or no knowledge of the complexities of the infrastrucdrture in order to gain its benefits.

Two additional opportunities for improvement in the current infrastrucrture approach are:

1) simplify the creation of this metadata which is currently labor intensive and therefore does not scale well

2) expand the approach beyond data discovery and interchange, to inlcude services interoperability

h2. Interested Parties? Stakeholders in New Semantic Infrastructure

The key to success for the next generation infrastructure is defining the critical, unmet semantic interoperability usage scenarios to ensure that the right next generation of services and infrastructure are provisioned.  With the help of Mayo Clinic through the caBIG Vocabulary Knowledge Center a wiki to support requirements gathering is being organized to collect and report on requirements.  

Interested parties are encouraged to become involved over the next few months as we attempt to characterize in more detail the requirements to achieve the interoperability vision.

h3. Primary Users

See Semantic Infrastructure [Stakeholder Page on VKC wiki|https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/SI_Conop_Stakeholders] for more complete description of stakeholders.

The primary direct and indirect end users include:
* Software and Application designers and architects
* Software and Application engineers and developers
* Scientific and medical researchers
* Medical research protocol designers
* Clinical and scientific research data managers
* Clinicians
* Patients
* Medical research study participants 

Requirements materials attached to this wiki page have been received from the following groups and are being used to supplement the requirements described on the VKC wiki:
|| Stakeholder || Contact || Area of Interest || Usage/Primary Interaction Scenarios ||
| Clinical Governance Group | John Speakman [CTMS Wiki - Storyboards and Semantic Profiles for services interoperabilty|https://cabig-kc.nci.nih.gov/CTMS/KC/index.php/CTMS/CCTS_Interoperability_Scenarios_-_Draft#CTMS_Interoperability_Scenarios_-_DRAFT] | ScenPro Analyst and 5AM, ISO Datatype Documentation of ISO use for COPPA (all projects have to use the datatypes) Implementation of guidelines must be complete in June \\ | 1. Support for 'operationalized' ISO 21090 Datatypes \\
2. Discover and share/reuse models  \\
3. Rules engine and repository: \\
Scenario #8: Management of Routine Non-Laboratory-Based Adverse Events (caAERS and CDMS) - Protocol Metadata and Rules Engine |
| Life Sciences Governance Group \\ | Juli Klemm \\ | Seamless interoperability, discover services and data that can be combined; construct new workflows \\ | 1. Taverna, caB2B \\ |
| MediData | | Data Elements on Forms: Metadata registry information and semantic metadata, specifically data elements (CDEs) to record and share centrally defined forms variables that can be used to customize new protocols for clinical trials | 1. Share Forms including form structure, behavior and variables \\ |
| Genzyme | Sue Dubman \\ | MDR metadata exchange \\ | 1. share variables (CDEs) \\ |
| Researchers \\ | Yolanda Gill | Workflow: Metadata and rule support | 1. Metadata to support workflow rules \\ |
| MD Anderson | Mike Riben | Alignment/Interoperability between NCI and MD Anderson's metadata and vocabulary solutions; Possible UAT | 1. \\ |
| Emory | Stuart Turner, Eliot Seigel and Joel Saltz | Semantic interoperation req't stemming from the TCGA Radiology and In Silico projects. Use case with 4 semantic requirements identified. | |
| Novartis | Mehta Saurin | MDR tools \\
ie. \\
1) Alternate names for permissible values i.e. currently you can register 'M', 'MALE' but we would like to register additional name such as 'm', '1', 'Male' etc. \\
2) Additional attributes for a data element - although there is a reference field available we would want (for operational purposes) additional attributes to define items such as 'SAS format', display format etc. \\
3) if the list of permissible values is Extensible flag \\
4) A way to relate data elements to each other \\
5) Distributed repository | |
| CDISC and SHARE | | Semantic Media Wiki for harmonizing and updating data elements; Input on tooling and metadata extensions | |
| Mayo | Robert Friemuth | new metadata repository and CTS2 | \\ |
| ICR | Juli Klemm, Baris | caB2B [IRWG requirements|https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146&sid=d26a0805c7809fe5330a36ec30aa2295], caBIG Gene Pattern and Analytical Services interoperability | \\ |
| Terminologists | Sherri De Coronado \\ | formal requirements for the new terminology and metadata services and for assessing equivalence between pre\- and post\- coordinated terminology | \\ |
| caBIG Community | Various | Vocabulary Knowledge Center Semantic Requirements [Wiki Forum|https://cabig-kc.nci.nih.gov/Vocab/forums/viewforum.php?f=34 ] | |
| Software Architects and Designers \\ | Anand Basu \\
Charlie Mead \\ | Discover and integrate services, on the fly, to perform scientific research \\ | 1. build new services that can interact with other existing services using workflow authoring tools such as Taverna\\
2. Support for "Conformance Profiles" <!--  /* Font Definitions */  @font-face 	{font-family:"Cambria Math"; 	panose-1:2 4 5 3 5 4 6 3 2 4; 	mso-font-charset:0; 	mso-generic-font-family:roman; 	mso-font-pitch:variable; 	mso-font-signature:-1610611985 1107304683 0 0 159 0;}  /* Style Definitions */  p.MsoNormal, li.MsoNormal, div.MsoNormal 	{mso-style-name:"Normal\,IJHISI-Normal"; 	mso-style-unhide:no; 	mso-style-qformat:yes; 	mso-style-parent:""; 	margin-top:6.0pt; 	margin-right:0in; 	margin-bottom:0in; 	margin-left:0in; 	margin-bottom:.0001pt; 	text-align:justify; 	mso-pagination:widow-orphan; 	font-size:12.0pt; 	font-family:"Times New Roman","serif"; 	mso-fareast-font-family:"Times New Roman";} .MsoChpDefault 	{mso-style-type:export-only; 	mso-default-props:yes; 	font-size:10.0pt; 	mso-ansi-font-size:10.0pt; 	mso-bidi-font-size:10.0pt;} @page Section1 	{size:8.5in 11.0in; 	margin:1.0in 1.0in 1.0in 1.0in; 	mso-header-margin:.5in; 	mso-footer-margin:.5in; 	mso-paper-source:0;} div.Section1 	{page:Section1;} -->Profiles are a mechanism used to constrain broader service capabilities to meet specific functional needs identified within a domain or locality (See attachment) \\ |
| Metadata Curators \\ | | Creating new content in caDSR \\ | 1. customizable download \\
2. improve search and browsing functions, leveraging existing semantics and metadata \\
3. improve batch upload/editing \\ |
\\
\\