The following are links to some useful external materials

The following are high level use case statements related to these requirements

Semantic Metadata
- Define semantic metadata for analytical service
- Define semantic metadata for scientific data
- Define semantic metadata for translation services
Dynamic Workflows
- Define workflow constraints
  - Desired output
  - Desired input
  - Data query parameters
  - Analytical parameters
  - Desired operations
  - Computational constraints/requirements
  - Time constraints/requirements
  - Storage constraints/requirements
- Generate workflow
- Validate workflow
- Run workflow
- Track workflow
- Share workflow
- Share dynamic workflow (template/constraints)
- Version workflow (design, creation, evolution)
Provenance Tracking
- Create intermediate data
- Fetch intermediate data
- Link data (process)
- Establish data ownership and security (attribution)
- Version data (republishing/updates)

The following are non-functional requirements that do not result in actor-oriented use cases

Define a semantic workflow standard encoding (e.g. OWL-S, WSMO, SWSL, SWSF)
Define a provenance standard encoding

The following are some basic discovery related use cases that pertain to the requirements

Discover data of interest

Use Case Number	Init3dbw2.pm21.1
Brief Description	Discover data of interest: A researcher wants to find data that has already been collected for use with caArray. They are able to find the data and to inspect the system to learn about what type of cells are in the database, what type of pathology is available for the data, etc.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	Data services exist and are accessible.
Post condition The state of the system after the user interacts with it	Data of interest is discovered.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher identifies the characteristics of the data that he would like to discover (e.g. type, size, specific data fields, etc.) The Cancer Researcher performs a discovery query and gets back a number of datasets. The Cancer Researcher interrogates those datasets to determine if they are of interest.
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init3dbw2 - Provenance metadata to support Semantic Workflows
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

Discover related data

Use Case Number	Init3dbw2.pm21.2
Brief Description	In some cases, two semantically equivalent data element can be annotated with different semantic concepts that may or may not themselves be related. In these cases, there needs to be a mechanism to define semantic equivalence between the data elements, the concepts, or expand/contract the scope of the semantic query in the case of related concepts. An example of this use case is that there needs to be a way to discover data elements both with StartDate and Begin+Date, e.g. through a semantic equivalence of the two or through a widening/narrowing query.
Actor(s) for this particular use case	Metadata Specialist, Cancer Researcher
Pre-condition The state of the system before the user interacts with it	Two data element exist and are individually discoverable
Post condition The state of the system after the user interacts with it	The two data elements are discovered as semantically equivalent
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	A Metadata Specialist individually discovers the two data elements The Metadata Specialist determines (manually) that these two data elements are semantically equivalent The Metadata Specialist defines a rule that the data elements are semantically equivalent A Cancer Researcher performs a discovery query that would normally (if there were no rules defined) return one of the data elements Both of the data elements are returned to the Cancer Researcher
Alternate Flow Things which would prevent the normal flow of the use case	If the two data elements are annotated with related concepts, the following alternate flow is possible: A Cancer Researcher discovers one of the data elements through a semantic query The Cancer Researcher widens the semantic query to include additional related concepts (up the tree for less specific, down the tree for more specific) Both of the data elements are returned to the Cancer Researcher
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	None.
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

Aggregate data

Use Case Number	Init3dbw2.pm21.3
Brief Description	Aggregate data of interest: A researcher is able to query the system to find data that can be combined with their data. It is able to compare the characteristics of the dataset to ensure that the data are combinable, for example .
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A number of datasets have been identified for aggregation.
Post condition The state of the system after the user interacts with it	Combinable data has been aggregated.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher performs a discovery query to find available datasets that can be aggregated with his dataset The Cancer Researcher selects datasets to be aggregated The Cancer Researcher selects aggregation parameters (e.g. data elements to combine) The Cancer Researcher performs the aggregation
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init3dbw2 - Provenance metadata to support Semantic Workflows
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

The following use cases have direct overlap with these requirements but have been captured under Init1dbw6.pm8.U0 - Support caB2B to integrate services on caGrid

Contents of this Page

Expose service workflow metadata
Discovery of analytical steps
Storage and access of intermediate data
Workflow sharing
Define a metadata category
Discover services by metadata category
Perform operations based on metadata category
Discover Related Data based on metadata
Query for Data based on permitted valid values

Expose service workflow metadata

Use Case Number	Init1dbw6.pm8.U1
Brief Description	It is commonplace in bioinformatics to string together a number of data and analytical operations in order to produce the desired output. In order for Cancer Researchers to discover which services can be piped together, it is necessary that the designers of the services expose the appropriate metadata.
Actor(s) for this particular use case	Information Modeler
Pre-condition The state of the system before the user interacts with it	A service exists that needs to be annotated.
Post condition The state of the system after the user interacts with it	The service is annotated sufficiently to be discovered and integrated into an analytical pipeline.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Information Modeler defines the metadata (inputs, outputs, service functionality, etc.) for the service. The service is deployed and accessible to its users The metadata is published in a way that can be queried upon and consumed by users of the service
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	Sufficient metadata needs to be defined so that the service can be discovered and linked into a workflow.

Discovery of analytical steps

Use Case Number	Init1dbw6.pm8.U2
Brief Description	Once metadata about a service is defined and exposed, it must be queriable by users of the service. They must, through consuming of the metadata alone, be able to determine which services can act as consumers of the data the service produces, as well as produces of the data the service consumes. Furthermore, the user must be able to determine that the service is appropriately placed within the workflow.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	Service-level metadata is exposed for a number of services that can be linked via a workflow.
Post condition The state of the system after the user interacts with it	The Cancer Researcher knows which services can act as inputs to which other services.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher identifies an analytical service that he would like to use. The Cancer Researcher performs a query to identify which services can produce data that can be fed into the analytical service. The Cancer Researcher performs a query to identify which services can accept the data from the selected analytical service. Through the service metadata, the Cancer Researcher can identify which services make sense to pipe data through, and identify any services that are needed for data transformations.
Alternate Flow Things which would prevent the normal flow of the use case	The query could begin with a dataset or data service, and the Cancer Researcher would be identifying all downstream data and analytical services.
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	The user must be able to identify services based on input/output types, as well as find the appropriate translation services if needed.

Storage and access of intermediate data

Use Case Number	Init1dbw6.pm8.U3
Brief Description	When services are chained together into bioinformatic pipelines, it is often desirable to be able to store and then later access intermediate results of queries and analytics. These can be used to modify the pipeline as needed, or to share intermediate results with other investigators.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A service that produces data has been identified and is accessible, as well as the mechanism by which intermediate data will be stored.
Post condition The state of the system after the user interacts with it	The results of the service are available via the intermediate data service.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher discovers and selects a service for storing the intermediate data. The Cancer Researcher instructs the tool or service to store the results in the intermediate data. The Cancer Researcher invokes the service The results are stored in the intermediate data service and are accessible via service calls
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	Access to the intermediate data must be as seamless as access to any other service, and the data should be secured based on rules that the Cancer Researcher identifies.

Workflow sharing

Use Case Number	Init1dbw6.pm8.U4
Brief Description	Once a user identifies a service workflow of interest, he should be able to share that workflow in a way that makes it easy to encode, share with colleagues, reuse/rerun, modify, and extend.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A set of services of interest has been identified.
Post condition The state of the system after the user interacts with it	The service workflow is stored and shared.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher encodes and saves the workflow The Cancer Researcher identifies the other uses that can access the workflow The Cancer Researcher reruns the workflow at a later date The Cancer Researcher modifies the workflow at a later date The Cancer Researcher copies and extends the workflow with additional steps
Alternate Flow Things which would prevent the normal flow of the use case	The steps listed above can be performed in any order any number of times with the exception that the workflow must be encoded and saved first.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	The workflow must be accessible in much the same way as any other service.

Define a metadata category

Use Case Number	Init1dbw6.pm8.U5
Brief Description	The Metadata Category is the ability to save a particular view of classes and their associations in order to find services that match. For example, a Cancer Researcher may be interested in A->B->C and wants to be able to query services that support those classes and associations.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	None.
Post condition The state of the system after the user interacts with it	A Metadata Category has been defined.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher discovers a set of classes, attributes, and associations that he is interested in The Cancer Researcher saves these as a Metadata Category in a repository
Alternate Flow Things which would prevent the normal flow of the use case	The Cancer Researcher may want to load, update, delete, or share an existing Metadata Category.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

Discover services by metadata category

Use Case Number	Init1dbw6.pm8.U6
Brief Description	Once a Metadata Category is created, a Cancer Researcher can use it to discover services that support the underlying classes, attributes, and associations.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A Metadata Category has been identified.
Post condition The state of the system after the user interacts with it	A set of services of interest have been identified.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher loads the Metadata Category The Cancer Researcher invokes the appropriate function to discover services that support the underlying parts of the Metadata Category The Cancer Researcher selects a subset of the services of interest
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

Perform operations based on metadata category

Use Case Number	Init1dbw6.pm8.U7
Brief Description	Once a user has identified a set of services that support a Metadata Category, he can invoke operations across those services and aggregate the results based upon the classes, attributes, and associations within the Metadata Category.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A set of services of interest has been identified via the Metadata Category.
Post condition The state of the system after the user interacts with it	The results from the cross-service operation are aggregated and presented to the user.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher selects the operation (such as "query") he would like to invoke on the selected services The services are invoked and results are returned The results are aggregated based upon the Metadata Category The aggregated results are returned to the user
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	Low.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	None.

Discover Related Data based on metadata

Use Case Number	Init1dbw6.U8
Brief Description	caB2B relies on establishing/identifiying relationships between concepts (entities) and/or properties (attributes) exposed through a service's implementation models (currently registered in caDSR/advertised on caGrid). For instance, we need to be able to identify (or compute) Gene (entity from Model A) and MyFavoriteGene (entity from Model B) are representing identical concepts, even when the entities are not named the same. Furthermore that Gene.entrezGeneId property (attribute from model A) and MyGene.geneEntereId (attribute from model B) are representing the same property/attribute for the given concept. Currently NCIt concept mappings stored in the caDSR for the Object Classes and CDEs are leveraged to identify such relationships. CDE mappings are the simplest to map across models because the CDE means that the physical database representation (syntax) is the same in the two models and can be directly aggregated. The new SI should continue to provide services and processes for us to be able to identify such relationships. In accordance with ECCF, the services/processes should at the minimum leverage the traceability (computable) among different levels of models (CIM->PSM) and compliance among model representations (PSM<->PSM), caB2B should be able to easily discover this equivalence.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	A particular concept of interest has been identified via the terminology browser or metadata browser. Class/entities in models have been associated with common concepts.
Post condition The state of the system after the user interacts with it	The results from the cross-model discovery operation are aggregated and presented to the user.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher selects the concept of interest and the operation (such as "discover related models") The SI services are invoked and results are returned The results are aggregated based upon the matching entities and levels of entities in the models. For example, models or portions of models that are conformant at the CIM level are evident, at the PIM level, and at the PSM level and are self-evident in the results. The aggregated results are returned to the user
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	Only Models where matching classes, attributes and associations are returned by the operation.

Query for Data based on permitted valid values

Use Case Number	Init1dbw6.U9
Brief Description	caB2B uses value sets (that bind to concepts) to filter/constrain certain data elements in formation of queries to datasets. This requires runtime access (or metadata download prior to runtime) to possible values for specific data elements. Currently value domains that are associated with data elements stored in metadata repository (MDR/caDSR) are fetched and provided in query builders. The enumerated values are provided in pull down menus/lists to ensure the queries are formulated with correct allowable values. Similarly, although not supported by caB2B now,, the allowable ranges (e.g. 0 <= Age <= 130) and units of measure can also be provided to support correct query formulation. The new SI should continue to provide services and processes for us to be able to identify and compute the value set or allowable ranges or units of measure for querying data elements.
Actor(s) for this particular use case	Cancer Researcher
Pre-condition The state of the system before the user interacts with it	The enumerations (value sets) or allowable ranges or units of measures for attributes/data elements/variables in a particular database are available to help form queries of the database.
Post condition The state of the system after the user interacts with it	Data matching the selected enumerations (values) or ranges is retrieved via an SI operation.
Steps to take The step-by-step description of how users will interact with the system to achieve a specific business goal or function	The Cancer Researcher selects the entity/class of interest and an operation to retrieve the possible attributes/properties. (such as selecting "Array" and asking getAttributes) The SI service returns the results (such as all the property/attributes associated with Assay in the models/data service). The Cancer Researcher discovers that there is an attribute in the data source that is called Array.type and that the possible values are "Gene Expression" "SNP" and "Exon", and issues a query (such as "query array.type="Gene Expression") The SI services are invoked and results are returned The results are returned to the user
Alternate Flow Things which would prevent the normal flow of the use case	None.
Priority The priority of implementing the use case: High, Medium or Low	High.
Associated Links The brief user stories, each describing the user interacts with the system for the one function only of the use case. There would potentially be a number of user stories that make up the use case.	Init1dbw6 - Support caB2B to integrate services on caGrid Support caB2B Services to integrate data on grid
Fit criterion/Acceptance Criterion How would actor describe the acceptable usage scenarios for the software or service that meets the actor's requirement?	The researcher is able to discover what the possible or allowable values are for a data field and select/enter more values of interest, results are returned that match only the entered values.

Content

Space Tools

Init3dbw2.pm21 - Provenance metadata to support Semantic Workflows