NIH | National Cancer Institute | NCI Wiki  

Current Working Draft
Unable to render {include} The included page could not be found.
  • No labels

7 Comments

  1. Unknown User (osters)

    Re: the central concepts of SI 2, in addition to "just enough semantics" I think we need to place emphasis on the notion that semantic annotation and robustness needs to be treated as an evolutionary process. One of the major barriers to entry of the current SI is it is very front heavy; you need to pretty much have a fully fleshed out semantic model before you can easily leverage the toolset. I think the community drivers are not so much from a "we want less semantics" but rather the need for incremental delivery. Many groups with potential significant value, in the form of existing tools and data sets, would like a path to easily provide that value to caBIG quickly, and over time add our expected metadata and perhaps harmonize terminology or data definitions. So while I agree there is a need to support those services and datasets with little to no semantics (that is easy), I think its more important that there is a process, supporting infrastructure, and tooling to incrementally ease those services up the semantic stairway.

  2. Unknown User (wileyal)

    Posted in behalf of Jyoti Pathak (Mayo)

    Re: "The caGrid 2.0 and Semantic Infrastructure 2.0 roadmap projects are focused on satisfying three overarching requirements that are central to success."

    For each of these 3 bullets, what would help is 4-5 sentences stating what the problems were in the existing/old infrastructure. As of now, the reader is lost as to what the issues were and why a change is warranted.

    1. Unknown User (meadch)

      • Lower the current barrier-to-entry to use of the caBIG® tools and technologies
        The first generation of caBIG® tools are heavily front-loaded in terms of the level-of-effort -- i.e. technical specificity, detail, and enterprise semantic compliance -- that are required in order for a service/software component to be deployed on caGrid.  In addition, many aspects of Grid service security are left to application developers who use the service.  This results in a considerable barrier to entry for service/component developers whose target deployment requirements do not require as much rigor as the "one size fits all" approach of caGrid 1.x.  The goal of the 2.0 effort is to fragment this front-loading into a series of "steps" or "levels" with appropriate tooling support so that deployment effort can be related to the complexity of the deployment context.  Thus, this first 2.0 requirement gives rise to the second requirement of the 2.0 infrastructure, i.e.
      • Provide a "linear value proposition" to caBIG® stakeholders – "make easy things easy to do"
        The goal of this requirement is to amortize the level of robustness that a given implementation needs to provide in terms of its overall Conceptual, Logical, and Implementable artifacts, its semantics/meta-data, and its security capabilities to enable service developers to deploy services with "just enough specifications, security, and semantics" to satisfy their target deployment context and its associated requirements.  For example, deploying a service/component in a local context with established trust (implicit or explicit) among users and a requirement that data returned by a service be consumable by humans but not computationally robust requires considerably less rigor in terms of specifications, security capabilities, and semantic rigor than required for a software component deployed in a world-wide community with potential use by clients of unknown trust and requiring its data (or behavior) be computationally processable.  The first deployment context is "easy," the second "hard."  And one can imagine several "not quite as hard" deployment contexts between the two.
      • Provide support for users of the first-generation caBIG® infrastructure and their data
        One of the single biggest differences between the current semantic infrastructure as manifest by the caDSR et al and the planned SI 2.0 is the increased amount and associated detail of informational (static) meta-data that will be able to be specified (see previous two bullets and note that this meta-data will not be required except for CBIIT-funded services), e.g. increased meta-data to specify context.  This requirement is focused on the fact that irrespective of any increased value that the SI 2.0 infrastructure might bring to the developers of <<new>> meta-data, the 2.0 infrastructure must support the considerable amount of first-generation data elements and associated meta-data.  The details of this support can range from "migration to a 2.0-compliant representation" to "infrastructure support for 1.x representation in selected high-value contexts."  As such, several strategies are being evaluated as part of the SI 2.0 Inception Phase/Roadmap development activities currently underway.
  3. Unknown User (wileyal)

    Posted in behalf of Jyoti Pathak (Mayo)

    Re: "This addresses the fundamental importance of semantics in any architecture in the broadest possible context of the life sciences and healthcare."

    This is a very broad sentence. Needs further clarification and details.

    1. Unknown User (meadch)

      The point of pre-pending the "s" to SOA is to underscore the importance of informational semantics -- what the caBIG community has historically referred to as data elements and meta-data -- in a SOA deployed into the life sciences and clinical sciences domains.  In particular, not only are informational -- and ultimately, as caGrid 1.x has learned in the context of analytic services and the CCTS collection of applications, behavioral -- semantics critical to the robust, relevant, and responsive functioning of an SOA in the clinical and life sciences domains, it is also the case that often there are <<standards>> for the expression of these semantics, something that is quite often <<not>> the case in SOAs deployed in other contexts.  Finally, the "s" is pre-pended to underscore the message that not only are semantics important, they are, in fact, critical success factors for large-scale interoperability in the clinical and life sciences where the interoperability requirements are quite often more complex and/or more comprehensive than those present in other SOA deployment contexts.  In summary, then, the "s" in "sSOA" is there simply to underscore some of the important "givens" as CBIIT moves to an enterprise/community architecture based on SOA as its paradigm for Connecting Communities and Content.

  4. Unknown User (wileyal)

    Posted in behalf of Jyoti Pathak (Mayo)

    Re: "just enough security, semantics, and specifications."

    How does one define "just enough"? Seems like a vague phrase.

    1. Unknown User (meadch)

      "Just enough" is intentionally "vague" in the sense that absolute requirements cannot -- and actually should not -- be specified up front.  Rather, the overarching philosophy for the 2.0 infrastructure is based on the belief that a service developer should be able to register and deploy a service in one hour of effort beyond what is required to develop the service for a non-caGrid deployment.  The notion of the caGrid infrastructure requiring "just enough" semantics, security, and/or specification" is meant -- in that context -- to say "no additional work in any of those dimensions is required beyond what a developer should reasonably expect to supply for the service/components originally intended context.  In particular, if the service/component is being deployed for use in a well-understood, trusted context in which semantics are either only required at a human-to-human level or are, alternatively, well-known and understood at a computational level, the caGrid 2.0/SI 2.0 deployment will not require any additional specification, security, or semantic responsibilities.  If, on the other hand, the intended deployment content on caGrid is for unanticipated use within unknown trust relationships and requirements for computational semantics -- or some combination of those three dimensions of complexity -- caGrid 2.0 and its associated infrastructure will both require and assist in guiding the developer in terms of increased specification requirements (e.g. both informational and behavioral meta-data) as well as provide tools to provide the deployed service/component with the appropriate service-/component-level security capabilities.

      Also, see Scott Oster's comment (above) RE the notion that "just enough" is meant as a point to the requirement to develop "stair-step" processes and tools which distribute the current 'one size fits all, front-loaded" requirements of caGrid 1.x across several levels of deployment complexity.