NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Scrollbar
iconsfalse

Archive - BiomedGT Publishing to LexBIG with Protégé

Panel
titleContents of this Page
Table of Contents
minLevel2

About this page

This page explains the process and the procedural steps for publishing BiomedGT content to LexBIG.

Process Overview

The following sections explain the steps for each procedure in the publishing process.

Run a Prompt baseline comparison (workflow manager)

Using the Prompt plug-in, a workflow manager follows these steps:

  1. Runs a bi-weekly comparison of the current Protégé database against the baseline from the last comparison.
  2. Exports a copy of the master Protégé project file and uses it as a baseline during the next comparison cycle.

Operations tasks

Approximately once a month, Operations publishes the baseline file to LexBIG. This section describes each task.

Eliminate unpublishable properties and roles

BiomedGT Publishing procedure

Prompt will be performed on the Protégé application on a (bi)weekly basis.  At the completion of this process, a baseline is exported for use in the next Prompt procedure.  On an approximately monthly basis, this baseline will be turned over to Operations for publication to LexBIG.  Note: LexBIG expects a ByCode version of the OWL file.  It is unclear at this point whether the baselines being created are ByName or ByCode.  If they are ByName, then we will need a process to convert them to ByCode.

...

Note
titleNote

Currently the OWL version of the TDE baseline is loaded into Protégé and classified

...

solely for

...

QA purposes.

...

This is to

...

verify that the removal of roles did not create any classification issues.

...

Since we are going to EVS:classify the Ontylog output

...

, this may no longer be necessary.

Currently there is a QA process which intakes the baseline, the previous monthly baseline, and the history export
for the month.  The output is helpful in finding inconsistencies in the data and should be continued in some form for BiomedGT.  Some process will be needed to "clean" the history for processing.

To eliminate unpublishable properties and roles, Operations runs a Prompt comparison. This procedure accomplishes the following:

  • Compares the current and the previous baseline against the edit history as recorded in the evs_history table, revealing such editing errors as detect
      • concepts created and retired in the same publishing cycle;
      detect
      • concepts that have no history; and
      detect
      • history records that have no matching concepts
      • .
    • Cleans up the history and exports it to the concept_history table for editing. Cleanup tasks include
      • combining
      combine
      • multiple modifies on a concept into a single modify record; and
      eliminate
      • eliminating modify records on concepts that have been created, merged, split, or retired.

    ...

    • Exports a number of

    ...

    • history files from the concept_history table for publication, including
        A
        • a cumulative history for publication in LexBIG;
        A
        • a monthly history showing all publishable history records for the cycle; and
        A
        • a monthly history highlighting only creations and retirements (including splits and merges) for use by the caDSR
        • .

      Generate Ontylog formatted file and flat text file

      Operations generates an Ontylog-The baseline will need to be run through a process to generate an Ontylog formatted file for input into DTS, in support of caCORE 3.2.  It is undetermined whether we will be making this file available for download directly.the DTS and for download. The baseline processing might also be required to generate generates a flat text file .  It is undetermined whether this will still be continued.The OWL file will be loaded for download.

      Note
      titleNote

      This process supports caCORE 3.2. Once caCORE 3.2 is retired, the process will be discontinued.

      Load OWL file into LexBIG with cumulative history; compare production to QA

      1. Load the OWL file into LexBIG on the Dev server, along with the cumulative history up to the date the baseline file was created.
      2. Tag baseline as QA.

      This resides   This baseline will be labeled as QA and will reside side by side with the previous PRODUCTION version.  As

      1. Run a series of scripts

      ...

      1. to compare the data in the PRODUCTION version to the data in QA.

        These scripts will most likely be run against the LexBIG API, since that is the form that in which the data will be seen in by the user. user will see the data.

      Load Ontylog version into DTS

      Operations loads the The Ontylog version of the vocabulary will be loaded into DTS.   The method of loading requires that the data to be loaded into TDE, classified, then transferred into DTS using Apelon-created applications.

        The classification step can be considered a QA marker, as some bad transformations can cause a classification to fail.   Bad formatting can also result in a failure to load to TDE or to transfer to the DTS.

      Tag QA version as Production; publish vocabulary

      Once the tests return satisfactory results, Operations follows these steps:

      1. Removes the old PRODUCTION version

      ...

      1. .
      2. Tags the QA version

      ...

      1. is tagged as PRODUCTION.

        The vocabulary will be entered into enters the promotion schedule and be is given an expected publication date in the EVS schedule.The

      1. Loads the final output files, both data and history,

      ...

      1. into the

      ...

      1. FTP server in preparation for final publication.

      ...

      1. Publishes the history files

      ...

      1. as soon as possible

      ...

      1. so that EVS-reliant data can be verified for use in the caDSR.

      Notes:
      The data on Dev will be transferred up the tiers along parallel tracks.   The We will push the DTS data will be pushed to QA by us and made make it available on the nciterms-qa website.    We do not have access to the QA layer for LexBIG, so we need to put in deployment requests for both the data and the indexes. We will push the LexBIG data to the software QA server and make it available on the bioportal-dataqa website.

      After the data has been on QA for one to two weeks, After 1 to 2 weeks on QA we will send deployment requests to move both DTS and LexBIG up to Stage.  After After a day there, it can be moved to production.

      Scrollbar
      iconsfalse