NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Contents

About this page

This page explains the procedure for publishing BiomedGT content to LexBIG.

Prompt baseline comparison

Using the Prompt plug-in, a workflow manager runs a bi-weekly comparison of the current Protégé database against the baseline from the last comparison. After completing this process, the manager exports a copy of the master project file and uses it as a baseline during the next comparison cycle. Approximately once a month, Operations publishes the baseline file to LexBIG.

Removal of unpublishable properties and roles

The first step performed by Operations is to eliminate unpublishable properties and roles from the Protégé baseline.

Note: Currently the OWL version of the TDE baseline is loaded into Protégé and classified solely for QA purposes. This is to verify that the removal of roles did not create any classification issues. Since we are going to classify the Ontylog output, this may no longer be necessary.

The Prompt procedure compares the current and the previous baseline against the edit history as recorded in the evs_history table. This process can reveal such editing errors as

  • concepts created and retired in the same publishing cycle;
  • concepts that have no history; and
  • history records that have no matching concepts.

Prompt will also clean up the history and export it to the concept_history table for editing. Cleanup tasks include

  • combining multiple modifies on a concept into a single modify record; and
  • eliminating modify records on concepts that have been created, merged, split, or retired.

A number of history files will be exported from the concept_history table for publication

  • A cumulative history for publication in LexBIG
  • A monthly history showing all publishable history records for the cycle
  • A monthly history highlighting only creations and retirements (including splits and merges) for use by caDSR

Generate an Ontylog formatted file and a flat text file

The baseline will need to be run through a process to generate an Ontylog formatted file for input into DTS and for download, in support of caCORE 3.2. Once caCORE 3.2 is retired, this process will be discontinued.

The baseline processing will also generate a flat file for download.

Load OWL file into LexBIG with cumulative history; compare production to QA

The OWL file will be loaded into LexBIG on the Dev server, along with the cumulative history up to the date the baseline file was created. This baseline will be tagged as QA and will reside side by side with the previous PRODUCTION version. A series of scripts will be run to compare the data in the PRODUCTION version to the data in QA. These scripts will most likely be run against the LexBIG API, since that is the form that the data will be seen in by the user.

Load Onlylog version into DTS

The Ontylog version of the vocabulary will be loaded into DTS. The method of loading requires the data to be loaded into TDE, classified, then transferred into DTS using Apelon created applications. The classification step can be considered a QA marker, as some bad transformations can cause a classification fail. Bad formatting can also result in a failure to load to TDE or to transfer to DTS.

Tag QA version as Production; publish vocabulary

Once the tests return satisfactory results, the old PRODUCTION version will be removed and the QA version will be tagged as PRODUCTION. The vocabulary will be entered into the promotion schedule and be given an expected publication date in the EVS schedule.

The final output files, both data and history, will be loaded into the ftp server in preparation for final publication. caDSR needs the history files published as soon as possible in the cycle in order to check for impact to their EVS-reliant data.

The data on Dev will be transferred up the tiers along parallel tracks. The DTS data will be pushed to QA by us and made available on the nciterms-qa website. The LexBIG data will be pushed to the software QA server by us and made available on the bioportal-dataqa website.

After 1 to 2 weeks on QA we will send deployment requests to move both DTS and LexBIG up to Stage. After a day there, it can be moved to production.

  • No labels