NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Contents

This is the procedure for publishing BiomedGT content to LexBIG.

Prompt

Prompt will be performed on the Protégé application on a (bi)weekly basis. At the completion of this process, a baseline is exported for use in the next Prompt procedure. On an approximately monthly basis, this baseline will be turned over to Operations for publication to LexBIG.

Note

LexBIG expects a ByCode version of the OWL file. It is unclear at this point whether the baselines being created are ByName or ByCode. If they are ByName, then we will need a process to convert them to ByCode.

Remove unpublishable properties and roles

The first step performed by Operations will be to run the baseline through a process to eliminate unpublishable properties and roles (example: Editor_Note. Currently the OWL version of the TDE baseline is loaded into Protégé and classified, for purely QA purposes. This is to check that the removal of roles did not create any classification issues. Since we are going to classify the Ontylog output (below) this may no longer be necessary.

Currently there is a QA process which intakes the baseline, the previous monthly baseline, and the history export for the month. The output is helpful in finding inconsistencies in the data and should be continued in some form for BiomedGT. Some process will be needed to "clean" the history for processing.

  • detect concepts created and retired in the same publishing cycle
  • detect concepts that have no history
  • detect history records that have no matching concepts
  • combine multiple modifies on a concept into a single modify record
  • eliminate modify records on concepts that have been created, merged, split or retired.

A number of history files will be created for publication

  • A cumulative history for publication in LexBIG
  • A monthly history showing all publishable history records for the cycle
  • A monthly history highlighting only creations and retirements (including splits and merges) for use by caDSR

Generate an Ontylog formatted file and a flat text file if needed

The baseline will need to be run through a process to generate an Ontylog formatted file for input into DTS, in support of caCORE 3.2. It is undetermined whether we will be making this file available for download directly.

The baseline processing might also be required to generate a flat text file. It is undetermined whether this will still be continued.

Load OWL file into LexBIG with cumulative history; compare production to QA

The OWL file will be loaded into LexBIG on the Dev server, along with the cumulative history up to the date the baseline file was created. This baseline will be labeled as QA and will reside side by side with the previous PRODUCTION version. A series of scripts will be run to compare the data in the PRODUCTION version to the data in QA. These scripts will most likely be run against the LexBIG API, since that is the form that the data will be seen in by the user.

Load Onlylog version into DTS

The Ontylog version of the vocabulary will be loaded into DTS. The method of loading requires the data to be loaded into TDE, classified, then transferred into DTS using Apelon created applications. The classification step can be considered a QA marker, as some bad transformations can cause a classification fail. Bad formatting can also result in a failure to load to TDE or to transfer to DTS.

Tag QA version as Production; publish vocabulary

Once the tests return satisfactory results, the old PRODUCTION version will be removed and the QA version will be tagged as PRODUCTION. The vocabulary will be entered into the promotion schedule and be given an expected publication date in the EVS schedule.

The final output files, both data and history, will be loaded into the ftp server in preparation for final publication. caDSR needs the history files published as soon as possible in the cycle in order to check for impact to their EVS-reliant data.

The data on Dev will be transferred up the tiers along parallel tracks. The DTS data will be pushed to QA by us and made available on the nciterms-qa website. We do not have access to the QA layer for LexBIG, so we need to put in deployment requests for both the data and the indexes.

After 1 to 2 weeks on QA we will send deployment requests to move both DTS and LexBIG up to Stage. After a day there, it can be moved to production.

  • No labels