This section includes the following topics.


Creating a New Study
This chaptersection describes the processes for creating and managing studies in caIntegrator.
Topics in this chaptersection include:

Creating a Study – Overview
study:creatingYou can create a caIntegrator study by importing subject annotation study data, genomics data and imaging data, using a combination of spreadsheet/files and existing caGrid applications as source data. Each instance of caIntegrator can support multiple studies. As the manager creating a study, it is important that you understand the study well and that the data you wish to aggregate has been submitted to the applications whose data can be integrated in caIntegrator.

As you create the study, you define its structure in the process, identifying the data sources and mapping the data between different source data. After the study has been created and deployed, the study can then be used to perform analyses.


Configuring and Deploying a Study

When you create a study:creating;creating:study;deploying studystudy, you must specify different data-types (subject annotation, array, image, etc), data sources (caGrid applications – caArray and NBIA) and map the data, (patient to sample, image series, etc.).
To create a new study, follow these steps:

This opens an Edit Study page where you can add identify data files for your study. See .
Creating/Editing a Study
creating:study;study:creatingThe Edit Study page, as described in , displays the Name and Description that you entered for a new study, or for an existing study that you are editing ().

To continue creating a study or to modify a study, on the Edit Study page complete these steps:

To continue, you can add subject annotation data sources, genomic data sources or imaging data sources.






Viewing/Editing a Log
log, viewing or editingOn the Edit Study page, as a study manager you can open a detailed log for the study.

Add an appropriate description/annotations to the individual log entries.

See also on page 12.
Working with Annotations – An Overview
One of the most important factors in creating a study in caIntegrator is in properly annotating the data. Because the process can be relatively complex, you might want to review the steps for working with annotations.
Annotation workflow summary:

Adding An Annotation Group
This topic opens from both the Create Annotation Group page and the Edit Annotation Group page. If you plan to create a group, continue with this topic. If you plan to edit an annotation group, see .
annotation group:adding;adding:annotation groupAn annotation group is a group of annotation definitions configured in a CSV file. This feature is primarily meant for the Study Manager who knows that they have tightly restricted vocabulary definitions that are relevant to a study. In this optional step, you can review the uploaded Group Definition Source file before assigning the appropriate definition for your study.
To add an annotation group, follow these steps:

The CSV file must include columns with these column headers in the first row: File Column Name, Field Type, Entity Type, CDE ID, CDE Version, Annotation Def Name, Data Type, Permissible, and Visible. Subsequent rows in the file define each subject annotation column in the subject annotation file.

– OR –

When you open the Define Fields for Subject Data page (see ), the annotation definitions in the file you uploaded display on the page, available for assignment in the study. Additionally, you can view the definitions by viewing the annotation group listed in the first column of the matrix.

Editing an Annotation Group
This topic opens from the Edit Annotation Group page. You may want to refer to if you are adding a group for the first time.
annotation group:editing;editing:annotation groupTo edit an annotation group, on the Edit Study page for a study with an existing annotation group, click the Edit Group button.

Adding Subject Annotation Data
study:adding subject annotations;subject annotation:adding data;adding:subject annotationThe Edit Study page, described in , opens after you save a new study or click to edit an existing study.
To add subject annotation metadata on this page, follow these steps:

After the data file is uploaded to this study, it will be listed in the Subject Annotation Data Sources section of the Edit Study page.
From this page you can initiate editing the annotations. In the Subject Annotation Data Sources section, click Edit Annotations corresponding to the subject annotations that have been uploaded for the study. This open the .
Define Fields Page for Editing Annotations
study:editing subject annotations;subject annotation:editing;editing:subject annotationThe Define Fields for Subject Data page opens when you click Edit Annotations in the Subject Annotation Data Sources or the Image Data Sources section of the Edit Study page (). The exception to this is if you have not yet imported annotations for the imaging data for the study, In that case, when you click the Edit Annotations button in the Imaging Data Sources section, a page opens where you can identify and upload image annotation data ().
If this Define Fields page opens after clicking the Edit Annotations button, working with this page is identical for both subject and image annotations

The first column of the table on this page displays annotation groups that have been created for this study. For more information, see .
To add subject or image annotation metadata in this page, follow these steps:

The MOST important steps in creating a new study in caIntegrator:

Note the following regarding the list of annotations on this page:





Assigning An Identifier or Annotation
assigning, annotation identifier;annotation:assigning identifierWhen you click Change Assignment on the Define Fields... page, the Assign Annotation Definition for Field Descriptor dialog box opens (). On this page you can change the column type and the field definition for the specific data field you selected.

As you select the column type, you can work with column headers in one of four ways in this dialog box.

You can still initiate a search for another annotation definition in the Search for an Annotation Definition section if you choose to change the definition (). See . Click Save to retain any changes.






Searching for Annotation Definitions
annotation:searching for definitions;searching:annotation definitionsAn alternative to creating a new definition is to search for annotation definitions already present in caIntegrator studies or in caDSR.

In summary, when you click the link, that assigns the definition to the Define Fields for Subject Data page, and it also closes the Annotation Definition page.
You can modify any portion of the definition, as described in .

The Data From File columns on the page display the column header values of the first three rows you designated as "annotations".

The Edit Study page now displays a "Not Loaded" status for the file whose annotations (column headers) you have defined ().

Status definitions:

The Manage Studies page opens when the study is deployed. The Deployed status is indicated on the Manage Studies page as well as the Edit Study page. For more information, see .
You can continue to perform other tasks in caIntegrator while deployment is in process.
See also .










Defining Survival Values
survival values, defining;defining survival valuesSurvival value is the length of time a patient lived. If you plan to analyze your caIntegrator data to create a Kaplan-Meier (K-M) Plot, then during the Annotation Definition process described above in , you should do one of two things:

For some applications, such as REMBRANDT and I-SPY, survival values are pre-defined in the databases when you load the data. In caIntegrator, however, you can review and define survival value ranges in a data set you are uploading to a study. To be able to do so, you need to understand the kind of data that can comprise the survival values.
To set up survival values, follow these steps:

Survival values can be defined by Date or by Length of time in study. Select the radio button for the category that defines your survival data.
In the drop-down lists, select the appropriate survival value definitions for each field listed. You might want to refer to the column headers in the data file itself. Dates covered by the definitions are already in the data set. You cannot enter specific dates.

See also on page 82.
Updated the Edit Survival Value Definitions page, now has a radio button and 2 different types of ways to define survival values.
Adding/Editing Genomic Data

genomic data:adding to study;adding:genomic data;study:adding genomic dataOnce you have loaded subject annotation data and identified patient IDs, you can add either one or more sets of array genomic sample data from caArray, which caIntegrator maps by sample IDs to the patient IDs in the subject annotation data, covered in this section, or you can load imaging files from NBIA, also mapped by IDs to the patient data, covered in . You can also edit genomic data information that you have already added to the study. Genomic sample data and imaging data are independent of each other, so neither is required before loading the other.
It is essential that you are well acquainted with the data you are working with--the subject annotation data, and the corresponding array data in caArray.
caIntegrator supports a limited number of array platforms. For more information, see .
To add genomic data to your caIntegrator study, follow these steps:

This opens the Edit Genomic Data Source dialog box. Enter the appropriate information in the fields (). This fields are described below.

caIntegrator goes to caArray, validates the information you have entered here, finds the experiment and retrieves all the sample IDs in the experiment. Once this finishes, the experiment information displays on the Edit Study page under the Genomic Data Sources section ().




Mapping Genomic Data to Subject Annotation Data
genomic data:mapping to subject annotation data;study:mapping genomic data to subject annotation;mapping genomic to subject annotation dataBecause the goal of caIntegrator is to integrate data from subject annotation, genomic and imaging data sources, data from uploaded source files must be mapped to each other. Mapping files can map to caArray genomic data of two types: "imported and parsed" and that stored in supplemental files.



Creating a Mapping File
You, as the caIntegrator study manager, must create a Subject to Sample mapping file before following the actual mapping steps. This file provides caIntegrator with the information for mapping patients to caArray samples.

The following figure shows an example multiple sample mapping file in CSV format.

The following steps described in use data of either type.
Steps for Mapping Genomic Data
To map the samples from the caArray experiment to the patients in the subject annotation data you uploaded, follow these steps:

If the caArray data you have identified is imported and parsed, when you click the Map Samples button, the mapping takes place as the data is uploaded into caIntegrator. If the caArray data is supplemental, the mapping does not occur until the study is deployed.
Mapped samples will be listed in the Samples Mapped to Subjects section. Unmapped samples show at the top of the caIntegrator page. They were loaded from caArray, but they are not in the mapping file. These are not used for integration.




Uploading Control Samples
control samples, uploading;study:uploading control samples to;fold change:control samples fileA Control Samples file is used to calculate fold change data, which compares "tumor" sample gene expression in the caArray experiment to the control samples to identify those that exhibit up or down gene regulation. Control samples can be the "normal" samples, but that is not necessarily the case.
To upload the control samples, follow these steps:

The control samples now display toward the bottom of the page.




Configuring Copy Number Data
study:configuring copy number data;copy number:configuring data;genomic data:adding copy number data to;configuring:copy number dataYou can add copy number data for a genomic data source by uploading the mapping file. This allows you to configure parameters to be used when segmentation data is being configured.
The name specified in the third column of the mapping file is specific for each array manufacturer as follows:

To add copy number data relating to the genomic data you are adding, follow these steps:

The Edit Copy Number page opens ().



Remapping Copy Number Data in a Deployed Study
copy number:remapping data, deployed studyOccasionally you may need to remap copy number data in a deployed study. To do so, follow these steps:

See also .
Working with Imaging Data
study:working with imaging data;imaging data:working withOnce you have loaded subject annotation data and identified patient IDs, you can add either array genomic sample data from caArray which caIntegrator maps by sample IDs to the patient IDs in the subject annotation data, or you can upload image data from NBIA, also mapped by IDs to the subject data. Once you have configured an NBIA image data source for adding images, then you can import image annotation data for the images. Genomic sample data and imaging data are independent of each other, so neither is required before loading the other.
It is essential that you are well acquainted with the data you are working with--the subject annotation data, and the corresponding imaging data in NBIA.

Adding or Editing Imaging Data Files from NBIA
study:adding imaging data;imaging data:adding to study;adding:imaging data;NBIA:adding files to caIntegrator;editing imaging files;imaging data:editing NBIA images sourcesTo add images from NBIA to the study you are creating, follow these steps:

The imaging data displays on the Edit Study page under the Imaging Data Sources section ().

Adding or Editing Image Annotations
adding:image annotations;images:adding annotations;annotation:image, editing;editing:image annotationsAfter you have configured an image data source with an NBIA Grid service and uploaded the image data, described in , you can load image annotations into caIntegrator from a file in CSV format or through an Annotations and Image Markup (AIM) service.

To add image annotations from a file, follow these steps:

To load image annotations through an AIM service, follow these steps:

Using either method, the image annotations are uploaded to caIntegrator. After this occurs, when you click the Edit Annotations button, the system opens to the Define Fields for Imaging Data page where you can edit the annotations. For more information, see . You must assign identifiers and annotations to the data in the same way you did with the subject annotation data. For more information, see and .
Adding External Links
external links, adding for a studyThis feature on the Edit Study page, described in , allows you to configure a CSV file with URLs to be used as external links relevant to a study. This allows you to easily share or configure references.
To add an external link, follow these steps:

Once you have created external links for a study, when the study is open, an External Links section showing the link(s) displays on the left sidebar of the page ().

Click the link to open a page that displays appropriately formatted web page links ().

Deploying the Study
study:deployingWhen you are ready to deploy the study, click the Deploy Study button on the Edit Study page. caIntegrator retrieves the selected data from the data service(s) you defined and makes the study available to a study manager or to anyone else who may want to analyze the study's data. Using the Manage Studies feature, you can then configure and share data queries and data lists with all investigators who access the study.
Note that you can continue to work in caIntegrator while study is being deployed. See also .
Managing a Study

study:managing;managing:study;study:editing;editing:studyOnce you have started to create a study or have deployed it, you can update an existing study in the following ways:

To update, edit or delete a study, follow these steps:

All of the "in process" or "completed" studies display on this page, with associated metadata. Note that whoever edited or updated the study last is shown in the Last Modified Column, indicated as the Study Manager.

On this page you can edit any details such as adding or deleting files, survival values, and so forth. For information about working with the Edit Study feature, see .

See also .
Managing Platforms
managing:platforms;platforms, managingcaIntegrator supports a limited number of array platforms, all of which originate from Agilent or Affymetrix. While they do not represent all of the platforms supported by caArray, caIntegrator must have array definitions loaded for the platforms it supports, and be able to properly load the data from caArray and parse it.
You can create a study without genomic data, but you cannot add genomic data to a caIntegrator study without a corresponding supported array platform. If you add more than one set of genomic data to the study, you can specify more than one platform for the study.
On the Manage Platforms page, you can identify, add or remove supported platforms.
To manage platforms in caIntegrator, follow these steps:

The Manage Platforms page that opens lists the platforms caIntegrator currently supports, those that the system can pull from caArray (). You can also add a new platform by entering information in the fields in the Create a New Platform section.

Depending on the Platform Type you select, there may be other parameters to provide here as well, such as Platform Channel Type for an Agilent platform.

platform, deployingThe platform deployment can be time-consuming. If the platform takes more than 12 hours to deploy, caIntegrator displays a "timed out" message. At that point, you can delete the platform, even if it has not loaded to the system.