NIH | National Cancer Institute | NCI Wiki  

Unable to render {include} The included page could not be found.

Step by Step Guide to Install caArray 2.x Grid Service (caArray 2.2 and Under)

Topic: caArray Installation and Upgrade

Release: Up to caArray 2.2

Date entered: 01/26/2009

Contents of this Page


There are three components installed during caArray's installation process: UPT, caArray and caArray Grid Service (refer to caArray 001 - UPT required for caArray or other application tools in LSD Bundle). UPT is a mandatory application used for caArray's user access control. The caGrid Service is used to advertise your caArray application on the NCI caGrid Portal. The caGrid Portal provides a visual display of the services that are running on the caGrid infrastructure and the institutions that are participating in the caBIG program. You can easily find caBIG services, participants, and points of contact on the caGrid Portal. You can also query caGrid data services, share your queries, and search for the queries of other community members. Advertising your caGrid Service on caGrid Portal is optional. You may choose to participate at a later time, or you may choose not to participate at all.

The official caGrid Service configuration steps can be found in caArray Installation Guide (page 23) (login required). This knowledge base entry offers a supplemental step by step installation guide based on our hands-on experience here at Columbia University.


caArray 2.x application must be running prior to the activation of your caGrid service on the caGrid Portal. However, the grid service is not required for the caArray application to run.

Abbreviation Used

The following abbreviations will be used through the text:




The location where you keep the files downloaded.
Example: /app_data1/caArray2.1

<application_base_path> = ${application.base.path}

The location where you are going to install caArray and caGrid service.
Example: /app_data1/caarray_app/caarray

<grid.home> = ${application.base.path}

The location for the caGrid service files, see step 2
It is predefined in file.

Step 1. Make appropriate changes to file

File Location: <installation_directory>/

This is the step to configure file before the deployment of caArray Grid service begins. caGrid service and caArray share the same file.

It is important to specify following properties:




The location where you are going to install caArray (i.e. the deployment of caGrid service and caArray application).
It need to be pre-created before installation start.
Important: <application_base_path> must be different from
<installation_directory> or the installation will fail.

grid.index URL =

NCI caGrid service's production webserver name

Our private DNS name where the caArray is installed.


We use localhost here because our caGrid service is on the same machine as caArray (a default configuration).
The caGrid service's hostname must be used if it is installed on a different machine from caArray application.


Our publicly accessible port to DMZ Zone

Step 2. Run ant to compile source code

File location: <installation_directory>/ant

During the compiling process, Java class files will be deployed into following two directories, as defined in file:




For caGrid service (created during the deployment)


For Jboss service (created during the deployment)

Step 3. Make appropriate changes to web.xml file

File Location: <grid.home>/server/default/deploy/wsrf.war/WEB-INF/web.xml


The web.xml file contains the port and protocol your grid service will be advertised as. The port defined here should be the same publicly accessible port as defined in your property file (grid.static.grid.port in Step 1). It is port 80 in our case, as defined in our property file.

caArray Application and Port Forwarding
caArray Application and Port Forwarding as explained in the text that follows

Our caGrid service server physically resides on an internal server ( behind the firewall. This address, however, is not reachable from outside of Columbia University's network. Our publicly accessible address (a logical hostname) is:

Historical link

Since our caGrid service is actually running on the internal server's port 18080, a proxy server will forward any incoming request from the publicly accessible port 80 to port 18080 on the internal host.

The port used in web.xml should be the port from the publicly accessible address, not the port on the internal server.

Step 4. Make appropriate changes to server-config.wsdd

File Location: <grid.home>/server/default/deploy/wsrf.war/WEB-INF/etc/globus_wsrf_core/server-config.wsdd

This is to add your caGrid service's publicly accessible DNS name into configuration file.

 	<parameter name="logicalHost" value=""/>

Step 5. Make appropriate changes to CaArraySvc_registration.xml

File Location: <grid.home>/server/default/deploy/wsrf.war/WEB-INF/etc/cagrid_CaArraySvc/CaArraySvc_registration.xml

This file has the URL to the Index Server where you can see if your service is advertised. Make sure it contains following:


Step 6. Make appropriate changes to serviceMetadata.xml

File Location: <grid.home>/server/default/deploy/wsrf.war/WEB-INF/etc/cagrid_CaArraySvc/serviceMetadata.xml

This file contains the contact information for your caGrid service. Your caGrid Service will not be recognized in the caGrid Portal without the proper identification defined here.

The following is in the top of the file.

  <ns3:PointOfContact affiliation="Columbia University Medical Center"  email="my_contact@my.organization"
  firstName="My FirstName" lastName="My LastName" phoneNumber="My Phone" role="DBA"

The following is at the bottom of the file.

  <ns14:ResearchCenter displayName="Columbia University Medical Center"  shortName="CUMC"
  <ns14:Address country="US" locality="New York" postalCode="10032" stateProvince="NY" street1="My Street1." street2=""/>
  <ns14:PointOfContact affiliation="Columbia University Medical Center" email="my_contact@my.organization"
  firstName="My FirstName" lastName="My LastName" phoneNumber="My Phone" role="DBA"/>

Step 7. Restart JBoss server for grid service

 	cd $\{application.base.path\}/jboss-4.0.4.GA/bin
 	nohup ./ &amp;

Step 8. Testing caGrid Service Installed

The table below summarizes four methods we used to test if our newly installed caGrid Service (caarraygrid) works properly:




Does caarraygrid Registered in NCI Index?

  • export GLOBUS_LOCATION=/app_data1/ws-core-4.0.3
  • cd /app_data1/ws-core-4.0.3/bin
  • ./wsrf-query -a -z none -s | grep 'caarraygrid'

Following Information should be returned:<ns8:Address xmlns:ns8="">


Is our registration accurate with ServiceMetaData?

  • ./wsrf-get-property -a -z none -s {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata}ServiceMetadata

It should return our contact information, such as:<ns1:ServiceMetadata xmlns:ns1="gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata">
<ns2:Service description="" name="CaArraySvc" version="1.1"
<ns3:PointOfContact affiliation="Columbia University Medical Center" ....

Is there any data returned with DomainModel?

  • ./wsrf-get-property -a -z none -s {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata.dataservice}DomainModel

It should return some of our data from caArray, such as: <ns1:DomainModel projectDescription="Version 2.0 caArray Model" projectLongName="caArray"
projectShortName="caArray" projectVersion="2"
<ns1:UMLAssociation bidirectional="false">
<ns1:UMLAssociationEdge maxCardinality="1" minCardinality="0" roleName="annotation">
<ns1:UMLClassReference refid="5423786E-6C04-59EA-E044-0003BA3F9857"/> ....

Can I see my caGrid Service online?

  • Go to Click the "SERVICES" tab -> "Search" tab -> [MATKC:'Keyword' = "caarraygrid" and 'Search Fields' = "URL" ] -> "Search" button

Checking that caGrid is online
Screenshot showing Checking that caGrid is online

Have a comment?

Please leave your comment in the caArray End User Forum.