Question: How can I build caArray from the source code in the NCI repository?

Topic: caArray Installation and Upgrade

Release: caArray v2.4.1.1 and higher

Date entered: 01/17/2012

Answer

Below is a step-by-step illustrated tutorial on how to build the application from source. It covers the following:

The tutorial is designed for use in a Windows environment, but can easily be adapted to work in Linux as well.

This tutorial is based on the readme.txt file in the NCI caArray source code repository at https://ncisvn.nci.nih.gov/svn/caarray2/tags/2.4.1.1/.

Prerequisites

1) Before proceeding to check out the code, ensure that the following are already installed on your machine:

The build process described in this tutorial only works with version 1.5 of the JDK; it will not work with any other versions, earlier or later.

2) Once all the above are installed, ensure they are configured as follows:

C:\Program Files\Java\jdk1.5.0_22\bin
C:\ant
C:\Program Files\CollabNet Subversion Client

Checking Out The Source Code From The NCI Repository

To begin checking out the caArray source code from the NCI repository, first create a new folder (i.e., C:\source), then open up a command-line window, navigate to that folder, and enter the following:

svn checkout https://ncisvn.nci.nih.gov/svn/caarray2/tags/2.4.1.1/

The source files will then begin downloading from the repository server into the folder you created. Depending on the speed of your Internet connection, it may take over half an hour for the checkout to complete, as the source contains thousands of files distributed over hundreds of folders.

You'll know when the checkout is complete when the command-line window shows a message stating, 'Checked out revision x', where x is the revision number, as shown in the screenshot below:

Screenshot of CollabNet command-line window with confirmation message that checkout process has completed

Manually Creating The Requisite Database Schema

It is possible to have Ant automatically generate the empty database schema required for the caArray installation. However, it is preferred to create these schema manually via the MySQL command line client.

Before doing so, you will need to derive a name for the database caArray will use to store genomic data, as well as a username and password for the user who is granted access to that database.  In this example, the database name is db1, the username is user1, and the password name is password1.

Now, log into the client using the root password you set when installing the database server, then enter the following SQL commands line-by-line, substituting your own database name, username, and password as needed:

CREATE DATABASE db1 DEFAULT CHARACTER SET latin1;
GRANT ALL ON db1.* TO 'user1'@'localhost' IDENTIFIED BY 'password1' WITH GRANT OPTION;
GRANT ALL ON db1.* TO 'user1'@'%' IDENTIFIED BY 'password1' WITH GRANT OPTION;

The single quotes around the username and password in the code snippet above are not optional and can not be omitted -- they are part of the MySQL syntax.

The client will respond with a confirmation that the issued queries were successful, as shown in the screenshot below:

Screenshot of MySQL command-line client window with confirmation message that queries issued to create database schema were successful

Configuring The 'Properties' Settings Files

The caArray installation settings are mainly specified by a source file, install.properties, whose path is:

$CAARRAY_HOME/software/master_build/install.properties

where $CAARRAY_HOME represents the root folder into which the caArray source was checked out (C:\source in this example).

A second file, local.properties, specifies additional settings. This file is not included in the repository -- you must manually create it yourself by opening a plain text editor and saving a blank file with the filename local.properties to the following path:

$CAARRAY_HOME/software/build/local.properties

These files must be customized with the settings specific to your local environment before starting the build process. These settings include:

The install.properties file contains several other properties aside from the ones listed above, but these others do not need to be customized and can be left at their default values.

The blank local.properties file you created previously must be populated with all the properties listed above and their respective values, plus an additional field, jboss.home, which specifies the name of the subfolder within the root installation folder to which the JBoss server will be installed. The value of the jboss.home property is derived from the value of the application.base.path property in install.properties, with the text '\jboss-4.0.5.GA' appended. For example, if the value of the application.base.path property is C:\caArray, then the corresponding value of the jboss.home property would be C:\caArray\jboss-4.0.5.GA.

Refer to the screenshot below for an example of how the local.properties file should be configured:

Screenshot of example local.properties file showing various configuration settings for caArray installation

The install.properties file contains additional properties which pertain to grid services. These are not covered here, as configuring caArray to utilize a grid service is beyond the scope of this tutorial. For more information, please refer to the caArray FAQ and in-depth articles.

Invoking The Build Process From The Command Line

Now that you've checked out the source code, created the database schema, and configured the properties files, you're now ready to start the actual build process. First, open a command-line window and navigate to the following path:

$CAARRAY_HOME/software/master_build

Then, enter the following command:

ant -Dproperties.file=<absolute path to install.properties file> deploy:local:install

For example, if the path to the install.properties file is:

C:\source\2.4.1.1\software\master_build\install.properties

Then the command is:

ant -Dproperties.file=C:\source\2.4.1.1\software\master_build\install.properties deploy:local:install

The amount of time needed for the build process to complete can vary anywhere from under 30 minutes to several hours depending on your hardware configuration. The screenshot below shows the command line window after the build process has completed successfully, with the 'BUILD SUCCESSFUL' message at the bottom.

Screenshot of Ant command-line window showing confirmation message that build process has completed successfully

 

The time it takes for the Ant build process to complete also depends on the speed of your Internet connection, as Ant retrieves all the JAR dependencies
from the NCI Ivy repository during the build. This retrieval is very bandwidth-intensive, so ideally the build should be run on a connection with at least 100 Mbps throughput.

Even if the command line window shows a 'BUILD SUCCESSFUL' message at the end, it is still possible that the build process did not complete successfully. The build process launches several sub-processes, each of which must complete successfully in order for the entire build to complete successfully. If any one of these sub-processes fails, it will display its own 'BUILD FAILED' error message, but the overall build process may still display the 'BUILD SUCCESSFUL' message at the end. The screenshot below illustrates such a case in which a sub-process failed due to an incorrect version of the JDK installed; note the indented 'BUILD FAILED' error message (highlighted in red) several lines above the 'BUILD SUCCESSFUL' message (highlighted in yellow) at the bottom.

 

Screenshot of Ant command-line window showing how build process can fail even when confirmation message indicates that build process has completed successfully

The caArray application is now installed in the path you specified in the install.properties file via the application.base.path property. In our example, the installation path is C:\temp, whose contents are displayed in the screenshot below:

Screenshot of Windows Explorer window showing contents of newly installed caArray application in the path specified in install.properties

If the contents of your installation path do not match those shown above, it is likely that your build process failed.

Launching The caArray Server Upon Build Completion

To launch the caArray server, navigate to the following path:

$INSTALLATION_HOME\jboss-4.0.5.GA\bin

where $INSTALLATION_HOME is the installation path you specified in install.properties, and run the file run.bat. A command line window will launch showing the progress of the launching of the caArray server. The launch process is not instant and may take anywhere from 20 to 120 seconds, after which the command line will display a message indicating that the JBoss MX Microkernel has started, as shown at the bottom of the screenshot below:

Screenshot of command-line window showing confirmation message that the caArray JBoss server has been started after executing the run.bat file

Do not attempt to access the application until you see the message stating that the JBoss kernel has started.

Accessing The caArray Login Page Via Your Web Browser

To access the caArray login page, open up a new tab in your Web browser and enter the following URL in the address bar:

http://localhost:38080/caarray/

The login page will now load as shown in the screenshot below from a Mozilla Firefox tab:

Screenshot of caArray login page in Mozilla Firefox Web browser after caArray application URL is entered in address bar

The default installation of caArray comes with several pre-configured user accounts built-in. We can log into the application via any of these accounts. In this tutorial, the one we will log in with is 'caarrayadmin', whose corresponding password is 'caArray2!'. You can enter this username and password into their respective fields under the caArray Login panel at the right of the login page, then click on the 'Login' button beneath to log into the application, as shown in the screenshot below:

Screenshot of caArray login panel from login page showing example username and password entered into their respective fields

You will now be directed to the caArray homepage, which shows the 'My Experiment Workspace', as shown in the screenshot below. Congratulations on successfully building and logging into caArray!

Screenshot of caArray homepage showing My Experiment Workspace panel after user successfully logged into application

caArray comes with a user provisioning tool (UPT) that allows you to modify the built-in accounts as well as to create additional accounts and assign them varying privileges and access levels. For more information on how to install and use UPT, please visit the UPT user guide at caCORE_CSM_v421_ProgrammersGuide.pdf.

Have a comment?

Please leave your comment in the caArray End User Forum.