h2. Overview

caIntegrator provides the capability to perform level 3 copy number analysis using either caDNAcopy or caCGHcall.  Both of these tools are part of the Bioconductor suite and are implemented as caGrid services wrapping the R Bioconductor code.  One or both of these tools can be installed either on a separate server from caIntegrator or on the same server.  Be aware that the R processing associated with caDNAcopy and/or caCGHcall can be very CPU and memory intensive.  Also, the current installation instructions only support installation on Linux platforms.  Recommended configurations are given at the bottom.

h2. Prerequisites

The following software packages must be installed prior to beginning the instructions:
* Java Development Kit (JDK) 1.6.x
** Ensure that JAVA_HOME is defined, preferably in the user or system profile
** Ensure that $JAVA_HOME/bin is in PATH

* Ant 1.7.x
** Ensure that ANT_HOME is defined, preferably in the user or system .profile
** Ensure that $ANT_HOME/bin is in PATH

* Tomcat 5.5.x
** Instructions assume Tomcat is configured to run on port 8080, if you are using another port, please substitute it into later instructions
** Ensure that CATALINA_HOME is set
** Tomcat may be autostarted or started as part of the Bioconductor launch script (provided later in these instructions)

h2. Install Dependent Packages

The caIntegrator Bioconductor installation also depends on several other packages that may not be installed on your system.  Summaries of these packages and how to install them follow.

h3. Globus 4.0.3

Installed at /usr/local/ws-core-4.0.3
* Download from [http://www-unix.globus.org/ftppub/gt4/4.0/4.0.3/ws-core/bin/ws-core-4.0.3-bin.tar.gz|http://www-unix.globus.org/ftppub/gt4/4.0/4.0.3/ws-core/bin/ws-core-4.0.3-bin.tar.gz]
* Unpack tar file into /usr/local/ws-core.4.0.3
* Ensure that GLOBUS_LOCATION is set to /usr/local/ws-core-4.0.3

h3. Apache Axis 1.4

Installs files into $CATALINA_HOME
* Download from [http://apache.cs.utah.edu/ws/axis/1_4/axis-bin-1_4.tar.gz]
* Unpack tar file in home directory
* Move axis-1_4/webapps/axis directory to $CATALINA_HOME/webapps
* Stop and restart Tomcat ($CATALINA_HOME/bin/shutdown.sh; $CATALINA_HOME/bin/startup.sh)
* You can test Axis installation at [http://YOUR-HOSTNAME:8080/axis/happyaxis.jsp]

h3. ActiveMQ 4.0.2

Installed at /usr/local/incubator-activemq-4.0.2
* Download from [http://people.apache.org/repository/incubator-activemq/distributions/incubator-activemq-4.0.2.tar.gz]
* Unpack tar file into /usr/local/incubator-activemq-4.0.2
* Ensure that JMS_HOME is set to /usr/local/incubator-activemq-4.0.2
* Edit file $JMS_HOME/conf/activemq.xml
** Change line that reads "<broker useJmx="true">" to "<broker useJmx="true" persistent="false">"
** Replace line that reads "<transportConnector name="default" uri="tcp://localhost:61616" discoveryUri="multicast://default"/>" with " <transportConnector name="default" uri="tcp://localhost:61616"/>"
** Comment out or remove line that reads "<networkConnector name="default" uri="multicast://default"/>"
* Make main binary executable
** chmod 755 $JMS_HOME/bin/activemq
* ActiveMQ may be autostarted or started as part of the Bioconductor launch script (provided later in these instructions)

h3. R 2.9.0

Installed at /usr/local/R-2.9.0
* Download from [http://cran.fhcrc.org/src/base/R-2/R-2.9.0.tar.gz]
* Unpack tar file in home directory
* Configure, build, and install
** cd R-2.9.0
** ./configure \--enable-R-shlib \--with-readline=no \--with-x=no
** Fix src/modules/Makefile as follows (fix is from [http://tolstoy.newcastle.edu.au/R/e6/devel/09/04/1434.html])
*** Change lines that read:
{code}
for d in "$(R_MODULES)"; do \
   (cd $${d} && $(MAKE) $@) || exit 1; \
done
{code}
*** To read:
{code}
@if test "$(R_MODULES)" != ""; then \
  for d in "$(R_MODULES)"; do \
     (cd $${d} && $(MAKE) $@) || exit 1; \
  done; \
fi
{code}
*** Be very careful of tabs in the lines, one tab to start each line, indent beyond that with spaces.  Be careful to have no spaces at the end of the lines too.  Use rest of Makefile as an example.
** make
** make install prefix=/usr/local/R-2.9.0

h3. Dependent R packages (RCurl, SJava, RWebServices, RUnit, DNAcopy)

Installs files into $R_HOME
* $R_HOME/bin/R
* At the R prompts:
** source("http://www.bioconductor.org/biocLite.R")
** biocLite()
** biocLite("RCurl")
** biocLite("SJava")
** biocLite("RWebServices")
** biocLite("RUnit")
** biocLite("DNAcopy")
** q()
* Build SJava links missed by install:
** cd $R_HOME/lib64/R/library/SJava/libs
** ln \-s SJava.so libRInterpreter.so
** ln \-s SJava.so libSJava.so
* Build and test SJava/RWebServices installation (as per Bioconductor Installation Guide page 10)
** $R_HOME/bin/R
** At the R prompts:
*** library(RWebServices)
*** unpackAntScript("/tmp/rservices")
*** q()
** cd /tmp/rservices
** ant recompile-sjava
** ant basic-prop
*** Look for any errors)
** ant rservices-test
*** This step is optional

h2. Install caDNAcopy

This section installs the caDNAcopy grid in three pieces: base code, R web service, and caGrid service.&nbsp; You may skip this section if you are only using caCGHcall.

h3. caDNAcopy Base Code

Installs into R_HOME
* cd to your home directory
* svn checkout [http://gforge.nci.nih.gov/svnroot/bioconductor]
* cd bioconductor/trunk/services/caDNAcopy/R
* $R_HOME/bin/R CMD INSTALL caDNAcopy

h3. caDNAcopy RWebService

Installs into /usr/local/bioconductor/caDNAcopy
* cd /usr/local/bioconductor
* $R_HOME/bin/R
* At the R prompts:
** library(RWebServices)
** unpackAntScript("caDNAcopy")
** q()
* cd caDNAcopy
* ant map-package \-Dpkg=caDNAcopy
* ant unpack-package \-Dpkg=caDNAcopy
* ant precompile
* Optionally, test RWebService (as per Bioconductor Installation Guide page 11):
** Start up Tomcat (shutdown if already running)
*** $CATALINA_HOME/bin/startup.sh
** Start up ActiveMQ
*** $JMS_HOME/bin/activemq &
** Start up R worker for caDNAcopy
** nohup ant start-worker &
** Run the test
*** ant local-test
*** Check any failures in test/output.  Ignore failures that say "expected 79, got 80"

h3. caDNAcopy Grid Service

Installs into $CATALINA_HOME
* Download Bioconductor-caGrid-Services.tar.gz from [https://gforge.nci.nih.gov/docman/view.php/175/19977/Bioconductor-caGrid-Services.tar.gz]
* Unpack tar file in home directory
* Build and deploy caDNAcopy grid service
** cd caGrid/CaDNAcopy
** ant deployTomcat
* Prepare Tomcat for Grid applications
* cd $GLOBUS_LOCATION
** ant \-f share/globus_wsrf_common/tomcat/tomcat.xml deployTomcat \-Dtomcat.dir="$CATALINA_HOME"
** Copy $JMS_HOME/lib/*.jar to $CATALINA_HOME/webapps/wsrf/WEB-INF/lib
* Optionally, fix timeout value in caDNAcopy.jar to be 2 hours instead of 60 seconds
** mkdir \~/unjar; cd \~/unjar
** unzip /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib/caDNAcopy.jar
** Edit org/bioconductor/rserviceJms/services/caDNAcopy/RWebServices4java.properties
*** Change to: jms.timeout=7200000
** zip \-r caDNAcopy.jar *
** cd /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib
** cp \~/unjar/caDNAcopy.jar .
* Optionally, fix timeout value in R worker to be 2 hours instead of 60 seconds
** Edit $R_HOME/lib64/R/library/RWebServices/scripts/RWebServicesTuning.properties
*** Change to: jms.timeout=7200000

h2. Install caCGHcall

This section installs the caCGHcall grid service in three pieces: base code, R web service, and caGrid service.&nbsp; You may skip this section if you are only using caDNAcopy.

h3. caCGHcall Base Code

Installs into R_HOME
* cd to your home directory
* svn checkout [http://gforge.nci.nih.gov/svnroot/bioconductor]
* cd bioconductor/branches/caIntegrator/services/caCGHcall/R
* $R_HOME/bin/R CMD INSTALL caCGHcall

h3. caCGHcall RWebService

Installs into /usr/local/bioconductor/caCGHcall
* cd /usr/local/bioconductor
* $R_HOME/bin/R
* At the R prompts:
** library(RWebServices)
** unpackAntScript("caCGHcall")
** q()
* cd caCGHcall
* ant map-package \-Dpkg=caCGHcall
* ant unpack-package \-Dpkg=caCGHcall
* ant precompile
* Optionally, test RWebService (as per Bioconductor Installation Guide page 11):
** Start up Tomcat (shutdown if already running)
*** $CATALINA_HOME/bin/startup.sh
** Start up ActiveMQ
*** $JMS_HOME/bin/activemq &
** Start up R worker for caCGHcall
** nohup ant start-worker &
** Run the test
*** ant local-test
*** Check any failures in test/output.  Ignore failures that say "expected 79, got 80"

h3. caCGHcall Grid Service

Installs into $CATALINA_HOME
* -Download Bioconductor-caGrid-Services.tar.gz from- -[https://gforge.nci.nih.gov/docman/view.php/175/19977/Bioconductor-caGrid-Services.tar.gz]-
* -Unpack tar file in home directory-
* -Build and deploy caDNAcopy grid service-
** -cd caGrid/CaDNAcopy-
** -ant deployTomcat-
* -Prepare Tomcat for Grid applications-
* -cd $GLOBUS_LOCATION-
** -ant \-f share/globus_wsrf_common/tomcat/tomcat.xml deployTomcat \-Dtomcat.dir="$CATALINA_HOME"-
** -Copy $JMS_HOME/lib/*.jar to $CATALINA_HOME/webapps/wsrf/WEB-INF/lib-
* -Optionally, fix timeout value in caDNAcopy.jar to be 2 hours instead of 60 seconds-
** -mkdir \~/unjar; cd \~/unjar-
** -unzip /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib/caDNAcopy.jar-
** -Edit org/bioconductor/rserviceJms/services/caDNAcopy/RWebServices4java.properties-
*** -Change to: jms.timeout=7200000-
** -zip \-r caDNAcopy.jar *-
** -cd /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib-
** -cp \~/unjar/caDNAcopy.jar .-
* -Optionally, fix timeout value in R worker to be 2 hours instead of 60 seconds-
** -Edit $R_HOME/lib64/R/library/RWebServices/scripts/RWebServicesTuning.properties-
*** -Change to: jms.timeout=7200000-

h2. Launch Bioconductor Services

This section describes how to start up all the pieces of the Bioconductor installation once they have been successfully installed. In the section following this one, you can find optional scripts that can automate the startup and shutdown of all processes.

h3. Starting Bioconductor Services

As per Bioconductor Installation Guide page 11:
* Start up Tomcat (shutdown if already running)
** $CATALINA_HOME/bin/startup.sh
* Start up ActiveMQ
** $JMS_HOME/bin/activemq &
* Start up R worker for caDNAcopy (skip if only using caDNAcopy)
** cd /usr/local/bioconductor/caDNAcopy
** nohup ant start-worker &
* Start up R worker for caCGHcall (skip if only using caCGHcall)
** cd /usr/local/bioconductor/caCGHcall
** nohup ant start-worker &

h2. Optional Improvements

h3. Scripts to startup/shutdown Bioconductor

These shell scripts can be used to automate the launch of Bioconductor.  If these are used to automatically start the Bioconductor services at system startup, please ensure that all environment variables (as listed in sections above) are already defined.  All three scripts should be placed in the same directory and must be set to be executable (chmod a+x).  Also note that these scripts are intended to be used with the common logging setup that is described in the section following this one.

* start-worker.sh
{code}
#!/bin/bash
#

WORKERTYPE=$1
WORKERID=$2

# Run worker
echo "Starting $WORKERTYPE worker ($WORKERID)..."
cd /usr/local/bioconductor/$WORKERTYPE
ant start-worker | sed -u "s/\[java\]/\[$WORKERTYPE $WORKERID\]/"
{code}
* start-bio.sh
{code}
#!/bin/bash
#
LOGFILE=/usr/local/bioconductor/logs/bioconductor-combined.log

THISDIR=`dirname $0`

# Run Tomcat
echo Starting Tomcat...
$CATALINA_HOME/bin/startup.sh

# Run ActiveMQ
echo Starting ActiveMQ...
$JMS_HOME/bin/activemq start &
sleep 10

# Run caDNAcopy workers
nohup $THISDIR/start-worker.sh caDNAcopy 1 >> $LOGFILE &
nohup $THISDIR/start-worker.sh caDNAcopy 2 >> $LOGFILE &

# Run caCGHcall workers
nohup $THISDIR/start-worker.sh caCGHcall 1 >> $LOGFILE &
nohup $THISDIR/start-worker.sh caCGHcall 2 >> $LOGFILE &
{code}
* stop-bio.sh
{code}
#!/bin/bash
#

# Stop ActiveMQ
$JMS_HOME/bin/activemq stop &

# Stop Tomcat
$CATALINA_HOME/bin/shutdown.sh

# Stop caDNAcopy/caCGHcall workers
### They stop themselves when ActiveMQ shuts down
{code}

h3. Setting up common logging

One disadvantage to having multiple separate processes for Bioconductor is that each process creates a separate log file by default.  By using the following steps, the log files for Tomcat, ActiveMQ, and caDNAcopy/caCGHcall are placed into a single directory (/usr/local/bioconductor/logs).  If the scripts in the above section are used, it further improves logging by placing all messages into a single log file (/usr/local/bioconductor/logs/bioconductor-combined.log).
* mkdir /usr/local/bioconductor/logs
* Edit /usr/local/activemq/conf/log4j.properties
** Change "log4j.rootLogger=INFO, stdout" to "log4j.rootLogger=INFO, out"
** Change "log4j.appender.out.file=${activemq.home}/data/activemq.log" to "log4j.appender.out.file=/usr/local/bioconductor/logs/activemq.log"
* cd /usr/local/bioconductor/logs
* ln \--s /usr/local/bioconductor/caDNAcopy/nohup.out ./rworker.log
* ln \-s $CATALINA_HOME/logs/catalina.out ./tomcat.log