This document provides instructions for installing caIntegrator Bioconductor.
Topics in this document include:
Text in this document formatted as monoface bold type indicates Unix or R commands you should type, as indicated, at the command line. |
caIntegrator provides the capability to perform segmentation and copy number analysis using either caDNAcopy or caCGHcall. Both of these tools are part of the Bioconductor suite and are implemented as caGrid services wrapping the R Bioconductor code. You can install one or both of these tools either on a separate server from caIntegrator or on the same server.
The R processing associated with caDNAcopy and/or caCGHcall can be very CPU and memory intensive. |
The current installation instructions only support installation on Linux platforms. Recommended configurations are given at the bottom.
The following software packages must be installed prior to beginning the instructions:
The caIntegrator Bioconductor installation also depends on several other packages that may not be installed on your system. Summaries of these packages and directions for installing them follow.
Installs at /usr/local/ws-core-4.0.3
Installs files into $CATALINA_HOME
Installs at /usr/local/incubator-activemq-4.0.2
<broker useJmx="true">
" to "<broker useJmx="true" persistent="false">
"<transportConnector name="default" uri="tcp://localhost:61616" discoveryUri="multicast://default"/>
" with " <transportConnector name="default" uri="tcp://localhost:61616"/>
".<networkConnector name="default" uri="multicast://default"/>
".chmod 755 $JMS_HOME/bin/activemq
ActiveMQ can be autostarted or started as part of the Bioconductor launch script (provided later in these instructions)
Installs at /usr/local/R-2.9.0
cd R-2.9.0
./configure --enable-R-shlib --with-readline=no --with-x=no
for d in "$(R_MODULES)"; do \ (cd $${d} && $(MAKE) $@) || exit 1; \ done |
@if test "$(R_MODULES)" != ""; then \ for d in "$(R_MODULES)"; do \ (cd $${d} && $(MAKE) $@) || exit 1; \ done; \ fi |
Be very careful of tabs in the lines, one tab to start each line, indent beyond that with spaces. Be careful to have no spaces at the end of the lines too. Use rest of Makefile as an example. |
make
make install prefix=/usr/local/R-2.9.0
Installs files into $R_HOME
$R_HOME/bin/R
source("http://www.bioconductor.org/biocLite.R")
biocLite()
biocLite("RCurl")
biocLite("SJava")
biocLite("RWebServices")
biocLite("RUnit")
biocLite("DNAcopy")
q()
cd $R_HOME/lib64/R/library/SJava/libs
ln -s SJava.so libRInterpreter.so
ln -s SJava.so libSJava.so
$R_HOME/bin/R
library(RWebServices)
unpackAntScript("/tmp/rservices")
q()
cd /tmp/rservices
ant recompile-sjava
ant basic-prop
ant rservices-test
This section installs the caDNAcopy grid in three pieces: base code, R web service, and caGrid service. You may skip this section if you are only using caCGHcall.
Installs into R_HOME
cd ~
svn checkout
http://gforge.nci.nih.gov/svnroot/bioconductor.cd bioconductor/trunk/services/caDNAcopy/R
$R_HOME/bin/R CMD INSTALL caDNAcopy
Installs into /usr/local/bioconductor/caDNAcopy
cd /usr/local/bioconductor
$R_HOME/bin/R
library(RWebServices)
unpackAntScript("caDNAcopy")
q()
jms.timeout=7200000
.cd caDNAcopy
ant map-package -Dpkg=caDNAcopy
ant unpack-package -Dpkg=caDNAcopy
ant precompile
$CATALINA_HOME/bin/startup.sh
$JMS_HOME/bin/activemq &
nohup ant start-worker &
ant local-test
Check any failures in test/output. Ignore failures that say "expected 79, got 80"
Installs into $CATALINA_HOME
cd caGrid/CaDNAcopy
ant deployTomcat
cd $GLOBUS_LOCATION
ant -f share/globus_wsrf_common/tomcat/tomcat.xml deployTomcat -Dtomcat.dir="$CATALINA_HOME"
cp $JMS_HOME/lib/*.jar $CATALINA_HOME/webapps/wsrf/WEB-INF/lib
caDNAcopy.jar
to be 2 hours instead of 60 seconds.
unzip /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib/caDNAcopy.jar
jms.timeout=7200000
zip -r caDNAcopy.jar *
cd /usr/local/tomcat-5.5.27-8080/webapps/wsrf/WEB-INF/lib
cp ~/unjar/caDNAcopy.jar .
This section installs the caCGHcall grid service in three pieces: base code, R web service, and caGrid service. You can skip this section if you are only using caDNAcopy.
Installs into R_HOME
cd ~
svn checkout
http://gforge.nci.nih.gov/svnroot/bioconductor
cd bioconductor/branches/caIntegrator/services/caCGHcall/R
$R_HOME/bin/R CMD INSTALL caCGHcall
Installs into /usr/local/bioconductor/caCGHcall
cd /usr/local/bioconductor
$R_HOME/bin/R
library(RWebServices)
unpackAntScript("caCGHcall")
q()
jms.timeout=7200000
.cd caCGHcall
jms.queue=CGHC
.ant map-package -Dpkg=caCGHcall
ant unpack-package -Dpkg=caCGHcall
ant precompile
$CATALINA_HOME/bin/startup.sh
$JMS_HOME/bin/activemq &
nohup ant start-worker &
ant local-test
Check any failures in test/output. Ignore failures that say "expected 79, got 80"
Installs into $CATALINA_HOME
cd /usr/local/bioconductor/caCGHcall
ant deployTomcat
cd $GLOBUS_LOCATION
ant -f share/globus_wsrf_common/tomcat/tomcat.xml deployTomcat -Dtomcat.dir="$CATALINA_HOME"
cp $JMS_HOME/lib/*.jar to $CATALINA_HOME/webapps/wsrf/WEB-INF/lib
This section describes how to start up all the pieces of the Bioconductor installation once they have been successfully installed. In #Optional Imporvements, you can find optional scripts that can automate the startup and shutdown of all processes.
See the Bioconductor Installation Guide page 11.
$CATALINA_HOME/bin/startup.sh
$JMS_HOME/bin/activemq &
cd /usr/local/bioconductor/caDNAcopy
nohup ant start-worker &
cd /usr/local/bioconductor/caCGHcall
nohup ant start-worker &
These shell scripts can be used to automate the launch of Bioconductor. If these are used to automatically start the Bioconductor services at system startup, ensure that all environment variables (as listed in sections above) are already defined. All three scripts should be placed in the same directory and must be set to be executable (chmod a+x
). Also note that these scripts are intended to be used with the common logging setup that is described in #Setting Up Common Logging.
#!/bin/bash # WORKERTYPE=$1 WORKERID=$2 # Run worker echo "Starting $WORKERTYPE worker ($WORKERID)..." cd /usr/local/bioconductor/$WORKERTYPE ant start-worker | sed -u "s/\[java\]/\[$WORKERTYPE $WORKERID\]/" |
#!/bin/bash # LOGFILE=/usr/local/bioconductor/logs/bioconductor-combined.log THISDIR=`dirname $0` # Run Tomcat echo Starting Tomcat... $CATALINA_HOME/bin/startup.sh # Run ActiveMQ echo Starting ActiveMQ... $JMS_HOME/bin/activemq start & sleep 10 # Run caDNAcopy workers nohup $THISDIR/start-worker.sh caDNAcopy 1 >> $LOGFILE & nohup $THISDIR/start-worker.sh caDNAcopy 2 >> $LOGFILE & # Run caCGHcall workers nohup $THISDIR/start-worker.sh caCGHcall 1 >> $LOGFILE & nohup $THISDIR/start-worker.sh caCGHcall 2 >> $LOGFILE & |
#!/bin/bash # # Stop ActiveMQ $JMS_HOME/bin/activemq stop & # Stop Tomcat $CATALINA_HOME/bin/shutdown.sh # Stop caDNAcopy/caCGHcall workers ### They stop themselves when ActiveMQ shuts down |
One disadvantage to having multiple separate processes for Bioconductor is that each process creates a separate log file by default. By using the following steps, the log files for Tomcat, ActiveMQ, and caDNAcopy/caCGHcall are placed into a single directory, /usr/local/bioconductor/log
. If the scripts in #Optional Improvements are used, it further improves logging by placing all messages into a single log file, /usr/local/bioconductor/logs/bioconductor-combined.log
.
mkdir /usr/local/bioconductor/logs
cd /usr/local/bioconductor/logs
ln -s /usr/local/bioconductor/caDNAcopy/nohup.out ./rworker.log
ln -s $CATALINA_HOME/logs/catalina.out ./tomcat.log