This document provides instructions for installing caIntegrator Bioconductor.
Topics in this document include:
caIntegrator provides the capability to perform level 3 Should "level 3" be defined? copy number analysis using either caDNAcopy or caCGHcall. Both of these tools are part of the Bioconductor suite and are implemented as caGrid services wrapping the R Bioconductor code. One or both of these tools can be installed either on a separate server from caIntegrator or on the same server. .What is this? It keeps appearing in the markup.
Be aware that the R processing associated with caDNAcopy and/or caCGHcall can be very CPU and memory intensive. |
The current installation instructions only support installation on Linux platforms. Recommended configurations are given at the bottom.
please check linksThe following software packages must be installed prior to beginning the instructions:
The caIntegrator Bioconductor installation also depends on several other packages that may not be installed on your system. Summaries of these packages and how to install them follow.
Installed at /usr/local/ws-core-4.0.3
Installs files into $CATALINA_HOME
Installed at /usr/local/incubator-activemq-4.0.2
Installed at /usr/local/R-2.9.0
for d in "$(R_MODULES)"; do \ (cd $${d} && $(MAKE) $@) || exit 1; \ done |
@if test "$(R_MODULES)" != ""; then \ for d in "$(R_MODULES)"; do \ (cd $${d} && $(MAKE) $@) || exit 1; \ done; \ fi |
Be very careful of tabs in the lines, one tab to start each line, indent beyond that with spaces. Be careful to have no spaces at the end of the lines too. Use rest of Makefile as an example. |
Installs files into $R_HOME
This section installs the caDNAcopy grid in three pieces: base code, R web service, and caGrid service. You may skip this section if you are only using caCGHcall.
I'm unclear about whether each of these are steps. If so they should be numbered, if not, perhaps they should stay bulleted.
Installs into R_HOME
steps?
Installs into /usr/local/bioconductor/caDNAcopy
These are worded like steps, so it's clearer to me that they can be numbered.Are the indented bullets steps as well? If so, I'll "letter" them.
Installs into $CATALINA_HOME
This section installs the caCGHcall grid service in three pieces: base code, R web service, and caGrid service. You may skip this section if you are only using caDNAcopy.
Installs into R_HOME
If these are steps, just number the first one and I'll finish them.
Installs into /usr/local/bioconductor/caCGHcall
* cd /usr/local/bioconductor
* $R_HOME/bin/R
* At the R prompts:
** library(RWebServices)
** unpackAntScript("caCGHcall")
** q()
* Optionally, fix timeout value for all R workers to be 2 hours instead of 60 seconds
** Edit $R_HOME/lib64/R/library/RWebServices/scripts/RWebServicesTuning.properties
*** Change to: jms.timeout=7200000
* cd caCGHcall
* Change the queue name for caCGHcall to not conflict with caDNAcopy:
** Edit RWebServicesTuning.properties
*** Change to: jms.queue=CGHC
* ant map-package -Dpkg=caCGHcall
* ant unpack-package -Dpkg=caCGHcall
* ant precompile
* Optionally, test RWebService (as per Bioconductor Installation Guide page 11):
** Start up Tomcat (shutdown if already running)
*** $CATALINA_HOME/bin/startup.sh
** Start up ActiveMQ
*** $JMS_HOME/bin/activemq &
** Start up R worker for caCGHcall
** nohup ant start-worker &
** Run the test
*** ant local-test
*** Check any failures in test/output. Ignore failures that say "expected 79, got 80"
h3. caCGHcall Grid Service
Installs into $CATALINA_HOME
# Build and deploy caCGHcall grid service
** cd /usr/local/bioconductor/caCGHcall
** ant deployTomcat
# Prepare Tomcat for Grid applications
* cd $GLOBUS_LOCATION
** ant -f share/globus_wsrf_common/tomcat/tomcat.xml deployTomcat -Dtomcat.dir="$CATALINA_HOME"
# Copy $JMS_HOME/lib/*.jar to $CATALINA_HOME/webapps/wsrf/WEB-INF/lib
h2. Launch Bioconductor Services
This section describes how to start up all the pieces of the Bioconductor installation once they have been successfully installed. In the section following this one, you can find optional scripts that can automate the startup and shutdown of all processes.
h3. Starting Bioconductor Services
As per Bioconductor Installation Guide page 11: Do you have a link for this?
These shell scripts can be used to automate the launch of Bioconductor. If these are used to automatically start the Bioconductor services at system startup, please ensure that all environment variables (as listed in sections above) are already defined. All three scripts should be placed in the same directory and must be set to be executable (chmod a+x). Also note that these scripts are intended to be used with the common logging setup that is described in the section following this one.
#!/bin/bash # WORKERTYPE=$1 WORKERID=$2 # Run worker echo "Starting $WORKERTYPE worker ($WORKERID)..." cd /usr/local/bioconductor/$WORKERTYPE ant start-worker | sed -u "s/\[java\]/\[$WORKERTYPE $WORKERID\]/" |
#!/bin/bash # LOGFILE=/usr/local/bioconductor/logs/bioconductor-combined.log THISDIR=`dirname $0` # Run Tomcat echo Starting Tomcat... $CATALINA_HOME/bin/startup.sh # Run ActiveMQ echo Starting ActiveMQ... $JMS_HOME/bin/activemq start & sleep 10 # Run caDNAcopy workers nohup $THISDIR/start-worker.sh caDNAcopy 1 >> $LOGFILE & nohup $THISDIR/start-worker.sh caDNAcopy 2 >> $LOGFILE & # Run caCGHcall workers nohup $THISDIR/start-worker.sh caCGHcall 1 >> $LOGFILE & nohup $THISDIR/start-worker.sh caCGHcall 2 >> $LOGFILE & |
#!/bin/bash # # Stop ActiveMQ $JMS_HOME/bin/activemq stop & # Stop Tomcat $CATALINA_HOME/bin/shutdown.sh # Stop caDNAcopy/caCGHcall workers ### They stop themselves when ActiveMQ shuts down |
One disadvantage to having multiple separate processes for Bioconductor is that each process creates a separate log file by default. By using the following steps, the log files for Tomcat, ActiveMQ, and caDNAcopy/caCGHcall are placed into a single directory (/usr/local/bioconductor/logs). If the scripts in the above section are used, it further improves logging by placing all messages into a single log file (/usr/local/bioconductor/logs/bioconductor-combined.log).