Home   Contacts   Info request
 

BioRainbow

 

Toucan Review


Toucan is the Java front-end application to analyse cis-regulatory logic of coregulated genes. It provides interface to online services which perform actual calculations. One of these services is known as "ModuleSearcher", this application can be used to find the optimal cis-regulatory module (CRM) which is binded to transcription factor binding sites in the regulatory sequences of a set of co-regulated genes.

Toucan software is avaliable from: http://www.esat.kuleuven.ac.be/~dna/BioI/Software.html

Advantages:
* Using Genetic Algorithm and A* algorithm to find best combination;
* A* algorithm (based on brach-and-bound search) is claimed to be able to find the best solution possible;
* Genetic Algorithm is more customizable than our realization: user can specify mutation probability (always one model of the module) and percent of survivors (always 50% in our current realization);
* Optionally there may be multiple copies of one model in the resulting module;
* Optionally sites may or may not overlap;
* Using model distance (according to Kullback-Leiber), which helps clusterize similar models into classes;
* Using background model to calculate score.

Disadvantages:
* Java realization possibly significantly slows down computations. C++ realization may be quite faster;
* Solving NP-complete problem in A* algorithm takes huge amount of CPU time and memory resources;
* CMA (owned by company BIOBASE GmbH) accepts two sequence sets (experimental and control) and total score tends to maximum when experimental set contains most sequences which are fit found module, while control set contains less sequences. ModuleSearcher implementation just sums up all the scores of separate sequences to produce sequence set score;
* CMA (from BIOBASE) accepts sequence sets with expression values and tries to maximize correlation between score of CRM on each sequence and expression value. Thus CMA can work with microarray experiments;
* CMA (from BIOBASE) tries to find not just best set of matrices, but also cutoff value for each matrix;
* CMA (from BIOBASE) is able to find best boolean expression of input sets. For example, result can be like this:
not (p(M1)) and (p(M2) or p(M3)) and p(M4), where p(Mi) is whether model Mi occurs on current sequence or not.

Comparison of Toucan and CompositeModuleAnalyst

CMA (from BIOBASE) is faster, because it is written on C++.
CMA (from BIOBASE) has metropolis algorithm inside
CMA (from BIOBASE) can work with microarray experiments
CMA (from BIOBASE) finds cutoff values for best module
CMA (from BIOBASE) can find boolean expressions

CMA (from BIOBASE) has no client-server implementation
CMA (from BIOBASE) has less features in genetic algorithm
CMA (from BIOBASE) cannot find authentically best module
CMA (from BIOBASE) cannot handle modules with repeated models
CMA (from BIOBASE) cannot group models into classes

  Company Science Development Related Products