Research interests | Publications | Software | Teaching | Short CV
An Efficient Algorithm for the Identification of Structured Motifs in DNA Promoter Sequences
Alexandra M. Carvalho, Ana T. Freitas, Arlindo L. Oliveira and Marie-France Sagot
IEEE/ACM Transactions on Computational Biology and Bioinformatics (ACM/IEEE TCBB)
Volume 3, Issue 2, pp. 126-140, April, 2006
We propose a new algorithm for identifying cis-regulatory modules in genomic sequences. The proposed algorithm, named RISO, uses a new data structure, called box-link, to store the information about conserved regions that occur in a well-ordered and regularly spaced manner in the dataset sequences. This type of conserved regions, called structured motifs, is extremely relevant in the research of gene regulatory mechanisms since it can effectively represent promoter models. The complexity analysis shows a time and space gain, over the best known exact algorithms, that is exponential in the spacings between binding sites. A full implementation of the algorithm was developed and made available online. Experimental results show that the algorithm is much faster than existing ones, sometimes by more than four orders of magnitude. The application of the method to biological datasets shows its ability to extract relevant consensi.
Keywords: Box-link, Factor tree, Structured motif, Promoter, Binding site consensus.
Availability: http://kdbio.inesc-id.pt/~asmc/software/riso.html
Get a preprint:
[ps]
[pdf]
Get a bibentry:
[bib]
[bbl]
Research interests | Publications | Software | Teaching | Short CV