Department of Mathematics
Mathematics Colloquium - Spring 2011
Wednesday, April 13th, 2011
2:45pm - 3:45pm, in McCormack 2-116 Kourosh ZarringhalamBoston CollegeMicroRNA classification and integration of chemical footprinting data into RNA secondary structure prediction
Abstract:
MicroRNAs (miRNAs) are short (~22 nt) endogenous non-coding
RNAs that play an important role in post-transcriptional gene
regulation. The miRNA precursor (pre-miRNA) has a characteristic
hairpin structure with Boltzmann basepairing probabilities in the
ensemble of low energy secondary structures significantly different
from those of other similar hairpin structures, allowing us to apply
machine learning classification methods to predict miRNAs based on
structural features. We assess their discriminatory power by training
a Support Vector Machine (SVM) classifier on known miRNAs and
protein-coding sequences. 5-fold cross-validation tests yield a high
accuracy of 0.95. We apply our classification method to NextGen
sequencing of short RNAs from human cell lines to identify more than
100 novel human putative miRNAs. Our work suggests that a large number
of miRNAs remain to be characterized in eukaryotic genomes. Further we
present a method for improving the accuracy of the RNA secondary
structure prediction by integrating chemical footprinting data into
the Boltzmann Partition function.
|
![]() |