User Tools

Site Tools


en:aarna:introduction

Introduction to aaRNA

aaRNA is an RNA binding site predictor using various features that out-performs sequence- or structure-only predictors. Sequence-based predictors are usually shortsighted, due their fragmented view of a binding site. By contrast, structure-based predictors can reach higher specificity but usually at a cost of sensitivity. As a consequence of these trade-offs, we aimed to develop a method that can optimally utilize both sequence and structural features for RNA-binding sites prediction.

To this end, we have made use of several established and novel features. In addition to the sequence features used previously in the SRCPred method [1], we include HMM-based evolutionary conservation (EC) scores to better evaluate conservation [2]. As for structural features, we make use of local relative accessible surface area (rASA), which we developed in a novel way and map onto patches of spatially neighboring residues in order to capture spatially proximate residue correlations. Finally, we make use of Laplacian norm (LN) [3] coordinates to represent molecular structure. LN is a structural descriptor that measures surface deformations on proteins at fine and coarse-grained levels. By tuning the granularity, the LN can be made tolerant to structural deviations among RNA binding surfaces, while still being sensitive enough to distinguish binding surfaces from non-binding ones. More importantly, aaRNA can make use of structural features even for sequence-only input by making use of in-line homology modeling. The proposed method has been implemented as a web service called aaRNA at http://sysimm.ifrec.osaka-u.ac.jp/aarna/, and is expected to enhance functional annotation of putative RNA-binding proteins at the residue level.

To evaluate the performance of our method, three homology-reduced standard benchmarks constructed by other classifiers (RB106 [4], RB144 [5], and RB198 [6]) were used for comparison. Our method exhibits considerable improvement over sequence-based features alone, and exceeds the previously reported AUC limit of 0.81 by 2-3%, as demonstrated in Figure 1. Moreover, we tested our model on an independent dataset from the study [7]. The AUC, MCC, and F-score calculated from our predictions were 0.8457, 0.488, and 0.598, respectively, in contrast to the author's Meta-predictor, which achieved an AUC of 0.835 and an MCC of 0.460, as shown in Figure 2.

en/aarna/introduction.txt · Last modified: 2014/09/08 10:53 by kmamada