aaRNA is an RNA binding site predictor using various features that out-performs sequence- or structure-only predictors. Sequence-based predictors are usually shortsighted, due their fragmented view of a binding site. By contrast, structure-based predictors can reach higher specificity but usually at a cost of sensitivity. As a consequence of these trade-offs, we aimed to develop a method that can optimally utilize both sequence and structural features for RNA-binding sites prediction.
To this end, we have made use of several established and novel features. In addition to the sequence features used previously in the SRCPred method , we include HMM-based evolutionary conservation (EC) scores to better evaluate conservation . As for structural features, we make use of local relative accessible surface area (rASA), which we developed in a novel way and map onto patches of spatially neighboring residues in order to capture spatially proximate residue correlations. Finally, we make use of Laplacian norm (LN)  coordinates to represent molecular structure. LN is a structural descriptor that measures surface deformations on proteins at fine and coarse-grained levels. By tuning the granularity, the LN can be made tolerant to structural deviations among RNA binding surfaces, while still being sensitive enough to distinguish binding surfaces from non-binding ones. More importantly, aaRNA can make use of structural features even for sequence-only input by making use of in-line homology modeling. The proposed method has been implemented as a web service called aaRNA at http://sysimm.ifrec.osaka-u.ac.jp/aarna/, and is expected to enhance functional annotation of putative RNA-binding proteins at the residue level.
To evaluate the performance of our method, three homology-reduced standard benchmarks constructed by other classifiers (RB106 , RB144 , and RB198 ) were used for comparison. Our method exhibits considerable improvement over sequence-based features alone, and exceeds the previously reported AUC limit of 0.81 by 2-3%, as demonstrated in Figure 1. Moreover, we tested our model on an independent dataset from the study . The AUC, MCC, and F-score calculated from our predictions were 0.8457, 0.488, and 0.598, respectively, in contrast to the author's Meta-predictor, which achieved an AUC of 0.835 and an MCC of 0.460, as shown in Figure 2.