Bioinformatics tool for evaluating aligner performance on your RNA-Seq dataset.

The purpose of CADBURE is to evaluate spliced aligner performance on user's RNA-Seq data by comparing a pair of alignment results obtained either from two different aligners with the similar parameter set or from two different parameter sets with the same aligner. In alignment comparison, CADBURE determines the relative reliability of unambiguously (also called uniquely) aligned reads and non-uniquely aligned reads.The reads are further subdivided into eight distinct scenarios of potential alignment outcomes.These scenarios are binned into three categories: true positive, false positive, and true negative which enables CADBURE to determine specificity and accuracy for each result.

Input and Output

The input is two alignment results, which should be in BAM format. For faster processing we recommend the input to be sorted by its read name (see Samtools). The summary output displaying the number of read mapping involved in contrasting scenarios are presented in html format. The read name lists involved in different scenarios are presented as text format in different folders. The read name list will be useful for viewing contrasting mapping scenarios using any bam viewer.

Easier and Simpler to use

CADBURE is one easier tool for evaluating spliced aligner performance on your data as it can be executed in one line. The use of CADBURE is simple because you do not need to rerun aligners or random sub-sample result or simulate chromosome. Since you do not need to simulate chromosome you can use any functional feature for generating alignment result like annotation guided mapping or SNP tolerance mapping.

Statistical Significance

CADBURE distribution comes with R script for doing bootstrap stats on the output to assign its statistical significance. This R script takes in TP, FP and TN of each aligner output by CADBURE. Differences in both Specificity and Accuracy results between aligners will be assessed by building 95% bootstrap confidence intervals (10,000 bootstrap samples) for the true difference. The script outputs the 95% confidence interval. Results should be deemed significant at the 5% significance level if the associated 95% CI for the differences fails to contain zero.

News and Updates

08/15/2015
CADBURE is published in the Nature Publishing Group's fully open access journal called Scientific reports!

If you use CADBURE please cite,

Praveen Kumar Raj Kumar, Thanh V. Hoang, Michael L. Robinson, Panagiotis A. Tsonis, Chun Liang: CADBURE: A generic tool to evaluate the performance of spliced aligners on RNA-Seq data, in press. Scientific Reports, 5:13443, (2015) DOI: 10.1038/srep13443.