RNA Seq Analysis

We at 1010Genome have built optimized solution for your transcriptomics (RNA-Seq) projects. Our service generates output that clearly indicates statistical significance of experiment data that helps to quickly shortlist differentially expressed genes. We also offer custom analysis workflow that suits your experiment design to deliver best results.

RNA-Seq is a revolutionising transcriptome studies. It is highly sensitive, precise and accurate tool for measuring abundances and expression across transcriptomes. One can detect both novel and known features in a single assay providing the opportunity to explore transcript isoforms, gene fusions, SNPs, single nucleotide variations, allele specific gene expression.

RNA sequencing has rapidly replaced gene expression microarrays in many labs. mRNA (and other RNAs) are converted to cDNA that is used as the input to a next-generation sequencing library preparation. RNA-Seq allows you to quantify, discover and profile RNAs.

There are basically two types of pipelines used for RNA-Seq, i.e. reference based and denovo. In denovo there is no reference genome, although we can assemble the RNA sequences and then we can use a reference genome of the closest species available, or we can make our assembled genome as reference and then map our sequences onto it or map it with DNA-Seq if available. In reference based there is a reference genome to compare with.

Every RNA-Seq experimental scenario could potentially have different optimal methods for transcript quantification, normalization, and ultimately differential expression analysis. Moreover, quality control checks should be applied pertinently at different stages of the analysis to ensure both reproducibility and reliability of the results.

RNA-Seq Data Analysis

General pipeline for RNA-Seq workflow. Right hand side represented various attributes of each step highlighting complexity of RNA-Seq data analysis

RNA-Seq is increasingly the method of choice for researchers studying the transcriptome. It offers numerous advantages over gene expression arrays.

  • Broader dynamic range enables more sensitive and accurate measurement of gene expression.
  • Not limited by prior knowledge – captures both known and novel features.
  • Can be applied to any species, even if reference sequencing is not available.
  • A better value, often delivering advantages at a comparable or lower price per sample than many arrays.
  • Gene expression profiling across samples.
  • Study of alternative splicing events (differential inclusion/exclusion of exons in the processed RNA product after splicing of a precursor RNA segment) associated with diseases.
  • Identification of allele-specific expression, disease-associated single nucleotide polymorphisms (SNPs) and gene fusions to understand, e.g. disease causal variants in cancer.
  • Study and identification of transcript abundance and expression, transcript assembly and annotation has become very handy due to RNA-Seq tools.

We have developed robust in-house RNA-Seq analysis pipelines for reference based or denovo (by generating a transcript assembly and annotation) strategies. This approach provides optimal results no matter which approach is chosen. We have modified the de novo transcriptome assembly protocol used by Trinity (Grabherr et al., 2011) to generate high-quality assemblies followed by annotation step. These annotated transcriptome assemblies are used to map sequencing reads followed by tag counting and expression analysis. Based on the experiment design we use a combination of standard (Trapnell et al 2013) and in-house tools for differential gene expression analysis. Analysis generated expression plots and FPKM table (FPKM, fragments per kilobase of transcript per million fragments mapped) for all genes under various conditions.

RNAseq Expression Plots

Analysis generated expression plots and FPKM table (FPKM, fragments per kilobase of transcript per million fragments mapped) for all genes under various conditions.

RNAseq Scatter Plots

Similarly ‘Scatter Plots’ are generated to represent similarities and specific outliers between 2 conditions.

RNAseq Volcano Plots

Volcano plots are generated to represent genes, transcripts, TSS or CDS groups that display significant differences between pair of conditions.

RNAseq Read Coverage

Alignment files – BAM files, generated by mapping reads to reference genome or to transcript assembly are used to visualize read coverage for a given gene (regucalcin here) between two conditions as a measure of differential expression. Genomic viewer like IGV and UCSC browser can be used to view BAM files.

Custom Sashimi plots (Katz et. al. 2010) can be generated to allow for the easy comparison of read depth, alternative splicing, and isoform structure between individual RNA-seq samples.

RNAseq Sashimi Plots
RNAseq Bar Plots

Additional ‘Bar plots’ for specific genes of interest can be ordered as well. Here is an example of differences in regucalcin gene expression between two conditions. Similarly, bar plots drawn for isoforms of a specific gene can highlight source of differential gene expression.

  1. RNA-Seq analysis report containing a summary of the results and quality measures.
  2. Expression analysis table containing normalized expression values between the samples and corresponding P-values.
  3. RNA-Seq alignment file (SAM/BAM).
  4. Quality control summary and figures that include principle component analysis (PCA) plot and hierarchical clustering figures.
  5. Scatter plots and volcano plots for specific genes of interest. (Optional)
  6. Denovo transcriptome assembly and annotation files (for projects with no existing reference genome).
  7. FPKM value table for isoforms.
  8. Isoform or gene specific figures or Sashimi plots (Custom on demand).