16s Metagenomics


Metagenomic analysis of microbial populations is often performed using the prokaryotic 16s ribosomal RNA (rRNA) gene. These genes contain conserved and variable regions that are studied for phylogenetic classification. In clinical microbiology, molecular identification based on 16s rDNA sequencing is applied fundamentally to bacteria whose identification by means of other types of techniques turns out impossible, difficult, or requires a lot of time.

16s rRNA Sequencing Analysis
  •  16s rRNA sequencing has been established for the identification and taxonomic classification of bacterial species.
  •  It can prove to be a method for the recognition of novel pathogens.
  •  Being the conserved gene in bacterial species it provides the identification of useful signature sequences and patterns.
  •  16s rRNA sequencing can lead to the development of completely new species.
  •  The conserved variable regions facilitate sequencing and phylogenetic classification.
  •  Researchers can achieve species level sensitivity for metagenomic surveys of bacterial populations.
  •  In clinical microbiology, molecular identification based on 16s rDNA sequencing is applied fundamentally to bacteria whose identification by means of other types of techniques turns out difficult, or requires a lot of time.

16s rRNA Short read libraries target variable V3 and V4 regions of 16s rRNA genes. Although, 16s rRNA sequencing is an amplicon sequencing technique, usually the environment or clinical samples are as clean and need expert hands to process and amplify 16s rRNA genes. Our 16s rRNA sequencing service makes sure that your precious samples are managed in the best possible manner to generate quality sequencing data. Our 16s rRNA sequencing service can help achieve species level sensitivity for biodiversity analysis of bacterial population.

Next-Generation Sequencing Pipelines

Choice of Next-Generation Sequencing Pipelines. Methods in molecular biology (Clifton, N.J.). 1231. 31-47. 10.1007/978-1-4939-1720-4_3.

16s rRNA Data Analysis

16s rRNA sequencing data analysis requires deep understanding of various steps to be able to generate precise information about biodiversity or phylogeny. A typical data analysis overview for 16s rRNA data analysis pipeline is represented here.

As with any NGS data analysis workflow, it is essentially to preprocess reads before actual analysis. Data preprocessing includes steps to determine quality and coverage required for analysis. Read trimming for adaptors and quality bases is a standard procedure applied to NGS dataset. 16s rRNA sequencing involves PCR amplification step which in general introduces chimeric reads that need to be removed before any OTU analysis could be performed else several spurious OTU could be reported. A paired end stitching may be required if the sequencing read lengths are longer then size of 16s rRNA. Our 16s rRNA data analysis pipeline determine overlap across paired end reads and selects higher quality bases to improve quality of stitched read.

NGS Data Analysis
16s rRNA Chimera

Typical chimera seen during Illumina and Pacbio library preparation. Microbial phylogenetic profiling with the Pacific Biosciences sequencing platform. Microbiome, 2013.

16s rRNA taxonomic classification step starts with clustering of reads based on sequence similarity into OTUs with a representative sequence for each OTU. In general, reads with more than 97% sequence similarity are assumed to belong to same bacterial species. Information about bacterial diversity and required cluster resolution can be used to fine tune sequence identity parameters to be used for clustering and OTU generation. Once OTU are established, the representative sequences are compared to known reference databases to generate taxonomical identities. Currently, there are 3 popular databases – Green Genes, Silva and Ribosomal Database (RDP) with each of them having their set of strengths and weakness. There are number of tools for taxonomic classification, we have developed a robust 16s rRNA analysis and classification pipeline for phylogenetic assignment using above mentioned reference databases.

OUT 16srRNA Data Analysis
Phylogenetic Tree 16s rRNA Data Analysis

Bryophyte-Cyanobacteria Associations during Primary Succession in Recently Deglaciated Areas of Tierra del Fuego (Chile). PloS one. 2014-9.

With OTU assignment in place, it is key to understand the evolutionary relationship across the specifies present in sample(s). All OTU sequences are multiple aligned against reference database to determine evolutionary distance across them and generate a phylogenetic tree that represented evolutionary distance in form of branches and nodes. Sequence or OTUs that are closer in a phylogenetic tree are more related to each other than those that are future apart or belong to other nodes in the tree.

Diversity analysis can be performed at two levels – within a sample and across samples.  Diversity measure within a sample is called Alpha diversity while when comparing diversity across samples is called Beta diversity. Alpha diversity provides a gauge of intra sample distribution of species. Majority of published alpha diversity studies indicate towards present of few dominant species in a sample. A high sequencing coverage depth is required to be able to capture those minority species present in each sample. Our team of experts run rarefaction analysis to determine actual diversity within a sample. Samples with high sequence depth show higher species diversity. There are many alpha diversity measures like Shanon index, observed species and Chao1 that can be generated upon request.

Heat map 16s rRNA Data Analysis

Heat map showing the relative abundances and distribution of representative 16S rRNA gene tag sequences classified at the genus level. 2015, PLOS ONE.

16srRNA Alpha Diversity

Alpha diversity – Taxonomic summary of Ciona samples by Illumina sequencing of 16S rRNA.

The Gut of Geographically Disparate Ciona intestinalis Harbors a Core Microbiota. PloS one. 2014

Diversity analysis can be performed at two levels – within a sample and across samples.  Diversity measure within a sample is called Alpha diversity while when comparing diversity across samples is called Beta diversity. Alpha diversity provides a gauge of intra sample distribution of species. Majority of published alpha diversity studies indicate towards present of few dominant species in a sample. A high sequencing coverage depth is required to be able to capture those minority species present in each sample. Our team of experts run rarefaction analysis to determine actual diversity within a sample. Samples with high sequence depth show higher species diversity. There are many alpha diversity measures like Shanon index, observed species and Chao1 that can be generated upon request.

For beta diversity analysis, our pipeline compares samples using the phylogenetic information like Unifrac distance generated in steps above. Samples can be compared either in a pairwise or all-vs-all manner to generate beta diversity matrix. Beta diversity is represented in several ways by means of network diagrams, phylogenetic trees or graphs.
UPGMA clustering of Porites surface microbiomes based on the Morisita-Horn beta diversity of V4 region of 16S rRNA genes revealed two distinct clusters of samples.

16s rRNA Beta Diversity

Community Shifts in the Surface Microbiomes of the Coral Porites astreoides with Unusual Lesions. PloS one, 2014

  • Data QC report that includes stats about read trimming and filtering, pair send stitching and chimera filtering.
  • Taxonomy classification report and set of representative sequence file
  • Classified OTU table with abundance information
  • Phylogenetic Tree
  • Alpha Diversity report that includes: Diversity tree, observed species and stats.
  • Beta Diversity report that includes: Trees, Graph and Network diagrams; Distance matrices; PCoA plots