Comparative genomics of Westiellopsis prolifica a freshwater cyanobacteria uncovers the prolific and distinctive metabolic potentials

  1. Vineeta Verma1,2
  2. , Mathu Malar C.1,2
  3. , Sucheta Tripathy1,2*

Authors Affiliation(s)

  • 1Department of Structural Biology and Bioinformatics, CSIR Indian Institute of Chemical Biology, Kolkata 700032, INDIA
  • 2Academy of Scientific and Innovative Research (AcSIR), New Delhi, INDIA

Can J Biotech, Volume 1, Special Issue, Page 123, DOI: https://doi.org/10.24870/cjb.2017-a109

*Corresponding author: tsucheta@iicb.res.in

Abstract

Cyanobacteria are one of the ancient Micro-organisms that originated about 2.5 billion years ago. They are a very rich source for production of various natural compounds that are largely scalable in pharmaceutical and biotechnology industries. The unicellular Cyanobacteria are more ancient than the multicellular forms. In this study, we are exploring the genomes of a multi cellular, heterocystous, true branching Cyanobacteria, Westiellopsis prolifica belonging to order Nostocales. Complete genome is essential to serve as a reference for other sequencing projects and from which we can confirm the presence of various useful metabolic genes which are important for manufacturing pharmaceutical products. Here we report the draft assembly of Westiellopsis prolifica genome of 7.2 Mb with 19 scaffolds and the N50 and largest contig sizes are 2650655 bp and 3476031 bp, respectively. The phylogenomic studies from the literature reveal the closest relative of Westiellopsis prolifica are Fischerella sp. pcc 9431, Fischerella sp. pcc 9939 and Hapalosiphon welwitschii. Our preliminary comparative genomic analysis revealed that the sequence identity with the neighbouring clades were less, although we observed the large set of genes were syntenic and arranged in conserved in clusters. Genome mining on these organisms identified several clusters of NRPS, polyketide biosynthesis, two-component system, heterocyst differentiation genes and Nif genes were conserved in these genomes. We identified 21 clusters of secondary metabolites, which include NRPS and polyketide genes. For extraction of metabolites, we used several organic solvents. These extract contain various metabolic products which can be further exploited for the large scale production by genetic engineering approaches. Our Future work includes checking the RNAseq expressions of these metabolite producing genes.