NABIC

Home > Portal > Genome > Introduction info

Introduction info

Open genome list

Overview

The bellflower (Platycodon grandiflorus) belongs to the bellflower family (Campanulaceae). Its root has been used as a traditional medicine and also a popular food additive with therapeutic effects on bronchitis, asthma, tonsillitis, pulmonary tuberculosis in East Asia for over 2000 years. The most important bioactive components of P. grandiflorus are platycosides, especially platycodin D. A whole-genome assembly of P. grandiflorus accompanied by its transcriptome and methylome data. The genome-wide analysis reveals the evolution of P. grandiflorus specialized in platycoside biosynthesis as a medicinal herb. In particular, the triterpenoid saponin biosynthesis-related genes show clues on species-specific selection of key player genes towards platycoside biosynthesis and their function.

 

Genome assembly and annotation of P. grandiflorus

Jangbaek-doraji, a cultivar of P. grandiflorus was used for whole-genome sequencing after four generations of self-fertilization. The karyotype analysis confirmed the diploid genome of P. grandiflorus with four metacentric and five sub-metacentric chromosome pairs. The k-mer analysis estimated the genome size to be approximately 694.4 Mb. We produced 474.5x sequencing coverage of Illumina short reads and 5.7x TruSeq synthetic long-reads (TSLRs). A hybrid assembly resulted in a 680.1 Mb draft genome with 4,816 scaffolds. It covered 98.4% of the estimated P. grandiflorus genome size. The assembly captured 92.6% of the complete BUSCOs, showing few fragmented and missing BUSCO genes indicating its high quality assembly construction. Its assembly quality was also assessed by mapping short reads to itself and 98% of them were successfully aligned to the assembly with 25x sequencing coverage depth. The genome annotation of P. grandiflorus enabled us to identify candidate genes underlying secondary metabolite biosynthesis. In the annotation, we predicted 40,018 non-redundant protein-coding genes with an average length of 5,019 bp from repeat-masked genome using evidence-driven gene prediction methods coupled with ab initio prediction.

 

 

Open genome list