6 open genome list

Pyrus pyrifolia var. culta (Pear) Pyrus pylifolia cv. ‘Wonwhang’ Overview Pears (Pyrsus spp.) are one of the important fruit crops in temperate regions. The Rosaceae family contains antioxidant fruits, such as apples, plum, berry’s and cherries. Pyrus pyrifolia cultivar ‘Wonwhang (WH)’ (BioSample SAMN05196235)genome sequencing were conducted a nuclear, chloroplast (KX450876), mitochondria (KY563276). The nuclear genome size is estimated to be 535 Mb by K-mer analysis. The nucleotide genome assembly characteristics included high detection of 97% with 2,221 scaffolds, 649kb of N50 and 517Mbp. Three genetic maps in inter and intra cross populations were constructed with SNPs polymorphisms by genotype by sequencing (GBS). Owing to recent WGD in Pyreae, there are more than two loci in different homologous chromosomes in genetic maps. Follows on Hi-C mapping mainly, we could decide to pseudomolecule on 17 chromosomes with 440.3Mb and 964 scaffolds with 85.23% coverage to the assemble genome in WH. SSR-containing genome sequences were subjected to categories of function putative genes with homology search, 510 SSR markers selected in this study. Of these, 70 markers were distributed across 17 chromosomes with more than one locus. We conducted QTL analysis with several breeding traits of fruit ripening with sweetness, acid, and texture. Catching four QTLs in one population (P. prifoilia and P. communis, inter cross) in sweetness, two SSR markers were decided in this population. Expected benefits of the pear genome sequencing will include identifying genes and utilizing them to develop new varieties to increase the market value of the fruit. Statistics Genome Approximately 517Mbp Mb is assembled in 17 chromosomes, with 440.3Mb and 964 scaffolds with 85.23% coverage Scaffold N50 = 649Kb, scaffold no. 2,221 Contig N50 = 256Kb, contig no.9,926 Loci 43,553 (Gene with isoforms, 1,623) protein-coding genes have been predicted Organelle genomes The complete chloroplast genome of P. pyrifolia is 159,922 bp in size (Accession no. KX450876). The chloroplast genome harbors 132 annotated genes, including 93 protein-coding genes, 31 tRNA genes, and eight rRNA genes. The complete mitochondrial genome of P. pyrifolia is 458,873 bp in length (Accession no. KY563276). A total of 65 genes are annotated including 39 protein-coding genes, 23 tRNA genes, and three rRNA genes. References H Y Chung, S Y Won, Y-K Kim, J S Kim* (2019) Development of the chloroplast genome-based InDel markers in Niitaka (Pyrus pyrifolia) and its application, Plant Biotechnology Reports11816-018-00513-0 P Soundararajan, S Y Won, J S Kim* (2019) Insight on Rosaceae Family with Genome Sequencing and Functional Genomics Perspective, BioMed Research International volume Article ID 7519687 J Y Lee, M-S seo, S Y Won, K A Lim, I S Sin, D S Choi, J S Kim* (2018) Construction of a Genetic Map using the SSR Markers Derived from “Wonwhang” of Pyrus pyrifolia, Korean J. Breed. Sci. 50(4):434-441 H Y Chung, T-H Lee, Y-K Kim, J S Kim* (2017) Complete chloroplast genome sequences of Wonwhang (Pyrus pryrifolia) and its phylogenetic analysis, Mitochondirial DNA Part B:resources VOL 2 No1, 325-326 H Y Chung, S Y Won, S-H Kang, S-H Sohn, J S Kim* (2017) Complete chloroplast genome sequences of Wonwhang (Pyrus pryrifolia) and its phylogenetic analysis, Mitochondirial DNA Part B:resources VOL 2 No1, 325-326

Fagopyrum esculentum (Buckwheat) Overview  The Tartary buckwheat (Fagopyrum tataricum) genome project was initiated through the Post genome Program by a consortium led by Yul Ho Kim, Su Jeong Kim, Hwang Bae Sohn, Sunghoon Lee, Dong-Ha Oh, Sin-Gi Park.   De novo genome sequencing of tartary buckwheat began in the early of 2014 and was completed lately in 2017. To obtain a high-quality draft genome assembly, we produced total 43.83 and 32.17 Gb sequences from Illumina paired-ends (PE) and Single-Molecule Real-Time (SMRT) sequencing platforms, respectively, which corresponded to 70x (Illumina PE) and 52x (SMRT) coverages. A hybrid assembly followed by scaffolding, gap-filling, and cleaning of redundancy resulted in a final draft assembly of 526.94 Mb in 2,566 scaffolds with 50% of the total sequence captured in 156 scaffolds larger than 886,968 bps (N50). We predicted total 43,771 putative protein-coding gene models occupying 19.33% of the genome, while 52.00% consisted of repetitive sequences and transposable elements, with Gypsy family long terminal repeat (LTR) retrotransposons being the most abundant class. We are currently preparing to publish a paper about the draft genome of tartary buckwheat.    Statistics Genome Approximately 526.94Mb arranged in 2,566 scaffolds Approximately 565.10Mb arranged in 4,433 contigs   Scaffold N50 = 886,968bp  Contig N50 = 463,432bp 137 scaffolds larger than 1 Mbps, with above 50% of the genome in 156 scaffolds   Loci  Total 43,771 putative protein-coding gene models were predicted.     Sequencing, Assembly, and Annotation  Genome sequencing  We prepared both short read (Illumina)and long read (PacBio) libraries to cover the entire genome of entire genome of F. tataricum. Sequencing libraries were prepared from genomic DNA using Illumina HiSeq2500 (2 × 101 bp) and PacBio RSII platforms (>3Kb). In brief, a short insert (350 bp) paired-end (PE) library was constructed using TruSeq DNA library Prep Kit (Illumina) according to the manufacturer instructions. Single Molecule Real Time (SMRT) bell libraries were prepared from the large scale amplified cDNA as recommended by Pacific Biosciences (Palo Alto, U.S.A). SMRT bell templates were bound to polymerase using the DNA polymerase binding kit P6 v2 primers.     How was the assembly generated?  Whole genome de novo assembly for F. tataricum was performed via hybrid approach as follows: Long SMRT sequencing reads were assembled using Fast Alignment andCONsensus (FALCON) (Chin et al., 2016), whereas 350-bp short insert reads were assembledusing SOAPdenovo2 (Luo et al., 2012) with default parameters. Before assembly, all Illuminareads were subjected to preprocessing (adapter, quality, duplicates trimming). The initialcontigs were merged two assemblies using HaploMerger2 (Huang et al., 2017). Both shortand long reads were then used to construct scaffolds with SSPACE software (Boetzer et al.,2011) followed by gaps were filled with the short read data using GapFiller (Nadalin et al.,2012). We used CoGE SynMap (Lyons et al., 2008) and LASTZ (Kiełbasa et al., 2011) todetect and filter out redundant genomic regions (>98% sequence identity over >7Kb) togenerate the final draft assembly. The hybrid assembly resulted in a final draft assembly of 526.94 Mb in 2,566 scaffolds with 50% of the total sequence captured in 156 scaffolds larger than 886,968 bps (N50).    Is it accurate?  To test the accuracy of the genome assembly, we applied classical Sanger sequencing methods on two BAC clones of 121.85Kb and 61.50Kb that contain gene loci for the homologs of two previously known Fagopyrum FLS coding sequences. The 121.85Kb BAC clone (“29-J17”) contained a gene locus for FtFLS1 (NCBI GenBank ID: JF27561), while the 61.50Kb BAC clone (“32-I01”) included a locus for a partial sequence of putative FLS (GenBank ID: HM357805). Both BAC clone sequences, assembled from contigs generated by Sanger sequencing, showed >99% sequence identity with their corresponding genomic regions in the draft genome assembly.    Gene prediction  We predicted gene models in the draft genomes of F. tataricum cv. Daegwan by combining evidence from transcriptome and protein sequence alignments with ab initio prediction on repeat-masked genome sequences. GeneMark-ET (Lomsadze et al., 2014) was used to perform iterative training and to generate initial gene structures with RNA-Seq data information. AUGUSTUS (Stanke and Morgenstern, 2005) was further used to perform  de novo prediction with gene models trained by GeneMark-ET, with exon-intron boundary information predicted by transcriptome and protein sequence alignments. We used TopHat (Trapnell et al., 2012) for RNA-Seq alignment and Exonerate(Slater and Birney, 2005) for protein sequence alignment with similar species sequences. We annotated deduced protein sequences through BLASTP searches with an e-value cutoff of 1e-10 460 against NCBI  non-redundant database, UniProt, and Interproscan. Occurrence and frequency of repeats, including retrotransposons, DNA transposons, microsatellites, and other repeats, were screened using RepeatMasker (Tarailo-Graovac and Chen, 2009). Further, the repeat masked scaffolds were used for gene prediction as described above.    Is it complete?  Compared to the draft genome of F. tataricum cv. Pinku, the problem was that the number of annotated genes and duplicated BUSCO was high in the draft genome of F. tataricum cv. Daegwan. And so, we are currently working on improving the draft genome of F. tataricum cv. Daegwan.     Contacts Yul Ho Kim (email: kimyuh77@korea.kr) Highland Agriculture Research Institute, National Institute of Crop Science, Rural Development Administration, Pyeongchang 25342, Korea   References:  Boetzer, M., Henkel, C. V., Jansen, H.J., Butler, D. and Pirovano, W. (2011) Scaffolding preassembled contigs using SSPACE. Bioinformatics, 27, 578–579. Available at: http://www.ncbi.nlm.nih.gov/pubmed/21149342 Chin, C.-S., Peluso, P., Sedlazeck, F.J., et al. (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods, 13, 1050–1054. Available at: http://www.ncbi.nlm.nih.gov/pubmed/27749838  Huang, J., Deng, J., Shi, T., et al. (2017) Global transcriptome analysis and identification of genes involved in nutrients accumulation during seed development of rice tartary buckwheat (Fagopyrum Tararicum). Sci Rep, 7, 1–14  Kiełbasa, S.M., Wan, R., Sato, K., Horton, P. and Frith, M.C. (2011) Adaptive seeds tame genomic sequence comparison. Genome Res, 21, 487–493. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3044862&tool=pmcentrez &rendertype=abstract Lomsadze, A., Burns, P.D. and Borodovsky, M. (2014) Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res, 42, e119. Available at: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gku557 Luo, R., Liu, B., Xie, Y., et al. (2012) SOAPdenovo2: an empirically improved memory efficient short-read de novo assembler. Gigascience, 1, 18. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23587118  Lyons, E., Pedersen, B., Kane, J., et al. (2008) Finding and Comparing Syntenic Regions among Arabidopsis and the Outgroups Papaya, Poplar, and Grape: CoGe with Rosids. Plant Physiol, 148, 1772–1781. Nadalin, F., Vezzi, F. and Policriti, A. (2012) GapFiller: a de novo assembly approach to fill the gap within paired reads. BMC Bioinformatics, 13 Suppl 1, S8. Available at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-13-S14-S8 Stanke, M. and Morgenstern, B. (2005) AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res, 33, W465-7. Available at: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gki458  Slater, G.S.C. and Birney, E. (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics, 6, 31. Available at: http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-6-31 Tarailo-Graovac, M. and Chen, N. (2009) Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences. In Curr Protoc Bioinformatics. Hoboken, NJ, USA: John Wiley & Sons, Inc., p. Unit 4.10. Available at: http://www.ncbi.nlm.nih.gov/pubmed/19274634

Antheraea yamamai (Japanese Oak Silkmoth) Overview Antheraea yamamai, also known as the Japanese oak silk moth, is a wild species of silk moth. Silk produced by A. yamamai, referred to as tensan silk, differs drastically from common silk produced from the domesticated silkworm, Bombyx mori. Silk moths can be categorized into two families- Bombycidae and Saturniidae. Saterniidae has been estimated to contain approximately 1,861 species with 162 genera and is known as the largest family in Lepidoptera. Among the many species in family Saturniidae, only a few species, including A. yamamai, can be utilized for silk production. For whole genome sequencing, we selected one male sample (Ay-7-male1) from a breeding line (Ay-7) of A. yamamai raised at the National Academy of Agricultural Science, Rural Development Administration, Korea. A total of 147Gb of genomic data and 76Gb of transcriptomic data was generated for this study. We present the genome sequence of A. yamamai, the first published genome in family Saturniidae, with gene expression data collected from ten different body organ tissues. Statistics   Genome   A total of 147G base pairs using Illumina and Pacbio sequencing platforms were generated. Approximately 210-fold coverage based on the 700 Mb estimated genome size of A. yamamai. The assembled genome of A. yamamai was 656 Mb(>2kb) with 3,675 scaffolds.   The N50 length of assembly was 739 Kb with 34.07% GC ratio.   Identified repeat elements covered 37.33% of the total genome and the completeness of the constructed genome assembly was estimated to be 96.7% by BUSCO v2 analysis.       Loci   A total of 76Gb of transcriptomic data was generated for this study.   A total of 21,124 genes were identified using Evidence Modeler based on the gene prediction results obtained from 3 different methods (ab initio, RNA-seq based, known-gene based).       Assembly   Before conducting genome assembly, we conducted k-mer distribution analysis using a 350bp paired-end library in order to estimate the size and characteristics of the A. yamamai genome. The 19-mer distribution of A. yamamai genome using a 350 bp paired-end library.   In the 19-mer distribution analysis, the genome size of A. yamamai was estimated to be 709Mb. Next, we conducted error correction on Illumina paired-end libraries using the error correction module of Allpaths-LG before the initial contig assembly process (ALLPATHS-LG , RRID:SCR_010742). After error correction, initial contig assembly with 350bp and 700bp libraries was conducted using SOAP denovo2 with the parameter option set at K=19; this approach showed the best assembly statistics compared to other assemblers and parameters (SOAPdenovo2 , RRID:SCR_014986).   At each scaffolding step, SOAP Gapcloser[21] with -l 155 and -p 31 parameters was repeatedly used to close the gaps within each scaffold.   After scaffolding was performed using SSPACE-LongRead with Illumina synthetic long read data, the total number of assembled scaffolds was effectively reduced from 398,446 to 24,558. The average scaffold length was also extended from 1.7 Kb to 24.8 Kb. However, there was no impressive improvement in N50 length (approximately 91 Kb to 112 Kb) of assembled scaffolds.   After final scaffolding processing using Pacbio long reads, the number of scaffolds was reduced to 3,675 and N50 length was effectively extended from 112 Kb to 739 Kb.       Gene Prediction   Three different algorithms were used for gene prediction of the A. yamamai genome: ab initio, RNA-seq transcript based, and protein homology-based approaches.   For RNA-seq transcript based prediction, generated transcriptome data from ten organ tissues of A. yamamai were aligned to the assembled genome and gene information was predicted using Cufflinks[44](Cufflinks , RRID:SCR_014597). The longest CDS sequences were identified from Cufflinks results using Transdecoder. For the homology-based approach, all known genes of order Lepidoptera in the NCBI database were aligned using PASA. The final gene set of A. yamamai genome contains 21,124 genes.   The average gene length was 8,331 bp with a 38.76% GC ratio and the number of exons per gene was 4.44. To identify the function of predicted genes, Swiss-Prot, Uniref100, NCBI NR database, and gene information of B. mori and D. melanogaster was employed for sequence similarity search using blastp.       Contacts   Seong-Ryul Kim (email : ksr319@korea.kr)   Seong-Wan Kim (email:tarupa@korea.kr)   Reference Publication   Kim SR, Kwak W, Kim H, Caetano-Anolles K, Kim KY, Kim SB, Choi KH, Kim SW, Hwang JS, Kim M, Kim I, Goo TW5 Park SW. Genome sequence of the Japanese oak silk moth, Antheraea yamamai: the first draft genome in the family Saturniidae. Gigascience. 2018 Jan 1;7(1):1-11. doi: 10.1093/gigascience/gix113.  

Chrysanthemum morifolium (chrysanthemum) Overview The chrysanthemum genome project was conducted through the National Agricultural Genome Program (NAGP) and National Institute of Agricultural Sciences (NAS) Program, Republic of Korea. Chrysanthemum boreale (Asteraceae;Asteroideae;Anthemideae) is a perennial plant native to Eastern Asia and has been used for ornamental and herbal purposes. C. boreale is a diploid species (2n=2x=18) and is close to the commercial and cultivated species, Chrysanthemum morifolium with a large and hexaploid genome (2n=6x=54). Therefore, C. boreale is used as a model plant in the study of cultivated chrysanthemum. The genome was sequenced using PacBio single-molecule real-time long reads and was scaffolded into nine chromosome sequences using the Bionano optical map and the Hi-C contact map. Genome coverage is high according to the mapping rate of Illumina short reads and BUSCO analysis, although the Bionano optical map resulted in large gap during scaffolding process as shown in the total length of contigs 3.415 Gb and scaffolds 5.524 Gb. The assembled genome corresponds to the original sequences including large gap and the sequence tag (CTTAAG) generated from Bionano map. Considering this, below statistics was calculated in two ways without and with gap. Statistics Genome Approximately 3.237 Gb is assembled in 9 chromosomes, with 0.181 Gb of unmapped scaffolds (w/o gap). Approximately 4.831 Gb is assembled in 9 chromosomes, with 0.692 Gb of unmapped scaffolds (with gap). Scaffold N50 = 379 Mb, Longest scaffold = 401 Mb (w/o gap) Scaffold N50 = 557 Mb, Longest scaffold = 610 Mb (with gap) Contig N50 = 218 kb, longest contig = 3.232 Mb Organelle genome The complete chloroplast genome is 151,012 bp in size (NCBI ID MG913594) and harbors 114 unique genes including 80 protein-coding genes, 4 rRNA gene, and 30 tRNA genes. The complete mitochondrial genome is 211,002 bp in size (NCBI ID MH004292) and harbors 58 genes including 35 protein-coding genes, 3 rRNA genes, and 20 tRNA genes. References So Youn Won, Jae-A Jung, and Jung Sun Kim (2018) The complete chloroplast genome of Chrysanthemum boreale (Asteraceae). Mitochondrial DNA Part B: Resources 3:549-550. So Youn Won, Jae-A Jung, and Jung Sun Kim (2018)The complete mitochondrial genome sequence of Chrysanthemum boreale (Asteraceae). Mitochondrial DNA Part B: Resources 3:529-530.

Perilla frutescens(Perilla) Overview The Perilla citriodora genome project was initiated by the National Agricultural Genome Project (NAGP) of Rural Development Administration (RDA). Through the sequence production and assembly using Illumina and Pacbio, mass production of the genomic sequence of diploid perilla (Perilla citriodora) was performed, and the quality of the assembly was confirmed by BAC and Illumina short reads. Subsequently, the sequence of the assembly was reconfirmed and analyzed by Hi-C grouping, and the diploid perilla genome size was estimated by k-mer analysis. For De novo assembly using mass-produced nucleotide sequences, contigs were scaffolded with Illumina mate pair based on Pacbio contigs. The contigs were separated and corrected by Hi-C analysis to determine the pseudomolecules of diploid perilla. The final gene was predicted using the pseudomolecule level, and the assembly level was confirmed by BLAST2GO and BUSCO analysis. Statistics Genome Approximately 677.6 Mb arranged in 1,622 scaffolds Approximately 676.0 Mb arranged in 2,802 contigs Scaffold N50 (L50) = 12.3 Mbp Contig N50 (L50) = 1,672.9 Kbp 78 scaffolds larger than 1Mbp, with 95.6% of the genome (1,622 scaffolds) 10 assembled pseudomolecules = 638.7 Mbp Protein-coding gene 43,175 protein-coding gene, 43,664 transcripts and 36,015 CDS have been predicted   Sequencing, Assembly, and Annotation NGS whole genome sequencing methodology Considering the advantages of each NGS equipment, it produces high-quality data through a total of three types of sequencers (HiSeq 2500, MiSeg, HiSeq 2000), and also adjusts the insert length in various ways to produce a total of 36 libraries and totals from them. 1,198.9Gb was produced, and the size of the diploid perilla (P. citriodora) predicted through k-mer analysis was estimated to be approximately 650 Mbp. First, Miseq reads were merged using FLASH, contig assembly was performed with Miseq reads, and scaffolding was performed with Hiseq mate paired reads. Pacbio analysis was performed to overcome the limitations of Illumina short read decoding using long reads and to improve scaffold quality. The average read quality after sequence trimming value was 0.83, the read length was 11,855bp, the read base was 49,142,181,846bp, the number of reads was 4,145,022 and the N50 value of the reads was 16,794bp. In the sequence analysis of P. citriodora, the subread base was 48,959,646,854bp, the number of subreads was 7,318,606 and the average subread length was 6,689bp. Hi-C library and raw sequence data Hi-C library was made by diploid perilla leaflets, and Bioanalyzer was used for library QC. The completed library was about 200-400 bp in size and showed an intermediate value of about 300 bp. The completed Hi-C library performed 80bp paired end sequencing twice using HiSeq2500 and produced about 112.7 million paired end raw data (225.3 million reads). Total 18 Gb raw data were made and used. Raw data QC analyzed per base sequence quality, per sequence quality, per sequence GC content, and sequence length distribution using FastQC program. Clustering The first genome of diploid perilla was produced by scaffolding with Illumina mate-pair data based on PacBio assembly, and consists of a total of 1622 scaffolds and 2802 contigs. Based on the above draft assembly results, Hi-C raw data was mapped and contig clustering was attempted using the LACHESIS program. Of the 2802 contigs, 2098 were clustered to10 Hi-C groups, with 666.6Mb of total length. This corresponds to 98.6% of the contig 2802 total base length of 676,030,824 bp, and 74.9% of the contig number. 704 contigs, which account for about 25%, were judged not to be clusters due to the small assembly length, and the average length of contigs was about 14 kb. Ordering/orientation The clustered contigs were processed through order / orientation to produce pseudomolecules for each cluster. Finally, 2,065 contigs were formed through order / orientation to form a scaffold of 10 chromosome units, which included 98.4% of contigs compared to the number of clustered contigs, and ~ 100% of length compared to the total (compared to the total number of input contigs). 73.4%, 98.6% of length). 33 contigs were clustered but order / orientation failed, and 737 contigs were excluded from the Hi-C final assembly if 704 clusters were included. Its total length is 10 Mb, which is less than 1.5% of the total assembly. In order to confirm the sequence of pseudomolecule, it was found that more than 99.8% of 10 BAC full insert sequences were identical and also about 90% of pseudomolecule QC was confirmed by insert size prediction using paired BAC end sequence. Gene prediction For protein-coding gene prediction analysis, Seqping was used using major transcript sequence, perilla de novo repeat sequence generated by repeatModeler, plant refseq protein sequence of NCBI, and Gypsy Database (GyDB). As a result, 43,175 genes, 43,664 transcripts, and 36,015 CDS were predicted. Pseudomolecule construction 10 pseudomolecules of diploid perilla were constructed through Hi-C analysis and genetic map.   Contacts Tae-Ho Kim (Email: thkim@rda.go.kr) Myoung-Hee Lee (Email: emhee@korea.kr) Jeong-Hee Lee (Email: jhlee@seeders.co.kr) Hong-Il Ahn (Email: ahi0101@korea.kr) Sun-Hwa Bae (Email: bae209@korea.kr)     Reference Publication(s) Yun-Joo Kang, Bo-Mi Lee, Moon Nam, Ki-Won Oh, Myoung-Hee Lee, Tae-Ho Kim, Sung-Hwan Jo & Jeong-Hee Lee, Identification of quantitative trait loci associated with flowering time in perilla using genotyping by sequencing, Molecular Biology Reports, 2019; 46:4397-4407 Myoung Hee Lee, Ki Won Oh, Myung Sik Kim, Sung Up Kim, Jung In Kim, Eun Young Oh, Suk Bok Pae, Un Sang Yeo, Tae-Ho Kim, Jeong Hee Lee, Chan Sik Jung, Do Yeon Kwak, and Yong Chul Kim, Detection of QTLs in an Interspecific Cross between Perilla citriodora × P. hirtella Mapping Population, Korean J. Breed. Sci., 2018; 50(1):13-20 Kyeong-Seong Cheon, In-Seon Jeong, Kyung-Hee Kim, Myoung-Hee Lee, Tae-Ho Lee, Jeong-Hee Lee, Ung-Han Yoon, Romika Chandra, Ye-Ji Lee, Tae-Ho Kim, Comparative SNP Analysis of Chloroplast Genomes and 45S nrDNAs Reveals Genetic Diversity of Perilla Species, Plant Breed. Biotech., 2018; 6(2):125-139 Ji-Su Mo, Kyunghee Kim, Myoung Hee Lee, Jeong-Hee Lee, Ung-Han Yoon & Tae-Ho Kim, The complete chloroplast genome sequence of Perilla citriodora (Makino) Nakai Mitochondrial DNA Part A, 2017; 28(1): 131–132 Ji-Eun Kim, Junkyoung Choe, Woo Kyung Lee, Sangmi Kim, Myoung Hee Lee, Tae-Ho Kim, Sung-Hwan Jo, Jeong Hee Lee, De novo gene set assembly of the transcriptome of diploid, oilseed-crop species Perilla citriodora, J Plant Biotechnol, 2016; 43:293–301

Senna tora cv. Myeongyun Senna tora cv. Myeongyun Overview Senna tora (L.) Roxb. (Cassia tora), a member of Leguminosae (subfamily Caesalpinoideae), is a semi-wild annual herb widely grown in different places of toropical and subtropical weather all around the World (https://ildis.org). S. tora is a rich resource of anthraquinones, flavonoids, and polysaccharides. So, seeds are extensively used for medicinal applications in gastrointestinal disorders, treatment of skin, and ailments ranging from simple cough, hypertension to diabetes. Despite of its useful applications, there has been little report of molecular and genomic studies of S. tora. To elucidate genes responsible for biosynthesis of anthraquinone in S. tora, the genome project has initiated through the National Agricultural Genome Program (NGAP) and National Institute of Agricultural Sciences (NAS) Program.   Statistics Genome Approximately 502 Mb is assembled in 13 chromosomes, with 23.8 Mb of sequence in unmapped scaffolds. Approximately 526.4 Mb arranged in 732 contigs Scaffold N50 = 41.7 Mb, Longest scaffold = 52.7 Mb Contig N50 = 4.03 Mb, Longest contig = 14.9 Mb Loci 45,268 protein-coding genes have been predicted   Organelle genomes The complete chloroplast genome of S. tora is 162,426 bp in size (Accession no. NC030193). The chloroplast genome harbors 110 annotated genes, including 77 protein-coding genes, 30 tRNA genes, and 4 rRNA genes. The complete mitochondrial genome of S. tora is 566,589 bp in length (Accession no. MF358693). A total of 63 genes are annotated including 36 protein-coding genes, 22 tRNA genes, and 5 rRNA genes.   References Sang-Ho Kang, So Youn Won and Chang-Kug Kim (2019) The complete mitochondrial genome sequences of Senna tora (Fabales: Fabaceae). Mitochondrial DNA Part B: Resources