Transcriptomic of loquat leaves and analysis of triterpenoid acid genes

Online：2026/5/21 16:00:29 Browsing times：

2026 No.5

Author: HUANG Jianjun, GAO Weicheng, WANG Xiaoping

Keywords: Loquat leaves; High-throughput sequencing; Triterpene acid; Gene analysis

DOI: 10.13925/j.cnki.gsxb.20250459

Received date:

Accepted date:

Online date:

PDF （） Abstract（）

【Objective】The study aimed to establish a comprehensive transcriptome database for loquat leaves by utilizing high-throughput sequencing methods and combining genomic, transcriptomic, and metabolomic data. The differential gene expression of triterpene acids in sesquiterpene and triterpene biosynthesis pathways between Zaozhong and Dazhong (loquat leaves) was investigated to identify the superior variety.【Methods】The fresh tender leaves (Dazhong and Zaozhong) were quickly frozen in liquid nitrogen and stored in a -80 ℃ refrigerator for future use. NanoDrop was used to sequence and construct cDNA libraries for Zaozhong and Dazhong, and transcriptome data results were statistically analyzed. Trinity was used for assembly, and cd- hit was employed to cluster sequences based on sequence similarity to remove redundant sequences. After removing redundancy, Corset was used for clustering. The software was used to aggregate transcripts into many clusters based on Shared Reads between transcripts. Combined with the expression levels of transcripts between the different samples and the H- Cluster algorithm, the transcripts with expression differences between samples were separated from the original clusters to establish new clusters, with each cluster ultimately defined as a "Gene". This method could aggregate redundant transcripts and improve the detection rate of differentially expressed genes. Five authoritative databases, including Swiss- Prot, NR, Pfam, GO, KEGG, and eggNOG, were used to annotate unigenes. The specific annotation process was as follows: For unigenes, BLASTX was used to search for homologous sequences in the Swiss-Prot database. Transdecoder was used to predict the cds region of unigenes and convert them into corresponding amino acid sequences. For the amino acid sequences, BLASTP was used to search for homologous sequences in the Swiss- Prot and NR databases, and hmmer was used to identify protein domains in the Pfam database. GO annotation results were obtained from the Swiss-Prot database, and Pfam2GO was used to convert Pfam annotation results into corresponding GO results. Online tools GhostKOALA and eggNOG_mapper were used to submit amino acid sequences for KEGG and eggNOG annotation, respectively. Expression levels were displayed in raw reads count and TPM. Raw reads count represented the number of reads contained in the transcript, but it was affected by sequencing depth and gene length, making it unsuitable for comparing differential genes between samples. Therefore, sequencing depth and gene length were normalized, and TPM values of genes were obtained for subsequent analysis. Hierarchical clustering was performed on the expression patterns of all genes, and heatmaps were used to present the clustering results. PCA analysis was performed on the gene expression levels of samples, and Pearson correlation coefficients were calculated between samples to detect the reproducibility within the same group of samples. Using DESeq2, we conducted differential expression analysis with replicate samples. Based on the results of the differential analysis, we selected significantly different genes with the sdandards of padj＜0.05 and |log2FoldChange|＞1. The KEGG pathway significance enrichment analysis was performed using the KEGG pathway as the unit, applying the hypergeometric test to identify pathways that were significantly enriched compared with the entire genome background. Volcano plots were used to summarize the significantly different genes between the components of Zaozhong and Dazhong.【Results】The results showed that the loquat leaves obtained an N50 of 1681 bp and an average length of 1 166.52 bp, with a total of 117 921 unigene sequences. Among the annotations from the five authoritative databases, the GO database had the highest number of successfully annotated transcripts, followed by the Swiss-Prot database, while the KEGG database had the fewest successful annotations. Specifically, in the NR database annotation, a total of 52 015 genes were annotated, accounting for 44.11% of the total, based on the similarity of the loquat leaves transcriptome to gene sequences of closely related species. In the GO database annotation, the 57 430 successfully annotated unigenes were categorized into three classes: biological process, cellular component, and molecular function, accounting for 48.70% of the total. In the KEGG database annotation, there were high annotations in the global and overview maps, carbohydrate metabolism, transformation, transport, and decomposition metabolism pathways. In the eggNOG database annotation, a total of 17 440 unigenes were finally annotated, accounting for 14.79%. The TPM expression distribution and clustering map showed that the six samples had similar expression distribution, with the number of differentially expressed genes accounting for only a small proportion of the overall genes. The genes with similar expression patterns might have the same function or participate in common metabolic pathways and signaling pathways. The PPYD1-1, PPYD1-2, and PPYD1-3 samples were clustered very close to each other in the PCA plot, indicating sample similarity; similarly, the PPYZ1-1 and PPYZ1-2 samples were also clustered close to each other in the PCA plot, indicating sample similarity. According to the KEGG enrichment analysis results, there were 6137 unigenes and 128 metabolic pathways differentially expressed between Zaozhong and Dazhong. Among them, the most enriched pathway was plant-pathogen interaction, followed by the MAPK signaling pathway- plant pathway, and the least enriched pathway was the biosynthesis of cuticle, suberin, and wax. There were 10 unigenes involved in sesquiterpene and triterpene biosynthesis, with 3 significantly different, 3 upregulated, and 0 downregulated.【Conclusion】Through analyzing the expression differences between Zaozhong and Dazhong, three differentially expressed genes were identified, and the content of five major components of triterpene acids was determined by HPLC. Based on the data results, the content of oleanolic acid in Zaozhong was slightly higher than that inDazhong, while the contents of the other four triterpene acids were all lower than those in Dazhong. The triterpene acid synthesis pathway might involve complex regulatory processes, and rational regulation of the expression of each part would have both positive and negative effects on the content of metabolites.

Previous：Mechanisms underlying the formation of seedlessness in Leigong No. 1 pummelo
Next：Comprehensive evaluation of 158 strawberry germplasm resources cultivated in weakly alkaline soil

Office Online

Guide for Authors

Contact Us

Home-Journal Online-2026 No.5

Transcriptomic of loquat leaves and analysis of triterpenoid acid genes