Skip to main content

ORIGINAL RESEARCH article

Front. Plant Sci., 27 January 2023
Sec. Plant Breeding
This article is part of the Research Topic Recent Advances in Tree Genetics and Genomics: Where We Stand and Where to Go? View all 7 articles

Genetic diversity and structure of the 4th cycle breeding population of Chinese fir (Cunninghamia lanceolata (lamb.) hook)

Yonglian JingYonglian Jing1Liming Bian*Liming Bian1*Xuefeng ZhangXuefeng Zhang1Benwen ZhaoBenwen Zhao1Renhua ZhengRenhua Zheng2Shunde SuShunde Su2Daiquan YeDaiquan Ye3Xueyan ZhengXueyan Zheng3Yousry A. El-KassabyYousry A. El-Kassaby4Jisen ShiJisen Shi1
  • 1State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, College of Forestry, Nanjing Forestry University, Nanjing, China
  • 2Key Laboratory of Timber Forest Breeding and Cultivation for Mountainous Areas in Southern China, Fujian Academy of Forestry Science, Fuzhou, China
  • 3Department of Tree Improvement, Yangkou State-owned Forest Farm, Nanping, China
  • 4Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver, BC, Canada

Studying population genetic structure and diversity is crucial for the marker-assisted selection and breeding of coniferous tree species. In this study, using RAD-seq technology, we developed 343,644 high-quality single nucleotide polymorphism (SNP) markers to resolve the genetic diversity and population genetic structure of 233 Chinese fir selected individuals from the 4th cycle breeding program, representing different breeding generations and provenances. The genetic diversity of the 4th cycle breeding population was high with nucleotide diversity (Pi) of 0.003, and Ho and He of 0.215 and 0.233, respectively, indicating that the breeding population has a broad genetic base. The genetic differentiation level between the different breeding generations and different provenances was low (Fst < 0.05), with population structure analysis results dividing the 233 individuals into four subgroups. Each subgroup has a mixed branch with interpenetration and weak population structure, which might be related to breeding rather than provenance, with aggregation from the same source only being in the local branches. Our results provide a reference for further research on the marker-assisted selective breeding of Chinese fir and other coniferous trees.

1 Introduction

Cunninghamia lanceolata (Lamb.) Hook of the genus Cunninghamia in the Cupressacaes family (2n = 22) is a Quaternary ice age relict species and is considered one of the most economically important timber species in southern China. The species is widely distributed in 17 provinces and autonomous regions of China and has rich genetic diversity (Bian et al., 2014). The species has been under cultivation for over 3,000 years and currently covers ~10 million hectares, accounting for 17.3% of the dominant tree species in China’s plantation forests. Genetic improvement activities of Chinese fir started in 1950s, mostly through conventional breeding. At present, the Chinese fir breeding program is in its 4th breeding cycle, which is characterized by the selection and establishment of the 4th cycle breeding population. Phenotypic variation of a multitude of biological traits of Chinese fir is known to be affected by both climate and geography. However, information regarding the neutral variation of molecular markers remains scant (Bian et al., 2014). It is anticipated that the use of molecular markers in the Chinese fir breeding program will help resolve the species genetic structure and diversity across populations and ultimately help in the implementation of marker-assisted selective breeding (Zheng et al., 2015; He et al., 2021).

A species breeding population represents the core material for genetic improvement. It is often used to generate a structured pedigree for genetic evaluation, mainly by implementing a specific mating design among the populations’ members. To prevent genetic variability erosion in the Chinese pine 4th cycle breeding population, rigorous genetic diversity assessment is required. The extent of genetic diversity within a population determines its resilience to unexpected environmental contingencies and successful reproduction and recruitment. Thus, the assessment of genetic diversity and population genetic structure is important for the effective conservation and utilization of coniferous tree populations as well as for the thorough development of their breeding programs (Cai et al., 2020).

Analysis of genetic diversity and population structure of forest tree populations has been mostly based on molecular genetic markers, such as random amplified fragment length polymorphism (RAPD), amplified fragment length polymorphism (AFLP), and simple sequence repeat (SSR) (Chung et al., 2004; Cao et al., 2012; Duan, 2014; Yang et al., 2018). SSR and RAPD markers were used to analyze the genetic diversity of the first three Chinese fir breeding populations (1st, 2nd, and 3rd cycles) (Li, 2001; Li et al., 2007; Ouyang et al., 2014; Li et al., 2017), and high levels of population genetic diversity were reported. Recently, the use of SNP markers has become common due to their stability, high resolution, wide distribution, and strong differentiation between germplasms (Xia et al., 2019; Zheng et al., 2019). Using specific-locus amplified fragment sequencing (SLAF-seq) technique, Zheng et al. (2019) developed a genome-wide SNP panel for 221 Chinese fir clones. However, Picea abies was used as the reference genome for the SNP selection.

High-throughput sequencing technologies generate substantial high-density SNP information, thereby offering opportunities for the development of new strategies for population genetics research. Among them, simplified genome sequencing technologies are widely used as they are free from the “reference genome” constraints. These include restriction-site related DNA sequencing (RAD-seq) (Bus et al., 2012), 2b-RAD sequencing based on RAD-seq (Wang et al., 2012), polymorphic sequence sequencing with reduced complexity (CRoPS) (Altshuler et al., 2000), specific-locus amplified fragment sequencing (SLAF-seq) (Sun et al., 2013), genotyping-by-sequencing (GBS) (Elshire et al., 2011), and reduced representation libraries sequencing (RRLS) (Van Tassell et al., 2008). RAD-seq technology has proven to be an effective sequencing technology for obtaining genome-wide genomic information at low costs and has been extensively used without dependance on “reference genome” (Miller et al., 2007; Zhou et al., 2018; Brandrud et al., 2019). RAD-seq simplified sequencing technology is widely used for plant and animal marker development, population structure analysis, and high-density genetic mapping (Emerson et al., 2010; Catchen et al., 2011; Lexer et al., 2014; Lozier, 2014; Zhou et al., 2016). Although RAD-seq technology is promising, it has not been widely used yet in the population genetic diversity and genetic structure analyses of Chinese fir.

Here, we used the Chinese fir 4th cycle breeding population as the study material to develop high-quality SNPs markers based on RAD-seq simplified genome technology. We expect that this development will not only help elucidating the genetic structure and diversity of the Chinese fir advanced generation breeding population, but also provide theoretical basis and reference for the development and establishment of breeding population and parental selection of seed orchards.

2 Materials and methods

2.1 Plant material

The Chinese fir 4th cycle genetic improvement population initially was selected for fast growth, high wood quality, and disease resistance. Individuals in this population were selected over three cycles of intensive genetic evaluation and were benchmarked against natural stands’ seedstock. The population is comprised of 233 individuals selected for the above-mentioned attributes along with the added knowledge of their flowering propensity (data generated from three years observations post grafting), which according to the genealogical records could be divided into four generations: 1st (n=43), 2nd (n=141), 3rd (n=38), and 4th (n=11) (Supplementary Table 1), thus covering eight geographical Chinese fir origins (Figure 1).

FIGURE 1
www.frontiersin.org

Figure 1 Origin of the 233 Chinese fir germplasm. ①~⑦ represent the seven Chinese fir provenances (Fujian, Hunan, Sichuan, Jiangxi, Guangdong, the boundary of Hunan, Guizhou and Guangxi, and the boundary of Shaanxi, Henan and Hubei). The mixed sources are not shown in the figure.

2.2 DNA extraction and sequencing

Current year fresh needles were collected from 233 Chinese fir trees of the 4th cycle selection population growing in the Yangkou State-owned Forest Farm (Fujian Province) and then preserved on dry ice. DNA was extracted using the Tiangen Biotech kit (DP-320-02), and its quality was checked using Qubit (Thermo Fisher Scientific, Waltham, MA) and Nanodrop (Thermo Fisher Scientific, Waltham, MA) with TE buffer as the blank. DNA purity and integrity were checked using 1% polyacrylamide gel electrophoresis.

Sequencing libraries of the 233 Chinese fir germplasm were constructed using the RAD-seq simplified gene sequencing technology. The quality-checked genomic DNA was enzymatically digested with EcoRI, and samples were double-end sequenced on the Illumina HiSeq 2500 platform to an average depth of 10×. The reference genome was assembled and spliced from the 233 genotypes using the simplified genome sequencing data assembled using the Stacks (Version 1.46) software (Catchen et al., 2011) and its sequencing was also completed by the Stacks (Version 1.46) software. Raw sequencing data containing splice information, low-quality bases, and other information that interferes with downstream analysis were removed to ensure proper data analysis. The FASTP (Version 0.18.0) software (Chen et al., 2018) was used for filtering with the following criteria: 1) removal of sequences lacking the EcoRI restriction sites; 2) removal of low-quality reads (the number of bases with quality Q ≤ 20, which accounted for over 50% of the entire read); 3) elimination of reads containing adapter information; and 4) exclusion of reads with N ratio > 10%.

2.3 High-quality SNP marker development

The BWA-MEM method of the Burrows-Wheeler-Aligner (v0.7.16a-r1181) software (Li and Durbin, 2009) was used to compare the high-quality reads of each sample with the assembled population tags, with the variant detection software GATK (McKenna et al., 2010) being used for population SNP detection. Using the Plink software (Purcell et al., 2007), the initial SNPs were screened based on the following criteria: 1) indels were removed; 2) only double alleles were retained; 3) Hardy-Weinberg equilibrium (HWE) was met; 4) linkage disequilibrium (LD) between loci was < 0.2; 5) to compare the differences in genetic diversity parameters under different filtering criteria, three sets of criteria were set: ①MAF>0.01, Call rate>0.9, ②MAF>0.05, Call rate>0.8 and ③MAF>0.05, Call rate>0.9. Finally, high-quality SNPs were obtained for genetic diversity analysis.

2.4 Data analysis

Using Plink software (Purcell et al., 2007), genetic diversity parameters, including observed (Ho) and expected heterozygosity (He), and inbreeding coefficient (F) were measured. To equalize the sample size of each population, clusters with larger sample sizes (e.g., G2, FJ) were randomly sampled each time with repeated sampling to calculate the genetic diversity parameters, and finally the mean was estimated. Using the Vcftools software (https://vcftools.github.io/man_latest.html) to 1) calculate nucleotide diversity parameters (Pi); 2) number of conversions and reversals (i.e., structural variation); and 3) calculate Ts/Tv values. The R package StAMPP’s stamppAmova (https://rdrr.io/cran/StAMPP/man/stamppAmova.html) and PopGenome (https://cran.r-project.org/web/packages/PopGenome/index.html) were used for the analysis of molecular variance (AMOVA) and also for estimating the genetic differentiation indices both between the four generations and between the different geographical origins. The SNP data were used to construct a phylogenetic tree for the 4th cycle breeding population using the MEGA 6 software (Tamura et al., 2013). Using the neighbor-joining method with bootstrap values set to 1,000, the phylogenetic tree was constructed using the Kimura 2-parameter model. The ped format file was first exported by Plink software, and then the Admixture (Version 1.3.0) software (Alexander et al., 2009) was used to calculate the Q values and the final population structure was determined. This assumed that the number of sampled sub-groups (K) ranged between 1 and 9, and the valley value of the cross-validation error rate was used as the optimal number of bins. A Q value > 0.6 indicated a single source and pure genetic background, while a Q value < 0.6 indicated a mixed source and complex genetic background. The software EIGENSOFT’s smartpca (https://www.hsph.harvard.edu/alkes-price/software/) module was used for principal component analysis (PCA). The above graphs were visualized using the R software.

3 Results

3.1 High quality SNP marker development

After RAD-seq sequencing, as shown in Supplementary Tables 2 and 3, we obtained 3,145.8 Gb data from the 233 individuals, with data volumes of 9.9–19.6 Gb for each sample, an average of 13.5 Gb per sample, and the average depth of high quality SNP marker sequencing in each sample was 5.5× (Supplementary Table 4), and average read length of 146 bp. After quality control, we retained a total of 3,075.3 Gb, with a 97.8% efficiency rate, with data volumes of 9.4–19.4 Gb per sample, and average of 13.2 Gb per sample. The overall sequencing quality was high (Q20 ≥ 97.27%, Q30 ≥ 92.15%), and the GC content was stable (36.61–37.71%, with average of 37.10%), which met the requirements of subsequent analyses. After removing the overlap, there were 2,188,278 contigs. The total length of the assembled reference genome sequence was 1.11 Gb, with average length of 509 bp and a maximum length of 2,211 bp; N50 length of 539 bp and N90 of 406 bp; and 37.01% GC content. The reference genome was compared with the Picea abies genome (http://congenie.org/), and it showed 80.48% match, with the RAD-seq sequencing accuracy being reliable for downstream analysis.

After quality control of the raw data, we detected a total of 27,283,139 SNP markers in the whole population as compared to the reference genome, with an average of one SNP locus per 46 bp. The content and distribution density of different types of SNP variants varied across the genome. Among them, conversion accounted for 60.41%, A/G and C/T accounted for 30.41 and 30% respectively; reversal accounted for 39.59% (A/C, C/G, A/T, and G/T), with C/G accounting for 5.51%. After further filtering, we retained a total of 343,644 (1.26%) high quality SNP markers for subsequent analyses. By comparing the genetic diversity parameters of Chinese fir under the three sets of criteria, the results showed that the parameter values under the first set of criteria were significantly smaller than the other two groups, while the values of various genetic parameters calculated under the third set of criteria were higher than the other two groups. Therefore, the SNP markers filtered by the third set of criteria (i.e., MAF > 0.05, Call rate > 0.9) were used as high-quality SNP markers, and 343,644 SNP (1.26%) markers were finally retained for subsequent analyses.

3.2 Population genetic diversity

We used the 343,644 high-quality SNP markers to calculate the genetic diversity parameters of the breeding parents of different generations and their origins in the 4th Chinese fir cycle breeding population (Table 1). Ho varied between 0.203 and 0.218 (mean of 0.211), while He varied from 0.214 to 0.231 (mean of 0.225). Both Ho and He were the highest in G2, with Ho at all SNP loci being smaller than He, thereby indicating that heterozygous deletions may exist in this Chinese fir germplasm population. G4 had the highest Pi (0.003), which may be related to its inclusion of more provenances, followed by G2, which was similar to G3 and G1. Among the origins, Ho was smaller than He in FJ and HN, while He was larger than Ho in the remaining provenances, and Pi values are also higher in the other provenances compared to the HN and FJ, probably due to the small sample size (only 3 to 5) in the other provenances, thus suggesting that genetic diversity in each provenance is somewhat related to the population size. This shows that the genetic diversity level of the 4th cycle breeding population was high and had abundant genetic variation.

TABLE 1
www.frontiersin.org

Table 1 Genetic diversity parameters of the Chinese fir 4th cycle breeding population.

3.3 Populations genetic differentiation

We assessed the genetic differentiation for different breeding generations and different germplasm sources (Tables 2, 3). Generally, the genetic differentiation level is low (Fst < 0.05), indicating that there was no significant genetic differentiation in Chinese fir between the different provinces and between the breeding populations of the four generations. In contrast, the degree of differentiation between SC, JX and GD was higher. The genetic differentiation among the different breeding generations showed the highest differentiation between G4 and G1, which shared similarity with the nucleotide diversity results, whereas the lowest genetic differentiation was between G2 and G3.

TABLE 2
www.frontiersin.org

Table 2 Genetic differentiation among the different provenances of Chinese fir.

TABLE 3
www.frontiersin.org

Table 3 Genetic differentiation among the Chinese fir different breeding population generations.

The AMOVA results showed that only 1.29% and 3.02% of the variation originated between breeding population generations and between the different germplasm origins, respectively, and over 96% of the variation was due to among different genotypes (Table 4).

TABLE 4
www.frontiersin.org

Table 4 Molecular analysis of variance for the Chinese fir different breeding population generations and different germplasm source locations.

3.4 Population genetic structure

The 233 Chinese fir individuals of the 4th cycle breeding population can be divided into four differential classes (I-IV) (Figure 2). There is large genetic variation among the four classes indicating mixed groups containing individual parents from 3 to 4 generations. Class I contained a minimum of 26 individuals [representing G3 (n=12); G2 (n=12), and G1 (n=2)]; Class II harboured a total of 41 individuals [representing G3 (n=2); G2 (n=12), and G1 (n=27)]; Class III contained 37 individuals [representing G3 (n=10), G2 (n=14), G1 (n=2), and all of G4]; while Class IV contained a maximum of 129 individuals accounting for 55.37% of the tested material [which is dominated by G2 (n=103) and a few G3 (n=14) and G1 (n=12)]. Fifteen individuals of G1 (including F5, E12, and K6) are located in the Class II subclade, confirming the close kinship of these 15 individuals at the molecular level. The evolutionary tree clustered according to provenance (Supplementary Figure 1), we found that most provenances were clustered into one group only in the local branches, e.g., most FJ provenances were clustered together, probably due to the larger sample size of the FJ provenance. Therefore, the phylogenetic tree showed that most 4th cycle breeding population clones were mixed to varying degrees, with few outlier samples and no obvious relationship between the division and provenances of the populations, which was probably related to the breeding generations, such that Class IV contained 73.05% of G2 and 36.84% of G3; Class II contained 62.79% of G1; while all of G4 was distributed in Class III.

FIGURE 2
www.frontiersin.org

Figure 2 The Chinese fir germplasm phylogenetic tree. The outermost circle in yellow indicates Class I, purple indicates Class II, red indicates Class III, and green indicates Class IV; the inner circle in red indicates G4, yellow indicates G3, green indicates G2 and purple indicates G1.

We used the admixture software to calculate the Q values of each sample (Supplementary Table 5) and then we grouped the 233 Chinese fir individuals (Figure 3, Supplementary Figures 2-4). Based on the valley of the cross-validation error rate, we determined that the optimum number of subgroups to be four, thereby indicating that these Chinese fir trees may have come from four original ancestral sources, with the four subgroups (I-IV) containing 160, 23, 23, and 27 individuals, respectively. Subpopulation I had the most complex genetic background, with 40 individuals with Q > 0.6 and 75% of the material having a poorly defined genetic composition. This suggested that there may have been a genetic exchange between these individuals, indicating that the parents may have been used multiple times for crossing in the ongoing breeding process. All of G4 and 75% of G2 comprised subpopulation I. Subpopulation III had greater genetic background purity, which was dominated by G2, where 21 individuals have a Q value > 0.6, probably associated with most samples from G2. Subpopulation II contained G3 (n=8), G2 (n=11), and G1 (n=4), of which 15 individuals have Q values > 0.6. Additionally, 74% of the material in subpopulation IV was from G1, with the remainder from G2, and 14 individuals having Q values > 0.6. All four subgroups retained a proportion of the same genetic material, thus facilitating gene exchange, resulting in a similar genetic background of the breeding parents from different origins. However, the genetic background of the breeding parents from different germplasm sources was similar. Although the subpopulation divisions do not match the provenance of the test material, it only showed some local correlation, thus suggesting that the Chinese fir germplasms may have mixed ancestry or gene flow, which matches the phylogenetic tree results.

FIGURE 3
www.frontiersin.org

Figure 3 Population structure of 233 Chinese fir germplasm k = 4.

PCA showed that the first 10 principal components explained only 11.29% of the variance, with each principal component explaining < 2%, thus indicating that only few SNPs could delineate the subgroups and discriminate between individuals. We selected the first three principal components (PC1 = 1.82%, PC2 = 1.48%, and PC3 = 1.45%) and plotted them in pairs (Figure 4, Supplementary Figure 5), which divided the 233 individuals into four groups. These results showed that G4 is relatively concentrated in the middle cluster, thus reflecting the close genetic distance between samples within G4. Most G2 and G3 were clustered together, while G1 was more dispersed. Furthermore, elucidating the Chinese fir population genetic structure (maybe related to the breeding generations which unintentionally mixed their genetic background) showed that it does not correspond to the provenance. The studied germplasm indicated that parents from different origins (provenances) were more dispersed, while those from the same provenance were clustered together. In summary, there was overlap and crossover between the four groups and a high degree of admixture between groups, thereby indicating different degrees of interpenetration between groups, which was consistent with both the phylogenetic tree and population genetic structure analysis results.

FIGURE 4
www.frontiersin.org

Figure 4 Principal component analysis where PC1 and PC2 represent the first and second principal components, respectively. G1~G4 represent the 1st, 2nd, 3rd and 4th generation breeding parents, respectively.

4 Discussion

4.1 Reliability of RAD-seq for simplified sequencing

With the release of the first version of the Populus trichocarpa genome (Tuskan et al., 2006), the era of forest tree genomes had officially started, with the genome-wide information of several tree species being published. However, genomic research progress in coniferous trees is still slow as compared to other plants due to the technical difficulties caused by their very large genomes, high sequencing costs, and gene structure annotation. To date, only a few coniferous tree species genomes have been released (e.g., Picea abies (Nystedt et al., 2013), Pinus taeda (Zimin et al., 2017), Pinus lambertiana (Stevens et al., 2016), Pseudotsuga menziesii (Neale et al., 2017), Pinus tabuliformis (Niu et al., 2022)). This undoubtedly led to the rapid development of genomic information of these species at the molecular level. Although whole genome of the Chinese fir has not yet been published, very limited genome-level studies are available. In this study, we attempted to construct a reference genome of Chinese fir using RAD-seq simplified sequencing technology for the species 4th cycle breeding population, and obtained a 1.11 Gb-sized genome with a 37.01% GC content, higher than the 36.04% estimated by K-mer analysis (Lin et al., 2020). This estimate is similar to that of Picea abies (37.90%) (Nystedt et al., 2013) and Pinus massoniana (37.95%) (Bai et al., 2019), and was lower than that of Cryptomeria japonica (48.00%) (Nagano et al., 2020), probably due to the lower sequencing depth and lower coverage of the simplified genome sequencing in this study.

The RAD-seq simplified sequencing technique is developed to generate a wider range of SNP markers. It is a cost-effective genotyping technique that detects variant information on a genome-wide scale, but the quality of the obtained SNPs is usually variable and the lack of stringent filtering can seriously affect subsequent analyses (Korecký et al., 2021). The initial 27,283,139 SNP markers obtained after the reference genome alignment, and implementation of strict filtering criteria helped obtaining high-quality SNP markers and finally only 1.26% of SNPs were retained as high-quality SNP markers. The proportion of retained high-quality SNP markers was much lower than that of other tree species (Mandrou et al., 2014; Tsumura et al., 2020; Yang et al., 2020). And it was found that the highest number of SNP markers but the lowest genetic diversity value was obtained under the first set of criteria (i.e., MAF > 0.01, Call rate > 0.9), thus indicating that setting of MAF filtering criteria had a greater effect on the number of SNP markers obtained. The filtering criteria for Chinese fir SNP selection in this study were more stringent than those implemented for Picea abies (Korecký et al., 2021), Ulmus pumila (Lyu et al., 2020), and other Chinese firs (Zheng et al., 2019).

The number of high-quality SNP markers obtained using RAD-seq technology (343,644) was much higher than the number of SNP markers detected by SLAF-seq simplified sequencing technology (108,753/143,871). This may be due to either an increase in sample size (233:221/110) or differences in sequencing technology (Zheng et al., 2019; Huang et al., 2021). RAD-seq sequencing technology not only show high number of markers but also high density (Zhang et al., 2018). This was also observed in some flowers or crops (Jia et al., 2016; Peng et al., 2016; Chankaew et al., 2022; Jiang et al., 2022). The RAD-seq technology often detects more SNPs as compared to SLAF-seq technology (Cai et al., 2015; Su et al., 2017). SNP variant types can be classified into two categories: conversion (Ts) and reversal (Tv), with a theoretical ratio of 0.5. However, a “conversion bias” (Collins and Jukes, 1994) (i.e., conversion/turnover (Ts/Tv) ratio) generally occurs. In this study, before SNP marker screening, the ratio of Ts/Tv was 1.5, whereas it was > 1.5 post screening, with results similar to other findings (Su et al., 2016; Zheng et al., 2019).

4.2 The richness of breeding population genetic base

Most coniferous trees have a long growth period, high rate of heterosis, and extensive gene flow, resulting in high level of genetic diversity (Bergmann and Mejnartowicz, 2000). The rich genetic variation within the breeding population forms the basis for genetic improvement (Chaisurisri and El-Kassaby, 1994; El-Kassaby and Ritland, 1996a; El-Kassaby and Ritland, 1996b; Stoehr and El-Kassaby, 1997). The level of population genetic diversity decreases with advanced-generation breeding, as the high intensity of artificial selection generally results in significant short-term genetic gains, while possibly also reducing the genetic variation base and genetic diversity of the breeding population. However, our analysis revealed that the Chinese fir 4th cycle breeding population still harbours high genetic diversity (Pi = 0.003) and high within-population genetic variation, similar to that reported for Pinus taeda (Chhatre et al., 2013), Eucalyptus urophylla (Yang et al., 2020), Cryptomeria japonica (Tsumura et al., 2014), and Larix kaempferi (Liu et al., 2017). The introduction of external superior trees (i.e., genetic infusion) leads to increased genetic diversity. Moreover, mating combinations among superior individuals also generate new recombinations, which also results in increased genetic diversity. Additionally, changes in breeding objectives also can increase the genetic variation among populations. The Chinese fir 4th cycle breeding population included not only hybrid offspring between superior trees, but also included external superior trees through genetic infusion. Additionally, the 4th cycle breeding objective added pest resistance attributes to the commonly selected fast-growing, high-quality trees, which may have contributed to the observed high genetic diversity. In addition, some researchers have argued that the Chinese fir germplasm growing in central production areas in suitable environments (e.g., superior seed sources) for long periods is subjected to natural selection, artificial selection, and some anthropogenic activities, leading to the occurrence of pollen and seed exchange and thus gene flow, making it possible for diversity to decrease and the genetic base to narrow (Chen et al., 1980; Li, 2015). The northern Fujian region was considered as one of the central production areas for Chinese fir as early as 20 years ago (Chen et al., 1980; Huang et al., 1986; You and Hong, 1998), and after many years of artificial selection, lower genetic diversity may have occurred, yet high genetic diversity was still detected in seed sources from this region. This may be due to the timely introduction of good external populations to expand the genetic base, and it should also be noted that the northern Fujian seed source also contributed a large number of parents to the Chinese fir breeding population, an observation that supports a previous observation (He, 2019).

The issue of correspondence between the number of parents selected from a particular provenance and genetic diversity (Duan et al., 2017), may suggest that those provenances with a lower number of parents in the breeding population could affect the extent of genetic diversity. Similarly, the AMOVA results showed that over 96% of the genetic variation was present between genotypes, with only very small amount of variation occurring among populations. This was confirmed by the very low Fst values (< 0.05) between subgroups, which may either be related to the unbalanced sample size representation across germplasm origins, or that the parental population was widely used due to its excellent phenotype, and the higher level of human activity may have led to enhancing gene flow, thus reducing genetic differentiation among populations (Fang et al., 2022). This result, which is also consistent with the findings of previous studies, shows that forest trees are predominantly heterozygous and have low genetic differentiation among populations and high levels of overall genetic diversity (Tsumura et al., 2014; Wang et al., 2014; Bínová et al., 2020).

Heterozygosity is an important indicator of the genetic diversity of a population, and the average heterozygosity of the 4th Chinese fir cycle breeding population was high (Ho = 0.215, He = 0.233), estimates similar to that reported for the same species (0.163/0.250) (Zheng et al., 2019) and (0.210/0.273) (Huang et al., 2021), Cryptomeria japonica (0.269/0.253) (Cai et al., 2020), and also higher than that reported for Keteleeria davidiana var. formosana (0.128/0.096) (Shih et al., 2018), Pinus pungens (0.113/0.114) and Pinus rigida (0.098/0.104) (Bolte et al., 2022), but lower than Eucalyptus globulus (0.511/0.423) (Butler et al., 2022), Pinus strobus (0.477/0.590) (Whitney et al., 2019), Cedrus (0.460/0.530) (Karam et al., 2019). The reasons for the higher heterozygosity estimates in Chinese fir are: 1) highly heterozygous genetic background and broad genetic base, probably due to a long growth cycle, and wind pollination, and 2) the bottleneck effect that may have contributed to high heterozygosity during the Cretaceous to Tertiary Eocene, when the global climate favored the widespread migration of Chinese fir trees between North America and Eurasian continents. During the late Eocene to Oligocene; however, abrupt global climatic changes caused the Chinese fir to disappear from the northern hemisphere at high latitudes. Furthermore, during the Quaternary ice age, the number of Chinese fir trees decreased dramatically, with their distribution becoming smaller and their gradual movement southwards, such that Chinese fir trees were no longer found north of the Qinling and Huai rivers after the Ice Age. Therefore, the Chinese fir may have been affected by the bottleneck effect after the Quaternary ice age, thereby resulting in a sudden increase in heterozygosity followed by a gradual stabilization, with the last ice age also affecting the genetic diversity of species like Pinus strobus (Whitney et al., 2019) and Cryptomeria japonica (Tsumura et al., 2020) and other tree species. It is also possible that individuals with higher heterozygosity are better suited to survive during evolution, and that the recent selective breeding may also have an effect.

SNP markers detect significantly more genetic variation than SSRs, probably because SNP markers are obtained from the whole genome, have a low genotyping error rate, and have a high density in genomes (Lu et al., 2009)) (e.g., one SNP marker was detected per 46 bp on an average in this study). SNP markers are usually bi-allelic (Vignal et al., 2002), whereas SSR markers are multi-allelic and have a significantly higher number of alleles than SNP markers (Van Inghelandt et al., 2010; Zurn et al., 2020). Studies have shown that double-allelic markers like SNPs can be counted with a maximum genetic diversity of 0.5, whereas multi-allelic markers like SSRs can be observed with genetic diversity values close to 1 (Van Inghelandt et al., 2010). However, some researchers have pointed out that the comparison should not be based only on the number of alleles, but more emphasis must be placed on the number of loci, and that few alleles (but high number of loci with a high gene coverage density) make the estimation of population structure more reliable (Zurn et al., 2020). Genetic diversity parameters obtained from analysis using SNP markers are generally lower than those calculated using traditional molecular markers, like SSR, ISSR, and SRAP (Chen et al., 2017; Duan et al., 2017; Li et al., 2017; García et al., 2018; Lin et al., 2020), which are also similar in other plants (Van Inghelandt et al., 2010; Avican and Bilgen, 2022). Molecular markers can also impact the results of the experiment, as different molecular markers introduce bias in the genetic diversity analysis results for the same or different populations (Bínová et al., 2020; Korecký et al., 2021).

4.3 Genetic structure rationalization

The population genetic structure in this study is relatively weak, and aggregation of the same provenances occurs only in some or local branches, which is similar to the findings of Huang et al. (2021) and Xia et al. (2019). The genetic structure of populations is related to a variety of factors, and when the materials are mostly generated from different origins or different geographical sources, the species’ wide range, climate, and complex geography allow for geographical genetic differentiation among the different origins, species sources, or populations, resulting in populations that often have an extremely strong genetic structure fit with geographic sources, like the king of Chinese fir (Li et al., 2016), Pinus monticola (Kim et al., 2011), and Eucalyptus cladocalyx (Bush and Thumma, 2013; Butler et al., 2022). Chinese fir mainly exists in the southern provinces and regions like Fujian and Guangdong, and the climatic similarity may be the reason for the observed subgrouping. In addition, the large scale long-distance cultivation has increased the genetic exchange among populations, which has gradually increased the complexity of Chinese fir germplasm kinship between different origins, thereby reinforcing the need for molecular techniques for resolving the genetic diversity and population structure (Fang et al., 2022).

Despite the low level of genetic differentiation between breeding generations in Chinese fir, the clustering results for genetic structure suggest it may be related to the genealogical classification and the development of breeding generations, which was similar to the results for significant genetic structure between the 1st and 2nd generation breeding populations of Pinus taeda (Chhatre et al., 2013). When the breeding population shows a complex genetic background and is originated from a wide range of sources, its genetic structure will correspond to the kinship between breeding parental sources, as observed in Eucalyptus urophylla (Lu et al., 2018). In addition, coniferous trees usually have low levels of genetic differentiation due to heterosis and gradual gene penetration (Petit and Hampe, 2006), e.g. no significant population structure was detected within Pinus pungens and P. rigida based on the whole genome-wide data (Bolte et al., 2022).

The observed clustering results of the Chinese fir 4th cycle breeding population may also be related to the three previous recurrent selection cycles. The 1st cycle breeding population dates back to the 1860s. However, over the years, the breeding objectives have mainly targeted fast growth and productivity, with the 4th cycle breeding population being selected for fast-growing, high quality, and stress resistance. This may result in some of the Chinese fir germplasm parental trees being repeatedly selected as mating parents due to their excellent performance. Repeated artificial selection may gradually intensify the performance of the target traits, thereby increasing the frequency of related advantageous loci, which may further produce a linkage disequilibrium effect and make the genetic structure of the artificially improved breeding populations likewise differ significantly (Du et al., 2021), so exploring population genetic structure should be considered from multiple aspects and dimensions, not just individual condition such as geographical factors or genealogical structure.

5 Conclusion

In this paper, we made a preliminary attempt to construct a reference genome for Chinese fir using RAD-seq. We genotyped 233 parents and the development of a large number of (343,644) high-quality SNP markers. Furthermore, we detected that the genetic diversity of the 4th cycle breeding population was abundant. The genetic differentiation among populations was not obvious, leading to no apparent population structure. Most of the observed variation mainly originated among individuals, which may be related to the frequent exchange between Chinese fir origins and its long history of cultivation and domestication. Therefore, population structure is not significantly correlated with germplasm origin, but may be related to the genealogy and breeding generation.

Data availability statement

The datasets presented in this study can be found in online repositories. The name of the repository and accession numbers can be found below: NCBI; PRJNA910811 and PRJNA909424.

Author contributions

YJ, LB and XFZ contributed to conception and design of the study. YJ, LB, XFZ, BZ, RZ, SS, DY and XYZ organized the database. YJ, LB, XFZ and BZ performed the statistical analysis. YJ, LB and XFZ wrote the first draft of the manuscript. YE-K and JS wrote sections of the manuscript. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by the National Key Research and Development Program of China (2022YFD2200201), National Natural Science Foundation of China (32171818), Fujian Province Science and Technology Research Funding on the fourth Tree Breeding Programme of Chinese fir (Min Lin Ke 2016-35, ZMGG-0701, 2022FKJ05), and Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).

Acknowledgments

Thanks to Guangzhou Genedenovo Biotechnology Co., Ltd for assisting in sequencing data acquisition.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2023.1106615/full#supplementary-material

References

Alexander, D. H., Novembre, J., Lange, K. (2009). Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19 (9), 1655–1664. doi: 10.1101/gr.094052.109

PubMed Abstract | CrossRef Full Text | Google Scholar

Altshuler, D., Pollara, V., Cowles, C., Van Etten, W., Baldwin, J., Linton, L., et al. (2000). A human SNP map generated by reduced representation shotgun sequencing. Nature 407, 513–516. doi: 10.1038/35035083

PubMed Abstract | CrossRef Full Text | Google Scholar

Avican, O., Bilgen, B. B. (2022). Investigation of the genetic structure of some common bean (Phaseolus vulgaris l.) commercial varieties and genotypes used as a genitor with SSR and SNP markers. Genet. Resour. Crop Evol., 69, 2755–2768. doi: 10.1007/s10722-022-01396-5

CrossRef Full Text | Google Scholar

Bai, Q., Cai, Y., He, B., Liu, W., Pan, Q., Zhang, Q. (2019). Core set construction and association analysis of pinus massoniana from guangdong province in southern China using SLAF-seq. Sci. Rep. 9 (1), 1–13. doi: 10.1038/s41598-019-49737-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Bergmann, F., Mejnartowicz, L. (2000). A reciprocal relationship between the genetic diversity at two metabolically-linked isozyme loci in several conifer species. Genetica 110 (1), 63–71. doi: 10.1023/A:1017572725635

PubMed Abstract | CrossRef Full Text | Google Scholar

Bian, L., Shi, J., Zheng, R., Chen, J., Wu, H. X. (2014). Genetic parameters and genotype–environment interactions of Chinese fir (Cunninghamia lanceolata) in fujian province. Can. J. For. Res. 44 (6), 582–592. doi: 10.1139/cjfr-2013-0427

CrossRef Full Text | Google Scholar

Bínová, Z., Korecký, J., Dvořák, J., Bílý, J., Zádrapová, D., Jansa, V., et al. (2020). Genetic structure of Norway spruce ecotypes studied by SSR markers. Forests 11 (1), 110. doi: 10.3390/f11010110

CrossRef Full Text | Google Scholar

Bolte, C. E., Faske, T. M., Friedline, C. J., Eckert, A. J. (2022). Divergence amid recurring gene flow: complex demographic processes during speciation are the growing expectation for forest trees. bioRxiv. 18, 1–18 doi: 10.1007/s11295-022-01565-8

CrossRef Full Text | Google Scholar

Brandrud, M. K., Paun, O., Lorenz, R., Baar, J., Hedrén, M. (2019). Restriction-site associated DNA sequencing supports a sister group relationship of nigritella and gymnadenia (Orchidaceae). Mol. Phylogenet. Evol. 136, 21–28. doi: 10.1016/j.ympev.2019.03.018

PubMed Abstract | CrossRef Full Text | Google Scholar

Bus, A., Hecht, J., Huettel, B., Reinhardt, R., Stich, B. (2012). High-throughput polymorphism detection and genotyping in brassica napus using next-generation RAD sequencing. BMC Genomics 13 (1), 1–11. doi: 10.1186/1471-2164-13-281

PubMed Abstract | CrossRef Full Text | Google Scholar

Bush, D., Thumma, B. (2013). Characterising a eucalyptus cladocalyx breeding population using SNP markers. Tree Genet. Genomes 9 (3), 741–752. doi: 10.1007/s11295-012-0589-1

CrossRef Full Text | Google Scholar

Butler, J. B., Freeman, J. S., Potts, B. M., Vaillancourt, R. E., Kahrood, H. V., Ades, P. K., et al. (2022). Patterns of genomic diversity and linkage disequilibrium across the disjunct range of the Australian forest tree eucalyptus globulus. Tree Genet. Genomes 18 (3), 1–18. doi: 10.1007/s11295-022-01558-7

CrossRef Full Text | Google Scholar

Cai, C., Cheng, F.-Y., Wu, J., Zhong, Y., Liu, G. (2015). The first high-density genetic map construction in tree peony (Paeonia sect. moutan) using genotyping by specific-locus amplified fragment sequencing. PloS One 10 (5), e0128584. doi: 10.1371/journal.pone.0128584

PubMed Abstract | CrossRef Full Text | Google Scholar

Cai, M., Wen, Y., Uchiyama, K., Onuma, Y., Tsumura, Y. (2020). Population genetic diversity and structure of ancient tree populations of cryptomeria japonica var. sinensis based on RAD-seq data. Forests 11 (11), 1192. doi: 10.3390/f11111192

CrossRef Full Text | Google Scholar

Cao, K., Wang, L., Zhu, G., Fang, W., Chen, C., Luo, J. (2012). Genetic diversity, linkage disequilibrium, and association mapping analyses of peach (Prunus persica) landraces in China. Tree Genet. Genomes 8 (5), 975–990. doi: 10.1007/s11295-012-0477-8

CrossRef Full Text | Google Scholar

Catchen, J. M., Amores, A., Hohenlohe, P., Cresko, W., Postlethwait, J. H. (2011). Stacks: building and genotyping loci de novo from short-read sequences. G3: Genes| genom| Genet. 1 (3), 171–182. doi: 10.1534/g3.111.000240/-/DC1

CrossRef Full Text | Google Scholar

Chaisurisri, K., El-Kassaby, Y. (1994). Genetic diversity in a seed production population vs. natural populations of sitka spruce. Biodivers. Conserv. 3 (6), 512–523. doi: 10.1007/BF00115157

CrossRef Full Text | Google Scholar

Chankaew, S., Sriwichai, S., Rakvong, T., Monkham, T., Sanitchon, J., Tangphatsornruang, S., et al. (2022). The first genetic linkage map of winged bean [Psophocarpus tetragonolobus (L.) DC.] and QTL mapping for flower-, pod-, and seed-related traits. Plants 11 (4), 500. doi: 10.3390/plants11040500

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, Y., Peng, Z., Wu, C., Ma, Z., Ding, G., Cao, G., et al. (2017). Genetic diversity and variation of Chinese fir from fujian province and Taiwan, China, based on ISSR markers. PloS One 12 (4), e0175571. doi: 10.1371/journal.pone.0175571

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, Y., Ruan, Y., Chen, S., Liu, D., Lin, Q. (1980). Genetic variations of chinese fir in eleven provenances. J. Nanjing For Univ. (Natural Sci. Edition) 4, 35–45. doi: 10.3969/j.jssn.1000-2006.1980.04.005

CrossRef Full Text | Google Scholar

Chen, S., Zhou, Y., Chen, Y., Gu, J. (2018). Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34 (17), i884–i890. doi: 10.1093/bioinformatics/bty560

PubMed Abstract | CrossRef Full Text | Google Scholar

Chhatre, V. E., Byram, T. D., Neale, D. B., Wegrzyn, J. L., Krutovsky, K. V. (2013). Genetic structure and association mapping of adaptive and selective traits in the east Texas loblolly pine (Pinus taeda l.) breeding populations. Tree Genet. Genomes 9 (5), 1161–1178. doi: 10.1007/s11295-013-0624-x

CrossRef Full Text | Google Scholar

Chung, J., Lin, T., Tan, Y., Lin, M., Hwang, S.-Y. (2004). Genetic diversity and biogeography of cunninghamia konishii (Cupressaceae), an island species in Taiwan: a comparison with cunninghamia lanceolata, a mainland species in China. Mol. Phylogenet. Evol. 33 (3), 791–801. doi: 10.1016/j.ympev.2004.08.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Collins, D. W., Jukes, T. H. (1994). Rates of transition and transversion in coding sequences since the human-rodent divergence. Genomics 20 (3), 386–396. doi: 10.1006/geno.1994.1192

PubMed Abstract | CrossRef Full Text | Google Scholar

Duan, H. (2014). Evaluation of genetic diversity and genome-wide association studies of important traits in chinese fir (Beijing: Beijing Forestry University).

Google Scholar

Duan, H., Cao, S., Zheng, H., Hu, D., Lin, J., Cui, B., et al. (2017). Genetic characterization of Chinese fir from six provinces in southern China and construction of a core collection. Sci. Rep. 7 (1), 1–10. doi: 10.1038/s41598-017-13219-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Du, C., Sun, X., Xie, Y., Hou, Y. (2021). Genetic diversity of larix kaempferi populations with different levels of improvement in northern subtropical region. Sci. Silvae Sinicae 57, 68–76. doi: 10.11707/j.1001-7488.20210507

CrossRef Full Text | Google Scholar

El-Kassaby, Y. A., Ritland, K. (1996a). Genetic variation in low elevation Douglas-fir of British Columbia and its relevance to gene conservation. Biodivers. Conserv. 5 (6), 779–794. doi: 10.1007/BF00051786

CrossRef Full Text | Google Scholar

El-Kassaby, Y. A., Ritland, K. (1996b). Impact of selection and breeding on the genetic diversity in Douglas-fir. Biodivers. Conserv. 5 (6), 795–813. doi: 10.1007/BF00051787

CrossRef Full Text | Google Scholar

Elshire, R., Glaubitz, J., Sun, Q., Poland, J., Kawamoto, K., Buckler, E. S., et al. (2011). A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PloS One 6, e19379. doi: 10.1371/journal.pone.0019379

PubMed Abstract | CrossRef Full Text | Google Scholar

Emerson, K. J., Merz, C. R., Catchen, J. M., Hohenlohe, P. A., Cresko, W. A., Bradshaw, W. E., et al. (2010). Resolving postglacial phylogeography using high-throughput sequencing. Proc. Natl. Acad. Sci. 107 (37), 16196–16200. doi: 10.1073/pnas.1006538107

CrossRef Full Text | Google Scholar

Fang, Y., Yang, H., Wu, H., Lei, X., Zhang, X., Yang, C., et al. (2022). Genetic diversity analysis of cunninghamia lanceolata in the 2nd and 2.5th generation seed orchard. J. Sichuan Agric. Univ. 40 (3), 371–378. doi: 10.16036/j.issn.1000-2650.202202040

CrossRef Full Text | Google Scholar

García, C., Guichoux, E., Hampe, A. (2018). A comparative analysis between SNPs and SSRs to investigate genetic variation in a juniper species (Juniperus phoenicea ssp. turbinata). Tree Genet. Genomes 14 (6), 1–9. doi: 10.1007/s11295-018-1301-x

CrossRef Full Text | Google Scholar

He, L. (2019). Genetic diversity analysis and core germplasm construction of chinese fir populations in south of jiangxi (Jiangxi Nanchang: Jiangxi Agricultural University).

Google Scholar

He, X., Zheng, J., Jiao, Z., Dou, Q., Huang, L. (2021). Genetic diversity and structure analysis of quercus shumardii populations based on slaf-seq technology. J. Nanjing For Univ. (Natural Sci. Edition) 46, 81–87. doi: 10.3969/j.issn.1000-2006.202010036

CrossRef Full Text | Google Scholar

Huang, M., Chen, D., Shi, J., Xu, N. (1986). Geographic distribution of esterase isozyme patterns in seed sources of chinese fir (cunninghamia lanceolata (lamb.) hook). J. Nanjing For Univ. (Natural Sci. Edition) 3, 31–35. doi: 10.3969/j.jssn.1000-2006.1986.03.005

CrossRef Full Text | Google Scholar

Huang, R., Hu, D., Deng, H., Wang, R., Wei, R., Yan, S., et al. (2021). Snps-based assessment of genetic diversity and genetic structure in elite chinese fir. Mol. Plant Breed. 1, 1–10.

Google Scholar

Jiang, X., Yang, T., Zhang, F., Yang, X., Yang, C., He, F., et al. (2022). RAD-Seq-Based high-density linkage maps construction and quantitative trait loci mapping of flowering time trait in alfalfa (Medicago sativa l.). Front. Plant Sci. 13. doi: 10.3389/fpls.2022.899681

CrossRef Full Text | Google Scholar

Jia, Q., Tan, C., Wang, J., Zhang, X.-Q., Zhu, J., Luo, H., et al. (2016). Marker development using SLAF-seq and whole-genome shotgun strategy to fine-map the semi-dwarf gene ari-e in barley. BMC Genomics 17 (1), 1–12. doi: 10.1186/s12864-016-3247-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Karam, M.-J., Aouad, M., Roig, A., Bile, A., Dagher-Kharrat, M. B., Klein, E. K., et al. (2019). Characterizing the genetic diversity of atlas cedar and phylogeny of Mediterranean cedrus species with a new multiplex of 16 SSR markers. Tree Genet. Genomes 15 (4), 1–12. doi: 10.1007/s11295-019-1366-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, M.-S., Richardson, B. A., McDonald, G. I., Klopfenstein, N. B. (2011). Genetic diversity and structure of western white pine (Pinus monticola) in north America: a baseline study for conservation, restoration, and addressing impacts of climate change. Tree Genet. Genomes 7 (1), 11–21. doi: 10.1007/s11295-010-0311-0

CrossRef Full Text | Google Scholar

Korecký, J., Čepl, J., Stejskal, J., Faltinová, Z., Dvořák, J., Lstibůrek, M., et al. (2021). Genetic diversity of Norway spruce ecotypes assessed by GBS-derived SNPs. Sci. Rep. 11 (1), 1–12. doi: 10.1038/s41598-021-02545-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Lexer, C., Wüest, R., Mangili, S., Heuertz, M., Stölting, K. N., Pearman, P. B., et al. (2014). Genomics of the divergence continuum in an African plant biodiversity hotspot, I: drivers of population divergence in restio capensis (Restionaceae). Mol. Ecol. 23 (17), 4373–4386. doi: 10.1111/mec.12870

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, M. (2001). Molecuar genetic varition of breeding populations and molecular breeding in chinese fir. J. Nanjing For Univ. (Natural Sci. Edition) 95 (39-48), 397. doi: 10.3969/j.issn.1000-2006.2001.05.022

CrossRef Full Text | Google Scholar

Li, Y. (2015). Genetic diversity and genetic divergence of cunninghamia lanceolata hook geographical provenances (Beijing: Chinese Academy of Forestry).

Google Scholar

Li, M., Chen, X., Huang, M., Wu, P., Ma, X. (2017). Genetic diversity and relationships of ancient Chinese fir (Cunninghamia lanceolata) genotypes revealed by sequence-related amplified polymorphism markers. Genet. Resour. Crop Evol. 64 (5), 1087–1099. doi: 10.1007/s10722-016-0428-6

CrossRef Full Text | Google Scholar

Li, H., Durbin, R. (2009). Fast and accurate short read alignment with burrows–wheeler transform. bioinformatics 25 (14), 1754–1760. doi: 10.1093/bioinformatics/btp324

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, M., Huang, M., Su, S., Chen, X., Ma, X. (2016). Genetic diversity of germplasm resources of the king of Chinese fir in fujian provenances. J. For. Environ. 36 (3), 312–318. doi: 10.13324/j.cnki.jfcf.2016.03.010

CrossRef Full Text | Google Scholar

Lin, E., Zhuang, H., Yu, J., Liu, X., Huang, H., Zhu, M., et al. (2020). Genome survey of Chinese fir (Cunninghamia lanceolata): Identification of genomic SSRs and demonstration of their utility in genetic diversity analysis. Sci. Rep. 10 (1), 1–12. doi: 10.1038/s41598-020-61611-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, M., Shi, J., Li, F., Gan, S. (2007). Molecular characterization of elite genotypes within a second-generation Chinese fir (Cunninghamia lanceolata) breeding population using RAPD markers. Sci. Silvae Sin. 43 (12), 50–55. doi: 10.3321/j.issn:1001-7488.2007.12.009

CrossRef Full Text | Google Scholar

Liu, C., Xie, Y., Yi, M., Zhang, S., Sun, X. (2017). Isolation, expression and single nucleotide polymorphisms (SNPs) analysis of LACCASE gene (LkLAC8) from Japanese larch (Larix kaempferi). J. For Res. 28 (5), 891–901. doi: 10.1007/s11676-016-0360-9

CrossRef Full Text | Google Scholar

Lozier, J. (2014). Revisiting comparisons of genetic diversity in stable and declining species: assessing genome-wide polymorphism in n orth a merican bumble bees using RAD sequencing. Mol. Ecol. 23 (4), 788–801. doi: 10.1111/mec.12636

PubMed Abstract | CrossRef Full Text | Google Scholar

Lu, W., Xiong, T., Wang, J., Zhang, L., Qi, J., Luo, J., et al. (2018). Genetic diversity of 1st generation breeding population in eucalyptus urophylla. Genomics Appl. Biol. 37 (6), 2505–2517. doi: 10.13417/j.gab.037.002505

CrossRef Full Text | Google Scholar

Lu, Y., Yan, J., Guimaraes, C. T., Taba, S., Hao, Z., Gao, S., et al. (2009). Molecular characterization of global maize breeding germplasm based on genome-wide single nucleotide polymorphisms. Theor. Appl. Genet. 120 (1), 93–115. doi: 10.1007/s00122-009-1162-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Lyu, Y.-z., Dong, X.-y., Huang, L.-b., Zheng, J.-w., He, X.-d., Sun, H.-n., et al. (2020). SLAF-seq uncovers the genetic diversity and adaptation of Chinese elm (Ulmus parvifolia) in Eastern China. Forests 11 (1), 80. doi: 10.3390/f11010080

CrossRef Full Text | Google Scholar

Mandrou, E., Denis, M., Plomion, C., Salin, F., Mortier, F., Gion, J.-M. (2014). Nucleotide diversity in lignification genes and QTNs for lignin quality in a multi-parental population of eucalyptus urophylla. Tree Genet. Genomes 10 (5), 1281–1290. doi: 10.1007/s11295-014-0760-y

CrossRef Full Text | Google Scholar

McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., et al. (2010). The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20 (9), 1297–1303. doi: 10.1101/gr.107524.110

PubMed Abstract | CrossRef Full Text | Google Scholar

Miller, M. R., Dunham, J. P., Amores, A., Cresko, W. A., Johnson, E. A. (2007). Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers. Genome Res. 17 (2), 240–248. doi: 10.1101/gr.5681207

PubMed Abstract | CrossRef Full Text | Google Scholar

Nagano, S., Hirao, T., Takashima, Y., Matsushita, M., Mishima, K., Takahashi, M., et al. (2020). SNP genotyping with target amplicon sequencing using a multiplexed primer panel and its application to genomic prediction in Japanese cedar, cryptomeria japonica (Lf) d. don. Forests 11 (9), 898. doi: 10.3390/f11090898

CrossRef Full Text | Google Scholar

Neale, D. B., McGuire, P. E., Wheeler, N. C., Stevens, K. A., Crepeau, M. W., Cardeno, C., et al. (2017). The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in pinaceae. G3: Genes Genom. Genet. 7 (9), 3157–3167. doi: 10.1534/g3.117.300078

CrossRef Full Text | Google Scholar

Niu, S., Li, J., Bo, W., Yang, W., Zuccolo, A., Giacomello, S., et al. (2022). The Chinese pine genome and methylome unveil key features of conifer evolution. Cell 185 (1), 204–217.e14. doi: 10.1016/j.cell.2021.12.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Nystedt, B., Street, N. R., Wetterbom, A., Zuccolo, A., Lin, Y.-C., Scofield, D. G., et al. (2013). The Norway spruce genome sequence and conifer genome evolution. Nature 497 (7451), 579–584. doi: 10.1038/nature12211

PubMed Abstract | CrossRef Full Text | Google Scholar

Ouyang, L., Chen, J., Zheng, R., Xu, Y., Lin, Y., Huang, J., et al. (2014). Genetic diversity among the germplasm collections of the chinese fir in 1st breeding population upon ssr markers. J. Nanjing For Univ. (Natural Sci. Edition) 38, 21–26. doi: 10.3969/j.issn.1000-2006.2014.01.004

CrossRef Full Text | Google Scholar

Peng, Y., Hu, Y., Mao, B., Xiang, H., Shao, Y., Pan, Y., et al. (2016). Genetic analysis for rice grain quality traits in the YVB stable variant line using RAD-seq. Mol. Genet. Genomics 291 (1), 297–307. doi: 10.1007/s00438-015-1104-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Petit, R. J., Hampe, A. (2006). Some evolutionary consequences of being a tree. Annu. Rev. ecol. evol. syst., 37, 187–214. doi: 10.2307/annurev.ecolsys.37.091305.300

CrossRef Full Text | Google Scholar

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81 (3), 559–575. doi: 10.1086/519795

PubMed Abstract | CrossRef Full Text | Google Scholar

Shih, K.-M., Chang, C.-T., Chung, J.-D., Chiang, Y.-C., Hwang, S.-Y. (2018). Adaptive genetic divergence despite significant isolation-by-distance in populations of Taiwan cow-tail fir (Keteleeria davidiana var. formosana). Front. Plant Sci. 9. doi: 10.3389/fpls.2018.00092

CrossRef Full Text | Google Scholar

Stevens, K. A., Wegrzyn, J. L., Zimin, A., Puiu, D., Crepeau, M., Cardeno, C., et al. (2016). Sequence of the sugar pine megagenome. Genetics 204 (4), 1613–1626. doi: 10.1534/genetics.116.193227

PubMed Abstract | CrossRef Full Text | Google Scholar

Stoehr, M., El-Kassaby, Y. (1997). Levels of genetic diversity at different stages of the domestication cycle of interior spruce in British Columbia. Theor. Appl. Genet. 94 (1), 83–90. doi: 10.1007/s001220050385

PubMed Abstract | CrossRef Full Text | Google Scholar

Su, Y., Hu, D., Zheng, H. (2016). Detection of SNPs based on DNA specific-locus amplified fragment sequencing in Chinese fir (Cunninghamia lanceolata (Lamb.) hook). Dendrobiology 76, 73–79. doi: 10.12657/denbio.076.007

CrossRef Full Text | Google Scholar

Sun, X., Liu, D., Zhang, X., Li, W., Liu, H., Hong, W., et al. (2013). SLAF-seq: an efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing. PloS One 8 (3), e58700. doi: 10.1371/journal.pone.0058700

PubMed Abstract | CrossRef Full Text | Google Scholar

Su, W., Wang, L., Lei, J., Chai, S., Liu, Y., Yang, Y., et al. (2017). Genome-wide assessment of population structure and genetic diversity and development of a core germplasm set for sweet potato based on specific length amplified fragment (SLAF) sequencing. PloS One 12 (2), e0172066. doi: 10.1371/journal.pone.0172066

PubMed Abstract | CrossRef Full Text | Google Scholar

Tamura, K., Stecher, G., Peterson, D., Filipski, A., Kumar, S. (2013). MEGA6: molecular evolutionary genetics analysis version 6.0. Mol. Biol. Evol. 30 (12), 2725–2729. doi: 10.1093/molbev/mst197

PubMed Abstract | CrossRef Full Text | Google Scholar

Tsumura, Y., Kimura, M., Nakao, K., Uchiyama, K., Ujino-Ihara, T., Wen, Y., et al. (2020). Effects of the last glacial period on genetic diversity and genetic differentiation in cryptomeria japonica in East Asia. Tree Genet. Genomes 16 (1), 1–14. doi: 10.1007/s11295-019-1411-0

CrossRef Full Text | Google Scholar

Tsumura, Y., Uchiyama, K., Moriguchi, Y., Kimura, M. K., Ueno, S., Ujino-Ihara, T. (2014). Genetic differentiation and evolutionary adaptation in cryptomeria japonica. G3: Genes Genom. Genet. 4 (12), 2389–2402. doi: 10.1534/g3.114.013896/-/DC1

CrossRef Full Text | Google Scholar

Tuskan, G. A., Difazio, S., Jansson, S., Bohlmann, J., Grigoriev, I., Hellsten, U., et al. (2006). The genome of black cottonwood, populus trichocarpa (Torr. & Gray). Science 313 (5793), 1596–1604. doi: 10.1126/science.1128691

PubMed Abstract | CrossRef Full Text | Google Scholar

Van Inghelandt, D., Melchinger, A. E., Lebreton, C., Stich, B. (2010). Population structure and genetic diversity in a commercial maize breeding program assessed with SSR and SNP markers. Theor. Appl. Genet. 120 (7), 1289–1299. doi: 10.1007/s00122-009-1256-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Van Tassell, C. P., Smith, T. P., Matukumalli, L. K., Taylor, J. F., Schnabel, R. D., Lawley, C. T., et al. (2008). SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat. Methods 5 (3), 247–252. doi: 10.1038/nmeth.1185

PubMed Abstract | CrossRef Full Text | Google Scholar

Vignal, A., Milan, D., SanCristobal, M., Eggen, A. (2002). A review on SNP and other types of molecular markers and their use in animal genetics. Genet. Select. Evol. 34 (3), 275–305. doi: 10.1051/gse:2002009

CrossRef Full Text | Google Scholar

Wang, Z., Kang, M., Liu, H., Gao, J., Zhang, Z., Li, Y., et al. (2014). High-level genetic diversity and complex population structure of Siberian apricot (Prunus sibirica l.) in China as revealed by nuclear SSR markers. PloS One 9 (2), e87381. doi: 10.1371/journal.pone.0087381

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, S., Meyer, E., McKay, J. K., Matz, M. V. (2012). 2b-RAD: a simple and flexible method for genome-wide genotyping. Nat. Methods 9 (8), 808–810. doi: 10.1038/nmeth.202

PubMed Abstract | CrossRef Full Text | Google Scholar

Whitney, T. D., Gandhi, K. J., Hamrick, J., Lucardi, R. D. (2019). Extant population genetic variation and structure of eastern white pine (Pinus strobus l.) in the southern appalachians. Tree Genet. Genomes 15 (5), 1–19. doi: 10.1007/s11295-019-1380-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Xia, W., Luo, T., Zhang, W., Mason, A. S., Huang, D., Huang, X., et al. (2019). Development of high-density SNP markers and their application in evaluating genetic diversity and population structure in elaeis guineensis. Front. Plant Sci. 10. doi: 10.3389/fpls.2019.00130

CrossRef Full Text | Google Scholar

Yang, H., Liao, H., Zhang, W., Pan, W. (2020). Genome-wide assessment of population structure and genetic diversity of eucalyptus urophylla based on a multi-species single-nucleotide polymorphism chip analysis. Tree Genet. Genomes 16 (3), 1–11. doi: 10.1007/s11295-020-1422-x

CrossRef Full Text | Google Scholar

Yang, X., Yang, Z., Li, H. (2018). Genetic diversity, population genetic structure and protection strategies for houpoëa officinalis (Magnoliaceae), an endangered Chinese medical plant. J. Plant Biol. 61 (3), 159–168. doi: 10.1007/s12374-017-0373-8

CrossRef Full Text | Google Scholar

You, Y., Hong, J. (1998). Application of rapd marker to genetic variation of chinese fir provenances. Sci. Silvae Sinicae 34, 34–40. doi: 10.3321/j.issn:1001-7488.1998.04.005

CrossRef Full Text | Google Scholar

Zhang, D., Xia, T., Dang, S., Fan, G., Wang, Z. (2018). Investigation of Chinese wolfberry (Lycium spp.) germplasm by restriction site-associated DNA sequencing (RAD-seq). Biochem. Genet. 56 (6), 575–585. doi: 10.1007/s10528-018-9861-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, H., Duan, H., Hu, D., Wei, R., Li, Y. (2015). Sequence-related amplified polymorphism primer screening on Chinese fir (Cunninghamia lanceolata (Lamb.) hook). J. for Res. 26 (1), 101–106. doi: 10.1007/s11676-015-0025-0

CrossRef Full Text | Google Scholar

Zheng, H., Hu, D., Wei, R., Yan, S., Wang, R. (2019). Chinese Fir breeding in the high-throughput sequencing era: Insights from SNPs. Forests 10 (8), 681. doi: 10.3390/f10080681

CrossRef Full Text | Google Scholar

Zhou, W., Ji, X., Obata, S., Pais, A., Dong, Y., Peet, R., et al. (2018). Resolving relationships and phylogeographic history of the Nyssa sylvatica complex using data from RAD-seq and species distribution modeling. Mol. Phylogenet. Evol. 126, 1–16. doi: 10.1016/j.ympev.2018.04.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, L., Luo, L., Zuo, J. F., Yang, L., Zhang, L., Guang, X., et al. (2016). Identification and validation of candidate genes associated with domesticated and improved traits in soybean. Plant Genome 9 (2), 1–17. doi: 10.3835/plantgenome2015.09.0090

CrossRef Full Text | Google Scholar

Zimin, A. V., Stevens, K. A., Crepeau, M. W., Puiu, D., Wegrzyn, J. L., Yorke, J. A., et al. (2017). An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing. Gigascience 6 (1), 1–4. doi: 10.1093/gigascience/giw016

CrossRef Full Text | Google Scholar

Zurn, J. D., Nyberg, A., Montanari, S., Postman, J., Neale, D., Bassil, N. (2020). A new SSR fingerprinting set and its comparison to existing SSR and SNP based genotyping platforms to manage pyrus germplasm resources. Tree Genet. Genomes 16 (5), 1–10. doi: 10.1007/s11295-020-01467-7

CrossRef Full Text | Google Scholar

Keywords: Chinese fir, breeding population, RAD-seq, genetic structure, genetic diversity

Citation: Jing Y, Bian L, Zhang X, Zhao B, Zheng R, Su S, Ye D, Zheng X, El-Kassaby YA and Shi J (2023) Genetic diversity and structure of the 4th cycle breeding population of Chinese fir (Cunninghamia lanceolata (lamb.) hook). Front. Plant Sci. 14:1106615. doi: 10.3389/fpls.2023.1106615

Received: 23 November 2022; Accepted: 16 January 2023;
Published: 27 January 2023.

Edited by:

Jianjun Chen, University of Florida, United States

Reviewed by:

Baosheng Wang, South China Botanical Garden (CAS), China
Haidong Yan, University of Georgia, United States

Copyright © 2023 Jing, Bian, Zhang, Zhao, Zheng, Su, Ye, Zheng, El-Kassaby and Shi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liming Bian, Lmbian@njfu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.