Digestion efficiency differences of restriction enzymes frequently used for genotype-by-sequencing technology

Yong Suk Chung1Taehwan Jun2Changsoo Kim1*

Abstract

With the development of next-generation sequencing (NGS), a cutting-edge technology, genotype-by-sequencing (GBS) became available at a low cost per sample. GBS makes it possible to customize the process of library preparation to obtain high-quality single nucleotide polymorphisms (SNPs) in the most efficient way. However, a GBS library is hard to construct due to fine-tuning of concentration of each reagent and set-up. The major reason for this is the presence of undigested genomic DNA (gDNA) owing to the efficiency of different restriction enzymes for different species with unknown reasons. Therefore, this proof-concept study is to demonstrate the unpredictable patterns of enzyme digestion from various plants in order to make the reader aware of the caution needed when choosing restriction enzymes for their GBS library preparations. Indeed, no pattern was found for the digestibility of gDNA samples and restriction enzymes in the current study. We suggest that more data should be accumulated on this matter to help researchers who want to apply GBS technologies in a variety of genetic approaches.

Keyword



Introduction

With the advent of next-generation sequencing (NGS), a cutting-edge technology, genotype- by-sequencing (GBS), has emerged for the sequencing of multiplexed samples (Elshire et al., 2011). It can perform molecular marker discovery and genotyping at the same time (Poland and Rife, 2012; He et al., 2014; Kim et al., 2016). Because of its cost and effectiveness, it has been applied to deal with the large quantities of samples generated by various genetic or breeding populations such as conventional biparental populations, advanced backcross populations, nested association mapping populations, and diversity panels from many plant species. Although many

commercial kits are currently available on the market, the cost per sample is still quite expensive particularly in handling multiple samples from those large-scale populations. The GBS makes it possible to customize the process of library preparations at much less cost, dramatically reducing the cost per sample (data point). It is by far the most efficient way to get high-quality single nucleotide polymorphisms (SNPs) (Gore et al., 2007; Gore et al., 2009). Obviously, many genetic laboratories have used publicly available protocols or set up their own protocols to reduce genotyping costs. However, constructing GBS libraries has multiple steps harmonizing different basic techniques of molecular biology including restriction digestion, ligation, purification, and polymerase chain reaction (PCR). In consequence, trouble-shooting is another issue because library preparation tends to be error-prone. In other words, the quality control of GBS libraries has to be seriously considered in order not to waste resources and labor. From our multi-year experiences in the GBS procedure, we have encountered different issues, leading to the failure of the entire experiment. Those issues can occur at any step such as restriction digestion, ligation, purification, and PCR. From our latest experience, our library preparation encountered multiple failure due to unknown reasons and we tried to exclude potential issues step by step. With the help of fragment analysis, the main source of failure was found in the restriction digestion, which was unexpected. One of the reasons why restriction enzymes (REs) are used in library preparations is to control genomic representations indirectly. GBS uses a low coverage of genomic data; however, if the coverage for each locus is less than 2, it generates a plethora of missing or false genotype data. Since the genome sizes of plants vary, researchers use different combinations of REs (e.g. four-, five-, or six-base cutters) according to their digestion probabilities, resulting in an increase of coverage in each locus. Some may want to use methylation-sensitive REs to focus on the euchromatic regions of genomes, enriching coding regions in a GBS library. Sometimes, REs do not work well due to star activity in which REs cleave similar but not identical sequences. This can be overcome by using high fidelity REs provided by major suppliers. However, the most important point is that the efficiency of restriction digestion in plants varies, indicating that we indeed need prior knowledge of RE cutting profiles for as many plant species as possible. A basic way to visualize those profiles is to use gel electrophoresis but it does not offer enough resolution to see if the fragments are well formed and it requires quite large amounts of digested gDNA. The efficiency of restriction digestion will determine that of downstream steps in GBS library preparation.

To sum up, GBS is low cost, has reduced sample handling, fewer PCR and purification steps, no size fractionation, no reference sequence limits, efficient barcoding and is easy to scale up (Davey et al., 2011). These advantageous features make GBS a very powerful tool to do many kinds of plant genetic studies including genetic mapping, association mapping, genome-wide association (GWAS), genomic selection, polyploidy, and genetic-diversity studies. This is possible not only due to the features of GBS that make it highly reproducible but also due to extremely specificity of enzyme digest sites.

As briefly stated above, there are sites in genomic DNA (gDNA) that cannot be cleaved with methylation-sensitive REs (Susan et al., 1994). Thus, gDNA digestion protocols should take into consideration the probability of having target sites over a genomic size of hundreds of mega-bases. Nevertheless, for unknown reasons, some gDNA seems not to be digested in this study. This causes an important problem in constructing GBS libraries. Accordingly, the objectives of this proof-of-concept study is to profile unpredictable patterns of enzyme digestions of various genomic DNA samples in order to let readersbe cautious when choosing the REs for their GBS library construction.

Materials and Methods

Plant materials

Six diverse plants species were randomly selected from monocots including zoysiagrass (Zoysia japonica Steud., Choridoideae subfamily), rice (Oryza sativa L., Oryzoideae subfamily), and sorghum (Sorghum bicolor L., Panicoideae subfamily), and from dicots such as lettuce (Lactuca sativa, Cichorioideae subfamily), perilla (Perilla frutescens, Laminaceae subfamily), and tomato (Solanum lycopersicum, Solanoideae subfamily). Those plants cover quite number of subfamilies from monocots to dicots. The gDNA of each plant sample planted and grown in the greenhouse in Chungnam National University in Daejeon, Korea was extracted using a CTAB method (Doyle and Doyle, 1987) and diluted at 40 ng/μL. To obtain intact DNA, each samples were frozen with nitrogen and homogenizing with mortar and pestle or a mechanical homogenizer (Honeycutt et al., 1992; Guillemaut and Maréchal- Drouard, 1992) followed by phenol-chloroform-isoamyl alcohol extraction (Zhu et al., 1993) to obtain clean DNA samples so that RE is not blocked by junk proteins during digestions.

REs and digestion conditions

Ten REs which are frequently used for GBS preparations were selected (Table 1). Two of them were methylation- sensitive and the others were methylation non-sensitive. Recognition sites ranged from 4 bases to 6 bases. Those REs were from two different companies (New England Biolabs, MA, U.S.A. and Enzynomics, Daejeon, Korea) but their quality is widely acceptable without variation (unpublished data). Eight units of each RE (since the concentration of REs were different and the volume of RE treatment varied) with 2.0 μL of the corresponding enzyme buffer, were incubated with 3.0 μL of gDNA (120 ng/μL) to give 20 μL of total volume for 2 hours at 37℃ followed by further incubation for 20 minutes at 65℃ (the volume of H2O was adjusted depending on the volume of RE).

Table 1. List of restriction enzymes used in the current study.

http://dam.zipot.com:8080/sites/kjoas/images/N0030440302_image/Table_KJAOS_44_03_02_T1.jpg

yEnzynomics.

zNew England BioLabs.

Fragment analysis

DNA fragments generated by REs were detected and visualized using Q-Analyzer (Wind Hill Technologies Co., Ltd, Shanghai, China). Method in the program setting was M-4-10-06-300 for sample injection 3 kV 10 seconds and separation 6 kV 300 seconds to have 15 - 1,000 bp ranges at 2 - 4 bp accuracy. Cartridge type was S1 (high resolution catridge) and alignment marker was MA-1.

Results and Discussion

Notably, the gDNA cutting rates of ten REs on six different plant species did not exceed 50 percent (data is not shown), which is very low. The gDNAs of zoysiagrass, rice, and perilla, were cleaved by five Res out of 10 REs (Table 2, Fig. 1). This pattern was not dependent on the kind of REs, RE recognizing bases, or methylation-sensitivity. Likewise, the gDNAs of sorghum and tomato were cleaved by four REs and those of lettuce were cut by one RE in an irregular pattern. No particular pattern was observed when those samples were grouped into monocots and dicots. In addition, the lengths of recognition sites did not affect the profiles of digestion patterns. Methylation sensitivity also does not seem to be important factor in the cleavage of gDNA based on the results of methylation-sensitive enzymes, HpaII and NsiI (Comb and Goodman, 1990). The purpose of using methylation-sensitive enzymes in the GBS process is to increase coverage at each genetic locus. Indeed, the distribution of methylated DNA is overwhelmed in heterochromatic regions in which reside a number of repeated sequences such as transposable elements. However, it is well-evidenced by many whole genome studies that a large portion of genomic DNA is methylated in euchromatic regions which are gene-rich areas (Arabidopsis Genome Initiative, 2000; Paterson et al., 2009; Schnable et al., 2009). Therefore, one needs to be very cautious to use methylation-sensitive REs in the GBS because useful genomic information can be missed. Furthermore, the recognition sites do not seem to be influential to the distribution of DNA fragments sizes. The distribution of DNA fragments sizes is not only related to the base cutting number but also to the RE dosage and incubation time. This should be investigated further because many different factors can contribute to the efficiency of REs. However, any pattern found in this study may not cover all the patterns of many other species. Thus, it would be necessary to increase plant diversity as well as the kinds of REs.

Table 2. The digestibility of each restriction enzyme for six species.

http://dam.zipot.com:8080/sites/kjoas/images/N0030440302_image/Table_KJAOS_44_03_02_T2.jpg

yindicates gDNA was digested.

zIndicates gDNA was not digested.

As a preliminary study to demonstrate if all REs could cut any gDNA, the amount of gDNA samples which are minimally detectable in the fragment analyser were used. Therefore, peaks are not visually obvious. Hence, it is recommended that more gDNA should be added in order to quantify the gDNA fragments of different sizes. Nevertheless, the fragmentation by REs could be profiled and predicted according to their sizes using the fragment analyzer. One thing to mention is that the undigested gDNA is not shown because the size of intact gDNA is too large (around 20 kb) to be seen in our analysable range.

In summary, no patterns were found for the digestibility based on different REs or plant species, which is unexpected and very interesting. This phenomenon is likely to make GBS users perplexed because their gDNA samples may not be cut by the REs of their choices. Thus, it is very important to let them know not all REs can cut any gDNA.

Fig. 1.

Digestibility of plant genomic DNA samples by 10 different restriction enzymes. X-axis and Y-axis of each graph represent relative migration times (minutes) and fluorescence units (RFU), respectively. Peaks with red circles in the graph indicate the profiles of DNA fragments generated by restriction-digestions.

http://dam.zipot.com:8080/sites/kjoas/images/N0030440302_image/Figure_KJAOS_44_03_02_F1.jpg

The GBS uses the combination of low-frequency and high-frequency cutters to digest gDNA, a barcoded adapter is ligated to one restriction site and a common adapter to the other (Poland and Rife, 2012; He et al., 2014). Therefore, the selection of REs for GBS approaches is crucial, especially for the reason demonstrated in the current study that the enzyme cleaving is not always working properly for unknown reasons. Further study of why this phenomenon occurs would be another interesting topic. Meanwhile, it would be very valuable to accumulate data on the digestibility of many other plant species with more REs to help researchers not waste their time and money due to the gDNA digestion failure by suggesting a cautious approach during experiment setup for GBS before testing their chosen REs on their own gDNA samples.

Acknowledgements

This work was supported by National Agricultural Genome Program (#PJ0122762017, Rural Development Administration). We also thank Ji Won Kang for assisting this experiment.

References

1 Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796-815.  

2 Comb M, Goodman HM. 1990. CpG methylation inhibits proenkephalin gene expression and binding of the transcription factor AP-2. Nucleic Acids Research 18:3975-3982. 

3 Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12:499-510. 

4 Doyle J, Doyle JL. 1987. Genomic plant DNA preparation from fresh tissue-CTAB method. Phytochemical Bulletin 19:11-5. 

5 Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE. 2011. A robust, simple genotyping- by-sequencing (GBS) approach for high diversity species. PloS ONE 6:e19379. 

6 Gore M, Bradbury P, Hogers R, Kirst M, Verstege E, van Oeveren J, Peleman J, Buckler E, van Eijk M. 2007. Evaluation of target preparation methods for single-feature polymorphism detection in large complex plant genomes. Crop Science 47:S-135-S-148. 

7 Gore MA, Chia J-M, Elshire RJ, Sun Q, Ersoz ES, Hurwitz BL, Peiffer JA, McMullen MD, Grills GS, Ross-Ibarra J. 2009. A first-generation haplotype map of maize. Science 326:1115-1117. 

8 Guillemaut P, Maréchal-Drouard L. 1992. Isolation of plant DNA: A fast, inexpensive, and reliable method. Plant Molecular Biology Reporter 10:60-5.  

9 He J, Zhao X, Laroche A, Lu ZX, Liu H, Li Z. 2014. Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding. Frontiers in Plant Science 5:484. 

10 Honeycutt RJ, Sobral BW, Keim P, Irvine JE. 1992. A rapid DNA extraction method for sugarcane and its relatives. Plant Molecular Biology Reporter 10:66-72. 

11 Kim C, Guo H, Kong W, Chandnani R, Shuang L-S, Paterson AH. 2016. Application of genotyping by sequencing technology to a variety of crop breeding programs. Plant Science 242:14-22. 

12 Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG, Rahman M, Ware D, Westhoff P, Mayers KF, Messing J, Rokhsar DS. 2009. The Sorghum bicolor genome and the diversification of grasses. Nature 457:551-556. 

13 Poland JA, Rife TW. 2012. Genotyping-by-sequencing for plant breeding and genetics. Plant Genome 5:92-102. 

14 Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK. 2009. The B73 maize genome: Complexity, diversity, and dynamics. Science 326:1112-1115.  

15 Susan JC, Harrison J, Paul CL, Frommer M. 1994. High sensitivity mapping of methylated cytosines. Nucleic Acids Research 22: 2990-2997. 

16 Zhu H, Qu F, Zhu, LH. 1993. Isolation of genomic DNAs from plants, fungi and bacteria using benzyl chloride. Nucleic Acids Research 21:5279.