Introduction
In Korea, breeding has used the Duroc breed and terminal sires in a three-way crossbred system. Korean pork consumption has shown a unique pattern with a strong preference for high-fat cuts such as belly, Boston butt, and rib (Choe et al., 2015).
Since the last century, traditional selective pig breeding methods have aimed to significantly improve economically important genetic traits such as growth rate, backfat thickness, and feed conversion ratio (Choi et al., 2015). In addition, animal improvement programs have dramatically accelerated the search for complex traits using genome-wide association studies (GWAS), such as quantitative trait loci (QTL) (Liu et al., 2020). The August 23, 2021 release 45 of the Animal QTL Database listed 33,540 QTLs in pigs (Pig QTLdb, 2021).
In pigs, GWAS has been conducted based on cDNA microarrays, whole genome sequencing (WGS), RNA sequencing (RNAseq) and the single-nucleotide polymorphism (SNP) chips for meat quality-related traits (Lee et al., 2012; Jeong et al., 2015; Wu et al., 2020; Zappaterra et al., 2021). In the Yorkshire breeds, genome-wide research has identified growth traits related to the candidate genes mitogen-activated protein kinase 6 (MAP2K6), phospholipase C beta 1 (PLCB1), rho GTPASE activating protein 24 (ARHGAP24), GLI family zinc finger 2 (GLI2), neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adaptor 2 (NYAP2), and zinc finger protein multitype 2 (ZFPM2) (Meng et al., 2017). Researchers also identified a regulatory gene network using GWAS with the AT-rich interaction domain 3B (ARID3B), glial cell missing homolog 1 (GCM1), and GLI2 genes for economic pig characteristics such as body fat thickness in purebred pigs (Lee et al., 2019).
Generally, WGS can detect genetic variants, but did not accuracy is lower that larger indels are, indeed, only detected poorly in DNA (Daetwyler et al., 2014; Falardeau et al., 2016). Despite recent decreases in cost, WGS of association panels comprising large numbers of samples remains cost-prohibitive. However, GWAS results showed that the leading SNPs for complex traits were located in non-coding regions and the exonic parts of the whole genome, representing 1.5 - 2% of the whole genome in pigs and humans (Choi et al., 2015; Liu et al., 2020). Whole-exome sequencing (WES) usually identifies around 20,000 different variants in the DNA (Salmaninejad et al., 2019).
However, no WES analysis has been conducted to date to detect variants of genes associated with meat qualities in purebred and crossbred pigs (Landrace × Yorkshire × Duroc [LYD]). This study used WES methods to associate genetic variants with loin muscle carcass quality in crossbred pigs.
Materials and Methods
Animals and sample collection
Pork carcasses are graded in terms of loin muscle quality and conformation based on the Korea Institute for Animal Products Quality Evaluation (KAPE) and scored as first plus (1+), first (1), or second (2), based on marbling, lean color, backfat thickness, and body weight (KAPE, 2018). Ten crossbred pigs (LYD) were determined to be grades 1+ (Class I, n = 5) and 2 (Class II, n = 5) after being slaughtered according to the KAPE standards. Quality details of the carcass tenderloin muscle characteristics were reported previously (Kim et al., 2020). For WES analysis, ten samples of loin muscles were immediately frozen and stored at -80℃ until DNA extraction.
DNA sample preparation and sequencing
Genomic DNA was extracted from the tissue samples (50 µg·sample-1) using the Roche DNA Extraction Kit (Roche Diagnostics, Mannheim, Germany) following the manufacturer’s standard protocol. The procedure included a proteinase K digestion followed by 56℃ column purification to extract high-quality DNA. DNA degradation and contamination were monitored on 1% agarose gels, and the DNA concentration was measured using the Qubit® DNA Assay Kit in a Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA).
Library preparation for sequencing
A total amount of 1.0 μg of genomic DNA per sample was used as input material for the DNA library preparation. Sequencing libraries were generated using 130104_Sscr_10_2_MW_EZ_HX1 (Roche NimbleGen Inc., Pleasanton, CA, USA) following the manufacturer’s recommendations, and index codes were added to each sample. Briefly, fragmentation was carried out using a hydrodynamic shearing system (Covaris Inc., Woburn, MA, USA) to generate 180 - 280 base pair (bp) fragments. The remaining overhangs were converted into blunt ends via exonuclease/polymerase activities, and the enzymes were removed. Adapter oligonucleotides were ligated after adenylation of the 3' ends of the DNA fragments. DNA fragments with ligated adapter molecules on both ends were selectively enriched by polymerase chain reaction (PCR). Following PCR, liquid-phase hybridization was performed on the library using a biotin-labeled probe, after which streptomycin-coated magnetic beads were used to capture the gene exons. Captured libraries were enriched via PCR, and index tags were added to prepare for hybridization. The products were purified using the AMPure XP system (Beckman Coulter, Brea, CA, USA) and quantified using the Agilent High Sensitivity DNA Kit on the Agilent Bioanalyzer 2100 system (Agilent, Santa Clara, CA, USA).
Whole-exome sequencing (WES) bioinformatics analysis
Sequence reads were filtered for base quality and adapter trimming using Trimmomatic ver. 0.33 (Bolger et al., 2014). Reads were trimmed if four consecutive bases had an average Phred quality score of < 20. After trimming, only pairs of DNA sequences for which each read exceeded 35 bp were retained for analysis. Sequence reads were aligned to the pig reference genome (Sus scrofa 10.2) using the Burrows-Wheeler Aligner (BWA-MEM) ver. 0.7, which is used to map low-divergent sequences against a large reference genome (Li and Durbin, 2009). The alignments were sorted according to their coordinates and converted into the BAM format using SAMtools ver. 1.2 (Ramirez-Gonzalez et al., 2012). Next, the data were formatted for variant calling using Picard ver. 1.135, and duplicate reads were marked (Li et al., 2009).
We used the Genome Analysis Toolkit (GATK) ver. 2.6-4 (McKenna et al., 2010; DePristo et al., 2011; Van der Auwera et al., 2013) for downstream SNP and indel calling. We used Realigner Target Creator to identify poorly mapped regions (nearby indels) from the alignments and realigned these regions using Indel Realigner. Next, Unified Genotyper was used to call the SNPs and indels with a Phred scale quality > 30. After SNP calling, we used VariantFiltration to discard sequencing and alignment artifacts from the SNPs with the parameters “MQ0 ≥ 4 && ((MQ0/(1.0 * DP)) > 0.1)”, “SB ≥ -1.0, QUAL<10,” and “QUAL <30.0 || QD < 5.0 || HRun> 5 || SB >-0.10” and from the indels with the parameters “QD < 2.0,” “FS > 200.0,” and “Read Pos Rank Sum< -20.0.” All the variants that passed the above filtering criteria were used in the downstream analysis. ANNOVAR (Wang et al., 2010) was used to annotate the functions of the variants (exonic, intronic, 5' and 3' untranslated regions, splicing, and intergenic). SNPs identified in the exonic regions were classified as synonymous or non-synonymous SNPs.
Validation
A set of validated SNPs was used that the Kompetitive allele specific PCR (KASP) were used to confirm the WES results in the loin muscles (102 SNPs). We have used the flanking sequences with evaluated SNPs indicated by brackets to design KASP assays (Supplementary Table 1). All custom SNP assays were designed by the corresponding company (LGC Genomics, Hoddesdon, Herts, UK). The KASP assays were performed in a volume of 5 μL using 384-well plates, and low ROX was used as a passive reference dye. Two probes ensured the allelic specificity of the TaqMan assay; one was labeled with FAM and the other with VIC. In the KASP assays, bi-allelic discrimination was achieved through the competitive binding of two allele-specific forward primers, one labeled with FAM and the other with HEX. These different reporter dyes were detected independently on real-time quantitative PCR instruments with excitation sources and emission filters in the respective wavelengths. Genotyping was carried out on the QuantStudio 5 Real-Time PCR System (Applied Biosystems, Waltham, MA, USA).
Statistical analysis
We performed analysis of variance and χ2 tests for continuous and categorical variables on our data. All statistical analyses were conducted using the SAS software (version 9.3, SAS Institute, Inc., Cary, NC, USA).
Results
On average, approximately 1.0 × 108 unique good-quality reads were generated per exome for the first plus-grade and second-grade groups (Supplementary Table 2). In the first-grade group, the total number of reads per sample ranged from 20,061,291 to 25,268,416, (median: 22,865,861). In the second-grade group, the number of reads ranged from 18,447,182 to 24,652,411 (median: 21,475,165). The sequence reads of each breed were aligned to the pig reference genome (Sus scrofa 10.2) from the Ensembl database. The mean coverage depth was 57.07 - 84.14 with 61 - 63% uniformity for amplicon coverage in the targeted regions with at least 4 times the target bases, and coverage of the median proportion with > 20 times coverage was 77.76% (Supplementary Table 3). Whole-exome variant analysis identified 489,279 variants in each of the first plus-grade and second-grade groups. The hetero rate (%) and Ts/Tv ratios for each whole exome ranged from 0.089 to 0.108 and from 2.366 to 2.400, respectively (Supplementary Table 4).
For additional quality control, we selected SNPs with a minor allele frequency of none or lower samples using the χ2 test. We identified 439 variants in each of calling rate with ≥ 100% and depth > 340. Using log-linear models, we selected an association with meat quality at an unadjusted significance level of p< 0.05. We assessed 13 candidate genes for changes in carcass characteristics in loin muscle to determine whether the candidate genes played a role in meat quality (Fig. 1). The SNPs associated with meat quality were located on chromosomes 1, 2, 5, 9, 10, 12, 13 and 15 (Table 1). Three suggestive SNPs were detected in the first plus grade. These SNPs located on the SSC1: 305,240,304, SSC15: 355,155,021, SSC2: 11,839,450, and SSC10: 24,762,182. Based on the pig genome assembly version 10.2, these were mapped to nucleoporin 214 (NUP214), GLI2, and deltex E3 ubiquitin ligase 4 (DTX4). Additionally, the six first plus grade genes had dominant SNPs in the regions of SSC10: 24,762,182, SSC10: 47,337,624, SSC12: 1,689,010, and SSC13: 86,695,398, 120,292,202, where we detected the crumbs cell polarity complex component 1 (CRB1), kinesin family member 5B (KIF5B), regulatory associated protein of MTOR complex 1 (RPTOR), NME/NM23 family member 9 (NME9), neutral cholesterol ester hydrolase 1 (NECH1), and amphiphysin (AMPH). In addition, in the second-grade group, we detected dominant genes for KIF26B, calmodulin-like 5 (CALML5), and septin-type G domain-containing protein (SEPT3).
The variants identified in the LYD breed were genotyped using KASP (n = 102). The LYD pig characteristics are listed in Table 2. We identified variations in three candidate genes, including KIF5B, GLI2, and KIF26B (Fig. 2). For KIF5B, the CIE L* (lightness) value of the C/C genotype (46.85, n = 29) and the C/T genotype (48.54, n = 59) was significantly higher than that in the T/T genotype (45.86) (p < 0.05) (Fig. 2a). For GLI2, the heterozygous G/A genotype was significantly higher than that of the homologous genotypes (G/G and A/A) for marbling score and meat color (Fig. 2b and c). In addition, the G/G genotype (21.09 mm, n = 64) and G/A genotype (21.37 mm, n = 30) were significantly higher than the A/A genotype (17.20 mm, n = 5) in KIF26B (p = 0.083) for backfat thickness (Fig. 2d).
Discussion
This study determined SNPs associated with meat quality through WES and genotyping in crossbred pigs. Using genotyping, we identified the associations of KIF5B, GLI2, and KIF26B with meat qualities such as meat color, marbling score, and backfat thickness.
Kinesins are ATP-dependent microtubule plus-end-directed motor proteins and are divided into 14 subfamilies that share similarities in the motor and structural domains (Park et al., 2017). The Kinesin-1 subfamily is a tetramer consisting of two kinesin heavy chains such as KIF5A, KIF5B, or KIF5C (Kanai et al., 2000). Generally, protein kinases can regulate the intracellular transport of organelles, proteins, and lipids through direct phosphorylation with KIF5s (Gibbs et al., 2015). The KIF5B and KIF26B genes interact with extracellular signal-regulated kinase (ERK) and mitogen-activated protein kinase (MAPK) signals to regulate cell proliferation, differentiation, migration, senescence, and apoptosis (Sun et al., 2015; Liang and Yang, 2019). In pigs, MAPK signaling is associated with intermuscular fat contents to regulate lipid deposition, such as polyunsaturated fatty acids (Zappaterra et al., 2021). Our previous study showed that significantly enriched KEGG pathways were related to the signaling pathways, including the cyclic adenosine monophosphate (cAMP) signaling pathway, the MAPK signaling pathway, and the MAPK cascade pathway (Kim et al., 2020). Thus, KIF family (e.g., KIF5B and KIF26B) variations may be associated with meat qualities such as color and backfat thickness.
The GLI2 was associated with marbling score and meat color in the heterozygous type. The GWAS showed that the GLI2 functions as a transcription regulator alongside the Hedgehog (Hh) pathway in growth in purebred pigs (Meng et al., 2017; Lee et al., 2019). The Hh signaling pathway is directly linked to the control of body fat mass and plays a crucial role in lipid metabolism, adipogenesis, and obesity (Wei et al., 2019). These data suggest that GLI2 and the Hh pathway regulate meat quality in pigs.
Conclusion
This study had some limitations. First, the statistical power is low because of the small number of samples; some samples could not be genotyped using the KASP assay with genomic DNA. Second, we did not assess any joint effects of SNPs and expression status. Future studies are needed with more samples. Despite these limitations, this study has demonstrated the benefits of combining WES and KASP data, which are low cost but still effective. In conclusion, genetic variations in the KIF family (KIF5B and KIF26B) and GLI2 genes, related to MAPK and Hh signaling pathways, were associated with meat quality in crossbred pigs. Our findings may help clarify the mechanisms underlying high-quality meat production in pigs.