Introduction

Several approaches are now available in order to elucidate genetic architecture of complex traits such as growth, fat deposition, carcass composition, or meat quality in livestock species. Traditionally phenotypic QTL (QTL) mapping has been carried out using linkage analysis with limited number of microsatellite markers (Wang et al. 2002; Deng et al. 2000). Although this approach provided reliable results, further analyses to identify underlying genes or causative mutations have not been very successful, in part due to the lack of QTL position precision limited by the available markers (Würschum and Kraft 2014; Schön et al. 2004). More recently, Genome-Wide Association Study (GWAS) using high density SNP panels has emerged as a strong approach that minimizes the marker number limitation (Hill 2012; Sun et al. 2015). GWAS analyses are usually focused on the study of phenotypic traits using genomic data. Nevertheless, it is not the only possible application; genetical genomics studies (Jansen and Nap 2001; Breitling et al. 2008) can be conducted by carrying out expression GWAS (eGWAS). Genetical genomics aims at identifying QTL for molecular traits, also known as intermediate phenotypes, such as gene expression (eQTL) that could link variation in genetic information to physiological traits (Williams et al. 2007). These analyses allow us to obtain information regarding gene expression regulation, regulation paths, and interactions that could help understand the genetic architecture of complex traits (Zou et al. 2012). Studies in human have validated the successfulness of eGWAS to identify variants associated with complex human diseases, and the potential role of gene expression changes in those diseases (Kodama et al. 2012; Zou et al. 2012).

Previous studies in an Iberian × Landrace porcine experimental cross using both linkage and GWAS analyses allowed us to identify QTL with relevant effects on growth, fat deposition, and premium cut yield-related traits (Óvilo et al. 2000; Varona et al. 2002; Mercadé et al. 2005; Fernández et al. 2012, 2014). However, the identification of potential causal variants has been limited to ELOVL6, LEPR, and FABP genes (Corominas et al. 2013, 2015; Ovilo et al. 2005; Estellé et al. 2006; Pérez-Montarelo et al. 2012). Therefore, the aim of the current study is to help in the identification of potential causal variants for these traits through a genetical genomics study. An eGWAS study has been carried out on an experimental Iberian × Landrace backcross in order to identify the genomic regions regulating the gene expression of those genes whose expression is correlated with growth, fat deposition, and premium cut yield measures. The analyses were conducted exploiting Porcine 60K SNP BeadChip genotypes and Porcine Expression Microarray (Affymetrix) data hybridized on mRNA from Longissimus dorsi muscle.

Materials and methods

Animals

The phenotypic information and gene expression data used in the current study belong to an experimental backcross F1 (Iberian × Landrace) × Landrace of the IBMAP population (Óvilo et al. 2000; Mercadé et al. 2005; Óvilo et al. 2005). The IBMAP F1 generation was obtained from three Iberian Guadyerbas boars and 30 Landrace sows, five of these F1 boars were mated with 25 Landrace sows obtaining 160 animals from the backcross (BC).

All animal procedures were performed according to the Spanish Policy for Animal Protection RD1201/05, which meets the European Union Directive 86/609 about the protection of animals used in experimentation.

Phenotypic data

For the study seven traits related to growth, fatness, and meat quality were recorded (Table 1). These were body weight at 150 days of mean age (BW150), backfat thickness measured at 75 kg (BFT75) and at slaughter (BFS), weights of premium cuts, such as hams (HW), shoulders (SW), and loin bone-in (LBW), and intramuscular fat content (IMF) measured in Longissimus dorsi samples at slaughter (Fernandez et al. 2012).

Table 1 eQTL identified in the eGWAS analyses conducted on Longissimus dorsi gene expression data

Gene expression

Gene expression data were obtained from the hybridization of mRNA samples coming from Longissimus dorsi of 102 backcrossed individuals with the Porcine Expression Microarray (Affymetrix) as described in Pena et al. (2013). Quality control was carried out with the microarray data using affyPLM package of the Bioconductor software (http://www.bioconductor.org/). RNA normalization was carried using BRB-Array Tools (v. 3.6.0) (http://linus.nci.nih.gov/BRB-ArrayTools.html). Expression data are expressed as the log2 of probeset signal intensity.

Correlation (phenotype and expression)

A correlation analysis was carried out between phenotypic (BW150, BFT75, BFS, HW, SW, LBW, and IMF) and expression data. Expression and phenotypic data were corrected adjusting a linear model, setting sex and batch as fixed effects, and slaughter age as random effect. Pearson correlation coefficient was calculated between the predicted values from 24,000 probesets and the predicted values of the seven phenotypic records. Genes with significant correlation levels (r = |0.32| − |0.66|, p value <0.001, q value <0.002) were selected for further analysis. Microarray probesets were annotated using NetAff from Affymetrix (https://www.affymetrix.com/analysis/index.affx).

Genotyping data

DNA samples from 160 backcrossed and their F1 and F0 relatives were genotyped with the PorcineSNP60 BeadChip (Illumina, Inc.), designed by Ramos et al. (2009). GenomeStudio software (Illumina, Inc.) was employed to visualize, edit, standardize quality filter, and extract genotyping data. A second process of data filtering was carried out with GenABEL software, those markers with a minimum allele frequency (MAF) <2.5%, and markers deviating from Hardy–Weinberg equilibrium (FDR < 1%) were discarded. A total of 31,606 SNPs were considered for further analyses.

eGWAS analysis

A genome-wide association study was performed using the GenABEL package (Karssen et al. 2016) in R environment. The analysis was carried on 102 individuals, those with expression and genotyping data. The genome-wide analysis was performed following the model:

$${{y}_{ijk}}={{S}_{i}}+{{B}_{j}}+b{{x}_{k}}+\sum\nolimits_{l}{{{\lambda }_{lk}}{{a}_{l}}}+{{u}_{k}}+{{e}_{ijk}},$$

where y ijk is the trait value of kth individual, S i and B j are fixed effects for sex and batch respectively, and b is the slaughter age regression coefficient. Additive effect of the SNP is a l and λ lk is the indicator related with the number of copies of the lth allele (0, 1, or 2) and u k would be the infinitesimal effect of the kth individual, e ijk is the random residual term. The same model but using carcass weight as regression coefficient showed similar results. QValue package in R was used to perform correction for multiple tests (Bass et al. 2015). Significant associations were considered for those reporting q value <0.05.

Region analysis

eQTLs were determined by two or more significantly associated SNPs within a maximum distance of 2 Mb. The genetic content of the eQTL was extracted using BioMart tool from Porcine Ensembl database. FatiGO and ReviGO online tools were used to investigate Gene Ontology enrichment and function. In order to prioritize the investigation on eQTL regions, those regions were compared with QTL regions obtained in previous studies carried out in the same material.

Candidate SNPs analyses

A candidate SNP search was done by exploiting an RNA-seq assay previously conducted on the same animal material (Martínez-Montes et al. 2016). The candidate SNPs were validated by Sanger sequencing on cDNA synthesized from mRNA. Primer pairs were designed from exon to exon, in order to avoid genomic DNA amplification (Supplemental Table 1). The PCR reactions were performed in a final volume of 25 μl, containing 4 μl of cDNA, 0.5 μl of polymerase, 2.5 μl buffer 10×, 2.5 μl of dNTPs, and 0.5 μl of each primer. Thermocycling was carried out under the following conditions: 94 °C for 5 min, 35 cycles of 94 °C for 30 s, 60 °C for 30 s, and 72 °C for 30 s, with a final extension of 72 °C for 10 min. The PCR reactions were carried out in a GeneAmp PCR System 9700 (Applied Biosystems, Warrington, UK). The PCR products were purified with the illustraTMGFX™ PCR DNA purification kit (GE Healthcare, UK) according to the manufacturers’ protocol. PCR products were sequenced with both forward and reverse primers using the 3100 BigDye® Terminator v3.1 Matrix Standard in a 3730 DNA Analyzer (Applied Biosystems Warrington, UK). After validation, the SNPs were genotyped in the 160 backcrossed animals using different techniques: pyrosequencing using specific primers (Sup. Table 1); PCR-RFLP with restriction enzymes Tsp451 (ss2031475817, GenBank ID: 100518810) and BstUI (ss2031475807, GenBank ID: 100233193); and the OpenArray platform at Servei Veterinari de Genètica Molecular (Universidad Autónoma de Barcelona, Spain).

The specific association analyses for the candidate SNPs and haplotypes, built with Phase 2.1.1 (Stephens et al. 2001), with gene expression measures were carried out using the previous quoted model using Qxpak software (Pérez-Enciso and Misztal 2011). Moreover, associations were also conducted for phenotypic traits. Bonferroni correction was applied to take into account multiple tests, setting up a p value of 0.003.

To examine the interest of the identified candidate SNPs, we analyzed in silico the potential effect of those SNPs that produce amino acid change using Predict Protein tool (Yachdav et al. 2014). Additionally, we used RegRNA (http://regrna.mbc.nctu.edu.tw/) for those synonymous SNPs, to determine the potential effects at mRNA level, stability, or gene expression regulation (Chang et al. 2013). ALDB, a domestic-animal long non-coding RNA database, was used to identify possible non-coding RNA, due to the presence of one of the SNP analyzed in this study, localized in a non-coding DNA region (Li et al. 2015). Also the MEME suite (Bailey et al. 2009) was used to identify possible motifs represented in our sequence as well as to predict potential effects of the SNPs changing the structure of a DNA motif. Multiple Em for Motif Elicitation (MEME) (Bailey and Elkan 1994), Gene Ontology for Motifs (GOMo), (Buske et al. 2010), and Motif Comparison Tool (Tomtom) (Gupta et al. 2007) tools were used to identify potential motifs in our sequence and analyze the gene ontology (GO) enrichment of this motifs by comparisons with previously described motifs.

Results

In order to focus the analysis on the productive traits and reduce the number of genes analyzed from thousands to few hundreds, only those probesets contained in the Affymetrix porcine expression microarray whose expression showed significant correlation with at least one of the seven phenotypes of interest were selected. A total of 820 probesets were selected, showing correlations between 0.32 and 0.66 (p value <0.001, q value <0.002) (Supplemental Table 2). Gene probesets were annotated using NetAffx tool. One probeset per gene was chosen, the one showing the highest expression level. In total 776 gene-unique probesets were used for the eGWAS.

The eGWAS was carried out using GenABEL package among each of the 776 probesets with the filtered SNPs. The results revealed 954 associations between 880 SNPs with expression levels of 42 genes (eTAS) (Supplemental Table 3). These eTAS corresponded to 63 regions or eQTLs for 36 different transcripts, containing between 2 and 209 eTAS. From the total number of eQTLs 15 were cis-associations and 48 trans-associations. The eQTL regions were identified on every autosome except on SSC10, SSC12, SSC16, and SSC18 (Table 1), showing the higher association for R33-trans (SSC8) and R1-cis (SSC6). This analysis validated relevant associations such as the association between Insulin-Like Growth Factor 2 (IGF2) expression with R5-trans (28 SNPs) and R9-trans (2 SNPs) regions on SSC2. A total of 2630 genes were identified within these 63 eQTLS, containing candidate genes as PTEN (Phosphatase and Tensin Homolog, located on SSC14:108911519–109003081) associated with muscle development, FADS1 (Fatty Acid Desaturase 1, located on SSC2:9,247,472–9,263,631), or CTNNB1 (Cadherin-Associated Protein, Beta 1, located on SSC13:27,623,128–27,667,302).

We focused our further studies on those eQTLs (cis- and trans-associated) overlapping with QTLs previously detected in the same animal material (Óvilo et al. 2000; Varona et al. 2002; Mercadé et al. 2005; Fernández et al. 2012, 2014). Those were located on SSC4, SSC9, SSC13, and SSC17 (Table 2). These associations implicate four different expression probesets:

Table 2 Identified eQTL overlapping with previous QTL

The Ssc.7190.1.S1_at probeset for QTL regions R19-trans and R35-trans, which corresponds with the BUB1B gene (ENSSSCG00000030580), SSC1:146,304,943–146,312,662. The BUB1B is associated with the proliferative capacity of muscle cells (Guntani et al. 2011).

The Ssc.7666.1.A1_at probeset for region R60-cis, which corresponds to PSMF1 gene (ENSSSCG00000020887), SSC17:38,667,378–38,711,350. The PSMF1 interest lies in its interaction with INS (Insulin), TGFB1 (Transforming Growth Factor Beta 1), and CDKN1A (Cyclin-Dependent Kinase Inhibitor 1A) which is related with myocyte terminal differentiation in muscle development (Guo et al. 1995; Qin et al. 2012).

The Ssc.21242.1.S1_at probeset for region R61-trans, which corresponds to CTNNBL1 gene (ENSSSCG00000021553), is involved in basal metabolism and previously related with carcass traits in different species (Espigolan et al. 2015).

The Ssc.10589.1.A1_at probeset is located in a non-coding region. Current annotation indicates that this probeset corresponds to a long intergenic non-coding RNA (lncRNA), identified in the ALDB database (http://www.ibiomedical.net/aldb/) as ALDBSSCG0000001928, and located at SSC13:80,720,757–80,739,741. The LDBSSCT0000003202 transcript spans 8,299 bp and two exons. This lncRNA is located in a region that overlaps with a high number of QTLs previously described on PigQTL database, as mainly associated with average daily gain (Hu et al. 2005, http://nhjy.hzau.edu.cn/kech/swxxx/jakj/dianzi/Bioinf8/Animal/Animal8.htm).

SNPs analysis

Candidate genes

A total of 44 positional and functional candidate genes for those QTLs overlapping with eQTLs were selected for candidate polymorphism search (Table 3). Polymorphism search was conducted taking the advantage of our previous SNP identification study based on an RNA-Seq assay performed on the same animal material (Martinez-Montes et al. 2016). We identified 49 SNPs in 13 of the 44 candidate genes. After validation and potential impact evaluation, a total of 20 SNPs located on 10 unique genes (Table 4) were selected for genotyping and association analyses in the backcrossed individuals (ZNF786, ACAD11, RYK, MGLL, TRIB3, PDIA4, LAMB1, RBP1, TXNRD3, and ICA):

Table 3 Positional candidate genes identified within the selected eQTLs
Table 4 Description of candidate SNPs selected and analyzed within candidate genes

The MGLL gene encodes a monoglyceride lipase that has been associated with fatty acid uptake and oxidation in pig intramuscular fatty acid composition in the longissimus thoracic muscle (Pena et al. 2013).

The TXNRD3 encodes for thioredoxin reductase 3 that was shown to affect adipocyte differentiation through Wnt signaling pathway (Kipp et al. 2012).

The ACAD11 gene that encodes an acyl-CoA dehydrogenase that was shown to be in association with variation in residual feed intake in beef cattle (Karisa et al. 2013).

The RYK genes encode a receptor-like tyrosine kinase that mediate muscle attachment in drosophila melanogaster via Wnt interaction (Lahaye et al. 2012).

The RBP1 gene encodes a retinol-binding protein that regulates adipogenesis in mice (Zizola et al. 2010).

The TRIB3 genes encode tribbles pseudokinase 3 that was shown to be in association with meat quality and production traits in Italian heavy pigs (Fontanesi et al. 2010).

The LAMB1 gene encodes beta laminin 1, which is associated with skeletal muscle development in human (Wewer et al. 1997).

The PDIA4 gene encodes a protein disulfide isomerase that is associated with HSP90 activity in muscle differentiation (Garcia de la Serrana and Johnston 2013).

The 20 SNPs were successfully genotyped in the backcrossed animals showing MAFs ranging from 0.03 for ss2031475815 to 0.49 for ss2031475813, most of the SNPs showed intermediate frequencies (Table 5).

Table 5 Significant association results of the analyzed candidate SNPs on gene expression and phenotypic traits

Association analysis

Most of the selected SNPs (65%) showed intermediate frequencies in the backcross population, MAF >0.25, optimal values for association analysis (Tabangin et al. 2009), and only two SNPs showed very low frequencies [ss2031475814 (MAF = 0.03) and ss2031475818 (MAF = 0.06)] and were discarded for the association analyses. Also linkage disequilibrium estimates were calculated for closely linked SNPs. Complete linkage was found for ss2031475811, ss2031475810, and ss2031475812, and between ss2031475802 and ss2031475803 polymorphisms.

Specific association analyses of each candidate SNP with the corresponding probeset expression level were conducted, in agreement with eGWAS results (Table 5). In addition, association analyses were done for the candidate SNPs with the phenotypic traits (SW, HW, BLW, BW150, IMF, BFT75, and BFS).

Significant association with gene expression measures were found for ss2031475813, ss2031475806, ss2031475807, ss2031475816, ss2031475814, ss2031475817, ss2031475811, ss2031475809, and ss2031475808. The whole results could be grouped into two different clusters conditional on the affected gene expression: Ssc.10589.1.A1_at probeset, representing ALDBSSCG0000001928 lncRNA expression and Ssc.7666.1.A1_at probeset, representing PSMF1 gene expression.

Within the first cluster, the ss2031475813, ss2031475806, ss2031475807, ss2031475816, ss2031475814, ss2031475811, ss2031475809, ss2031475808 SNPs showed association with the Ssc.10589.1.A1_at expression levels. All eight SNPs are located in R44-cis, showing a decrease of expression levels between 0.437 and 0.918 (Table 5). Only ss2031475816 SNP, which is located in R45-trans (trans-association), reported an increase of Ssc.10589.1.A1_at expression levels in 0.779 with a standard error (SE) of ±0.083. For the second cluster, only one SNP was associated with the expression levels of PSMF1 gene, ss2031475817, which increases expression in 0.224 (±0.062).

Regarding the association analysis results for the production traits, the ss2031475809, which showed the higher association with expression levels of SSsc.10589.1.A1_at probeset, showed also the most significant effect on BW150, increasing animal weight in 2.66 kg (±1.07) (Table 5). Additionally, it also revealed effects on HW and BLW increasing weight in 156 g (±88) and 148 g (±79), respectively. Besides ss2031475808, ss2031475814, and ss2031475806 SNPs showed associations, p value <0.05, with BW150 trait. Suggestive effects (p < 0.10) of ss2031475816 on BW150, and ss2031475817 on HW could also be reported. Here, it should be noted that the animal number is a power limitation in the identification of significant effects in the association analysis (Hong and Park 2012).

The linkage disequilibrium estimates for TXNRD3, MGLL, ICA, and RBP1 SNPs (Fig. 1) revealed a significant linkage block composed by two SNPs of MGLL and the three SNPs of TXNRD3 (ss2031475806, ss2031475807, ss2031475808, ss2031475809, and ss2031475810), two genes located very close in SSC13, at 594 kb distance. Four haplotypes were identified for these five SNPs: AAACC, AAATC, GGTTT, and GATTT. The same association analysis as those used with single SNPs were carried out for the haplotypes, trying to determine if the addition of genomic information to the analysis could better explain the effects than the individual SNPs. Nevertheless, the results were less significant. The association analysis of ss2031475817, the unique SNP associated with PSMF1 gene expression, with the productive traits revealed effects on HW trait (Table 5).

Fig. 1
figure 1

Linkage disequilibrium (r 2) representation among the SNPs associated with Ssc.10589.1.A1_at expression levels

In addition, few other SNPs located within these eQTL regions showing effects on production traits: ss2031475801 showed effects on HW and BW150, ss2031475800 on HW and SW, ss2031475799 on HW, SW, and IMF, and ss2031475814 on BW150 (results not shown). Some of these results may be relevant; however, due to the lack of association with probeset expression levels (the initial hypothesis of the current study), these results were not further studied.

Discussion

In the current study we focused our analyses on the identification of mutations that could affect expression levels of genes involved in porcine fat deposition and growth processes. In order to achieve this objective a genetical genomics study was conducted using the expression levels for 776 genes selected due to the correlation between expression levels and phenotypic traits, currently known as intermediate phenotypes, correlated with fat deposition and growth-related traits, in an eGWAS. This approach allowed us to identify a total of 954 significant associations between 42 genes and 880 eTAS. Moreover, we were able to validate interesting associations between SNPs and gene expression levels, such as those identified for Insulin-Like Growth Factor 2 (IGF2), R5-trans that contains 28 SNPs associated with expression, at SSC2:16,416–10,979,357, and R9-trans containing 2 SNPs SSC2:162,084,552–162,298,086, where are reported to map the causal mutation affecting IGF2 gene expression, involved in muscle development (Van Laere et al. 2003), fatty acid composition (Hong et al. 2015), and litter size (Muñoz et al. 2010).

As expected, the identification of SNPs affecting phenotypic traits is less precise than identifying association with expression levels directly, likely due to the most direct relation between gene expression and genomic information. Gene expression seems to be regulated in a simple way if we compared it with complex phenotypic traits. Nevertheless, the interpretation and biological relevance of the associations identified here need further analyses to unravel these complex regulation mechanisms.

Beyond the identification of associations between SNPs and gene expressions, the eGWAS has allowed us to identify 63 eQTLs. Although region size and gene content seem to be variable, a lot of information could be obtained from these genomic regions. eQTLs were identified in all autosomes except SSC10, SSC12, SSC16, and SSC18. The SSC13 showed 13 different eQTL regions associated with four different gene expression levels, ten of those were associated with TXNRD3 expression. The regions identified on SSC13 covered almost 60% of total chromosome length, which could be explained by high linkage disequilibrium levels (Fig. 2) as previously reported (Saura et al. 2015). Positional and functional candidate genes were identified in some of these regions, allowing us to select potential genes underlying the identified eQTLs. Some of the genes are transcription factors previously associated with traits of interest such as the FOXO1, associated with adipogenesis in porcine preadipocytes (Yan et al. 2013), the GATA2, involved in adipogenesis (Szczerbal and Chmurzynska 2008), and the RBL1 (p107), which has been proposed to regulate adipocyte differentiation (Scimè et al. 2005).

Fig. 2
figure 2

Linkage disequilibrium heat map of eQTL regions on SSC13

One of the challenges of this kind of studies is how to manage the great amount of results obtained from the eGWAS. Although a lot of interesting regions and genes were detected, the study was focused on those regions that overlap with phenotypic QTLs previously described in the same animal material. The eQTL regions at SSC4, SSC9, SSC13, and SSC17 overlapped with QTLs for fatness and premium cut yield (Varona et al. 2002; Fernandez et al. 2012). With this approach, some of the results remained unanalyzed but it brings interesting data for further studies.

In order to identify candidate mutations that could be underlying the selected eQTLs, SNPs identified in a previous RNA-Seq study were used (Martínez-Montes et al. 2016). This approach allowed us to select not only candidate SNPs located on those regions, but also SNPs that showed differential genotype between divergent groups for growth and fat deposition (Martínez-Montes et al. 2016). Merging both studies, SNP identification and GWAS results, we were able to focus our analysis on the identification of candidate causal mutations. The strength of this approach relies on the possibility of using different type of results in order to answer a common objective. Following this strategy, we were able to identify potential candidate genes that could be regulating the expression levels of three genes: BUB1B, ALDBSSCG0000001928, and PSMF1. Even more, candidate mutations associated with gene expression and production traits were also identified. One of the most interesting results is the association detected between PSMF1 and TRIB3 SNPs. Previous studies have reported the association of TRIB3 polymorphisms with meat quality and production traits in Italian heavy pigs (Fontanesi et al. 2010) by reducing the fat levels and increasing weight. Moreover, the PSMF1 interacts indirectly with TRIB3 gene, via AKT2 and UBC genes which have been previously associated with adipogenesis and muscle development (Pang et al. 2013; Ayuso et al. 2015). Among the most promising and novelty result is the identification of ALDBSSCG0000001928 lncRNA, whose expression seems to be associated with TXNRD3 polymorphisms.

The analyzed ss2031475809 could be the causal mutation affecting the expression levels of this lncRNA, and it appears to also be associated with body weight and premium cut yields. It should be also noted that several regions of SSC13 chromosome, ten different regions, are trans-associated with the same lncRNA expression levels (R37, R39, R41, R42, R43, R44, R45, R46, R47, and R49), but the most significant association corresponded to the cis-association of R44, which includes ss2031475809.

The Ssc.10589.1.A1_at probeset was firstly annotated within TXNRD3 gene. Nevertheless, after annotation updates and deeper sequence analysis by basic local alignment searches with BLAST, the annotation confirmed that it represents a long intergenic non-coding RNA (lncRNA) gene annotated in the domestic-animal long non-coding RNA database (ALDB) as ALDBSSCG0000001928 gene (ALDBSSCT0000003202 transcript). This lncRNA is located close to TXNRD3 gene, at 3 Kb of ss2031475809. Annotation data showed that ALDBSSCG000000192 gene is located within a QTL region for several productive traits such as average daily gain, body weight, and back fat weight (pigQTL database).

Long non-coding RNAs (lncRNA) have been identified as chromatin regulators in different species and act following different strategies. For instance, the XIST gene, which is an lncRNA upregulated in one of the female X chromosomes of mice in early development, leads to transcriptional repression and important changes in chromatin composition. Nevertheless, it acts also for dosage compensation roX gene in Drosophila melanogaster, increasing transcription on the single male X chromosome (Rutenberg-Schoenberg et al. 2016). But several other roles have been attributed to lncRNAs such as transcriptional regulation and post-transcriptional control (Angrand et al. 2015). In the current study, we hypothesize that ALDBSSCG0000001928 lncRNA could be regulating expression levels, through transcriptional repression, of surrounding genes such as MGLL and TXNRD3 (negative significant correlation, −0.43 and −0.42 was detected with lncRNA expression, respectively).

The potential mechanism explaining the relation between ss2031475809 SNP and ALDBSSCG0000001928 lncRNA was explored using Motif Comparison tool from the MEME suite. Potential motifs including or close to ss2031475809 were identified (Fig. 3). The most relevant corresponded to the motif CAC[A/C]T[A/G]AG, which involves conservation in the ss2031475809 position indicating its high functional relevance. Additionally, two more motifs (similar to NKX2-8 DBD) were predicted (Fig. 3). Using the GOMo tool to scan promoters to determine if the identified motif is significantly associated with genes linked to one or more Genome Ontology (GO) terms, we were able to observe enrichment, among human gene catalogue, for olfactory receptor activity (GO:0004984), sensory perception of smell (GO:0007608), and sensory perception of chemical stimulus (GO:0007606). These terms involve genes such as taste receptors likely mediating growth and fatness processes (Ren et al. 2009) and olfactory receptors, which have been studied in porcine due to the possible relevance in pig over other species (Nguyen et al. 2012) and their relation with gastrointestinal tract in pigs (Priori et al. 2015). These results support ss2031475809 as candidate mutation, to regulate ALDBSSCG0000001928 lncRNA expression, which can be involved in the transcriptional regulation of MGLL and TXNRD3, affecting productive and meat quality traits (Pena et al. 2013; Puig-Oliveras et al. 2016).

Fig. 3
figure 3

Genomic organization of MGLL, TXNRD3 (ss2031475809) and ALDBSSCG0000001928 lncRNA genes. Representation of NKX2-8 DBD motifs within TXNRD3 gene sequence

In conclusion, we were able to identify 63 eQTL regions affecting 36 transcript expressions, which overlapped with phenotypic QTLs on chromosomes SSC4, SSC9, SSC13, and SSC17. Also candidate genes on these regions, and candidate SNPs obtained from RNA-Seq data were analyzed. One of the most relevant results is the identification of ALDBSSCG0000001928, a long non-coding RNA, whose expression seems to be correlated with premium cut yield. In silico domain annotation and association analysis support the role of TXNRD3 polymorphisms as potential candidates to regulate ALDBSSCG0000001928 expression. This lncRNA could be involved in the transcriptional regulation of genes surrounding it, as other lncRNA are reported to, affecting productive and meat quality traits.