Introduction

According to the organizational hypothesis, the prenatal period is a critical window when androgens impact the development of both reproductive and non-reproductive tissues (Phoenix et al. 1959). In human studies, putative effects of prenatal androgens on brain and behavior have been demonstrated using a variety of approaches, including direct (reviewed in van de Beek et al. 2004) and indirect (reviewed in Hönekopp et al. 2007) measures. In previous studies carried out with magnetic resonance imaging (MRI), exposures to prenatal androgens have been associated with inter-individual variations in brain size (Peper et al. 2009), callosal morphology (Chura et al. 2010), and local gray-matter volumes in sexually dimorphic regions (Lombardo et al. 2012).

Given the scarce opportunities for measuring androgen levels directly from amniotic fluid or umbilical-cord blood, one could test for putative effects of prenatal androgens using the discordant-sex twin design (e.g., Peper et al. 2009). This approach is based on the so-called twin-testosterone transfer (TTT) hypothesis: the intrauterine presence of a male (vs. female) co-twin exposes the other twin to higher levels of prenatal androgens (reviewed in Cohen-Bendahan et al. 2005; Miller 1994; Tapp et al. 2011). While female fetuses produce androgens only by fetal adrenal glands (Rainey et al. 2004) and as a by-product of corticosteroid production (Tapp et al. 2011), male fetuses develop testes in the 7–8 week of gestation (Tapp et al. 2011) and start producing increasing levels of testosterone (McIntyre 2006; Sarkar et al. 2007). Androgens produced by the male co-twin can reach the other co-twin by diffusion across amniotic fluid and fetal membranes (Ryan and Vandenberg 2002). Even et al. (1992) showed that injection of silastic capsule containing testosterone into the amniotic sac increased the testosterone levels especially in the adjacent rat fetuses.

Direct evidence for the TTT comes from research on mice and rats; the male sex of (nearby) fetuses is associated with the masculinization of a variety of sexually dimorphic traits in the female offspring (Ryan and Vandenberg 2002; Kinsley et al. 1986; Quadagno et al. 1987; vom Saal and Bronson 1980). Prenatal administration of anti-androgenic flutamide abolished the TTT supporting the fact that masculinization of females by male fetuses is indeed related to exposure to prenatal testosterone and not only to postnatal socialization with male littermates (Clemens et al. 1978). In humans, no direct measurements of prenatal androgens in dizygotic twins have been reported, but discordant twin design studies on digit ratio (Anders and Verhon 2006; Voracek and Dressler 2007) and other sex-specific traits (e.g., Peper et al. 2009; Cohen-Bendahan et al. 2004; Galsworthy et al. 2000) support the existence of intrauterine hormonal transfer in humans. Miller’s (1994) review of the testosterone transfer hypothesis concluded that comparison of female twins with same-sex versus opposite-sex fraternal co-twin provides a reasonable model for studies of prenatal testosterone exposure. Tapp et al. (2011) provided an overview of human studies of phenotypic differences––in a number of domains––between twins with a male or female co-twin and concluded that while the accumulated evidence lacks consistency [e.g., negative findings by Medland et al.’s (2008), Baker et al.’s (2009), Henderson and Berenbaum’s (1997)]––there is sufficient support for the TTT hypothesis to motivate further research.

The current study builds on the interconnected development of the brain and the face; it uses the discordant-sex twin design to identify a “signature” of the prenatal exposure to androgens in the craniofacial morphology. This choice is motivated by the growing availability of magnetic resonance (MR) images in population-based studies of brain development (Paus 2013) and enabled by the current work on MR-based craniofacial morphometry (e.g. Chakravarty et al. 2011; Liu et al. 2012; Marečková et al. 2011, 2013). All skeletal and connective tissues that form the face derive from neural crest cells, which originate along the dorsal margins of the midbrain and rostral hindbrain (Noden 1978; Couly et al. 1993; Kontges and Lumsden 1996). The concurrent phase of the brain and craniofacial development takes place between the 5th and 13th week of gestation (Diewert et al. 1993; Diewert and Lozanoff 1993). Specific genetic factors or exposures (e.g., teratogens) can result in disruptions of molecular signaling between the brain and the face and create malformations in both tissues. In holoprosencephaly, for example, Ssh signaling is reduced, brain can be severely malformed, and facial malformations range from midfacial hypoplasia to complete cyclopia (Muenke and Beachy 2000). Other examples of genetic and environmental perturbations affecting both phenotypes include, respectively, Down’s syndrome (Guihard-Costa et al. 2006) and fetal alcohol syndrome (Larkby and Day 1997).

Sex differences in the craniofacial morphology have been observed as early as in 6-month-old human infants (Bulygina et al. 2006). Studies in adults suggested a relationship between an indirect measure of prenatal androgens, namely, the ratio of the 2nd and 4th finger lengths (2D:4D ratio) and both the face shape (Fink et al. 2005; Meindl et al. 2012) and the perceived masculinity of the face (Neave et al. 2003). Prenatal androgens also appear to influence teeth size: androgenized female monkeys (vs. female controls) had longer and sharper teeth (Zingeser and Phoenix 1978) and human females with a co-twin brother (vs. with a co-twin sister) had larger mesiodistal and, to a lesser extent, buccolingual diameters of the top of the crown (Dempsey et al. 1999).

Here, we used the discordant-sex twin design to identify possible effects of prenatal androgens on craniofacial morphology in prepubertal children (Study 1). This approach allowed us to avoid the possible confounding effect of pubertal androgens. Subsequently, we tested the validity of such a craniofacial “signature” of prenatal androgens by examining its relationship with brain size in adolescent females born as singletons (Study 2). We have restricted the Study 2 to females because our previous work with this sample showed that female (but not male) faces are fully developed at the age of 12 years, with no age-related changes in face morphology occurring during adolescence (Marečková et al. 2011).

Study 1: methods

Participants

A sample of 119 dizygotic 8-year-old twins from the Quebec Newborn Twin Study included 63 females [28 with a twin sister (Same Sex Female; SSF), 35 with a twin brother (Opposite Sex Female; OSF)] and 56 males [20 with a twin brother (Same Sex Male; SSM), 36 with a twin sister (Opposite Sex Male; OSM)]. We were interested in differences between dizygotic twins with same-sex vs. opposite-sex co-twin (SSF vs. OSF, and SSM vs. OSM contrasts). Since individuals within the OSF group (and within the OSM group) were unrelated, our analyses included only one member of each twin pair from the SSF (and SSM) group (chosen at random when both MR images were usable). The study was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Informed assent and consent forms were obtained from the children and their parents, respectively.

MRI data acquisition

We obtained three-dimensional T1-weighted, sagittal, fast low-angle shot (FLASH) images of the whole head using a 1.5 Tesla system (Magnetom Vision, Siemens Electric, Erlangen, Germany) using TE = 10 ms, TR = 22 ms, flip angle = 30°, 160 contiguous slices, matrix size = 224 × 256, 1 mm × 1 mm × 1 mm voxels.

Landmarks and craniofacial features

All 119 T1-weighted images were registered, using rigid transformations (3 translations, 3 rotations), to a T1-weighted image of one individual chosen at random (Collins et al. 1994). This ensured similar orientation of all images for the subsequent placement of craniofacial landmarks. Next, we placed 19 skull landmarks (Fig. 1) on each of the 119 images using the register visualization software (http://www.bic.mni.mcgill.ca/ServicesSoftware/HomePage). While it would have been more convenient to place landmarks only on the population-based average and then use the fully automated approaches, as previously developed by our group (Chakravarty et al. 2011) and employed in Marečková et al. (2013), this was not possible in this particular case due to the frequent presence of movement artifacts in T1-weighted images acquired in this sample.

Fig. 1
figure 1

Skull landmarks (19) and their position: 1 inside corner of the left eye socket; 2 outside corner of the left eye socket; 3 inside corner of the right eye socket; 4 outside corner of the right socket; 5 bridge of the nose; 6 middle of the left mandibular sinus; 7 middle of the right mandibular sinus; 8 outside of the right mandibular sinus; 9 outside of the left mandibular sinus; 10 point around 2nd vertebrae where lower teeth most visible; 11 front of the lower teeth; 12 left jaw (unerupted lower 3rd molar); 13 right jaw (unerupted lower 3rd molar); 14 front of the upper teeth; 15 left canine; 16 right canine; 17 left first molar; 18 right first molar; 19 tip of the chin bone

We placed skull landmarks in those non-tissue spaces, represented by a low-intensity signal void that enabled high precision in positioning (e.g., particular teeth, corners of the eye sockets, tip of the chin). Skull rather than facial landmarks were used to capture craniofacial features that are independent of the amount of fat (or muscle) in the face (see Chakravarty et al. 2011; Marečková et al. 2011, 2013 for details).

To remove a possible confounding effect of the overall size of the skull, we used the methods described in Sibson (1978), Chakravarty et al. (2011), and Pujol et al. (2011) to normalize all landmarks with respect to the linear dimensions of the skull (as represented by all of the landmarks of the group). We achieved this by matching the landmarks from each participant to each of the other participants based on the estimation of a 12-parameter affine transformation (3 parameters per translation, rotation, scale and shear). This yields 118 transformations for each participant that are averaged and then applied to the participant’s specific set of landmarks. The result is a distribution of points that define the overall shape variation in the skull across all participants. This is similar to the approach taken in active shape modeling in the computer vision (Cootes and Taylor 1992) and in other craniofacial studies (Liu et al. 2012; Fink et al. 2005). Finally, we extracted X, Y and Z coordinates of the 19 landmarks for each of the 119 individuals.

Variation in craniofacial morphology can be described by a particular constellation of craniofacial landmarks. Principle component analysis (PCA) of the landmark-based data allows one to compute a point-distribution model (Chakravarty et al. 2011), modeling the spatial distribution of the landmarks in all the participants. Principle components describe variability in the landmark positions and define particular craniofacial features. Loadings of the principle components in each individual represent the amount of these particular craniofacial features in one’s craniofacial structure. Here, we normalized the landmark coordinates to a range between 0 and 1, mean centered them, and submitted them to PCA. In addition, 17 craniofacial distances (Table 1) were calculated as Euclidean distances between the normalized mean-centered coordinates. The statistical software package JMP was used to test the effect of twin group on the principal components (PCs) and to determine which of the 17 craniofacial distances correlated with the PC that showed a twin group difference. Bonferroni correction was applied to p values to account for multiple comparisons.

Table 1 Skull landmark distances (17)

Brain size calculation

Brain volume was estimated with SIENAX (Smith et al. 2001, 2002), part of FSL (Smith et al. 2004). SIENAX starts by extracting brain and skull images from the single whole-head input data (Smith 2002). Next, tissue-type segmentation with partial volume estimation is carried out (Zhang et al. 2001) and total brain volume is calculated.

Study 1: results

Age

All participants were 8 years old (M = 101.1 months, SD = 1.03; range 99.5–106.6 months) and their age did not vary as a function of twin group [F(3,115) = 1.13, p = 0.34].

Craniofacial features

The first 10 PCs from PCA described a total of 78 % of variance in the craniofacial features. As reported in Table 2, only PC2 and PC3 showed sex differences. We followed these two PCs with one-way ANOVA to test for differences in PC loadings across the four groups of twins (SSF, OSF, OSM, SSM). As reported in Table 3, only PC3 skull features showed a main effect of twin group [F(3,115) = 7.3, p = 0.0002 uncorrected; p = 0.002 corrected for 10 comparisons]. Post hoc analyses showed that SSF group had higher loadings of PC3 skull features than any other group [SSF vs. OSF: t(61) = 3.01, p < 0.0001, Cohen’s d = 0.76; SSF vs. OSM: t(62) = 4.35, p < 0.0001, Cohen’s d = 1.08; SSF vs. SSM: t(46) = 2.56, p = 0.01, Cohen’s d = 0.77]. There were no differences in PC3 loadings across the OSF, OSM, and SSM groups; in particular, OSF did not differ from either OSM [t(69) = −1.51, p = 0.14] and SSM [t(53) = −0.003, p = 0.99]. Craniofacial features characteristic for positive range of PC3 values are illustrated in Fig. 2.

Table 2 Sex differences in the 10 principal components (PCs) explaining variability in skull features
Table 3 Results of four-way ANOVA exploring the effects of twin group (SSF, OSF, OSM, SSM) on the two principal compoenents (PCs) that showed a sex difference
Fig. 2
figure 2

Simulation of the third principal component (PC3) craniofacial features. The first column of images represents craniofacial features characteristic for the mean values of PC3, further columns of images represent the positive range of PC3 values (0.5 up to 2 SD from the mean) that reflect craniofacial features characteristic for low exposure to prenatal testosterone. Please notice the increasing width and length of the jaw in the upper and lower row of the images, respectively

Correlations between the 17 craniofacial distances and PC3 features identified the same eight PC3-related distances in both the whole sample of twins (Table 4) and the female twins only. We then evaluated differences between OSF and SSF and found that five of these eight PC3-related features showed an effect of co-twin’s sex (Table 5; we show results for all 17 distances for completeness). Overall, SSF (vs. OSF) had smaller distance between the inside corners of eye sockets, larger distance between left- and right-third molars of the lower jaw, larger “left-third-molar to chin”, “left-third-molar to lower front-teeth”, and “right-third-molar to lower front-teeth” distance. This is consistent with the direction of a simple sex difference: females (vs. males) had also smaller distance between the inside corners of eye sockets [t(116) = 3.37, p = 0.01, Cohen’s d = −0.63], larger distance between left- and right-third molars of the lower jaw [t(116) = −2.00, p = 0.05, Cohen’s d = 0.37], and larger “left-third-molar to chin” [t(116) = −2.75, p = 0.007, Cohen’s d = 0.5], “left-third-molar to lower front-teeth” [t(116) = −4.65, p < 0.0001, Cohen’s d = 0.86], and “right-third-molar to lower front-teeth” [t(116) = −3.88, p = 0.0002, Cohen’s d = 0.72] distance.

Table 4 Correlations between the 17 skull landmark distances and PC3 features among all twins
Table 5 PC3-related distances in females with male (OSF) and female (SSF) co-twin

Brain size

Brain size was successfully calculated for 106 out of the 119 twins (13 individuals did not pass the quality control). As expected (reviewed in Paus 2010), male brains were larger (by 7.9 %) than female brains [t(104) = 6.42, p < 0.0001]. There was no effect of co-twin’s sex on brain size in either males [t(48) = 1.82, p = 0.07] or females [t(54) = −0.01, p = 0.99]. These results remained the same after correcting brain size for birth weight: Male brains were larger than female brains [t(96) = −6.48, p < 0.0001] and there was no effect of co-twin’s sex on brain size in either males [t(44) = −0.53, p = 0.60] or females [t(51) = −0.06, p = 0.95]. In the full sample (i.e., SSF, OSF, SSM and OSM), we observed a negative relationship between PC3 and brain size [t(105) = −2.31, p = 0.02, r = −0.22]; as predicted, the more “SSF-like” features were associated with lower brain size. When evaluated separately in each sex, the relationship was significant in males [t(50) = −2.96, p = 0.005, r = −0.39] but not in females [t(55) = 1.37, p = 0.18, r = 0.18].

Study 2: methods

In this study, we applied the model of PC3 features in an independent dataset of female adolescents and tested for the presence of a relationship between the PC3-related craniofacial features and brain size. In a previous study, females with a female co-twin had a slightly smaller brain than those with a male co-twin (Peper et al. 2009). We hypothesized that the presence of PC3-related features, an indicator of an absence of prenatal androgens (for PC3: SSF > OSF), would predict smaller brain size. Note that the results of Study 1 vis-à-vis the relationship between PC3 and brain size were inconclusive (present in males but not in females).

Participants

A total of 462 female adolescents born as singletons (age range = 12–18 years; M = 180.02 months, SD = 22.61, see Table S1 for the number of participants per age-bin) were recruited in the context of Saguenay Youth Study (SYS), a community-based sample of typically developing adolescents (Pausova et al. 2007). The study was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Informed assent and consent forms were obtained from the children and their parents, respectively.

MRI data acquisition

We acquired T1-weighted (T1W) images of the head using a Philips 1T scanner and three-dimensional radio frequency (RF)-spoiled gradient-echo scan (TR = 25 ms, TE = 5 ms, flip angle = 30°, 140–160 slices, resolution 1 × 1 × 1 mm).

Landmarks and craniofacial features

The image-analysis pipeline for deriving craniofacial features uses the methodology described previously (Chakravarty et al. 2011) and employed in our studies of face and skull development during adolescence (Marečková et al. 2011, 2013). In brief, MR images are first corrected for intensity non-uniformity (Sled et al. 1998). Next, we remove the brain using a nonlinear transformation (Avants et al. 2008) that matches each T1-W image to an average model derived from 876 adolescent T1-W images (Grabner et al. 2006). The resulting images are then normalized to an intensity range of 0–10,000.

The goal of the remainder of the pipeline is to build an average model of the craniofacial structure for all of the study participants. Our approach has been inspired by similar pipelines created to estimate average models of brain anatomy (Frey et al. 2011; Borghammer et al. 2010; Grabner et al. 2006). Each image (original T1-W image minus the brain) is matched to a single image (chosen at random from the group) to normalize the spatial position and orientation using a 6-parameter transformation (3 translations and 3 rotations). Next, each image is normalized to the average head size of the entire group through the estimation of a 12-parameter transformation (3 translations, 3 rotations, 3 scales, and 3 shears) (Collins et al. 1994). Note that unlike in Experiment 1, all transformations are estimated using intensity information from the images. An average model is then created that is refined iteratively in subsequent nonlinear registration stages where large differences in craniofacial features are first accounted for followed by the estimation of transformations that progressively capture more subtle differences in craniofacial morphology (<2 mm; Chakravarty et al. 2011). The result is the estimation of a single transformation (a combination of linear and nonlinear transformations) that maps the craniofacial structure of each individual to the average craniofacial representation of the entire group. Using this average image, we then placed 19 landmarks (Fig. 1; the same landmarks as used in Study 1) at anatomically defined locations of the skull. Then, using the inverse nonlinear transformation generated through the hierarchical matching procedure described above, we fit all of the skull landmarks to each participant. This step provided a set of landmarks (and relevant Euclidean distances; Table 1) for each participant’s skull; the 3D position of each landmark is defined by X, Y and Z coordinates, termed below the “original” coordinates.

Projection of the facial signature

We transformed the model of twin PC3 generated in Study 1 to the (SYS) average image as follows. First, we calculated the centroids for each of the 19 landmarks in the previously registered twin dataset and aligned them using affine (12 parameters) registration to the 19 landmarks on the SYS average image. Next, we multiplied these 19 skull coordinates by the PC3 weights and thus created a set of 19 PC3-like coordinates. Finally, for each of the 19 landmarks and each participant, we calculated the difference between the original coordinates and PC3-like coordinates using the root mean square error (RMSE) formula. The RMSE thus determines how similar the craniofacial features of a given person were to those captured by PC3. Individuals from Study 2 whose faces were most similar to the PC3 model had the lowest RMSE values, a possible indicator of low exposure to androgens prenatally. In the manuscript, we refer to these PC3-like features as the facial signature.

Brain size calculation

As in Study 1, brain volume was estimated with SIENAX (Smith et al. 2002), part of FSL (Smith et al. 2004).

Study 2: results

Since we expected to find the relationship between the facial signature of prenatal androgens and brain size only in females, we explored the possible presence of this relationship in a sample of 458 females (4/462 participants were excluded due to failing the image-processing pipeline). As predicted, linear regression showed a positive relationship between RMSE and brain size [t(457) = 2.69, p = 0.007; R 2 = 0.02]; females with faces less similar to the PC3 model (i.e., higher RMSE) had bigger brains.

To identify craniofacial distances with the strongest relationship with brain size, we examined PC3-related distances that showed a difference between the SSF and OSF group (Study 1). Four of the eight PC3-related distances showed significant correlations with brain size: “right-third-molar to chin” (r = −0.29, p = 0.0008), “left-third-molar to chin” (r = −0.24, p = 0.0008), “left-third-molar to lower front-teeth” (r = −0.13, p = 0.03), and “spine to lower front-teeth” (r = −0.19, p = 0.0008).

Finally, we calculated an average of the two symmetrical distances that showed relationship with brain size (“right-third-molar to chin” and “left-third-molar to chin”). Linear regression showed a negative relationship between the average of these distances and brain size [t(457) = −6.34, p < 0.0001]. This fine-tuned facial signature of prenatal androgens explained 8 % variance in brain size.

In a subset of the female participants (n = 192), we were able to investigate also the relationship between digit ratio and brain size. Neither the left nor the right digit ratio showed a relationship with brain size [left 2D4D: t(186) = 1.75, p = 0.08; right 2D4D: t(186) = 1.46, p = 0.15]. There was also no relationship between the facial signature (RMSE indicating the amount of PC3-like features) and digit ratio [left 2D4D: t(191) = 0.82, p = 0.41; right 2D4D: t(191) = −0.36, p = 0.72].

Discussion

We used twin design and head MR images to study the possible effect of prenatal androgens on craniofacial morphology. Females with a female co-twin differed from all the other twin groups that were prenatally exposed to gonadal androgens (OSF, OSM, SSM) and the effect size was large (Cohen’s d ~ 0.8) in all three contrasts (SSF vs. OSF, SSF vs. OSM, SSF vs. SSM). The craniofacial morphology of the females with a female co-twin (vs. all other groups) was characterized by high loadings of PC3 features, mainly the longer and wider jaw that together formed a longer chin (see Fig. 2). We refer to these PC3-like craniofacial features, associated with low exposure to prenatal androgens, as the facial signature. This facial signature is not related to the width of the face, forehead and nose that characterize typically male-like features emerging during male adolescence (see PC1 description in Marečková et al. 2011, Table 1).

To verify the relationship between these craniofacial features and prenatal androgens, we used a large independent dataset of female adolescents (singletons) to explore the relationship between craniofacial features and brain size. Consistently with Peper et al. (2009) who showed that females with a female co-twin had slightly smaller brains than those with a male co-twin, we showed that the facial signature (PC3-like features, associated with low exposure to prenatal androgens) was negatively related to brain size: the set of PC3-related features could explain 2 % variance in brain size, and the length of the jaw could explain 8 % variance in brain size.

Sex differences in skull shape appear already in the first year of life (Bulygina et al. 2006). Male infants had a relatively larger and more globular frontal bone, smaller face, and a more flexed cranial base than female infants (Bulygina et al. 2006). While a comparison of these findings with the results from Study 1 is difficult due to the different sets of skull landmarks, we speculate that the larger frontal bone in male vs. female infants (Bulygina et al. 2006) might be consistent with the greater distance between the inner corners of eye sockets, characteristic for females with a male vs. female co-twin (Study 1). Facial features that showed an influence by prenatal androgens in our study also correspond to the embryonic development. Testes develop at 7–8 weeks of gestation (Tapp et al. 2011), which is a period characterized by the start of mandibular and maxillary ossification, formation of deciduous-tooth buds, and migration of eyes medially (Sperber et al. 2000).

In the twin study (Study 1), we found differences between the same-sex females and each of the other three twin groups, but no difference between the opposite-sex and same-sex males. This is consistent with the effect of co-twin’s sex on teeth size reported by Dempsey et al. (1999). It seems that the effect of prenatal androgens on the skull appears at certain levels of prenatal androgens but does not follow a simple (linear) dose response. Females produce very little endogenous testosterone and therefore gestation with a male co-twin has a relatively greater effect on females compared with males (Tapp et al. 2011).

The large effect of twin group on facial signature (Cohen’s d = 0.76 in SSF vs. OSF, 0.77 in SSF vs. SSM, and 1.08 in SSF vs. OSM) is consistent with Anders and Verhon (2006) study about the effect of co-twin’s sex on digit ratio. The small (2 %) effect of facial signature (RMSE) on brain size in females from Study 2 is consistent with Peper et al. (2009) who reported a small difference in brain size between same- vs. opposite-sex females (Cohen’s d = 0.36, which is R 2 = 0.03). The fine-tuned facial signature (i.e. length of the jaw) explained even more variance in brain size (8 %).

Experimental manipulations in mice showed that digit ratio is related to prenatal exposure to testosterone (Zheng and Cohn 2011). Still, the ability of digit ratio to explain variance in cognition or behavior might be limited. In Study 2, we compared the ability of digit ratio and facial signature to predict brain size and showed that while facial signature predicted small portion of variability in brain size, digit ratio did not predict any. It is possible that facial signature might be a more sensitive indicator of prenatal androgens than 2D:4D ratio. While sex differences in the levels of prenatal testosterone (measured directly) are large (Cohen’s d = 1.9; Knickmeyer et al. 2005; van de Beek et al. 2004), sex differences in 2D:4D ratio are rather small (Cohen’s d = 0.2 in Manning et al. 2007 and 0.3 in Manning et al. 2004). Given the medium effect size of the sex differences in facial signature (Study 1: Cohen’s d = 0.56), the facial signature may provide a useful additional––albeit still indirect––index of prenatal exposure to androgens. In our subset of participants, we compared the digit ratio and facial signature but did not find any relationship. This is consistent with Valla and Halazonetis (2014) who found no relationship between digit ratio and craniofacial shape in prepubertal children.

We are aware of the possible limitation of our study, namely the fact that we were not able to replicate the Peper et al. (2009) findings regarding the effect of co-twin’s sex on brain size. This may be related to the small effect of prenatal androgens on brain size (cca 3 % according to Peper et al. 2009) and possible inter-individual differences in variables known to affect brain size, such as genes (Bartley et al. 1997; Paus et al. 2012), exposures to various toxins during pregnancy, such as cigarette smoke (Anblagan et al. 2013; Paus et al. 2012) or alcohol (Chen et al. 2003) and the duration of breast feeding (Isaacs et al. 2010). The absence of the relationship between PC3 and brain size in Study 1 females is consistent with the lack of the effect of co-twin sex on brain size in this sample. It is possible that the subtle effects of androgens produced by a male co-twin on his female co-twin require relatively large sample size and/or ethnically homogenous populations (the latter important especially for face morphometrics), both of which have been afforded by the SYS sample. It is of interest to note that––in Study 1––PC3 correlated (in the predicted direction) with brain size in male twins (regardless of their co-twin status). It is possible that inter-individual variations in the androgen levels produced by the male fetus influence both PC3 (reducing it) and brain size (increasing it).

Overall, these findings suggest that prenatal androgens did leave their signature in the skull and that this facial signature might be used, in the same way as 2D:4D ratio, as an indirect index of exposure to prenatal androgens. Given the widespread availability of T1-weighted head MRIs, an MR-based facial signature might be easily accessible to many researchers interested in the effects of prenatal androgens. Facial signature could be also estimated based on the PC3-related distances derived from photographs (profile and frontal view) and thus in the absence of MRI data. Study 2 showed that these distances can explain even more variance in brain size (up to 8 %) than the complete set of landmarks characteristic for PC3 (2 %). While the PCA methodology was invaluable for identifying these features, we conclude that the fine-tuned signature is a simpler predictor of prenatal androgens than the PC3-based one and might thus be more applicable for future research.

Conclusion

We used a cohort of 8-year-old dizygotic twins to study the relationship between prenatal androgens and craniofacial shape (Study 1). Head MR images enabled us to describe variability in skull features among same-sex and opposite-sex twins. Females with a female co-twin showed facial features that distinguished them from all other twin groups prenatally exposed to gonadal androgens. To verify the existence of the relationship between prenatal androgens and skull features, we studied relationship of this facial signature with brain size in a large independent sample of female adolescents born as singletons (Study 2). We were able to predict 2 and 8 % variance in brain size by this signature and the jaw length, respectively. We conclude that this signature of prenatal androgens in the skull might be used in future studies as an alternative to digit ratio to investigate further the role of prenatal androgens on brain and disease risk.