Introduction

Factors influencing perceptions of physical attractiveness have long been of interest. Early studies [1, 2] suggested that female waist to hip ratio (WHR) indicates attractiveness, and several studies set this preference into an evolutionary context, by suggesting that the most favored WHR also maximizes female fertility [2, 3]. However, WHR is not independent of adiposity, and in a series of studies, it was shown that once body mass index (BMI) is taken into account, the impact of WHR is much reduced [4,5,6,7]. Studies across multiple cultures indicate that males prefer thinner females down to a BMI of 18–19 kg/m2 [8,9,10,11].

A factor suggested to affect attractiveness ratings is socioeconomic status (SES) of the observer. Individuals from lower SES prefer higher BMI females [9, 10, 12,13,14]. The reasons for this effect are unclear. One idea is that high levels of resources translate to greater body fatness. Hence, when most people do not have many resources, body fatness might become a marker for resource possession, and therefore becomes physically attractive. However, this does not explain why, when most individuals have lots of resources, males prefer females that are thinner.

Nelson and Morrison [15] suggested that in a situation where there is collective resource scarcity, an individual would be likely to lack resources themselves. Hence, individual perception of their own resource status should provide information that would inform their judgements about attractiveness of potential mates. They tested this by evaluating how preferences for potential partners depended on financial satisfaction or hunger. They assumed that people experiencing low financial satisfaction, or high levels of hunger, would have implicit clues that resources were scarce, and this would lead them to prefer heavier mates. One study involved asking subjects going into a restaurant (hungry state) or leaving a restaurant (satiated state) to rate attributes of the most attractive potential partner. Hungry subjects said that their ideal partner would be heavier than subjects who were satiated, thereby supporting the original hypothesis [15].

That ratings of attractiveness should be responsive to a variable like hunger is remarkable, because hunger fluctuates enormously over time, largely independently of whether the overall resource availability is high or low. People with high access to resources do not live in a state of permanent satiation/satiety, and people with low access to resources are not permanently hungry. Nevertheless, this restaurant experiment [15] has been repeated on at least two occasions, and the primary results have been confirmed [16, 17]. Males going into restaurants consistently rate thinner female images more attractive than ratings by males leaving restaurants.

Although these previous studies are often described as “experimental” studies, they are not randomized controlled trials because the experimenters had no control over the subject allocation, nor any control over other things that may have happened during the restaurant visit. Hence, we wished to test the idea that transient hunger status affects male ratings of physical attractiveness, but using a more rigorous randomized controlled trial design. In a second study, we investigated whether alcohol consumption, which was a potential confounding factor, may have played a role in the previous positive associations.

Methods

The experiments were approved by the Ethical Review Board of the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences. The approval number for the hunger experiment was IGDB-2018-IRB-002 and for the alcohol study it was IGDB-2018-IRB-005. Both experiments were preregistered at the Chinese clinical trials registry: hunger experiment ChiCTR-ROC-17013771 and alcohol experiment ChiCTR1800017022.

Experiment 1

The participants were 52 Chinese males aged between 20 and 35. Individuals with different educational background and jobs were recruited by posters, word of mouth, and Internet chat sites. We did not study females because there was no previous work suggesting that female perceptions of male attractiveness depend on hunger. We did not ask individuals their sexual orientation prior to the testing, as this was not done in the previous studies at restaurants. The experiments were performed on groups of 4, 6, or 8 individuals. Subjects were asked to attend our institute at 0830 hours having fasted from 1800 hours the previous evening. When the participants arrived, we introduced the aims and procedures of the experiment to them. To disguise the primary focus on physical attractiveness, they were told the aim was to assess the impacts of hunger on a range of cognitive tasks. Individuals then gave informed consent to participate. Once consented, they had their blood glucose level measured using a blood-drop pin-prick test and glucometer (Johnson OneTouchUltraEasy). Individuals that had starved overnight were expected to have blood glucose lower than 6.0 mmol/L. Individuals with higher blood glucose were not admitted into the study (this only happened once). In that case, to make the participant numbers even, we asked an additional individual to voluntarily quit the experiment. Both these two participants still got a free lunch as a payment for attending. All individuals were weighed and had their body composition measured by an FM bioimpedance analyzer (Tanita: TBF-418B, Japan).

The individuals were randomized into two groups by choosing marked balls from a box. Equal numbers of starved and fed individuals were randomized in each subgroup i.e., if eight individuals attended, we randomized four to starvation and four to eating. All individuals remained in the laboratory under constant observation for 4 h. Individuals sat with their own group, but the two groups were about 5 m apart. The two groups were able to see each other. That is, individuals in the starvation group were able to see the individuals randomized to “Eating” consuming their food. We did this deliberately as we felt it would possibly make them more hungry. Individuals were allowed to read or use their laptops/mobile phones during this 4-h period. The “Starving group” was permitted to drink water but not given any food. Individuals randomized to the “Eating group” were each given a double hamburger and soda (full sugar 500 ml) immediately after randomization (around 0900 hours). At 1030 hours, they were given a noodle snack or chicken legs. Throughout the experiment, they had ad libitum access to additional snacks (potato chips) and sweets (chocolates (Dove) and chocolate with cake (Orion)). They were encouraged to eat as much as they could. What each individual actually ate was recorded. We weighed all the provided food and how much was left at the end of the experiment. The amount of food they had consumed was calculated and converted into energy consumption. Around 1100 hours all individuals in both groups were given a menu to choose a meal from that which would be provided once the experiment was over. At half-hourly intervals, all individuals completed a Visual Analog Scale (VAS) for their feelings of hunger. After 3.5 h at around 12 noon blood glucose was measured again.

The cognitive tests started after the VAS and blood glucose were completed at 12 noon. Individuals were split up and performed the tests in isolation from each other (about 2 m apart). Individuals in the fed group had snacks available throughout the testing period. All individuals continued to be observed from a distance throughout the testing phase.

The tests comprised a general IQ test, a female attractiveness rating test that we have used previously to evaluate male ratings of female attractiveness [11], and a cellphone-based memory test. The memory test is called the “instant memory test” (from the Android Market). The test consists of a screen with a chessboard-like grid on it. At the start, there are two balls on the board numbered 1 and 2. After 2 s, the balls disappear and the player has to indicate where they were located on the grid in the correct numerical order. If participants choose the correct order and locations, then the game resumes with three numbered balls in new randomized positions. The number of balls increased by 1 in each round. The game continues until an error is made. The score is the accumulated number of correctly located balls across all rounds of the game. Participants were asked to finish the game three times and we chose the highest score across all three attempts as their memory score. This task is formally known as an object-location binding task, because the person is required not only to identify the numbered objects in order, but also their spatial locations.

The tests lasted for about 50 min. It was not possible to blind either the experimenter who was observing, or the subjects, to the treatment they were exposed to. However, the subjects were blind to the expectations with respect to the attractiveness task. After the testing period, all individuals in both groups were given lunch.

Experiment 2

A potential confounding factor in the previous observational studies was that individuals entering and leaving a restaurant may not only have differed in their levels of hunger, but also have been affected by other events that happened in the restaurant. Perhaps chief among these was consumption of alcohol. We therefore conducted a second experiment to assess the impact of low levels of alcohol consumption on two of the tests used in experiment 1.

We recruited the individuals by poster and word of mouth. All the participants were male. The experiment was designed to have a double-blind procedure. There were two experimenters, one who only watched the individuals and did the testing but was blind to who had consumed alcohol, the second made up the doses of alcohol, provided it to the individuals, and measured their breath alcohol levels.

We asked the participants (N = 32) to come to the institute at 1800 hours. They came in groups of 4−8. After they arrived, they were told the aim of the study was to assess the impact of drinking alcohol on performance on some tests, and they gave informed consent. They were not informed that some of them would not be drinking any alcohol. We then performed a breath alcohol test using a machine provided by the Chinese Police (Model: Jiuan-1000) to make sure there was no baseline alcohol in their circulation. We set a baseline exclusion criterion of >5 mg/100 ml. None of the subjects exceeded this level in the first measurement. After that we weighed their body mass. Then researcher #1 did a body fat analysis (as in experiment 1) and gave a number to each individual. Researcher #2 used the numbers of the individuals to randomize their allocation to a group given beer containing 4% alcohol at a rate of 10 ml/kg body mass, and a control group given alcohol-free beer. Hence, an 80 kg individual would have consumed 80 ml of beer containing 32 ml of alcohol. That would be equivalent to drinking two regular 125 ml glasses of wine with 13% alcohol. We considered these levels of consumption to be representative levels often consumed with an evening meal, as would likely have happened during the restaurant tests performed previously. The two kinds of beer were commercially available beers manufactured by the YANJING Brewing company, China, and were purchased from a local supermarket. The subjects and researcher #1 were both blind to who was drinking alcohol. Individuals were given cups with their individual ID number and were instructed to drink the contents in 30 min. The amount of beer provided was related to the body mass of the subjects, since the alcohol content of the beer (4%) was constant, yet we wanted to dose at a rate related to body mass. The subjects all sat together while consuming the beverages. After they finished, researcher #2 repeated the breath alcohol test. The subjects and researcher #1 remained blind to the grouping. In line with the feeding study, we also inserted the instant memory test to cover up the focus of the study on attractiveness. The subjects were observed during the testing phase by researcher #1. During informal questioning, once the testing was complete, the subjects were unaware that some individuals had not consumed any alcohol, and the researcher who was blinded could also not distinguish the two groups.

Statistics

We used R studio to analyze the data and used GraphPad to draw the graphs.

The primary outcome was the ratings of attractiveness. We analyzed this in two different ways. First, we compared the responses of the two groups. For the ratings test, we converted the rank orders into scores by an equation which was used in a previous study [11]. The scores followed the formula an = 1 + (n − 1) × 0.4 (n represents the rank order of the images from the last to the most attractive. That is, n, the most attractive image was 21, so a21 = 1 + (21–1) × 0.4 = 9). For the secondary outcomes we used t tests. We then looked for relationships between the ratings of attractiveness and the actual ratings of hunger by VAS and the blood glucose, as well as the circulating alcohol absolute levels and changes in levels using least squares linear regression. The overall rating task for attractiveness may miss subtle effects at the extremes. To explore the data for these effects, we calculated the average body fatness of the top 3 and top 5 ranked individuals for each rater, and similarly the average fatness of the bottom 3 and bottom 5 ranked individuals. We then compared the averages of these ratings between the fed and staved groups, and the groups that had and had not drunk alcohol using two-sample t tests. We set the significance criterion level at p = 0.05 and corrected for multiple testing where necessary using the Bonferroni correction.

Results

Body fat

We compared the body fat percentage of participants in control and treatment groups in the two experiments at baseline and there was no significant difference (experiment 1: mean for fed group = 20.66, SD = 6.5, n = 26; mean for starved group = 20.57, SD = 7.2, n = 26, T test: t = −0.044, p = 0.965. Experiment 2: mean for alcohol group = 17.59, SD = 5.78, n = 16; mean for alcohol-free group = 14.33, SD = 5.32, n = 16, T test: t = 1.66, p = 0.11; Table 1).

Table 1 The body fatness of the randomized two groups engaged in the two experiments

Experiment 1: impact of hunger

At randomization, there was no significant difference in the blood glucose level between the two groups (mean for fed group = 5.17, SD = 0.54, n = 26, mean for starved group = 4.91, SD = 0.54, n = 26: Supplementary Fig. 1a) nor in their average self-reported levels of hunger from the VAS (mean for fed group = 3.58, SD = 2.86; n = 26, mean for starved group = 3.50, SD = 2.60, n = 26: Supplementary Fig. 1b). There was no significant relationship between the circulating glucose level and the self-ratings of hunger at baseline (Supplementary Fig. 1c). At the end of the 3 h after randomization, the blood glucose of the starved group was unchanged (mean for starved group after 3 h fasting = 5.15: t = −1.57, df = 49.71, p value = 0.12) but that of the fed group was significantly elevated (mean for fed group after 3 h eating = 6.96: t = −7.19, df = 49.71, p value < 0.01). During the course of the experiment, the individual self-ratings of hunger by the VAS tool progressively diverged with the individuals in the starved group becoming progressively more hungry, with an average final mean value of 6.27 (SD = 3.13, n = 26, which differed significantly from their initial ratings 3 h earlier: paired t test, t = 3.89, p < 0.001: Supplementary Fig. 1d). In contrast, the fed group became progressively more satiated and finished with an average final mean value of 1.42 (SD = 0.81, n = 26, which differed significantly from their initial ratings 3 h earlier: paired t test, t = 3.74, p < 0.001). The final hunger ratings of the starved group were significantly higher than those of the fed group (two-sample t test: t = 7.16, p < 0.001). Final hunger ratings were significantly negatively related to circulating glucose levels (Supplementary Fig. 1e).

Because individuals were continuously observed, we know that the starved group consumed no food during the interval between randomization and testing. The food consumption of the fed group varied between individuals. The highest was 2.55 kg of food/drink and the lowest 700 g. We converted the food into energy units and the average intake was 6.96 MJ. There was a significant positive relationship between the energy they consumed and the circulating glucose at the end of the feeding period (Supplementary Fig. 1f), but there was no relationship between the energy intake and final VAS hunger rating in the fed group (Supplementary Fig. 1g).

Primary outcome

As we have shown previously using this measurement tool, there was a strong negative relationship between ratings of attractiveness and subject body fat % (Fig. 1a) and BMI (Fig. 1b). There was a less significant effect of WHR (Fig. 1c). In all three cases, there was no significant effect of the randomized group on these relationships (for body fat percentage: ANCOVA group effect F = 0, p = 1; for BMI: ANCOVA group effect F = 0.02, p = 0.97, for WHR: ANCOVA group effect F = 0, p = 1).

Fig. 1
figure 1

Effect of hunger and alcohol on ratings of physical attractiveness. The plots show a ratings of attractiveness against body fat percentage, b body mass index, and c waist to hip ratio. In all cases there was a strong negative relationship. The relationships did not differ between fasted and fed groups. d Ratings of attractiveness against waist to hip ratio, e body fat percentage, and f body mass index. In all cases, there was a strong negative relationship. The relationships did not differ between groups that had and had not drunk alcohol

Because this tool may not detect subtle differences in preferences at the extremes, we calculated the body fat% of the top 3 and top 5 most attractive rated images, and the body fat% of the bottom 3 and bottom 5 least attractive rated images for each of the raters. There was no significant difference in the means of these values between the randomized groups (Table 2). There were also no significant relationships between the individual values of these four ratings and the hunger as evaluated by the final VAS measure (Fig. 2a–d) or the change in blood glucose between initial and final measurement (Fig. 2e–h).

Table 2 The average body fat percentage of the top 3 or 5 most attractive rated images and bottom 3 or 5 least attractive rated images averaged across the individuals in the fed and starved groups (experiment 1) and the individuals given or not given alcohol to drink (experiment 2). The averages were compared using two-sample t tests the results of which are also shown
Fig. 2
figure 2

Impact of hunger of raters on the body fatness of the three and five most rated attractive individuals and the three and five least rated attractive individuals. In (a) to (d) the ratings are plotted against the VAS rating of hunger and in (e) to (h) against the circulating glucose levels. None of the relationships was significant

Secondary outcomes

There was no significant difference between the randomized groups in their IQ test scores (starved group: mean = 114.8, SD = 19.14; fed group: mean = 114.1, SD = 18.47, t test: p = 0.89, Fig. 3a) and no significant relationship between individuals’ IQ and the individuals’ final ratings of hunger by VAS (Fig. 3b) or the change in circulating glucose (Fig. 3c). In terms of the test outcomes, there was a significant difference between the randomized groups in their performance on the instant memory test (mean of starved group = 21.54, SD = 5.78; mean of fed group = 25.85, SD = 6.18, t test: p value = 0.01, Fig. 3d). There was a significant positive relationship between the memory performance and circulating glucose levels (R2 = 0.096, p < 0.001, Fig. 3e).

Fig. 3
figure 3

Secondary outcome effects on IQ and memory recall. a No difference between starved and fed groups in performance on the IQ test. Relationship between final hunger levels and b IQ and c change in circulating glucose between start and final measurement. d Significant difference between starved and fed groups on the memory task and e relationship between the performance on the instant memory test and circulating glucose levels. f Memory ability between the two groups, one given and the other not given alcohol, and g the relationship between circulating alcohol levels and memory ability in the group that consumed alcohol

Experiment 2

When first tested that all the participants had 0 mg/100 ml of predicted circulating blood alcohol from the breath test. The highest volume of beer consumed was 1162 ml, the lowest was 644 ml, and the mean was 844 ml. For all individuals in the alcohol-free group, their circulating alcohol was still 0 mg/100 ml after consuming the drinks, but in the alcohol group their alcohol breath levels increased. Their predicted blood alcohol from the breath test varied from 13 mg/100 ml to 38 mg/100 ml (mean = 27.38, SD = 6.82).

There was no significant effect of the randomized group on the ratings of attractiveness (for body fat percentage: ANCOVA group effect F = 0.05, p = 0.82; for BMI: ANCOVA group effect F = 0.71, p = 0.40; for WHR: ANCOVA group effect F = 0, p = 0.99) (Fig. 1d–f). There were also no differences in the body fatness of the most attractive rated individuals (Table 2), but for the least attractive individuals the average adiposity of the three least attractive individuals was significantly lower in the individuals that had drunk alcohol (Table 2). Moreover, when we explored the relationships between the ratings of the most and least attractive individuals and the level of circulating alcohol, we found that while there was no effect of alcohol on the body fatness of the most attractive rated individuals (Fig. 4a, c) there was a significant negative relationship for the least attractive individuals (Fig. 4d: average adiposity of the bottom 5 rated individuals r2 = 0.28, p = 0.033). There was a single outlier in this relationship that we were worried was causing the significance; however, when we removed it the r2 increased to 0.53 and the p value declined to 0.0019. We did not find any significant effect of alcohol consumption on the instant memory test (mean of alcohol group = 24.44, SD = 6.38; mean of alcohol-free group = 25.19, SD = 5.97. t test: p value = 0.73, Fig. 3f). Also there was no significant relationship between the memory ability and circulating alcohol levels (Fig. 3g).

Fig. 4
figure 4

Impact of alcohol intake of raters on the body fatness of the three and five most rated attractive individuals and the three and five least rated attractive individuals. There was no significant correlation

Discussion

Based on both objective (blood glucose changes) and subjective (VAS hunger ratings) measurements, the first experimental manipulation was successful at altering the levels of hunger. However, contrasting the previous three observational studies [15,16,17], we found no evidence that hunger altered the perception of physical attractiveness towards subjects with greater adiposity.

There are several differences between our study and the previous work that may potentially explain these different outcomes. First, our study concerned Asian men, while previous work involved Caucasians. The strong consistency of the perception of physical attractiveness of females of different adiposity between Asians and Caucasians [8,9,10,11], however, suggest this difference was unlikely to be important. In addition, the original hypothesis underlying the presumed impact of hunger never indicated that this might be something restricted to one particular culture [15]. We used a different tool to measure physical attractiveness, but this tool [11] provides highly consistent results with other measures that indicate males typically prefer leaner females. Our study took place in the morning between 0830 and 1300 hours while previous studies have taken place in the evening. This time difference again seems an unlikely source of the large difference between the outcomes.

The most significant difference between our study and the previous studies was that ours was a randomized controlled trial, while previous studies [15,16,17] intercepted diners entering and exiting restaurants. The experimenters in these previous studies therefore did not randomize the participants to the different treatments. Possibly more important than randomization, however, they were unable to control for confounding factors that might co-vary with the dining experience. Hence between entering and exiting a restaurant, prospective diners would probably have consumed food, and would hence be less hungry; however, in addition, they would likely also have engaged in other consumptive behaviors such as drinking alcohol, drinking coffee or tea and possibly smoking.

Of these the most important potential confound was alcohol intake. Alcohol consumption impacts ratings of attractiveness by making all subjects more attractive [18,19,20,21]. However, more critically it also reduces the ability to perform simple discrimination tasks. Hence, individuals who have consumed alcohol show altered preferences for facial symmetry, because they find it harder to distinguish symmetrical from nonsymmetrical objects of any sort [22]. This might be an explanation for the difference between our study and previous work. That is in the previous studies individuals exiting the restaurants were not only less hungry, but had likely also consumed alcohol which diminished their abilities to distinguish subtle differences in adiposity in the rating test.

To evaluate this we performed a second experiment to see if consumption of alcohol at a level similar to that presumed to be consumed with a meal (about two glasses of wine) might lead to altered perceptions of attractiveness using the same tool. As with experiment 1 our manipulation successfully manipulated the level of circulating alcohol in the exposed compared to the nonexposed group. The level of alcohol in their breath varied from 13 to 38 mg/100 ml. This compares to the UK driving limit of 35 mg/100 ml breath. This experiment provides support for the suggestion that alcohol consumption may have underpinned the differences reported previously. Although alcohol did not appear to impact the adiposity of the images that were rated most attractive, there was a significant negative relationship between circulating alcohol and the mean adiposity of the five individuals rated as least attractive. That is as circulating alcohol increased the adiposity of the least attractive individuals was reduced—hence alcohol made the individuals less discriminatory against those with high adiposity. This may be a sufficiently large enough effect to cause the trends reported previously in individuals exiting restaurants compared to those entering. Unfortunately, we have no idea what the alcohol levels were in the previous experiments because they were not measured.

Because we embedded the attractiveness rating between two other tasks we also had some secondary outcomes. The IQ test we used was not sensitive to the hunger levels of the subjects. Hence, if individuals are planning on taking an IQ test it would not appear any benefit to feed or starve oneself in advance of the test. These data contrast previous work that suggested individuals who scored more highly on IQ test had higher circulating glucose levels on a subsequent oral glucose tolerance test, suggesting greater assimilation capacity for glucose might be linked to greater IQ score [23], although in that study the glucose monitoring was not performed simultaneous to the IQ test. In contrast, high blood glucose in type 2 diabetic patients appeared to be negatively linked to cognitive performance including IQ tests [24]. Unlike the effect on IQ, performance on the memory recall test was reduced by about 20% by starving, and the performance was positively related to circulating glucose levels. This result is consistent with previous work that has shown drinking glucose after an overnight fast can result in a temporary enhancement of cognition, particularly episodic memory—an effect called the “glucose enhancement effect” [25,26,27,28]. This is consistent with glucose being the major energy substrate that supports neuronal function, notably in object-location binding tasks [28] as was performed in our study, and shows that the hunger levels we generated were sufficient to generate data consistent with a different well-established impact of feeding on cognitive performance.

Conclusions

This randomized controlled trial failed to replicate previous nonrandomized observational studies, which suggested ratings of female physical attractiveness by males are sensitive to hunger. The reason for the difference was possibly because in previous studies hunger was confounded by alcohol consumption.