1 Introduction

The concept of homophily refers to the fact that social characteristics or attributes that are common to groups of individuals involve more intense relational behaviours among their members; on the other hand, heterophily involves the existence of common relational behaviours between different groups. Homophilic processes and their complementary heterophilic processes thus highlight the attributes of a group and relate them to relational behaviours and their consequences. From a sociological point of view, this process could be reversed: it could also be said that similar relational behaviours (interactions) generate the common attributes or categories of social agents.

This article focuses on homophily and heterophily applied to the analysis of a set of personal or ego-centred networks. The article aims to identify and analyze separately two types of homophily and heterophily and then compares them. These two types of homophily and heterophily are based on the nature and content of the relations between individuals, which in some ways are complementary: elementary homophily (and heterophily) and specific homophily (and heterophily). The name generator that has been used to build up the links among actors asked for relationships of reciprocal knowledge. Thus the content of elementary homophily (and heterophily) represents the basis and prerequisite for any other type of content. The content of the relationship attributed to specific homophily (and heterophily) consists of the intensity of the relationship of the egos with their alters on a scale ranging from Not at all close through Fairly close, Close and Very close to Intimate. Therefore, the contents attributed to specific homophily and elementary homophily overlap in the reciprocal knowledge.

The article is organised in four sections plus the introduction. In section two a short state of the art is presented. Section three refers to the research design and database. Section four presents the analysis, where the methods and indicators used are those of social network theory and analysis applied to ego-centred networks. In this section, the groups are identified from demographic and work-related attributes that shape the social structure. Finally, in section five the conclusions are offered.

2 Homophily and heterophily: from attributes to relations

2.1 The logic of homophilic and heterophilic processes: from the attributes to the social relationship (or interaction)

The social structure is expressed by variables in which each value or category constitutes a group with equal status or social resources. The homophilic effect makes agents with identical attributes or statuses correspond to increasingly frequent relationships. The heterophilic effect makes agents with different attributes or statuses correspond to more frequent relationships. These behaviours can lead to similar practices, habits, behaviours, beliefs, feelings or tastes. A group’s resources, which are assumed to be similar within each position or attribute, condition the interactions between its agents, resulting in greater relational identity in each category.

The methodological process and design involved in the analysis of homophily and heterophily are simple: they consist in contrasting the attributive variables with the relational variables. The homophily increases with the degree of isomorphism within the groups corresponding to different attributes and the relational behavior of their individuals. Heterophily corresponds to the absence of isomorphism, i.e. the classification based on the groups’ attributes does not correspond to similar behaviours in their relationships.

2.2 Varieties and causes of homophily and heterophily

The idea of homophily is implicit in Simmel (1971[1908]) and in Park and Burgess (1921). Lazarsfeld and Merton (1954) are among the first explicit exponents. Homophily studies have focused mainly on socio-demographic variables such as age, gender, educational level, occupation, social class and status (McPherson et al. 2001).

The literature offers different types of classification of homophily but with fairly similar criteria. An important distinction is that between individual homophily and structural homophily: the former originates from preferences and choices based on criteria of affinity and intention, and the latter from shared contextual, social or physical characteristics or situations, such as geographic location, occupation and workplace. Equally important is the distinction between baseline homophily and inbreeding homophily. In baseline homophily the proportion of agents of the same category in a population is proportional to that of the agents who have similar and/or more intense relational behaviours; it is like the water line or zero level of homophily. In inbreeding homophily the proportion of homophilic relationships exceeds that expected in random conditions in view of the composition of the population (McPherson et al. 2001; Yuan and Gay 2006; Kossinets and Watts 2009; Pearson et al. 2006). Other expressions used to discuss the phenomenon of homophily and heterophily include proximity or social distance, social opening or closure and homogeneity or complementarity. The terms ‘homophilic value and status’ and ‘cognitive similarity’ are also used (McPherson et al. 2001; Monge and Contractor 2003).

Homophily studies have dealt widely with schools and neighborhood institutions (Wellman 1929) and the behaviour of adolescents and the influence of age, gender, education, ethnicity and racial discrimination (Loomis 1946; Leenders 1996; Marsden 1987; Mollica et al. 2003). Homophily studies on attitudes and aspirations have also been important (Cohen 1977).

The explanations of how and why homophily occurs point to a wide variety of factors, ranging from geographical segregation and special proximity, to the composition of organizational foci (Feld 1981), cognitive processes of many different kinds, including social advantages in communication processes between similar people (Festinger 1957; Werner and Parmelee 1979; Hamm 2000), the increased confidence and solidarity gained from people with greater equality in characteristics and thus a relational similarity (Portes and Sensenbrenner 1993; Mollica et al. 2003), the alleged reduction of risks associated with the treatment of people with similar characteristics (Mouw 2003), or in general the cost reduction in social communication processes (Leenders 1996), and similar positions in social structures (see McPherson et al. 2001 for further details).

2.3 The introduction of relational mediation in homophily and heterophily

In the 1970s and 1980s there was a leap forward in the study of homophily with the introduction of social networks in comparative case studies, sampling studies (Marsden 1987), and studies in schools (Duncan et al. 1972; Shrum et al. 1988), specific groups (Laumann 1973; Fischer 1982) and among the general population (Marsden 1987). Networks linked to homophily have been present in studies concerning a variety of topics: work organization, integration into employment, transmission of information (Burt 1992, 2000, 2001; Campbell 1988; Lin et al. 1981a, b; Granovetter 1973; Requena 1990; Ibarra 1995); interconnected resources in the business world (Aldrich et al. 1996; Burt 1998); homogeneity and heterogeneity of links and networks (McPherson et al. 2001; Ibarra 1995; Ibarra and Andrews 1993); marital and partner relationships (Kalmijn 1998); friendship (Verbrugge 1983; Requena 1994a, b); contacts and mutual ties (Wellman 1996); co-presence in public places (Mayhew et al. 1995); multiple relationships (Fischer 1982); and the structure, typology and temporal evolution of networks as a result of shared activities, similarity of attributes of individuals and the time cycles of networks (Kossinets and Watts 2009)

In the context of this research on homophily based on social networks methodology, it is worth mentioning the studies that have sought to compare relational contents. According to Bidart (1991), studies of this type are not common. Allan (1979) focuses on friendship in itself by studying its rules of formation within a biographical time framework and at a micro-level, in the framework of their history and dynamics. Eisenstadt and Roniger (1984) compared friendship with other forms of relationships, and Davis and Todd (1982) with love relationships. The importance of the homophilic factors in relations of friendship (Verbrugge 1977; Coenen-Hunther 1989) and in the duration of relationships (Ferrand 1989, 1990) have also been studied.

2.4 The content and meaning of the proposal

In contrast to studies that compare different relational contents between homophilic individuals, our proposal seeks to compare elementary homophily with its intensity (specific homophily). Let us recall that elementary homophily pre-exists any level of intensity because it refers to mutual ties and relations between egos and alters. Specific homophily refers to the degree of intensity of the links that can be interpreted as the degree of proximity or trust. These two contents are therefore not comparable as, for example, economic and affective relationships are; rather the latter needs the former because it is based on it. The objective of the analysis is not so much to compare them but to examine how and to what extent specific homophily is distributed over elementary homophily. What matters in this goal is that once elementary homophily of ties and relations has been examined, the study analyses the extent to which this relationship can be developed in the sense of intensity or proximity between egos and alters. This perspective is not common in network analysis.

Following the above reasoning, the article focuses in particular on two goals, one methodological and one conceptual. The methodological objective addresses several aspects. First, it investigates the potential of personal or ego-centred networks for studying homophily and heterophily from a sample of egos, characterized by a set of social structure variables. Second, it examines the empirical possibilities of applying the two types of homophily, which though different are conditioned by their overlapping; and third, it shows that the homophilic effect provides criteria for clustering or recomposition of structural variables.

The conceptual objective, linked to the content of homophily, addresses two correlative aspects. First, it examines elementary homophily (and heterophily) and specific homophily (and heterophily) using three structural variables: employment status, age and city of residence. Second, it compares the results obtained in the two types of homophily for each of the three variables selected.

In relation to the above objectives it is important to note that elementary homophily is a necessary condition or ‘ground zero’ for specific homophily. Mutual knowledge relations are not necessarily consolidated in time, but may be instantaneous or almost instantaneous and unstable. Neither are they necessarily strong or completely instrumental. Therefore, the superposition or combination of specific relationships with elementary ones gives the relationship an added content that is more permanent, emotional, trustworthy and consistent.

3 The database and the analysis

The data come from a survey of personal networks with 441 cases.Footnote 1 A non-proportional sampling with autochthonous and immigrant population, balanced by age and gender quotas was performed. The fieldwork was conducted in three cities (see Fig. 1) of Catalonia (northeast of Spain) that represent three different types of urban development: Barcelona, a large city; Balaguer, an agricultural town with some presence of industry and services; and Sant Feliu de Llobregat, a typical city of the metropolitan area of Barcelona, which has a high level of industrial and urban development and received a large number of immigrants from rural areas of Spain from the late 1950s to the early 1970s. The survey was conducted between late 2010 and early 2011. For more details of the survey and sample, see also Lozares et al. (2011).

Fig. 1
figure 1

Situation in Spain and Catalonia of the cities in which the field work of the survey was carried out

The survey includes 441 egos and 13,230 relationships. The information is given by the egos about themselves and about their contacts. The survey collected three types of data: on the classic socio-demographic characteristics for both the egos and their alters; on the relationships between each ego and 30 of its alters; and on the relationships between the alters of each ego, i.e. the network of relationships. The question used as a name generator was drafted as follows:

Please write a list of 30 people you know by name, and vice versa, who you have had contact with at least in the last two years by any means of communication and who you can contact again if necessary. Do not include people under 18 years of age. Anyone can be included. Try to include people who are close and important to you. Then you can include anyone who is not so close but you see quite a lot. Then you can extend your memory to others. It may help you to think about different groups of people in different places, family, friends, colleagues, neighbours, etc.

The analyses are limited to the association between variables representing groups of egos and the corresponding and identical variables for the alters. Elementary homophily (and heterophily), as noted above, is mediated by a generic relationship of mutual acquaintanceship. In specific homophily (and heterophily) the relationship between the two variables of egos and their alters is mediated by the intensity of the relationship, expressed on the scale Not at all close, Fairly close, Close, Very close and Intimate.

4 The analysis

4.1 The social structure and elementary homophily and heterophily of mutual ties and relations

Elementary homophily and heterophily are based on the ego-centred networks with a basic content of ‘reciprocal knowledge’ or acquaintanceship. Egos also provided information on the characteristics of their alters. This information is used to match the groups of egos with the identical groups of alters with regard to the attributes for which homophily and heterophily are calculated. The procedure consists of association through contingency tables.

The groups analyzed correspond to three structural variables of the sample population: employment status, age and town of residence. The criteria for choosing these three variables as an application or exemplification are the following: (i) current employment status is related to what is known as homophily linked to the environments or foci of relationship (Feld 1981), in this case occupational ones; (ii) age is a strong form of contextual homophily of a temporal nature; (iii) town of residence is also a strong spatial context. Table 1 summarizes the variables and their categories.

Table 1 Variables used and categories

Elementary homophily and heterophily are measured by the adjusted residuals through chi square calculation in a contingency table, as shown in Table 2. In the boxes in which the groups with the same attributes coincide, seen diagonally in the table, the homophilic relations are interpreted. Positive residual values indicate a higher concentration of relations than expected according to the ratio of their marginals in rows and columns; these marginals mark the base for the homophily effects in the combination of egos and alters, i.e. homophily occurs if the groups coincide and heterophily if the groups are different. In contrast, when the residuals are negative the opposite occurs: fewer relations than expected are observed. In working with adjusted residuals, values higher than 1.96 or lower than \(-\)1.96 are significantly different from 0 and those that are between the two intervals are not significant. Let us see the results arising from the interpretation of Tables 2, 3, 4, 5, 6 and 7 that follow.

Table 2 Adjusted residuals in the association between groups according to the current employment status of ego and alter
Table 3 Adjusted residuals in the association between groups by age of ego and alter
Table 4 Adjusted residuals in the association between groups by town of residence of ego and alter
Table 5 Adjusted residuals in the association between the intensity of the relationship between groups according to the current employment status of ego and alter
Table 6 Adjusted residuals in the association between the intensity of the relationship between the groups by age of ego and alter
Table 7 Adjusted residuals i the intensity of the relationship between the groups by city of residence

4.1.1 Groups by employment status

The level or degree of homophily between groups with different work situations is, in descending order, the following: retirees, pensioners and people with disabilities (3); people with full-time jobs and open-ended contracts (6); students with jobs (2); students without jobs (1); people who do home management and care work (4); unemployed people (5); people with part-time or temporary jobs (7); and finally, people with other (mainly irregular) situations (8).

Heterophilic relationships between various groups are also observed. The most intense ones are between the groups of students with jobs (1) and without jobs (2); between people with part-time or temporary jobs (7) and other situations (8); and between people with part-time or temporary jobs (7) and unemployed people (5). These are coherent forms of heterophily. Therefore, if it was desired to restructure this variable by grouping their values (groups), this criterion could be used to do so. Grouping heterophilic groups would result in a homophilic group. The result of this reclassification would be: first, students, with and without jobs; second, retirees, pensioners and people with disabilities; third, groups who do home management and care work; fourth, unemployed people and people with part-time or temporary jobs; fifth, people with full-time jobs and open-ended contracts; and finally, other situations.

4.1.2 Age groups

The global association with respect to homophily for all groups is high. In this order, the groups with highest homophily are: the 18–34 group, followed by the 65 and over group, the 55–64 group and the 35–54 group.

Another remarkable effect with respect to the reduction or grouping of groups by relational links is that the 55–64 age group and the 65 and over age group can be grouped because of their homophily and heterophily. This cut-off effect at 55 years is important because it appears as a turning point.

4.1.3 Town of residence groups

A strikingly high level of contextual homophily is caused by town of residence. The three towns have a high homophily. Balaguer, a medium-sized town, has a much higher level of homophily than the two larger cities. On the other hand, the comparison between Barcelona and Sant Feliu shows that, despite their differences in population, urbanization and habitat type, their homophily levels are similar.

4.2 The social structure and specific homophily according to the relational intensity between egos and alters

In the previous sections we have analyzed elementary homophily, whose relational content, as stated above, is that of ‘reciprocal knowledge’ between the egos and their alters. The specific homophily differs from the elementary homophily in that the intensity of the relationship between egos and alters is considered. The information is obtained, as stated above, by asking the egos about the intensity of their relationship with their alters according to the scale Not at all close, Fairly close, Close, Very close and Intimate.

The analysis is carried out by a triple contingency table that contains attributes of the egos, identical attributes of the alters, and the intensity of the relationship according to the scale. These tables (5,6,7) are drawn up for each of the variables studied previously: employment status, age and town of residence. In this case, we try to analyze the intensity of the relationship only on the basis of homophily without taking into account heterophilic relationships. Therefore, only the intensity of the relationship among the alters that shared their attributes with those of the egos was taken into account; i.e. the intensity of homophilic relationships.Footnote 2 Moreover, the relationships considered are only those presenting significant and high levels of homophily. In order to simplify the presentation of information, the analyses have been focused only on three values of the intensity scale of the relationship: the middle value Close and the two extreme values Not at all close and Intimate. This choice is justified by the results of the whole scale: the intensity of the middle relationship Close is that which establishes the greatest differences between the groups and the values Not at all close and Intimate are indicative of the extreme values of the scale.

The measurement indicator, as in the previous analyses, is the adjusted residual, which informs about the level of significance of the association.

4.2.1 Employment status groups

In terms of similarity, the contacts that are identified as “not at all close” and just “close” tend to be much more homophilous than the “intimate” ones, thus confirming that close relationships can deal with more heterogeneity than superfluous ones. In all three types of relationship, the highest values of adjusted residuals are for retirees, pensioners and people with disabilities (3) and people with full-time jobs and open-ended contracts (6). In other words, two groups with opposite situations. The lowest values of adjusted residuals in the Close relationships are for people with part-time or temporary jobs (7) and other situations (8). The intermediate values are for students with jobs (2) or without jobs (1), people who do home management and care work (4) and unemployed people (5). The lowest values in the Intimate relations are for groups of students without jobs (1), people who do home management and care work (4), unemployed people (5) and employees with part time and temporary jobs (7). The lowest for Not at all close relations are students without jobs (2), people who do home management and care work (4), people with part-time or temporary jobs (7) and people in other situations (8) (Fig. 2).

Fig. 2
figure 2

Adjusted residuals in the association between groups according to the current employment status of ego and alter (1) student without job; (2) student with job; (3) retiree, pensioner, with disability; (4) home-management/care tasks; (5) unemployed; (6) full-time job and open-ended contract; (7) part-time/temporary job; (8) other situations

4.2.2 Age groups

The three degrees of intensity have very similar, almost parabolic forms for the five age groups. They increase in the youngest and oldest age groups and decrease in the middle ones. Furthermore, Intimate relationships have intermediate values for the significance indicators in comparison with Close relationships, which have higher values and Not at all close relationships, which have lower values.

The trends for Close and Intimate relationships are equivalent for all age groups. Maximum values in both cases are in the relationships of the youngest group (18–24 years) and the oldest group (65 and over). They are followed by the 35–54 age group. Not at all close relationships are nearly uniform for all ages, except for the 18–24 age group, which shows higher values Fig. 3).

Fig. 3
figure 3

Adjusted residuals in the association between groups according to the age inervals of ego and alter

4.2.3 Groups according to town of residence

For city of residence intermediate values for the significance indicators were also found in the Intimate relationships, higher values in Not at all close relationships and lower values in Close relationships. The behavior of Close and Intimate relationships is in this case parallel Fig. 4).

Fig. 4
figure 4

Adjusted residuals in the association between groups according tothe city of residence of ego and alter

The Balaguer group has the highest value for the three types of relationship. Sant Feliu has the lowest values for Close and Not at all close relationships and Barcelona has intermediate values for Not at all close and Close and lower values in Intimate relationships\(.\)

5 Conclusions

From the methodological point of view, three results have been obtained. First, ego-centred networks have been shown to be flexible and effective for the simultaneous analysis of homophily and heterophily using contingency tables between identical variables of egos and alters, using both global indicators for all groups of variables as the specific indicators of their attributes. The variables of egos and alters were dealt with one by one but their potential would also have allowed them to be processed in compositions of several variables in more condensed structures. Second, the most accurate indicators, such as ‘adjusted residuals’ allow different possibilities: a comparative processing of all groups; the removal of the effect of the marginals; the significance test given the distribution law, and the assignment of a value to the intensity of homophily or heterophily. Third, the calculation of the indicators of homophily and heterophily in all groups of one variable allows their values to be grouped from a relational perspective.

From the perspective of content, several conclusions are reached. First, from a generic point of view, what we have called elementary and specific homophily and heterophily have great conceptual potential. Specific homophily and heterophily have been studied in the case of three basic variables that were also analyzed from the viewpoint of specific homophily superimposed on the elemental homophily and, finally, comparisons were made between the two types.

Second, the various analyses of homophily and heterophily offer several substantive conclusions. For specific homophily and heterophily, the employment status groups with the highest level of homophily are: retirees, pensioners and people with disabilities, followed by people with full-time jobs and open-ended contracts. They are followed in descending order by students with and without jobs, people who do home management and care work, unemployed people, people with full-time jobs and open-ended contracts, and, finally, people with part-time and temporary jobs and other situations. Heterophilic relationships also arise between these groups. Examples are the relationships between students with and without jobs, and to a lesser extent the relationship between retirees, pensioners and people with disabilities and people who do home management and care work. Some associations are also observed between age groups. Although relationships within the category are predominant for all groups, heterophilic relationships occur between adjacent age categories. In age groups the global homophilic association is high. In descending order, the 18–24 age group, the 65 or over age group, the 55–64 age group and finally the 35–54 age group. In town of residence groups the contextual homophily is high.

Regarding specific homophily, for all groups, the association indicators in intimate relations of egos with their alters show intermediate values between the indicators of Close and Not at all close relationships. Moreover, in all cases, the association indicators in Close relationships have higher values and therefore the Not at all close relationships have lower values. Also, in virtually all cases of the variables and groups (with one exception: the behaviour of the associations of egos with their own alters of the same group), Close and Intimate relationships show a completely similar and comparable behaviour. Not at all close relationships are generically different.

The groups with high values in the Intimate and Close relationships are retirees, pensioners, people with disabilities, people with full-time jobs and open-ended contracts, the 18–24 age group, the 65 years and over age group and Balaguer. By contrast, in the Not at all close relationship, students without jobs, the 18–24 age group and Balaguer stand out.

With respect to the correspondences between elementary and specific homophily and heterophily, the groups with the highest incidence of homophily are retirees, pensioners and people with disabilities; people with full-time jobs and open-ended contracts, the 18–24 age group, the 65 years and over age group, and Balaguer. These are precisely the groups with the highest values in Intimate and Close relationships.