INTRODUCTION

One key element of measurement theory is the relationship between the selection of appropriate samples and the verity of empirical results (Gray & Cooper, 2010). International business scholars have been specifically urged to “avoid incautious comparisons based on small samples of countries, which may lead to premature and incorrect conclusions about relationships between aggregate variables” (Franke & Richey, 2010: 1276). In this paper we examine the relationship between a particular type of small sample, single-country samples (a single home country with varying host countries or a single host country with varying home countries), and the verity of results when using the Kogut and Singh (1988) measure of cultural distance. We ask and answer the question: are single-country samples appropriate for cultural distance research?

Cultural distance measurement theory deals with three different elements: (1) the national culture dimension measures used, for example, Hofstede (1980) or GLOBE (House, Hanges, Javidan, Dorfman, & Gupta, 2004); (2) how the cultural distance construct is measured, for example, Kogut and Singh (1988) and (3) the characteristics of the sample that is used. Prior research has examined the first two items, the pros and cons of various national culture measures (Berry, Guillén, & Zhou, 2010; Shenkar, 2001) and the advantages and disadvantages of cultural distance measures (Lu, 2006; Shenkar, 2012; Zaheer, Schomaker, & Nachum, 2012). Unfortunately, cultural distance measurement theory has paid little attention to the third item, matching the appropriate sample to the statistic used to measure cultural distance.

This issue of sampling practices is particularly important when difference scores are used to operationalize cultural distance, as is true in the case of the most commonly used measure of cultural distance (Kogut & Singh, 1988), a statistic based on an average of the difference scores for a set of national cultural dimensions. A difference score measures the unsigned distance between two constructs. Prior literature suggests potential problems with the use of difference scores (Edwards, 2001; Johns, 1981; Venkatraman, 1989). Here we focus on one specific problem commonly associated with difference score-based measures, the confounded variables problem, variables “whose effect cannot be separated from the supposed independent variable” (White & McBurney, 2013: 122). We propose and show that for single-country samples in cultural distance studies this problem is very common.

We are not the first study to warn against the use of single-country samples when conducting cultural distance research. Estrin, Baghdasaryan, and Meyer (2009) specifically caution against using single-country samples in cultural distance studies asserting that such samples cannot distinguish between country-level and distance effects. Building on their assertion we theorize and empirically show that in single-country samples difference score-based cultural distance measures (the independent variable) can be highly correlated with one or more of the varying countries’ underlying national culture dimensions (the confounded variables), creating the confounded variables problem.

This issue is important because the vast majority of empirical cultural distance studies’ samples contain only a single home or host country. For instance, over 80% of the 388 cultural distance articles that operationalize cultural distance using difference scores published in the 13 top management and international business journals in 1988–2012 used samples with only a single home (199) or single host (113) country, providing evidence of the extensiveness of the problem.Footnote 1 As we demonstrate, in such single-country cultural distance samples it is impossible to determine whether it is the single-country cultural distance construct (home countries when the single country is a host country or vice versa) or the varying countries’ national cultural component that drives the results when using single home or host country samples.

Not knowing whether distance or national culture is the driver creates difficulties in assessing the verity of results in single-country samples. Put simply, we have no way of knowing whether a statistically significant finding for a cultural distance construct based on a single home/host country sample is truly measuring the underlying structure of the data. Here we attempt to remedy this problem. Using the six Hofstede dimensional measures (Hofstede, Hofstede, & Minkov, 2010) and Kogut and Singh’s (1988) method for calculating distance (a difference score method) we develop and test a method that allows scholars to determine the appropriate minimum sample required to obtain valid and reliable empirical results in cultural distance studies.

THE CONFOUNDED VARIABLES PROBLEM

The confounded variables problem in single-country sample studies clouds our ability to accurately interpret cultural distance results. To illustrate this point we created two contrasting controlled statistical experiments using standard hierarchical linear regression (HLR) for each of the seven countries most frequently used in cultural distance research.

The statistical experiments use simulated data sets; simulation allows us to predetermine the nature of specific underlying relationships in the data. For example, we predetermine the headquarters and subsidiary locations and performance of the “subsidiaries” in our simulated samples. These simulated subsidiaries serve to create a complete cross-section of the seven most frequently used countries in cultural distance research. By doing this we can create samples where we a priori know the actual underlying relationships between firm size, firm experience, national culture, cultural distance, and performance. The created data set was not designed to present a fully specified explanation of the drivers of subsidiary performance. Rather, creating such samples allows us to see whether the confounded variables problem associated with the use of the Kogut and Singh (1988) difference equation measures in single-country samples can interfere with the ability to detect actual underlying relationships. When we run an HLR analysis and include cultural distance in the simulated data set we should find the predetermined underlying relationships. Finding something other than the predetermined relationships shows there is a problem using the difference equation measure on the sample.

Our sample for our statistical experiments was constructed using two different sets of countries. The seven most frequently used in single-country samples were chosen as the reference country set while all 62 remaining countries with available Hofstede national culture measures were chosen as the varying country set. The seven countries were used singly and in pairs for our single-country and two-country samples. Using 62 varying countries allowed us to include all countries previously used in cultural distance studies. Two simulated subsidiaries in each of the 62 varying countries for our single-country studies create a sample (n = 124) that exceeds the minimum sample size requirement (114 firms) for medium effect sizes (α = 0.05) with power = 0.80.

In the first experiment the sample is constructed such that the dependent variable (simulated subsidiary performance) is driven by cultural distance, and all of the individual national cultural dimensions are not significant predictors. In the second experiment the data set was configured such that subsidiary performance is driven by only two of the Hofstede national culture attributes: masculinity and uncertainty avoidance. In the predetermined underlying data structure for the second scenario cultural distance and the other national cultural attributes are not significant predictors of the dependent variable.

In both experiments we calculate cultural distance by using the Kogut and Singh (1988) method in conjunction with the six Hofstede national culture dimensions (Hofstede et al., 2010). If there is no confounded variables problem, given the predetermined simulated relationships in the first experiment, only cultural distance will be significant. Similarly, if there is no confounded variables problem, in the second experiment only two national culture attributes, masculinity and uncertainty avoidance, will be significant.

Table 1 provides the results for the case of a single-country cultural distance study using Japan as an illustrative example, although similar results occur for all seven countries. In both simulations cultural distance and several national culture measures are found to be statistically significant even when they are not the predetermined underlying cause. This example illustrates the confounded variables problem and shows how in a single-country sample the results may not reflect what is actually driving the relationship.

Table 1 Illustrating the single-country confounded variable problem for Japan

Table 2 provides a summary of the same cultural distance analysis for each of the seven most frequently used single-country samples. For Experiment 1 cultural distance is the predetermined driver of subsidiary performance; for Experiment 2 masculinity and uncertainty avoidance are the predetermined drivers. In Experiment 1 cultural distance is always significant but, for two countries (China and Japan) some of the individual national culture dimensions were also significant which represents an erroneous finding. In Experiment 2 three countries (China, the Netherlands and the United Kingdom) lacked significance for masculinity or uncertainty avoidance (the true drivers). For China other national culture dimensions were erroneously found significant. Cultural distance was always significant although it is not a predetermined driver, creating false positives for all seven countries. Nine out of 14 possible outcomes for the seven most commonly used countries produce erroneous results, casting doubt on the reliability and validity of single-country samples.

Table 2 Summary testing the single-country cultural distance problem for each frequently used country

DEALING WITH CONFOUNDED VARIABLES

How can researchers prevent the confounded variables problem from compromising their cultural distance findings? Although a few methods have been proposed to deal with problems arising from the use of difference scores (Franke, Hill, Ramsey, & Richey, 2011; Tisak & Smith, 1994), the dominant approach is offered by Edwards (2001). He suggests using a polynomial regression and response surface method as an alternative to difference scores. Used in simple person – environmental fit contexts, for example, Edwards and Parry (1993), the method is very difficult to apply to multidimensional cultural distance studies for two reasons. First, the recommended response surface analysis for the multidimensional cultural distance construct requires that the response surface be plotted in nine-dimensional space (for the four Hofstede dimensions) or in 13-dimensional space (for the six Hofstede dimensions). This cannot be done.

Second, while the polynomial regression can be done, for cultural distance it is extremely difficult to do and hard to interpret. The five required independent variables for each national culture dimension for such an analysis include: (1) the home and host country dimension values; (2) the interaction between the home and host country dimension values; and (3) the squared home and host country dimension values.

Conducting this polynomial regression analysis using four Hofstede dimensions requires 20 independent variables; conducting this analysis when using six Hofstede dimensions requires 30 independent variables. This method for confirming a hypothesis that a dependent variable is negatively related to cultural distance requires that for the coefficients of each cultural dimension: (1) the coefficients for the direct effects of the home and host country values be approximately equal to zero; (2) the coefficients for the squared terms of the home and host country values be negative and approximately equal; and (3) the coefficient for the interactive term between the home and host country values be positive and with a value twice that of the absolute value of the coefficients for the squared terms.Footnote 2 These relationships between the values of the coefficients must be tested based on the estimated value and standard error for each coefficient using a t-test. Thus applying the recommended polynomial regression technique to cultural distance studies is very cumbersome at best.

THE TWO-COUNTRY SOLUTION

As an alternative we offer a method that is much easier to understand and to implement that resolves the confounded variables problem in cultural distance studies. We propose using a sample with two home/host countries and a variety of host/home countries. The underlying logic is that as long as the two countries’ dimensional values somewhat offset each other (which is the case with almost every pair of the seven countries), their combined distance score will be uncorrelated with the varying countries’ underlying national culture dimensions, thus eliminating the confounded variables problem and providing scholars a broad range of pairs to choose among.

We illustrate this two-country solution through an identical set of controlled statistical experiments. We tested each pair of the seven most frequently used countries, resulting in 21 pairs. Each headquarters company in the pair of reference countries has one subsidiary in each of the 62 varying country locations. We again applied standard HLR with a sample size of n=124 to each of the 42 statistical experiments (two experiments for each of the 21 pairs). Again, in the first case simulated subsidiary performance is driven by cultural distance. In the second case it is driven by the masculinity and uncertainty avoidance national culture dimensions of the varying countries.

Table 3 provides the results for the case of a two-country cultural distance study using the United States and South Korea. As indicated in Table 3, the HLR provides the correct empirical results for each of the two cases. In the case where the predetermined driver of the dependent variable is cultural distance (in Models 1 through 3), only the control variables and the cultural distance coefficients are significant; none of the national culture dimension coefficients are significant. In the case where the predetermined driver of the dependent variable is masculinity and uncertainty avoidance (in Models 4 through 6), only the coefficients for the control variables, masculinity and uncertainty avoidance, are significant; none of the other variables (cultural distance or the other national culture dimensions) are significant. Thus unlike the incorrect HLR results from the single-country cultural distance studies, the HLR analysis provides the correct empirical results when using the United States and South Korea as the two reference countries.

Table 3 Confirming the two-country cultural distance solution through an illustrative example

Table 4 summarizes the results of Experiment 1 (cultural distance is the predetermined driver) and Experiment 2 (masculinity and uncertainty avoidance are the predetermined drivers) for each two-country combination of the seven most frequently used countries. For Experiment 1 each combination confirms cultural distance as the driver while not finding significance for national culture. In Experiment 2 each country pairing except one (the United States and the United Kingdom) confirms masculinity and uncertainty avoidance as the drivers while not finding significance for cultural distance or the other national culture dimensions.

Table 4 Summary testing the two-country cultural distance solution for each combination of the seven frequently used countries

Figure 1 shows national culture dimensional values for the United States and the United Kingdom. The values for five of the six dimensions are almost identical, lacking the offsets needed to produce a new distance measure that is uncorrelated with one or more of the varying countries national cultural dimensions. For that reason this two-country sample fails to eliminate the confounded variables problem. This is the only pairing of the seven most frequently countries that does not eliminate the confounded variables problem, leaving researchers a wide range of two-country choices.

Figure 1
figure 1

National culture dimensions for the United States and the United Kingdom.

CONCLUSIONS, IMPLICATIONS, AND FUTURE EFFORTS

This study explains and empirically shows the value of creating appropriate samples when doing cultural distance research. We demonstrate that single home/host country cultural distance samples are highly susceptible to the confounded variables problem; we cannot know with certainty whether it is cultural distance or national culture that is driving the results. This problem renders the verity of conclusions for such studies suspect and means that single home/host country samples should not be used when conducting cultural distance research.

This is a huge problem for cultural distance research because over 80% of the empirical research in this area relies on single-country samples. Our intent in highlighting this problematic practice is not to criticize or embarrass particular studies. Rather we hope that by raising awareness of the confounded variables problem while offering a viable solution future cultural distance scholarship will cease to use single-country samples. In doing this we create a good news, bad news scenario.

On the positive side, each of the seven most frequently used single countries has many available country pairings that create an acceptable sample for cultural distance research. Specifically, as we showed, almost any second country added to the cultural distance sample solves the confounded variables problem. Conducting these analyses prior to collecting firm-level data gives us confidence that results can be accurately and confidently attributed to cultural distance rather than to home/host national culture effects.

Alternatively, for studies that are primarily interested in investigating differences between country A and country B, scholars may prefer to use country-level theories and measures rather than cultural distance theories and measures to strengthen theoretical–empirical alignment. For those that are interested in both country and difference effects the careful selection of appropriate samples is vital in order to remove the confounded variables problem haunting so many prior studies.

On the negative side, our study raises questions regarding cultural distance conclusions that are based on single-country samples. Put simply, we cannot know whether the conclusions drawn from prior cultural distance studies conducted with single-country samples were correctly or incorrectly attributed to cultural distance. Such prior studies may need to be replicated using appropriate samples in order to eliminate the confounded variables issues.

We used Hofstede’s cultural measures in our illustrations. Future scholarship may wish to see if the same confounded variables problem occurs in single-country samples using other measures of cultural distance such as the GLOBE dimensions (House et al., 2004). Scholars may also examine the degree to which institutional distance (Kostova, 1999), psychic distance (Johanson & Vahlne, 1977), and the recently developed Mahalanobis cross-national distance measure (Berry et al., 2010) use difference scores in their calculations; they also may be susceptible to the confounded variables problem associated with single-country samples.

Cultural distance is clearly an integral construct in international business research. Because of this, our study urges caution and forethought in the creation of appropriate samples that are free from confounded variables problems when conducting cultural distance research. By doing so the international business research community advances best practices and ensures that when cultural distance is evoked it is really distance that is being examined.