1 Introduction

Income is regarded as one of the clearest indicators of socioeconomic status in the developedFootnote 1 world and is highly correlated with outcomes across most life course domains including longevity, education, health, crime, family life, and happiness (for example Easterlin 2001; Hansen and Machin 2002; Lynch et al. 2000; Mayer 1997; Wilkinson 1992). Classical definitions linked income with consumption, so that income was defined as ‘the maximum value a man can consume during the week and still be as well off at the end of the week as he was at the beginning’ (Hicks 1939, p 172). The proliferation of consumer credit since this definition was coined may alter this meaning somewhat. Nevertheless, collecting information on the total monies entering a household or an individual’s budget is viewed as an important part of establishing socioeconomic status. For children, establishing household income can give a strong indication of the family’s potential purchasing power in terms of providing food, shelter, clothing, educational resources and other essential and non-essential items that will influence a child’s development.

Despite its importance, there remain issues as to the accuracy of income collected as part of surveys. These include the sensitivity of asking about income; whether weekly, monthly or annual income is reported; respondent knowledge of different types of income (gross compared to net for instance); and the more fundamental issue of whether respondents actually know their income. In addition, there is no consensus as to the best way to minimize these problems and therefore no agreement as to be best way to collect income data in surveys. Some surveys ask a single question, others use a number of more detailed questions; some ask for exact amounts, others for a banded range. These differences all lead to potential error in the data (Micklewright and Schnepf 2010; Moore et al. 2000).

This paper makes a unique contribution to the debate by examining whether different methods of collecting income within a single survey produce consistent estimates of income. This is possible within the Millennium Cohort Study (MCS), a UK nationally representative study described later in the paper, due to the different ways in which income is collected from the same respondents. We compare differences in how the same individual reports income in the most recent sweep of data collection (at age 7) when asked as a single question compared to multiple questions. We then look at whether the pattern varies across different sub-groups of the population, or by the period over which incomes are reported, for example weekly, monthly etc. The paper finally examines the implications any differences may have on the subsequent analysis of that data—in particular in an area of key substantive interest—the measurement of child poverty. We begin by examining known issues in the measurement of income.

1.1 Respondent Issues

The fact that people are said to be more reluctant to discuss their income than they are their sexual behavior (Gordon 1998) is testament to the sensitivity of the subject matter, although acceptability is just one aspect of respondent issues. In piloting questions on income for the 2001 UK census, Collins and White (1996) encountered respondents who were unsure as to the definition of household or family income (whether it represented earnings alone or total income); unsure of the meaning of net income (whether it referred to earnings after taxes or earnings after taxes and regular bills (disposable income)); and could not, in several cases, report on the partner’s income and thereby the family or household income (see also Atkinson and Micklewright 1983; Gordon 1998; Micklewright and Schnepf 2010).

Recall and reconciliation are other areas in which the respondent may introduce bias into the data (Atkinson and Micklewright 1983). The importance of recall in surveys is affected by the relevant measurement unit (Clarke et al. 2008); in the case of income it is usually annual income that is taken as a gold-standard (Becker et al. 2003).Footnote 2 This is generally not problematic for many salaried employees. However, for a large section of the population whose incomes come from a range of sources, recalling each source may be difficult. For those who are in short-term or seasonal employment, those reliant on several sources of income, or who otherwise have inconsistent forms of income, reporting an annual total figure can be problematic (Schrapler 2006). Even for salaried employees, providing both net and gross income estimates may be challenging. This moves from an issue of recall to an issue of reconciliation in reporting income. In the case of respondents who recently changed employment and income, providing one sum to reflect income during the period of change further compounds any possible existing difficulties in providing accurate information. For all respondents, regardless of earnings instability, providing a figure for annual income is a complex process involving issues of interpretation, understanding, retrieval, judgment, calculation, conversion and finally, communication (Collins and White 1996).

1.2 Measurement Issues: To Ask a Single Question?

There is no consensus as to the best way to collect income data in surveys. Large scale surveys in the UK collect information on income in a variety of ways—the majority either by single questions or by a set of detailed questions. Table 1 shows the different ways income is collected in a number of key UK studies. Some of this lack of consensus in measurement between surveys is likely due to the fact that surveys collect income data for different purposes (Davern et al. 2005; Micklewright and Schnepf 2010). However, even in multipurpose and multidisciplinary studies, such as the MCS, where income is not collected for a specific purpose, there is no agreement as to the best method.

Table 1 Collection of income data in major UK studies

In reality single question and multiple question strategies both have their merits and shortcomings. Limiting a respondent to answering a single question on income can introduce difficulties in recall and reconciliation as discussed above. These difficulties are likely to be encountered differentially amongst various groups in the population resulting in differential item non-response across sub-populations. When coupled with more basic problems of study coverage and representativeness (Becker et al. 2003; Francesconi et al. 2009), this may lead to serious flaws in estimates.

In asking a single question, there are also considerations to be made as to whether this is asked for as a total figure or whether respondents should select a band. Banded data may improve response but may limit detail, although the loss of detail may not be so severe as to affect the data quality for the majority of the population (Micklewright and Schnepf 2010).Footnote 3 But banded data can also introduce respondent bias. In pilot studies, reported in Collins and White (1996), respondents were found to select the band below which their actual income lay because they were concerned that selecting the correct band would imply that they had an income close to the top limit of the band.

There are also issues related to the loss of detail encountered when using a single question even if not collected in bands, particularly if researchers are interested in the contribution different components of income make to the overall total. However, while providing additional detail, asking multiple questions on income can introduce new problems of unmanageable questionnaire length and heavy respondent burden (Micklewright and Schnepf 2010), in addition to vastly increased survey costs. In large scale longitudinal surveys, the implications of heavy respondent burdens can have a twofold effect through unit missingness as well as attrition in subsequent sweeps (Burchell and Marsh 1992). Moreover, many sections of detailed questions on income may be irrelevant for some respondents; questions on state benefits are likely to be irrelevant for the most advantaged while questions on dividend from stocks, shares and investments are likely to be irrelevant for the least advantaged respondents, for example. However, restricting these questions to certain respondents only could introduce further inaccuracies.

In addition, detailed questions on income may compound issues of acceptability in responding to these questions; while some respondents may be comfortable in providing a total figure for income, they may not be as comfortable providing details of the origin of each component of income. Not all researchers agree, some unequivocally state that asking a number of detailed income questions provides a more accurate response than a single income question. Of this persuasion Davern et al. (2005) argue the individual components allow policy makers to better understand the dynamics of poverty and income in determining eligibility for intervention programs. Yet this debate remains unresolved in the literature at present.

1.3 Other Measurement Issues: What to Count as Household Income and when to Count it?

The Canberra Group, an international expert group concerned by inconsistencies in the measurement of income, defined the essential elements included in a measure of household income as: (1) income from employment, (2) income from self-employment, (3) property income (including from stocks and shares), (4) income from social insurance benefits, (5) deductions from income (social insurance), and (6) social transfers in kind (for example government provided education and healthcare services) (The Canberra Group 2001).Footnote 4

While suitable for standardizing the measurement of income for international comparison, in reality, such a framework is difficult to implement in the UK setting for a nationally representative population survey. Even if we limit the focus to the first four components alone, difficulties arise. In the UK context, measuring income from state benefits (which would include means tested benefits) directly from respondents is difficult as respondents may be unaware of the total benefits they receive. For example, housing benefit and council tax benefits (means-tested benefits that cover housing rents for those on no/low incomes) may constitute a large proportion of household income, but may be unknown to the respondent as they are often paid directly to landlords. This measure is also removed from the element of consumption, to which classical definitions of income, referred to earlier, are implicitly linked.

Seasonality in employment or in income (for example through performance related bonuses) can mean that results are dependent on when questions are asked. In analyses comparing results from questions on annual income and current income in the British Household Panel Survey, Boheim and Jenkins (2006) found small differences in the distributions, which did not alter the relative position of groups. However, within some groups, the differences were more pronounced with annual income reports producing lower values when harmonized with current values, notably in cases where the head of household was unemployed (Boheim and Jenkins 2006). Furthermore, it is easy to speculate that those in seasonal or unstable employment may also be those who have most difficulty in providing a single figure for annual income (as discussed earlier), and known correlates of irregular earnings include having low qualifications and being self-employed (Drewinka 2010).

The period of payment used to collect the information on income also has the potential to cause problems for a wide range of respondentsFootnote 5 who may prefer to report income in the period in which they receive payment, as opposed to the period dictated by the researcher. This will then vary by sub-population, as those paid weekly will answer in weekly amounts, those monthly in monthly amounts. As the low paid are more likely to be paid weekly we will see systematic differences across different groups of the population. Moreover, respondents may prefer to report the figures they are most familiar with; often this means reporting gross annual income as an annual total but their net income on a monthly basis.

Each component of income will have substantial variation in terms of the response rate and the reliability. Hawkes and Plewis (2008) found that questions on income from self-employment had particularly low response rates in the MCSFootnote 6 and Moore et al. (2000) found that estimates of survey-collected self-employment income were substantially lower than independently verified self-employment income. Wages and salaries data collected in surveys generally have the highest levels of consistency with externally verified sources (Biancotti et al. 2008; Moore et al. 2000), while income from dividends and interest from investments have the lowest, with estimates being approximately half of the actual total (Moore et al. 2000). Income from pensions is found to have high levels of reliability in some studies (Biancotti et al. 2008), but others argue that in a number of cases income from pensions tends to be overstated (Moore et al. 2000).

This paper builds upon the literature and offers a unique contribution to the income debate by comparing differences in how the same MCS respondent reports income in the most recent sweep of data collection (at age 7) when it is asked as a single question as opposed to multiple questions. We then examine the implications that different methods of collecting and reporting income have for data collected from different groups within the population. We also briefly discuss the potential issue of measurement error in the collection of income data.

More specifically, based on the evidence presented above, we ask:

  • What differences can be observed in average annual income from using single question banded income data versus multiple questions?

    • Does the pattern change across different sub-groups of the population?

    • How does the pattern change by unit of measurement, for example weekly, monthly etc.?

  • What effect does using a banded single question versus detailed questions have on the analysis of a key area of substantive interest—the estimation of child poverty in the survey?

2 Data

The data used in this paper are from the most recent sweep of the UK Millennium Cohort Study (MCS), collected from families when children were aged 7.Footnote 7 This study recruited families of children born between 2000/1 in randomly selected electoral wards, disproportionally stratified to ensure adequate representation of children from disadvantaged and ethnic minority families. Information has been collected at 9 months, 3, 5 and 7 years, with the next sweep of data collection due when the children are aged 11 years. Initially, over 19,000 households were recruited into the study; by age 7 the number of participating families had dropped to 13,800. In the most recent sweeps, information on family income was collected through both a battery of detailed questions as well as a single question.Footnote 8 , Footnote 9

2.1 Detailed Income Questions

At the age 7 survey detailed income questions collect information on earnings and income from benefits as well as other sources. Each question typically asked first whether the respondent received a particular component of income before asking about the amount, whereupon respondents would report an exact amount as opposed to selecting a value from a set of bands. Respondents were then asked to confirm the unit in which they reported the component of income from a set range of options (weekly, monthly etc.), although this may not cover all arrangements. In an effort to improve the response rate for individual components of income, respondents who didn’t give an initial answer to the amount of income were given a number of follow-up options known as ‘unfolding brackets’. Respondents who were unable, or refused, to give an exact answer were asked a series of follow up questions designed to elicit a minimum and maximum number, defining a range within which the value lies (Centre for Longitudinal Studies 2009).

While income from earnings and benefits was collected in detail, income from other sources was collected less comprehensively, and was collected from main respondents only who could only give a monthly amount. These included income from investments, studentships and cash from parents.Footnote 10

2.2 Single Question

In addition to the detailed questions at age 7 each main respondent (who is usually the mother) was presented with a card that showed incomes in weekly, monthly and annual amounts. They were asked ‘Which of the groups on this card represents you [^and your husband/wife]’s total take-home income from all these sources and earnings, after tax and other deductions’. Respondents were given a choice of nineteen brackets (see Table 2), although the question was not asked of partner respondents (usually fathers) and therefore assumed that a mother was aware of both her own and, where applicable, her partner’s income. The question asked about income after deductions, and while respondents were given a choice to report a total income in weekly, monthly or annual amounts, they were nevertheless expected to know and report different components of their income in the same periods of receipt.Footnote 11 At the age 7 sweep in 2008, the single question achieved a response rate of 89%, which was substantially higher than the 58% achieved for income from the multiple questions.Footnote 12

Table 2 Income bands used in single question and components of income included from multiple questions

The single question followed the detailed questions on income. It can therefore be expected that the multiple questions served as a prompt for what should go into the report of net family income. However, this was not made explicit. For example receipt of housing benefit was included in the detailed questions but not the amount. It is expected that housing benefits will also be excluded from the answers to the single question, although respondents were not explicitly reminded to exclude housing benefits from their total calculation in responding to the single question. Show cards were shown to respondents with the intervals displayed in weekly, monthly or annual amounts, although no record exists of which period respondents chose to report their answers.

3 Results

3.1 What is the Impact of Using Single Question Banded Income Data Versus Multiple Questions for Measuring Income?

To examine our first research question, we compare the descriptive statistics from five different calculations of income based on data handling conventions (detailed below) using information from both the detailed set of questions and the single income question.

  1. 1.

    Income as collected with the single question in banded groups (calculation 1). Main respondents were asked to report a single band for total net household income (after taxes and other duties) from a choice of twenty bands (see Table 2). Different bands were used for couple and lone parent households. Respondents could choose bracket values that corresponded to the household income in weekly, monthly or annual multiples, although for consistency with our other definitions, we examine annual amounts of income. Income brackets were narrower towards the bottom of the income distribution compared to the top; the top bracket was open at £100,000 or more per annum. Respondents were also given the choice of don’t know or could refuse to answer—around a quarter of those without a valid response represented refusals.

  2. 2.

    Income as collected with the single question in banded groups continuously using the mid-point of the interval as the value (calculation 2).Footnote 13 This calculation of income uses the same information as above, although we apply mean substitution for the band intervals, taking the mid-point of the interval.

  3. 3.

    Income collected from multiple questions, and including the income of any respondent with a valid answer to any of the component questions as having a valid income response (calculation 3). This calculation of income represents an aggregate sum from all components of income as reported, with sums multiplied to an annual amount. Respondents who gave a valid response to any component of income were included in this calculation of income—this includes respondents giving valid amounts for income from employment, self-employment, state benefits, dividends, among other sources (see Table 2). This method of calculating income includes reports from partners on income where appropriate. No checks are imposed for the validity of the data including checks for logical inconsistencies; this calculation of income instead is indicative of the potential sample size should imputation techniques be employed.

  4. 4.

    Income collected from multiple questions, excluding incomes from respondents with any missing component parts and logical inconsistencies in benefits and other information (calculation 4). This calculation of income represents an aggregate sum from all components of income as reported above, with sums multiplied to an annual amount. This calculation includes only those cases who indicated whether they received a given source of income or not, the amount from that source of income, as well as the frequency of payment for each potential component included in the calculation (see Table 2). Logical consistency checks were also imposed to filter out inconsistencies in the data. For example, those whose gross income is less than their net income; couples with a valid main but no valid partner report; those who claim certain benefits but give no amount; and those who give inflated estimates of benefits (which in some cases, for example child benefit, are not means tested and paid in universal amounts); were excluded from this calculation whereas they were included in the above calculation. The effect of filters in depleting sample size of income derived from using multiple questions poses one of the main disadvantages of collecting income information in this way.

  5. 5.

    Income collected from multiple questions, excluding incomes from respondents with any missing component parts and logical inconsistencies in benefits and other information, but including information from brackets data (calculation 5). This calculation is derived in the same way as calculation 4 above. However, we boost the number of valid responses by using information derived from a series of follow up questions referred to as ‘unfolding brackets’. Respondents who are unable, or refuse, to give an exact answer are asked a series of follow up questions designed to elicit a minimum and maximum number defining a range within which the value lies. We take the midpoint of this bracket.

Reassuringly, examining Table 3 shows that the different calculations of income produce similar results in terms of the mean value of income. They vary from the lowest, £29,329 (using calculation 3) to the highest, £32,936 (using calculation 5). However, the sample sizes do vary across calculation methods—from over 12,250 using the single question to under 6,900 using multiple questions after the elimination of missing elements and logical inconsistencies. When we apply listwise deletion in the bottom part of Table 3, we find that many of the differences are caused by variations in the sample composition—the median and mean values now lie within £1,000 of each other regardless of the definition of income used. However, there remain considerable differences in the value of the lowest quartile which is approximately £3,000 higher using the multiple questions as opposed to the single question. Consequently, the 90:10 index of inequality is substantially lower using the multiple questions as opposed to the single question. Using the brackets increases the sample size by almost 150, and makes a moderate difference to the upper quartile and mean. The correlation coefficient of 0.63 between the income derived from multiple questions and from the single question indicates a strong, but imperfect, correlation between measurement tools.

Table 3 Descriptive information for differing calculation methods of income

3.2 Does the Difference Between Using Single Question Banded Income Data and Multiple Questions for Measuring Income Vary by Sub-Group or Period of Measurement?

Having charted the differences seen in the income data by using the separate calculations of income we now want to examine variations in this pattern. We are particularly interested in whether income, measured in these different ways, varies across sub-groups of the population. We do this because different groups in the population are more or less vulnerable than others, have more or less income than others and vary in their degree of need and their likelihood of experiencing poverty. It may be that some methods of recording income work better for some groups rather than others. For this reason we compare the descriptive statistics using calculation methods 2 (taking the single question using the mid-point of the intervals) and 5 (using multiple questions with restrictions to filter out inconsistencies but including information from brackets). We examine these calculation methods for different groups including: single parents versus those in couples; workless families compared to families where at least 1 person works; those reliant on state benefits versus households who do not claim any means tested benefits; those self-employed and those not self-employed; part-time workers (defined as those working less than 30 hours per week) compared to full-time workers; and variations across different ethnic groups.Footnote 14 We also compare the similarity in estimates by unit of pay and period of pay. All these variations are shown in Fig. 1).

Fig. 1
figure 1

Difference between single income question (Using the estimated interval mid-point (calculation 2)) and multiple income question average estimates (Components question with imputed information from brackets (calculation 5)). hh household. Negative value indicates a higher value on the single question, N = 6,353

We can see the sub-groups of the population that might be considered more vulnerable such as those on means tested benefits, part-time workers, non-whites and the self-employed are more likely to record higher incomes through the multiple questions than the single question measurement. The difference is largest for couple households where both are in self-employment or lone parent families headed by a parent in self-employment, where the multiple questions appear to capture a substantially higher mean level of income (approximately £46,000) than data from the single banded question (approximately £32,500), a discrepancy of over £13,000.

When the period of reporting (defined as 1 week, 4 weeks, calendar month or annual) is examined the results show that higher values of income are recorded through the multiple questions than the single question when respondents report their income annually or in a 4 week period. But respondents who report their income by calendar month report higher incomes using the single question.

So far we have examined differences in income recorded by respondents using a range of definitions based either or the single banded income question or the multiple questions by comparing the income at various points of the distribution. An alternate way to illuminate differences between income measurements is to apply the upper and lower limits of the twenty income bands that are used in the single question on to the continuous data from the multiple questions, so that the data from both calculation methods are now grouped according to the same limits, and examine the differences. Figure 2 shows the difference between the income distributions from the single question (using calculation 2) and multiple questions (calculation 5), expressed as the difference in number of income bands. A band in this case represents an absolute range of income (see Table 2).Footnote 15 A negative value indicates that the income band derived from the multiple questions is lower than that derived from the single questions. In total 29% of the cohort fell into the same bracket using a single question, 26% had a lower income band using the multiple questions and 44% had a higher income band using the multiple questions. In a substantial number of cases (19%), the difference was greater than two bands, equating to a potential difference of £48,000 or more in some cases.

Fig. 2
figure 2

Difference between single income question and multiple income question bands. Difference in income bands when applying income bands from single question onto continuous data from multiple questions. Difference measured in the number of income bands—positive values indicate a higher band from transformed continuous data than from single question data. Other notes as Fig. 1

We have examined descriptively differences in income recorded by respondents using a range of definitions. Results have shown that there are non-negligible variations between the income measures for certain sub-groups of the population and by the unit of time respondents record their income over. However, as many of the factors that are examined may overlap, we explore these relationships more robustly using a multinomial probit regression model (presented in Table 4).Footnote 16 Our model examines the relative probability of reporting a higher income band or lower income band using the multiple questions compared to reporting the same band using both the multiple and single questions (the latter being the baseline category). The process is similar to constructing a (binary) probit regression model, although as we have a nominal variable, as opposed to a binary, the models for predicting the probability of ‘higher band versus the same’ and ‘lower band versus the same’ are run simultaneously, with the added constraint that resulting predicted probabilities sum to one across all categories. In our model, we enter all of the factors contained in Fig. 1 as predictors and present, in Table 4, the relative magnitude of our predictors in determining the probability of achieving, for example, a higher income band with the multiple questions than a single question, compared to the probability of achieving the same or lower band, respectively.

Table 4 Multinomial probit regression showing estimation of income using single versus multiple questions

The results confirm earlier descriptive analyses that those with more diverse sources of income are more likely to report higher incomes using the multiple income questions. Those in self-employment, part-time employment and those in receipt of means tested benefits are significantly more likely to report higher incomes using the multiple questions. Net of other factors, lone parents are more likely to report the same income across measurements. Ethnicity is generally not statistically significantly predictive of response patterns, except that in comparison to white families, black families are less likely to report income within the same bracket across measurements.Footnote 17

3.3 What Implications does Using a Banded Single Question Versus Detailed Questions have for Measuring Child Poverty?

Looking at the differences in reported income according to measurement method allows us to gain insight into which groups are most likely to give inconsistent income estimates. Nevertheless, we remain unable to comment on which of these methods is likely to provide the most accurate estimate.Footnote 18 However, due to the extensive range of information collected in the MCS we are able to look at how the different measures of income are associated with different financial wellbeing measures. In the first instance we examine how the classification of families in poverty in the MCS varies according to measurement instrument. Furthermore, we speculate on the accuracy of those classified as living in poverty (or not) by comparing our results with other, more subjective measures of financial wellbeing collected in the MCS.

We classify households as living in poverty based on a similar definition as the European Commission (2010) definition, a relative classification of poverty in which equivalised household incomes that fall below 60% of median income are classified as being in poverty. Our definition approximates the DWP practice of comparing net income to the median using a modified version of the OECD equivalised scales and focusing on families rather than the household. This resembles the procedure adopted by Ketende and Joshi (2008) but here we use a separate median threshold for households who are in receipt of housing benefits for whom housing costs are not included in our family income estimates (£206pw), compared to those who are not in receipt of housing benefit (£244pw) (Adams et al. 2010).Footnote 19

The analysis continues to compare banded data and continuous data, although if we were to use the banded data as the basis for our poverty cut-off point, we risk significantly underestimating or overestimating the proportion in poverty by ignoring the within band distributionFootnote 20 (for example up to 10% of the cohort fall within the same income band). In order to obtain a more accurate estimation of the numbers classed as ‘in poverty’ from the banded data, we use interval regression to obtain predicted values. Interval regression is used to model banded data, and holds an advantage over OLS when estimating the distributions at the bottom and top of the interval (Ketende and Joshi 2008; Micklewright and Schnepf 2010). Interval regression also has advantages where the bottom or top thresholds of the interval are unknown (or censored), as in the case of our single income variable.

We construct an interval regression model using a number of predictors (region and sample stratum, worklessness, self-employment, part-time working, educational level (of main respondent), age (of main respondent), family structure, ethnicity (of child), unit of payment of main component of income, housing tenure and receipt of means tested benefits) to obtain predicted values of income. As the predicted values reflect the choice of predictors used, to facilitate comparison, we compare the predicted results from the interval regression with the predicted results from an OLS model that uses the continuous income from the battery of questions with brackets (calculation 5 earlier).Footnote 21 In addition, we also compare the impact of using the observed continuous income (calculation 5 earlier), and for reference the banded single question data using mid-points (calculation 2 earlier), on estimates of poverty. We equivalise our four income values using OECD scales (Adams et al. 2010) and present the results in Table 5.

Table 5 Per cent in poverty by different measurement tools and derivations of income

By comparing the predicted values from the two different data sources in the models constructed, we see that a greater number of families would be classed as being in poverty using the data from the single question compared to the multiple questions (looking at either predicted or observed values). Essentially, the higher level of income estimated using data from the multiple questions leads to a more conservative estimate of poverty.Footnote 22

Due to the range of alternative measures of financial wellbeing available in the MCS we can additionally compare how well our measures of poverty (using the different calculation methods of income) correspond to measures of poverty defined in more subjective ways. We do this in Table 6 which shows the positive predictive power of our poverty definitions in detecting other indicators of financial deprivation: free school meal eligibility and uptake (FSM), reports of financial wellbeing (financially ‘just about getting by’ or ‘finding it difficult’), receipt of housing benefit, and all three indicators combined (termed indicator of deprivation).Footnote 23 This is a simple measure which identifies the percentage of people who are identified as being in poverty using our income measures who are predicted to be in poverty using the alternative measures of financial wellbeing.

Table 6 The positive predictive values of poverty defined using the different income measures against poverty defined using other measures of financial wellbeing

The results show the definitions of poverty derived from multiple questions have higher positive predictive values across measures of financial wellbeing than definitions of poverty derived from a single question. In other words, those we define as being in the poverty category from the multiple questions have a higher likelihood of being in a deprived category on another indicator of financial wellbeing, than poverty definitions derived from the single income question.

While our analysis is highly dependent on our choice of indicators of financial well-being, the evidence suggests that income measured through a single question may be overestimated, and as such, the higher levels of poverty estimated through a single question may not necessarily be corroborated by other measures of financial wellbeing. However, we do exercise caution in drawing our conclusions as the definition of poverty does not correlate perfectly with other indicators of financial wellbeing—for example in the latest official records, 50% of households with children defined as being ‘in income poverty’ (before housing costs) were not in receipt of housing benefit and 13% were not in receipt of any means tested benefits (Adams et al. 2010). Furthermore, a large body of literature exists that cautions against treating measures of income poverty and other measures of deprivation synonymously (for example Whelan et al. 2004; Saunders and Adelman 2006).

4 Summary

Family income is measured in a variety of different ways across national surveys impeding comparability between sources. Here, we examined differences between two of the main distinctions in measurement methods—whether to use a single variable with a choice of intervals or multiple questions examining each portion of income in detail. Each has its own merits. Researchers wishing to examine certain portions of income will need to use information from multiple questions. However, for many studies, researchers may not need the additional detail, and respondent time may be better spent addressing other questions. Additionally, our results show that using multiple questions is associated with a substantially lower response rate especially when safeguards to ensure logical consistency are imposed.

Given that income represents a key variable of interest in many studies, obtaining accurate measurements is of great importance, and we demonstrate that in the case of the MCS, the collection of income is associated with differential estimates of income and measurement error. If the variable in question is used as a dependent variable in a regression and if the measurement error is ‘classical’, i.e. a random error, then the regression coefficients are unbiased, albeit with larger standard errors. But the error may not be ‘classical’, that is, it may be systematically related to the true unobserved value. Or, if used as an explanatory variable, the error even if classical causes ‘attenuation bias’, i.e. the coefficient is biased downwards and may not fully account for confounding effect of income on other variables (for example Bound et al. 2000).

Our analyses of the differences in the estimate of income by measurement method suggest that the measurement error may be unequally distributed among socioeconomic and socio-demographic groups. Use of the multiple questions generally results in a higher estimate of family income than using a single question. This is particularly true for certain groups of the population—those on means tested benefits, those in self-employment and those in part-time work and on low incomes. This could indicate systematic bias introduced by one or other measurement instruments. As the battery of income questions result in higher income estimates, particularly for those in self-employment and on low incomes, it could be assumed that the results from the battery of questions provide the most accurate answer. While it is only through external verification that such an assumption could be qualified, we do explore this further looking at the definition of poverty. The results show that using a single question generally produces a higher proportion of families who could be classified as living in poverty. These results concur with a number of other studies, where the use of a single question was found to overestimate the rate of poverty (Davern et al. 2005). In the MCS when examining other measures of financial deprivation, the more conservative estimate of poverty using the multiple questions shows greater concurrence with the other measures of financial wellbeing as indicated by higher positive predictive power. The evidence suggests that use of multiple questions to collect income information, as opposed to a single question, may reduce the possibility of measurement error among lower income groups and those with diverse income sources.