Introduction

The past 20 years or so have seen the establishment of an increasing number of cross-national surveys to meet demand for comparable data on adolescent health for both scientific and policy purposes. While the potential difficulties of producing comparable international data have been recognised for a number of years (e.g., sample frame coverage, culturally appropriate translation of survey items and standardised data collection methods), greater attention is now focused on these challenges, alongside other methodological issues such as quality standards in sampling, questionnaire construction and analytic methods (see for example Lynn 2003; Levy and Lemeshow 1999; Harkness 1999).

The Health Behaviour in School-aged Children (HBSC) study was among the first international surveys on adolescent health in Europe, with fieldwork first being undertaken in 1983 in three countries. The challenges to producing valid and reliable data were clear when the initial research protocol was being developed, including a range of structural and practical factors such as variation in school systems where data were collected, language, compliance with the research protocol and varying degrees of capacity within countries (Aarø et al. 1986; Smith et al. 1992). As the study has increased in size and complexity, a more systematic assessment of the survey methods employed has been required.

This paper provides a brief introduction to the background and aims of the HBSC study, the survey content and methods used to collect data across countries. This is followed by a discussion of recent developments in maintaining and improving quality standards in the context of an evolving international research study.

Background and aims of the HBSC study

The HBSC study was established in 1982 by a team of academic researchers from Finland, Norway and England (Aarø et al. 1986). Data were first collected in 1983/84, 2 years later in 1985/86 and every 4 years since then. The study has increased in size over the years, such that for the 7th survey in 2005/06, 41 countries/regions participated across Europe and North America, in collaboration with the World Health Organization (WHO) Regional Office for Europe. A list of participating countries/regions can be found at http://www.hbsc.org.

The HBSC Research Network comprises principal investigators from each participating country and their research teams. All members participate and contribute to the development of the study within their area of interest and expertise. The study is co-ordinated by an international co-ordinator (elected by the principal investigators) who heads a centre responsible for co-ordinating all international activities within the network. Since 1995 the centre has been based at the Child and Adolescent Health Research Unit at the University of Edinburgh. Funding of the centre has been provided by NHS Health Scotland, the Scottish Executive’s Chief Medical Office, the University of Edinburgh and, more recently, through contributions from member countries through their national funders (a subscription system that was introduced in 2000). The data collected in each country are managed by an International Data Bank manager who is responsible for the production of the international data file from each survey. The data bank is based at the University of Bergen and has been supported by funds from the Norwegian Research Council and the international network subscription system. Collaboration with the WHO creates opportunities for, and facilitates, the wide dissemination and utilisation of HBSC research findings through international reports (see for example Currie et al. 2000, 2004; King et al. 1996; WHO 2006) and some support for those countries with fewest resources, such as help with attending meetings.

The original aim of the study has remained largely the same since its inception: to gain new insight into and increase understanding of adolescent health behaviours, health and well being in their social context and to collect high quality comparable cross-national data in order to achieve this. From the start, one of the study’s key objectives has been to provide research evidence to support health improvement policy and practice through gaining a better understanding of the patterning and associations between different types of health-related behaviours. With the growth of the study and the extension of the international research network, the study’s approach has become increasingly multi-disciplinary, resulting in a cross-fertilisation or ‘fusion’ of conceptual approaches, including social psychology, public health/epidemiology and a macrosocial/multilevel perspective (Currie et al. 2001). This growth and development have allowed multiple strands of research to proceed simultaneously, and provided the opportunity to use the data for a range of purposes, but discussion of these issues is beyond the scope of this paper where the focus is on methodological developments.

With the growth of the study and its public profile, there have been increasing demands from a range of national and international agencies to use published data and from academics to access raw data for secondary analysis (see for example UNICEF 2007; Bradshaw et al. 2007; Association of Public Health Observatories 2006; UNICEF 2004; European Commission 2000). In addition, the number of published scientific papers from members of the HBSC Research Network has grown markedly in recent years (see http://www.hbsc.org/publications.html for an up to date list of publications). This increased focus on the HBSC study has resulted in greater scrutiny and the need for a sharper focus on continuous improvement in data quality.

The HBSC Research Network members collaborate on the production of an international research protocol for each 4-yearly survey (see for example Currie et al. 2001). The Research Protocol includes detailed information and instructions covering the following: conceptual framework for the study; scientific rationales for each of the survey topic areas; international standard version of questionnaires and instructions for use (e.g. recommended layout, question ordering and translation guidelines); comprehensive guidance on survey methodology, including sampling, data collection procedures and instructions for preparing national datasets for export to the International Data Bank; and rules related to use of HBSC data and international publishing. The current Research Protocol is available on request from the International Co-ordinating Centre. Key components of the survey methods are examined in more detail below.

Survey methods

Questionnaire content

HBSC is a school-based survey and data are collected through self-completion questionnaires administered in the classroom. The international standard questionnaire for each survey consists of three levels of questions that are used to create national survey instruments. These are:

  1. (1)

    Core questions that each country is required to include for the production of an international common dataset.

  2. (2)

    Optional packages of questions on specific topic areas from which countries can choose.

  3. (3)

    Country-specific questions related to issues of national importance.

Survey questions cover a range of health indicators and health-related behaviours as well as the life circumstances of young people. The core questions provide information on: demographic factors, including age and state of maturation; social background, including family structure and socio-economic status; social relations provided by family, peers and school environment; health behaviours, including physical activity, eating and dieting, smoking, alcohol use, cannabis use, sexual behaviour, injuries, violence and bullying; and well-being indicators, including symptoms, life satisfaction, self-reported health, body mass index and body image (Currie et al. 2001). Many of these core items have remained the same since the study’s inception, allowing the analysis of change through time.

Study design

The specific population selected for sampling is young people attending school aged 11, 13 and 15. These age groups represent the onset of adolescence, the challenge of physical and emotional changes, and the middle years when important life and career decisions are beginning to be made. The desired mean age for the three age groups is 11.5, 13.5 and 15.5. In some countries, each age group corresponds to a single school grade, while in others a proportion may be found across grades due to students being advanced or held back. This has implications for the sampling strategy chosen by each participating country. A minimum of 95 percent of the eligible target population should be within the sample frame. Countries may choose to stratify their samples to ensure representation by, for example, geography, ethnic group and school type.

Cluster sampling is used, the primary sampling unit being school class (or school, where a sampling frame of classes is not available). The recommended sample size for each of the three age groups is set at approximately 1,500 students, the calculation assuming a 95 per cent confidence interval of ± 3 per cent around a proportion of 50 per cent and a design factor of 1.2, based on analyses of existing HBSC data.

Given the differences in school systems, such as vocational or academic institutions, age of admission to school and the degree of advancement and holding back among students, imposing a uniform approach is impractical. To deal with this complexity, age has been a priority for sampling, with students of the relevant age being selected across school years. Further complications arise when the target population is split across different levels of schooling, such as primary and secondary. Where the number of classes eligible for sampling is unknown, probability proportionate to size sampling is used, making use of actual or estimated school size. In some countries, to minimise fieldwork costs, classes from one age group are randomly sampled and then classes drawn for the other grades from the same school, minimising the number of schools required. The survey is administered at different times as appropriate to the national school system in order to produce samples with mean ages of 11.5, 13.5 and 15.5. In countries where there is significant holding back and/or advancement of students, this may involve sampling more than three grades.

In the vast majority of countries, a nationally representative sample is drawn. Where this is not possible, a regional sample is drawn (one country in 2005/06). It should also be noted that in those countries with a small population (Malta and Greenland in 2005/06), a census of school children of the relevant age groups is undertaken.

Data collection and file preparation

Fieldwork takes place every 4 years between October and the following May in the vast majority of countries, usually lasting between 1 and 2 months. In most countries, questionnaires are delivered to schools for teacher administration. Where financial resources are available, researchers are used in an attempt to minimise teacher burden. Files from each country are prepared and exported to the HBSC International Data Bank at the University of Bergen, where they are cleaned and compiled into an international data set with support from the Norwegian Social Science Data Services (NSD), under the guidance of the study’s data bank manager. Data on young people outside the target age groups are removed and deviations from the international research protocol are documented, typically to make users of the data aware where there are changes to the wording of questions and/or response categories in a country. Depending on the magnitude of the deviation, the user can then choose to include or exclude items from subsequent analyses. Despite some variation, the desired mean ages and sample sizes are achieved in the vast majority of countries.

When all national data have been received and accepted according to the Research Protocol by the data bank manager, the files are merged and the combined dataset is made available to the principal investigators in each participating country. The production of the final international data file is usually 4 or 5 months after the deadline for submission of national data sets. From the time it is finalised the international data file is restricted for the use of member country teams for a period of 3 years, after which time the data are available for external use by agreement with principal investigators across the study.

Continuous improvement

The foundations for the collection of robust and comparable data were established back in the 1980s when the HBSC study was limited to a small number of participating countries. Since then Europe has seen massive social and political change and the challenge of producing internationally comparable data is magnified, as the study embraces an increasing variety of cultures across Europe and North America. In order to improve the quality of data collected, a number of actions have been taken, which can be organised under six key headings. The first two actions, management structure and review of the international Research Protocol, underpin the study’s efforts to further improve data quality. The remaining actions build on these strategic developments, namely sampling procedures, translation, data documentation and new forms of analysis and software.

Study organisation

Long-term improvement in the quality of the research undertaken within an international research network requires commitment to the study and its development, i.e., not simply the production of detailed research protocols. The establishment and development of a strong multi-disciplinary international research network that promotes the sharing of skills and expertise has ensured not just the quality of the survey data itself, but the quality of the scientific development of the study, internationally as well as within countries. A particular strength of the HBSC Research Network is the long-term commitment that many members have given. The importance of establishing and strengthening such a network cannot be under-estimated in order for the study to achieve its potential.

There has been action in three key areas of study organisation in recent years, each contributing to building a structure that can best support countries in the delivery of high quality data and the production of high quality metadata. First, the position of the International Co-ordinating Centre in Edinburgh and International Data Bank in Bergen has been strengthened, with more day to day support available for member countries and a greater emphasis on adherence to the Research Protocol and detailed data documentation. Second, geographical zones have been introduced, each containing a small number of neighbouring countries, for joint working and sharing methodological expertise and experiences. Third, a Methods Development Group has been formally established within the management structure, working via a series of task and finish groups. Recent work has focused on reviewing countries’ sampling plans and international cleaning procedures for the 2005/06 survey, with work planned on time trend and multilevel analyses. It is important to note that methodological development work to date has relied on the expertise and contributions of network members, with support from the International Co-ordinating and Databank Centres.

Review of the international Research Protocol

As the study has grown in visibility since the early 1980s, largely through an increase in the number of scientific publications and dissemination of reports to national and international policy makers, the survey content and methods have been subjected to greater scrutiny. As part of the continuous improvement process, a review of the international Research Protocol was introduced prior to the 2001/02 survey, with internal and external peer review built in. A key element of the review was to document information on the reliability and validity of existing HBSC items and to encourage additional validation studies, where possible including a number of countries. Evidence of this renewed emphasis on assessing the validity and reliability of questions in recent years can be seen from studies published on, for example, family affluence (Boyce et al. 2006), diet (Vereecken and Maes 2003), overweight and obesity (Elgar et al. 2005), symptoms (Haugland and Wold 2001) and school environment (Torsheim et al. 2000). Any changes to survey procedures and/or specific items that could impact on comparability between survey years are noted within the Research Protocol.

Given the potential for question ordering to influence survey responses, the need for guidelines was recognised and these were provided to countries prior to the 2005/06 survey. Other outcomes of the review, including the need to strengthen guidance and support on sampling, translation and data documentation, are outlined below. The Methods Development Group works closely with the International Data Bank Manager in leading on this work.

Strengthening sampling procedures and support

The need to revisit sampling procedures and the support given to member countries during survey preparation was thought to be important for numerous reasons. First, the variety of school systems represented has grown with study expansion. This has implications for sampling across the three age groups and achieving the desired mean age in each. Second, with greater emphasis on multilevel analytical techniques, whether adjusting for sample design or data from different levels in examining a research question (e.g., looking at individual and school level effects on an outcome measure), the identification of the data hierarchy is critical. Third, levels of technical expertise and experience vary across countries. With more countries involved, a more systematic approach was needed.

All countries are now provided with sampling guidance notes and required to complete a sampling questionnaire, covering issues such as: the proportions of students held back or advanced; how students will be sampled; whether a sampling frame of classes or schools is available; whether probability proportional to size sampling will be possible where school is the primary sampling unit; stratification to be used; dealing with likely non-response; and whether any boosts will be built in (e.g., to accommodate language groups or geographic regions). These plans are reviewed by a team from the study’s Methods Development Group, further guidance offered where necessary, with some plans amended as a result of this process. Full details of the sampling plan and accompanying guidance notes can be found in the Research Protocol (Currie et al. 2001).

Strengthening the translation process

As the study has grown, translation has become more important than ever. The standard approach in HBSC has been to ask the same question in each country, that is to say a direct translation with adaptations permitted only when absolutely necessary for linguistic clarity. The source language for all questions is English and these are then translated into the national language(s).

In the early years of the study the translation of survey items was not a major consideration: there were few countries involved, the researchers developed questions together and a common understanding of translation needs was assumed. Therefore, in common with many cross-national studies (Harkness 1999) the documentation of translation procedures from early HBSC surveys is minimal. As more countries joined the study and different questions were added to the core questionnaire, the need for a more formal documentation process became clear.

The standard method employed in the study for checking translations has been the back-translation process, where the translated questions are translated back into the source language (English in this case) and compared against the original. If carried out ‘blind’ with the assumption that if the back-translation compares favourably to the original then there are no problems, this process has severe limitations as basic errors can simply be duplicated (Harkness 2003). An example to illustrate this would be if the word ‘fair’ (intended meaning ‘impartial’) was translated into ‘blonde’ and then back-translated to ‘fair’, thus masking the error. However, when additional reviewing techniques are used it is possible to identify any such major errors and to highlight potential discrepancies (Harkness 2003). This is particularly so when researchers themselves are involved in the development of the instrument, as they are in a collaborative study such as HBSC, and not simply translating an existing instrument in isolation.

In recent surveys the back-translation reviewing process has been strengthened by incorporating a more thorough system where back-translations are checked at the International Co-ordinating Centre, followed by discussion and further review between the researcher, translator and the Co-ordinating Centre. A further improvement is the development of detailed translation guidelines that are aimed specifically at translators rather than the researchers. It is sometimes assumed that because the questions are designed for adolescents and are simply phrased that translation is a relatively straightforward process. While this may be true for some questions (for example, ‘How often do you smoke cigarettes?’) it is not always so for more conceptually complex questions on health perception and social contexts. It is also not the case that the Research Protocol’s scientific rationale for including the question in the survey is sufficient guidance for translators. This development will continue to be expanded and documented as countries feed back their experiences of testing new items and new countries/languages are included in the study.

New countries carrying out the survey for the first time test their translations through pilot surveys and qualitative work (e.g., focus group interviews with children). Translations are adjusted at this stage and may also be further refined during the required pilot phase prior to each survey. Similarly, new items are tested across countries in this way. More recently, language groups have been established, building on the introduction of geographical zones in the management structure for the study. For example, those countries that include Russian-speaking populations have collaborated to work more efficiently and ensure consistency with translations.

The importance of improvement in this methodological area has meant that these practical (and time-consuming) actions, and their thorough documentation, have been undertaken despite limited resources. Clearly more work is required, but these pragmatic developments are likely to improve the consistency of items across countries and subsequent data comparability.

Data documentation and processing

With increased demand for published HBSC data and requests for data for secondary analysis, greater emphasis has been placed on the availability of good quality metadata across countries. It is particularly important that a country’s deviations from the international research protocol are clearly documented. For example, some countries may be required by funders to omit sensitive items, such as those concerning sexual health or cannabis use. An online data submission facility has been introduced by the International Data Bank, following consultation with the Methods Development Group and wider HBSC membership. All data processing, including consistency checks, age cleaning, derivation of variables and imputation is handled centrally by the International Data Bank, with support from NSD, working to a framework agreed by the study. Data sets are not accepted by the data bank manager until the online data documentation form is complete, which contains questions on fieldwork dates, sampling procedures (to check compliance with reviewed sampling plan), data collection procedures, response rates, languages used, funding and any deviations from the Research Protocol. Primary sampling units and stratification variables are clearly identified for subsequent analysis. It is anticipated that this more comprehensive procedure will minimise errors, improve transparency and help those undertaking national and international analyses.

New forms of analysis and software

When the HBSC study began there were relatively few statistical packages available that permitted appropriate analysis of survey data. The majority of the research community was unaware that aspects of survey design, particularly weighting and clustering, could affect the results of analyses from survey data. Inclusion of survey-specific modules within many widely used statistical packages, such as Stata (StataCorp. 2005), SAS/STAT (SAS Institute Inc. 2004) and SPSS (SPSS Inc. 2006), has increased the capacity for appropriate survey data analysis. Recent initiatives from the UK Economic and Social Research Council’s (ESRC) Research Methods Programme, such as the PEAS (Practical Exemplars and Survey Analysis) web-based resource and the Survey Design and Measurement Initiative (ESRC 2006), have been aimed specifically at raising the quality of survey research and methodology. The increasing interest in examining effects operating at a variety of levels other than the individual (e.g., the impact of school policies on adolescent health behaviour) has also led to greater use and development of hierarchical modelling methods (see for example Leyland and Goldstein 2001). Critical to these more appropriate methodologies is the availability within the data set of individual, cluster and stratum identifiers.

Discussion

This paper has attempted to demonstrate how an international research study, with limited central funding for international activities considering its size and complexity, has introduced a range of practical measures to address data quality issues in a sustainable and continuous manner, making use of the expertise available within the study. These measures have provided demonstrable benefits for data quality, including: the embedding of methodological issues and data quality within the study; countries being more confident of having high quality data and metadata to inform the development and monitoring of policies and programmes; and greater confidence in comparative analysis. That said, caution should still be exercised when interpreting the results from international studies.

The HBSC study has evolved over 20 years and is continuing to do so. The challenge is to adapt and develop, but to be pragmatic and realistic about possibilities. In developing quality standards for cross-national survey research, Lynn (2003) suggests that there are five approaches, ranging from maximum quality to constrained target quality. The former would require separate standards to be set for all participating countries, which for a study with the resources currently available to HBSC would not be feasible. The constrained target quality approach, on the other hand, attempts to raise quality standards by motivating countries to aim for the standards in those countries where they are highest, recognising that not all targets will be met, but ensuring minimum standards and consistency on key dimensions. This approach would seem to be consistent with the direction that HBSC has taken, recognising differences, but offering support and guidance in key areas to ensure consistency where at all possible. For example, countries are required to undertake a probability sample with similar levels of precision to produce optimal data for comparisons between countries, while having flexibility as to how this can best be achieved (Kish 1994).

Looking to the future, the focus on further validation work across countries will need to continue, particularly if the study expands further eastwards into Asia and beyond, as HBSC is primarily a European study. There is considerable interest from such countries outside Europe and North America to use HBSC and this presents some very interesting opportunities and challenges for expanding our knowledge on the possibilities for gathering cross-national and cross-cultural data. For example, HBSC is being used to help build capacity in understanding the health needs of young people in countries as diverse as those in the Pacific region (Phongsavan et al. 2005), Indonesia (Smet et al. 1999) and in some African countries (Gonçalves et al. 2005).

There will be a demand for analysis of time trends as the 2005/06 survey will represent the seventh wave of fieldwork for a small number of countries and the third or fourth for many more. Difficult decisions have been required in relation to leaving items unchanged in order to monitor trends versus continuous improvement of items, particularly as new evidence of validity/reliability is produced. This is a fundamental challenge for any long-standing study. In addition, there will be implications for improving access to documentation from previous surveys and exploring appropriate analytical techniques for assessing change through time. Benchmarking HBSC data quality standards against other cross-national surveys should also be explored, such as the European Social Survey (European Science Foundation 1999). The work of the Methods Development Group and the wider HBSC network will be key to delivering this agenda of continuous improvement in data quality.

Sizeable resources for comprehensive methodological work will be needed if state of the art techniques for data improvement are to be developed and implemented at a pace that allows the HBSC study to be considered a model cross-national study and meet its full potential.