Introduction

The Generation R Study is a population-based prospective cohort study from fetal life until young adulthood. The background and specific research projects of the study have been described in detail [16]. Briefly, the Generation R Study is designed to identify early environmental and genetic causes of normal and abnormal growth, development and health from fetal life until young adulthood. The study focuses on six areas of research: (1) maternal health; (2) growth and physical development; (3) behavioural and cognitive development; (4) respiratory health and allergies; (5) diseases in childhood; and (6) health and healthcare for children and their parents.

Main exposures and outcomes studied in the Generation R Study are new or well-known risk factors for common diseases in childhood or adulthood. As published in the European Journal of Epidemiology, many of these risk factors have been related to common outcomes throughout the life course such as pregnancy and early childhood outcomes [717], cardiovascular disease [1850], diabetes [5160], obesity [6169], metabolic diseases [7075], respiratory diseases [7685], neurological diseases [8690] and other common outcomes [9194]. The main outcomes and exposures in the Generation R Study are presented in Tables 1 and 2. A detailed and extensive data collection has been conducted over the years, starting in the prenatal phase and currently in childhood [4]. This data collection also comprises biological samples including blood, hair, faeces, nasal swabs, saliva and urine samples. These biological samples form the Generation R Study Biobank, which enables epidemiological studies focused on environmental exposures, genetic determinants and their interactions in relation to growth, health and development during fetal life, childhood and adulthood. We have previously described the design of this Biobank [5]. In this paper, we give an update of the collection, processing and storage of the biological samples collected during fetal life and childhood. In addition we give an overview of currently available measures in these samples.

Table 1 Main outcomes per research area
Table 2 Main determinants

Study cohort

In total, 9,778 mothers were enrolled in the study. Of these mothers, 91 % (n = 8,879) was enrolled in pregnancy. Fathers from mothers enrolled during pregnancy were invited to participate. In total, 71 % (n = 6,347) of all fathers was enrolled. Of all participating mothers, data are available in early pregnancy in 72 % (n = 7,069), in mid-pregnancy in 16 % (n = 1,594), late pregnancy in 2 % (n = 216) and from birth of their child in 9 % (n = 899). A total of 1,232 pregnant women and their children were enrolled in a subgroup of Dutch children for additional detailed studies until the age of 4 years. Of all eligible children in the study area, 61 % participated in the study at birth [4]. The largest ethnic groups are the Dutch, Surinamese, Turkish and Moroccan mothers.

Biological samples collection

A detailed overview of the complete data collection in mothers, fathers and children until the age of 9 years is given elsewhere [4]. Figure 1 gives an overview of the collection of biological samples. During pregnancy, biological materials have been collected in early, mid- and late pregnancy and at birth. All samples were collected during a visit to one of our dedicated research centers. The planned amounts of blood taken by antecubital venipuncture were 35 ml in early pregnancy and 20 ml in mid-pregnancy from the mother and 10 ml from her partner. When mothers were enrolled in mid- and late pregnancy, 35 ml of blood was taken at the first visit. Urine samples (65 ml) were added to the data collection between February 2003 and November 2005. Directly after delivery, midwives or obstetricians collected maximum 30 ml cord blood from the umbilical vein [5].

Fig. 1
figure 1

Design and response biological sample collection until the age of 9 years. *Number of eligible subjects at enrolment reflects those participating in the study during pregnancy and who visited our research center in early, mid- or late pregnancy. **The number of urine samples at enrolment is lower since this data-collection was added after starting the study. ***The number of eligible children at enrolment reflects the number of live born children of mothers who were enrolled in the prenatal phase of the study. ****The number of hair samples at the age of 5 years is lower since collection was added after starting the study (2 years later)

During the preschool period, children participating in the Dutch subgroup have been invited six times to a dedicated research center were we collected biological samples. At the age of 6, 14 and 24 months a maximum amount of 20 ml blood was taken by antecubital venipuncture. At the age of 1.5, 6, 14, 24 and 36 months nasal samples were collected. At the age 6, 14, 24 and 36 months saliva was collected in 1.5 ml Eppendorf tubes using Saliva Collection device (ORACOL, Malvern Medical UK) and frozen at −20 °C and stored at −80 °C. In addition, at the age of 14 months, parents were asked to collect five saliva samples at home using Salivette sampling devices (Sarstedt, Rommelsdorf, Germany).

From the age of 6 years, all participating mothers and children are invited to a well-equipped and dedicated research center in the Erasmus MC-Sophia Children’s Hospital every 3 years (age 6 years visit completed, age 9 years visit ongoing, and age 12 visit planned) [4]. At the child’s age of 6 years, a maximum amount of 22.5 ml blood and a sample of hair (1–2 cm) was collected in both mother and child and nasal swabs, saliva (1.5 ml) and urine (80 ml) samples were collected in children only. At the age of 9 years, we are currently collecting blood (maximum 22.5 ml), hair (1–2 cm) and urine (54 ml) in both mother and child and nasal swabs only from the child. In a random group of 1,000 children we collect 2.5 ml of blood in a 2.5 ml PAXgene™ blood RNA tube (PAXgene Tubes-Becton–Dickinson). RNA will be isolated using a PAXgene Blood RNAkit-Qiagen (Qiagen, Hilden, Germany). In addition, parents are asked to collect faeces for intestinal microbiome analysis from the children at home.

Logistics

Blood and urine

All biological samples are bar coded with a unique and anonymous laboratory number. Blood or urine samples collected at one visit have the same bar code with an additional unique tube number. All following steps in processing, storing and data management of the samples are linked to this unique tube number.

During the prenatal phase, all samples from the mother and father were taken by research nurses and temporally stored at our research center or one of the obstetric departments at room temperature for a maximum of 3 h. The urine samples were stored at 4 °C and transported to the STAR-MDC laboratory for further processing within 24 h of receipt. All samples were transported to a dedicated laboratory facility of the regional laboratory in Rotterdam, the Netherlands (STAR-MDC) for further processing and storage. Cord blood samples collected at home or at one of the hospitals in Rotterdam were collected by a midwife or obstetrician. Subsequently, courier services with a 7/24 availability were responsible for transportation of cord blood samples to our laboratory within 2 h.

During childhood, all blood and urine samples collected at our research center in the Erasmus MC-Sophia’s Children’s Hospital are stored for a maximum of 4 h at 4 °C with the exception of blood collected for further immunologic analysis (room temperature) or RNA isolation (2–4 h at room temperature before storage at −20 °C). Twice a day blood and urine samples are transported to the STAR-MDC for further processing and storage.

After collection and transportation, blood and urine samples are centrally processed and stored at the STAR-MDC laboratory. Samples for DNA extraction are stored as EDTA whole blood samples at −20 °C. All collected EDTA plasma and serum samples are processed within 4 h after venous puncture. Total processing time takes 15 min. The samples are spun and the plasma and serum volume is distributed into 250 µl aliquots and transferred to 0.65 ml polypropylene tubes (Micronic) by a Tecan automatic liquid handler. Aliquots tubes from one plasma and serum sample are divided over four different microliter trays and immediately stored at −80 °C. The four trays are divided over four different freezers with different power supplies at one location (STAR-MDC laboratory). Each Micronic tube is uniquely coded on the bottom with the Traxis® 2D code.

The urine samples were distributed manually in one 5 ml (only during pregnancy) and three 20 ml tubes. The 5 ml urine tube was sent to the Department of Medical Microbiology, Erasmus MC. The remaining urine tubes (20 ml) were stored at −20 °C at the STAR-MDC laboratory. At the age of 9 years onwards, the urine is filled out in 4.5 ml tubes and stored at −20 °C.

DNA

DNA from mothers, fathers and children has been isolated from whole blood EDTA tubes. DNA extraction from all children has been conducted manually using the Qiagen FlexiGene Kit (Qiagen Hilden, Germany) [95]. DNA extraction, plating and normalization from 5 ml whole blood samples from the mothers and fathers was performed at the Human Genotyping and Sequencing Facility of the Genetic Laboratory at the Department of Internal Medicine, Erasmus MC, by a Hamilton STAR multi-channel robot using AGOWA magnetic bead technology.

Extracted DNA has been automatically collected in stock tubes (2D Matrix, Micronic) organized in 96-wells format, which can be individually addressed. These stock samples are split into two tubes that are stored at different locations. One of the stocks is used for normalizing the DNA concentrations in 96 deep well (DW) plates. This protocol, as well as other manipulations of the DNA samples, are performed on a Caliper ALH3000 pipetting robot (8/96/384 channels) with a Twister module and a 96/384 wells Tecan GENios plus UV reader. A random selection of 5 % of the total number of samples is put into separate 96 DW plates for control purposes. For normalization, sample and diluent volumes are automatically calculated and produced to obtain equal DNA concentrations for all samples (50 ng/µl). From these stock 96-DW plates, 384 DW plates are created, with 1 ng/µl per sample. Subsequently, replica PCR plates (384 wells) with a concentration of 2 ng/µl are created for genotype studies.

Stock 96-well plates (DW and 2D Matrix, Micronic) are stored at −20 °C at the Genetic Epidemiology Laboratory, Erasmus MC. All genotyping and sequencing studies with Generation R DNA material are performed in-house at the Human Genotyping and Sequencing Facility of the Genetic Laboratory of the Department of Internal Medicine, Erasmus MC.

Other biological samples

Nasal swabs for bacterial cultures are transported in Amies transport medium to the Medical Microbiology laboratory, Erasmus MC within 6 h of sampling for further processing and storage.

Parents were asked to collect five saliva samples at home using Salivette sampling devices (Sarstedt, Rommelsdorf, Germany) at the age of 14 months in the Dutch subgroup and at the age of 6 years in all participating children. Parents received detailed written instructions with pictures concerning the saliva sampling. They collected five saliva samples during one single weekday: immediately after awakening, 30 min later, around noon, between 1500 hours and 1600 hours, and at bedtime. The salivettes were mailed to the Genetic Epidemiology Laboratory, Erasmus MC. Here, the samples were centrifuged and frozen at −80 °C. After completion of the data collection at the age of 14 months, all frozen samples were sent on dry ice in one batch by courier to the laboratory of the Department of Biological Psychology laboratory at the Technical University of Dresden for analysis. Saliva samples were transported to the Clinical Chemistry laboratory, Erasmus MC for storage.

Hair samples are stored at room temperature and transported to the laboratory of Internal Medicine, Erasmus MC for further processing and storage.

Microbiome analyses focused on the gut microbiome are planned but other eligible samples are collected as well such as nasal swabs, saliva, and urine. Faecal samples collected at home are mailed by post to the laboratory of the Department of Gastroenterlogy and Hepatology, Erasmus MC and DNA is isolated from the faecal samples and stored at −80 °C and used for subsequent 16S microbiome genetic analysis.

Response rates

Overall response rates for the collection of biological samples are presented in Fig. 1. To get these high response rates, we collected blood samples from veni punctures that were already planned in routine care whenever possible. Blood for DNA extraction was available from 89, 82 and 61 % of the eligible mothers, fathers and children. The larger number of missings in fathers and children were mainly due to non-participation of fathers and logistical constraints during delivery, respectively. Absolute numbers of urine samples are lower than the absolute number of blood samples because the urine sample collection was performed during a limited period in the prenatal phase of the study. Efforts have been carried out for completing DNA collection in the children samples with the collection of whole blood samples at the age of 6 and now in 9 years. DNA is available for analysis in 78 % of all mothers, 76 % of all participating fathers and 58 % of all live born children.

At the age of 6 years the overall response rate for blood collection in mothers was 89 %. In children the response was 69 % due to lack of consent by the parents or non-successful venous punctures. The overall response was 97 % for urine samples, 96 % for nasal swabs and 97 % for saliva samples. The collection of hair started at a later stage. Collecting biological samples by the parents at home led to lower response rates (43 % for saliva swabs).

Available measures

Blood for phenotypes

Results of analyses performed in blood samples (EDTA plasma) for routine care for pregnant women (Hb, Ht, HIV, HBsAg, Lues, Rhesus factor and irregular antibodies) were obtained from midwife and obstetric registries [9699]. Additional measurements performed in serum and plasma samples collected in mothers and children are shown in Table 3 [100145].

Table 3 Available biomarkers in full cohort

Genomics data

All Genomics Data are generated at the Human Genotyping and Sequencing Facility of the Genetic Laboratory of the Department of Internal Medicine (www.glimdna.org), where the GWAS datasets of the Rotterdam Study were also created, a prospective cohort study among over 10,000 adults [146148]. The Generation R Genomics data currently include GWAS data, DNA Exome Array data, DNA methylation data, and single polymorphisms data (e.g., individual SNPs, VNTRs).

Genome wide association study (GWAS) database

Genetic data have been generated by a genome wide association scan (GWAS) using Illumina HumanHap 610 or 660 Quad chips (Illumina Inc., San Diego, USA), depending on time of collection. The GWAS dataset underwent a stringent QC process, which has been described in detail previously [3, 149]. Most GWAS analyses are strongly embedded in the Early Growth Genetics (EGG) Consortium and Early Genetics and Longitudinal Epidemiology (EAGLE) Consortium, in which several birth cohort studies combine their GWAS efforts focused on multiple outcomes in fetal life, childhood and adolescence. These efforts have already led to successful identification of various common genetic variants related to birth weight, birth length, infant head circumference, childhood adiposity, body mass index, bone mineral density, atopic dermatitis and other outcomes [149167].

Single-nucleotide polymorphisms (SNPs)

As GWAS is not yet available in parents, genotyping is mainly performed using Taqman allelic discrimination assay (Applied Biosystems, Foster City, CA) and Abgene QPCR ROX mix (Abgene, Hamburg, Germany) in mother and father. To confirm the accuracy of the genotyping results, 276 randomly selected samples from the Generation R Study are genotyped for a second time with the same method. The error rate was less than 1 %. In Table 4 all available SNPs and VNTRs in mothers and fathers are shown. In children, individual genotype data are extracted from the genomewide Illumina 610 or 660 Quad chips. If SNPs were not directly genotyped, we used MACH (version 1.0.15) software to impute genotypes using the HapMap II CEU (release 22) as reference set or SNPs were genotyped using the same method as the parents. SNPs were used for various candidate gene, replication and Mendelian randomization studies [120, 143, 168201].

Table 4 Available single-nucleotide polymorphism (SNPs) and variable number of tandem repeats (VNTRs) in mothers and fathers

Exome array variant database

Variants in exomes were measured using an Illumina exome chip v1.1 array in a DNA samples of a subgroup of ~ 1,000 Dutch children. The array contains 270.000 genetic variants in exonic sequences and can be used to study common and rare coding variants next to the GWAS data.

Epigenome wide association study (EWAS) database

DNA methylation was measured on a genome wide level in cord blood samples in a subgroup of ~1,000 Dutch children using the Illumina 450 K Infinium BeadChip (Illumina Inc., San Diego, USA), which contains 485,553 methylation sites at a single nucleotide resolution. Quality control of analyzed samples was performed using standardized criteria. Samples were excluded in case of low sample call rate (<99 %), colour balance >3, low staining efficiency, poor extension efficiency, poor hybridization performance, low stripping efficiency after extension and poor bisulfite conversion. Current analyses are based on cord blood samples. We also plan to measure DNA methylation at different ages.

Candidate gene methylation

DNA methylation was assessed at different loci in genomic DNA isolated from cord blood samples and has been used to study the association with ADHD and fetal and infant growth [202, 203]. Candidate genes for methylation were selected on the basis of their potential involvement in neurotransmitter systems and neurodevelopment. Additionally regions were selected of the IGF2DMR and H19 gene that are implicated in fetal growth. Isolated genomic DNA (500 ng) from cord blood samples was treated with sodium bisulphite for 16 h using the EZ-96 DNA methylation kit (Shallow) (Zymo Research, Irvine, CA, USA). This was followed by PCR amplification, fragmentation after reverse transcription and analysis on a mass spectrometer (Sequenom, Inc, San Diego, USA). This generated mass spectra that were translated into quantitative DNA methylation levels of different CpG sites by MassARRAY EpiTYPER Analyzer software (v1.0, build 1.0.6.88 Sequenom, Inc, San Diego, USA).

Samples were randomly divided over bisulphite conversion and PCR amplification batches. For each individual, the assays were amplified from the same bisulphite-treated DNA. All methylation measurements were done in triplicate from the same bisulphite-treated DNA.

Saliva

Salivary cortisol concentrations were measured using a commercial immunoassay with chemiluminescence detection (CLIA; IBL Hamburg, Germany) with which various studies have been performed to determine risk factors in relation to infant cortisol rhythms and stress reactivity [204208]. Samples collected at the age of 6 years will be processed at a later stage.

Nasal swabs

Nasal swabs were taken with rayon tipped dacron pernasal swabs (Copan Italia, Brescia, Italy), transported in Amies transport medium and plated within 6 h of sampling on a blood agar plate with 5 % sheep blood, a chocolate agar plate and a Heamophilus selective agar plate. The plates are kept at 35 °C in a CO2 rich environment for 2 days. Bacterial growth was determined daily. All bacteria were determined by standard methods. Various studies have been performed studying colonization of the nasopharynx by potential pathogens [122, 123, 125, 126, 209216]. At the age of 9 years all swabs are stored at −80 °C after collection and will be processed at a later stage.

Urine

In urine samples collected during pregnancy, levels for various environmental exposures are measured like organophosphorous pesticide, bisphenol A, and phthalate levels to study their effect on fetal development and is planned to be done in samples obtained at the age of 6 years [217220]. To validate the use of cannabis, urine was tested on the presence of 11-nor-Δ 9-THC-9-COOH using the DRI® Cannabinoid Assay (Microgenics) [221]. To determine C. trachomatis infection during pregnancy, urine samples were tested by PCR [222]. In children at the age of 6 years, kidney function was determined by creatinine and albumin levels using the Beckman Coulter AU analyzer, creatinine levels were measured according to the Jaffe method [135137]. Dietary intake of iodine during pregnancy was assessed by measuring iodine in urine [111, 223, 224].

Data management and privacy protection

All blood and urine samples stored at the STAR-MDC laboratory are registered in Labosys (Philips) [225]. Both the anonymous person unique study number and all sample numbers are registered which enables matching of each sample to a study subject. A backup of this registration is available at the Erasmus MC. All other laboratories use different systems for registration in which laboratory number, tube number and date of processing is registered. Our data management receives a backup of these registrations every 3 months and is responsible for creating all selection lists and registers the picking-out of the samples, the number of samples and remaining total volume in storage and the number of freeze counts. The sample numbers are removed from the dataset before the results become available for the researchers. The dataset for researchers include a subject unique identification number that enables feedback about a sample of the subject to the data manager but do not enable identification of that particular subject.

Collaboration

The Generation R Study is not the only birth-cohort study with an extensive collection of biological samples. Generation R collaborates with several other cohort studies for replication or meta-analyses.

The Generation R study has an open policy with regard to collaboration with other research groups. Request for collaboration and use of biological samples should primarily be addressed to Vincent Jaddoe (v.jaddoe@erasmusmc.nl), Principal Investigator of the Generation R study. These requests are discussed in the Generation R Study Management Team regarding their scientific merits, study aims, overlap with ongoing studies, logistic consequences and financial contributions. General policy is that collaborating researchers and groups are responsible themselves for the financial requirements. Laboratory tests are preferably conducted in the Erasmus Medical Center, but this is not required. After approval of the project by the Generation R Study Management Team and the Medical Ethical Committee of the Erasmus Medical Center, the collaborative research project is embedded in one of the research areas supervised by the specific principal investigator.