Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Due to the unprecedented technological advances in our ability to “read” our DNA (our personal code of life) the time is rapidly approaching when we will have our personal genome sequenced and available for a variety of health-related interrogations. The first human genome was sequenced at a cost of $3 billion and took thousands of scientists over 10 years to complete (International Human Genome Sequencing Consortium 2001, 2004; Venter et al. 2001). Less than 10 years after the first genome, any one of the many established genome sequencing centers in the world (of which there are 3 in Canada) can do the job in a few days for around $5,000. What other area of science and technology has undergone such a rapid evolution where the cost of a significant operation has dropped by around 1 million fold within a ten year period?

No wonder then that the biomedical world is abuzz with a profusion of potential applications for this now accessible technology. How will the enormous amounts of information generated through high throughput DNA sequencing be analyzed, by whom, who will own the data and how on earth will we integrate this “new world” of medicine into an already stressed healthcare system? In order to answer these questions we need to understand what our personal genome can tell us about our individual health status at any one time and to what extent this information can inform us with regards to our susceptibility to certain diseases later in life. As importantly, we have to understand what our genome sequence cannot now tell us about our destiny and what it shall never be able to. The degrees to which our genes impact our health differ greatly depending on the condition or disease in question. As usual, with any biological system there is a wide spectrum of situations; the end of each spectrum is illustrated in Fig. 1.

Fig. 1
figure 1

The spectrum of genetic contribution to disease spans a range from very rare diseases with strong genetically determined outcomes to more common chronic diseases with more complex genetic and environmental etiology

At one end are the single gene disorders that are relatively rare in any given population (and the subject of this chapter). These include the majority of extremely rare diseases that affect perhaps one in several hundreds of thousands of individuals, and fewer more prevalent ones like cystic fibrosis, certain forms of bleeding disorders (hemophilia) and Huntington disease that affect one in a few thousand. For these diseases, the genetic component is the main (if not unique) driver of the disease. No matter what environmental factors are at play, if someone is unlucky enough (and, yes, it is a matter of which cards you have been dealt with to go through life’s journey) to have a certain mutated gene for one of these disorders, then he or she will most likely have the disease.

At the other end of the spectrum, there are the more common chronic diseases where many genes may cooperate together to confer susceptibility to a disease. The disease will, however, manifest itself only if environmental components (including lifestyle choices) are added to the mix. An example is type 2 diabetes which is driving healthcare costs to unsustainable levels in most developed countries. There is a genetic component to type 2 diabetes but the disease will express itself preferentially in those who, for example, do not take regular exercise, whose nutrition is suboptimal, and who consume alcohol at above average levels (Moore and Florez 2008).

We will see later on how discoveries at one end of the spectrum can inform how we cope with complex diseases at the other end but, for now, let us concentrate on the very rare genetic diseases that affect from one in 50,000 to one in one million people. The causative mutations are referred to as highly penetrant, meaning that they almost always cause disease. Given that a single gene is involved they are referred to as monogenic or Mendelian diseases, the latter term reflects that the vast majority of these diseases respect the first laws of genetics laid down by the Austrian Augustinian monk Mendel (1865). Mendel worked for 8 years, using the garden pea as model system, performing the fundamental experiments on which much of modern genetics is based.

In this chapter we will describe the outcome of a two-year program of research from a pan-Canadian network that brings pediatric clinicians, clinical geneticists, scientists and high throughput genome centers together. This project (called FORGE, see below) has already had a major impact in rare disease research. As we will show, FORGE has given Canada a front line role internationally in the quest to solve the mysteries of rare diseases, many of which have been under investigation for decades without much progress toward their understanding. The new knowledge emanating from this initiative will not only help the affected families directly but will, we believe, have profound effects on how we establish an evidence based approach to personalized health in Canada.

Technology

The rapid development in DNA sequencing and bioinformatic analyses of genomes has facilitated truly revolutionary progress in the rare disease field; the work summarized below was not conceivable even 5 years ago from both a financial and technological viewpoint. So what have been the main technological breakthroughs that have enabled this paradigm shift? The so called next-generation DNA sequencing technologies were developed shortly after, and as a direct result of, the Human Genome Project (HGP) (International Human Genome Sequencing Consortium 2001, 2004; Venter et al. 2001), a ten-year $3 billion effort that profoundly altered our fundamental understanding of the genome at a number of levels. Given the complexity of the human organism, it was predicted by many that there would be over 100,000 (and perhaps several hundred thousand) genes in our genome, far greater than the current estimate of 22–25,000 functional genes (just a few more than a nematode worm). It would thus appear that the far greater complexity of humans as compared to a nematode worm derives from the 95 % of the genome outside the protein coding region. Second, when human genomes were compared with one another, it became clear that the number of repeated stretches of DNA sequence, (i.e. copy number variants or CNVs) was much greater than anticipated (Feuk et al. 2006). From a practical and technical point of view it also became clear that, although very robust, the traditional Sanger DNA sequencing methodology which had been the gold standard for several decades (and remains useful today for many specific uses) had limitations in terms of scalability and cost. If the analyses of hundreds or thousands of human genomes, and the development of reference sequences for several key species in the living world with complex genomes such as wheat, conifer trees or salmon were to be undertaken, new approaches were clearly needed.

Several groups thus started to work on developing scalable technologies capable of determining short stretches (or reads) of DNA sequence on a massively parallel scale, as well as the powerful computational based bioinformatic tools to assemble them, making alignments with the reference human genome sequence (Bentley et al. 2008; McKernan et al. 2009). This field exploded from 2007 onwards; in the following 4 years the capacity to sequence DNA increased by 1,000 fold per sequencing run! Thus in 2007 it cost about $500,000 to sequence one human genome and in 2011 it was possible for $10–20,000—a 25–50 fold reduction. Exomes (the portion of the genome which encodes protein and is reflected in the number of genes) could be sequenced for $3,000 meaning this technology was now available for the analysis of small cohorts such as those used to elucidate the causal mutations underlying rare monogenic diseases. Both Genome Canada and the Canadian Institutes of Health Research (CIHR) immediately saw value in initiating a partnership. The vision was to bring existing high throughput sequencing platforms and large scale genomics research supported by Genome Canada together with the extensive expertise and patient resources found in the large network of CIHR funded scientists and Canadian clinicians; the result: a very productive partnership.

Since 2000, Genome Canada has invested heavily in both the technologies and large scale genomics projects changing the landscape of Canadian science in this realm. Because of the decade-long sustained investment, Canada could hit the ground running when the opportunity came to tackle rare diseases. The genome sequence focused Science and Technology Innovation Centres (STICs) in Canada (one at McGill University in Montreal, one at the Hospital for Sick Children in Toronto and one at the BC Cancer Agency in Vancouver) had been at the cutting edge in the ten years since the HGP and were thus primed for this opportunity. In addition, because of their inherent skill and the fact that the Canadian health care system is organized around public single payer systems funded by provincial governments, the existing pediatric networks, medical geneticists and CIHR funded clinician scientists were rapidly able to provide a large number of extremely well phenotyped patients and families affected with rare, presumably monogenic diseases.

Recent Progress in Rare Disease Research in Canada

Collectively, rare diseases affect approximately 500,000 children in Canada with an estimated annual cost to the health care system measured in the billions of dollars. Although the cause of a monogenic disease is simple (i.e. a single gene), the clinical manifestations are sometimes so complex that a clear diagnosis is very challenging. Affected families may thus spend years visiting many disciplines of medical practice, undergo a myriad of clinical testing involving blood draws, tissue biopsies, sophisticated (and expensive) imaging technologies often with an inconclusive result. This long and frequently non-productive journey is referred to as the diagnostic odyssey. However, the new genomics-based approach promises that very soon, a patient presenting to the genetics clinic with features of a rare genetic disease will have a rapid, comparatively inexpensive and accurate molecular diagnosis, a true revolution in the care of patients and families affected by these disorders. In Canada we have been gaining insight into this future reality through a rare disease initiative called FORGE Canada (Finding of Rare Disease Genes in Canada) initiated in April 2011.

FORGE is led by Drs Kym Boycott (Children’s Hospital of Eastern Ontario Research Institute, University of Ottawa), Jan Friedman (Children’s and Women’s Hospital, University of British Columbia) and Jacques Michaud (CHUM Sainte Justine, University of Montreal). It is supported by Genome Canada, the Canadian Institutes of Health Research, Genome British Columbia, Genome Quebec, the Ontario Genomics Institute and the McLaughlin Centre, Toronto. The early success of this initiative is due to four main strengths (1) the scientific strength and inclusive nature of the team leadership, (2) the network of clinicians who have access to superbly defined clinical phenotypes and family history data reflective of a publically funded health system, (3) intimate links with the Genome Canada funded Science and Technology Innovation Centres which provide the latest cutting edge high throughput DNA sequencing and bioinformatic analysis of human genomes, and (4) a flexible funding model that allowed, in the first instance, CIHR and Genome Canada to launch an innovative call for proposals.

From the outset, it was decided (on a Canada-wide basis) which of the 350 diseases proposed by the more than 150 FORGE members would have the most chance of benefitting from the genome sequencing technology—defined as arriving at a molecular etiology for a particular disease. The inclusion and exclusion criteria were configured by a steering group of Canadian investigators, clinical geneticists and genomics experts. The result was a pipeline of approximately 200 disorders that were primarily subjected to whole-exome sequencing analysis with a small subset of disorders undergoing a whole-genome sequencing approach. Currently over 100 genes have been identified as causal for the different diseases studied. About half of these genes are novel and never before associated with disease; others frequently broaden our understanding of the clinical presentation of a given disorder. The proportion of hits (number of disorders solved relative to the total analyzed) is a remarkable 67 %—the highest we are aware of. The utility of genomics-based diagnostic approaches can be demonstrated in patients who fall into three distinct categories:

  1. 1.

    Patients with diseases for which a phenotype has been described and genes are known but are nonetheless undiagnosed usually because their clinical features are atypical. In such cases the molecular diagnosis frequently broadens the established phenotype for the disease. These patients often undergo the diagnostic odyssey described above.

  2. 2.

    Patients with diseases for which a phenotype is known but the causal gene is not.

  3. 3.

    Patients with previously undescribed disorders for which there is neither a name nor a gene.

    • Category 1—Expansion of a phenotype associated with a disease gene:

The diagnostic odyssey is best captured in the story of two brothers from rural Canada who for over a 5 year period went from specialist to specialist, from hospital to hospital—undergoing brain scans, muscle biopsies and metabolic tests, only to be informed every time that a diagnosis was not forthcoming. With the FORGE based genome sequencing solution of the genetic riddle in 2011, the family’s life has changed significantly. The two affected siblings exhibited hearing difficulties early on and then motor neurological symptoms leading to one brother being more or less confined to a wheel chair by the age of 15 years. Both brothers have now been definitively diagnosed with D-bifunctional protein deficiency (McMillan et al. 2012). They have a previously undescribed mild form of the disease (in most cases, survival beyond 2 years is unusual), a result of the type and distribution of the two recessive mutations making a diagnosis based only on clinical and biochemical assessment virtually impossible. Now the family can concentrate on managing their lives knowing exactly the cause and prognosis of their disease.

  • Category 2—Gene discovery for a known phenotype:

A FORGE team led by Dr. Jacques Michaud analyzed genomes from a French Canadian family affected with a subtype of Joubert syndrome (JBTS)—a rare autosomal recessive neurological disorder with a distinctive diagnostic brain malformation. Over 20 clinical variants have been described and the genetic etiology is known for just over a half. For the JBTS present in the French Canadian population, no clear causal association had been made with specific known genes even though the syndrome was first described in Quebec families over 40 years ago by Dr. Marie Joubert (a Quebec pediatric neurologist). Using individuals from eleven unrelated but clinically well documented families, including members of the family originally described by Dr. Joubert, Michaud and colleagues sequenced the exomes of fifteen individuals and discovered that mutations in the gene C5ORF42 was the cause of this syndrome in this geographic region. Since the publication of this work (Srour et al. 2012a, b) the same gene has been associated with cases of JBTS in Saudi Arabia and is now proposed as being a relatively common cause of JBTS world-wide.

A second example of this category is Floating Harbour Syndrome (FHS) (Pelletier and Feingold 1973), a rare condition characterized by short stature, delayed bone maturation and distinctive facial appearance. The unusual name reflects its first description by investigators from both Boston Floating Hospital and Harbor General Hospital (Torrance, CA). Many cases are sporadic although a few parent to child transmissions have been documented suggesting that FHS is an autosomal dominant disorder. Despite the general recognition that FHS is a distinct syndrome—in over 25 years little progress has been made relative to its underlying genetic cause. In a FORGE study (Hood et al. 2012) led by Dr. Kym Boycott, 13 unrelated patients were identified and exome sequencing in 5 immediately revealed that the gene SRCAP was responsible. Targeted Sanger sequencing revealed that the same gene was mutated in the 8 other patients. Interestingly, the gene product of SRCAP is involved in chromatin remodeling and another gene involved in this biological process (encoding CREB-binding protein) has been shown to be mutated in a similar rare disease—that of Rubenstein-Taybi syndrome. As an anecdote, Dr. M Feingold (the clinician who originally described FHS in 1973) on hearing about the discovery prior to publication, called Dr. Boycott in amazement at the finding and could not believe that this had been elucidated using DNA sequencing technology. A comprehensive review of the genotype-phenotype correlation in FHS, in collaboration with Dr. Feingold, involving over 50 patients, will be reported shortly.

  • Category 3—Gene discovery for a novel phenotype:

Microcephaly-capillary malformation (MIC-CAP) syndrome was described for the very first time in two patients by FORGE clinicians in 2011 (Carter et al. 2011). Once recognized as a distinct clinical entity, several additional patients were quickly reported internationally. MIC-CAP syndrome is a severe disorder characterized by microcephaly, intractable epilepsy, profound developmental delay and multiple small capillary malformations of the skin. The FORGE team led by Dr. Kym Boycott analyzed exome data from five patients with MIC-CAP syndrome and identified novel recessive mutations in STAMBP, a gene encoding the deubiquitinating (DUB) isopeptidase STAMBP (STAM-binding protein) that plays a key role in the recycling of cell surface receptors (McDonell et al. 2013). Within a very short period of time, the team had moved from a single Canadian patient, to an internationally recognized syndrome to a gene which implicated a new area of biology to progressive neuronal loss.

Many more of the FORGE successes published to date can be found in the reference section (Samuels et al. 2013a, b; Moffatt et al. 2013; Fernandez et al. 2012; Schuurs-Hoeijmakers et al. 2012; Koenekoop et al. 2012; Lynch et al. 2012; Rivière et al. 2012; Bernier et al. 2012; Doherty et al. 2012; Lines et al. 2012; Gibson et al. 2012; Majewski et al. 2011). These examples illustrate the power of next generation DNA sequencing in elucidating the causes of rare genetic diseases. It has been suggested that over the next eight years the underlying cause of almost all Mendelian disorders will be solved. The only way to achieve this ambitious goal is to encourage international collaboration at a massive scale. In a first step towards this the International Rare Disease Research Consortium, IRDiRC, (www.irdirc.org) was created. Initially driven by the European Commission and the NIH, this initiative now involves over 30 public and private funders from France, Italy, Germany, the Netherlands, Spain, the EU, UK, USA, Australia, China and Canada. Indeed one of us (PL) will be assuming the chairmanship of the executive committee of IRDiRC in April 2013, and another of us (KB) co-chairs one of three scientific committees that advises the executive. This reflects Canada’s leading role in this field.

Families with rare diseases are immediately impacted by the results of this new approach; molecular insight influences how they live their lives going forward knowing exactly what has caused their disease, what the prognosis might be, how to best manage complications and informs reproductive decision making. While knowing what biological pathway is perturbed may not lead to a cure for the families in the short-term it will hopefully pave the way for best practice guidelines and novel interventions for future patients. Indeed Beaulieu et al. (2012) have proposed the development of a strategic tool box and preclinical research pathway for inherited rare diseases. This may well lead to targeted therapeutic interventions using, for example, repurposing of already approved drugs so that patients can benefit as soon as possible after diagnosis. This is indeed the perfect model for personalized medicine.

Lastly, insight into rare genetic diseases can contribute to much more common diseases. Thus, the insight into the molecular etiology of rare diseases will not only help the affected families but also will contribute a wealth of knowledge to human biology shedding light on how we are structured and function in both health and disease, rare and more common. For example, the discovery that mutations in the NOTCH2 gene is responsible for the rare Hadju-Cheney syndrome (a disease exhibiting dramatic deficiencies in bone formation and degeneration) could provide great mechanistic insights into much more common forms of osteoporosis and may eventually give rise to better treatments for this complex disease (Majewski et al. 2011).

Implications for Personalized Medicine

The integration of new technologies such as genomics into complex health systems represents a challenge for healthcare providers to fully embrace. The main reasons for this are:

  1. 1.

    The relative lack of good economic models for health technology assessment necessary for payers to see the value of proactive integration of genomics into the healthcare system.

  2. 2.

    A limited receptor capacity within a healthcare system that is not optimally adapted to the efficient translation of new technologies as they mature.

  3. 3.

    The lack of education and training of health care professionals in the field of genomics.

  4. 4.

    The lack of robust harmonized health information systems needed to integrate high density data sets (omics) with detailed clinical phenotypic data while making them readily accessible to the end user.

Nonetheless and notwithstanding these barriers, we believe the work on rare diseases in Canada paves the way for more general integration of genomics technologies into the health system over time. Rare diseases represent the first true test case for personalized medicine and will, in our opinion, create the model of intervention for more common diseases in the future. We are learning through the rare disease program what genome sequencing can enable and, as importantly, what some of the limitations will be. Being able to stratify patient groups according to genomic profile will allow more targeted clinical assessments to be carried out and should give rise to more efficient drug development processes and ultimately more effective therapies. We are also learning how to deal with those discoveries which were unanticipated yet have clinical significance, the so called incidental finding; when and under what circumstances these should be communicated to patients and their families. Indeed one of the key integrated parts of the FORGE program is a study on what legal, ethical and social implications should be considered when whole genome sequencing is used to determine the root cause of a specific condition. How individuals are consented for these studies is critical as the ramifications of discovering incidental findings of a clinically actionable nature can be far reaching for both individuals and their families. But it is very early days for personalized medicine: we are at the very beginning of the application phase and so, for some, the potential benefits to patients and to the system are hard to imagine. As Arthur Kornberg (Nobel laureate who discovered the enzyme that replicates DNA) used to say: “the future is invented, not predicted”.