270x Filetype PDF File size 0.38 MB Source: www.cell.com
Leading Edge
Essay
Distilling Pathophysiology
from Complex Disease Genetics
1, 2 3
Aravinda Chakravarti, * Andrew G. Clark, and Vamsi K. Mootha
1
Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
2
Cornell University, Ithaca, NY 14850, USA
3
Massachusetts General Hospital, Boston, MA 02114, USA
*Correspondence: aravinda@jhmi.edu
http://dx.doi.org/10.1016/j.cell.2013.09.001
Technologies for genome-wide sequence interrogation have dramatically improved our ability to
identify loci associated with complexhumandisease.However,achasmremainsbetweencorrela-
tionsandcausalitythatstems,inpart,fromalimitingtheoreticalframeworkderivedfromMendelian
geneticsandanincompleteunderstandingofdiseasephysiology.Hereweproposeasetofcriteria,
akin to Koch’s postulates for infectious disease, for assigning causality between genetic variants
andhumandiseasephenotypes.
.Thus it is easy to prove that the wearing of tall hats and the carrying of incorrect knowledge is worse than no
umbrellas enlarges the chest, prolongs life, and confers comparative immunity knowledge at all (Brown and Goldstein,
from disease; for the statistics show that the classes which use these articles 1992).
are bigger, healthier, and live longer than the class which never dreams of pos- Consider that two types of genomic
sessing such things. It does not take much perspicacity to see that what really surveys, one horizontal and the other ver-
makesthisdifferenceisnotthetallhatandtheumbrella,butthewealthandnour- tical, are now routine for attempting to
ishment of which they are evidence, and that a gold watch or membership of a understand human biology and disease.
club in Pall Mall might be proved in the same way to have the like sovereign In horizontal or broad surveys, we can
virtues.. obtain the full genome sequence in tens
George Bernard Shaw, The Doctor’s Dilemma (Preface), 1909 to hundreds of thousands of individuals
to sort out which genomic segments are
important and which are innocent
Distinguishing correlation from causality der to promote the role of one or more bystanders, to a particular comparison
is the essence of experimental science. genes as being ‘‘causal,’’ rather than just between individuals, such as those with
Nowhere is the need for this distinction ‘‘associated,’’ in a disease process versus without coronary artery disease
greatertodaythanincomplexdiseasege- (Brown and Goldstein, 1992; Falkow, or cases with early versus late onset of
netics, where proof that specific genes 1988, 2004)(Box 1). dementia. In contrast, in vertical or deep
have causal effects on human disease Below we discuss the nature of the surveys, we examine the effects of the
phenotypesremainsanenormousburden ‘‘proof’’ that we desire in order to make genomeastheDNAinformationgetspro-
andchallenge.Giventhepotentialscienti- fundamental discoveries in human path- cessed, and its encoded functions get
fic and medical payoffs of disease gene ophysiology. We admit at the outset executed through its transcriptome, pro-
discovery (Chakravarti, 2001), we argue that the answers are not straightforward, teome, and effectors such as the metab-
in this Essay of the need for a rigorous ex- and that there are serious technical olome. Both of these classes of studies
amination of the assumptions under and intellectual impediments to demon- are relevant to analysis of a disease of
which we connect genes to phenotypes. strating causality for the common com- unknown etiology and have re-empha-
This is particularly so in this age of routine plex disorders of man where multiple sized the long-held suspicion that study-
-omic surveys, which can produce more interacting genes are involved. We ing genes one-at-a-time may not be
false-positive than true-positive findings acknowledge that even unproven candi- meaningful because a gene’s effect is
(Kohane et al., 2006). Moreover, genomic date genes may lead to significant usually pleiotropic, context dependent,
mapping and sequencing approaches insight into disease pathophysiology. and contingent upon the state of many
that are invaluable for producing a list of Nevertheless, the casual conflation of other genetic and nongenetic factors
unbiased candidates are, by themselves, ‘‘mapped locus’’ to ‘‘proven gene’’ is a (Chin et al., 2012). In turn, this implies
insufficientforimplicatingspecificgene(s) constant source of confusion and that proving a gene’s specific role in a
in a diseaseorbiologicalprocess.Conse- obfuscation in biology and medicine biological process, either in wild-type or
quently, we suggest that specific genetic that requires remedy. We hope to offer mutant form, may not be straightforward
criteria, analogous to Koch’s postulates some concrete suggestions, however because its role may only be evident
in microbiology,needtobesatisfiedinor- difficult they may be to satisfy, because when examined in relation to its
Cell 155, September 26, 2013 ª2013 Elsevier Inc. 21
Box1.Koch’sPostulatesforComplex any overt disease phenotype, presum- The case of amyotrophic lateral scle-
HumanDiseasesandTraits ably due to the buffering by other genes rosis (ALS), a devastating, progressive
(MacArthur et al., 2012). Acknowledging motor neuron disease, illustrates this
(1) Candidate gene variants are this complexity, there are two general point (Ludolph et al., 2012). Despite the
enriched in patients. ways forward. First, at this stage of our lack of evidence, we largely describe
(2) Disruptionofthegeneinamodel knowledge, perhaps we should not worry ALS as being ‘‘heterogeneous’’ and
system gives rise to a model about ‘‘all’’ of the genes in a disease, in comprised of single-gene mutations that
phenotype that is accepted as many ways an undefinable goal, but can individually lead to disease. In
relevant and ‘‘equivalent’’ to the rather those whose effects are demon- 1993, mutations in superoxide dismutase
humanphenotype. strable, i.e., through a mutation that, irre- 1(SOD1) were identified in an auto-
(3) The model phenotype can spective of its interactions, can by itself somal-dominant form of the disease;
be rescued with the wild-type affect a critical pathway. Second, as we subsequently, the disorder has become
humanalleles. unravel the effects of multiple genes on synonymous with aberrant clearance of
(4) Themodelphenotypecannotbe a phenotype, we should advance the free radicals as its central pathology.
rescued with the mutant human same criterion, namely, that a set of What is often not appreciated, however,
alleles. mutations affects that same critical pro- is that fewer than 10% of all cases of
cess. Both of these goals are approach- ALS are familial and even fewer follow
able, particularly with recent advances an apparent Mendelian pattern. Even
biochemical partners, and in particular in genome-editing technologies that within this subset of cases, more than
contexts of diet, pathogen exposure, allow the creation of multiple mutations 20 distinct genes, spanning other path-
etc. (Zerba et al., 1996). This is a partic- within a single experimental organism ways including RNA homeostasis, have
ular problem in genetic studies of any (Wang et al., 2013). The question then is been identified, and SOD1 represents a
outbred nonexperimental organism, how ‘‘complex’’ are complex traits and minority of cases. The molecular etiology
suchasthehuman,andstudiesofhuman diseases? for the majority of the sporadic forms of
disease, where investigations are obser- the disease remains unclear, and the sci-
vational not experimental. It is the strong TheNewGenetics:Understanding entific problem in understanding ALS is
belief of contemporary human geneticists the Function of Variation more than simply identification of addi-
that uncovering the genetic underpin- With the rediscovery of Mendel’s rules of tional genes. We may ask, can SOD1
nings of any disease, however complex, transmission more than 100 years ago, and the other described gene mutations
is the surest unbiased route to under- there was a vicious debate on the lead to ALS by themselves? Are these
standing its pathophysiology and, thus, relative importance of single-gene versus the key rate-limiting steps to ALS or sim-
enabling its future rational therapies multifactorial inheritance (Provine, 1971). ply one of several required in concert? Is
(Brooke et al., 2008). Consequently, for Geneticists quickly, and successfully, the aberrant clearance of free radicals
this view to prevail, we should require focused on deciphering the specific the fundamental defect or one of many
experimental evidence, be it in cells, tis- mechanisms of gene inheritance and un- such pathologies or a common down-
sues, experimental models, or the rare derstanding the physiology of the gene stream consequence? Given the diver-
patient, for the role of a specific gene in in lieu of answering why some pheno- sity and number of deleterious, even
a disease process. We discuss here the typeshadcomplexetiologyandtransmis- loss-of-function, genetic variants in all
types of evidence that we consider sion. Nevertheless, the rare examples of of our genomes (Abecasis et al., 2012;
incontrovertible. deciphering the genetic basis of complex MacArthur et al., 2012) and, in the
Successinthisdifficulttaskrequiresus phenotypes,suchasfortruncate(wing)in absence of stronger evidence bearing
to solve a logical conundrum: how can Drosophila (Altenburg and Muller, 1920), on these questions, it is fair to assume
we understand the genes underlying a clearly emphasized that traits were more that ALS patients harbor multiple muta-
phenotype if some of these component than the additive properties of multiple tions with a plurality of molecular defects
factors, in isolation, do not have recog- genes. Today, it is quite clear that and that free radical metabolism is only
nizable phenotypes on their own? We Mendelian inheritance of traits, including one of a set of canonical pathophysiol-
know that even in a simple model organ- diseases, is the exception not the rule. ogies that define the disease. No doubt,
ism, budding yeast, synthetic lethality— Nevertheless, the entire language of this plurality is the case for cancer (Vo-
where death or some other phenotype genetics is in terms of individual genes gelstein et al., 2013), Crohn’s disease
occurs only through the conspiracy of for individual phenotypes, with one (Jostins et al., 2012), and even rare
mutations at two different genes—is function, rather than the ensemble and developmental disorders such as Hirsch-
widely prevalent (Costanzo et al., 2010). emergent properties of genomes. This sprung disease (McCallion et al., 2003).
Interactions of greater complexity and absence of a specific genetics language In all of these cases, a richer genetics
involving more than two genes are also for the proper description of the multi- vocabulary may improve our understand-
known in yeast (Hartman et al., 2001) genic architecture of traits (the ensemble) ing of the phenotypes through recog-
and must be true for humans as well. A remains as an impediment to our under- nizing what we know and what we
human genome will typically harbor 20 standing of the nature and degree of don’t; our current language limits us to
genes that are fully inactivated, without genetic complexity of the phenotype. describing genes not phenotypes.
22 Cell 155, September 26, 2013 ª2013 Elsevier Inc.
Molecular biology, genetics’ twin, on molecular biology, biochemistry, and is now applicable to any human trait or
the other hand, appears to have been far physiology of the genes within a mapped disease. In fact, more than 2,000
more successful in deciphering and locus to even identify the disease gene, confirmed loci, each containing multiple
describing not only its individual compo- let alone understand its functions. Suc- genes, affecting susceptibility to more
nents (e.g., DNA, RNA, protein) but also cessinthisendeavorwillrequireasynthe- than 100 medically relevant traits (e.g.,
their mutual relationships (e.g., DNA- sis of many biological disciplines that blood pressure) and disease (e.g., hyper-
protein interaction) and ensembles (e.g., includes the role of genetic variation as tension) are now known (Hindorff et al.,
transcriptional complex), although this is intrinsic to the biological process, not an 2009). For most complextraits examined,
also far from complete (Watson et al., aspect to be ignored. many such loci have been mapped, but
2007). Not only do we understand the Consequently,meldingvariation-based the vast majority of the specific genes
structure of individual genes and how genetic and molecular biological thinking remain unidentified. We can sometimes
their molecular functions get executed, is of critical importance for both fields guess at a candidate gene within the lo-
but we are also starting to learn how and is central to our understanding of cus (Jostins et al., 2012), sometimes
functionsgetregulatedthroughadiversity mechanisms of trait variation, including implicate a gene by virtue of an abun-
of cis- and trans-acting functions. The interindividual variation in disease risk. If dance of rare variants among affected
consequencesoftheprimaryandinterac- most disease, in most humans, is the individuals (Jostins et al., 2012), in rare
tion effects are often well understood, consequence of the effects of variation circumstances, use therapeutic modula-
even though not completely described, at many genes, then knowledge of their tion of a pathway to pinpoint the gene
at both the molecular and cellular levels functional relationships, rather than (Moon et al., 2004), and sometimes
(Alberts et al., 2007). There are also merely their identities, is central to under- identify one by painstaking experimental
improving technologies and understand- standing the phenotype. This is clearly a dissection (Musunuru et al., 2010), but,
ing of the structures and functions of problem of ‘‘Systems Biology’’ but one generally, identification of the underlying
ensembles of proteins and cells, and that incorporates genetic variation gene has not become easier. In fact,
how these interact and communicate directly. The ability to integrate the real- mostofthemappedlociunderlyingcom-
with one another to create complexity ities of suchwidespreadgeneticvariation, plex traits remain unresolved at the gene
(Ilsley et al., 2013). Although the use of which are ultimately at the causal root or mechanistic level.
genetic tools and genetic perspectives of disease mechanisms, with systems Despite the beginning clues to human
are fundamental to this progress, these biology approaches to understand func- disease pathophysiology that complex
advances have not as yet led to a major tional contingencies is central to the disease mapping is providing, and the
revision of our understanding of trait or challenge of deciphering complex human slow identification of individual genes, it
disease variation. The major reason for disease. Importantly, it is likely to spur appears highly unlikely that we can
this discrepancy is that, with few excep- newthinking in both fields. understand traits and diseases this way.
tions (Raj et al., 2010), molecular and cell There is indeed evidence for scenarios
biology has focused on the impact of Genetic Dissection of Complex in which variation in complex traits,
deleting or overexpressing genes and Phenotypes including risk of complex disease, is
not grappled with the consequences of Genetic transmission rules imply that, mediated by a myriad of variants of
allelic variation. even in an intractable species such as minute effect, spread evenly across the
Classical Mendelian genetics has been us, one can map genomic segments that genome (Yang et al., 2011). Therefore,
a boon to uncovering biology from yeast must contain a disease or trait gene. The we need other approaches to override
to humans whenever a mutation with a lure and success of this method is that this bottleneck.
simple inheritance pattern can be iso- we can map a disease locus in the For Mendelian disorders, gene identifi-
lated. This approach has been revolution- absenceofanyknowledgeoftheunderly- cation within a locus is made possible by
ary in the unicellular yeast, particularly ing biology of the phenotype. Such eachmutationbeingnecessaryandsuffi-
because genetics (and gene manipula- mapping requires identification of the cientforthephenotype,beingfunctionally
tion), biochemistry, and cell biology were segregation of common sites of variation deleterious and rare, and having an inher-
meldedtounderstandfunctionatavariety across the genome, now easy to identify itance pattern consistent with the pheno-
of levels. This kind of multilevel approach through sequencing, and recognition of type. It’s the mutation that eventually
has been less straightforward, but still a genomic segment identical-by-descent reveals the biology and explains the
largely successful, for a metazoan such in affected individuals, both within and phenotype. Any component locus for a
asDrosophilawheremoregenesandmul- between families. This task has become complex disease has no such restriction,
tiple specialized cells often rescue the easier and more powerful as sequencing as the causal variants are neither neces-
effects of a mutation or enhance its minor technology has improved to provide a sary nor sufficient, nor coding (in fact,
effect. These lessons suggest to us that nearlycompletecatalogofvariantsabove theyarefrequentlynoncodingandregula-
the current approach, based strictly on 1% frequency in the population; further tory) nor rare (Emison et al., 2010; Jostins
genetic variation, to understanding com- improvements to sample rarer variants etal., 2012).Currently,themajorattempts
plexhumandiseaseisalsogrosslyinsuffi- are ongoing (Abecasis et al., 2012). to overcome this impediment involve reli-
cient and,asinyeastandflies,willrequire Consequently, genetic mapping, once ance on single severe mutations at the
the contemporaneous analysis of the the province of rare Mendelian disorders, very same component genes and
Cell 155, September 26, 2013 ª2013 Elsevier Inc. 23
complex inheritance problem (Yosef
et al., 2013). Even more importantly, this
approach might, through the effect of
mutations, allow us to decipher cell cir-
cuitry and understand which pathways
are limiting and which are redundant.
This last aspect is critical: as we argue
below, with our current state of knowl-
edge, we are likely to have our greatest
success with understanding how genes
map onto pathways, and how pathways
mapontodisease,beforeatruequantita-
tive understanding of disease biology
emerges.Onemightcounterthatexisting
gene ontologies do precisely that, but,
even in yeast, this appears to be highly
incomplete (Dutkowski et al., 2013).
Proving Causality: Molecular
Koch’sPostulates
The evidence that a specific gene is
involved in a particular human disease
has historically been nonstatistical and
based on our experience with identifying
Figure 1. Complementary Approaches Necessary for Proving Genetic Causality and mutations in Mendelian diseases. The
Understanding the Pathophysiology of Complex Disease chief criteria have been to demonstrate
Geneticassociationstudiesinhumanscansynergizewithpriorknowledgeandsystems-levelquantitative cosegregation with the phenotype in
analysis to generate predictions of what pathways and modules are disrupted, where (anatomically), and families, exclusivity of the mutation to
when(developmentally)toyieldaspecificmorphologicalorbiochemicalphenotype.Thesepredictionscan affected individuals (rare alleles absent
then be tested in an appropriate model system while adhering to the postulates outlined in Box 1.
incontrols),andthenatureofthemutation
(a plausibly deleterious allele at a
demonstrating Mendelian inheritance of ease.Thisapproachhasbeenhighlyprof- conserved site within a protein). Unfortu-
the same or similar phenotype, and/or itable in Crohn’s disease—a common nately, as already mentioned, all of these
identifying single genes with a demon- inflammatorydisorderwhoserootcauses rules break down in complex phenotypes
strable excess of rare coding variants. remained cryptic until genome-wide where neither cosegregation nor exclu-
Thefirstofthesetwostrategiesisastrong association studies identified a large sivity to affecteds nor obviously delete-
unproven hypothesis and probably not number of loci with fundamental defects rious alleles are likely; moreover, many
universally true, whereas the second re- in mucosalimmunity(GrahamandXavier, mutationsaresuspectedtobenoncoding
lies on very large sample sizes of patients 2013)—but not in type 2 diabetes, where andinadiversityofregulatory RNAmole-
and suffers from the unknown functional the pathophysiology awaits clarification cules. Consequently, statistical evidence
effect of the majority of rare coding vari- (Groop and Pociot, 2013). Although we of enrichment has been the mainstay,
ants. Consequently, these strategies suspect that the numbers of pathways but this has two negative consequences:
themselvesdependonthehiddenbiology involved are fewer than the numbers of first, scanning across the genome or
we seek and are applicable only to the genes involved, this is merely suspicion. multiple loci covering tens to hundreds
most common human diseases. It ap- Nevertheless, can we reduce the com- of megabases requires very large sample
pears to us that ignorance of biology has plexity of the problem by identifying all sizes and very strict levels of significance
become rate limiting for understanding of the relevant pathways? Despite uncer- toguardagainstthemanyexpectedfalse-
diseasepathophysiology,exceptperhaps tainty, this approachhastheadvantageof positive findings; second, genetic effects
for the Mendelian disorders. There are leading to specific testable hypotheses. that are small or genes with only a few
two ways to get out of this vicious cycle Thesecondapproachistofocusresearch causal alleles are notoriously difficult to
(Figure 1). onwhythediseaseiscomplexinthefirst detect, although they may be very impor-
One approach may be to use a set of place. Although the genome is linear, its tant to understanding pathogenesis. This
model traits and diseases and employ expression and biology are highly difficulty translates into a low power of
their existing mapped loci to identify a nonlinear and hierarchical, being seques- detection, as common disease alleles
small set of the component genes by tered in specific cells and organelles cannot be distinguished from bystander
brute-force (or, luck) and use the uncov- (Ilsley et al., 2013). Understanding this associated alleles, whereas rare alleles
ered biology to infer which other genes hierarchy, the province of systems are observed too infrequently to provide
in their ‘‘pathways’’ can explain the dis- biology, is critical to the solution of the statistical significance. Consequently,
24 Cell 155, September 26, 2013 ª2013 Elsevier Inc.
no reviews yet
Please Login to review.