– 2 DB responses – 50 words each- write it in your own worlds – read the article again if needed – this is very simple and you don’t have to go into too much details. here is an example of how you should write the responses (DO NOT COPY IT):example: Hi, This was an amazing article! I was also intrigued by the depth and quality of information that was provided. I agree that the Riskogram algorithm is another fascinating tool, and I am amazed how they can relate every thing to the iPOPs. I wouldn’t want to get blood drawn on a regular basis either! I am also all for supporting research, but this study takes a large toll.


Unformatted Attachment Preview

Respond the two discussion board posts
Minimum 50 words EACH; max 100 words
This should be a simple response in word own worlds, and it should correlate with your
initial discussion board post
You can refer back to the article in order to make the responses
Initial post that you wrote:
Hi Class!
What surprised me in the article is how advent for genome sequencing plus analysis of the
physiological conditions have proved to be very powerful. This is shown through different
technologies whereby a generation of high superiority genome sequence. The genomic DNA
then becomes exposed to the WGS through the technologies that come from the Genomics
that are complete. The other thing that also surprised me is how exome sequencing is done
by different technologies, for instance, the usage of genotyping arrays plus RNA sequencing.
Personally, I have the curiosity of knowing more in relation to what was found from the study
since some facts and figures were written down. I would also give some of my samples to be
tested but under the condition that the information would be kept private and confidential.
This would help me to understand the study even more. I would also love to know the signs
and symptoms of some diseases and if the diseases had some cure or they could not be
cured. I would also want to know the disease states and the personal variants for the RNA
Respond these two posts: Minimum 50 words EACH; max 100 words
Post 1: After reading the Snydrome article, I found two aspects personally interesting. The
first interesting find from this article was that the researcher was able to use a single
algorithm, called the RiskOGram, to assess genetic disease risks. This algorithm combined
many alleles that were linked to the disease risk to come up with the specific diseases and
disorders the patient was at high risk for. The patient in this case had a higher risk for
coronary artery disease and a much higher risk of basal cell carcinoma, hypertriglyceridemia,
and type II diabetes. As scary as these risks are, I would rather know so I could prepare myself
and understand the preventative measures that could be taken.
Another interesting point I found was when the researcher observed a certain mutation in the
patient, called TERT, and had knowledge that it was associated with the disease aplastic
anemia. The researcher measured the patient’s telomeres after finding this mutation, but saw
little differences in length than what he expected. He also noted that the patient’s 83-year old
mother also had the TERT mutation but did not express the symptoms of aplastic anemia. I
thought this was interesting because previous research believed that if you had the TERT
mutation, it meant you were going to suffer from the disease at some point, but this finding
does suggest that context and environment play roles in this as well. (Word Count: 233).
Post 2: Hello everyone,
One thing that surprised me about Dr. Michael Snyder’s research was that they were able to
identify sequences not present in the reference sequence of his genome. This was interesting
because they were able to confirm that there are a number of undocumented genetic regions
that exist in the human genome but that they can also be identified using deep sequencing
techniques. Another thing that interested me was that they were able to identify and examine
a number of genes for medical relevance. These included a mutation (E366K) in
the SERPINA1 gene previously known in the subject, a damaging mutation in TERT,
associated with acquired aplastic anemia (Links to an external site.)Links to an external
site.and variants associated with hypertriglyceridemia and diabetes. Lastly, I thought it was
interesting how they were able to associate SNVs with TF binding. They were able to identify
14,922 SNVs that lie within 36 TFs with binding factors, this had not been previously
attempted which was intriguing.
Personal Omics Profiling
Reveals Dynamic Molecular
and Medical Phenotypes
Rui Chen,1,11 George I. Mias,1,11 Jennifer Li-Pook-Than,1,11 Lihua Jiang,1,11 Hugo Y.K. Lam,1,12 Rong Chen,2,12
Elana Miriami,1 Konrad J. Karczewski,1 Manoj Hariharan,1 Frederick E. Dewey,3 Yong Cheng,1 Michael J. Clark,1
Hogune Im,1 Lukas Habegger,6,7 Suganthi Balasubramanian,6,7 Maeve O’Huallachain,1 Joel T. Dudley,2
Sara Hillenmeyer,1 Rajini Haraksingh,1 Donald Sharon,1 Ghia Euskirchen,1 Phil Lacroute,1 Keith Bettinger,1 Alan P. Boyle,1
Maya Kasowski,1 Fabian Grubert,1 Scott Seki,2 Marco Garcia,2 Michelle Whirl-Carrillo,1 Mercedes Gallardo,9,10
Maria A. Blasco,9 Peter L. Greenberg,4 Phyllis Snyder,1 Teri E. Klein,1 Russ B. Altman,1,5 Atul J. Butte,2 Euan A. Ashley,3
Mark Gerstein,6,7,8 Kari C. Nadeau,2 Hua Tang,1 and Michael Snyder1,*
of Genetics, Stanford University School of Medicine
of Systems Medicine and Division of Immunology and Allergy, Department of Pediatrics
3Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine
4Division of Hematology, Department of Medicine
5Department of Bioengineering
Stanford University, Stanford, CA 94305, USA
6Program in Computational Biology and Bioinformatics
7Department of Molecular Biophysics and Biochemistry
8Department of Computer Science
Yale University, New Haven, CT 06520, USA
9Telomeres and Telomerase Group, Molecular Oncology Program, Spanish National Cancer Centre (CNIO), Madrid E-28029, Spain
10Life Length, Madrid E-28003, Spain
11These authors contributed equally to this work
12Present address: Personalis, Palo Alto, CA 94301, USA
*Correspondence: mpsnyder@stanford.edu
DOI 10.1016/j.cell.2012.02.009
Personalized medicine is expected to benefit from
combining genomic information with regular monitoring of physiological states by multiple highthroughput methods. Here, we present an integrative
personal omics profile (iPOP), an analysis that
combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single
individual over a 14 month period. Our iPOP analysis
revealed various medical risks, including type 2
diabetes. It also uncovered extensive, dynamic
changes in diverse molecular components and
biological pathways across healthy and diseased
conditions. Extremely high-coverage genomic
and transcriptomic data, which provide the basis
of our iPOP, revealed extensive heteroallelic
changes during healthy and diseased states and an
unexpected RNA editing mechanism. This study
demonstrates that longitudinal iPOP can be used
to interpret healthy and diseased states by connecting genomic information with additional dynamic
omics activity.
Personalized medicine aims to assess medical risks, monitor,
diagnose and treat patients according to their specific genetic
composition and molecular phenotype. The advent of genome
sequencing and the analysis of physiological states has proven
to be powerful (Cancer Genome Atlas Research Network,
2011). However, its implementation for the analysis of otherwise
healthy individuals for estimation of disease risk and medical
interpretation is less clear. Much of the genome is difficult to
interpret and many complex diseases, such as diabetes, neurological disorders and cancer, likely involve a large number of
different genes and biological pathways (Ashley et al., 2010;
Grayson et al., 2011; Li et al., 2011), as well as environmental
contributors that can be difficult to assess. As such, the combination of genomic information along with a detailed molecular
analysis of samples will be important for predicting, diagnosing
and treating diseases as well as for understanding the onset, progression, and prevalence of disease states (Snyder et al., 2009).
Presently, healthy and diseased states are typically followed
using a limited number of assays that analyze a small number
of markers of distinct types. With the advancement of many
new technologies, it is now possible to analyze upward of 105
molecular constituents. For example, DNA microarrays have
allowed the subcategorization of lymphomas and gliomas
Cell 148, 1293–1307, March 16, 2012 ª2012 Elsevier Inc. 1293
(Mischel et al., 2003), and RNA sequencing (RNA-Seq) has
identified breast cancer transcript isoforms (Li et al., 2011; van
der Werf et al., 2007; Wu et al., 2010; Lapuk et al., 2010).
Although transcriptome and RNA splicing profiling are powerful
and convenient, they provide a partial portrait of an organism’s
physiological state. Transcriptomic data, when combined with
genomic, proteomic, and metabolomic data are expected to
provide a much deeper understanding of normal and diseased
states (Snyder et al., 2010). To date, comprehensive integrative
omics profiles have been limited and have not been applied to
the analysis of generally healthy individuals.
To obtain a better understanding of: (1) how to generate an
integrative personal omics profile (iPOP) and examine as many
biological components as possible, (2) how these components
change during healthy and diseased states, and (3) how this
information can be combined with genomic information to
estimate disease risk and gain new insights into diseased states,
we performed extensive omics profiling of blood components
from a generally healthy individual over a 14 month period
(24 months total when including time points with other molecular
analyses). We determined the whole-genome sequence (WGS)
of the subject, and together with transcriptomic, proteomic, metabolomic, and autoantibody profiles, used this information to
generate an iPOP. We analyzed the iPOP of the individual over
the course of healthy states and two viral infections (Figure 1A).
Our results indicate that disease risk can be estimated by
a whole-genome sequence and by regularly monitoring health
states with iPOP disease onset may also be observed. The
wealth of information provided by detailed longitudinal iPOP revealed unexpected molecular complexity, which exhibited
dynamic changes during healthy and diseased states, and
provided insight into multiple biological processes. Detailed
omics profiling coupled with genome sequencing can provide
molecular and physiological information of medical significance.
This approach can be generalized for personalized health monitoring and medicine.
Overview of Personal Omics Profiling
Our overall iPOP strategy was to: (1) determine the genome
sequence at high accuracy and evaluate disease risks, (2)
monitor omics components over time and integrate the relevant
omics information to assess the variation of physiological states,
and (3) examine in detail the expression of personal variants
at the level of RNA and protein to study molecular complexity
and dynamic changes in diseased states.
We performed iPOP on blood components (peripheral blood
mononuclear cells [PBMCs], plasma and sera that are highly
accessible) from a 54-year-old male volunteer over the course
of 14 months (IRB-8629). The samples used for iPOP were taken
over an interval of 401 days (days 0–400). In addition, a complete
medical exam plus laboratory and additional tests were performed before the study officially launched (day 123) and blood
glucose was sampled multiple times after the comprehensive
omics profiling (days 401–602) (Figure 1A). Extensive sampling
was performed during two viral infections that occurred during
this period: a human rhinovirus (HRV) infection beginning on
1294 Cell 148, 1293–1307, March 16, 2012 ª2012 Elsevier Inc.
day 0 and a respiratory syncytial virus (RSV) infection starting
on day 289. A total of 20 time points were extensively analyzed
and a summary of the time course is indicated in Figure 1A.
The different types of analyses performed are summarized in
Figures 1B and 1C. These analyses, performed on PBMCs
and/or serum components, included WGS, complete transcriptome analysis (providing information about the abundance of
alternative spliced isoforms, heteroallelic expression, and RNA
edits, as well as expression of miRNAs at selected time points),
proteomic and metabolomic analyses, and autoantibody
profiles. An integrative analysis of these data highlights dynamic
omics changes and provides rich information about healthy and
diseased phenotypes.
Whole-Genome Sequencing
We first generated a high quality genome sequence of this
individual using a variety of different technologies. Genomic
DNA was subjected to deep WGS using technologies from
Complete Genomics (CG, 35 nt paired end) and Illumina
(100 nt paired end) at 150- and 120-fold total coverage, respectively, exome sequencing using three different technologies to
80- to 100-fold average coverage (see Extended Experimental
Procedures available online) and analysis using genotyping
arrays and RNA sequencing.
The vast majority of genomic sequences (91%) mapped to the
hg19 (GRCh37) reference genome. However, because of the
depth of our sequencing, we were able to identify sequences
not present in the reference sequence. Assembly of the
unmapped Illumina sequencing reads (60,434,531, 9% of the
total) resulted in 1,425 (of 29,751) contigs (spanning 26 Mb) overlapping with RefSeq gene sequences that were not annotated in
the hg19 reference genome. The remaining sequences appeared
unique, including 2,919 exons expressed in the RNA-Seq data
(e.g., Figure S1A). These results confirm that a large number of
undocumented genetic regions exist in individual human
genome sequences and can be identified by very deep
sequencing and de novo assembly (Li et al., 2010).
Our analysis detected many single nucleotide variants (SNVs),
small insertions and deletions (indels) and structural variants
(SVs; large insertions, deletions, and inversions relative to
hg19), (summarized in Table 1 and Experimental Procedures).
134,341 (4.1%) high-confidence SNVs are not present in
dbSNP, indicating that they are very rare or private to the
subject. Only 302 high-confidence indels reside within RefSeq
protein coding exons and exhibit enrichments in multiples of
three nucleotides (p < 0.0001). In addition to indels, 2,566 high-confidence SVs were identified (Experimental Procedures and Table S1) and 8,646 mobile element insertions were identified (Stewart et al., 2011). Analysis of the subject’s mother’s genome by comprehensive genome sequencing (as above) and imputation allowed a maternal/paternal chromosomal phasing of 92.5% of the subject’s SNVs and indels (see Extended Experimental Procedures for details). Of 1,162 compound heterozygous mutations in genes, 139 contain predicted compound heterozygous deleterious and/or nonsense mutations. Phasing enabled the assembly of a personal genome sequence of very high confidence (c.f., Rozowsky et al., 2011). A B C Figure 1. Summary of Study (A) Time course summary. The subject was monitored for a total of 726 days, during which there were two infections (red bar, HRV; green bar, RSV). The black bar indicates the period when the subject: (1) increased exercise, (2) ingested 81 mg of acetylsalicylic acid and ibuprofen tablets each day (the latter only during the first 6 weeks of this period), and (3) substantially reduced sugar intake. Blue numbers indicate fasted time points. (B) iPOP experimental design indicating the tissues and analyses involved in this study. (C) Circos (Krzywinski et al., 2009) plot summarizing iPOP. From outer to inner rings: chromosome ideogram; genomic data (pale blue ring), structural variants >
50 bp (deletions [blue tiles], duplications [red tiles]), indels (green triangles); transcriptomic data (yellow ring), expression ratio of HRV infection to healthy states;
proteomic data (light purple ring), ratio of protein levels during HRV infection to healthy states; transcriptomic data (yellow ring), differential heteroallelic
expression ratio of alternative allele to reference allele for missense and synonymous variants (purple dots) and candidate RNA missense and synonymous edits
(red triangles, purple dots, orange triangles and green dots, respectively).
See also Figure S1.
WGS-Based Disease Risk Evaluation
We identified variants likely to be associated with increased
susceptibility to disease (Dewey et al., 2011). The list of high
confidence SNVs and indels was analyzed for rare alleles (<5% of the major allele frequency in Europeans) and for changes in genes with known Mendelian disease phenotypes (data summarized in Table 2), revealing that 51 and 4 of the rare coding SNV and indels, respectively, in genes present in OMIM are predicted to lead to loss-of-function (Table S2A). This list of genes was further examined for medical relevance (Table S2A; example alleles are summarized in Figure 2A), and 11 were validated by Sanger sequencing. High interest genes include: (1) a mutation (E366K) in the SERPINA1 gene previously known in the subject, (2) a damaging mutation in TERT, associated with acquired aplastic anemia (Yamaguchi et al., 2005), and (3) variants associated with hypertriglyceridemia and diabetes, such as GCKR Cell 148, 1293–1307, March 16, 2012 ª2012 Elsevier Inc. 1295 Table 1. Summary and Breakdown of DNA Variants Type Total Variants Total High Confidence Heterozygous High Confidence Homozygous High Confidence Total SNVs 3,739,701 3,301,521 1,971,629 1,329,892 Total gene-associated SNVs 1,312,780 1,183,847 717,485 466,362 Total coding/UTR 49,017 44,542 27,383 17,159 Missense 10,592 9,683 5,944 3,739 Nonsense 83 73 49 24 Synonymous 11,459 10,864 6,747 4,117 50 UTR 4,085 2,978 1,802 1,176 30 UTR 22,798 20,944 12,841 8,103 Intron 1,263,763 1,139,305 690,102 449,203 Ts/Tv — 2.14 — — dbSNP 3,493,748 3,167,180 — — Candidate private SNV 245,953 134,341 — — Indels (107 +36 bp) 1,022,901 216,776 — — Coding Structural variants (>50 bp)
In 1000G projecta



High confidence values are from variants identified across multiple platforms (Illumina and CG) and/or Exome and RNA-Seq data. Annotations were
based from variant call formatted (vcf) files for heterozygous calls: 0/1, reference (ref)/alternative (alt); 1/2, alt/alt and homozygous calls; 1/1, alt/alt; 1/,
(alt/alt-incomplete call). Polyphen-2 was used to identify the location of the SNVs.
1000G (1000 Genomes Project Consortium, 2010).
(homozygous) (Vaxillaire et al., 2008), and KCNJ11 (homozygous) (Hani et al., 1998) and TCF7 (heterozygous) (Erlich et al.,
Genetic disease risks were also assessed by the RiskOGram
algorithm, which integrates information from multiple alleles
associated with disease risk (Ashley et al., 2010) (Figure 2B).
This analysis revealed a modest elevated risk for coronary artery
disease and significantly elevated risk levels of basal cell carcinoma (Figure 2B), hypertriglyceridemia, and type 2 diabetes
(T2D) (Figures 2B and 2C).
In addition to coding region variants we also analyzed genomic
variants that may affect regulatory elements (transcription
factors [TF]), which had not been attempted previously (Data
S1). A total of 14,922 (of 234,980) SNVs lie in the motifs of 36
TFs known to be associated with the binding data (see Experimental Procedures), indicating that these are likely having a
direct effect on TF binding. Comparison of SNPs that alter
binding patterns of NFkB and Pol II sites (Kasowski et al.,
2010), also revealed a number of other interesting regulatory
variants, some of which are associated with human disease
(e.g., EDIL) (Sun et al., 2010) (Figure S1B).
Medical Phenotypes Monitoring
Based on the above analysis of medically relevant variants and
the RiskOGram, we monitored markers associated with highrisk disease phenotypes and performed additional medically
relevant assays.
Monitoring of glucose levels and HbA1c revealed the onset of
T2D as diagnosed by the subject’s physician (day 369, Figures
2A and 2C). The subject lacked many known factors associated
with diabetes (nonsmoker; …
Purchase answer to see full