Publications by authors named "Dana C Crawford"

185 Publications

Global variation in sequencing impedes SARS-CoV-2 surveillance.

PLoS Genet 2021 07 15;17(7):e1009620. Epub 2021 Jul 15.

Departments of Population and Quantitative Health Sciences and Genetics and Genome Sciences, Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, United States of America.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1009620DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8282079PMC
July 2021

Biobanks Linked to Electronic Health Records Accelerate Genomic Discovery.

J Am Soc Nephrol 2021 Aug 9;32(8):1828-1829. Epub 2021 Jul 9.

Department of Physiology and Biophysics, Case Western Reserve University, Cleveland, Ohio.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1681/ASN.2021060836DOI Listing
August 2021

Performance of African-ancestry-specific polygenic hazard score varies according to local ancestry in 8q24.

Prostate Cancer Prostatic Dis 2021 Jun 14. Epub 2021 Jun 14.

School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA, USA.

Background: We previously developed an African-ancestry-specific polygenic hazard score (PHS46+African) that substantially improved prostate cancer risk stratification in men with African ancestry. The model consists of 46 SNPs identified in Europeans and 3 SNPs from 8q24 shown to improve model performance in Africans. Herein, we used principal component (PC) analysis to uncover subpopulations of men with African ancestry for whom the utility of PHS46+African may differ.

Materials And Methods: Genotypic data were obtained from the PRACTICAL consortium for 6253 men with African genetic ancestry. Genetic variation in a window spanning 3 African-specific 8q24 SNPs was estimated using 93 PCs. A Cox proportional hazards framework was used to identify the pair of PCs most strongly associated with the performance of PHS46+African. A calibration factor (CF) was formulated using Cox coefficients to quantify the extent to which the performance of PHS46+African varies with PC.

Results: CF of PHS46+African was strongly associated with the first and twentieth PCs. Predicted CF ranged from 0.41 to 2.94, suggesting that PHS46+African may be up to 7 times more beneficial to some African men than others. The explained relative risk for PHS46+African varied from 3.6% to 9.9% for individuals with low and high CF values, respectively. By cross-referencing our data set with 1000 Genomes, we identified significant associations between continental and calibration groupings.

Conclusion: We identified PCs within 8q24 that were strongly associated with the performance of PHS46+African. Further research to improve the clinical utility of polygenic risk scores (or models) is needed to improve health outcomes for men of African ancestry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41391-021-00403-7DOI Listing
June 2021

Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction.

Nat Genet 2021 01 4;53(1):65-75. Epub 2021 Jan 4.

Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, Victoria, Australia.

Prostate cancer is a highly heritable disease with large disparities in incidence rates across ancestry populations. We conducted a multiancestry meta-analysis of prostate cancer genome-wide association studies (107,247 cases and 127,006 controls) and identified 86 new genetic risk variants independently associated with prostate cancer risk, bringing the total to 269 known risk variants. The top genetic risk score (GRS) decile was associated with odds ratios that ranged from 5.06 (95% confidence interval (CI), 4.84-5.29) for men of European ancestry to 3.74 (95% CI, 3.36-4.17) for men of African ancestry. Men of African ancestry were estimated to have a mean GRS that was 2.18-times higher (95% CI, 2.14-2.22), and men of East Asian ancestry 0.73-times lower (95% CI, 0.71-0.76), than men of European ancestry. These findings support the role of germline variation contributing to population differences in prostate cancer risk, with the GRS offering an approach for personalized risk prediction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-020-00748-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8148035PMC
January 2021

Editorial: The Importance of Diversity in Precision Medicine Research.

Front Genet 2020 26;11:875. Epub 2020 Aug 26.

Department of Population and Quantitative Health Sciences, Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, United States.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2020.00875DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479241PMC
August 2020

Modelling kidney disease using ontology: insights from the Kidney Precision Medicine Project.

Nat Rev Nephrol 2020 11 16;16(11):686-696. Epub 2020 Sep 16.

Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA.

An important need exists to better understand and stratify kidney disease according to its underlying pathophysiology in order to develop more precise and effective therapeutic agents. National collaborative efforts such as the Kidney Precision Medicine Project are working towards this goal through the collection and integration of large, disparate clinical, biological and imaging data from patients with kidney disease. Ontologies are powerful tools that facilitate these efforts by enabling researchers to organize and make sense of different data elements and the relationships between them. Ontologies are critical to support the types of big data analysis necessary for kidney precision medicine, where heterogeneous clinical, imaging and biopsy data from diverse sources must be combined to define a patient's phenotype. The development of two new ontologies - the Kidney Tissue Atlas Ontology and the Ontology of Precision Medicine and Investigation - will support the creation of the Kidney Tissue Atlas, which aims to provide a comprehensive molecular, cellular and anatomical map of the kidney. These ontologies will improve the annotation of kidney-relevant data, and eventually lead to new definitions of kidney disease in support of precision medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41581-020-00335-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8012202PMC
November 2020

African-specific improvement of a polygenic hazard score for age at diagnosis of prostate cancer.

Int J Cancer 2021 01 24;148(1):99-105. Epub 2020 Sep 24.

UMR Inserm 1134 Biologie Intégrée du Globule Rouge, INSERM/Université Paris Diderot-Université Sorbonne Paris Cité/INTS/Université des Antilles, Paris, France.

Polygenic hazard score (PHS) models are associated with age at diagnosis of prostate cancer. Our model developed in Europeans (PHS46) showed reduced performance in men with African genetic ancestry. We used a cross-validated search to identify single nucleotide polymorphisms (SNPs) that might improve performance in this population. Anonymized genotypic data were obtained from the PRACTICAL consortium for 6253 men with African genetic ancestry. Ten iterations of a 10-fold cross-validation search were conducted to select SNPs that would be included in the final PHS46+African model. The coefficients of PHS46+African were estimated in a Cox proportional hazards framework using age at diagnosis as the dependent variable and PHS46, and selected SNPs as predictors. The performance of PHS46 and PHS46+African was compared using the same cross-validated approach. Three SNPs (rs76229939, rs74421890 and rs5013678) were selected for inclusion in PHS46+African. All three SNPs are located on chromosome 8q24. PHS46+African showed substantial improvements in all performance metrics measured, including a 75% increase in the relative hazard of those in the upper 20% compared to the bottom 20% (2.47-4.34) and a 20% reduction in the relative hazard of those in the bottom 20% compared to the middle 40% (0.65-0.53). In conclusion, we identified three SNPs that substantially improved the association of PHS46 with age at diagnosis of prostate cancer in men with African genetic ancestry to levels comparable to Europeans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ijc.33282DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8135907PMC
January 2021

Optimizing identification of resistant hypertension: Computable phenotype development and validation.

Pharmacoepidemiol Drug Saf 2020 11 26;29(11):1393-1401. Epub 2020 Aug 26.

Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA.

Purpose: Computable phenotypes are constructed to utilize data within the electronic health record (EHR) to identify patients with specific characteristics; a necessary step for researching a complex disease state. We developed computable phenotypes for resistant hypertension (RHTN) and stable controlled hypertension (HTN) based on the National Patient-Centered Clinical Research Network (PCORnet) common data model (CDM). The computable phenotypes were validated through manual chart review.

Methods: We adapted and refined existing computable phenotype algorithms for RHTN and stable controlled HTN to the PCORnet CDM in an adult HTN population from the OneFlorida Clinical Research Consortium (2015-2017). Two independent reviewers validated the computable phenotypes through manual chart review of 425 patient records. We assessed precision of our computable phenotypes through positive predictive value (PPV) and test validity through interrater reliability (IRR).

Results: Among the 156 730 HTN patients in our final dataset, the final computable phenotype algorithms identified 24 926 patients with RHTN and 19 100 with stable controlled HTN. The PPV for RHTN in patients randomly selected for validation of the final algorithm was 99.1% (n = 113, CI: 95.2%-99.9%). The PPV for stable controlled HTN in patients randomly selected for validation of the final algorithm was 96.5% (n = 113, CI: 91.2%-99.0%). IRR analysis revealed a raw percent agreement of 91% (152/167) with Cohen's kappa statistic = 0.87.

Conclusions: We constructed and validated a RHTN computable phenotype algorithm and a stable controlled HTN computable phenotype algorithm. Both algorithms are based on the PCORnet CDM, allowing for future application to epidemiological and drug utilization based research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/pds.5095DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7754782PMC
November 2020

A Germline Variant at 8q24 Contributes to Familial Clustering of Prostate Cancer in Men of African Ancestry.

Eur Urol 2020 09 12;78(3):316-320. Epub 2020 May 12.

Department of Surgery, Center for Prostate Disease Research, Uniformed Services University of the Health Sciences, Bethesda, MD, USA.

Although men of African ancestry have a high risk of prostate cancer (PCa), no genes or mutations have been identified that contribute to familial clustering of PCa in this population. We investigated whether the African ancestry-specific PCa risk variant at 8q24, rs72725854, is enriched in men with a PCa family history in 9052 cases, 143 cases from high-risk families, and 8595 controls of African ancestry. We found the risk allele to be significantly associated with earlier age at diagnosis, more aggressive disease, and enriched in men with a PCa family history (32% of high-risk familial cases carried the variant vs 23% of cases without a family history and 12% of controls). For cases with two or more first-degree relatives with PCa who had at least one family member diagnosed at age <60 yr, the odds ratios for TA heterozygotes and TT homozygotes were 3.92 (95% confidence interval [CI] = 2.13-7.22) and 33.41 (95% CI = 10.86-102.84), respectively. Among men with a PCa family history, the absolute risk by age 60 yr reached 21% (95% CI = 17-25%) for TA heterozygotes and 38% (95% CI = 13-65%) for TT homozygotes. We estimate that in men of African ancestry, rs72725854 accounts for 32% of the total familial risk explained by all known PCa risk variants. PATIENT SUMMARY: We found that rs72725854, an African ancestry-specific risk variant, is more common in men with a family history of prostate cancer and in those diagnosed with prostate cancer at younger ages. Men of African ancestry may benefit from the knowledge of their carrier status for this genetic risk variant to guide decisions about prostate cancer screening.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.eururo.2020.04.060DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805560PMC
September 2020

A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans.

PLoS One 2019 31;14(12):e0226771. Epub 2019 Dec 31.

Cleveland Institute for Computational Biology, Cleveland, Ohio, United States of America.

We performed a hypothesis-generating phenome-wide association study (PheWAS) to identify and characterize cross-phenotype associations, where one SNP is associated with two or more phenotypes, between thousands of genetic variants assayed on the Metabochip and hundreds of phenotypes in 5,897 African Americans as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study. The PAGE I study was a National Human Genome Research Institute-funded collaboration of four study sites accessing diverse epidemiologic studies genotyped on the Metabochip, a custom genotyping chip that has dense coverage of regions in the genome previously associated with cardio-metabolic traits and outcomes in mostly European-descent populations. Here we focus on identifying novel phenome-genome relationships, where SNPs are associated with more than one phenotype. To do this, we performed a PheWAS, testing each SNP on the Metabochip for an association with up to 273 phenotypes in the participating PAGE I study sites. We identified 133 putative pleiotropic variants, defined as SNPs associated at an empirically derived p-value threshold of p<0.01 in two or more PAGE study sites for two or more phenotype classes. We further annotated these PheWAS-identified variants using publicly available functional data and local genetic ancestry. Amongst our novel findings is SPARC rs4958487, associated with increased glucose levels and hypertension. SPARC has been implicated in the pathogenesis of diabetes and is also known to have a potential role in fibrosis, a common consequence of multiple conditions including hypertension. The SPARC example and others highlight the potential that PheWAS approaches have in improving our understanding of complex disease architecture by identifying novel relationships between genetic variants and an array of common human phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226771PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6938343PMC
April 2020

Frequency of ClinVar Pathogenic Variants in Chronic Kidney Disease Patients Surveyed for Return of Research Results at a Cleveland Public Hospital.

Pac Symp Biocomput 2020 ;25:575-586

Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA.

Return of results is not common in research settings as standards are not yet in place for what to return, how to return, and to whom. As a pioneer of large-scale of return of research results, the Precision Medicine Initiative Cohort now known of All of Us plans to return pharmacogenomic results and variants of clinical significance to its participants starting late 2019. To better understand the local landscape of possibilities regarding return of research results, we assessed the frequency of pathogenic variants and APOL1 renal risk variants in a small diverse cohort of chronic kidney disease patients (CKD) ascertained from a public hospital in Cleveland, Ohio genotyped on the Illumina Infinium MegaEX. Of the 23,720 ClinVar-designated variants directly assayed by the MegaEX, 8,355 (35%) had at least one alternate allele in the 130 participants genotyped. Of these, 18 ClinVar variants deemed pathogenic by multiple submitters with no conflicts in interpretation were distributed across 27 participants. The majority of these pathogenic ClinVar variants (14/18) were associated with autosomal recessive disorders. Of note were four African American carriers of TTR rs76992529 associated with amyloidogenic transthyretin amyloidosis, otherwise known as familial transthyretin amyloidosis (FTA). FTA, an autosomal dominant disorder with variable penetrance, is more common among African-descent populations compared with European-descent populations. Also common in this CKD population were APOL1 renal risk alleles G1 (rs73885319) and G2 (rs71785313) with 60% of the study population carrying at least one renal risk allele. Both pathogenic ClinVar variants and APOL1 renal risk alleles were distributed among participants who wanted actionable genetic results returned, wanted genetic results returned regardless of actionability, and wanted no results returned. Results from this local genetic study highlight challenges in which variants to report, how to interpret them, and the participant's potential for follow-up, only some of the challenges in return of research results likely facing larger studies such as All of Us.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6931908PMC
February 2021

Bridging the Gaps in Personalized Medicine Value Assessment: A Review of the Need for Outcome Metrics across Stakeholders and Scientific Disciplines.

Public Health Genomics 2019 27;22(1-2):16-24. Epub 2019 Aug 27.

Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, USA,

Despite monumental advances in genomics, relatively few health care provider organizations in the United States offer personalized or precision medicine as part of the routine clinical workflow. The gaps between research and applied genomic medicine may be a result of a cultural gap across various stakeholders representing scientists, clinicians, patients, policy makers, and third party payers. Scientists are trained to assess the health care value of genomics by either quantifying population-scale effects, or through the narrow lens of clinical trials where the standard of care is compared with the predictive power of a single or handful of genetic variants. While these metrics are an essential first step in assessing and documenting the clinical utility of genomics, they are rarely followed up with other assessments of health care value that are critical to stakeholders who use different measures to define value. The limited value assessment in both the research and implementation science of precision medicine is likely due to necessary logistical constraints of these teams; engaging bioethicists, health care economists, and individual patient belief systems is incredibly daunting for geneticists and informaticians conducting research. In this narrative review, we concisely describe several definitions of value through various stakeholder viewpoints. We highlight the existing gaps that prevent clinical translation of scientific findings generally as well as more specifically using two present-day, extreme scenarios: (1) genetically guided warfarin dosing representing a handful of genetic markers and more than 10 years of basic and translational research, and (2) next-generation sequencing representing genome-dense data lacking substantial evidence for implementation. These contemporary scenarios highlight the need for various stakeholders to broadly adopt frameworks designed to define and collect multiple value measures across different disciplines to ultimately impact more universal acceptance of and reimbursement for genomic medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1159/000501974DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6752968PMC
January 2020

Genetically-guided algorithm development and sample size optimization for age-related macular degeneration cases and controls in electronic health records from the VA Million Veteran Program.

AMIA Jt Summits Transl Sci Proc 2019 6;2019:153-162. Epub 2019 May 6.

Louis Stokes Cleveland VA Medical Center, Cleveland, OH.

Electronic health records (EHRs) linked to extensive biorepositories and supplemented with lifestyle, behavioral, and environmental exposure data, have enormous potential to contribute to genomic discovery, a necessary step in the pathway towards translational or precision medicine. A major bottleneck in incorporating EHRs into genomic studies is the extraction of research-grade variables for analysis, particularly when gold-standard measurements are not available or accessible. Here we develop algorithms for age-related macular degeneration (AMD), a common cause of blindness among the elderly, and controls free of AMD. These computable phenotypes were developed using billing codes (ICD-9-CM and ICD-10-CM) and Current Procedural Terminology (CPT) codes and evaluated in two study sites of the Veterans Affairs Million Veteran Program: Louis Stokes Cleveland VA Medical Center and the Providence VA Medical Center. After establishing a high overall positive and negative predictive values (93% and 95%, respectively) through manual chart review, the candidate algorithm was deployed in the full VA MVP dataset of >500,000 participants. The algorithm was then optimized in a data cube using a variety of approaches including adjusting inclusion age thresholds by examining previously-reported genetic associations for (rs10801555, a proxy for rs1061170) and (rs10490924). The algorithm with the smallest p-values for the known genetic associations was selected for downstream and on-going AMD genomic discovery efforts. This two-phase approach to developing research-grade case/control variables for AMD genomic studies capitalizes on established genetic associations resulting in high precision and optimized sample sizes, an approach that can be applied to other large-scale biobanks linked to EHRs for precision medicine research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6568141PMC
May 2019

Mind the gap: resources required to receive, process and interpret research-returned whole genome data.

Hum Genet 2019 Jul 3;138(7):691-701. Epub 2019 Jun 3.

Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, 44106, USA.

Most genotype-phenotype studies have historically lacked population diversity, impacting the generalizability of findings and thereby limiting the ability to equitably implement precision medicine. This well-documented problem has generated much interest in the ascertainment of new cohorts with an emphasis on multiple dimensions of diversity, including race/ethnicity, gender, age, socioeconomic status, disability, and geography. The most well known of these new cohort efforts is arguably All of Us, formerly known as the Precision Medicine Cohort Initiative Program. All of Us intends to ascertain at least one million participants in the United States representative of the multiple dimensions of diversity. As an incentive to participate, All of Us is offering the return of research results, including whole genome sequencing data, as well as the opportunity to contribute to the scientific process as non-scientists. The scale and scope of the proposed return of research results are unprecedented. Here, we briefly review possible return of genetic data models, including the likely data file formats and modes of data transfer or access. We also review the resources required to access and interpret the genetic or genomic data once received by the average participant, highlighting the nuanced anticipated barriers that will challenge both the digitally, computationally literate and illiterate participant alike. This inventory of resources required to receive, process, and interpret return of research results exposes the potential for access disparities and warns the scientific community to mind the gap so that all participants have equal access and understanding of the benefits of human genetic research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00439-019-02033-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6767905PMC
July 2019

A Social Determinant of Health May Modify Genetic Associations for Blood Pressure: Evidence From a SNP by Education Interaction in an African American Population.

Front Genet 2019 10;10:428. Epub 2019 May 10.

Department of Population and Quantitative Health Sciences, Cleveland Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, United States.

African Americans experience the highest burden of hypertension in the United States compared with other groups. Genetic contributions to this complex condition are now emerging in this as well as other populations through large-scale genome-wide association studies (GWAS) and meta-analyses. Despite these recent discovery efforts, relatively few large-scale studies of blood pressure have considered the joint influence of genetics and social determinants of health despite extensive evidence supporting their impact on hypertension. To identify these expected interactions, we accessed a subset of the Vanderbilt University Medical Center (VUMC) biorepository linked to de-identified electronic health records (EHRs) of adult African Americans genotyped using the Illumina Metabochip ( = 2,577). To examine potential interactions between education, a recognized social determinant of health, and genetic variants contributing to blood pressure, we used linear regression models to investigate two-way interactions for systolic and diastolic blood pressure (DBP). We identified a two-way interaction between rs6687976 and education affecting DBP ( = 0.052). Individuals homozygous for the minor allele and having less than a high school education had higher DBP compared with (1) individuals homozygous for the minor allele and high school education or greater and (2) individuals not homozygous for the minor allele and less than a high school education. To our knowledge, this is the first EHR -based study to suggest a gene-environment interaction for blood pressure in African Americans, supporting the hypothesis that genetic contributions to hypertension may be modulated by social factors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.00428DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6523518PMC
May 2019

Precision Medicine: Improving health through high-resolution analysis of personal data.

Pac Symp Biocomput 2019 ;24:220-223

University of California, Berkeley, USA†Supported by U41 HG007346 and U19 HD077627.

For the 2019 Pacific Symposium on Biocomputing's session on precision medicine, we present new research on computational techniques in range of areas including data curation, whole genome analysis, transcriptomics, microbiome profiling, EHR data-mining, and histological image processing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6526370PMC
January 2019

Using Electronic Health Records To Generate Phenotypes For Research.

Curr Protoc Hum Genet 2019 01 5;100(1):e80. Epub 2018 Dec 5.

Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio.

Electronic health records contain patient-level data collected during and for clinical care. Data within the electronic health record include diagnostic billing codes, procedure codes, vital signs, laboratory test results, clinical imaging, and physician notes. With repeated clinic visits, these data are longitudinal, providing important information on disease development, progression, and response to treatment or intervention strategies. The near universal adoption of electronic health records nationally has the potential to provide population-scale real-world clinical data accessible for biomedical research, including genetic association studies. For this research potential to be realized, high-quality research-grade variables must be extracted from these clinical data warehouses. We describe here common and emerging electronic phenotyping approaches applied to electronic health records, as well as current limitations of both the approaches and the biases associated with these clinically collected data that impact their use in research. © 2018 by John Wiley & Sons, Inc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/cphg.80DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6318047PMC
January 2019

Local genetic ancestry in CDKN2B-AS1 is associated with primary open-angle glaucoma in an African American cohort extracted from de-identified electronic health records.

BMC Med Genomics 2018 Sep 14;11(Suppl 3):70. Epub 2018 Sep 14.

Department of Population and Quantitative Health Sciences, Institute for Computational Biology, Case Western Reserve University, 2103 Cornell Road, Wolstein Research Building, Suite 2-527, Cleveland, OH, 44106, USA.

Background: Glaucoma is a leading cause of blindness in developed countries. Primary open-angle glaucoma (POAG), the most prevalent clinical subtype of glaucoma in the United States, affects African Americans at a higher rate compared with European Americans. Risk factors identified for POAG include increased age and family history, which coupled with heritability estimates, suggest this complex condition is associated with genetic and environmental factors. To date, several genome-wide studies have identified loci significantly associated with POAG risk, but most of these studies were performed in populations of European-descent.

Methods: To identify population-specific and trans-population genetic associations for POAG, we genotyped 11,521 African Americans using the Illumina Metabochip as part of the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study accessing BioVU, the Vanderbilt University Medical Center's biorepository linked to de-identified electronic health records. Among this study population, we identified 138 cases of POAG and 1376 controls and performed Metabochip-wide tests of association. We also estimated local genetic ancestry at CDKN2B-AS1, a POAG-associated locus established in European-descent populations.

Results: Overall, we did not identify significant single SNP-POAG associations after adjusting for multiple testing. We did, however, detect a significant association between POAG risk and local African genetic ancestry at CDKN2B-AS1, where on average cases were of 90% African descent compared with controls at 58% (p = 2 × 10).

Conclusions: These data suggest that CDKN2B-AS1 is an important locus for POAG risk among African Americans, warranting further investigation to identify the variants underlying this association.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-018-0392-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157155PMC
September 2018

Frequency and phenotype consequence of APOC3 rare variants in patients with very low triglyceride levels.

BMC Med Genomics 2018 Sep 14;11(Suppl 3):66. Epub 2018 Sep 14.

Departments of Medicine and Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA.

Background: High levels of triglycerides (TG ≥200 mg/dL) are an emerging risk factor for cardiovascular disease. Conversely, very low levels of TG are associated with decreased risk for cardiovascular disease. Precision medicine aims to capitalize on recent findings that rare variants such as APOC3 R19X (rs76353203) are associated with risk of disease, but it is unclear how population-based associations can be best translated in clinical settings at the individual-patient level.

Methods: To explore the potential usefulness of screening for genetic predictors of cardiovascular disease, we surveyed BioVU, the Vanderbilt University Medical Center's biorepository linked to de-identified electronic health records (EHRs), for APOC3 19X mutations among adult European American patients (> 45 and > 55 years of age for men and women, respectively) with the lowest percentile of TG levels. The initial search identified 262 patients with the lowest TG levels in the biorepository; among these, 184 patients with sufficient DNA and the lowest TG levels were chosen for Illumina ExomeChip genotyping.

Results: A total of two patients were identified as heterozygotes of APOC3 R19X for a minor allele frequency (MAF) of 0.55% in this patient population. Both heterozygous patients had only a single mention of TG in the EHR (31 and 35 mg/dL, respectively), and one patient had evidence of previous cardiovascular disease.

Conclusions: In this patient population, we identified two patients who were carriers of the APOC3 19X null variant, but only one lacked evidence of disease in the EHR highlighting the challenges of inclusion of functional or previously associated genetic variation in clinical risk assessment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-018-0387-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6156840PMC
September 2018

Genome-wide association analysis of common genetic variants of resistant hypertension.

Pharmacogenomics J 2019 06 20;19(3):295-304. Epub 2018 Sep 20.

Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics, College of Pharmacy, University of Florida, Gainesville, FL, USA.

Resistant hypertension (RHTN), defined as uncontrolled blood pressure (BP) ≥ 140/90 using three or more drugs or controlled BP (<140/90) using four or more drugs, is associated with adverse outcomes, including decline in kidney function. We conducted a genome-wide association analysis in 1194 White and Hispanic participants with hypertension and coronary artery disease from the INternational VErapamil-SR Trandolapril STudy-GENEtic Substudy (INVEST-GENES). Top variants associated with RHTN at p < 10 were tested for replication in 585 White and Hispanic participants with hypertension and subcortical strokes from the Secondary Prevention of Subcortical Strokes GENEtic Substudy (SPS3-GENES). A genetic risk score for RHTN was created by summing the risk alleles of replicated RHTN signals. rs11749255 in MSX2 was associated with RHTN in INVEST (odds ratio (OR) (95% CI) = 1.50 (1.2-1.8), p = 7.3 × 10) and replicated in SPS3 (OR = 2.0 (1.4-2.8), p = 4.3 × 10), with genome-wide significance in meta-analysis (OR = 1.60 (1.3-1.9), p = 3.8 × 10). Other replicated signals were in IFLTD1 and PTPRD. IFLTD1 rs6487504 was associated with RHTN in INVEST (OR = 1.90 (1.4-2.5), p = 1.1 × 10) and SPS3 (OR = 1.70 (1.2-2.5), p = 4 × 10). PTPRD rs324498, a previously reported RHTN signal, was among the top signals in INVEST (OR = 1.60 (1.3-2.0), p = 3.4 × 10) and replicated in SPS3 (OR = 1.60 (1.1-2.4), one-sided p = 0.005). Participants with the highest number of risk alleles were at increased risk of RHTN compared to participants with a lower number (p-trend = 1.8 × 10). Overall, we identified and replicated associations with RHTN in the MSX2, IFLTD1, and PTPRD regions, and combined these associations to create a genetic risk score.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41397-018-0049-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6426691PMC
June 2019

The genetic underpinnings of variation in ages at menarche and natural menopause among women from the multi-ethnic Population Architecture using Genomics and Epidemiology (PAGE) Study: A trans-ethnic meta-analysis.

PLoS One 2018 25;13(7):e0200486. Epub 2018 Jul 25.

Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute and Department of Pediatrics at Harbor-UCLA Medical Center, Torrance, California, United States of America.

Current knowledge of the genetic architecture of key reproductive events across the female life course is largely based on association studies of European descent women. The relevance of known loci for age at menarche (AAM) and age at natural menopause (ANM) in diverse populations remains unclear. We investigated 32 AAM and 14 ANM previously-identified loci and sought to identify novel loci in a trans-ethnic array-wide study of 196,483 SNPs on the MetaboChip (Illumina, Inc.). A total of 45,364 women of diverse ancestries (African, Hispanic/Latina, Asian American and American Indian/Alaskan Native) in the Population Architecture using Genomics and Epidemiology (PAGE) Study were included in cross-sectional analyses of AAM and ANM. Within each study we conducted a linear regression of SNP associations with self-reported or medical record-derived AAM or ANM (in years), adjusting for birth year, population stratification, and center/region, as appropriate, and meta-analyzed results across studies using multiple meta-analytic techniques. For both AAM and ANM, we observed more directionally consistent associations with the previously reported risk alleles than expected by chance (p-valuesbinomial≤0.01). Eight densely genotyped reproductive loci generalized significantly to at least one non-European population. We identified one trans-ethnic array-wide SNP association with AAM and two significant associations with ANM, which have not been described previously. Additionally, we observed evidence of independent secondary signals at three of six AAM trans-ethnic loci. Our findings support the transferability of reproductive trait loci discovered in European women to women of other race/ethnicities and indicate the presence of additional trans-ethnic associations both at both novel and established loci. These findings suggest the benefit of including diverse populations in future studies of the genetic architecture of female growth and development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0200486PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6059436PMC
January 2019

Hi-MC: a novel method for high-throughput mitochondrial haplogroup classification.

PeerJ 2018 25;6:e5149. Epub 2018 Jun 25.

Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA.

Effective approaches for assessing mitochondrial DNA (mtDNA) variation are important to multiple scientific disciplines. Mitochondrial haplogroups characterize branch points in the phylogeny of mtDNA. Several tools exist for mitochondrial haplogroup classification. However, most require full or partial mtDNA sequence which is often cost prohibitive for studies with large sample sizes. The purpose of this study was to develop Hi-MC, a high-throughput method for mitochondrial haplogroup classification that is cost effective and applicable to large sample sizes making mitochondrial analysis more accessible in genetic studies. Using rigorous selection criteria, we defined and validated a custom panel of mtDNA single nucleotide polymorphisms that allows for accurate classification of European, African, and Native American mitochondrial haplogroups at broad resolution with minimal genotyping and cost. We demonstrate that Hi-MC performs well in samples of European, African, and Native American ancestries, and that Hi-MC performs comparably to a commonly used classifier. Implementation as a software package in R enables users to download and run the program locally, grants greater flexibility in the number of samples that can be run, and allows for easy expansion in future revisions. Hi-MC is available in the CRAN repository and the source code is freely available at https://github.com/vserch/himc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.5149DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022720PMC
June 2018

Willingness to Participate in a National Precision Medicine Cohort: Attitudes of Chronic Kidney Disease Patients at a Cleveland Public Hospital.

J Pers Med 2018 Jun 26;8(3). Epub 2018 Jun 26.

Department of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.

Multiple ongoing, government-funded national efforts longitudinally collect health data and biospecimens for precision medicine research with ascertainment strategies increasingly emphasizing underrepresented groups in biomedical research. We surveyed chronic kidney disease patients from an academic, public integrated tertiary care system in Cleveland, Ohio, to examine local attitudes toward participation in large-scale government-funded studies. Responses ( = 103) indicate the majority (71%) would participate in a hypothetical national precision medicine cohort and were willing to send biospecimens to a national repository and share de-identified data, but <50% of respondents were willing to install a phone app to track personal data. The majority of participants (62%) indicated that return of research results was very important, and the majority (54%) also wanted all of their research-collected health and genetic data returned. Response patterns did not differ by race/ethnicity. Overall, we found high willingness to participate among this Cleveland patient population already participating in a local genetic study. These data suggest that despite common perceptions, subjects from communities traditionally underrepresented in genetic research will participate and agree to store samples and health data in repositories. Furthermore, most participants want return of research results, which will require a plan to provide these data in a secure, accessible, and understandable manner.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/jpm8030021DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6164471PMC
June 2018

Somatic T-cell Receptor Diversity in a Chronic Kidney Disease PatientPopulation Linked to Electronic Health Records.

AMIA Jt Summits Transl Sci Proc 2018 18;2017:63-71. Epub 2018 May 18.

Institute for Computational Biology, Departments of.

Germline and somatic genomic variation represent the bulk of 'omics data available for precision medicine research. These data, however, may fail to capture the dynamic biological processes that underlie disease development, particularly for chronic diseases of aging such as chronic kidney disease (CKD). To demonstrate the value of additional dynamic precision medicine data, we sequenced somatic T-cell receptor rearrangements, markers of the adaptive immune response, from genomic DNA collected during a clinical encounter from 15 participants with CKD and associated co-morbidities. Participants were consented as part of a larger precision medicine research project at the MetroHealth System, a large urban public hospital in Cleveland, Ohio. Despite the limited sample size, we observed reduced T-cell receptor diversity in relation to biomarkers (creatinine and BUN) of CKD status in this older and mostly African American sample. Overall, these data suggest a relationship between advanced CKD and premature aging of the adaptive immune system and highlight the potential of dynamic 'omic data to generate novel hypotheses about disease mechanisms and unique opportunities for precision medicine applications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961818PMC
May 2018

Racial Disparities in Lung Cancer Survival: The Contribution of Stage, Treatment, and Ancestry.

J Thorac Oncol 2018 10 6;13(10):1464-1473. Epub 2018 Jun 6.

Department of Thoracic Surgery, Vanderbilt University Medical Center, Nashville, Tennessee; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee; Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee. Electronic address:

Introduction: Lung cancer is a leading cause of cancer-related death worldwide. Racial disparities in lung cancer survival exist between blacks and whites, yet they are limited by categorical definitions of race. We sought to examine the impact of African ancestry on overall survival among blacks and whites with NSCLC cases.

Methods: Incident cases of NSCLC in blacks and whites from the prospective Southern Community Cohort Study (N = 425) were identified through linkage with state cancer registries in 12 southern states. Vital status was determined by linkage with the National Death Index and Social Security Administration. We evaluated the impact of African ancestry (as estimated by using genome-wide ancestry-informative markers) on overall survival by calculating the time-dependent area under the curve (AUC) for Cox proportional hazards models, adjusting for relevant covariates such as stage and treatment. We replicated our findings in an independent population of NSCLC cases in blacks.

Results: Global African ancestry was not significantly associated with overall survival among NSCLC cases. There was no change in model performance when Cox proportional hazards models with and without African ancestry were compared (AUC = 0.79 for each model). Removal of stage and treatment reduced the average time-dependent AUC from 0.79 to 0.65. Similar findings were observed in our replication study.

Conclusions: Stage and treatment are more important predictors of survival than African ancestry is. These findings suggest that racial disparities in lung cancer survival may disappear with similar early detection efforts for blacks and whites.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtho.2018.05.032DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6153049PMC
October 2018

Discovery, fine-mapping, and conditional analyses of genetic variants associated with C-reactive protein in multiethnic populations using the Metabochip in the Population Architecture using Genomics and Epidemiology (PAGE) study.

Hum Mol Genet 2018 08;27(16):2940-2953

Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA.

C-reactive protein (CRP) is a circulating biomarker indicative of systemic inflammation. We aimed to evaluate genetic associations with CRP levels among non-European-ancestry populations through discovery, fine-mapping and conditional analyses. A total of 30 503 non-European-ancestry participants from 6 studies participating in the Population Architecture using Genomics and Epidemiology study had serum high-sensitivity CRP measurements and ∼200 000 single nucleotide polymorphisms (SNPs) genotyped on the Metabochip. We evaluated the association between each SNP and log-transformed CRP levels using multivariate linear regression, with additive genetic models adjusted for age, sex, the first four principal components of genetic ancestry, and study-specific factors. Differential linkage disequilibrium patterns between race/ethnicity groups were used to fine-map regions associated with CRP levels. Conditional analyses evaluated for multiple independent signals within genetic regions. One hundred and sixty-three unique variants in 12 loci in overall or race/ethnicity-stratified Metabochip-wide scans reached a Bonferroni-corrected P-value <2.5E-7. Three loci have no (HACL1, OLFML2B) or only limited (PLA2G6) previous associations with CRP levels. Six loci had different top hits in race/ethnicity-specific versus overall analyses. Fine-mapping refined the signal in six loci, particularly in HNF1A. Conditional analyses provided evidence for secondary signals in LEPR, IL1RN and HNF1A, and for multiple independent signals in CRP and APOE. We identified novel variants and loci associated with CRP levels, generalized known CRP associations to a multiethnic study population, refined association signals at several loci and found evidence for multiple independent signals at several well-known loci. This study demonstrates the benefit of conducting inclusive genetic association studies in large multiethnic populations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddy211DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6077792PMC
August 2018

INTEGRATING COMMUNITY-LEVEL DATA RESOURCES FOR PRECISION MEDICINE RESEARCH.

Pac Symp Biocomput 2018 ;23:618-622

Department of Population and Quantitative Health Sciences, Case Western Reserve University Cleveland, OH, 44106, USA,

Precision Medicine focuses on collecting and using individual-level data to improve healthcare outcomes. To date, research efforts have been motivated by molecular-scale measurements, such as incorporating genomic data into clinical use. In many cases however, environmental, social, and economic factors are much more predictive of health outcomes, yet are not systematically used in clinical practice due to the difficulties in measurement and quantification. Advances in both the availability of electronic health information, environmental exposure data, and the more systematic use of geo-coding now provide ways to systematically assess community-level indicators of health, and link these factors to electronic health records for evaluating their influence on disease outcomes. In this workshop, we discuss new electronic sources of community-level data, and provide insight into their utility and validity when compared with gold-standard data collection approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
August 2018

Local ancestry transitions modify snp-trait associations.

Pac Symp Biocomput 2018 ;23:424-435

Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37235, USA, ²Departments of Biological Sciences, Biomedical Informatics, and Computer Science, Vanderbilt University, Nashville, TN 37235, USA,

Genomic maps of local ancestry identify ancestry transitions - points on a chromosome where recent recombination events in admixed individuals have joined two different ancestral haplotypes. These events bring together alleles that evolved within separate continential populations, providing a unique opportunity to evaluate the joint effect of these alleles on health outcomes. In this work, we evaluate the impact of genetic variants in the context of nearby local ancestry transitions within a sample of nearly 10,000 adults of African ancestry with traits derived from electronic health records. Genetic data was located using the Metabochip, and used to derive local ancestry. We develop a model that captures the effect of both single variants and local ancestry, and use it to identify examples where local ancestry transitions significantly interact with nearby variants to influence metabolic traits. In our most compelling example, we find that the minor allele of rs16890640 occuring on a European background with a downstream local ancestry transition to African ancestry results in significantly lower mean corpuscular hemoglobin and volume. This finding represents a new way of discovering genetic interactions, and is supported by molecular data that suggest changes to local ancestry may impact local chromatin looping.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728664PMC
August 2018

PRECISION MEDICINE: FROM DIPLOTYPES TO DISPARITIES TOWARDS IMPROVED HEALTH AND THERAPIES.

Pac Symp Biocomput 2018 ;23:389-399

Population and Quantitative Health Sciences, Institute for Computational Biology, Case Western Reserve University, Cleveland, OH, 44106, USA,

Precision medicine research efforts both in basic science discovery and clinical implementation are well underway and promise to provide individualized preventions and treatments, improving overall health care delivery. To achieve these goals, advances in data capture and analysis are needed spanning different types of 'omic and clinical data. The efforts to enhance precise treatments for all may accentuate healthcare disparities unless specific challenges are identified and addressed. This session of the 2018 Pacific Symposium on Biocomputing presents the latest developments in this transdisciplinary research space of genomics, medicine, and population health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6182117PMC
August 2018
-->