Publications by authors named "Pui-Yan Kwok"

202 Publications

Application of full-genome analysis to diagnose rare monogenic disorders.

NPJ Genom Med 2021 Sep 23;6(1):77. Epub 2021 Sep 23.

Children's Hospital Oakland Research Institute, Benioff Children's Hospital Oakland, University of California San Francisco, Oakland, CA, USA.

Current genetic testenhancer and narrows the diagnostic intervals for rare diseases provide a diagnosis in only a modest proportion of cases. The Full-Genome Analysis method, FGA, combines long-range assembly and whole-genome sequencing to detect small variants, structural variants with breakpoint resolution, and phasing. We built a variant prioritization pipeline and tested FGA's utility for diagnosis of rare diseases in a clinical setting. FGA identified structural variants and small variants with an overall diagnostic yield of 40% (20 of 50 cases) and 35% in exome-negative cases (8 of 23 cases), 4 of these were structural variants. FGA detected and mapped structural variants that are missed by short reads, including non-coding duplication, and phased variants across long distances of more than 180 kb. With the prioritization algorithm, longer DNA technologies could replace multiple tests for monogenic disorders and expand the range of variants detected. Our study suggests that genomes produced from technologies like FGA can improve variant detection and provide higher resolution genome maps for future application.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41525-021-00241-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8460793PMC
September 2021

Integrated genomic analyses of cutaneous T-cell lymphomas reveal the molecular bases for disease heterogeneity.

Blood 2021 10;138(14):1225-1236

Department of Dermatology, and.

Cutaneous T-cell lymphomas (CTCLs) are a clinically heterogeneous collection of lymphomas of the skin-homing T cell. To identify molecular drivers of disease phenotypes, we assembled representative samples of CTCLs from patients with diverse disease subtypes and stages. Via DNA/RNA-sequencing, immunophenotyping, and ex vivo functional assays, we identified the landscape of putative driver genes, elucidated genetic relationships between CTCLs across disease stages, and inferred molecular subtypes in patients with stage-matched leukemic disease. Collectively, our analysis identified 86 putative driver genes, including 19 genes not previously implicated in this disease. Two mutations have never been described in any cancer. Functionally, multiple mutations augment T-cell receptor-dependent proliferation, highlighting the importance of this pathway in lymphomagenesis. To identify putative genetic causes of disease heterogeneity, we examined the distribution of driver genes across clinical cohorts. There are broad similarities across disease stages. Many driver genes are shared by mycosis fungoides (MF) and Sezary syndrome (SS). However, there are significantly more structural variants in leukemic disease, leading to highly recurrent deletions of putative tumor suppressors that are uncommon in early-stage skin-centered MF. For example, TP53 is deleted in 7% and 87% of MF and SS, respectively. In both human and mouse samples, PD1 mutations drive aggressive behavior. PD1 wild-type lymphomas show features of T-cell exhaustion. PD1 deletions are sufficient to reverse the exhaustion phenotype, promote a FOXM1-driven transcriptional signature, and predict significantly worse survival. Collectively, our findings clarify CTCL genetics and provide novel insights into pathways that drive diverse disease phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1182/blood.2020009655DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8499046PMC
October 2021

Genome-wide association study of early-onset bipolar I disorder in the Han Taiwanese population.

Transl Psychiatry 2021 05 20;11(1):301. Epub 2021 May 20.

Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan.

The search for susceptibility genes underlying the heterogeneous bipolar disorder has been inconclusive, often with irreproducible results. There is a hope that narrowing the phenotypes will increase the power of genetic analysis. Early-onset bipolar disorder is thought to be a genetically homogeneous subtype with greater symptom severity. We conducted a genome-wide association study (GWAS) for this subtype in bipolar I (BPI) disorder. Study participants included 1779 patients of Han Chinese descent with BPI disorder recruited by the Taiwan Bipolar Consortium. We conducted phenotype assessment using the Chinese version of the Schedules for Clinical Assessment in Neuropsychiatry and prepared a life chart with graphic depiction of lifetime clinical course for each of the BPI patient recruited. The assessment of onset age was based on this life chart with early onset defined as ≤20 years of age. We performed GWAS in a discovery group of 516 early-onset and 790 non-early-onset BPI patients, followed by a replication study in an independent group of 153 early-onset and 320 non-early-onset BPI patients and a meta-analysis with these two groups. The SNP rs11127876, located in the intron of CADM2, showed association with early-onset BPI in the discovery cohort (P = 7.04 × 10) and in the test of replication (P = 0.0354). After meta-analysis, this SNP was demonstrated to be a new genetic locus in CADM2 gene associated with early-onset BPI disorder (P = 5.19 × 10).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41398-021-01407-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137921PMC
May 2021

Expanding the genotypic and phenotypic spectrum in a diverse cohort of 104 individuals with Wiedemann-Steiner syndrome.

Am J Med Genet A 2021 06 30;185(6):1649-1665. Epub 2021 Mar 30.

Division of Human Genetics, Department of Pediatrics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

Wiedemann-Steiner syndrome (WSS) is an autosomal dominant disorder caused by monoallelic variants in KMT2A and characterized by intellectual disability and hypertrichosis. We performed a retrospective, multicenter, observational study of 104 individuals with WSS from five continents to characterize the clinical and molecular spectrum of WSS in diverse populations, to identify physical features that may be more prevalent in White versus Black Indigenous People of Color individuals, to delineate genotype-phenotype correlations, to define developmental milestones, to describe the syndrome through adulthood, and to examine clinicians' differential diagnoses. Sixty-nine of the 82 variants (84%) observed in the study were not previously reported in the literature. Common clinical features identified in the cohort included: developmental delay or intellectual disability (97%), constipation (63.8%), failure to thrive (67.7%), feeding difficulties (66.3%), hypertrichosis cubiti (57%), short stature (57.8%), and vertebral anomalies (46.9%). The median ages at walking and first words were 20 months and 18 months, respectively. Hypotonia was associated with loss of function (LoF) variants, and seizures were associated with non-LoF variants. This study identifies genotype-phenotype correlations as well as race-facial feature associations in an ethnically diverse cohort, and accurately defines developmental trajectories, medical comorbidities, and long-term outcomes in individuals with WSS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ajmg.a.62124DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8631250PMC
June 2021

Genomic Variation and Recent Population Histories of Spotted (Strix occidentalis) and Barred (Strix varia) Owls.

Genome Biol Evol 2021 05;13(5)

Institute for Human Genetics, University of California San Francisco, CA, USA.

Spotted owls (SOs, Strix occidentalis) are a flagship species inhabiting old-growth forests in western North America. In recent decades, their populations have declined due to ongoing reductions in suitable habitat caused by logging, wildfires, and competition with the congeneric barred owl (BO, Strix varia). The northern spotted owl (S. o. caurina) has been listed as "threatened" under the Endangered Species Act since 1990. Here, we use an updated SO genome assembly along with 51 high-coverage whole-genome sequences to examine population structure, hybridization, and recent changes in population size in SO and BO. We found that potential hybrids identified from intermediate plumage morphology were a mixture of pure BO, F1 hybrids, and F1 × BO backcrosses. Also, although SO underwent a population bottleneck around the time of the Pleistocene-Holocene transition, their population sizes rebounded and show no evidence of any historical (i.e., 100-10,000 years ago) population decline. This suggests that the current decrease in SO abundance is due to events in the past century. Finally, we estimate that western and eastern BOs have been genetically separated for thousands of years, instead of the previously assumed recent (i.e., <150 years) divergence. Although this result is surprising, it is unclear where the ancestors of western BO lived after the separation. In particular, although BO may have colonized western North America much earlier than the first recorded observations, it is also possible that the estimated divergence time reflects unsampled BO population structure within central or eastern North America.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evab066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8120011PMC
May 2021

Genomic regions associated with microdeletion/microduplication syndromes exhibit extreme diversity of structural variation.

Genetics 2021 02;217(2)

Department of Pediatrics, Section of Clinical Genetics and Metabolism, University of Colorado School of Medicine, Aurora, CO 80045, USA.

Segmental duplications (SDs) are a class of long, repetitive DNA elements whose paralogs share a high level of sequence similarity with each other. SDs mediate chromosomal rearrangements that lead to structural variation in the general population as well as genomic disorders associated with multiple congenital anomalies, including the 7q11.23 (Williams-Beuren Syndrome, WBS), 15q13.3, and 16p12.2 microdeletion syndromes. Population-level characterization of SDs has generally been lacking because most techniques used for analyzing these complex regions are both labor and cost intensive. In this study, we have used a high-throughput technique to genotype complex structural variation with a single molecule, long-range optical mapping approach. We characterized SDs and identified novel structural variants (SVs) at 7q11.23, 15q13.3, and 16p12.2 using optical mapping data from 154 phenotypically normal individuals from 26 populations comprising five super-populations. We detected several novel SVs for each locus, some of which had significantly different prevalence between populations. Additionally, we localized the microdeletion breakpoints to specific paralogous duplicons located within complex SDs in two patients with WBS, one patient with 15q13.3, and one patient with 16p12.2 microdeletion syndromes. The population-level data presented here highlights the extreme diversity of large and complex SVs within SD-containing regions. The approach we outline will greatly facilitate the investigation of the role of inter-SD structural variation as a driver of chromosomal rearrangements and genomic disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/genetics/iyaa038DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8045732PMC
February 2021

Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese.

NPJ Genom Med 2021 Feb 11;6(1):10. Epub 2021 Feb 11.

Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.

Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high-coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort. The arrays also include thousands of known functional variants, allowing for simultaneous ascertainment of Mendelian disease-causing mutations and variants that affect drug metabolism. We found that 21.2% of the population are mutation carriers of autosomal recessive diseases, 3.1% have mutations in cancer-predisposing genes, and 87.3% carry variants that affect drug response. We highlight how TWB data provide insight into both population history and disease burden, while showing how widespread genetic testing can be used to improve clinical care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41525-021-00178-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7878858PMC
February 2021

Investigating rare pathogenic/likely pathogenic exonic variation in bipolar disorder.

Mol Psychiatry 2021 Sep 22;26(9):5239-5250. Epub 2021 Jan 22.

HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA.

Bipolar disorder (BD) is a serious mental illness with substantial common variant heritability. However, the role of rare coding variation in BD is not well established. We examined the protein-coding (exonic) sequences of 3,987 unrelated individuals with BD and 5,322 controls of predominantly European ancestry across four cohorts from the Bipolar Sequencing Consortium (BSC). We assessed the burden of rare, protein-altering, single nucleotide variants classified as pathogenic or likely pathogenic (P-LP) both exome-wide and within several groups of genes with phenotypic or biologic plausibility in BD. While we observed an increased burden of rare coding P-LP variants within 165 genes identified as BD GWAS regions in 3,987 BD cases (meta-analysis OR = 1.9, 95% CI = 1.3-2.8, one-sided p = 6.0 × 10), this enrichment did not replicate in an additional 9,929 BD cases and 14,018 controls (OR = 0.9, one-side p = 0.70). Although BD shares common variant heritability with schizophrenia, in the BSC sample we did not observe a significant enrichment of P-LP variants in SCZ GWAS genes, in two classes of neuronal synaptic genes (RBFOX2 and FMRP) associated with SCZ or in loss-of-function intolerant genes. In this study, the largest analysis of exonic variation in BD, individuals with BD do not carry a replicable enrichment of rare P-LP variants across the exome or in any of several groups of genes with biologic plausibility. Moreover, despite a strong shared susceptibility between BD and SCZ through common genetic variation, we do not observe an association between BD risk and rare P-LP coding variants in genes known to modulate risk for SCZ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41380-020-01006-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8295400PMC
September 2021

A Large-Scale Association Study Detects Novel Rare Variants, Risk Genes, Functional Elements, and Polygenic Architecture of Prostate Cancer Susceptibility.

Cancer Res 2021 04 8;81(7):1695-1703. Epub 2020 Dec 8.

Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California.

To identify rare variants associated with prostate cancer susceptibility and better characterize the mechanisms and cumulative disease risk associated with common risk variants, we conducted an integrated study of prostate cancer genetic etiology in two cohorts using custom genotyping microarrays, large imputation reference panels, and functional annotation approaches. Specifically, 11,984 men (6,196 prostate cancer cases and 5,788 controls) of European ancestry from Northern California Kaiser Permanente were genotyped and meta-analyzed with 196,269 men of European ancestry (7,917 prostate cancer cases and 188,352 controls) from the UK Biobank. Three novel loci, including two rare variants (European ancestry minor allele frequency < 0.01, at 3p21.31 and 8p12), were significant genome wide in a meta-analysis. Gene-based rare variant tests implicated a known prostate cancer gene (), as well as a novel candidate gene (), which encodes a receptor highly expressed in prostate tissue and is related to the B7/CD28 family of T-cell immune checkpoint markers. Haplotypic patterns of long-range linkage disequilibrium were observed for rare genetic variants at and other loci, reflecting their evolutionary history. In addition, a polygenic risk score (PRS) of 188 prostate cancer variants was strongly associated with risk (90th vs. 40th-60th percentile OR = 2.62, = 2.55 × 10). Many of the 188 variants exhibited functional signatures of gene expression regulation or transcription factor binding, including a 6-fold difference in log-probability of androgen receptor binding at the variant rs2680708 (17q22). Rare variant and PRS associations, with concomitant functional interpretation of risk mechanisms, can help clarify the full genetic architecture of prostate cancer and other complex traits. SIGNIFICANCE: This study maps the biological relationships between diverse risk factors for prostate cancer, integrating different functional datasets to interpret and model genome-wide data from over 200,000 men with and without prostate cancer..
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/0008-5472.CAN-20-2635DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137514PMC
April 2021

Accurate assembly of the olive baboon (Papio anubis) genome using long-read and Hi-C data.

Gigascience 2020 12;9(12)

Institute for Human Genetics, University of California San Francisco, 513 Parnassus Avenue, San Francisco, CA 94143, USA.

Background: Baboons are a widely used nonhuman primate model for biomedical, evolutionary, and basic genetics research. Despite this importance, the genomic resources for baboons are limited. In particular, the current baboon reference genome Panu_3.0 is a highly fragmented, reference-guided (i.e., not fully de novo) assembly, and its poor quality inhibits our ability to conduct downstream genomic analyses.

Findings: Here we present a de novo genome assembly of the olive baboon (Papio anubis) that uses data from several recently developed single-molecule technologies. Our assembly, Panubis1.0, has an N50 contig size of ∼1.46 Mb (as opposed to 139 kb for Panu_3.0) and has single scaffolds that span each of the 20 autosomes and the X chromosome.

Conclusions: We highlight multiple lines of evidence (including Bionano Genomics data, pedigree linkage information, and linkage disequilibrium data) suggesting that there are several large assembly errors in Panu_3.0, which have been corrected in Panubis1.0.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa134DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7719865PMC
December 2020

Towards a reference genome that captures global genetic diversity.

Nat Commun 2020 10 30;11(1):5482. Epub 2020 Oct 30.

Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA, 94158, USA.

The current human reference genome is predominantly derived from a single individual and it does not adequately reflect human genetic diversity. Here, we analyze 338 high-quality human assemblies of genetically divergent human populations to identify missing sequences in the human reference genome with breakpoint resolution. We identify 127,727 recurrent non-reference unique insertions spanning 18,048,877 bp, some of which disrupt exons and known regulatory elements. To improve genome annotations, we linearly integrate these sequences into the chromosomal assemblies and construct a Human Diversity Reference. Leveraging this reference, an average of 402,573 previously unmapped reads can be recovered for a given genome sequenced to ~40X coverage. Transcriptomic diversity among these non-reference sequences can also be directly assessed. We successfully map tens of thousands of previously discarded RNA-Seq reads to this reference and identify transcription evidence in 4781 gene loci, underlining the importance of these non-reference sequences in functional genomics. Our extensive datasets are important advances toward a comprehensive reference representation of global human genetic diversity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19311-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7599213PMC
October 2020

Mutations in Metabotropic Glutamate Receptor 1 Contribute to Natural Short Sleep Trait.

Curr Biol 2021 01 15;31(1):13-24.e4. Epub 2020 Oct 15.

Department of Neurology, University of California, San Francisco, San Francisco, CA 94143, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA; Weill Institute for Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA; Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA 94143, USA. Electronic address:

Sufficient and efficient sleep is crucial for our health. Natural short sleepers can sleep significantly shorter than the average population without a desire for more sleep and without any obvious negative health consequences. In searching for genetic variants underlying the short sleep trait, we found two different mutations in the same gene (metabotropic glutamate receptor 1) from two independent natural short sleep families. In vitro, both of the mutations exhibited loss of function in receptor-mediated signaling. In vivo, the mice carrying the individual mutations both demonstrated short sleep behavior. In brain slices, both of the mutations changed the electrical properties and increased excitatory synaptic transmission. These results highlight the important role of metabotropic glutamate receptor 1 in modulating sleep duration.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cub.2020.09.071DOI Listing
January 2021

Genomic Analysis of Historical Cases with Positive Newborn Screens for Short-Chain Acyl-CoA Dehydrogenase Deficiency Shows That a Validated Second-Tier Biochemical Test Can Replace Future Sequencing.

Int J Neonatal Screen 2020 Jun 26;6(2). Epub 2020 May 26.

Department of Pediatrics, University of California, San Francisco, CA 94158 USA.

Short-chain acyl-CoA dehydrogenase deficiency (SCADD) is a rare autosomal recessive disorder of β-oxidation caused by pathogenic variants in the gene. Analyte testing for SCADD in blood and urine, including newborn screening (NBS) using tandem mass spectrometry (MS/MS) on dried blood spots (DBSs), is complicated by the presence of two relatively common variants (c.625G>A and c.511C>T). Individuals homozygous for these variants or compound heterozygous do not have clinical disease but do have reduced short-chain acyl-CoA dehydrogenase (SCAD) activity, resulting in elevated blood and urine metabolites. As part of a larger study of the potential role of exome sequencing in NBS in California, we reviewed sequence and MS/MS data from DBSs from a cohort of 74 patients identified to have SCADD. Of this cohort, approximately 60% had one or more of the common variants and did not have the two rare variants, and thus would need no further testing. Retrospective analysis of ethylmalonic acid, glutaric acid, 2-hydroxyglutaric acid, 3-hydroxyglutaric acid, and methylsuccinic acid demonstrated that second-tier testing applied before the release of the newborn screening result could reduce referrals by over 50% and improve the positive predictive value for SCADD to above 75%.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ijns6020041DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7423011PMC
June 2020

The role of exome sequencing in newborn screening for inborn errors of metabolism.

Nat Med 2020 09 10;26(9):1392-1397. Epub 2020 Aug 10.

Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA.

Public health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently used to screen newborns for a panel of rare inborn errors of metabolism (IEMs). The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS. We obtained archived residual dried blood spots and data for nearly all IEM cases from the 4.5 million infants born in California between mid-2005 and 2013 and from some infants who screened positive by MS/MS, but were unaffected upon follow-up testing. WES had an overall sensitivity of 88% and specificity of 98.4%, compared to 99.0% and 99.8%, respectively for MS/MS, although effectiveness varied among individual IEMs. Thus, WES alone was insufficiently sensitive or specific to be a primary screen for most NBS IEMs. However, as a secondary test for infants with abnormal MS/MS screens, WES could reduce false-positive results, facilitate timely case resolution and in some instances even suggest more appropriate or specific diagnosis than that initially obtained. This study represents the largest, to date, sequencing effort of an entire population of IEM-affected cases, allowing unbiased assessment of current capabilities of WES as a tool for population screening.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-020-0966-5DOI Listing
September 2020

De novo mutation and skewed X-inactivation in girl with BCAP31-related syndrome.

Hum Mutat 2020 Oct 22;41(10):1775-1782. Epub 2020 Jul 22.

Department of Pediatrics, National Taiwan University Hospital, Taipei, Taiwan.

Full genome analysis of a young girl with deafness, dystonia, central hypomyelination, refractory seizure, and fluctuating liver function impairment revealed a heterozygous, de novo variant in the BCAP31 gene on chromosome Xq28 (NM_001256447.2:c.92G>A), mutations of which caused the X-linked recessive severe neurologic disorder deafness, dystonia, and cerebral hypomyelination. Reverse transcription-polymerase chain reaction of the patient's white blood cells showed the absence of wild-type BCAP31 messenger RNA (mRNA) but the presence of two novel BCAP31 mRNAs. The major alternatively spliced mRNA is due to Exon 2 skipping and the utilization of a new initiation site in Exon 3 that leads to a frameshift and truncated transcript while the minor novel mRNA has a 110 nucleotide insertion to Exon 2. Phasing studies showed that the de novo variant arose in the paternal X chromosome. X chromosome inactivation assay was done and confirmed that the patient's maternal X chromosome was preferentially inactivated, providing evidence that the mutated BCAP31 gene was the one predominantly expressed. According to the American College of Medical Genetics and Genomics guideline, this variant is deemed "pathogenic" (PS2, PS3, PM2, PP3, and PP4) and deleterious. This is the first reported female patient in BCAP31-related syndrome resulted from skewed X-inactivation and a de novo mutation in the active X chromosome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.24080DOI Listing
October 2020

Analysis of putative cis-regulatory elements regulating blood pressure variation.

Hum Mol Genet 2020 07;29(11):1922-1932

Department of Genetic Medicine, McKusick-Nathans Institute, Baltimore, MD 21205, USA.

Hundreds of loci have been associated with blood pressure (BP) traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ~100 000 Genetic Epidemiology Research on Aging study participants. In the present study, we sought to fine-map known loci and identify novel genes by determining putative regulatory regions for these and other tissues relevant to BP. We constructed maps of putative cis-regulatory elements (CREs) using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. We aggregate variants within these putative CREs within 50 Kb of the start or end of 'expressed' genes in these tissues or cell types using public expression data and use deltaSVM scores as weights in the group-wise sequence kernel association test to identify candidates. We test for association with both BP traits and expression within these tissues or cell types of interest and identify the candidates MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B and PPCDC. Additionally, we examined two known QT interval genes, SCN5A and NOS1AP, in the Atherosclerosis Risk in Communities Study, as a positive control, and observed the expected heart-specific effect. Thus, our method identifies variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddaa098DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7372556PMC
July 2020

Comprehensive Analysis of Human Subtelomeres by Whole Genome Mapping.

PLoS Genet 2020 01 27;16(1):e1008347. Epub 2020 Jan 27.

School of Biomedical Engineering, Drexel University, Philadelphia, PA, United States of America.

Detailed comprehensive knowledge of the structures of individual long-range telomere-terminal haplotypes are needed to understand their impact on telomere function, and to delineate the population structure and evolution of subtelomere regions. However, the abundance of large evolutionarily recent segmental duplications and high levels of large structural variations have complicated both the mapping and sequence characterization of human subtelomere regions. Here, we use high throughput optical mapping of large single DNA molecules in nanochannel arrays for 154 human genomes from 26 populations to present a comprehensive look at human subtelomere structure and variation. The results catalog many novel long-range subtelomere haplotypes and determine the frequencies and contexts of specific subtelomeric duplicons on each chromosome arm, helping to clarify the currently ambiguous nature of many specific subtelomere structures as represented in the current reference sequence (HG38). The organization and content of some duplicons in subtelomeres appear to show both chromosome arm and population-specific trends. Based upon these trends we estimate a timeline for the spread of these duplication blocks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008347DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7004388PMC
January 2020

The Driver of Extreme Human-Specific Olduvai Repeat Expansion Remains Highly Active in the Human Genome.

Genetics 2020 01 21;214(1):179-191. Epub 2019 Nov 21.

Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045

Sequences encoding Olduvai protein domains (formerly DUF1220) show the greatest human lineage-specific increase in copy number of any coding region in the genome and have been associated, in a dosage-dependent manner, with brain size, cognitive aptitude, autism, and schizophrenia. Tandem intragenic duplications of a three-domain block, termed the Olduvai triplet, in four genes in the chromosomal 1q21.1-0.2 region, are primarily responsible for the striking human-specific copy number increase. Interestingly, most of the Olduvai triplets are adjacent to, and transcriptionally coregulated with, three human-specific genes that have been shown to promote cortical neurogenesis. Until now, the underlying genomic events that drove the Olduvai hyperamplification in humans have remained unexplained. Here, we show that the presence or absence of an alternative first exon of the Olduvai triplet perfectly discriminates between amplified (58/58) and unamplified (0/12) triplets. We provide sequence and breakpoint analyses that suggest the alternative exon was produced by an nonallelic homologous recombination-based mechanism involving the duplicative transposition of an existing Olduvai exon found in the CON3 domain, which typically occurs at the C-terminal end of genes. We also provide suggestive evidence that the alternative exon may promote instability through a putative G-quadraplex (pG4)-based mechanism. Lastly, we use single-molecule optical mapping to characterize the intragenic structural variation observed in genes in 154 unrelated individuals and 52 related individuals from 16 families and show that the presence of pG4-containing Olduvai triplets is strongly correlated with high levels of Olduvai copy number variation. These results suggest that the same driver of genomic instability that allowed the evolutionarily recent, rapid, and extreme human-specific Olduvai expansion remains highly active in the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.119.302782DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6944415PMC
January 2020

Mutant neuropeptide S receptor reduces sleep duration with preserved memory consolidation.

Sci Transl Med 2019 10;11(514)

Department of Neurology, University of California San Francisco, San Francisco, CA 94143, USA.

Sleep is a crucial physiological process for our survival and cognitive performance, yet the factors controlling human sleep regulation remain poorly understood. Here, we identified a missense mutation in a G protein-coupled neuropeptide S receptor 1 (NPSR1) that is associated with a natural short sleep phenotype in humans. Mice carrying the homologous mutation exhibited less sleep time despite increased sleep pressure. These animals were also resistant to contextual memory deficits associated with sleep deprivation. In vivo, the mutant receptors showed increased sensitivity to neuropeptide S exogenous activation. These results suggest that the NPS/NPSR1 pathway might play a critical role in regulating human sleep duration and in the link between sleep homeostasis and memory consolidation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/scitranslmed.aax2014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7587149PMC
October 2019

Three patients with homozygous familial hypercholesterolemia: Genomic sequencing and kindred analysis.

Mol Genet Genomic Med 2019 12 16;7(12):e1007. Epub 2019 Oct 16.

Cardiovascular Research Institute, University of California, San Francisco, CA, USA.

Background: Homozygous Familial Hypercholesterolemia (HoFH) is an inherited recessive condition associated with extremely high levels of low-density lipoprotein (LDL) cholesterol in affected individuals. It is usually caused by homozygous or compound heterozygous functional mutations in the LDL receptor (LDLR). A number of mutations causing FH have been reported in literature and such genetic heterogeneity presents great challenges for disease diagnosis.

Objective: We aim to determine the likely genetic defects responsible for three cases of pediatric HoFH in two kindreds.

Methods: We applied whole exome sequencing (WES) on the two probands to determine the likely functional variants among candidate FH genes. We additionally applied 10x Genomics (10xG) Linked-Reads whole genome sequencing (WGS) on one of the kindreds to identify potentially deleterious structural variants (SVs) underlying HoFH. A PCR-based screening assay was also established to detect the LDLR structural variant in a cohort of 641 patients with elevated LDL.

Results: In the Caucasian kindred, the FH homozygosity can be attributed to two compound heterozygous LDLR damaging variants, an exon 12 p.G592E missense mutation and a novel 3kb exon 1 deletion. By analyzing the 10xG phased data, we ascertained that this deletion allele was most likely to have originated from a Russian ancestor. In the Mexican kindred, the strikingly elevated LDL cholesterol level can be attributed to a homozygous frameshift LDLR variant p.E113fs.

Conclusions: While the application of WES can provide a cost-effective way of identifying the genetic causes of FH, it often lacks sensitivity for detecting structural variants. Our finding of the LDLR exon 1 deletion highlights the broader utility of Linked-Read WGS in detecting SVs in the clinical setting, especially when HoFH patients remain undiagnosed after WES.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/mgg3.1007DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6900368PMC
December 2019

The 22q11 low copy repeats are characterized by unprecedented size and structural variability.

Genome Res 2019 09;29(9):1389-1401

Departement of Human Genetics, KU Leuven, Leuven, 3000 Belgium.

Low copy repeats (LCRs) are recognized as a significant source of genomic instability, driving genome variability and evolution. The Chromosome 22 LCRs (LCR22s) mediate nonallelic homologous recombination (NAHR) leading to the 22q11 deletion syndrome (22q11DS). However, LCR22s are among the most complex regions in the genome, and their structure remains unresolved. The difficulty in generating accurate maps of LCR22s has also hindered localization of the deletion end points in 22q11DS patients. Using fiber FISH and Bionano optical mapping, we assembled LCR22 alleles in 187 cell lines. Our analysis uncovered an unprecedented level of variation in LCR22s, including LCR22A alleles ranging in size from 250 to 2000 kb. Further, the incidence of various LCR22 alleles varied within different populations. Additionally, the analysis of LCR22s in 22q11DS patients and their parents enabled further refinement of the rearrangement site within LCR22A and -D, which flank the 22q11 deletion. The NAHR site was localized to a 160-kb paralog shared between the LCR22A and -D in seven 22q11DS patients. Thus, we present the most comprehensive map of LCR22 variation to date. This will greatly facilitate the investigation of the role of LCR variation as a driver of 22q11 rearrangements and the phenotypic variability among 22q11DS patients.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.248682.119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6724673PMC
September 2019

Evaluating the quality of the 1000 genomes project data.

BMC Genomics 2019 Aug 16;20(1):620. Epub 2019 Aug 16.

Institute for Human Genetics, University of California, San Francisco, CA, 94143, USA.

Background: Data from the 1000 Genomes project is quite often used as a reference for human genomic analysis. However, its accuracy needs to be assessed to understand the quality of predictions made using this reference. We present here an assessment of the genotyping, phasing, and imputation accuracy data in the 1000 Genomes project. We compare the phased haplotype calls from the 1000 Genomes project to experimentally phased haplotypes for 28 of the same individuals sequenced using the 10X Genomics platform.

Results: We observe that phasing and imputation for rare variants are unreliable, which likely reflects the limited sample size of the 1000 Genomes project data. Further, it appears that using a population specific reference panel does not improve the accuracy of imputation over using the entire 1000 Genomes data set as a reference panel. We also note that the error rates and trends depend on the choice of definition of error, and hence any error reporting needs to take these definitions into account.

Conclusions: The quality of the 1000 Genomes data needs to be considered while using this database for further studies. This work presents an analysis that can be used for these assessments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-019-5957-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6696682PMC
August 2019

Genome of the Komodo dragon reveals adaptations in the cardiovascular and chemosensory systems of monitor lizards.

Nat Ecol Evol 2019 08 29;3(8):1241-1252. Epub 2019 Jul 29.

Gladstone Institutes, San Francisco, CA, USA.

Monitor lizards are unique among ectothermic reptiles in that they have high aerobic capacity and distinctive cardiovascular physiology resembling that of endothermic mammals. Here, we sequence the genome of the Komodo dragon Varanus komodoensis, the largest extant monitor lizard, and generate a high-resolution de novo chromosome-assigned genome assembly for V. komodoensis using a hybrid approach of long-range sequencing and single-molecule optical mapping. Comparing the genome of V. komodoensis with those of related species, we find evidence of positive selection in pathways related to energy metabolism, cardiovascular homoeostasis, and haemostasis. We also show species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and other lizard lineages. Together, these evolutionary signatures of adaptation reveal the genetic underpinnings of the unique Komodo dragon sensory and cardiovascular systems, and suggest that selective pressure altered haemostasis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. The Komodo dragon genome is an important resource for understanding the biology of monitor lizards and reptiles worldwide.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41559-019-0945-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668926PMC
August 2019

OMMA enables population-scale analysis of complex genomic features and phylogenomic relationships from nanochannel-based optical maps.

Gigascience 2019 07;8(7)

School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong.

Background: Optical mapping is an emerging technology that complements sequencing-based methods in genome analysis. It is widely used in improving genome assemblies and detecting structural variations by providing information over much longer (up to 1 Mb) reads. Current standards in optical mapping analysis involve assembling optical maps into contigs and aligning them to a reference, which is limited to pairwise comparison and becomes bias-prone when analyzing multiple samples.

Findings: We present a new method, OMMA, that extends optical mapping to the study of complex genomic features by simultaneously interrogating optical maps across many samples in a reference-independent manner. OMMA captures and characterizes complex genomic features, e.g., multiple haplotypes, copy number variations, and subtelomeric structures when applied to 154 human samples across the 26 populations sequenced in the 1000 Genomes Project. For small genomes such as pathogenic bacteria, OMMA accurately reconstructs the phylogenomic relationships and identifies functional elements across 21 Acinetobacter baumannii strains.

Conclusions: With the increasing data throughput of optical mapping system, the use of this technology in comparative genome analysis across many samples will become feasible. OMMA is a timely solution that can address such computational need. The OMMA software is available at https://github.com/TF-Chan-Lab/OMTools.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giz079DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6615982PMC
July 2019

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Nat Commun 2019 04 16;10(1):1784. Epub 2019 Apr 16.

Pacific Biosciences, Menlo Park, CA, 94025, USA.

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-08148-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6467913PMC
April 2019

Genome maps across 26 human populations reveal population-specific patterns of structural variation.

Nat Commun 2019 03 4;10(1):1025. Epub 2019 Mar 4.

Cardiovascular Research Institute, University of California-San Francisco, San Francisco, CA, 94143, USA.

Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-08992-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6399254PMC
March 2019

Targeted Genomic Profiling of Acral Melanoma.

J Natl Cancer Inst 2019 10;111(10):1068-1077

Background: Acral melanoma is a rare type of melanoma that affects world populations irrespective of skin color and has worse survival than other cutaneous melanomas. It has relatively few single nucleotide mutations without the UV signature of cutaneous melanomas, but instead has a genetic landscape characterized by structural rearrangements and amplifications. BRAF mutations are less common than in other cutaneous melanomas, and knowledge about alternative therapeutic targets is incomplete.

Methods: To identify alternative therapeutic targets, we performed targeted deep-sequencing on 122 acral melanomas. We confirmed the loss of the tumor suppressors p16 and NF1 by immunohistochemistry in select cases.

Results: In addition to BRAF (21.3%), NRAS (27.9%), and KIT (11.5%) mutations, we identified a broad array of MAPK pathway activating alterations, including fusions of BRAF (2.5%), NTRK3 (2.5%), ALK (0.8%), and PRKCA (0.8%), which can be targeted by available inhibitors. Inactivation of NF1 occurred in 18 cases (14.8%). Inactivation of the NF1 cooperating factor SPRED1 occurred in eight cases (6.6%) as an alternative mechanism of disrupting the negative regulation of RAS. Amplifications recurrently affected narrow loci containing PAK1 and GAB2 (n = 27, 22.1%), CDK4 (n = 27, 22.1%), CCND1 (n = 24, 19.7%), EP300 (n = 20, 16.4%), YAP1 (n = 15, 12.3%), MDM2 (n = 13, 10.7%), and TERT (n = 13, 10.7%) providing additional and possibly complementary therapeutic targets. Acral melanomas with BRAFV600E mutations harbored fewer genomic amplifications and were more common in patients with European ancestry.

Conclusion: Our findings support a new, molecularly based subclassification of acral melanoma with potential therapeutic implications: BRAFV600E mutant acral melanomas with characteristics similar to nonacral melanomas that could benefit from BRAF inhibitor therapy, and non-BRAFV600E mutant acral melanomas. Acral melanomas without BRAFV600E mutations harbor a broad array of therapeutically relevant alterations. Expanded molecular profiling would increase the detection of potentially targetable alterations for this subtype of acral melanoma.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jnci/djz005DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792090PMC
October 2019

Integrative approach identifies corticosteroid response variant in diverse populations with asthma.

J Allergy Clin Immunol 2019 05 24;143(5):1791-1802. Epub 2018 Oct 24.

Center for Individualized and Genomic Medicine Research (CIGMA), Henry Ford Health System, Detroit, Mich; Department of Internal Medicine, Henry Ford Health System, Detroit, Mich. Electronic address:

Background: Although inhaled corticosteroid (ICS) medication is considered the cornerstone treatment for patients with persistent asthma, few ICS pharmacogenomic studies have involved nonwhite populations.

Objective: We sought to identify genetic predictors of ICS response in multiple population groups with asthma.

Methods: The discovery group comprised African American participants from the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE) who underwent 6 weeks of monitored ICS therapy (n = 244). A genome-wide scan was performed to identify single nucleotide polymorphism (SNP) variants jointly associated (ie, the combined effect of the SNP and SNP × ICS treatment interaction) with changes in asthma control. Top associations were validated by assessing the joint association with asthma exacerbations in 3 additional groups: African Americans (n = 803 and n = 563) and Latinos (n = 1461). RNA sequencing data from 408 asthmatic patients and 405 control subjects were used to examine whether genotype was associated with gene expression.

Results: One variant, rs3827907, was significantly associated with ICS-mediated changes in asthma control in the discovery set (P = 7.79 × 10) and was jointly associated with asthma exacerbations in 3 validation cohorts (P = .023, P = .029, and P = .041). RNA sequencing analysis found the rs3827907 C-allele to be associated with lower RNASE2 expression (P = 6.10 × 10). RNASE2 encodes eosinophil-derived neurotoxin, and the rs3827907 C-allele appeared to particularly influence ICS treatment response in the presence of eosinophilic inflammation (ie, high pretreatment eosinophil-derived neurotoxin levels or blood eosinophil counts).

Conclusion: We identified a variant, rs3827907, that appears to influence response to ICS treatment in multiple population groups and likely mediates its effect through eosinophils.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaci.2018.09.034DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482107PMC
May 2019
-->