Publications by authors named "Ananyo Choudhury"

24 Publications

  • Page 1 of 1

Admixture/fine-mapping in Brazilians reveals a West African associated potential regulatory variant (rs114066381) with a strong female-specific effect on body mass and fat mass indexes.

Int J Obes (Lond) 2021 Feb 26. Epub 2021 Feb 26.

Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.

Background/objectives: Admixed populations are a resource to study the global genetic architecture of complex phenotypes, which is critical, considering that non-European populations are severely underrepresented in genomic studies. Here, we study the genetic architecture of BMI in children, young adults, and elderly individuals from the admixed population of Brazil.

Subjects/methods: Leveraging admixture in Brazilians, whose chromosomes are mosaics of fragments of Native American, European, and African origins, we used genome-wide data to perform admixture mapping/fine-mapping of body mass index (BMI) in three Brazilian population-based cohorts from Northeast (Salvador), Southeast (Bambuí), and South (Pelotas).

Results: We found significant associations with African-associated alleles in children from Salvador (PALD1 and ZMIZ1 genes), and in young adults from Pelotas (NOD2 and MTUS2 genes). More importantly, in Pelotas, rs114066381, mapped in a potential regulatory region, is significantly associated only in females (p = 2.76e-06). This variant is rare in Europeans but with frequencies of ~3% in West Africa and has a strong female-specific effect (95% CI: 2.32-5.65 kg/m per each A allele). We confirmed this sex-specific association and replicated its strong effect for an adjusted fat mass index in the same Pelotas cohort, and for BMI in another Brazilian cohort from São Paulo (Southeast Brazil). A meta-analysis confirmed the significant association. Remarkably, we observed that while the frequency of rs114066381-A allele ranges from 0.8 to 2.1% in the studied populations, it attains ~9% among women with morbid obesity from Pelotas, São Paulo, and Bambuí. The effect size of rs114066381 is at least five times higher than the FTO SNPs rs9939609 and rs1558902, already emblematic for their high effects.

Conclusions: We identified six candidate SNPs associated with BMI. rs114066381 stands out for its high effect that was replicated and its high frequency in women with morbid obesity. We demonstrate how admixed populations are a source of new relevant phenotype-associated genetic variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41366-021-00761-1DOI Listing
February 2021

Bantu-speaker migration and admixture in southern Africa.

Hum Mol Genet 2020 Dec 24. Epub 2020 Dec 24.

Palaeo-Research Institute, University of Johannesburg, Auckland Park, South Africa.

The presence of Early and Middle Stone Age human remains and associated archaeological artefacts from various sites scattered across southern Africa, suggests this geographic region to be one of the first abodes of anatomically modern humans. Although the presence of hunter-gatherer cultures in this region dates back to deep times, the peopling of southern Africa have largely been reshaped by three major sets of migrations over the last 2000 years. These migrations have led to a confluence of four distinct ancestries (San hunter-gatherer, East African pastoralist, Bantu-speaker farmer and Eurasian) in populations from this region. In this review, we have summarized the recent insights into the refinement of timelines and routes of the migration of Bantu-speaking populations to southern Africa and their admixture with resident southern African Khoe-San populations. We highlight two recent studies hinting at the emergence of fine-scale population structure within some South-Eastern Bantu-speaker groups. We also accentuate whole genome sequencing studies (current and ancient) that have both enhanced our understanding of the peopling of southern Africa and demonstrated a huge potential for novel variant discovery in populations from this region. Finally, we identify some of the major gaps and inconsistencies in our understanding and emphasize the importance of more systematic studies of southern African populations from diverse ethnolinguistic groups and geographic locations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddaa274DOI Listing
December 2020

Candidate Gene Analysis Reveals Strong Association of Variants With High Density Lipoprotein Cholesterol and Variants With Low Density Lipoprotein Cholesterol in Ghanaian Adults: An AWI-Gen Sub-Study.

Front Genet 2020 30;11:456661. Epub 2020 Oct 30.

Navrongo Health Research Centre, Navrongo, Ghana.

Variations in lipid levels are attributed partly to genetic factors. Genome-wide association studies (GWASs) mainly performed in European, African American and Asian cohorts have identified variants associated with LDL-C, HDL-C, total cholesterol (TC) and triglycerides (TG), but few studies have been performed in sub-Saharan Africans. This study evaluated the effect of single nucleotide variants (SNVs) in eight candidate loci (, , , , , , , and ) on lipid levels among 1855 Ghanaian adults. All lipid levels were measured directly using an automated analyser. DNA was extracted and genotyped using the H3Africa SNV array. Linear regression models were used to test the association between SNVs and log-transformed lipid levels, adjusting for sex, age and waist circumference. In addition Bonferroni correction was performed to account for multiple testing. Several variants of , , , and (MAF > 0.05) were associated with HDL-C, LDL-C and TC levels at < 0.05. The lead variants for association with HDL-C were rs17231520 in (β = 0.139, < 0.0001) and rs1109166 in (β = -0.044, = 0.028). Lower LDL-C levels were associated with an intronic variant in (rs11806638 [β = -0.055, = 0.027]) and increased TC was associated with a variant in (rs854558 [β = 0.040, = 0.020]). functional analyses indicated that these variants likely influence gene function through their effect on gene transcription. We replicated a strong association between variants and HDL-C and between variant and LDL-C in West Africans, with two potentially functional variants and identified three novel variants in linkage disequilibrium in which were associated with increasing TC levels in Ghanaians.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2020.456661DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7661969PMC
October 2020

High-depth African genomes inform human migration and health.

Nature 2020 10 28;586(7831):741-748. Epub 2020 Oct 28.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.

The African continent is regarded as the cradle of modern humans and African genomes contain more genetic variation than those from any other continent, yet only a fraction of the genetic diversity among African individuals has been surveyed. Here we performed whole-genome sequencing analyses of 426 individuals-comprising 50 ethnolinguistic groups, including previously unsampled populations-to explore the breadth of genomic diversity across Africa. We uncovered more than 3 million previously undescribed variants, most of which were found among individuals from newly sampled ethnolinguistic groups, as well as 62 previously unreported loci that are under strong selection, which were predominantly found in genes that are involved in viral immunity, DNA repair and metabolism. We observed complex patterns of ancestral admixture and putative-damaging and novel variation, both within and between populations, alongside evidence that Zambia was a likely intermediate site along the routes of expansion of Bantu-speaking populations. Pathogenic variants in genes that are currently characterized as medically relevant were uncommon-but in other genes, variants denoted as 'likely pathogenic' in the ClinVar database were commonly observed. Collectively, these findings refine our current understanding of continental migration, identify gene flow and the response to human disease as strong drivers of genome-level population variation, and underscore the scientific imperative for a broader characterization of the genomic diversity of African individuals to understand human ancestry and improve health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2859-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7759466PMC
October 2020

variant distribution in sub-Saharan Africa and potential risks of using chloroquine/hydroxychloroquine based treatments for COVID-19.

medRxiv 2020 Jun 2. Epub 2020 Jun 2.

Chloroquine/hydroxychloroquine have been proposed as potential treatments for COVID-19. These drugs have warning labels for use in individuals with glucose-6-phosphate dehydrogenase (G6PD) deficiency. Analysis of whole-genome sequence data of 458 individuals from sub-Saharan Africa showed significant variation across the continent. We identified nine variants, of which four are potentially deleterious to G6PD function, and one (rs1050828) that is known to cause G6PD deficiency. We supplemented data for the rs1050828 variant with genotype array data from over 11,000 Africans. Although this variant is common in Africans overall, large allele frequency differences exist between sub-populations. African sub-populations in the same country can show significant differences in allele frequency (e.g. 16.0% in Tsonga vs 0.8% in Xhosa, both in South Africa, ρ=2.4×10 ). The high prevalence of variants in the gene found in this analysis suggests that it may be a significant interaction factor in clinical trials of chloroquine and hydrochloroquine for treatment of COVID-19 in Africans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.05.27.20114066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302299PMC
June 2020

Novel and Known Gene-Smoking Interactions With cIMT Identified as Potential Drivers for Atherosclerosis Risk in West-African Populations of the AWI-Gen Study.

Front Genet 2019 7;10:1354. Epub 2020 Feb 7.

Faculty of Health Sciences, Sydney Brenner Institute for Molecular Bioscience (SBIMB), University of the Witwatersrand, Johannesburg, South Africa.

Introduction: Atherosclerosis is a key contributor to the burden of cardiovascular diseases (CVDs) and many epidemiological studies have reported on the effect of smoking on carotid intima-media thickness (cIMT) and its subsequent effect on CVD risk. Gene-environment interaction studies have contributed towards understanding some of the missing heritability of genome-wide association studies. Gene-smoking interactions on cIMT have been studied in non-African populations (European, Latino-American, and African American) but no comparable African research has been reported. Our aim was to investigate smoking-SNP interactions on cIMT in two West African populations by genome-wide analysis.

Materials And Methods: Only male participants from Burkina Faso (Nanoro = 993) and Ghana (Navrongo = 783) were included, as smoking was extremely rare among women. Phenotype and genotype data underwent stringent QC and genotype imputation was performed using the Sanger African Imputation Panel. Smoking prevalence among men was 13.3% in Nanoro and 42.5% in Navrongo. We analyzed gene-smoking interactions with PLINK after adjusting for covariates: age and 6 PCs (Model 1); age, BMI, blood pressure, fasting glucose, cholesterol levels, MVPA, and 6 PCs (Model 2). All analyses were performed at site level and for the combined data set.

Results: In Nanoro, we identified new gene-smoking interaction variants for cIMT within the previously described region (rs112017404, rs144170770, and rs4941649) (Model 1: p = 1.35E-07; Model 2: p = 3.08E-08). In the combined sample, two novel intergenic interacting variants were identified, rs1192824 in the regulatory region of (p = 5.90E-09) and rs77461169 (p = 4.48E-06) located in an upstream region of open chromatin. In silico functional analysis suggests the involvement of genes implicated in biological processes related to cell or biological adhesion and regulatory processes in gene-smoking interactions with cIMT (as evidenced by chromatin interactions and eQTLs).

Discussion: This is the first gene-smoking interaction study for cIMT, as a risk factor for atherosclerosis, in sub-Saharan African populations. In addition to replicating previously known signals for , we identified two novel genomic regions (, near ) involved in this gene-environment interaction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.01354DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7025492PMC
February 2020

Targeted ultra-deep sequencing of a South African Bantu-speaking cohort to comprehensively map and characterize common and novel variants in 65 pharmacologically-related genes.

Pharmacogenet Genomics 2019 09;29(7):167-178

CSIR Biosciences Unit, Pretoria, South Africa.

Background: African populations are characterised by high genetic diversity, which provides opportunities for discovering and elucidating novel variants of clinical importance, especially those affecting therapeutic outcome. Significantly more knowledge is however needed before such populations can take full advantage of the advances in precision medicine. Coupled with the need to concisely map and better understand the pharmacological implications of genetic diversity in populations of sub-Sharan African ancestry, the aim of this study was to identify and characterize known and novel variants present within 65 important absorption, distribution, metabolism and excretion genes.

Patients And Methods: Targeted ultra-deep next-generation sequencing was used to screen a cohort of 40 South African individuals of Bantu ancestry.

Results: We identified a total of 1662 variants of which 129 are novel. Moreover, out of the 1662 variants 22 represent potential loss-of-function variants. A high level of allele frequency differentiation was observed for variants identified in this study when compared with other populations. Notably, on the basis of prior studies, many appear to be pharmacologically important in the pharmacokinetics of a broad range of drugs, including antiretrovirals, chemotherapeutic drugs, antiepileptics, antidepressants, and anticoagulants. An in-depth analysis was undertaken to interrogate the pharmacogenetic implications of this genetic diversity.

Conclusion: Despite the new insights gained from this study, the work illustrates that a more comprehensive understanding of population-specific differences is needed to facilitate the development of pharmacogenetic-based interventions for optimal drug therapy in patients of African ancestry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1097/FPC.0000000000000380DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6675649PMC
September 2019

Genome-Wide SNP Discovery in Indigenous Cattle Breeds of South Africa.

Front Genet 2019 29;10:273. Epub 2019 Mar 29.

Division of Animal Sciences, University of Missouri, Columbia, MO, United States.

Single nucleotide polymorphism arrays have created new possibilities for performing genome-wide studies to detect genomic regions harboring sequence variants that affect complex traits. However, the majority of validated SNPs for which allele frequencies have been estimated are limited primarily to European breeds. The objective of this study was to perform SNP discovery in three South African indigenous breeds (Afrikaner, Drakensberger, and Nguni) using whole genome sequencing. DNA was extracted from blood and hair samples, quantified and prepared at 50 ng/μl concentration for sequencing at the Agricultural Research Council Biotechnology Platform using an Illumina HiSeq 2500. The fastq files were used to call the variants using the Genome Analysis Tool Kit. A total of 1,678,360 were identified as novel using Run 6 of 1000 Bull Genomes Project. Annotation of the identified variants classified them into functional categories. Within the coding regions, about 30% of the SNPs were non-synonymous substitutions that encode for alternate amino acids. The study of distribution of SNP across the genome identified regions showing notable differences in the densities of SNPs among the breeds and highlighted many regions of functional significance. Gene ontology terms identified genes such as , , and that have been associated with coat color in mouse, and , and genes have been associated with fertility in cattle. Further analysis of the variants detected 688 candidate selective sweeps (ZH Z-scores ≤ -4) across all three breeds, of which 223 regions were assigned as being putative selective sweeps (ZH scores ≤-5). We also identified 96 regions with extremely low ZH Z-scores (≤-6) in Afrikaner and Nguni. Genes such as and that have been associated with skin pigmentation in cattle and , which has been associated with biopolar disorder in human, were identified in these regions. This study provides the first analysis of sequence data to discover SNPs in indigenous South African cattle breeds. The information will play an important role in our efforts to understand the genetic history of our cattle and in designing appropriate breed improvement programmes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.00273DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6452414PMC
March 2019

Genomic and environmental risk factors for cardiometabolic diseases in Africa: methods used for Phase 1 of the AWI-Gen population cross-sectional study.

Glob Health Action 2018 ;11(sup2):1507133

a Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences , University of the Witwatersrand , Johannesburg, South Africa.

There is an alarming tide of cardiovascular and metabolic disease (CMD) sweeping across Africa. This may be a result of an increasingly urbanized lifestyle characterized by the growing consumption of processed and calorie-dense food, combined with physical inactivity and more sedentary behaviour. While the link between lifestyle and public health has been extensively studied in Caucasian and African American populations, few studies have been conducted in Africa. This paper describes the detailed methods for Phase 1 of the AWI-Gen study that were used to capture phenotype data and assess the associated risk factors and end points for CMD in persons over the age of 40 years in sub-Saharan Africa (SSA). We developed a population-based cross-sectional study of disease burden and phenotype in Africans, across six centres in SSA. These centres are in West Africa (Nanoro, Burkina Faso, and Navrongo, Ghana), in East Africa (Nairobi, Kenya) and in South Africa (Agincourt, Dikgale and Soweto). A total of 10,702 individuals between the ages of 40 and 60 years were recruited into the study across the six centres, plus an additional 1021 participants over the age of 60 years from the Agincourt centre. We collected socio-demographic, anthropometric, medical history, diet, physical activity, fat distribution and alcohol/tobacco consumption data from participants. Blood samples were collected for disease-related biomarker assays, and genomic DNA extraction for genome-wide association studies. Urine samples were collected to assess kidney function. The study provides base-line data for the development of a series of cohorts with a second wave of data collection in Phase 2 of the study. These data will provide valuable insights into the genetic and environmental influences on CMD on the African continent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/16549716.2018.1507133DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6161608PMC
June 2019

Genetic variants in SEC16B are associated with body composition in black South Africans.

Nutr Diabetes 2018 07 19;8(1):43. Epub 2018 Jul 19.

Division of Human Genetics, School of Pathology, Faculty of Health Sciences, National Health Laboratory Service & University of the Witwatersrand, Johannesburg, South Africa.

Objective: The latest genome-wide association studies of obesity-related traits have identified several genetic loci contributing to body composition (BC). These findings have not been robustly replicated in African populations, therefore, this study aimed to assess whether European BC-associated gene loci played a similar role in a South African black population.

Methods: A replication and fine-mapping study was performed in participants from the Birth to Twenty cohort (N = 1,926) using the Metabochip. Measurements included body mass index (BMI), waist and hip circumference, waist-to-hip ratio (WHR), total fat mass, total lean mass and percentage fat mass (PFM).

Results: SNPs in several gene loci, including SEC16B (P <  9.48 × 10), NEGR1 (P < 1.64 × 10), FTO (P < 2.91 × 10), TMEM18 (P < 2.27 × 10), and WARS2 (P < 3.25 × 10) were similarly associated (albeit not at array-wide signficance (P ≤ 6.7 × 10) with various phenotypes including fat mass, PFM, WHR linked to BC in this African cohort, however the associations were driven by different sentinel SNPs. More importantly, DXA-derived BC measures revealed stronger genetic associations than simple anthropometric measures. Association signals generated in this study were shared by European and African populations, as well as unique to this African cohort. Moreover, sophisticated estimates like DXA measures enabled an enhanced characterisation of genetic associations for BC traits.

Conclusion: Results from this study suggest that in-depth genomic studies in larger African cohorts may reveal novel SNPs for body composition and adiposity, which will provide greater insight into the aetiology of obesity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41387-018-0050-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6053407PMC
July 2018

African genetic diversity provides novel insights into evolutionary history and local adaptations.

Hum Mol Genet 2018 08;27(R2):R209-R218

Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa.

Genetic variation and susceptibility to disease are shaped by human demographic history and adaptation. We can now study the genomes of extant Africans and uncover traces of population migration, admixture, assimilation and selection by applying sophisticated computational algorithms. There are four major ethnolinguistic divisions among present day Africans: Hunter-gatherer populations in southern and central Africa; Nilo-Saharan speakers from north and northeast Africa; Afro-Asiatic speakers from north and east Africa; and Niger-Congo speakers who are the predominant ethnolinguistic group spread across most of sub-Saharan Africa. The enormous ethnolinguistic diversity in sub-Saharan African populations is largely paralleled by extensive genetic diversity and until a decade ago, little was known about detailed origins and divergence of these groups. Results from large-scale population genetic studies, and more recently whole genome sequence data, are unravelling the critical role of events like migration and admixture and environmental factors including diet, infectious diseases and climatic conditions in shaping current population diversity. It is now possible to start providing quantitative estimates of divergence times, population size and dynamic processes that have affected populations and their genetic risk for disease. Finally, the availability of ancient genomes from Africa provides historical insights of unprecedented depth. In this review, we highlight some key interpretations that have emerged from recent African genome studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddy161DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6061870PMC
August 2018

Insights into the genetics of blood pressure in black South African individuals: the Birth to Twenty cohort.

BMC Med Genomics 2018 01 17;11(1). Epub 2018 Jan 17.

School of Molecular & Cell Biology, Faculty of Science, University of the Witwatersrand, Johannesburg, South Africa.

Background: Cardiovascular diseases (CVDs) are the leading cause of non-communicable disease deaths globally, with hypertension being a major risk factor contributing to CVDs. Blood pressure is a heritable trait, with relatively few genetic studies having been performed in Africans. This study aimed to identify genetic variants associated with variance in systolic (SBP) and diastolic (DBP) blood pressure in black South Africans.

Methods: Genotyping was performed using the Metabochip in a subset of participants (mixed sex; median age 17.9) and their adult female caregivers (median age 41.0) from the Birth to Twenty cohort (n = 1947). Data were analysed as a merged dataset (all participants and caregivers together) in GEMMA (v0.94.1) using univariate linear mixed models, incorporating a centered relatedness matrix to account for the relatedness between individuals and with adjustments for age, sex, BMI and principal components of the genotype information.

Results: Association analysis identified regions of interest in the NOS1AP (DBP: rs112468105 - p = 7.18 × 10 and SBP: rs4657181 - p = 4.04 × 10), MYRF (SBP: rs11230796 - p = 2.16 × 10, rs400075 - p = 2.88 × 10) and POC1B (SBP: rs770373 - p = 7.05 × 10, rs770374 - p = 9.05 × 10) genes and some intergenic regions (DACH1|LOC440145 (DBP: rs17240498 - p = 4.91 × 10 and SBP: rs17240498 - p = 2.10 × 10) and INTS10|LPL (SBP: rs55830938 - p = 1.30 × 10, rs73599609 - p = 5.78 × 10, rs73667448 - p = 6.86 × 10)).

Conclusions: The study provided further insight into the contribution of genetic variants to blood pressure in black South Africans. Future functional and replication studies in larger samples are required to confirm the role of the identified loci in blood pressure regulation and whether or not these variants are African-specific.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-018-0321-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5773038PMC
January 2018

Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans.

Nat Commun 2017 12 12;8(1):2062. Epub 2017 Dec 12.

Institute for Cellular and Molecular Medicine, Department of Immunology, Faculty of Health Sciences, University of Pretoria, Pretoria, 0084, South Africa.

The Southern African Human Genome Programme is a national initiative that aspires to unlock the unique genetic character of southern African populations for a better understanding of human genetic diversity. In this pilot study the Southern African Human Genome Programme characterizes the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole-genome sequencing. A total of ~16 million unique variants are identified. Despite the shallow time depth since divergence between the two main southeastern Bantu-speaking groups (Nguni and Sotho-Tswana), principal component analysis and structure analysis reveal significant (p < 10) differentiation, and F analysis identifies regions with high divergence. The Coloured individuals show evidence of varying proportions of admixture with Khoesan, Bantu-speakers, Europeans, and populations from the Indian sub-continent. Whole-genome sequencing data reveal extensive genomic diversity, increasing our understanding of the complex and region-specific history of African populations and highlighting its potential impact on biomedical research and genetic susceptibility to disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-017-00663-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5727231PMC
December 2017

Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.

PLoS Comput Biol 2017 Jun 1;13(6):e1005419. Epub 2017 Jun 1.

Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, South Africa.

The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa) program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1005419DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5453403PMC
June 2017

Organic Cation Transporter 2 (OCT2/SLC22A2) Gene Variation in the South African Bantu-Speaking Population and Functional Promoter Variants.

OMICS 2017 03 16;21(3):169-176. Epub 2017 Feb 16.

1 The School of Molecular and Cell Biology, University of the Witwatersrand , Johannesburg, South Africa .

SLC22A2 facilitates the transport of endogenous and exogenous cationic compounds. Many pharmacologically significant compounds are transported by SLC22A2, including the antidiabetic drug metformin, anticancer agent cisplatin, and antiretroviral lamivudine. Genetic polymorphisms in SLC22A2 can modify the pharmacokinetic profiles of such important medicines and could therefore prove useful as precision medicine biomarkers. Since the frequency of SLC22A2 polymorphisms varies among different ethnic populations, we evaluated these in South African Bantu speakers, a majority group in the South African population, who exhibit unique genetic diversity, and we subsequently functionally characterized promoter polymorphisms. We identified 11 polymorphisms within the promoter and 9 single-nucleotide polymorphisms (SNPs) within the coding region of SLC22A2. While some polymorphisms appeared with minor allele frequencies similar to other African and non-African populations, some differed considerably; this was especially notable for three missense polymorphisms. In addition, we functionally characterized two promoter polymorphisms; rs138765638, a three base-pair deletion that bioinformatics analysis suggested could alter c-Ets-1/2, Elk1, and/or STAT4 binding, and rs59695691, an SNP that could abolish TFII-I binding. Significantly higher luciferase reporter gene expression was found for rs138765638 (increase of 37%; p = 0.001) and significantly lower expression for rs59695691 (decrease of 25%; p = 0.038), in comparison to the wild-type control. These observations highlight the importance of identifying and functionally characterizing genetic variation in genes of pharmacological significance. Finally, our data for SLC22A2 attest to the importance of considering genetic variation in different populations for drug safety, response, and global pharmacogenomics, through, for example, projects such as the Human Heredity and Health in Africa initiative.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/omi.2016.0165DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972774PMC
March 2017

Regulation of Cell Cycle Associated Genes by microRNA and Transcription Factor.

Microrna 2016 ;5(3):180-200

BioMedical Genomics Centre, PG Polyclinic, 5 Suburban Hospital Road, Kolkata 700 020, India.

Cell cycle is a complex process and regulated at transcriptional, post-transcriptional and posttranslational levels. Large numbers of genes are implicated in the process. Abnormality at any stage of cell cycle may lead to diseases including cancer. To gain global view of genes associated with cell cycle, their regulation by transcription factors and microRNAs, we collected genes related to cell cycle from different databases. Experimentally validated targets of microRNAs are collected from miRTarbase. Transcription factors that bind to upstream sequences of cell cycle associated genes and microRNA genes were collected from published papers. We collected 3028 genes associated with cell cycle. These proteins belong to different protein classes like nucleic acid binding (594 proteins), transcription factors (305 proteins), cytoskeletal (232 proteins), kinases (174 proteins), phosphatase (111 proteins) and chaperones (84 proteins). Among 3028 cell cycle associated genes, 2125 genes are validated targets of 424 microRNAs; CDKN1A is a target of 46 miRNAs and miR-335 targets 301 genes. About 100 transcription factors had binding sites at potential promoter regions of 2722 genes and 329 microRNAs that target cell cycle associated genes. We presented the largest numbers of cell cycle associated genes. Many transcription factors regulate both cell cycle associated genes and the miRNAs that target cell cycle associated genes. These resources will be utilized to identify the co-regulation of cell cycle associated genes by transcription factors and miRNAs and to test specific hypothesis for cell cycle regulation and its alteration in different diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2174/2211536605666161117112251DOI Listing
July 2017

Population Stratification and Underrepresentation of Indian Subcontinent Genetic Diversity in the 1000 Genomes Project Dataset.

Genome Biol Evol 2016 12 31;8(11):3460-3470. Epub 2016 Dec 31.

Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa

Genomic variation in Indian populations is of great interest due to the diversity of ancestral components, social stratification, endogamy and complex admixture patterns. With an expanding population of 1.2 billion, India is also a treasure trove to catalogue innocuous as well as clinically relevant rare mutations. Recent studies have revealed four dominant ancestries in populations from mainland India: Ancestral North-Indian (ANI), Ancestral South-Indian (ASI), Ancestral Tibeto-Burman (ATB) and Ancestral Austro-Asiatic (AAA). The 1000 Genomes Project (KGP) Phase-3 data include about 500 genomes from five linguistically defined Indian-Subcontinent (IS) populations (Punjabi, Gujrati, Bengali, Telugu and Tamil) some of whom are recent migrants to USA or UK. Comparative analyses show that despite the distinct geographic origins of the KGP-IS populations, the ANI component is predominantly represented in this dataset. Previous studies demonstrated population substructure in the HapMap Gujrati population, and we found evidence for additional substructure in the Punjabi and Telugu populations. These substructured populations have characteristic/significant differences in heterozygosity and inbreeding coefficients. Moreover, we demonstrate that the substructure is better explained by factors like differences in proportion of ancestral components, and endogamy driven social structure rather than invoking a novel ancestral component to explain it. Therefore, using language and/or geography as a proxy for an ethnic unit is inadequate for many of the IS populations. This highlights the necessity for more nuanced sampling strategies or corrective statistical approaches, particularly for biomedical and population genetics research in India.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evw244DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5203783PMC
December 2016

The African Genome Variation Project shapes medical genetics in Africa.

Nature 2015 Jan 3;517(7534):327-32. Epub 2014 Dec 3.

1] Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, UK [2] Department of Public Health and Primary Care, University of Cambridge, 2 Wort's Causeway, Cambridge, CB1 8RN, UK.

Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13997DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4297536PMC
January 2015

Immunochip identifies novel, and replicates known, genetic risk loci for rheumatoid arthritis in black South Africans.

Mol Med 2014 Aug 14;20:341-9. Epub 2014 Aug 14.

Division of Rheumatology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa.

The aim of this study was to identify genetic variants associated with rheumatoid arthritis (RA) risk in black South Africans. Black South African RA patients (n = 263) were compared with healthy controls (n = 374). Genotyping was performed using the Immunochip, and four-digit high-resolution human leukocyte antigen (HLA) typing was performed by DNA sequencing of exon 2. Standard quality control measures were implemented on the data. The strongest associations were in the intergenic region between the HLA-DRB1 and HLA-DQA1 loci. After conditioning on HLA-DRB1 alleles, the effect in the rest of the extended major histocompatibility (MHC) diminished. Non-HLA single nucleotide polymorphisms (SNPs) in the intergenic regions LOC389203|RBPJ, LOC100131131|IL1R1, KIAA1919|REV3L, LOC643749|TRAF3IP2, and SNPs in the intron and untranslated regions (UTR) of IRF1 and the intronic region of ICOS and KIAA1542 showed association with RA (p < 5 × 10(-5)). Of the SNPs previously associated with RA in Caucasians, one SNP, rs874040, locating to the intergenic region LOC389203|RBPJ was replicated in this study. None of the variants in the PTPN22 gene was significantly associated. The seropositive subgroups showed similar results to the overall cohort. The effects observed across the HLA region are most likely due to HLA-DRB1, and secondary effects in the extended MHC cannot be detected. Seven non-HLA loci are associated with RA in black South Africans. Similar to Caucasians, the intergenic region between LOC38920 and RBPJ is associated with RA in this population. The strong association of the R620W variant of the PTPN22 gene with RA in Caucasians was not replicated since this variant was monomorphic in our study, but other SNP variants of the PTPN22 gene were also not associated with RA in black South Africans, suggesting that this locus does not play a major role in RA in this population.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2119/molmed.2014.00097DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4153842PMC
August 2014

Population-specific common SNPs reflect demographic histories and highlight regions of genomic plasticity with functional relevance.

BMC Genomics 2014 Jun 6;15:437. Epub 2014 Jun 6.

Sydney Brenner Institute of Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa.

Background: Population differentiation is the result of demographic and evolutionary forces. Whole genome datasets from the 1000 Genomes Project (October 2012) provide an unbiased view of genetic variation across populations from Europe, Asia, Africa and the Americas. Common population-specific SNPs (MAF > 0.05) reflect a deep history and may have important consequences for health and wellbeing. Their interpretation is contextualised by currently available genome data.

Results: The identification of common population-specific (CPS) variants (SNPs and SSV) is influenced by admixture and the sample size under investigation. Nine of the populations in the 1000 Genomes Project (2 African, 2 Asian (including a merged Chinese group) and 5 European) revealed that the African populations (LWK and YRI), followed by the Japanese (JPT) have the highest number of CPS SNPs, in concordance with their histories and given the populations studied. Using two methods, sliding 50-SNP and 5-kb windows, the CPS SNPs showed distinct clustering across large genome segments and little overlap of clusters between populations. iHS enrichment score and the population branch statistic (PBS) analyses suggest that selective sweeps are unlikely to account for the clustering and population specificity. Of interest is the association of clusters close to recombination hotspots. Functional analysis of genes associated with the CPS SNPs revealed over-representation of genes in pathways associated with neuronal development, including axonal guidance signalling and CREB signalling in neurones.

Conclusions: Common population-specific SNPs are non-randomly distributed throughout the genome and are significantly associated with recombination hotspots. Since the variant alleles of most CPS SNPs are the derived allele, they likely arose in the specific population after a split from a common ancestor. Their proximity to genes involved in specific pathways, including neuronal development, suggests evolutionary plasticity of selected genomic regions. Contrary to expectation, selective sweeps did not play a large role in the persistence of population-specific variation. This suggests a stochastic process towards population-specific variation which reflects demographic histories and may have some interesting implications for health and susceptibility to disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-15-437DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4092225PMC
June 2014

Genome wide gene expression regulation by HIP1 Protein Interactor, HIPPI: prediction and validation.

BMC Genomics 2011 Sep 26;12:463. Epub 2011 Sep 26.

Crystallography and Molecular Biology Division, Saha Institute of Nuclear Physics, 1/AF Bidhan Nagar, Kolkata 700 064, India.

Background: HIP1 Protein Interactor (HIPPI) is a pro-apoptotic protein that induces Caspase8 mediated apoptosis in cell. We have shown earlier that HIPPI could interact with a specific 9 bp sequence motif, defined as the HIPPI binding site (HBS), present in the upstream promoter of Caspase1 gene and regulate its expression. We also have shown that HIPPI, without any known nuclear localization signal, could be transported to the nucleus by HIP1, a NLS containing nucleo-cytoplasmic shuttling protein. Thus our present work aims at the investigation of the role of HIPPI as a global transcription regulator.

Results: We carried out genome wide search for the presence of HBS in the upstream sequences of genes. Our result suggests that HBS was predominantly located within 2 Kb upstream from transcription start site. Transcription factors like CREBP1, TBP, OCT1, EVI1 and P53 half site were significantly enriched in the 100 bp vicinity of HBS indicating that they might co-operate with HIPPI for transcription regulation. To illustrate the role of HIPPI on transcriptome, we performed gene expression profiling by microarray. Exogenous expression of HIPPI in HeLa cells resulted in up-regulation of 580 genes (p < 0.05) while 457 genes were down-regulated. Several transcription factors including CBP, REST, C/EBP beta were altered by HIPPI in this study. HIPPI also interacted with P53 in the protein level. This interaction occurred exclusively in the nuclear compartment and was absent in cells where HIP1 was knocked down. HIPPI-P53 interaction was necessary for HIPPI mediated up-regulation of Caspase1 gene. Finally, we analyzed published microarray data obtained with post mortem brains of Huntington's disease (HD) patients to investigate the possible involvement of HIPPI in HD pathogenesis. We observed that along with the transcription factors like CREB, P300, SREBP1, Sp1 etc. which are already known to be involved in HD, HIPPI binding site was also significantly over-represented in the upstream sequences of genes altered in HD.

Conclusions: Taken together, the results suggest that HIPPI could act as an important transcription regulator in cell regulating a vast array of genes, particularly transcription factors and at least, in part, play a role in transcription deregulation observed in HD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-12-463DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228557PMC
September 2011

Africa: the next frontier for human disease gene discovery?

Hum Mol Genet 2011 Oct 9;20(R2):R214-20. Epub 2011 Sep 9.

Division of Human Genetics, School of Pathology, Faculty of Health Sciences, University of the Witwatersrand andNational Health Laboratory Service, Johannesburg, South Africa.

The populations of Africa harbour the greatest human genetic diversity following an evolutionary history tracing its beginnings on the continent to time before the emergence of Homo sapiens. Signatures of selection are detectable as responses to ancient environments and cultural practices, modulated by more recent events including infectious epidemics, migrations, admixture and, of course, chance. The age of high-throughput biology is not passing Africa by. African-based cohort studies and networks with an African footprint are ideal springboards for disease-related genetic and genomic studies. Initiatives like HapMap, the 1000 Genomes Project, MalariaGEN, the INDEPTH network and Human Heredity and Health in Africa are catalysts to exploring African genetic diversity and its role in the spectrum from health to disease. The challenges are abundant in dissecting biological questions in the light of linguistic, cultural, geographic and political boundaries and their respective roles in shaping health-related profiles. Will studies based on African populations lead to a new wave of discovery of genetic contributors to disease?
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddr401DOI Listing
October 2011

Arabidopsis thaliana regulatory element analyzer.

Bioinformatics 2008 Oct 11;24(19):2263-4. Epub 2008 Aug 11.

Department of Biophysics, Molecular Biology and Genetics, University of Calcutta, 92 APC Road, Kolkata 700009, India.

Unlabelled: In the Arabidopsis thaliana regulatory element analyzer (AtREA) server, we have integrated sequence data, genome-wide expression data and functional annotation data in three application modules which will be useful to identify major regulatory targets of a user-provided cis-regulatory element (CRE), study different features of CRE distribution and evaluate the role of a set of CREs in the regulation of gene expression--independently as well as in combination with other user-provided CREs.

Availability: AtREA is freely available at http://www.bioinformatics.org/grn/atrea.html.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btn417DOI Listing
October 2008

TRABAS: a database for transcription regulation by ABA signaling.

In Silico Biol 2008 ;8(5-6):511-6

Department of Biophysics, Molecular Biology & Bioinformatics, University of Calcutta, 92, APC Road, Kolkata 700009, India.

The effects of abscisic acid (ABA) induction on Arabidopsis thaliana transcriptome have been investigated by various expression studies. We have assembled and analyzed data from available expression studies related to ABA signaling in Arabidopsis along with other available microarray data, functional annotations and information related to occurrence of cis-regulatory elements in promoters of Arabidopsis genes in a database called TRABAS. TRABAS is expected to provide a simple, user-friendly platform to facilitate the study of different aspects of ABA mediated transcription regulation and is freely available at http://www.bioinformatics.org/trabas/.
View Article and Find Full Text PDF

Download full-text PDF

Source
May 2009