Publications by authors named "Xiaoshuai Zhang"

27 Publications

  • Page 1 of 1

The trans-ancestral genomic architecture of glycemic traits.

Nat Genet 2021 06 31;53(6):840-860. Epub 2021 May 31.

Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 × 10), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-021-00852-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7610958PMC
June 2021

Pyridazine-bridged expanded rosarin and semi-rosarinogen.

Chem Commun (Camb) 2021 Feb;57(12):1486-1489

Key Laboratory of Catalysis and Energy Materials Chemistry of Ministry of Education & Hubei Key Laboratory of Catalysis and Materials Science, South-Central University for Nationality, Wuhan, Hubei 430074, China.

The synthesis of the pyridazine-bridged expanded rosarin 1 and a reduced precursor, semi-rosarinogen 2, is reported. A single crystal X-ray diffraction analysis of 1 and theoretical calculations show that both 1 and 2 have distorted structures. Expanded rosarin 1 and its precursor 2 can differentiate various thiols in organic solvents by means of species-specific colour changes and reaction times.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/d0cc07433kDOI Listing
February 2021

Plastic credit: A consortium blockchain-based plastic recyclability system.

Waste Manag 2021 Feb 18;121:42-51. Epub 2020 Dec 18.

Institute of Finance and Technology, University College of London, London WC1E 6BT, United Kingdom. Electronic address:

By the end of 2015, approximately 6300 million tons (Mt) of plastic waste had been generated globally, but less than 10% of plastics was recycled. Since different types of plastics have various degrees of recyclability, consumer information about plastic product recyclability is paramount in order to increase the levels of plastic recycled. Against this context, the objective of this work is to define a plastic credit system to increase the amount of recyclable plastics. The plastic credit system assigns credit information to each plastic product and its corresponding company based on the percentage recyclability value of the plastic type and its composition. The methodology proposed is based on a unified and transparent credit system established by a double-chain system, which comprises a public blockchain CreditChain and a consortium blockchain M-InfoChain. The results show through the overall system performance analysis that the designed plastic credit system is capable of promoting a demand shift towards plastic products with higher plastic recyclability and achieving a lightweight operation for resource requirements and system maintenance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.wasman.2020.11.045DOI Listing
February 2021

Studying the Role of a Single Mutation of a Family 11 Glycoside Hydrolase Using High-Resolution X-ray Crystallography.

Protein J 2020 12 31;39(6):671-680. Epub 2020 Oct 31.

College of Science, Nanjing Agricultural University, Nanjing, 210095, People's Republic of China.

XynII is a family 11 glycoside hydrolase that uses the retaining mechanism for catalysis. In the active site, E177 works as the acid/base and E86 works as the nucleophile. Mutating an uncharged residue (N44) to an acidic residue (D) near E177 decreases the enzyme's optimal pH by ~ 1.0 unit. D44 was previously suggested to be a second proton carrier for catalysis. To test this hypothesis, we abolished the activity of E177 by mutating it to be Q, and mutated N44 to be D or E. These double mutants have dramatically decreased activities. Our high-resolution crystallographic structures and the microscopic pK calculations show that D44 has similar position and pK value during catalysis, indicating that D44 changes electrostatics around E177, which makes it prone to rotate as the acid/base in acidic conditions, thus decreases the pH optimum. Our results could be helpful to design enzymes with different pH optimum.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10930-020-09938-5DOI Listing
December 2020

A Challenge-Response Assisted Authorisation Scheme for Data Access in Permissioned Blockchains.

Sensors (Basel) 2020 Aug 19;20(17). Epub 2020 Aug 19.

School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK.

Permissioned blockchains can be applied for sharing data among permitted users to authorise the data access requests in a permissioned blockchain. A consensus network constructed using pre-selected nodes should verify a data requester's credentials to determine if he or she have the correct permissions to access the queried data. However, current studies do not consider how to protect users' privacy for data authorisation if the pre-selected nodes become untrusted, e.g., the pre-selected nodes are manipulated by attackers. When a user's credentials are exposed to pre-selected nodes in the consensus network during authorisation, the untrusted (or even malicious) pre-selected nodes may collect a user's credentials and other private information without the user's right to know. Therefore, the private data exposed to the consensus network should be tightly restricted. In this paper, we propose a challenge-response based authorisation scheme for permissioned blockchain networks named Challenge-Response Assisted Access Authorisation (CRA) to protect users' credentials during authorisation. In CRA, the pre-selected nodes in the consensus network do not require users' credentials to authorise data access requests to prevent privacy leakage when these nodes are compromised or manipulated by attackers. Furthermore, the computational burden on the consensus network for authorisation is reduced because the major computing work of the authorisation is executed by the data requester and provider in CRA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/s20174681DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506573PMC
August 2020

Bridging the Gap Between Computational Photography and Visual Recognition.

IEEE Trans Pattern Anal Mach Intell 2020 May 21;PP. Epub 2020 May 21.

What is the current state-of-the-art for image restoration and enhancement applied to degraded images acquired under less than ideal circumstances? Can the application of such algorithms as a pre-processing step improve image interpretability for manual analysis or automatic visual recognition to classify scene content? While there have been important advances in the area of computational photography to restore or enhance the visual quality of an image, the capabilities of such techniques have not always translated in a useful way to visual recognition tasks. To address this, we introduce the UG dataset as a large-scale benchmark composed of video imagery captured under challenging conditions, and two enhancement tasks designed to test algorithmic impact on visual quality and automatic object recognition. Furthermore, we propose a set of metrics to evaluate the joint improvement of such tasks as well as individual algorithmic advances, including a novel psychophysics-based evaluation regime for human assessment and a realistic set of quantitative measures for object recognition performance. We introduce six new algorithms for image restoration or enhancement, which were created as part of the IARPA sponsored UG Challenge workshop held at CVPR 2018.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2020.2996538DOI Listing
May 2020

The Effect of Low Carbohydrate Diet on Polycystic Ovary Syndrome: A Meta-Analysis of Randomized Controlled Trials.

Int J Endocrinol 2019 26;2019:4386401. Epub 2019 Nov 26.

Department of Gynecology, Chengdu Xinan Gynecological Hospital, Chengdu, China.

Objective: To assess the effect of a low carbohydrate diet (LCD) on women with polycystic ovary syndrome (PCOS).

Methods: Data from randomized controlled trials (RCTs) were obtained to perform a meta-analysis of the effects of LCD in PCOS patients. The primary outcomes included the changes in BMI, homeostatic model assessment for insulin resistance (HOMA-IR), and blood lipids, including total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), and high-density lipoprotein cholesterol (HDL-C), follicle-stimulating hormone (FSH), luteotropic hormone (LH), total testosterone (T), and sex hormone-binding globulin (SHBG).

Results: Eight RCTs involving 327 patients were included. In comparison with the control group, the LCD decreased BMI (SMD = -1.04, 95% CI (-1.38, -0.70), < 0.00001), HOMA-IR (SMD = -0.66, 95% CI (-1.01, -0.30), < 0.05), TC (SMD = -0.68, 95% CI (-1.35, -0.02), < 0.05), and LDL-C (SMD = -0.66, 95% CI (-1.30, -0.02), < 0.05). Stratified analyses indicated that LCD lasting longer than 4 weeks had a stronger effect on increasing FSH levels (MD = 0.39, 95% CI (0.08, 0.71), < 0.05), increasing SHBG levels (MD = 5.98, 95% CI (3.51, 8.46), < 0.05), and decreasing levels (SMD = -1.79, 95% CI (-3.22, -0.36), < 0.05), and the low-fat and low-CHO LCD (fat <35% and CHO <45%) had a more significant effect on the levels of FSH (MD = 0.40, 95% CI (0.09, 0.71), < 0.05) and SHBG (MD = 6.20, 95% CI (3.68, 8.72), < 0.05) than the high-fat and low-CHO LCD (fat >35% and CHO <45%).

Conclusion: Based on the current evidence, LCD, particularly long-term LCD and low-fat/low-CHO LCD, may be recommended for the reduction of BMI, treatment of PCOS with insulin resistance, prevention of high LDL-C, increasing the levels of FSH and SHBG, and decreasing the level of level. Together, the analyzed data indicate that proper control of carbohydrate intake provides beneficial effects on some aspects of PCOS and may represent one of the important interventions improving the clinical symptoms of affected patients.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1155/2019/4386401DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6899277PMC
November 2019

Risk Prediction of Dyslipidemia for Chinese Han Adults Using Random Forest Survival Model.

Clin Epidemiol 2019 10;11:1047-1055. Epub 2019 Dec 10.

Department of Preventive Medicine, School of Public Health and Management, Binzhou Medical University, Yantai, People's Republic of China.

Objective: Dyslipidemia has been recognized as a major risk factor of several diseases, and early prevention and management of dyslipidemia is effective in the primary prevention of cardiovascular events. The present study aims to develop risk models for predicting dyslipidemia using Random Survival Forest (RSF), which take the complex relationship between the variables into account.

Methods: We used data from 6328 participants aged between 19 and 90 years free of dyslipidemia at baseline with a maximum follow-up of 5 years. RSF was applied to develop gender-specific risk model for predicting dyslipidemia using variables from anthropometric and laboratory test in the cohort. Cox regression was also adopted in comparison with the RSF model, and Harrell's concordance statistic with 10-fold cross-validation was used to validate the models.

Results: The incidence density of dyslipidemia was 101/1000 in total and subgroup incidence densities were 121/1000 for men and 69/1000 for women. Twenty-four predictors were identified in the prediction model of males and 23 in females. The C-statistics of the prediction models for males and females were 0.731 and 0.801, respectively. The RSF model shows better discriminative performance than CPH model (0.719 for males and 0.787 for females). Moreover, some predictors were observed to have a nonlinear effect on dyslipidemia.

Conclusion: The RSF model is a promising method in identifying high-risk individuals for the prevention of dyslipidemia and related diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2147/CLEP.S223694DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6911320PMC
December 2019

The potential agents from food for preventing leukopenia induced by benzene: garlic preparations.

Toxicol Mech Methods 2019 Nov 14;29(9):702-709. Epub 2019 Aug 14.

Institute of Toxicology, School of Public Health, Shandong University , Jinan , China.

Leukopenia is the early clinical manifestation of benzene poisoning. The aim of our research was to evaluate the preventive effects of three kinds of garlic preparations on benzene induced leukopenia. The mouse model of Leukopenia was established with benzene orally. At the same time, mice were administrated with garlic homogenate (GH), garlic oil (GO) or diallyl trisulfide (DATS) as preventional measures. The counts of white blood cells (WBC), the organ indexes, pathological examinations, blood biochemical parameters, weight gains, and food intakes were evaluated to observe the protective effect and potential adverse events. The results demonstrated that the counts of WBC increased by 144.04%, 140.07%, and 148.34%, respectively, after intervention by GH (400 mg/kg), GO (60 mg/kg) and DATS (30 mg/kg), compared with that in the model group. The spleen and thymus indexes in the benzene model group were 44.99% and 54.04% lower than those in the blank control group, the number of spleen nodules reduced and the thymus atrophy, which were restored by three garlic preparations at different degree. The results suggested that the three preparations all could prevent the leukopenia and protect the organ injuries induced by benzene. However, the spleen index and weight gains revealed that GH and GO brought more adverse events than DATS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/15376516.2019.1650148DOI Listing
November 2019

Liver Function and Risk of Type 2 Diabetes: Bidirectional Mendelian Randomization Study.

Diabetes 2019 08 14;68(8):1681-1691. Epub 2019 May 14.

MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, U.K.

Liver dysfunction and type 2 diabetes (T2D) are consistently associated. However, it is currently unknown whether liver dysfunction contributes to, results from, or is merely correlated with T2D due to confounding. We used Mendelian randomization to investigate the presence and direction of any causal relation between liver function and T2D risk including up to 64,094 T2D case and 607,012 control subjects. Several biomarkers were used as proxies of liver function (i.e., alanine aminotransferase [ALT], aspartate aminotransferase [AST], alkaline phosphatase [ALP], and γ-glutamyl transferase [GGT]). Genetic variants strongly associated with each liver function marker were used to investigate the effect of liver function on T2D risk. In addition, genetic variants strongly associated with T2D risk and with fasting insulin were used to investigate the effect of predisposition to T2D and insulin resistance, respectively, on liver function. Genetically predicted higher circulating ALT and AST were related to increased risk of T2D. There was a modest negative association of genetically predicted ALP with T2D risk and no evidence of association between GGT and T2D risk. Genetic predisposition to higher fasting insulin, but not to T2D, was related to increased circulating ALT. Since circulating ALT and AST are markers of nonalcoholic fatty liver disease (NAFLD), these findings provide some support for insulin resistance resulting in NAFLD, which in turn increases T2D risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2337/db18-1048DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7011195PMC
August 2019

Bioprocess development of a stable FUT8-CHO cell line to produce defucosylated anti-HER2 antibody.

Bioprocess Biosyst Eng 2019 Aug 13;42(8):1263-1271. Epub 2019 Apr 13.

Engineering Research Center of Cell and Therapeutic Antibody, Ministry of Education, School of Pharmacy, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, People's Republic of China.

In recent years, an increasing number of defucosylated therapeutic antibodies have been applied in clinical practices due to their better efficacy compared to fucosylated counterparts. The establishment of stable and clonal manufacturing cell lines is the basis of therapeutic antibodies production. Bioprocess development of a new cell line is necessary for its future applications in the biopharmaceutical industry. We engineered a stable cell line expressing defucosylated anti-HER2 antibody based on an established α-1,6-fucosyltransferase (FUT8) gene knockout CHO-S cell line. The optimization of medium and feed was evaluated in a small-scale culture system. Then the optimal medium and feed were scaled up in a bioreactor system. After fed-batch culture over 13 days, we evaluated the cell growth, antibody yield, glycan compositions and bioactivities. The production of anti-HER2 antibody from the FUT8 gene knockout CHO-S cells in the bioreactor increased by 37% compared to the shake flask system. The N-glycan profile of the produced antibody was consistent between the bioreactor and shake flask system. The antibody-dependent cellular cytotoxicity activity of the defucosylated antibody increased 14-fold compared to the wild-type antibody, which was the same as our previous results. The results of our bioprocess development demonstrated that the engineered cell line could be developed to a biopharmaceutical industrial cell line.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00449-019-02124-7DOI Listing
August 2019

Phage-Mediated Competitive Chemiluminescent Immunoassay for Detecting Cry1Ab Toxin by Using an Anti-Idiotypic Camel Nanobody.

J Agric Food Chem 2018 Jan 22;66(4):950-956. Epub 2018 Jan 22.

Laboratory for Food Quality and Safety-State Key Laboratory Cultivation Base of Ministry of Science and Technology,Key Laboratory of Control Technology and Standard for Agro-product Safety and Quality (Ministry of Agriculture), Institute of Food Safety and Nutrition, Jiangsu Academy of Agricultural Sciences , Nanjing 210014, China.

Cry toxins have been widely used in genetically modified organisms for pest control, raising public concern regarding their effects on the natural environment and food safety. In this work, a phage-mediated competitive chemiluminescent immunoassay (c-CLIA) was developed for determination of Cry1Ab toxin using anti-idiotypic camel nanobodies. By extracting RNA from camels' peripheral blood lymphocytes, a naive phage-displayed nanobody library was established. Using anti-Cry1Ab toxin monoclonal antibodies (mAbs) against the library for anti-idiotypic antibody screening, four anti-idiotypic nanobodies were selected and confirmed to be specific for anti-Cry1Ab mAb binding. Thereafter, a c-CLIA was developed for detection of Cry1Ab toxin based on anti-idiotypic camel nanobodies and employed for sample testing. The results revealed a half-inhibition concentration of developed assay to be 42.68 ± 2.54 ng/mL, in the linear range of 10.49-307.1 ng/mL. The established method is highly specific for Cry1Ab recognition, with negligible cross-reactivity for other Cry toxins. For spiked cereal samples, the recoveries of Cry1Ab toxin ranged from 77.4% to 127%, with coefficient of variation of less than 9%. This study demonstrated that the competitive format based on phage-displayed anti-idiotypic nanobodies can provide an alternative strategy for Cry toxin detection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jafc.7b04923DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7314401PMC
January 2018

A powerful weighted statistic for detecting group differences of directed biological networks.

Sci Rep 2016 Sep 30;6:34159. Epub 2016 Sep 30.

Department of Biostatistics, School of Public Health, Shandong University, Jinan 250012, China.

Complex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. Different physiological conditions such as cases and controls may manifest as different networks. Statistical comparison between biological networks can provide not only new insight into the disease mechanism but statistical guidance for drug development. However, the methods developed in previous studies are inadequate to capture the changes in both the nodes and edges, and often ignore the network structure. In this study, we present a powerful weighted statistical test for group differences of directed biological networks, which is independent of the network attributes and can capture the changes in both the nodes and edges, as well as simultaneously accounting for the network structure through putting more weights on the difference of nodes locating on relatively more important position. Simulation studies illustrate that this method had better performance than previous ones under various sample sizes and network structures. One application to GWAS of leprosy successfully identifies the specific gene interaction network contributing to leprosy. Another real data analysis significantly identifies a new biological network, which is related to acute myeloid leukemia. One potential network responsible for lung cancer has also been significantly detected. The source R code is available on our website.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep34159DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054825PMC
September 2016

A novel chi-square statistic for detecting group differences between pathways in systems epidemiology.

Stat Med 2016 12 7;35(29):5512-5524. Epub 2016 Sep 7.

Department of Biostatistics, School of Public Health, Shandong University, Jinan, 250012, Shandong, China.

Traditional epidemiology often pays more attention to the identification of a single factor rather than to the pathway that is related to a disease, and therefore, it is difficult to explore the disease mechanism. Systems epidemiology aims to integrate putative lifestyle exposures and biomarkers extracted from multiple omics platforms to offer new insights into the pathway mechanisms that underlie disease at the human population level. One key but inadequately addressed question is how to develop powerful statistics to identify whether one candidate pathway is associated with a disease. Bearing in mind that a pathway difference can result from not only changes in the nodes but also changes in the edges, we propose a novel statistic for detecting group differences between pathways, which in principle, captures the nodes changes and edge changes, as well as simultaneously accounting for the pathway structure simultaneously. The proposed test has been proven to follow the chi-square distribution, and various simulations have shown it has better performance than other existing methods. Integrating genome-wide DNA methylation data, we analyzed one real data set from the Bogalusa cohort study and significantly identified a potential pathway, Smoking → SOCS3 → PIK3R1, which was strongly associated with abdominal obesity. The proposed test was powerful and efficient at identifying pathway differences between two groups, and it can be extended to other disciplines that involve statistical comparisons between pathways. The source code in R is available on our website. Copyright © 2016 John Wiley & Sons, Ltd.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/sim.7094DOI Listing
December 2016

Network or regression-based methods for disease discrimination: a comparison study.

BMC Med Res Methodol 2016 08 18;16:100. Epub 2016 Aug 18.

Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China.

Background: In stark contrast to network-centric view for complex disease, regression-based methods are preferred in disease prediction, especially for epidemiologists and clinical professionals. It remains a controversy whether the network-based methods have advantageous performance than regression-based methods, and to what extent do they outperform.

Methods: Simulations under different scenarios (the input variables are independent or in network relationship) as well as an application were conducted to assess the prediction performance of four typical methods including Bayesian network, neural network, logistic regression and regression splines.

Results: The simulation results reveal that Bayesian network showed a better performance when the variables were in a network relationship or in a chain structure. For the special wheel network structure, logistic regression had a considerable performance compared to others. Further application on GWAS of leprosy show Bayesian network still outperforms other methods.

Conclusion: Although regression-based methods are still popular and widely used, network-based approaches should be paid more attention, since they capture the complex relationship between variables.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12874-016-0207-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4991108PMC
August 2016

A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data.

BMC Genet 2016 Mar 9;17:51. Epub 2016 Mar 9.

Department of biostatistics, School of Public Health, Shandong University, Jinan City, Shandong Province, P. R. China.

Background: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis.

Results: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers.

Conclusions: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12863-016-0358-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784463PMC
March 2016

A powerful score-based statistical test for group difference in weighted biological networks.

BMC Bioinformatics 2016 Feb 12;17:86. Epub 2016 Feb 12.

Department of Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, Shandong, China.

Background: Complex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. A key but inadequately addressed issue is how to test possible differences of the networks between two groups. Group-level comparison of network properties may shed light on underlying disease mechanisms and benefit the design of drug targets for complex diseases. We therefore proposed a powerful score-based statistic to detect group difference in weighted networks, which simultaneously capture the vertex changes and edge changes.

Results: Simulation studies indicated that the proposed network difference measure (NetDifM) was stable and outperformed other methods existed, under various sample sizes and network topology structure. One application to real data about GWAS of leprosy successfully identified the specific gene interaction network contributing to leprosy. For additional gene expression data of ovarian cancer, two candidate subnetworks, PI3K-AKT and Notch signaling pathways, were considered and identified respectively.

Conclusions: The proposed method, accounting for the vertex changes and edge changes simultaneously, is valid and powerful to capture the group difference of biological networks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-016-0916-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4751708PMC
February 2016

A powerful score-based test statistic for detecting gene-gene co-association.

BMC Genet 2016 Jan 29;17:31. Epub 2016 Jan 29.

Department of Biostatistics, School of Public Health, Shandong University, 44 Wen Hua Xi Road, PO Box 100, Jinan, 250012, China.

Background: The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association.

Results: Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice.

Conclusions: SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12863-016-0331-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4731962PMC
January 2016

Detection for pathway effect contributing to disease in systems epidemiology with a case-control design.

BMJ Open 2015 Jan 16;5(1):e006721. Epub 2015 Jan 16.

Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, Jinan, China.

Objectives: Identification of pathway effects responsible for specific diseases has been one of the essential tasks in systems epidemiology. Despite some advance in procedures for distinguishing specific pathway (or network) topology between different disease status, statistical inference at a population level remains unsolved and further development is still needed. To identify the specific pathways contributing to diseases, we attempt to develop powerful statistics which can capture the complex relationship among risk factors.

Setting And Participants: Acute myeloid leukaemia (AML) data obtained from 133 adults (98 patients and 35 controls; 47% female).

Results: Simulation studies indicated that the proposed Pathway Effect Measures (PEM) were stable; bootstrap-based methods outperformed the others, with bias-corrected bootstrap CI method having the highest power. Application to real data of AML successfully identified the specific pathway (Treg→TGFβ→Th17) effect contributing to AML with p values less than 0.05 under various methods and the bias-corrected bootstrap CI (-0.214 to -0.020). It demonstrated that Th17-Treg correlation balance was impaired in patients with AML, suggesting that Th17-Treg imbalance potentially plays a role in the pathogenesis of AML.

Conclusions: The proposed bootstrap-based PEM are valid and powerful for detecting the specific pathway effect contributing to disease, thus potentially providing new insight into the underlying mechanisms and ways to study the disease effects of specific pathways more comprehensively.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/bmjopen-2014-006721DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298111PMC
January 2015

Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.

BMC Genet 2014 Dec 10;15:130. Epub 2014 Dec 10.

Bayessoft, Inc., 2221 Caravaggio Drive, Davis, CA, 95618, USA.

Background: Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets.

Results: Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies.

Conclusions: The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12863-014-0130-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4275962PMC
December 2014

Comparing partial least square approaches in a gene- or region-based association study for multiple quantitative phenotypes.

Hum Biol 2014 ;86(1):51-8

Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, Shandong, China.

On thinking quantitatively of complex diseases, there are at least three statistical strategies for association studies: one single-nucleotide polymorphism (SNP) on a single trait, gene or region (with multiple SNPs) on a single trait, and gene or region on multiple traits. The third approach is the most general in dissecting genetic mechanisms underlying complex diseases underpinning multiple quantitative traits. Gene or region association methods based on partial least square (PLS) approaches have been shown to have apparent power advantage. However, few approaches have been developed for multiple quantitative phenotypes or traits underlying a condition or disease, and the performance of various PLS approaches used in association studies for multiple quantitative traits have not been assessed. Here we exploit association between multiple SNPs and multiple phenotypes or traits, from a regression perspective, through exhaustive scan statistics (sliding window) using PLS and sparse PLS regressions. Simulations were conducted to assess the performance of the proposed scan statistics and compare them with existing methods. The proposed methods were applied to 12 regions of genome-wide association study data from the European Prospective Investigation of Cancer-Norfolk study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3378/027.086.0106DOI Listing
November 2015

A powerful latent variable method for detecting and characterizing gene-based gene-gene interaction on multiple quantitative traits.

BMC Genet 2013 Sep 23;14:89. Epub 2013 Sep 23.

Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, Jinan 250012, China.

Background: On thinking quantitatively of complex diseases, there are at least three statistical strategies for analyzing the gene-gene interaction: SNP by SNP interaction on single trait, gene-gene (each can involve multiple SNPs) interaction on single trait and gene-gene interaction on multiple traits. The third one is the most general in dissecting the genetic mechanism underlying complex diseases underpinning multiple quantitative traits. In this paper, we developed a novel statistic for this strategy through modifying the Partial Least Squares Path Modeling (PLSPM), called mPLSPM statistic.

Results: Simulation studies indicated that mPLSPM statistic was powerful and outperformed the principal component analysis (PCA) based linear regression method. Application to real data in the EPIC-Norfolk GWAS sub-cohort showed suggestive interaction (γ) between TMEM18 gene and BDNF gene on two composite body shape scores (γ = 0.047 and γ = 0.058, with P = 0.021, P = 0.005), and BMI (γ = 0.043, P = 0.034). This suggested these scores (synthetically latent traits) were more suitable to capture the obesity related genetic interaction effect between genes compared to single trait.

Conclusions: The proposed novel mPLSPM statistic is a valid and powerful gene-based method for detecting gene-gene interaction on multiple quantitative phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2156-14-89DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848962PMC
September 2013

From interaction to co-association --a Fisher r-to-z transformation-based simple statistic for real world genome-wide association study.

PLoS One 2013 29;8(7):e70774. Epub 2013 Jul 29.

Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China.

Currently, the genetic variants identified by genome wide association study (GWAS) generally only account for a small proportion of the total heritability for complex disease. One crucial reason is the underutilization of gene-gene joint effects commonly encountered in GWAS, which includes their main effects and co-association. However, gene-gene co-association is often customarily put into the framework of gene-gene interaction vaguely. From the causal graph perspective, we elucidate in detail the concept and rationality of gene-gene co-association as well as its relationship with traditional gene-gene interaction, and propose two Fisher r-to-z transformation-based simple statistics to detect it. Three series of simulations further highlight that gene-gene co-association refers to the extent to which the joint effects of two genes differs from the main effects, not only due to the traditional interaction under the nearly independent condition but the correlation between two genes. The proposed statistics are more powerful than logistic regression under various situations, cannot be affected by linkage disequilibrium and can have acceptable false positive rate as long as strictly following the reasonable GWAS data analysis roadmap. Furthermore, an application to gene pathway analysis associated with leprosy confirms in practice that our proposed gene-gene co-association concepts as well as the correspondingly proposed statistics are strongly in line with reality.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0070774PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3726765PMC
August 2014

An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.

PLoS One 2013 3;8(7):e67672. Epub 2013 Jul 3.

Department of Health Statistics, Chongqing Medical University, Chongqing, China.

The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with 'large p, small n' problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0067672PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3700986PMC
February 2014

A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design.

PLoS One 2013 19;8(4):e62129. Epub 2013 Apr 19.

Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China.

For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0062129PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3631168PMC
November 2013

Detection for gene-gene co-association via kernel canonical correlation analysis.

BMC Genet 2012 Oct 8;13:83. Epub 2012 Oct 8.

Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, 250012, China.

Background: Currently, most methods for detecting gene-gene interaction (GGI) in genomewide association studies (GWASs) are limited in their use of single nucleotide polymorphism (SNP) as the unit of association. One way to address this drawback is to consider higher level units such as genes or regions in the analysis. Earlier we proposed a statistic based on canonical correlations (CCU) as a gene-based method for detecting gene-gene co-association. However, it can only capture linear relationship and not nonlinear correlation between genes. We therefore proposed a counterpart (KCCU) based on kernel canonical correlation analysis (KCCA).

Results: Through simulation the KCCU statistic was shown to be a valid test and more powerful than CCU statistic with respect to sample size and interaction odds ratio. Analysis of data from regions involving three genes on rheumatoid arthritis (RA) from Genetic Analysis Workshop 16 (GAW16) indicated that only KCCU statistic was able to identify interactions reported earlier.

Conclusions: KCCU statistic is a valid and powerful gene-based method for detecting gene-gene co-association.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2156-13-83DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3506484PMC
October 2012