Publications by authors named "Othman Soufan"

13 Publications

  • Page 1 of 1

Development of a Comprehensive Toxicity Pathway Model for 17α-Ethinylestradiol in Early Life Stage Fathead Minnows ().

Environ Sci Technol 2021 04 23;55(8):5024-5036. Epub 2021 Mar 23.

Toxicology Centre, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5B3, Canada.

There is increasing pressure to develop alternative ecotoxicological risk assessment approaches that do not rely on expensive, time-consuming, and ethically questionable live animal testing. This study aimed to develop a comprehensive early life stage toxicity pathway model for the exposure of fish to estrogenic chemicals that is rooted in mechanistic toxicology. Embryo-larval fathead minnows (FHM; ) were exposed to graded concentrations of 17α-ethinylestradiol (water control, 0.01% DMSO, 4, 20, and 100 ng/L) for 32 days. Fish were assessed for transcriptomic and proteomic responses at 4 days post-hatch (dph), and for histological and apical end points at 28 dph. Molecular analyses revealed core responses that were indicative of observed apical outcomes, including biological processes resulting in overproduction of vitellogenin and impairment of visual development. Histological observations indicated accumulation of proteinaceous fluid in liver and kidney tissues, energy depletion, and delayed or suppressed gonad development. Additionally, fish in the 100 ng/L treatment group were smaller than controls. Integration of omics data improved the interpretation of perturbations in early life stage FHM, providing evidence of conservation of toxicity pathways across levels of biological organization. Overall, the mechanism-based embryo-larval FHM model showed promise as a replacement for standard adult live animal tests.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.est.0c05942DOI Listing
April 2021

FastBMD: an online tool for rapid benchmark dose-response analysis of transcriptomics data.

Bioinformatics 2021 05;37(7):1035-1036

Department of Natural Resource Sciences, McGill University, Montreal, QC H9X 3V9, Canada.

Motivation: Transcriptomics dose-response analysis is a promising new approach method for toxicity testing. While international regulatory agencies have spent substantial effort establishing a standardized statistical approach, existing software that follows this approach is computationally inefficient and must be locally installed.

Results: FastBMD is a web-based tool that implements standardized methods for transcriptomics benchmark dose-response analysis in R. It is >60 times faster than the current leading software, supports transcriptomics data from 13 species, and offers a comprehensive analytical pipeline that goes from processing and normalization of raw gene expression values to interactive exploration of pathway-level benchmark dose results.

Availability And Implementation: FastBMD is freely available at www.fastbmd.ca.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa700DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128449PMC
May 2021

miRNet 2.0: network-based visual analytics for miRNA functional analysis and systems biology.

Nucleic Acids Res 2020 07;48(W1):W244-W251

Department of Human Genetics, McGill University, Montreal, Quebec, Canada.

miRNet is an easy-to-use, web-based platform designed to help elucidate microRNA (miRNA) functions by integrating users' data with existing knowledge via network-based visual analytics. Since its first release in 2016, miRNet has been accessed by >20 000 researchers worldwide, with ∼100 users on a daily basis. While version 1.0 was focused primarily on miRNA-target gene interactions, it has become clear that in order to obtain a global view of miRNA functions, it is necessary to bring other important players into the context during analysis. Driven by this concept, in miRNet version 2.0, we have (i) added support for transcription factors (TFs) and single nucleotide polymorphisms (SNPs) that affect miRNAs, miRNA-binding sites or target genes, whilst also greatly increased (>5-fold) the underlying knowledgebases of miRNAs, ncRNAs and disease associations; (ii) implemented new functions to allow creation and visual exploration of multipartite networks, with enhanced support for in situ functional analysis and (iii) revamped the web interface, optimized the workflow, and introduced microservices and web application programming interface (API) to sustain high-performance, real-time data analysis. The underlying R package is also released in tandem with version 2.0 to allow more flexible data analysis for R programmers. The miRNet 2.0 website is freely available at https://www.mirnet.ca.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa467DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7319552PMC
July 2020

EcoToxModules: Custom Gene Sets to Organize and Analyze Toxicogenomics Data from Ecological Species.

Environ Sci Technol 2020 04 10;54(7):4376-4387. Epub 2020 Mar 10.

Faculty of Agricultural and Environmental Sciences, McGill University, Sainte-Anne-de-Bellevue H9X 3V9, Canada.

Traditional results from toxicogenomics studies are complex lists of significantly impacted genes or gene sets, which are challenging to synthesize down to actionable results with a clear interpretation. Here, we defined two sets of 21 custom gene sets, called the functional and statistical EcoToxModules, in fathead minnow () to (1) re-cast predefined molecular pathways into a toxicological framework and (2) provide a data-driven, unsupervised grouping of genes impacted by exposure to environmental contaminants. The functional EcoToxModules were identified by re-organizing KEGG pathways into biological processes that are more relevant to ecotoxicology based on the input from expert scientists and regulators. The statistical EcoToxModules were identified using co-expression analysis of publicly available microarray data ( = 303 profiles) measured in livers of fathead minnows after exposure to 38 different conditions. Potential applications of the EcoToxModules were demonstrated with two case studies that represent exposure to a pure chemical and to environmental wastewater samples. In comparisons to differential expression and gene set analysis, we found that EcoToxModule responses were consistent with these traditional results. Additionally, they were easier to visualize and quantitatively compare across different conditions, which facilitated drawing conclusions about the relative toxicity of the exposures within each case study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.est.9b06607DOI Listing
April 2020

T1000: a reduced gene set prioritized for toxicogenomic studies.

PeerJ 2019 29;7:e7975. Epub 2019 Oct 29.

Institute of Parasitology, McGill University, Montreal, Canada.

There is growing interest within regulatory agencies and toxicological research communities to develop, test, and apply new approaches, such as toxicogenomics, to more efficiently evaluate chemical hazards. Given the complexity of analyzing thousands of genes simultaneously, there is a need to identify reduced gene sets. Though several gene sets have been defined for toxicological applications, few of these were purposefully derived using toxicogenomics data. Here, we developed and applied a systematic approach to identify 1,000 genes (called Toxicogenomics-1000 or T1000) highly responsive to chemical exposures. First, a co-expression network of 11,210 genes was built by leveraging microarray data from the Open TG-GATEs program. This network was then re-weighted based on prior knowledge of their biological (KEGG, MSigDB) and toxicological (CTD) relevance. Finally, weighted correlation network analysis was applied to identify 258 gene clusters. T1000 was defined by selecting genes from each cluster that were most associated with outcome measures. For model evaluation, we compared the performance of T1000 to that of other gene sets (L1000, S1500, Genes selected by Limma, and random set) using two external datasets based on the rat model. Additionally, a smaller (T384) and a larger version (T1500) of T1000 were used for dose-response modeling to test the effect of gene set size. Our findings demonstrated that the T1000 gene set is predictive of apical outcomes across a range of conditions (e.g., and , dose-response, multiple species, tissues, and chemicals), and generally performs as well, or better than other gene sets available.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.7975DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824333PMC
October 2019

NetworkAnalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis.

Nucleic Acids Res 2019 07;47(W1):W234-W241

Institute of Parasitology, McGill University, Montreal, Quebec, Canada.

The growing application of gene expression profiling demands powerful yet user-friendly bioinformatics tools to support systems-level data understanding. NetworkAnalyst was first released in 2014 to address the key need for interpreting gene expression data within the context of protein-protein interaction (PPI) networks. It was soon updated for gene expression meta-analysis with improved workflow and performance. Over the years, NetworkAnalyst has been continuously updated based on community feedback and technology progresses. Users can now perform gene expression profiling for 17 different species. In addition to generic PPI networks, users can now create cell-type or tissue specific PPI networks, gene regulatory networks, gene co-expression networks as well as networks for toxicogenomics and pharmacogenomics studies. The resulting networks can be customized and explored in 2D, 3D as well as Virtual Reality (VR) space. For meta-analysis, users can now visually compare multiple gene lists through interactive heatmaps, enrichment networks, Venn diagrams or chord diagrams. In addition, users have the option to create their own data analysis projects, which can be saved and resumed at a later time. These new features are released together as NetworkAnalyst 3.0, freely available at https://www.networkanalyst.ca.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz240DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6602507PMC
July 2019

Systematic selection of chemical fingerprint features improves the Gibbs energy prediction of biochemical reactions.

Bioinformatics 2019 08;35(15):2634-2643

King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal, Saudi Arabia.

Motivation: Accurate and wide-ranging prediction of thermodynamic parameters for biochemical reactions can facilitate deeper insights into the workings and the design of metabolic systems.

Results: Here, we introduce a machine learning method with chemical fingerprint-based features for the prediction of the Gibbs free energy of biochemical reactions. From a large pool of 2D fingerprint-based features, this method systematically selects a small number of relevant ones and uses them to construct a regularized linear model. Since a manual selection of 2D structure-based features can be a tedious and time-consuming task, requiring expert knowledge about the structure-activity relationship of chemical compounds, the systematic feature selection step in our method offers a convenient means to identify relevant 2D fingerprint-based features. By comparing our method with state-of-the-art linear regression-based methods for the standard Gibbs free energy prediction, we demonstrated that its prediction accuracy and prediction coverage are most favorable. Our results show direct evidence that a number of 2D fingerprints collectively provide useful information about the Gibbs free energy of biochemical reactions and that our systematic feature selection procedure provides a convenient way to identify them.

Availability And Implementation: Our software is freely available for download at http://sfb.kaust.edu.sa/Pages/Software.aspx.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty1035DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662295PMC
August 2019

DPubChem: a web tool for QSAR modeling and high-throughput virtual screening.

Sci Rep 2018 06 14;8(1):9110. Epub 2018 Jun 14.

Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.

High-throughput screening (HTS) performs the experimental testing of a large number of chemical compounds aiming to identify those active in the considered assay. Alternatively, faster and cheaper methods of large-scale virtual screening are performed computationally through quantitative structure-activity relationship (QSAR) models. However, the vast amount of available HTS heterogeneous data and the imbalanced ratio of active to inactive compounds in an assay make this a challenging problem. Although different QSAR models have been proposed, they have certain limitations, e.g., high false positive rates, complicated user interface, and limited utilization options. Therefore, we developed DPubChem, a novel web tool for deriving QSAR models that implement the state-of-the-art machine-learning techniques to enhance the precision of the models and enable efficient analyses of experiments from PubChem BioAssay database. DPubChem also has a simple interface that provides various options to users. DPubChem predicted active compounds for 300 datasets with an average geometric mean and F score of 76.68% and 76.53%, respectively. Furthermore, DPubChem builds interaction networks that highlight novel predicted links between chemical compounds and biological assays. Using such a network, DPubChem successfully suggested a novel drug for the Niemann-Pick type C disease. DPubChem is freely available at www.cbrc.kaust.edu.sa/dpubchem .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-018-27495-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6002400PMC
June 2018

MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis.

Nucleic Acids Res 2018 07;46(W1):W486-W494

Institute of Parasitology, McGill University, Montreal, Québec, Canada.

We present a new update to MetaboAnalyst (version 4.0) for comprehensive metabolomic data analysis, interpretation, and integration with other omics data. Since the last major update in 2015, MetaboAnalyst has continued to evolve based on user feedback and technological advancements in the field. For this year's update, four new key features have been added to MetaboAnalyst 4.0, including: (1) real-time R command tracking and display coupled with the release of a companion MetaboAnalystR package; (2) a MS Peaks to Pathways module for prediction of pathway activity from untargeted mass spectral data using the mummichog algorithm; (3) a Biomarker Meta-analysis module for robust biomarker identification through the combination of multiple metabolomic datasets and (4) a Network Explorer module for integrative analysis of metabolomics, metagenomics, and/or transcriptomics data. The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions. The underlying knowledgebases (compound libraries, metabolite sets, and metabolic pathways) have also been updated based on the latest data from the Human Metabolome Database (HMDB). A Docker image of MetaboAnalyst is also available to facilitate download and local installation of MetaboAnalyst. MetaboAnalyst 4.0 is freely available at http://metaboanalyst.ca.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gky310DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030889PMC
July 2018

DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning.

J Cheminform 2016 10;8:64. Epub 2016 Nov 10.

Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900 Saudi Arabia.

Background: Mining high-throughput screening (HTS) assays is key for enhancing decisions in the area of drug repositioning and drug discovery. However, many challenges are encountered in the process of developing suitable and accurate methods for extracting useful information from these assays. Virtual screening and a wide variety of databases, methods and solutions proposed to-date, did not completely overcome these challenges. This study is based on a multi-label classification (MLC) technique for modeling correlations between several HTS assays, meaning that a single prediction represents a subset of assigned correlated labels instead of one label. Thus, the devised method provides an increased probability for more accurate predictions of compounds that were not tested in particular assays.

Results: Here we present DRABAL, a novel MLC solution that incorporates structure learning of a Bayesian network as a step to model dependency between the HTS assays. In this study, DRABAL was used to process more than 1.4 million interactions of over 400,000 compounds and analyze the existing relationships between five large HTS assays from the PubChem BioAssay Database. Compared to different MLC methods, DRABAL significantly improves the FScore by about 22%, on average. We further illustrated usefulness and utility of DRABAL through screening FDA approved drugs and reported ones that have a high probability to interact with several targets, thus enabling drug-multi-target repositioning. Specifically DRABAL suggests the Thiabendazole drug as a common activator of the NCP1 and Rab-9A proteins, both of which are designed to identify treatment modalities for the Niemann-Pick type C disease.

Conclusion: We developed a novel MLC solution based on a Bayesian active learning framework to overcome the challenge of lacking fully labeled training data and exploit actual dependencies between the HTS assays. The solution is motivated by the need to model dependencies between existing experimental confirmatory HTS assays and improve prediction performance. We have pursued extensive experiments over several HTS assays and have shown the advantages of DRABAL. The datasets and programs can be downloaded from https://figshare.com/articles/DRABAL/3309562.Graphical abstract.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-016-0177-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5105261PMC
November 2016

DASPfind: new efficient method to predict drug-target interactions.

J Cheminform 2016 16;8:15. Epub 2016 Mar 16.

Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900 Saudi Arabia.

Background: Identification of novel drug-target interactions (DTIs) is important for drug discovery. Experimental determination of such DTIs is costly and time consuming, hence it necessitates the development of efficient computational methods for the accurate prediction of potential DTIs. To-date, many computational methods have been proposed for this purpose, but they suffer the drawback of a high rate of false positive predictions.

Results: Here, we developed a novel computational DTI prediction method, DASPfind. DASPfind uses simple paths of particular lengths inferred from a graph that describes DTIs, similarities between drugs, and similarities between the protein targets of drugs. We show that on average, over the four gold standard DTI datasets, DASPfind significantly outperforms other existing methods when the single top-ranked predictions are considered, resulting in 46.17 % of these predictions being correct, and it achieves 49.22 % correct single top ranked predictions when the set of all DTIs for a single drug is tested. Furthermore, we demonstrate that our method is best suited for predicting DTIs in cases of drugs with no known targets or with few known targets. We also show the practical use of DASPfind by generating novel predictions for the Ion Channel dataset and validating them manually.

Conclusions: DASPfind is a computational method for finding reliable new interactions between drugs and proteins. We show over six different DTI datasets that DASPfind outperforms other state-of-the-art methods when the single top-ranked predictions are considered, or when a drug with no known targets or with few known targets is considered. We illustrate the usefulness and practicality of DASPfind by predicting novel DTIs for the Ion Channel dataset. The validated predictions suggest that DASPfind can be used as an efficient method to identify correct DTIs, thus reducing the cost of necessary experimental verifications in the process of drug discovery. DASPfind can be accessed online at: http://www.cbrc.kaust.edu.sa/daspfind.Graphical abstractThe conceptual workflow for predicting drug-target interactions using DASPfind.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-016-0128-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4793623PMC
March 2016

Mining Chemical Activity Status from High-Throughput Screening Assays.

PLoS One 2015 14;10(12):e0144426. Epub 2015 Dec 14.

King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia.

High-throughput screening (HTS) experiments provide a valuable resource that reports biological activity of numerous chemical compounds relative to their molecular targets. Building computational models that accurately predict such activity status (active vs. inactive) in specific assays is a challenging task given the large volume of data and frequently small proportion of active compounds relative to the inactive ones. We developed a method, DRAMOTE, to predict activity status of chemical compounds in HTP activity assays. For a class of HTP assays, our method achieves considerably better results than the current state-of-the-art-solutions. We achieved this by modification of a minority oversampling technique. To demonstrate that DRAMOTE is performing better than the other methods, we performed a comprehensive comparison analysis with several other methods and evaluated them on data from 11 PubChem assays through 1,350 experiments that involved approximately 500,000 interactions between chemicals and their target proteins. As an example of potential use, we applied DRAMOTE to develop robust models for predicting FDA approved drugs that have high probability to interact with the thyroid stimulating hormone receptor (TSHR) in humans. Our findings are further partially and indirectly supported by 3D docking results and literature information. The results based on approximately 500,000 interactions suggest that DRAMOTE has performed the best and that it can be used for developing robust virtual screening models. The datasets and implementation of all solutions are available as a MATLAB toolbox online at www.cbrc.kaust.edu.sa/dramote and can be found on Figshare.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0144426PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682830PMC
June 2016

DWFS: a wrapper feature selection tool based on a parallel genetic algorithm.

PLoS One 2015 26;10(2):e0117988. Epub 2015 Feb 26.

King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Saudi Arabia.

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filtering methods that may be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0117988PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4342225PMC
January 2016
-->