Publications by authors named "Joshua W K Ho"

85 Publications

Generalized and scalable trajectory inference in single-cell omics data with VIA.

Nat Commun 2021 09 20;12(1):5528. Epub 2021 Sep 20.

Department of Electrical & Electronic Engineering, The University of Hong Kong, Pokfulam, Hong Kong.

Inferring cellular trajectories using a variety of omic data is a critical task in single-cell data science. However, accurate prediction of cell fates, and thereby biologically meaningful discovery, is challenged by the sheer size of single-cell data, the diversity of omic data types, and the complexity of their topologies. We present VIA, a scalable trajectory inference algorithm that overcomes these limitations by using lazy-teleporting random walks to accurately reconstruct complex cellular trajectories beyond tree-like pathways (e.g., cyclic or disconnected structures). We show that VIA robustly and efficiently unravels the fine-grained sub-trajectories in a 1.3-million-cell transcriptomic mouse atlas without losing the global connectivity at such a high cell count. We further apply VIA to discovering elusive lineages and less populous cell fates missed by other methods across a variety of data types, including single-cell proteomic, epigenomic, multi-omics datasets, and a new in-house single-cell morphological dataset.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-25773-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8452770PMC
September 2021

Automatic flow delay through passive wax valves for paper-based analytical devices.

Lab Chip 2021 Sep 20. Epub 2021 Sep 20.

School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China.

Microfluidic paper-based analytical devices (μPADs) have been widely explored for point-of-care testing due to their simplicity, low cost, and portability. μPADs with multiple-step reactions usually require precise flow control, especially flow-delay. This paper reports the numerical, mathematical, and experimental studies of flow delay through wax valves surrounded by PDMS walls on paper microfluidics. The predried surfactant in the sample zone diffuses into the liquid sample which can therefore flow through the wax valves. The delay time is automatically regulated by the diffusion of the surfactant after sample loading. The numerical study suggested that both the elevated contact angle and the reduced porosity and pore size in the wax printed region could effectively prevent water but allow liquids with lower contact angles (, surfactant solutions) to flow through. The PDMS walls fabricated using a low-cost liquid dispenser effectively prevented the leakage of surfactant solutions. By controlling the quantity, diffusion distance, and type of the surfactant predried on the chip, the system successfully achieved a delay time ranging from 1.6 to 20 minutes. A mathematical model involving the above parameters was developed based on Fick's second law to predict the delay time. Finally, the flow-delay systems were applied in sequential mixing and distance-based detection of either glucose or alcohol. Linear ranges of 1-100 mg dL and 1-40 mg dL were achieved for glucose and alcohol, respectively. The lower limit detection (LOD) of glucose and alcohol was 1 mg dL. The LOD of glucose was only 1/11 of that detected using μPADs without flow control, indicating the advantage of controlling fluid flow. The systematic findings in this study provide critical guidelines for the development and applications of wax valves in automatic flow delay for point-of-care testing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/d1lc00638jDOI Listing
September 2021

Genetic screening reveals phospholipid metabolism as a key regulator of the biosynthesis of the redox-active lipid coenzyme Q.

Redox Biol 2021 10 8;46:102127. Epub 2021 Sep 8.

Heart Research Institute, The University of Sydney, Sydney, New South Wales, Australia; Victor Chang Cardiac Research Institute, Sydney, Australia; St Vincent's Clinical School, University of New South Wales, Sydney, Australia; School of Life and Environmental Sciences, The University of Sydney, Sydney, Australia. Electronic address:

Mitochondrial energy production and function rely on optimal concentrations of the essential redox-active lipid, coenzyme Q (CoQ). CoQ deficiency results in mitochondrial dysfunction associated with increased mitochondrial oxidative stress and a range of pathologies. What drives CoQ deficiency in many of these pathologies is unknown, just as there currently is no effective therapeutic strategy to overcome CoQ deficiency in humans. To date, large-scale studies aimed at systematically interrogating endogenous systems that control CoQ biosynthesis and their potential utility to treat disease have not been carried out. Therefore, we developed a quantitative high-throughput method to determine CoQ concentrations in yeast cells. Applying this method to the Yeast Deletion Collection as a genome-wide screen, 30 genes not known previously to regulate cellular concentrations of CoQ were discovered. In combination with untargeted lipidomics and metabolomics, phosphatidylethanolamine N-methyltransferase (PEMT) deficiency was confirmed as a positive regulator of CoQ synthesis, the first identified to date. Mechanistically, PEMT deficiency alters mitochondrial concentrations of one-carbon metabolites, characterized by an increase in the S-adenosylmethionine to S-adenosylhomocysteine (SAM-to-SAH) ratio that reflects mitochondrial methylation capacity, drives CoQ synthesis, and is associated with a decrease in mitochondrial oxidative stress. The newly described regulatory pathway appears evolutionary conserved, as ablation of PEMT using antisense oligonucleotides increases mitochondrial CoQ in mouse-derived adipocytes that translates to improved glucose utilization by these cells, and protection of mice from high-fat diet-induced insulin resistance. Our studies reveal a previously unrecognized relationship between two spatially distinct lipid pathways with potential implications for the treatment of CoQ deficiencies, mitochondrial oxidative stress/dysfunction, and associated diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.redox.2021.102127DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8435697PMC
October 2021

The method to quantify cell elasticity based on the precise measurement of pressure inducing cell deformation in microfluidic channels.

MethodsX 2021 23;8:101247. Epub 2021 Jan 23.

School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China.

The cell elasticity has attracted extensive research interests since it not only provides new insights into cell biology but also is an emerging mechanical marker for the diagnosis of some diseases. This paper reports the method for the precise measurement of mechanical properties of single cells deformed to a large extent using a novel microfluidic system integrated with a pressure feedback system and small particle separation unit. The particle separation system was employed to avoid the blockage of the cell deformation channel to enhance the measurement throughput. This system is of remarkable application potential in the precise evaluation of cell mechanical properties. In brief, this paper reports:•The manufacturing of the chip using standard soft lithography;•The methods to deform single cells in a microchannel and measure the relevant pressure drop using a pressure sensor connecting to the microfluidic chip;•Calculation of the mechanical properties including stiffness and fluidity of each cell based on a power-law rheology model describing the viscoelastic behaviors of cells;•Automatic and real-time measurement of the mechanical properties using video processing software.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mex.2021.101247DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8374187PMC
January 2021

Deep Learning for Clinical Image Analyses in Oral Squamous Cell Carcinoma: A Review.

JAMA Otolaryngol Head Neck Surg 2021 Oct;147(10):893-900

Division of Oral and Maxillofacial Surgery, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, China.

Importance: Oral squamous cell carcinoma (SCC) is a lethal malignant neoplasm with a high rate of tumor metastasis and recurrence. Accurate diagnosis, prognosis prediction, and metastasis detection can improve patient outcomes. Deep learning for clinical image analysis can be used for diagnosis and prognosis in cancers, including oral SCC; its use in these areas can improve patient care and outcome.

Observations: This review is a summary of the use of deep learning models for diagnosis, prognosis, and metastasis detection for oral SCC by analyzing information from pathological and radiographic images. Specifically, deep learning has been used to classify different cell types, to differentiate cancer cells from nonmalignant cells, and to identify oral SCC from other cancer types. It can also be used to predict survival, to differentiate between tumor grades, and to detect lymph node metastasis. In general, the performance of these deep learning models has an accuracy ranging from 77.89% to 97.51% and 76% to 94.2% with the use of pathological and radiographic images, respectively. The review also discusses the importance of using good-quality clinical images in sufficient quantity on model performance.

Conclusions And Relevance: Applying pathological and radiographic images in deep learning models for diagnosis and prognosis of oral SCC has been explored, and most studies report results showing good classification accuracy. The successful use of deep learning in these areas has a high clinical translatability in the improvement of patient care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamaoto.2021.2028DOI Listing
October 2021

FlowGrid enables fast clustering of very large single-cell RNA-seq data.

Bioinformatics 2021 Jul 20. Epub 2021 Jul 20.

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.

Motivation: Scalable clustering algorithms are needed to analyse millions of cells in single cell RNA-seq (scRNA-seq) data.

Results: Here we present an open source python package called FlowGrid that can integrate into the Scanpy workflow to perform clustering on very large scRNA-seq data sets. FlowGrid implements a fast density-based clustering algorithm originally designed for flow cytometry data analysis. We introduce a new automated parameter tuning procedure, and show that FlowGrid can achieve comparable clustering accuracy as state-of-the-art clustering algorithms but at a substantially reduced run time for very large single cell RNA-seq data sets. For example, FlowGrid can complete a 1-hour clustering task for one million cells in about 5 minutes.

Availability: https://github.com/holab-hku/FlowGrid.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab521DOI Listing
July 2021

Machine learning application for the prediction of SARS-CoV-2 infection using blood tests and chest radiograph.

Sci Rep 2021 07 9;11(1):14250. Epub 2021 Jul 9.

Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China.

Triaging and prioritising patients for RT-PCR test had been essential in the management of COVID-19 in resource-scarce countries. In this study, we applied machine learning (ML) to the task of detection of SARS-CoV-2 infection using basic laboratory markers. We performed the statistical analysis and trained an ML model on a retrospective cohort of 5148 patients from 24 hospitals in Hong Kong to classify COVID-19 and other aetiology of pneumonia. We validated the model on three temporal validation sets from different waves of infection in Hong Kong. For predicting SARS-CoV-2 infection, the ML model achieved high AUCs and specificity but low sensitivity in all three validation sets (AUC: 89.9-95.8%; Sensitivity: 55.5-77.8%; Specificity: 91.5-98.3%). When used in adjunction with radiologist interpretations of chest radiographs, the sensitivity was over 90% while keeping moderate specificity. Our study showed that machine learning model based on readily available laboratory markers could achieve high accuracy in predicting SARS-CoV-2 infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-93719-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8270945PMC
July 2021

The method to dynamically screen and print single cells using microfluidics with pneumatic microvalves.

MethodsX 2021 28;8:101190. Epub 2020 Dec 28.

School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China.

Printing single cells into individual chambers is of critical importance for single-cell analysis using traditional equipment, for instance, single-cell clonal expansion or sequencing. The size of cells can usually be a reflection of their types, functions, and even cell cycle phases. Therefore, printing individual cells within the desired size range is of essential application potential in single-cell analysis. This paper presents a method for the development of a microfluidic chip integrating pneumatic microvalves to print single cells with appropriate size into standard well plates. The reported method provided essential guidelines for the fabrication of multi-layer microfluidic chips, control of the membrane deflection to screen cell size, and printing of single cells. In brief, this paper reports:•the manufacturing of the chip using standard soft lithography;•the protocol to dynamically screen both the lower and the upper size limit of cells passing through the valves by deflection of the valve membrane;•the screening and dispensing of suspended human umbilical vein endothelial cells (HUVECs) into 384-well plates with high viability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mex.2020.101190DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779779PMC
December 2020

Computed tomography-based deep-learning prediction of neoadjuvant chemoradiotherapy treatment response in esophageal squamous cell carcinoma.

Radiother Oncol 2021 01 15;154:6-13. Epub 2020 Sep 15.

Department of Thoracic Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China; State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China; Guangdong Esophageal Cancer Institute, Guangzhou, China. Electronic address:

Background: Deep learning is promising to predict treatment response. We aimed to evaluate and validate the predictive performance of the CT-based model using deep learning features for predicting pathologic complete response to neoadjuvant chemoradiotherapy (nCRT) in esophageal squamous cell carcinoma (ESCC).

Materials And Methods: Patients were retrospectively enrolled between April 2007 and December 2018 from two institutions. We extracted deep learning features of six pre-trained convolutional neural networks, respectively, from pretreatment CT images in the training cohort (n = 161). Support vector machine was adopted as the classifier. Validation was performed in an external testing cohort (n = 70). We assessed the performance using the area under the receiver operating characteristics curve (AUC) and selected an optimal model, which was compared with a radiomics model developed from the training cohort. A clinical model consisting of clinical factors only was also built for baseline comparison. We further conducted a radiogenomics analysis using gene expression profiles to reveal underlying biology associated with radiological prediction.

Results: The optimal model with features extracted from ResNet50 achieved an AUC and accuracy of 0.805 (95% CI, 0.696-0.913) and 77.1% (65.6%-86.3%) in the testing cohort, compared with 0.725 (0.605-0.846)) and 67.1% (54.9%-77.9%) for the radiomics model. All the radiological models showed better predictive performance than the clinical model. Radiogenomics analysis suggested a potential association mainly with WNT signaling pathway and tumor microenvironment.

Conclusions: The novel and noninvasive deep learning approach could provide efficient and accurate prediction of treatment response to nCRT in ESCC, and benefit clinical decision making of therapeutic strategy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.radonc.2020.09.014DOI Listing
January 2021

Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads.

F1000Res 2019 4;8:1587. Epub 2019 Sep 4.

Victor Chang Cardiac Research Institute, Sydney, NSW, 2010, Australia.

Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.19426.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7459848PMC
October 2020

A High-Throughput Genome-Integrated Assay Reveals Spatial Dependencies Governing Tcf7l2 Binding.

Cell Syst 2020 09 9;11(3):315-327.e5. Epub 2020 Sep 9.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Hubrecht Institute, 3584 CT Utrecht, the Netherlands. Electronic address:

Predicting where transcription factors bind in the genome from their in vitro DNA-binding affinity is confounded by the large number of possible interactions with nearby transcription factors. To characterize the in vivo binding logic for the Wnt effector Tcf7l2, we developed a high-throughput screening platform in which thousands of synthesized DNA phrases are inserted into a specific genomic locus, followed by measurement of Tcf7l2 binding by DamID. Using this platform at two genomic loci in mouse embryonic stem cells, we show that while the binding of Tcf7l2 closely follows the in vitro motif-binding strength and is influenced by local chromatin accessibility, it is also strongly affected by the surrounding 99 bp of sequence. Through controlled sequence perturbation, we show that Oct4 and Klf4 motifs promote Tcf7l2 binding, particularly in the adjacent ∼50 bp and oscillating with a 10.8-bp phasing relative to these cofactor motifs, which matches the turn of a DNA helix.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2020.08.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7530048PMC
September 2020

Assessment of Intratumoral and Peritumoral Computed Tomography Radiomics for Predicting Pathological Complete Response to Neoadjuvant Chemoradiation in Patients With Esophageal Squamous Cell Carcinoma.

JAMA Netw Open 2020 09 1;3(9):e2015927. Epub 2020 Sep 1.

Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China.

Importance: For patients with locally advanced esophageal squamous cell carcinoma, neoadjuvant chemoradiation has been shown to improve long-term outcomes, but the treatment response varies among patients. Accurate pretreatment prediction of response remains an urgent need.

Objective: To determine whether peritumoral radiomics features derived from baseline computed tomography images could provide valuable information about neoadjuvant chemoradiation response and enhance the ability of intratumoral radiomics to estimate pathological complete response.

Design, Setting, And Participants: A total of 231 patients with esophageal squamous cell carcinoma, who underwent baseline contrast-enhanced computed tomography and received neoadjuvant chemoradiation followed by surgery at 2 institutions in China, were consecutively included. This diagnostic study used single-institution data between April 2007 and December 2018 to extract radiomics features from intratumoral and peritumoral regions and established intratumoral, peritumoral, and combined radiomics models using different classifiers. External validation was conducted using independent data collected from another hospital during the same period. Radiogenomics analysis using gene expression profile was done in a subgroup of the training set for pathophysiological explanation. Data were analyzed from June to December 2019.

Exposures: Computed tomography-based radiomics.

Main Outcomes And Measures: The discriminative performances of radiomics models were measured by area under the receiver operating characteristic curve.

Results: Among the 231 patients included (192 men [83.1%]; mean [SD] age, 59.8 [8.7] years), the optimal intratumoral and peritumoral radiomics models yielded similar areas under the receiver operating characteristic curve of 0.730 (95% CI, 0.609-0.850) and 0.734 (0.613-0.854), respectively. The combined model was composed of 7 intratumoral and 6 peritumoral features and achieved better discriminative performance, with an area under the receiver operating characteristic curve of 0.852 (95% CI, 0.753-0.951), accuracy of 84.3%, sensitivity of 90.3%, and specificity of 79.5% in the test set. Gene sets associated with the combined model mainly involved lymphocyte-mediated immunity. The association of peritumoral area with response identification might be partially attributed to type I interferon-related biological process.

Conclusions And Relevance: A combination of peritumoral radiomics features appears to improve the predictive performance of intratumoral radiomics to estimate pathological complete response after neoadjuvant chemoradiation in patients with esophageal squamous cell carcinoma. This study underlines the significant application of peritumoral radiomics to assess treatment response in clinical practice.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamanetworkopen.2020.15927DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7489831PMC
September 2020

Biophysical Review's 'meet the editors series'-a profile of Joshua W. K. Ho.

Authors:
Joshua W K Ho

Biophys Rev 2020 Aug 28;12(4):745-748. Epub 2020 Jul 28.

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong.

It is my pleasure to write a few words to introduce myself to the readers of Biophysical Reviews as part of the 'meet the editors' series. A portrait of Dr. Joshua Ho.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12551-020-00744-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429651PMC
August 2020

Hactive: a smartphone application for heart rate profiling.

Biophys Rev 2020 Aug 14;12(4):777-779. Epub 2020 Jul 14.

Victor Chang Cardiac Research Institute, Darlinghurst, NSW, 2010, Australia.

With advancements in popular modern wearable devices, such as Apple Watch and Fitbit, it is now possible to harness these technologies for continuous monitoring and recording of heart rate data, which can then be used for medical research and ultimately e-health applications. In this paper, we report the development of a new mobile smartphone application (app) that enables heart rate profiles to be extracted and analysed from continuous heart rate monitoring time series. The new iOS app, called Hactive, extracts heart rate data from Apple's smartwatches to construct heart rate profiles. A key innovation is Hactive's ability to detect and analyse exercise-associated heart rate changes from continuous heart rate data, which enables heart rate profiles to be constructed based on free-living conditions. We believe this tool advances the use of wearable technology to collect physiologically relevant big data for healthcare and medical research. The source code of Hactive is available via an MIT open source licence at https://github.com/VCCRI/hactive .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12551-020-00731-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429606PMC
August 2020

Sierra: discovery of differential transcript usage from polyA-captured single-cell RNA-seq data.

Genome Biol 2020 07 8;21(1):167. Epub 2020 Jul 8.

School of Mathematics and Statistics, Faculty of Science, The University of Sydney, Camperdown, 2006, Australia.

High-throughput single-cell RNA-seq (scRNA-seq) is a powerful tool for studying gene expression in single cells. Most current scRNA-seq bioinformatics tools focus on analysing overall expression levels, largely ignoring alternative mRNA isoform expression. We present a computational pipeline, Sierra, that readily detects differential transcript usage from data generated by commonly used polyA-captured scRNA-seq technology. We validate Sierra by comparing cardiac scRNA-seq cell types to bulk RNA-seq of matched populations, finding significant overlap in differential transcripts. Sierra detects differential transcript usage across human peripheral blood mononuclear cells and the Tabula Muris, and 3 UTR shortening in cardiac fibroblasts. Sierra is available at https://github.com/VCCRI/Sierra .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02071-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7341584PMC
July 2020

Challenges and emerging systems biology approaches to discover how the human gut microbiome impact host physiology.

Biophys Rev 2020 Aug 7;12(4):851-863. Epub 2020 Jul 7.

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong.

Research in the human gut microbiome has bloomed with advances in next generation sequencing (NGS) and other high-throughput molecular profiling technologies. This has enabled the generation of multi-omics datasets which holds promises for big data-enabled knowledge acquisition in the form of understanding the normal physiological and pathological involvement of gut microbiomes. Ample evidence suggests that distinct microbial compositions in the human gut are associated with different diseases. However, the biological mechanisms underlying these associations are often unclear. There is a need to move beyond statistical associations to discover how changes in the gut microbiota mechanistically affect host physiology and disease development. This review summarises state-of-the-art big data and systems biology approaches for mechanism discovery.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12551-020-00724-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429608PMC
August 2020

Introduction to the Special Issue on GIW/ABACBS 2019.

Authors:
Joshua W K Ho

J Bioinform Comput Biol 2020 02;18(1):2002001

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, Hong Kong SAR, China.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1142/S0219720020020011DOI Listing
February 2020

Multi-omic profiling reveals associations between the gut mucosal microbiome, the metabolome, and host DNA methylation associated gene expression in patients with colorectal cancer.

BMC Microbiol 2020 04 23;20(Suppl 1):83. Epub 2020 Apr 23.

State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.

Background: The human gut microbiome plays a critical role in the carcinogenesis of colorectal cancer (CRC). However, a comprehensive analysis of the interaction between the host and microbiome is still lacking.

Results: We found correlations between the change in abundance of microbial taxa, butyrate-related colonic metabolites, and methylation-associated host gene expression in colonic tumour mucosa tissues compared with the adjacent normal mucosa tissues. The increase of genus Fusobacterium abundance was correlated with a decrease in the level of 4-hydroxybutyric acid (4-HB) and expression of immune-related peptidase inhibitor 16 (PI16), Fc Receptor Like A (FCRLA) and Lymphocyte Specific Protein 1 (LSP1). The decrease in the abundance of another potentially 4-HB-associated genus, Prevotella 2, was also found to be correlated with the down-regulated expression of metallothionein 1 M (MT1M). Additionally, the increase of glutamic acid-related family Halomonadaceae was correlated with the decreased expression of reelin (RELN). The decreased abundance of genus Paeniclostridium and genus Enterococcus were correlated with increased lactic acid level, and were also linked to the expression change of Phospholipase C Beta 1 (PLCB1) and Immunoglobulin Superfamily Member 9 (IGSF9) respectively. Interestingly, 4-HB, glutamic acid and lactic acid are all butyrate precursors, which may modify gene expression by epigenetic regulation such as DNA methylation.

Conclusions: Our study identified associations between previously reported CRC-related microbial taxa, butyrate-related metabolites and DNA methylation-associated gene expression in tumour and normal colonic mucosa tissues from CRC patients, which uncovered a possible mechanism of the role of microbiome in the carcinogenesis of CRC. In addition, these findings offer insight into potential new biomarkers, therapeutic and/or prevention strategies for CRC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12866-020-01762-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7178946PMC
April 2020

dv-trio: a family-based variant calling pipeline using DeepVariant.

Bioinformatics 2020 06;36(11):3549-3551

Victor Chang Cardiac Research Institute, Sydney, Australia.

Motivation: In 2018, Google published an innovative variant caller, DeepVariant, which converts pileups of sequence reads into images and uses a deep neural network to identify single-nucleotide variants and small insertion/deletions from next-generation sequencing data. This approach outperforms existing state-of-the-art tools. However, DeepVariant was designed to call variants within a single sample. In disease sequencing studies, the ability to examine a family trio (father-mother-affected child) provides greater power for disease mutation discovery.

Results: To further improve DeepVariant's variant calling accuracy in family-based sequencing studies, we have developed a family-based variant calling pipeline, dv-trio, which incorporates the trio information from the Mendelian genetic model into variant calling based on DeepVariant.

Availability And Implementation: dv-trio is available via an open source BSD3 license at GitHub (https://github.com/VCCRI/dv-trio/).

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa116DOI Listing
June 2020

Cellular diversity and lineage trajectory: insights from mouse single cell transcriptomes.

Development 2020 01 24;147(2). Epub 2020 Jan 24.

School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.

Single cell RNA-sequencing (scRNA-seq) technology has matured to the point that it is possible to generate large single cell atlases of developing mouse embryos. These atlases allow the dissection of developmental cell lineages and molecular changes during embryogenesis. When coupled with single cell technologies for profiling the chromatin landscape, epigenome, proteome and metabolome, and spatial tissue organisation, these scRNA-seq approaches can now collect a large volume of multi-omic data about mouse embryogenesis. In addition, advances in computational techniques have enabled the inference of developmental lineages of differentiating cells, even without explicitly introduced genetic markers. This Spotlight discusses recent advent of single cell experimental and computational methods, and key insights from applying these methods to the study of mouse embryonic development. We highlight challenges in analysing and interpreting these data to complement and expand our knowledge from traditional developmental biology studies in relation to cell identity, diversity and lineage differentiation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1242/dev.179788DOI Listing
January 2020

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells.

Bioinformatics 2020 05;36(9):2778-2786

Department of Electrical and Electronic Engineering.

Motivation: New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity.

Results: We introduce a highly scalable graph-based clustering algorithm PARC-Phenotyping by Accelerated Refined Community-partitioning-for large-scale, high-dimensional single-cell data (>1 million cells). Using large single-cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without subsampling of cells, including Phenograph, FlowSOM and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single-cell dataset of 1.1 million cells within 13 min, compared with >2 h for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis.

Availability And Implementation: https://github.com/ShobiStassen/PARC.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa042DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7203756PMC
May 2020

Cloud accelerated alignment and assembly of full-length single-cell RNA-seq data using Falco.

BMC Genomics 2019 Dec 30;20(Suppl 10):927. Epub 2019 Dec 30.

Victor Chang Cardiac Research Institute, 405 Liverpool St, Darlinghurst, 2010, New South Wales, Australia.

Background: Read alignment and transcript assembly are the core of RNA-seq analysis for transcript isoform discovery. Nonetheless, current tools are not designed to be scalable for analysis of full-length bulk or single cell RNA-seq (scRNA-seq) data. The previous version of our cloud-based tool Falco only focuses on RNA-seq read counting, but does not allow for more flexible steps such as alignment and read assembly.

Results: The Falco framework can harness the parallel and distributed computing environment in modern cloud platforms to accelerate read alignment and transcript assembly of full-length bulk RNA-seq and scRNA-seq data. There are two new modes in Falco: alignment-only and transcript assembly. In the alignment-only mode, Falco can speed up the alignment process by 2.5-16.4x based on two public scRNA-seq datasets when compared to alignment on a highly optimised standalone computer. Furthermore, it also provides a 10x average speed-up compared to alignment using published cloud-enabled tool for read alignment, Rail-RNA. In the transcript assembly mode, Falco can speed up the transcript assembly process by 1.7-16.5x compared to performing transcript assembly on a highly optimised computer.

Conclusion: Falco is a significantly updated open source big data processing framework that enables scalable and accelerated alignment and assembly of full-length scRNA-seq data on the cloud. The source code can be found at https://github.com/VCCRI/Falco.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-019-6341-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936136PMC
December 2019

Comparison of somatic variant detection algorithms using Ion Torrent targeted deep sequencing data.

BMC Med Genomics 2019 12 24;12(Suppl 9):181. Epub 2019 Dec 24.

Victor Chang Cardiac Research Institute, Darlinghurst, Australia.

Background: The application of next-generation sequencing in cancer has revealed the genomic landscape of many tumour types and is nowadays routinely used in research and clinical settings. Multiple algorithms have been developed to detect somatic variation from sequencing data using either paired tumour-blood or tumour-only samples. Most of these methods have been developed and evaluated for the identification of somatic variation using Illumina sequencing datasets of moderate coverage. However, a comprehensive evaluation of somatic variant detection algorithms on Ion Torrent targeted deep sequencing data has not been performed.

Methods: We have applied three somatic detection algorithms, Torrent Variant Caller, MuTect2 and VarScan2, on a large cohort of ovarian cancer patients comprising of 208 paired tumour-blood samples and 253 tumour-only samples sequenced deeply on Ion Torrent Proton platform across 330 amplicons. Subsequently, the concordance and performance of the three somatic variant callers were assessed.

Results: We have observed low concordance across the algorithms with only 0.5% of SNV and 0.02% of INDEL calls in common across all three methods. The intersection of all methods showed better performance when assessed using correlation with known mutational signatures, overlap with COSMIC variation and by examining the variant characteristics. The Torrent Variant Caller also performed well with the advantage of not eliminating a high number of variants that could lead to high type II error.

Conclusions: Our results suggest that caution should be taken when applying state-of-the-art somatic variant algorithms to Ion Torrent targeted deep sequencing data. Better quality control procedures and strategies that combine results from multiple methods should ensure that higher accuracy is achieved. This is essential to ensure that results from bioinformatics pipelines using Ion Torrent deep sequencing can be robustly applied in cancer research and in the clinic.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-019-0636-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929331PMC
December 2019

Ularcirc: visualization and enhanced analysis of circular RNAs via back and canonical forward splicing.

Nucleic Acids Res 2019 11;47(20):e123

Victor Chang Cardiac Research Institute.

Circular RNAs (circRNA) are a unique class of transcripts that can only be identified from sequence alignments spanning discordant junctions, commonly referred to as backsplice junctions (BSJ). Canonical splicing is also linked with circRNA biogenesis either from the parental transcript or internal to the circRNA, and is not fully utilized in circRNA software. Here we present Ularcirc, a software tool that integrates the visualization of both BSJ and forward splicing junctions and provides downstream analysis of selected circRNA candidates. Ularcirc utilizes the output of CIRI, circExplorer, or raw chimeric output of the STAR aligner and assembles BSJ count table to allow multi-sample analysis. We used Ularcirc to identify and characterize circRNA from public and in-house generated data sets and demonstrate how it can be used to (i) discover novel splicing patterns of parental transcripts, (ii) detect internal splicing patterns of circRNA, and (iii) reveal the complexity of BSJ formation. Furthermore, we identify circRNA that have potential open reading frames longer than their linear sequence. Finally, we detected and validated the presence of a novel class of circRNA generated from ApoA4 transcripts whose BSJ derive from multiple non-canonical splicing sites within coding exons. Ularcirc is accessed via https://github.com/VCCRI/Ularcirc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz718DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6846653PMC
November 2019

Enhanced cardiac repair by telomerase reverse transcriptase over-expression in human cardiac mesenchymal stromal cells.

Sci Rep 2019 07 22;9(1):10579. Epub 2019 Jul 22.

Centre for Heart Research, Westmead Institute for Medical Research, The University of Sydney, Westmead, NSW, 2145, Australia.

We have previously reported a subpopulation of mesenchymal stromal cells (MSCs) within the platelet-derived growth factor receptor-alpha (PDGFRα)/CD90 co-expressing cardiac interstitial and adventitial cell fraction. Here we further characterise PDGFRα/CD90-expressing cardiac MSCs (PDGFRα + cMSCs) and use human telomerase reverse transcriptase (hTERT) over-expression to increase cMSCs ability to repair the heart after induced myocardial infarction. hTERT over-expression in PDGFRα + cardiac MSCs (hTERT + PDGFRα + cMSCs) modulates cell differentiation, proliferation, survival and angiogenesis related genes. In vivo, transplantation of hTERT + PDGFRα + cMSCs in athymic rats significantly increased left ventricular function, reduced scar size, increased angiogenesis and proliferation of both cardiomyocyte and non-myocyte cell fractions four weeks after myocardial infarction. In contrast, transplantation of mutant hTERT + PDGFRα + cMSCs (which generate catalytically-inactive telomerase) failed to replicate this cardiac functional improvement, indicating a telomerase-dependent mechanism. There was no hTERT + PDGFRα + cMSCs engraftment 14 days after transplantation indicating functional improvement occurred by paracrine mechanisms. Mass spectrometry on hTERT + PDGFRα + cMSCs conditioned media showed increased proteins associated with matrix modulation, angiogenesis, cell proliferation/survival/adhesion and innate immunity function. Our study shows that hTERT can activate pro-regenerative signalling within PDGFRα + cMSCs and enhance cardiac repair after myocardial infarction. An increased understanding of hTERT's role in mesenchymal stromal cells from various organs will favourably impact clinical regenerative and anti-cancer therapies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-47022-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6646304PMC
July 2019

Dam mutants provide improved sensitivity and spatial resolution for profiling transcription factor binding.

Epigenetics Chromatin 2019 06 13;12(1):36. Epub 2019 Jun 13.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.

DamID, in which a protein of interest is fused to Dam methylase, enables mapping of protein-DNA binding through readout of adenine methylation in genomic DNA. DamID offers a compelling alternative to chromatin immunoprecipitation sequencing (ChIP-Seq), particularly in cases where cell number or antibody availability is limiting. This comes at a cost, however, of high non-specific signal and a lowered spatial resolution of several kb, limiting its application to transcription factor-DNA binding. Here we show that mutations in Dam, when fused to the transcription factor Tcf7l2, greatly reduce non-specific methylation. Combined with a simplified DamID sequencing protocol, we find that these Dam mutants allow for accurate detection of transcription factor binding at a sensitivity and spatial resolution closely matching that seen in ChIP-seq.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13072-019-0273-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6567924PMC
June 2019

Discovery of perturbation gene targets via free text metadata mining in Gene Expression Omnibus.

Comput Biol Chem 2019 Jun 24;80:152-158. Epub 2019 Mar 24.

Victor Chang Cardiac Research Institute, Sydney, Australia; University of New South Wales, Sydney, Australia; School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.

There exists over 2.5 million publicly available gene expression samples across 101,000 data series in NCBI's Gene Expression Omnibus (GEO) database. Due to the lack of the use of standardised ontology terms in GEO's free text metadata to annotate the experimental type and sample type, this database remains difficult to harness computationally without significant manual intervention. In this work, we present an interactive R/Shiny tool called GEOracle that utilises text mining and machine learning techniques to automatically identify perturbation experiments, group treatment and control samples and perform differential expression. We present applications of GEOracle to discover conserved signalling pathway target genes and identify an organ specific gene regulatory network. GEOracle is effective in discovering perturbation gene targets in GEO by harnessing its free text metadata. Its effectiveness and applicability has been demonstrated by cross validation and two real-life case studies. It opens up new avenues to unlock the gene regulatory information embedded inside large biological databases such as GEO. GEOracle is available at https://github.com/VCCRI/GEOracle.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiolchem.2019.03.014DOI Listing
June 2019

Ultrafast clustering of single-cell flow cytometry data using FlowGrid.

BMC Syst Biol 2019 04 5;13(Suppl 2):35. Epub 2019 Apr 5.

Victor Chang Cardiac Research Institute, Sydney, Australia.

Background: Flow cytometry is a popular technology for quantitative single-cell profiling of cell surface markers. It enables expression measurement of tens of cell surface protein markers in millions of single cells. It is a powerful tool for discovering cell sub-populations and quantifying cell population heterogeneity. Traditionally, scientists use manual gating to identify cell types, but the process is subjective and is not effective for large multidimensional data. Many clustering algorithms have been developed to analyse these data but most of them are not scalable to very large data sets with more than ten million cells.

Results: Here, we present a new clustering algorithm that combines the advantages of density-based clustering algorithm DBSCAN with the scalability of grid-based clustering. This new clustering algorithm is implemented in python as an open source package, FlowGrid. FlowGrid is memory efficient and scales linearly with respect to the number of cells. We have evaluated the performance of FlowGrid against other state-of-the-art clustering programs and found that FlowGrid produces similar clustering results but with substantially less time. For example, FlowGrid is able to complete a clustering task on a data set of 23.6 million cells in less than 12 seconds, while other algorithms take more than 500 seconds or get into error.

Conclusions: FlowGrid is an ultrafast clustering algorithm for large single-cell flow cytometry data. The source code is available at https://github.com/VCCRI/FlowGrid .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12918-019-0690-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6449887PMC
April 2019

Big data: the elements of good questions, open data, and powerful software.

Biophys Rev 2019 Feb 25;11(1):1-3. Epub 2019 Jan 25.

Victor Chang Cardiac Research Institute, Darlinghurst, NSW, 2010, Australia.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12551-019-00500-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6381355PMC
February 2019
-->