Front Genet 2019 3;10:276. Epub 2019 Apr 3.
Aginome-XMU Joint Lab, Xiamen University, Xiamen, China.
In the last few years, advances in next-generation sequencing (NGS) technology for whole genome sequencing (WGS) of foodborne pathogens have provided drastic improvements in food pathogen outbreak surveillance. WGS of foodborne pathogen enables identification of pathogens from food or environmental samples, including difficult-to-detect pathogens in culture-negative infections. Compared to traditional low-resolution methods such as the pulsed-field gel electrophoresis (PFGE), WGS provides advantages to differentiate even closely related strains of the same species, thus enables rapid identification of food-source associated with pathogen outbreak events for a fast mitigation plan. In this paper, we present UltraStrain, which is a fast and ultra sensitive pathogen detection and strain typing method for () based on WGS data analysis. In the proposed method, a noise filtering step is first performed where the raw sequencing data are mapped to a synthetic species-specific reference genome generated from specific marker sequences to avoid potential interference from closely related species for low spike samples. After that, a statistical learning based method is used to identify candidate strains, from a database of known strains, that best explain the retained specific reads.Finally, a refinement step is further performed by mapping all the reads before filtering onto the identified top candidate strains, and recalculating the probability of presence for each candidate strain. Experiment results using both synthetic and real sequencing data show that the proposed method is able to identify the correct strains from low-spike samples, and outperforms several existing strain-typing methods in terms of sensitivity and accuracy.