Publications by authors named "Marcel Salathé"

56 Publications

Supervised Learning Computer Vision Benchmark for Snake Species Identification From Photographs: Implications for Herpetology and Global Health.

Front Artif Intell 2021 20;4:582110. Epub 2021 Apr 20.

Institute of Global Health, Faculty of Medicine, University of Geneva, Geneva, Switzerland.

We trained a computer vision algorithm to identify 45 species of snakes from photos and compared its performance to that of humans. Both human and algorithm performance is substantially better than randomly guessing (null probability of guessing correctly given 45 classes = 2.2%). Some species (e.g., ) are routinely identified with ease by both algorithm and humans, whereas other groups of species (e.g., uniform green snakes, blotched brown snakes) are routinely confused. A species complex with largely molecular species delimitation (North American ratsnakes) was the most challenging for computer vision. Humans had an edge at identifying images of poor quality or with visual artifacts. With future improvement, computer vision could play a larger role in snakebite epidemiology, particularly when combined with information about geographic location and input from human experts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/frai.2021.582110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8093445PMC
April 2021

Deep Learning for Understanding Satellite Imagery: An Experimental Survey.

Front Artif Intell 2020 16;3:534696. Epub 2020 Nov 16.

Center for Cognitive Interaction Technology (CITEC), Bielefeld University, Bielefeld, Germany.

Translating satellite imagery into maps requires intensive effort and time, especially leading to inaccurate maps of the affected regions during disaster and conflict. The combination of availability of recent datasets and advances in computer vision made through deep learning paved the way toward automated satellite image translation. To facilitate research in this direction, we introduce the Satellite Imagery Competition using a modified SpaceNet dataset. Participants had to come up with different segmentation models to detect positions of buildings on satellite images. In this work, we present five approaches based on improvements of U-Net and Mask R-Convolutional Neuronal Networks models, coupled with unique training adaptations using boosting algorithms, morphological filter, Conditional Random Fields and custom losses. The good results-as high as and -from these models demonstrate the feasibility of Deep Learning in automated satellite image annotation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/frai.2020.534696DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7944145PMC
November 2020

Early evidence of effectiveness of digital contact tracing for SARS-CoV-2 in Switzerland.

Swiss Med Wkly 2020 12 16;150:w20457. Epub 2020 Dec 16.

Digital and Mobile Health Group, Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Switzerland.

In the wake of the pandemic of coronavirus disease 2019 (COVID-19), contact tracing has become a key element of strategies to control the spread of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Given the rapid and intense spread of SARS-CoV-2, digital contact tracing has emerged as a potential complementary tool to support containment and mitigation efforts. Early modelling studies highlighted the potential of digital contact tracing to break transmission chains, and Google and Apple subsequently developed the Exposure Notification (EN) framework, making it available to the vast majority of smartphones. A growing number of governments have launched or announced EN-based contact tracing apps, but their effectiveness remains unknown. Here, we report early findings of the digital contact tracing app deployment in Switzerland. We demonstrate proof-of-principle that digital contact tracing reaches exposed contacts, who then test positive for SARS-CoV-2. This indicates that digital contact tracing is an effective complementary tool for controlling the spread of SARS-CoV-2. Continued technical improvement and international compatibility can further increase the efficacy, particularly also across country borders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4414/smw.2020.20457DOI Listing
December 2020

A digital reconstruction of the 1630-1631 large plague outbreak in Venice.

Sci Rep 2020 10 20;10(1):17849. Epub 2020 Oct 20.

Digital Epidemiology Laboratory, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

The plague, an infectious disease caused by the bacterium Yersinia pestis, is widely considered to be responsible for the most devastating and deadly pandemics in human history. Starting with the infamous Black Death, plague outbreaks are estimated to have killed around 100 million people over multiple centuries, with local mortality rates as high as 60%. However, detailed pictures of the disease dynamics of these outbreaks centuries ago remain scarce, mainly due to the lack of high-quality historical data in digital form. Here, we present an analysis of the 1630-1631 plague outbreak in the city of Venice, using newly collected daily death records. We identify the presence of a two-peak pattern, for which we present two possible explanations based on computational models of disease dynamics. Systematically digitized historical records like the ones presented here promise to enrich our understanding of historical phenomena of enduring importance. This work contributes to the recently renewed interdisciplinary foray into the epidemiological and societal impact of pre-modern epidemics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-74775-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7576796PMC
October 2020

Assessing Public Opinion on CRISPR-Cas9: Combining Crowdsourcing and Deep Learning.

J Med Internet Res 2020 08 31;22(8):e17830. Epub 2020 Aug 31.

Health Ethics and Policy Lab, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland.

Background: The discovery of the CRISPR-Cas9-based gene editing method has opened unprecedented new potential for biological and medical engineering, sparking a growing public debate on both the potential and dangers of CRISPR applications. Given the speed of technology development and the almost instantaneous global spread of news, it is important to follow evolving debates without much delay and in sufficient detail, as certain events may have a major long-term impact on public opinion and later influence policy decisions.

Objective: Social media networks such as Twitter have shown to be major drivers of news dissemination and public discourse. They provide a vast amount of semistructured data in almost real-time and give direct access to the content of the conversations. We can now mine and analyze such data quickly because of recent developments in machine learning and natural language processing.

Methods: Here, we used Bidirectional Encoder Representations from Transformers (BERT), an attention-based transformer model, in combination with statistical methods to analyze the entirety of all tweets ever published on CRISPR since the publication of the first gene editing application in 2013.

Results: We show that the mean sentiment of tweets was initially very positive, but began to decrease over time, and that this decline was driven by rare peaks of strong negative sentiments. Due to the high temporal resolution of the data, we were able to associate these peaks with specific events and to observe how trending topics changed over time.

Conclusions: Overall, this type of analysis can provide valuable and complementary insights into ongoing public debates, extending the traditional empirical bioethics toolset.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2196/17830DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7490675PMC
August 2020

A research agenda for digital proximity tracing apps.

Swiss Med Wkly 2020 Jul 16;150:w20324. Epub 2020 Jul 16.

Institute of Social and Preventive Medicine, University of Bern, Switzerland.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4414/smw.2020.20324DOI Listing
July 2020

Keep calm and carry on vaccinating: Is anti-vaccination sentiment contributing to declining vaccine coverage in England?

Vaccine 2020 07 16;38(33):5297-5304. Epub 2020 Jun 16.

Immunisation Division, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, England, United Kingdom.

Background: In England, coverage for childhood vaccines have decreased since 2012/13 in the context of an increasingly visible anti-vaccination discourse. We determined whether anti-vaccination sentiment is the likely cause of this decline in coverage.

Methods: Descriptive study triangulating a range of data sources (vaccine coverage, cross-sectional survey of attitudes towards vaccination, UK-specific Twitter social media) and assessing them against the following Bradford Hill criteria: strength of association, consistency, specificity, temporality, biological gradient and coherence.

Results: Strength of association: compared with well-documented vaccine scares, the decline in childhood vaccination seen since 2012/13 is 4-20 times smaller; consistency: while coverage for completed courses of the hexavalent and meningococcal vaccines decreased by 0.5-1.2 percentage points (pp) between 2017 and 2019, coverage for the first dose of these vaccines increased 0.5-0.7 pp; specificity: Since 2012-13, coverage decreased for some vaccines (hexavalent, MMR, HPV, shingles) and increased for others (MenACWY, Td/IPV, antenatal pertussis, influenza in 2 years of children), with no age-specific patterns. Temporality and biological gradient: the decline in vaccine coverage was preceded by an increase in vaccine confidence and a decrease in the proportion of parents encountering anti-vaccination materials. Coherence: attitudes towards vaccination expressed on Twitter in the UK became increasingly positive between 2017 and 2019 as vaccine coverage for childhood vaccines decreased.

Conclusions: In England, trends in vaccine coverage between 2012/13 and 2018/19 were not homogenous and varied in magnitude and direction according to vaccine, dose and region. In addition, confidence in vaccines increased during the same period. These findings are not compatible with anti-vaccination sentiment causing a decline in vaccine coverage In England.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.vaccine.2020.05.082DOI Listing
July 2020

Author Correction: Assessing the Dynamics and Control of Droplet- and Aerosol-Transmitted Influenza Using an Indoor Positioning System.

Sci Rep 2020 Mar 27;10(1):5792. Epub 2020 Mar 27.

Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-62682-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7099035PMC
March 2020

COVID-19 epidemic in Switzerland: on the importance of testing, contact tracing and isolation.

Swiss Med Wkly 2020 03 19;150:w20225. Epub 2020 Mar 19.

University of Bern, Switzerland.

Switzerland is among the countries with the highest number of coronavirus disease-2019 (COVID-19) cases per capita in the world. There are likely many people with undetected SARS-CoV-2 infection because testing efforts are currently not detecting all infected people, including some with clinical disease compatible with COVID-19. Testing on its own will not stop the spread of SARS-CoV-2. Testing is part of a strategy. The World Health Organization recommends a combination of measures: rapid diagnosis and immediate isolation of cases, rigorous tracking and precautionary self-isolation of close contacts. In this article, we explain why the testing strategy in Switzerland should be strengthened urgently, as a core component of a combination approach to control COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4414/smw.2020.20225DOI Listing
March 2020

Snakebite and snake identification: empowering neglected communities and health-care providers with AI.

Lancet Digit Health 2019 09 5;1(5):e202-e203. Epub 2019 Sep 5.

Institute of Global Health, Faculty of Medicine, University of Geneva, Geneva, Switzerland.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/S2589-7500(19)30086-XDOI Listing
September 2019

Assessment of menstrual health status and evolution through mobile apps for fertility awareness.

NPJ Digit Med 2019 16;2:64. Epub 2019 Jul 16.

2Digital Epidemiology Lab, Global Health Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Campus Biotech, Chemin des mines 9, 1202 Geneva, Switzerland.

For most women of reproductive age, assessing menstrual health and fertility typically involves regular visits to a gynecologist or another clinician. While these evaluations provide critical information on an individual's reproductive health status, they typically rely on memory-based self-reports, and the results are rarely, if ever, assessed at the population level. In recent years, mobile apps for menstrual tracking have become very popular, allowing us to evaluate the reliability and tracking frequency of millions of self-observations, thereby providing an unparalleled view, both in detail and scale, on menstrual health and its evolution for large populations. In particular, the primary aim of this study was to describe the tracking behavior of the app users and their overall observation patterns in an effort to understand if they were consistent with previous small-scale medical studies. The secondary aim was to investigate whether their precision allowed the detection and estimation of ovulation timing, which is critical for reproductive and menstrual health. Retrospective self-observation data were acquired from two mobile apps dedicated to the application of the sympto-thermal fertility awareness method, resulting in a dataset of more than 30 million days of observations from over 2.7 million cycles for two hundred thousand users. The analysis of the data showed that up to 40% of the cycles in which users were seeking pregnancy had recordings every single day. With a modeling approach using Hidden Markov Models to describe the collected data and estimate ovulation timing, it was found that follicular phases average duration and range were larger than previously reported, with only 24% of ovulations occurring at cycle days 14 to 15, while the luteal phase duration and range were in line with previous reports, although short luteal phases (10 days or less) were more frequently observed (in up to 20% of cycles). The digital epidemiology approach presented here can help to lead to a better understanding of menstrual health and its connection to women's health overall, which has historically been severely understudied.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41746-019-0139-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6635432PMC
July 2019

Crowdbreaks: Tracking Health Trends Using Public Social Media Data and Crowdsourcing.

Front Public Health 2019 12;7:81. Epub 2019 Apr 12.

Digital Epidemiology Lab, EPFL, Geneva, Switzerland.

In the past decade, tracking health trends using social media data has shown great promise, due to a powerful combination of massive adoption of social media around the world, and increasingly potent hardware and software that enables us to work with these new big data streams. At the same time, many challenging problems have been identified. First, there is often a mismatch between how rapidly online data can change, and how rapidly algorithms are updated, which means that there is limited reusability for algorithms trained on past data as their performance decreases over time. Second, much of the work is focusing on specific issues during a specific past period in time, even though public health institutions would need flexible tools to assess multiple evolving situations in real time. Third, most tools providing such capabilities are proprietary systems with little algorithmic or data transparency, and thus little buy-in from the global public health and research community. Here, we introduce Crowdbreaks, an open platform which allows tracking of health trends by making use of continuous crowdsourced labeling of public social media content. The system is built in a way which automatizes the typical workflow from data collection, filtering, labeling and training of machine learning classifiers and therefore can greatly accelerate the research process in the public health domain. This work describes the technical aspects of the platform, thereby covering the functionalities at its current state and exploring its future use cases and extensions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpubh.2019.00081DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476276PMC
April 2019

WHO and ITU establish benchmarking process for artificial intelligence in health.

Lancet 2019 Jul 29;394(10192):9-11. Epub 2019 Mar 29.

China Academy of Information and Communications Technology, Beijing, China.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/S0140-6736(19)30762-7DOI Listing
July 2019

Assessing the Dynamics and Control of Droplet- and Aerosol-Transmitted Influenza Using an Indoor Positioning System.

Sci Rep 2019 02 18;9(1):2185. Epub 2019 Feb 18.

Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

There is increasing evidence that aerosol transmission is a major contributor to the spread of influenza. Despite this, virtually all studies assessing the dynamics and control of influenza assume that it is transmitted solely through direct contact and large droplets, requiring close physical proximity. Here, we use wireless sensors to measure simultaneously both the location and close proximity contacts in the population of a US high school. This dataset, highly resolved in space and time, allows us to model both droplet and aerosol transmission either in isolation or in combination. In particular, it allows us to computationally quantify the potential effectiveness of overlooked mitigation strategies such as improved ventilation that are available in the case of aerosol transmission. Our model suggests that recommendation-abiding ventilation could be as effective in mitigating outbreaks as vaccinating approximately half of the population. In simulations using empirical transmission levels observed in households, we find that bringing ventilation to recommended levels had the same mitigating effect as a vaccination coverage of 50% to 60%. Ventilation is an easy-to-implement strategy that has the potential to support vaccination efforts for effective control of influenza spread.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-38825-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6379436PMC
February 2019

FoodRepo: An Open Food Repository of Barcoded Food Products.

Front Nutr 2018 4;5:57. Epub 2018 Jul 4.

Global Health Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fnut.2018.00057DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6040205PMC
July 2018

Digital epidemiology: what is it, and where is it going?

Authors:
Marcel Salathé

Life Sci Soc Policy 2018 Jan 4;14(1). Epub 2018 Jan 4.

Digital Epidemiology Lab, School of Life Sciences and School of Computer and Communication Sciences, EPFL, Chemin des Mines 9, 1202, Geneva, Switzerland.

Digital Epidemiology is a new field that has been growing rapidly in the past few years, fueled by the increasing availability of data and computing power, as well as by breakthroughs in data analytics methods. In this short piece, I provide an outlook of where I see the field heading, and offer a broad and a narrow definition of the term.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s40504-017-0065-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5754279PMC
January 2018

Augmenting Research, Education, and Outreach with Client-Side Web Programming.

Trends Biotechnol 2018 05 15;36(5):473-476. Epub 2017 Dec 15.

Institute of Chemical Sciences and Engineering, EPFL, Lausanne CH-1015, Switzerland.

The evolution of computing and web technologies over the past decade has enabled the development of fully fledged scientific applications that run directly on web browsers. Powered by JavaScript, the lingua franca of web programming, these 'web apps' are starting to revolutionize and democratize scientific research, education, and outreach.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tibtech.2017.11.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6445968PMC
May 2018

Critical dynamics in population vaccinating behavior.

Proc Natl Acad Sci U S A 2017 12 11;114(52):13762-13767. Epub 2017 Dec 11.

Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada N2L 3G1;

Vaccine refusal can lead to renewed outbreaks of previously eliminated diseases and even delay global eradication. Vaccinating decisions exemplify a complex, coupled system where vaccinating behavior and disease dynamics influence one another. Such systems often exhibit critical phenomena-special dynamics close to a tipping point leading to a new dynamical regime. For instance, critical slowing down (declining rate of recovery from small perturbations) may emerge as a tipping point is approached. Here, we collected and geocoded tweets about measles-mumps-rubella vaccine and classified their sentiment using machine-learning algorithms. We also extracted data on measles-related Google searches. We find critical slowing down in the data at the level of California and the United States in the years before and after the 2014-2015 Disneyland, California measles outbreak. Critical slowing down starts growing appreciably several years before the Disneyland outbreak as vaccine uptake declines and the population approaches the tipping point. However, due to the adaptive nature of coupled behavior-disease systems, the population responds to the outbreak by moving away from the tipping point, causing "critical speeding up" whereby resilience to perturbations increases. A mathematical model of measles transmission and vaccine sentiment predicts the same qualitative patterns in the neighborhood of a tipping point to greatly reduced vaccine uptake and large epidemics. These results support the hypothesis that population vaccinating behavior near the disease elimination threshold is a critical phenomenon. Developing new analytical tools to detect these patterns in digital social data might help us identify populations at heightened risk of widespread vaccine refusal.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1704093114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5748162PMC
December 2017

Digital Pharmacovigilance and Disease Surveillance: Combining Traditional and Big-Data Systems for Better Public Health.

Authors:
Marcel Salathé

J Infect Dis 2016 12;214(suppl_4):S399-S403

Digital Epidemiology Laboratory, School of Life Sciences and School of Computer and Communication Sciences, EPFL, Geneva, Switzerland.

The digital revolution has contributed to very large data sets (ie, big data) relevant for public health. The two major data sources are electronic health records from traditional health systems and patient-generated data. As the two data sources have complementary strengths-high veracity in the data from traditional sources and high velocity and variety in patient-generated data-they can be combined to build more-robust public health systems. However, they also have unique challenges. Patient-generated data in particular are often completely unstructured and highly context dependent, posing essentially a machine-learning challenge. Some recent examples from infectious disease surveillance and adverse drug event monitoring demonstrate that the technical challenges can be solved. Despite these advances, the problem of verification remains, and unless traditional and digital epidemiologic approaches are combined, these data sources will be constrained by their intrinsic limits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/infdis/jiw281DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5144898PMC
December 2016

An ecological and digital epidemiology analysis on the role of human behavior on the 2014 Chikungunya outbreak in Martinique.

Sci Rep 2017 07 20;7(1):5967. Epub 2017 Jul 20.

Centre de Démoustication/Lutte antivectorielle CTM/ARS, Martinique, France.

Understanding the spatio-temporal dynamics of endemic infections is of critical importance for a deeper understanding of pathogen transmission, and for the design of more efficient public health strategies. However, very few studies in this domain have focused on emerging infections, generating a gap of knowledge that hampers epidemiological response planning. Here, we analyze the case of a Chikungunya outbreak that occurred in Martinique in 2014. Using time series estimates from a network of sentinel practitioners covering the entire island, we first analyze the spatio-temporal dynamics and show that the largest city has served as the epicenter of this epidemic. We further show that the epidemic spread from there through two different propagation waves moving northwards and southwards, probably by individuals moving along the road network. We then develop a mathematical model to explore the drivers of the temporal dynamics of this mosquito-borne virus. Finally, we show that human behavior, inferred by a textual analysis of messages published on the social network Twitter, is required to explain the epidemiological dynamics over time. Overall, our results suggest that human behavior has been a key component of the outbreak propagation, and we argue that such results can lead to more efficient public health strategies specifically targeting the propagation process.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-017-05957-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5519737PMC
July 2017

Precision global health in the digital age.

Swiss Med Wkly 2017 8;147:w14423. Epub 2017 Apr 8.

Présidence, Station 1, EPFL, Switzerland.

Precision global health is an approach similar to precision medicine, which facilitates, through innovation and technology, better targeting of public health interventions on a global scale, for the purpose of maximising their effectiveness and relevance. Illustrative examples include: the use of remote sensing data to fight vector-borne diseases; large databases of genomic sequences of foodborne pathogens helping to identify origins of outbreaks; social networks and internet search engines for tracking communicable diseases; cell phone data in humanitarian actions; drones to deliver healthcare services in remote and secluded areas. Open science and data sharing platforms are proposed for fostering international research programmes under fair, ethical and respectful conditions. Innovative education, such as massive open online courses or serious games, can promote wider access to training in public health and improving health literacy. The world is moving towards learning healthcare systems. Professionals are equipped with data collection and decision support devices. They share information, which are complemented by external sources, and analysed in real time using machine learning techniques. They allow for the early detection of anomalies, and eventually guide appropriate public health interventions. This article shows how information-driven approaches, enabled by digital technologies, can help improving global health with greater equity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4414/smw.2017.14423DOI Listing
October 2017

[Using big data for disease surveillance and drug safety monitoring].

Rev Prat 2017 01;67(1):25-30

Digital Epidemiology Lab, École polytechnique fédérale de Lausanne, Lausanne, Suisse.

Using big data for disease surveillance and drug safety monitoring. The ongoing global growth in internet usage and data sharing has generated tremendous amounts of "big data" that can be analyzed for public health purposes. In addition to electronic medical records, new paradigms are emerging, leveraging the combination of patient-generated data with data from traditional health systems. As the two data sources have complementary strengths - high veracity in the data from traditional sources, high velocity and variety in the data from patient-generated data - they can be combined to build more robust public health systems. Here, we will focus on two areas of public health: infectious disease surveillance, and adverse drug event monitoring. The problem with these data sources are that they are completely unstructured and highly context-dependent, posing essentially a machine learning challenge. Some of the recent examples in these two domains indicate that the technical challenges can be solved, but as long as the two systems are separate, they will be constrained by their intrinsic limits.
View Article and Find Full Text PDF

Download full-text PDF

Source
January 2017

Using Deep Learning for Image-Based Plant Disease Detection.

Front Plant Sci 2016 22;7:1419. Epub 2016 Sep 22.

Digital Epidemiology Lab, EPFLGeneva, Switzerland; School of Life Sciences, EPFLLausanne, Switzerland; School of Computer and Communication Sciences, EPFLLausanne, Switzerland.

Crop diseases are a major threat to food security, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of 54,306 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 14 crop species and 26 diseases (or absence thereof). The trained model achieves an accuracy of 99.35% on a held-out test set, demonstrating the feasibility of this approach. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path toward smartphone-assisted crop disease diagnosis on a massive global scale.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpls.2016.01419DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5032846PMC
September 2016

Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter.

JMIR Public Health Surveill 2015 Jul-Dec;1(2):e7. Epub 2015 Jul 27.

Center for Infectious Disease DynamicsDepartment of BiologyPenn State UniversityUniversity Park, PAUnited States.

Background: Social media platforms are increasingly seen as a source of data on a wide range of health issues. Twitter is of particular interest for public health surveillance because of its public nature. However, the very public nature of social media platforms such as Twitter may act as a barrier to public health surveillance, as people may be reluctant to publicly disclose information about their health. This is of particular concern in the context of diseases that are associated with a certain degree of stigma, such as HIV/AIDS.

Objective: The objective of the study is to assess whether adverse effects of HIV drug treatment and associated sentiments can be determined using publicly available data from social media.

Methods: We describe a combined approach of machine learning and crowdsourced human assessment to identify adverse effects of HIV drug treatment solely on individual reports posted publicly on Twitter. Starting from a large dataset of 40 million tweets collected over three years, we identify a very small subset (1642; 0.004%) of individual reports describing personal experiences with HIV drug treatment.

Results: Despite the small size of the extracted final dataset, the summary representation of adverse effects attributed to specific drugs, or drug combinations, accurately captures well-recognized toxicities. In addition, the data allowed us to discriminate across specific drug compounds, to identify preferred drugs over time, and to capture novel events such as the availability of preexposure prophylaxis.

Conclusions: The effect of limited data sharing due to the public nature of the data can be partially offset by the large number of people sharing data in the first place, an observation that may play a key role in digital epidemiology in general.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2196/publichealth.4488DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4869211PMC
May 2016

Ethical challenges of big data in public health.

PLoS Comput Biol 2015 Feb 9;11(2):e1003904. Epub 2015 Feb 9.

Children's Hospital Informatics Program, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1003904DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4321985PMC
February 2015

An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.

J Biomed Inform 2014 Jun 16;49:255-68. Epub 2014 Mar 16.

Human Development and Family Studies, The Pennsylvania State University, University Park, PA 16802, USA.

Objectives: The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data.

Methodology: Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported.

Data Sets: Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts.

Evaluations: Two sets of evaluations are conducted to investigate the proposed model's ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media.

Findings: The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2014.03.005DOI Listing
June 2014

How should social mixing be measured: comparing web-based survey and sensor-based methods.

BMC Infect Dis 2014 Mar 10;14:136. Epub 2014 Mar 10.

Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA.

Background: Contact surveys and diaries have conventionally been used to measure contact networks in different settings for elucidating infectious disease transmission dynamics of respiratory infections. More recently, technological advances have permitted the use of wireless sensor devices, which can be worn by individuals interacting in a particular social context to record high resolution mixing patterns. To date, a direct comparison of these two different methods for collecting contact data has not been performed.

Methods: We studied the contact network at a United States high school in the spring of 2012. All school members (i.e., students, teachers, and other staff) were invited to wear wireless sensor devices for a single school day, and asked to remember and report the name and duration of all of their close proximity conversational contacts for that day in an online contact survey. We compared the two methods in terms of the resulting network densities, nodal degrees, and degree distributions. We also assessed the correspondence between the methods at the dyadic and individual levels.

Results: We found limited congruence in recorded contact data between the online contact survey and wireless sensors. In particular, there was only negligible correlation between the two methods for nodal degree, and the degree distribution differed substantially between both methods. We found that survey underreporting was a significant source of the difference between the two methods, and that this difference could be improved by excluding individuals who reported only a few contact partners. Additionally, survey reporting was more accurate for contacts of longer duration, and very inaccurate for contacts of shorter duration. Finally, female participants tended to report more accurately than male participants.

Conclusions: Online contact surveys and wireless sensor devices collected incongruent network data from an identical setting. This finding suggests that these two methods cannot be used interchangeably for informing models of infectious disease dynamics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2334-14-136DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984737PMC
March 2014

Positive network assortativity of influenza vaccination at a high school: implications for outbreak risk and herd immunity.

PLoS One 2014 5;9(2):e87042. Epub 2014 Feb 5.

Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America.

Schools are known to play a significant role in the spread of influenza. High vaccination coverage can reduce infectious disease spread within schools and the wider community through vaccine-induced immunity in vaccinated individuals and through the indirect effects afforded by herd immunity. In general, herd immunity is greatest when vaccination coverage is highest, but clusters of unvaccinated individuals can reduce herd immunity. Here, we empirically assess the extent of such clustering by measuring whether vaccinated individuals are randomly distributed or demonstrate positive assortativity across a United States high school contact network. Using computational models based on these empirical measurements, we further assess the impact of assortativity on influenza disease dynamics. We found that the contact network was positively assortative with respect to influenza vaccination: unvaccinated individuals tended to be in contact more often with other unvaccinated individuals than with vaccinated individuals, and these effects were most pronounced when we analyzed contact data collected over multiple days. Of note, unvaccinated males contributed substantially more than unvaccinated females towards the measured positive vaccination assortativity. Influenza simulation models using a positively assortative network resulted in larger average outbreak size, and outbreaks were more likely, compared to an otherwise identical network where vaccinated individuals were not clustered. These findings highlight the importance of understanding and addressing heterogeneities in seasonal influenza vaccine uptake for prevention of large, protracted school-based outbreaks of influenza, in addition to continued efforts to increase overall vaccine coverage.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0087042PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914803PMC
November 2014

Influenza A (H7N9) and the importance of digital epidemiology.

N Engl J Med 2013 Aug 3;369(5):401-4. Epub 2013 Jul 3.

Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1056/NEJMp1307752DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4873163PMC
August 2013