Publications by authors named "Devis Tuia"

8 Publications

  • Page 1 of 1

Counting using deep learning regression gives value to ecological surveys.

Sci Rep 2021 12 1;11(1):23209. Epub 2021 Dec 1.

Ecole Polytechnique Fédérale de Lausanne (EPFL), 1950, Sion, Switzerland.

Many ecological studies rely on count data and involve manual counting of objects of interest, which is time-consuming and especially disadvantageous when time in the field or lab is limited. However, an increasing number of works uses digital imagery, which opens opportunities to automatise counting tasks. In this study, we use machine learning to automate counting objects of interest without the need to label individual objects. By leveraging already existing image-level annotations, this approach can also give value to historical data that were collected and annotated over longer time series (typical for many ecological studies), without the aim of deep learning applications. We demonstrate deep learning regression on two fundamentally different counting tasks: (i) daily growth rings from microscopic images of fish otolith (i.e., hearing stone) and (ii) hauled out seals from highly variable aerial imagery. In the otolith images, our deep learning-based regressor yields an RMSE of 3.40 day-rings and an [Formula: see text] of 0.92. Initial performance in the seal images is lower (RMSE of 23.46 seals and [Formula: see text] of 0.72), which can be attributed to a lack of images with a high number of seals in the initial training set, compared to the test set. We then show how to improve performance substantially (RMSE of 19.03 seals and [Formula: see text] of 0.77) by carefully selecting and relabelling just 100 additional training images based on initial model prediction discrepancy. The regression-based approach used here returns accurate counts ([Formula: see text] of 0.92 and 0.77 for the rings and seals, respectively), directly usable in ecological research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-02387-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8636638PMC
December 2021

Social media and deep learning capture the aesthetic quality of the landscape.

Sci Rep 2021 10 8;11(1):20000. Epub 2021 Oct 8.

Laboratory of Geo-Information Science and Remote Sensing, Wageningen University, Wageningen, 6708 PB, The Netherlands.

Peoples' recreation and well-being are closely related to their aesthetic enjoyment of the landscape. Ecosystem service (ES) assessments record the aesthetic contributions of landscapes to peoples' well-being in support of sustainable policy goals. However, the survey methods available to measure these contributions restrict modelling at large scales. As a result, most studies rely on environmental indicator models but these do not incorporate peoples' actual use of the landscape. Now, social media has emerged as a rich new source of information to understand human-nature interactions while advances in deep learning have enabled large-scale analysis of the imagery uploaded to these platforms. In this study, we test the accuracy of Flickr and deep learning-based models of landscape quality using a crowdsourced survey in Great Britain. We find that this novel modelling approach generates a strong and comparable level of accuracy versus an indicator model and, in combination, captures additional aesthetic information. At the same time, social media provides a direct measure of individuals' aesthetic enjoyment, a point of view inaccessible to indicator models, as well as a greater independence of the scale of measurement and insights into how peoples' appreciation of the landscape changes over time. Our results show how social media and deep learning can support significant advances in modelling the aesthetic contributions of ecosystems for ES assessments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-021-99282-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8501120PMC
October 2021

Wasserstein Adversarial Regularization for learning with label noise.

IEEE Trans Pattern Anal Mach Intell 2021 Jul 7;PP. Epub 2021 Jul 7.

Noisy labels often occur in vision datasets, especially when they are obtained from crowdsourcing or Web scraping. We propose a new regularization method, which enables learning robust classifiers in presence of noisy data. To achieve this goal, we propose a new adversarial regularization {scheme} based on the Wasserstein distance. Using this distance allows taking into account specific relations between classes by leveraging the geometric properties of the labels space. {Our Wasserstein Adversarial Regularization (WAR) encodes a selective regularization, which promotes smoothness of the classifier between some classes, while preserving sufficient complexity of the decision boundary between others. We first discuss how and why adversarial regularization can be used in the context of noise and then show the effectiveness of our method on five datasets corrupted with noisy labels: in both benchmarks and real datasets, WAR outperforms the state-of-the-art competitors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2021.3094662DOI Listing
July 2021

Optimal Transport for Domain Adaptation.

IEEE Trans Pattern Anal Mach Intell 2017 09 7;39(9):1853-1865. Epub 2016 Oct 7.

Domain adaptation is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data representation become more robust when confronted to data depicting the same classes, but described by another observation system. Among the many strategies proposed, finding domain-invariant representations has shown excellent properties, in particular since it allows to train a unique classifier effective in all domains. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains. We learn a transportation plan matching both PDFs, which constrains labeled samples of the same class in the source domain to remain close during transport. This way, we exploit at the same time the labeled samples in the source and the distributions observed in both domains. Experiments on toy and challenging real visual adaptation examples show the interest of the method, that consistently outperforms state of the art approaches. In addition, numerical experiments show that our approach leads to better performances on domain invariant deep learning features and can be easily adapted to the semi-supervised case where few labeled samples are available in the target domain.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2016.2615921DOI Listing
September 2017

Combining Human Computing and Machine Learning to Make Sense of Big (Aerial) Data for Disaster Response.

Big Data 2016 03 26;4(1):47-59. Epub 2016 Feb 26.

6 Laboratory of Geographical Information Systems (LASIG), School of Architecture , Civil and Environmental Engineering (ENAC), Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne, Switzerland .

Aerial imagery captured via unmanned aerial vehicles (UAVs) is playing an increasingly important role in disaster response. Unlike satellite imagery, aerial imagery can be captured and processed within hours rather than days. In addition, the spatial resolution of aerial imagery is an order of magnitude higher than the imagery produced by the most sophisticated commercial satellites today. Both the United States Federal Emergency Management Agency (FEMA) and the European Commission's Joint Research Center (JRC) have noted that aerial imagery will inevitably present a big data challenge. The purpose of this article is to get ahead of this future challenge by proposing a hybrid crowdsourcing and real-time machine learning solution to rapidly process large volumes of aerial data for disaster response in a time-sensitive manner. Crowdsourcing can be used to annotate features of interest in aerial images (such as damaged shelters and roads blocked by debris). These human-annotated features can then be used to train a supervised machine learning system to learn to recognize such features in new unseen images. In this article, we describe how this hybrid solution for image analysis can be implemented as a module (i.e., Aerial Clicker) to extend an existing platform called Artificial Intelligence for Disaster Response (AIDR), which has already been deployed to classify microblog messages during disasters using its Text Clicker module and in response to Cyclone Pam, a category 5 cyclone that devastated Vanuatu in March 2015. The hybrid solution we present can be applied to both aerial and satellite imagery and has applications beyond disaster response such as wildlife protection, human rights, and archeological exploration. As a proof of concept, we recently piloted this solution using very high-resolution aerial photographs of a wildlife reserve in Namibia to support rangers with their wildlife conservation efforts (SAVMAP project, http://lasig.epfl.ch/savmap ). The results suggest that the platform we have developed to combine crowdsourcing and machine learning to make sense of large volumes of aerial images can be used for disaster response.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/big.2014.0064DOI Listing
March 2016

Kernel Manifold Alignment for Domain Adaptation.

PLoS One 2016 12;11(2):e0148655. Epub 2016 Feb 12.

Image Processing Laboratory, Universitat of València, València, Spain.

The wealth of sensory data coming from different modalities has opened numerous opportunities for data analysis. The data are of increasing volume, complexity and dimensionality, thus calling for new methodological innovations towards multimodal data processing. However, multimodal architectures must rely on models able to adapt to changes in the data distribution. Differences in the density functions can be due to changes in acquisition conditions (pose, illumination), sensors characteristics (number of channels, resolution) or different views (e.g. street level vs. aerial views of a same building). We call these different acquisition modes domains, and refer to the adaptation problem as domain adaptation. In this paper, instead of adapting the trained models themselves, we alternatively focus on finding mappings of the data sources into a common, semantically meaningful, representation domain. This field of manifold alignment extends traditional techniques in statistics such as canonical correlation analysis (CCA) to deal with nonlinear adaptation and possibly non-corresponding data pairs between the domains. We introduce a kernel method for manifold alignment (KEMA) that can match an arbitrary number of data sources without needing corresponding pairs, just few labeled examples in all domains. KEMA has interesting properties: 1) it generalizes other manifold alignment methods, 2) it can align manifolds of very different complexities, performing a discriminative alignment preserving each manifold inner structure, 3) it can define a domain-specific metric to cope with multimodal specificities, 4) it can align data spaces of different dimensionality, 5) it is robust to strong nonlinear feature deformations, and 6) it is closed-form invertible, which allows transfer across-domains and data synthesis. To authors' knowledge this is the first method addressing all these important issues at once. We also present a reduced-rank version of KEMA for computational efficiency, and discuss the generalization performance of KEMA under Rademacher principles of stability. Aligning multimodal data with KEMA reports outstanding benefits when used as a data pre-conditioner step in the standard data analysis processing chain. KEMA exhibits very good performance over competing methods in synthetic controlled examples, visual object recognition and recognition of facial expressions tasks. KEMA is especially well-suited to deal with high-dimensional problems, such as images and videos, and under complicated distortions, twists and warpings of the data manifolds. A fully functional toolbox is available at https://github.com/dtuia/KEMA.git.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0148655PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4752280PMC
July 2016

Principal polynomial analysis.

Int J Neural Syst 2014 Nov 17;24(7):1440007. Epub 2014 Aug 17.

Image Processing Laboratory (IPL), Universitat de València, 46980 Paterna, València, Spain.

This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1142/S0129065714400073DOI Listing
November 2014

Indoor radon distribution in Switzerland: lognormality and Extreme Value Theory.

J Environ Radioact 2008 Apr 26;99(4):649-57. Epub 2007 Oct 26.

Institute of Geomatics and Analysis of Risk (IGAR), University of Lausanne, CH-1015 Lausanne, Switzerland.

Analysis and modeling of statistical distributions of indoor radon concentration from data valorization to mapping and simulations are critical issues for real decision-making processes. The usual way to model indoor radon concentrations is to assume lognormal distributions of concentrations on a given territory. While these distributions usually model correctly the main body of the data density, they cannot model the extreme values, which are more important for risk assessment. In this paper, global and local indoor radon distributions are modeled using Extreme Value Theory (EVT). Emphasis is put on the tails of the distributions and their deviations from lognormality. The best fits of distributions to real data set density have been computed and goodness of fit with Root Mean Squared Error (RMSE) is evaluated. The results show that EVT performs better than lognormal pdf for real data sets characterized by high indoor radon concentrations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jenvrad.2007.09.004DOI Listing
April 2008
-->