3,832 results match your criteria IEEE transactions on pattern analysis and machine intelligence[Journal]


Hyperbolic Wasserstein Distance for Shape Indexing.

Authors:
Jie Shi Yalin Wang

IEEE Trans Pattern Anal Mach Intell 2019 Feb 8. Epub 2019 Feb 8.

Shape space is an active research topic in computer vision and medical imaging fields. The distance defined in a shape space may provide a simple and refined index to represent a unique shape. This work studies the Wasserstein space and proposes a novel framework to compute the Wasserstein distance between general topological surfaces by integrating hyperbolic Ricci flow, hyperbolic harmonic map, and hyperbolic power Voronoi diagram algorithms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2898400DOI Listing
February 2019

End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 14. Epub 2019 Feb 14.

We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as inputs and produces the corresponding camera control signals as outputs (e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2899570DOI Listing
February 2019

Skeleton-Based Online Action Prediction Using Scale Selection Network.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 12. Epub 2019 Feb 12.

Action prediction is to recognize the class label of an ongoing activity when only a part of it is observed. In this paper, we focus on online action prediction in streaming 3D skeleton sequences. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the temporal axis. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8640046/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2019.2898954DOI Listing
February 2019
1 Read

Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 12. Epub 2019 Feb 12.

We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2898859DOI Listing
February 2019

Min-Entropy Latent Model for Weakly Supervised Object Detection.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 12. Epub 2019 Feb 12.

Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors. The inconsistency between the weak supervision and learning objectives introduces significant randomness to object locations and ambiguity to detectors. In this paper, a min-entropy latent model (MELM) is proposed for weakly supervised object detection. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2898858DOI Listing
February 2019

Measuring Shapes with Desired Convex Polygons.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 12. Epub 2019 Feb 12.

In this paper we have developed a family of shape measures. All the measures from the family evaluate the degree to which a shape looks like a predefined convex polygon. A quite new approach in designing object shape based measures has been applied. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2898830DOI Listing
February 2019

EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes.

IEEE Trans Pattern Anal Mach Intell 2019 Feb 5. Epub 2019 Feb 5.

Big data has had a great share in the success of deep learning in computer vision. Recent works suggest that there is significant further potential to increase object detection performance by utilizing even bigger datasets. In this paper, we introduce the EuroCity Persons dataset, which provides a large number of highly diverse, accurate and detailed annotations of pedestrians, cyclists and other riders in urban traffic scenes. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2897684DOI Listing
February 2019

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 31. Epub 2019 Jan 31.

Skeleton-based human action recognition has recently attracted increasing attention thanks to the accessibility and the popularity of 3D skeleton data. One of the key challenges in skeleton-based action recognition lies in the large view variations when capturing data. In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2896631DOI Listing
January 2019

Hierarchical Surface Prediction.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 30. Epub 2019 Jan 30.

Recently, Convolutional Neural Networks have shown promising results for 3D geometry prediction. They can make predictions from very little input data such as a single color image. A major limitation of such approaches is that they only predict a coarse resolution voxel grid, which does not capture the surface of the objects well. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2896296DOI Listing
January 2019

Calibrating Classification Probabilities with Shape-restricted Polynomial Regression.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 28. Epub 2019 Jan 28.

In many real-world classification problems, accurate prediction of membership probabilities is critical for further decision making. The probability calibration problem studies how to map scores obtained from one classification algorithm to membership probabilities. The requirement of non-decreasingness for this mapping involves an infinite number of inequality constraints, which makes its estimation computationally intractable. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2895794DOI Listing
January 2019

Joint Rain Detection and Removal from a Single Image with Contextualized Deep Networks.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 28. Epub 2019 Jan 28.

Rain streaks, particularly in heavy rain, not only degrade visibility but also make many computer vision algorithms fail to function properly. In this paper, we address this visibility problem by focusing on single-image rain removal, even in the presence of dense rain streaks and rain-streak accumulation, which is visually similar to mist or fog. To achieve this, we introduce a new rain model and a deep learning architecture. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2895793DOI Listing
January 2019

Training Faster by Separating Modes of Variation in Batch-normalized Models.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 28. Epub 2019 Jan 28.

Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes the layer outputs during training using the statistics of each mini-batch. BN accelerates training procedure by allowing to safely utilize large learning rates and alleviates the need for careful initialization of the parameters. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2895781DOI Listing
January 2019

Absent Multiple Kernel Learning Algorithms.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 28. Epub 2019 Jan 28.

Multiple kernel learning (MKL) has been intensively studied during the past decade. It optimally combines the multiple channels of each sample to improve classification performance. However, existing MKL algorithms cannot effectively handle the situation where some channels of the samples are missing, which is not uncommon in practical applications. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2895608DOI Listing
January 2019
1 Read

Unsupervised Video Matting via Sparse and Low-Rank Representation.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 25. Epub 2019 Jan 25.

A novel method, unsupervised video matting via sparse and low-rank representation, is proposed which can achieve high quality in a variety of challenging examples featuring illumination changes, feature ambiguity, topology changes, transparency variation, dis-occlusion, fast motion and motion blur, Previous matting methods introduced a nonlocal prior to search samples for estimating the alpha matte, which have achieved impressive results on some data. However, on one hand, searching inadequate or excessive samples may miss good samples or introduce noise; on the other hand, it is difficult to construct consistent nonlocal structures for pixels with similar features, yielding inconsistent video matte. In this paper, we proposed a novel video matting method to achieve spatially and temporally consistent matting result. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2895331DOI Listing
January 2019

Feature Boosting Network For 3D Pose Estimation.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 22. Epub 2019 Jan 22.

In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2894422DOI Listing
January 2019
1 Read
1.614 Impact Factor

Rolling Shutter Camera Absolute Pose.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 22. Epub 2019 Jan 22.

We present a minimal, non-iterative solutions to the absolute pose problem for images from rolling shutter cameras. Absolute pose problem is a key problem in computer vision and rolling shutter is present in a vast majority of today's digital cameras. We discuss several camera motion models and propose two feasible rolling shutter camera models for a polynomial solver. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2894395DOI Listing
January 2019
1 Read

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 22. Epub 2019 Jan 22.

We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11% more accurate on Sintel final than the recent FlowNet2 model. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2894353DOI Listing
January 2019
1 Read

Structured Label Inference for Visual Understanding.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 16. Epub 2019 Jan 16.

Visual data such as images and videos contain a rich source of structured semantic labels as well as a wide range of interacting components. Visual content could be assigned with fine-grained labels describing major components, coarse-grained labels depicting high level abstractions, or a set of labels revealing attributes. Such categorization over different, interacting layers of labels evinces the potential for a graph-based encoding of label information. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893215DOI Listing
January 2019

Hierarchical LSTMs with Adaptive Attention for Visual Captioning.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 21. Epub 2019 Jan 21.

Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8620348/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2019.2894139DOI Listing
January 2019
4 Reads

Heterogeneous Recommendation via Deep Low-rank Sparse Collective Factorization.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 21. Epub 2019 Jan 21.

Real-world recommender usually makes use of heterogeneous types of user feedbacks-for example, binary ratings such as likes and dislikes and numerical rating such as 5-star grades. In this work, we focus on transferring knowledge from binary ratings to numerical ratings, facing more serious data sparsity problem. Conventional Collective Factorization methods usually assume that multiple domains share some common latent information across users and items. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2894137DOI Listing
January 2019

Pixel Transposed Convolutional Networks.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 18. Epub 2019 Jan 18.

Transposed convolutional layers have been widely used in a variety of deep models for up-sampling, including encoder-decoder networks for semantic segmentation and deep generative models for unsupervised learning. One of the key limitations of transposed convolutional operations is that they result in the so-called checkerboard problem. This is caused by the fact that no direct relationship exists among adjacent pixels on the output feature map. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893965DOI Listing
January 2019
1 Read

Shared Multi-view Data Representation for Multi-domain Event Detection.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 18. Epub 2019 Jan 18.

Internet platforms provide new ways for people to share experiences, generating massive amounts of data related to various real-world concepts. In this paper, we present an event detection framework to discover real-world events from multiple data domains, including online news media and social media. As multi-domain data possess multiple data views that are heterogeneous, initial dictionaries consisting of labeled data samples are exploited to align the multi-view data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893953DOI Listing
January 2019

Large-scale Urban Reconstruction with Tensor Clustering and Global Boundary Refinement.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 18. Epub 2019 Jan 18.

Accurate and efficient methods for large-scale urban reconstruction are of significant importance to the computer vision and computer graphics communities. Although rapid acquisition techniques such as airborne LiDAR have been around for many years, creating a useful and functional virtual environment from such data remains difficult and labor intensive. This is due largely to the necessity in present solutions for data dependent user defined parameters. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893671DOI Listing
January 2019

Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 21. Epub 2019 Jan 21.

Light field imaging has recently known a regain of interest due to the availability of practical light field capturing systems that offer a wide range of applications in the field of computer vision. However, capturing high-resolution light fields remains technologically challenging since the increase in angular resolution is often accompanied by a significant reduction in spatial resolution. This paper describes a learning-based spatial light field super-resolution method that allows the restoration of the entire light field with consistency across all angular views. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893666DOI Listing
January 2019

RefineNet: Multi-Path Refinement Networks for Dense Prediction.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 18. Epub 2019 Jan 18.

Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense prediction problems such as semantic segmentation and depth estimation. However, repeated sub-sampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2893630DOI Listing
January 2019

LCR-Net++: Multi-person 2D and 3D Pose Detection in Natural Images.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images. Key to our approach is the generation and scoring of a number of pose proposals per image, which allows us to predict 2D and 3D poses of multiple people simultaneously. Hence, our approach does not require an approximate localization of the humans for initialization. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2892985DOI Listing
January 2019

3D Human Pose Machines with Self-supervised Learning.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

Driven by recent computer vision and robotic applications, recovering 3D human poses has become increasingly important and attracted growing interests. In fact, completing this task is quite challenging due to the diverse appearances, viewpoints, occlusions and inherently geometric ambiguities inside monocular images. Most of the existing methods focus on designing some elaborate priors/constraints to directly regress 3D human poses based on the corresponding 2D human pose-aware features or 2D pose predictions. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2892452DOI Listing
January 2019

Multiple Kernel k-means with Incomplete Kernels.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

Existing MKC algorithms cannot efficiently address the situation where some rows and columns of base kernel matrices are absent. This paper proposes two simple yet effective algorithms to address this issue. Different from existing approaches where incomplete kernel matrices are firstly imputed and a standard MKC algorithm is applied to the imputed kernel matrices, our first algorithm integrates imputation and clustering into a unified learning procedure. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8611131/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2019.2892416DOI Listing
January 2019
4 Reads

Minimal Case Relative Pose Computation using Ray-Point-Ray Features.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

Corners are popular features for relative pose computation with 2D-2D point correspondences. Stable corners may be formed by two 3D rays sharing a common starting point. We call such elements ray-point-ray (RPR) structures. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8611140/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2019.2892372DOI Listing
January 2019
5 Reads

Fast Cross-Validation for Kernel-based Algorithms.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

Cross-validation (CV) is a widely adopted approach for selecting the optimal model. However, the computation of empirical cross-validation error (CVE) has high complexity due to multiple times of learner training. In this paper, we develop a novel approximation theory of CVE and present an approximate approach to CV based on the Bouligand influence function (BIF) for kernel-based agorithms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2892371DOI Listing
January 2019

Online Meta Adaptation for Fast Video Object Segmentation.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 14. Epub 2019 Jan 14.

Conventional deep neural networks based video object segmentation (VOS) methods are dominated by heavily fine-tuning a segmentation model on the first frame of a given video, which is time-consuming and inefficient. In this paper, we propose a novel method which rapidly adapts a base segmentation model to new video sequences with only a couple of model-update iterations, without sacrificing performance. Such attractive efficiency benefits from the meta-learning paradigm which leads to a meta-segmentation model and a novel continuous learning approach which enables online adaptation of the segmentation model. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8611188/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2018.2890659DOI Listing
January 2019
10 Reads

On Detection of Faint Edges in Noisy Images.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 10. Epub 2019 Jan 10.

A fundamental question for edge detection in noisy images is how faint can an edge be and still be detected. In this paper we offer a formalism to study this question and subsequently introduce computationally efficient multiscale edge detection algorithms designed to detect faint edges in noisy images. In our formalism we view edge detection as a search in a discrete, though potentially large, set of feasible curves. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2892134DOI Listing
January 2019
1 Read

Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 9. Epub 2019 Jan 9.

In this paper, we consider the Tensor Robust Principal Component Analysis (TRPCA) problem, which aims to exactly recover the low-rank and sparse components from their sum. Our model is based on the recently proposed tensor-tensor product (or t-product) [15]. Induced by the t-product, we first rigorously deduce the tensor spectral norm, tensor nuclear norm, and tensor average rank, and show that the tensor nuclear norm is the convex envelope of the tensor average rank within the unit ball of the tensor spectral norm. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2891760DOI Listing
January 2019

Visibility graphs for image processing.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 9. Epub 2019 Jan 9.

The family of image visibility graphs (IVG/IHVG) have been recently introduced as simple algorithms by which scalar fields can be mapped into graphs. Here we explore the usefulness of such operator in the scenario of image processing and image classication. We demonstrate that the link architecture of the image visibility graphs encapsulates relevant information on the structure of the images and we explore their potential as image filters. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2891742DOI Listing
January 2019

Hierarchical Bayesian Inverse Lighting of Portraits with a Virtual Light Stage.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 9. Epub 2019 Jan 9.

From a single RGB image of an unknown face, taken under unknown conditions, we estimate a physically plausible lighting model. First, the 3D geometry and texture of the face are estimated by fitting a 3D Morphable Model to the 2D input. With this estimated 3D model and a Virtual Light Stage (VLS), we generate a gallery of images of the face with all the same conditions, but different lighting. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2891638DOI Listing
January 2019

Weighted Manifold Alignment using Wave Kernel Signatures for Aligning Medical image Datasets.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 9. Epub 2019 Jan 9.

Manifold alignment (MA) is a technique to map many high-dimensional datasets to one shared low-dimensional space.Here we develop a pipeline for using MA to reconstruct high-resolution medical images. We present two key contributions. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2891600DOI Listing
January 2019

Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 9. Epub 2019 Jan 9.

The explosive growth of digital images in video surveillance and social media has led to the significant need for efficient search of persons of interest in law enforcement and forensic applications. Despite tremendous progress in primary biometric traits (e.g. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8606226/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2019.2891584DOI Listing
January 2019
8 Reads
1.614 Impact Factor

Focal Visual-Text Attention for Memex Question Answering.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 7. Epub 2019 Jan 7.

Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering. However, to tackle real-life question answering problems on multimedia collections such as personal photo albums, we have to look at whole collections with sequences of photos. This paper proposes a new multimodal MemexQA task: given a sequence of photos from a user, the goal is to automatically answer questions that help users recover their memory about an event captured in these photos. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8603827/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2018.2890628DOI Listing
January 2019
3 Reads

Perspective-adaptive Convolutions for Scene Parsing.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 1. Epub 2019 Jan 1.

Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8598804/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2018.2890637DOI Listing
January 2019
1 Read

Joint Image Filtering with Deep Convolutional Networks.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 1. Epub 2019 Jan 1.

Joint image filters leverage the guidance image as a prior and transfer the structural details from the guidance image to the target image for suppressing noise or enhancing spatial resolution. Existing methods either rely on various explicit filter constructions or hand-designed objective functions, thereby making it difficult to understand, improve, and accelerate these filters in a coherent framework. In this paper, we propose a learning-based approach for constructing joint filters based on Convolutional Neural Networks. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2890623DOI Listing
January 2019

Extracting Geometric Structures in Images with Delaunay Point Processes.

IEEE Trans Pattern Anal Mach Intell 2019 Jan 1. Epub 2019 Jan 1.

We introduce Delaunay Point Processes, a framework for the extraction of geometric structures from images. Our approach simultaneously locates and groups geometric primitives (line segments, triangles) to form extended structures (line networks, polygons) for a variety of image analysis tasks. Similarly to traditional point processes, our approach uses Markov Chain Monte Carlo to minimize an energy that balances fidelity to the input image data with geometric priors on the output structures. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2890586DOI Listing
January 2019
1 Read
1.614 Impact Factor

Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 28. Epub 2018 Dec 28.

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures, which are typically used at the coarse search stage of the most proximity graph techniques. Hierarchical NSW incrementally builds a multi-layer structure consisting from hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889473DOI Listing
December 2018

Deep Self-Evolution Clustering.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 27. Epub 2018 Dec 27.

Clustering is a crucial but challenging task in pattern analysis and machine learning. Existing methods often ignore the combination between representation learning and clustering. To tackle this problem, we reconsider the clustering task from its definition to develop Deep Self-Evolution Clustering (DSEC) to jointly learn representations and cluster data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889949DOI Listing
December 2018

Group Maximum Differentiation Competition: Model Comparison with Few Samples.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 27. Epub 2018 Dec 27.

In many science and engineering fields that require computational models to predict certain physical quantities, we are often faced with the selection of the best model under the constraint that only a small sample set can be physically measured. One such example is the prediction of human perception of visual quality, where sample images live in a high dimensional space with enormous content variations. We propose a new methodology for model comparison named group maximum differentiation (gMAD) competition. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889948DOI Listing
December 2018
1.614 Impact Factor

Automated video face labelling for films and TV material.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 27. Epub 2018 Dec 27.

The objective of this work is automatic labelling of characters in TV video and movies, given weak supervisory information provided by an aligned transcript. We make five contributions: (i) a new strategy for obtaining stronger supervisory information from aligned transcripts; (ii) an explicit model for classifying background characters, based on their face-tracks; (iii) employing new ConvNet based face features, and (iv) a novel approach for labelling all face tracks jointly using linear programming. Each of these contributions delivers a boost in performance, and we demonstrate this on standard benchmarks using tracks provided by authors of prior work. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8590759/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2018.2889831DOI Listing
December 2018
7 Reads

Advances in Variational Inference.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 25. Epub 2018 Dec 25.

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889774DOI Listing
December 2018

Hierarchical Fully Convolutional Network for Joint Atrophy Localization and Alzheimer's Disease Diagnosis using Structural MRI.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 21. Epub 2018 Dec 21.

Structural magnetic resonance imaging (sMRI) has been widely used for computer-aided diagnosis of neurodegenerative disorders, e.g., Alzheimer's disease (AD), due to its sensitivity to morphological changes caused by brain atrophy. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889096DOI Listing
December 2018

Photometric Stereo in Participating Media Using an Analytical Solution for Shape-Dependent Forward Scatter.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 21. Epub 2018 Dec 21.

Images captured in participating media such as murky water, fog, or smoke are degraded by scattered light. Thus, the use of traditional three-dimensional (3D) reconstruction techniques in such environments is difficult. In this paper, we propose a photometric stereo method for participating media. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889088DOI Listing
December 2018
1 Read

Tracking-by-Fusion via Gaussian Process Regression Extended to Transfer Learning.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 21. Epub 2018 Dec 21.

This paper presents a new Gaussian Processes (GPs)-based particle filter tracking framework. The framework non-trivially extends Gaussian process regression (GPR) to transfer learning, and, following the tracking-by-fusion strategy, integrates closely two tracking components, namely a GPs component and a CFs one. First, the GPs component analyzes and models the probability distribution of the object appearance by exploiting GPs. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2889070DOI Listing
December 2018

Deep Audio-visual Speech Recognition.

IEEE Trans Pattern Anal Mach Intell 2018 Dec 21. Epub 2018 Dec 21.

The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem -- unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) we compare two models for lip reading, one using a CTC loss, and the other using a sequence-to-sequence loss. Read More

View Article

Download full-text PDF

Source
https://ieeexplore.ieee.org/document/8585066/
Publisher Site
http://dx.doi.org/10.1109/TPAMI.2018.2889052DOI Listing
December 2018
10 Reads