Publications by authors named "Zhenan Sun"

23 Publications

  • Page 1 of 1

Learning 3D Human Shape and Pose from Dense Body Parts.

IEEE Trans Pattern Anal Mach Intell 2020 Dec 3;PP. Epub 2020 Dec 3.

Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment comes from the facts that the mapping from images to the model space is highly non-linear and the rotation-based pose representation of the body model is prone to result in the drift of joint positions. In this work, we investigate learning 3D human shape and pose from dense correspondences of body parts and propose a Decompose-and-aggregate Network (DaNet) to address these issues. DaNet adopts the dense correspondence maps, which densely build a bridge between 2D pixels and 3D vertexes, as intermediate representations to facilitate the learning of 2D-to-3D mapping. The prediction modules of DaNet are decomposed into one global stream and multiple local streams to enable global and fine-grained perceptions for the shape and pose predictions, respectively. Messages from local streams are further aggregated to enhance the robust prediction of the rotation-based poses, where a position-aided rotation feature refinement strategy is proposed to exploit spatial relationships between body joints. Moreover, a Part-based Dropout (PartDrop) strategy is introduced to drop out dense information from intermediate representations during training, encouraging the network to focus on more complementary body parts as well as neighboring position features. The efficacy of the proposed method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW, showing that our method could significantly improve the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at https://hongwenzhang.github.io/dense2mesh.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2020.3042341DOI Listing
December 2020

Adversarial Cross-Spectral Face Completion for NIR-VIS Face Recognition.

IEEE Trans Pattern Anal Mach Intell 2020 05 24;42(5):1025-1037. Epub 2019 Dec 24.

Near infrared-visible (NIR-VIS) heterogeneous face recognition refers to the process of matching NIR to VIS face images. Current heterogeneous methods try to extend VIS face recognition methods to the NIR spectrum by synthesizing VIS images from NIR images. However, due to the self-occlusion and sensing gap, NIR face images lose some visible lighting contents so that they are always incomplete compared to VIS face images. This paper models high-resolution heterogeneous face synthesis as a complementary combination of two components: a texture inpainting component and a pose correction component. The inpainting component synthesizes and inpaints VIS image textures from NIR image textures. The correction component maps any pose in NIR images to a frontal pose in VIS images, resulting in paired NIR and VIS textures. A warping procedure is developed to integrate the two components into an end-to-end deep network. A fine-grained discriminator and a wavelet-based discriminator are designed to improve visual quality. A novel 3D-based pose correction loss, two adversarial losses, and a pixel loss are imposed to ensure synthesis results. We demonstrate that by attaching the correction component, we can simplify heterogeneous face synthesis from one-to-many unpaired image translation to one-to-one paired image translation, and minimize the spectral and pose discrepancy during heterogeneous recognition. Extensive experimental results show that our network not only generates high-resolution VIS face images but also facilitates the accuracy improvement of heterogeneous face recognition.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2019.2961900DOI Listing
May 2020

Binocular Light-Field: Imaging Theory and Occlusion-Robust Depth Perception Application.

IEEE Trans Image Process 2019 Sep 27. Epub 2019 Sep 27.

Binocular stereo vision (SV) has been widely used to reconstruct the depth information, but it is quite vulnerable to scenes with strong occlusions. As an emerging computational photography technology, light-field (LF) imaging brings about a novel solution to passive depth perception by recording multiple angular views in a single exposure. In this paper, we explore binocular SV and LF imaging to form the binocular-LF imaging system. An imaging theory is derived by modeling the imaging process and analyzing disparity properties based on the geometrical optics theory. Then an accurate occlusion-robust depth estimation algorithm is proposed by exploiting multibaseline stereo matching cues and defocus cues. The occlusions caused by binocular SV and LF imaging are detected and handled to eliminate the matching ambiguities and outliers. Finally, we develop a binocular-LF database and capture realworld scenes by our binocular-LF system to test the accuracy and robustness. The experimental results demonstrate that the proposed algorithm definitely recovers high quality depth maps with smooth surfaces and precise geometric shapes, which tackles the drawbacks of binocular SV and LF imaging simultaneously.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2019.2943019DOI Listing
September 2019

Adversarial Learning Semantic Volume for 2D/3D Face Shape Regression in the Wild.

IEEE Trans Image Process 2019 Apr 19. Epub 2019 Apr 19.

Regression based methods have revolutionized 2D landmark localization with the exploitation of deep neural networks and massive annotated datasets in the wild. However, it remains challenging for 3D landmark localization due to the lack of annotated datasets and the ambiguous nature of landmarks under 3D perspective. This paper revisits regression based methods and proposes an adversarial voxel and coordinate regression framework for 2D and 3D facial landmark localization in real-world scenarios. First, a semantic volumetric representation is introduced to encode the per-voxel likelihood of positions being the 3D landmarks. Then, an end-to-end pipeline is designed to jointly regress the proposed volumetric representation and the coordinate vector. Such a pipeline not only enhances the robustness and accuracy of the predictions but also unifies the 2D and 3D landmark localization so that 2D and 3D datasets could be utilized simultaneously. Further, an adversarial learning strategy is exploited to distill 3D structure learned from synthetic datasets to real-world datasets under weakly supervised settings, where an auxiliary regression discriminator is proposed to encourage the network to produce plausible predictions for both synthetic and real-world images. The effectiveness of our method is validated on benchmark datasets 3DFAW and AFLW2000-3D for both 2D and 3D facial landmark localization tasks. Experimental results show that the proposed method achieves significant improvements over previous state-of-the-art methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2019.2911114DOI Listing
April 2019

Local Semantic-aware Deep Hashing with Hamming-isometric Quantization.

IEEE Trans Image Process 2018 Dec 21. Epub 2018 Dec 21.

Hashing has attracted increasing attention due to its tremendous potential for efficient image retrieval and data storage. Compared with conventional hashing methods with a handcrafted feature, emerging deep hashing approaches employ deep neural networks to learn feature representations as well as hash functions, which have already been proved to be more powerful and robust in real-world applications. Currently, most of the existing deep hashing methods construct pairwise or triplet-wise constraint to obtain similar binary codes between similar data pair or relative similar binary codes within a triplet. However, some critical local structures of the data are lack of exploiting, thus the effectiveness of hash learning is not fully shown. To address this limitation, we propose a novel deep hashing method named local semantic-aware deep hashing with Hamming-isometric quantization (LSDH), where local similarity of the data is intentionally integrated into hash learning. Specifically, in the Hamming space, we exploit the potential semantic relation of the data to robustly preserve their local similarity. In addition to reducing the error introduced by binary quantizing, we further develop a Hamming-isometric objective to maximize the consistency of similarity between the pairwise binary-like feature and its binary codes pair, which is shown to be able to enhance the quality of binary codes. Extensive experimental results on several benchmark datasets, including three singlelabel datasets (i.e., CIFAR-10, CIFAR-20, and SUN397) and one multi-label dataset (NUS-WIDE), demonstrate that the proposed LSDH achieves superior performance over the latest state-of-theart hashing methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2018.2889269DOI Listing
December 2018

Dynamic Feature Matching for Partial Face Recognition.

IEEE Trans Image Process 2018 Sep 18. Epub 2018 Sep 18.

Partial face recognition (PFR) in an unconstrained environment is a very important task, especially in situations where partial face images are likely to be captured due to occlusions, out-of-view, and large viewing angle, e.g., video surveillance and mobile devices. However, little attention has been paid to PFR so far and thus, the problem of recognizing an arbitrary patch of a face image remains largely unsolved. This study proposes a novel partial face recognition approach, called Dynamic Feature Matching (DFM), which combines Fully Convolutional Networks (FCNs) and Sparse Representation Classification (SRC) to address partial face recognition problem regardless of various face sizes. DFM does not require prior position information of partial faces against a holistic face. By sharing computation, the feature maps are calculated from the entire input image once, which yields a significant speedup. Experimental results demonstrate the effectiveness and advantages of DFM in comparison with state-of-the-art PFR methods on several partial face databases, including CAISA-NIR-Distance, CASIA-NIR-Mobile, and LFW databases. The performance of DFM is also impressive in partial person re-identification on Partial RE-ID and iLIDS databases. The source code of DFM can be found at https://github.com/lingxiao-he/dfm new.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2018.2870946DOI Listing
September 2018

Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition.

IEEE Trans Pattern Anal Mach Intell 2019 07 1;41(7):1761-1773. Epub 2018 Jun 1.

Heterogeneous face recognition (HFR) aims at matching facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR presents more challenging issues than traditional face recognition because of the large intra-class variation among heterogeneous face images and the limited availability of training samples of cross-modality face image pairs. This paper proposes the novel Wasserstein convolutional neural network (WCNN) approach for learning invariant features between near-infrared (NIR) and visual (VIS) face images (i.e., NIR-VIS face recognition). The low-level layers of the WCNN are trained with widely available face images in the VIS spectrum, and the high-level layer is divided into three parts: the NIR layer, the VIS layer and the NIR-VIS shared layer. The first two layers aim at learning modality-specific features, and the NIR-VIS shared layer is designed to learn a modality-invariant feature subspace. The Wasserstein distance is introduced into the NIR-VIS shared layer to measure the dissimilarity between heterogeneous feature distributions. W-CNN learning is performed to minimize the Wasserstein distance between the NIR distribution and the VIS distribution for invariant deep feature representations of heterogeneous face images. To avoid the over-fitting problem on small-scale heterogeneous face data, a correlation prior is introduced on the fully-connected WCNN layers to reduce the size of the parameter space. This prior is implemented by a low-rank constraint in an end-to-end network. The joint formulation leads to an alternating minimization for deep feature representation at the training stage and an efficient computation for heterogeneous data at the testing stage. Extensive experiments using three challenging NIR-VIS face recognition databases demonstrate the superiority of the WCNN method over state-of-the-art methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2842770DOI Listing
July 2019

Aggregating Randomized Clustering-Promoting Invariant Projections for Domain Adaptation.

IEEE Trans Pattern Anal Mach Intell 2019 05 1;41(5):1027-1042. Epub 2018 May 1.

Unsupervised domain adaptation aims to leverage the labeled source data to learn with the unlabeled target data. Previous trandusctive methods tackle it by iteratively seeking a low-dimensional projection to extract the invariant features and obtaining the pseudo target labels via building a classifier on source data. However, they merely concentrate on minimizing the cross-domain distribution divergence, while ignoring the intra-domain structure especially for the target domain. Even after projection, possible risk factors like imbalanced data distribution may still hinder the performance of target label inference. In this paper, we propose a simple yet effective domain-invariant projection ensemble approach to tackle these two issues together. Specifically, we seek the optimal projection via a novel relaxed domain-irrelevant clustering-promoting term that jointly bridges the cross-domain semantic gap and increases the intra-class compactness in both domains. To further enhance the target label inference, we first develop a 'sampling-and-fusion' framework, under which multiple projections are independently learned based on various randomized coupled domain subsets. Subsequently, aggregating models such as majority voting are utilized to leverage multiple projections and classify unlabeled target data. Extensive experimental results on six visual benchmarks including object, face, and digit images, demonstrate that the proposed methods gain remarkable margins over state-of-the-art unsupervised domain adaptation methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2018.2832198DOI Listing
May 2019

LFNet: A Novel Bidirectional Recurrent Convolutional Neural Network for Light-Field Image Super-Resolution.

IEEE Trans Image Process 2018 Sep;27(9):4274-4286

The low spatial resolution of light-field image poses significant difficulties in exploiting its advantage. To mitigate the dependency of accurate depth or disparity information as priors for light-field image super-resolution, we propose an implicitly multi-scale fusion scheme to accumulate contextual information from multiple scales for super-resolution reconstruction. The implicitly multi-scale fusion scheme is then incorporated into bidirectional recurrent convolutional neural network, which aims to iteratively model spatial relations between horizontally or vertically adjacent sub-aperture images of light-field data. Within the network, the recurrent convolutions are modified to be more effective and flexible in modeling the spatial correlations between neighboring views. A horizontal sub-network and a vertical sub-network of the same network structure are ensembled for final outputs via stacked generalization. Experimental results on synthetic and real-world data sets demonstrate that the proposed method outperforms other state-of-the-art methods by a large margin in peak signal-to-noise ratio and gray-scale structural similarity indexes, which also achieves superior quality for human visual systems. Furthermore, the proposed method can enhance the performance of light field applications such as depth estimation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2018.2834819DOI Listing
September 2018

Feature Selection Based on Structured Sparsity: A Comprehensive Study.

IEEE Trans Neural Netw Learn Syst 2017 07 22;28(7):1490-1507. Epub 2016 Apr 22.

Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such as clustering and classification. Structured sparsity-inducing feature selection (SSFS) methods have been widely studied in the last few years, and a number of algorithms have been proposed. However, there is no comprehensive study concerning the connections between different SSFS methods, and how they have evolved. In this paper, we attempt to provide a survey on various SSFS methods, including their motivations and mathematical representations. We then explore the relationship among different formulations and propose a taxonomy to elucidate their evolution. We group the existing SSFS methods into two categories, i.e., vector-based feature selection (feature selection based on lasso) and matrix-based feature selection (feature selection based on l-norm). Furthermore, FS has been combined with other machine learning algorithms for specific applications, such as multitask learning, multilabel learning, multiview learning, classification, and clustering. This paper not only compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2016.2551724DOI Listing
July 2017

Fast Supervised Discrete Hashing.

IEEE Trans Pattern Anal Mach Intell 2018 02 7;40(2):490-496. Epub 2017 Mar 7.

Learning-based hashing algorithms are "hot topics" because they can greatly increase the scale at which existing methods operate. In this paper, we propose a new learning-based hashing method called "fast supervised discrete hashing" (FSDH) based on "supervised discrete hashing" (SDH). Regressing the training examples (or hash code) to the corresponding class labels is widely used in ordinary least squares regression. Rather than adopting this method, FSDH uses a very simple yet effective regression of the class labels of training examples to the corresponding hash code to accelerate the algorithm. To the best of our knowledge, this strategy has not previously been used for hashing. Traditional SDH decomposes the optimization into three sub-problems, with the most critical sub-problem - discrete optimization for binary hash codes - solved using iterative discrete cyclic coordinate descent (DCC), which is time-consuming. However, FSDH has a closed-form solution and only requires a single rather than iterative hash code-solving step, which is highly efficient. Furthermore, FSDH is usually faster than SDH for solving the projection matrix for least squares regression, making FSDH generally faster than SDH. For example, our results show that FSDH is about 12-times faster than SDH when the number of hashing bits is 128 on the CIFAR-10 data base, and FSDH is about 151-times faster than FastHash when the number of hashing bits is 64 on the MNIST data-base. Our experimental results show that FSDH is not only fast, but also outperforms other comparative methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2017.2678475DOI Listing
February 2018

Demographic Analysis from Biometric Data: Achievements, Challenges, and New Frontiers.

IEEE Trans Pattern Anal Mach Intell 2018 02 14;40(2):332-351. Epub 2017 Feb 14.

Biometrics is the technique of automatically recognizing individuals based on their biological or behavioral characteristics. Various biometric traits have been introduced and widely investigated, including fingerprint, iris, face, voice, palmprint, gait and so forth. Apart from identity, biometric data may convey various other personal information, covering affect, age, gender, race, accent, handedness, height, weight, etc. Among these, analysis of demographics (age, gender, and race) has received tremendous attention owing to its wide real-world applications, with significant efforts devoted and great progress achieved. This survey first presents biometric demographic analysis from the standpoint of human perception, then provides a comprehensive overview of state-of-the-art advances in automated estimation from both academia and industry. Despite these advances, a number of challenging issues continue to inhibit its full potential. We second discuss these open problems, and finally provide an outlook into the future of this very active field of research by sharing some promising opportunities.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2017.2669035DOI Listing
February 2018

Supervised Discrete Hashing With Relaxation.

IEEE Trans Neural Netw Learn Syst 2018 03 29;29(3):608-617. Epub 2016 Dec 29.

Data-dependent hashing has recently attracted attention due to being able to support efficient retrieval and storage of high-dimensional data, such as documents, images, and videos. In this paper, we propose a novel learning-based hashing method called "supervised discrete hashing with relaxation" (SDHR) based on "supervised discrete hashing" (SDH). SDH uses ordinary least squares regression and traditional zero-one matrix encoding of class label information as the regression target (code words), thus fixing the regression target. In SDHR, the regression target is instead optimized. The optimized regression target matrix satisfies a large margin constraint for correct classification of each example. Compared with SDH, which uses the traditional zero-one matrix, SDHR utilizes the learned regression target matrix and, therefore, more accurately measures the classification error of the regression model and is more flexible. As expected, SDHR generally outperforms SDH. Experimental results on two large-scale image data sets (CIFAR-10 and MNIST) and a large-scale and challenging face data set (FRGC) demonstrate the effectiveness and efficiency of SDHR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2016.2636870DOI Listing
March 2018

Information Theoretic Subspace Clustering.

IEEE Trans Neural Netw Learn Syst 2016 12 1;27(12):2643-2655. Epub 2015 Dec 1.

This paper addresses the problem of grouping the data points sampled from a union of multiple subspaces in the presence of outliers. Information theoretic objective functions are proposed to combine structured low-rank representations (LRRs) to capture the global structure of data and information theoretic measures to handle outliers. In theoretical part, we point out that group sparsity-induced measures ( l -norm, l -norm, and correntropy) can be justified from the viewpoint of half-quadratic (HQ) optimization, which facilitates both convergence study and algorithmic development. In particular, a general formulation is accordingly proposed to unify HQ-based group sparsity methods into a common framework. In algorithmic part, we develop information theoretic subspace clustering methods via correntropy. With the help of Parzen window estimation, correntropy is used to handle either outliers under any distributions or sample-specific errors in data. Pairwise link constraints are further treated as a prior structure of LRRs. Based on the HQ framework, iterative algorithms are developed to solve the nonconvex information theoretic loss functions. Experimental results on three benchmark databases show that our methods can further improve the robustness of LRR subspace clustering and outperform other state-of-the-art subspace clustering methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2015.2500600DOI Listing
December 2016

Representative Vector Machines: A Unified Framework for Classical Classifiers.

IEEE Trans Cybern 2016 08 13;46(8):1877-88. Epub 2015 Aug 13.

Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCYB.2015.2457234DOI Listing
August 2016

Robust Subspace Clustering With Complex Noise.

IEEE Trans Image Process 2015 Nov 15;24(11):4001-13. Epub 2015 Jul 15.

Subspace clustering has important and wide applications in computer vision and pattern recognition. It is a challenging task to learn low-dimensional subspace structures due to complex noise existing in high-dimensional data. Complex noise has much more complex statistical structures, and is neither Gaussian nor Laplacian noise. Recent subspace clustering methods usually assume a sparse representation of the errors incurred by noise and correct these errors iteratively. However, large corruptions incurred by complex noise cannot be well addressed by these methods. A novel optimization model for robust subspace clustering is proposed in this paper. Its objective function mainly includes two parts. The first part aims to achieve a sparse representation of each high-dimensional data point with other data points. The second part aims to maximize the correntropy between a given data point and its low-dimensional representation with other points. Correntropy is a robust measure so that the influence of large corruptions on subspace clustering can be greatly suppressed. An extension of pairwise link constraints is also proposed as prior information to deal with complex noise. Half-quadratic minimization is provided as an efficient solution to the proposed robust subspace clustering formulations. Experimental results on three commonly used data sets show that our method outperforms state-of-the-art subspace clustering methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2015.2456504DOI Listing
November 2015

Ordinal feature selection for iris and palmprint recognition.

IEEE Trans Image Process 2014 Sep 11;23(9):3922-34. Epub 2014 Jul 11.

Ordinal measures have been demonstrated as an effective feature representation model for iris and palmprint recognition. However, ordinal measures are a general concept of image analysis and numerous variants with different parameter settings, such as location, scale, orientation, and so on, can be derived to construct a huge feature space. This paper proposes a novel optimization formulation for ordinal feature selection with successful applications to both iris and palmprint recognition. The objective function of the proposed feature selection method has two parts, i.e., misclassification error of intra and interclass matching samples and weighted sparsity of ordinal feature descriptors. Therefore, the feature selection aims to achieve an accurate and sparse representation of ordinal measures. And, the optimization subjects to a number of linear inequality constraints, which require that all intra and interclass matching pairs are well separated with a large margin. Ordinal feature selection is formulated as a linear programming (LP) problem so that a solution can be efficiently obtained even on a large-scale feature pool and training database. Extensive experimental results demonstrate that the proposed LP formulation is advantageous over existing feature selection methods, such as mRMR, ReliefF, Boosting, and Lasso for biometric recognition, reporting state-of-the-art accuracy on CASIA and PolyU databases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2014.2332396DOI Listing
September 2014

Iris Image Classification Based on Hierarchical Visual Codebook.

IEEE Trans Pattern Anal Mach Intell 2014 Jun;36(6):1120-33

Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2013.234DOI Listing
June 2014

Group sparse multiview patch alignment framework with view consistency for image classification.

IEEE Trans Image Process 2014 Jul;23(7):3126-37

No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2014.2326001DOI Listing
July 2014

Half-quadratic-based iterative minimization for robust sparse representation.

IEEE Trans Pattern Anal Mach Intell 2014 Feb;36(2):261-75

Institute of Automation, Chinese Academy of Sciences, Beijing.

Robust sparse representation has shown significant potential in solving challenging problems in computer vision such as biometrics and visual surveillance. Although several robust sparse models have been proposed and promising results have been obtained, they are either for error correction or for error detection, and learning a general framework that systematically unifies these two aspects and explores their relation is still an open problem. In this paper, we develop a half-quadratic (HQ) framework to solve the robust sparse representation problem. By defining different kinds of half-quadratic functions, the proposed HQ framework is applicable to performing both error correction and error detection. More specifically, by using the additive form of HQ, we propose an ℓ1-regularized error correction method by iteratively recovering corrupted data from errors incurred by noises and outliers; by using the multiplicative form of HQ, we propose an ℓ1-regularized error detection method by learning from uncorrupted data iteratively. We also show that the ℓ1-regularization solved by soft-thresholding function has a dual relationship to Huber M-estimator, which theoretically guarantees the performance of robust sparse representation in terms of M-estimation. Experiments on robust face recognition under severe occlusion and corruption validate our framework and findings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2013.102DOI Listing
February 2014

Iris Matching Based on Personalized Weight Map.

IEEE Trans Pattern Anal Mach Intell 2011 Sep 23;33(9):1744-57. Epub 2010 Dec 23.

Iris recognition typically involves three steps, namely, iris image preprocessing, feature extraction, and feature matching. The first two steps of iris recognition have been well studied, but the last step is less addressed. Each human iris has its unique visual pattern and local image features also vary from region to region, which leads to significant differences in robustness and distinctiveness among the feature codes derived from different iris regions. However, most state-of-the-art iris recognition methods use a uniform matching strategy, where features extracted from different regions of the same person or the same region for different individuals are considered to be equally important. This paper proposes a personalized iris matching strategy using a class-specific weight map learned from the training images of the same iris class. The weight map can be updated online during the iris recognition procedure when the successfully recognized iris images are regarded as the new training data. The weight map reflects the robustness of an encoding algorithm on different iris regions by assigning an appropriate weight to each feature code for iris matching. Such a weight map trained by sufficient iris templates is convergent and robust against various noise. Extensive and comprehensive experiments demonstrate that the proposed personalized iris matching strategy achieves much better iris recognition performance than uniform strategies, especially for poor quality iris images.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2010.227DOI Listing
September 2011

Ordinal measures for iris recognition.

IEEE Trans Pattern Anal Mach Intell 2009 Dec;31(12):2211-26

Center for Biometrics and Security Research, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Haidian District, Beijing, PR China.

Images of a human iris contain rich texture information useful for identity authentication. A key and still open issue in iris recognition is how best to represent such textural information using a compact set of features (iris features). In this paper, we propose using ordinal measures for iris feature representation with the objective of characterizing qualitative relationships between iris regions rather than precise measurements of iris image structures. Such a representation may lose some image-specific information, but it achieves a good trade-off between distinctiveness and robustness. We show that ordinal measures are intrinsic features of iris patterns and largely invariant to illumination changes. Moreover, compactness and low computational complexity of ordinal measures enable highly efficient iris recognition. Ordinal measures are a general concept useful for image analysis and many variants can be derived for ordinal feature extraction. In this paper, we develop multilobe differential filters to compute ordinal measures with flexible intralobe and interlobe parameters such as location, scale, orientation, and distance. Experimental results on three public iris image databases demonstrate the effectiveness of the proposed ordinal feature models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2008.240DOI Listing
December 2009

Toward accurate and fast iris segmentation for iris biometrics.

IEEE Trans Pattern Anal Mach Intell 2009 Sep;31(9):1670-84

Center for Biometrics and Security research and the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China.

Iris segmentation is an essential module in iris recognition because it defines the effective image region used for subsequent processing such as feature extraction. Traditional iris segmentation methods often involve an exhaustive search of a large parameter space, which is time consuming and sensitive to noise. To address these problems, this paper presents a novel algorithm for accurate and fast iris segmentation. After efficient reflection removal, an Adaboost-cascade iris detector is first built to extract a rough position of the iris center. Edge points of iris boundaries are then detected, and an elastic model named pulling and pushing is established. Under this model, the center and radius of the circular iris boundaries are iteratively refined in a way driven by the restoring forces of Hooke's law. Furthermore, a smoothing spline-based edge fitting scheme is presented to deal with noncircular iris boundaries. After that, eyelids are localized via edge detection followed by curve fitting. The novelty here is the adoption of a rank filter for noise elimination and a histogram filter for tackling the shape irregularity of eyelids. Finally, eyelashes and shadows are detected via a learned prediction model. This model provides an adaptive threshold for eyelash and shadow detection by analyzing the intensity distributions of different iris regions. Experimental results on three challenging iris image databases demonstrate that the proposed algorithm outperforms state-of-the-art methods in both accuracy and speed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2008.183DOI Listing
September 2009
-->