**129 results** match your criteria *Annals Of Statistics[Journal] *

- Page
**1**of**5** - Next Page

Ann Stat 2020 Jun 17;48(3):1452-1474. Epub 2020 Jul 17.

Department of ORFE, Princeton University, Princeton, NJ 08544, USA.

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. Read More

June 2020

Ann Stat 2020 Feb 17;48(1):413-439. Epub 2020 Feb 17.

Department of Statistics, and The Methodology Center, The Pennsylvania State University, University Park, PA 16802-2111, USA

In multiple change-point analysis, one of the major challenges is to estimate the number of change-points. Most existing approaches attempt to minimize a Schwarz information criterion which balances a term quantifying model fit with a penalization term accounting for model complexity that increases with the number of change-points and limits overfitting. However, different penalization terms are required to adapt to different contexts of multiple change-point problems and the optimal penalization magnitude usually varies from the model and error distribution. Read More

February 2020

Ann Stat 2020 Apr 26;48(2):1001-1024. Epub 2020 May 26.

Department of Biostatistics, University of Washington.

The problem of nonparametric inference on a monotone function has been extensively studied in many particular cases. Estimators considered have often been of so-called Grenander type, being representable as the left derivative of the greatest convex minorant or least concave majorant of an estimator of a primitive function. In this paper, we provide general conditions for consistency and pointwise convergence in distribution of a class of generalized Grenander-type estimators of a monotone function. Read More

April 2020

Ann Stat 2019 31;47(6):3300-3334. Epub 2019 Oct 31.

Department of Statistics, and The Methodology Center the Pennsylvania State University, University Park, PA 16802-2111, USA,

This paper is concerned with test of significance on high dimensional covariance structures, and aims to develop a unified framework for testing commonly-used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high dimensional random matrix theory, and establish several highly useful asymptotic results. Read More

October 2019

Ann Stat 2019 Dec 31;47(6):3009-3031. Epub 2019 Oct 31.

Department of Operations Research and Financial Engineering Princeton University.

Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. Read More

December 2019

Ann Stat 2019 21;47(4):2204-2235. Epub 2019 May 21.

Department of Operations Research &, Financial Engineering, Princeton University, Princeton, New Jersey 08544,

This paper is concerned with the problem of top- ranking from pairwise comparisons. Given a collection of items and a few pairwise comparisons across them, one wishes to identify the set of items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model - the Bradley-Terry-Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Read More

May 2019

Ann Stat 2019 Oct 3;47(5):2671-2703. Epub 2019 Aug 3.

Department of Statistics, and The Methodology Center, the Pennsylvania State University, University Park, PA 16802-2111, USA.

This paper is concerned with testing linear hypotheses in high-dimensional generalized linear models. To deal with linear hypotheses, we first propose constrained partial regularization method and study its statistical properties. We further introduce an algorithm for solving regularization problems with folded-concave penalty functions and linear constraints. Read More

October 2019

Ann Stat 2019 Oct 3;47(5):2855-2886. Epub 2019 Aug 3.

Department of Statistics, Stanford University, 390 Serra Mall, Stanford, CA 94305,

We study the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models. When the dimensionality of the observations is large and comparable to the number of realizations of each random effect, we show that the empirical spectra of such estimators are well-approximated by deterministic laws. The Stieltjes transforms of these laws are characterized by systems of fixed-point equations, which are numerically solvable by a simple iterative procedure. Read More

October 2019

Ann Stat 2019 Oct 3;47(5):2887-2921. Epub 2019 Aug 3.

Department of Biostatistics, The University of North Carolina at Chapel Hill Chapel Hill, NC, USA

Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one-, two-, and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. Read More

October 2019

Ann Stat 2019 Jun 13;47(3):1505-1535. Epub 2019 Feb 13.

DEPARTMENT OF STATISTICS, TEXAS A&M UNIVERSITY, COLLEGE STATION, TEXAS 77843, USA.

A fundamental assumption used in causal inference with observational data is that treatment assignment is ignorable given measured confounding variables. This assumption of no missing confounders is plausible if a large number of baseline covariates are included in the analysis, as we often have no prior knowledge of which variables can be important confounders. Thus, estimation of treatment effects with a large number of covariates has received considerable attention in recent years. Read More

June 2019

Ann Stat 2019 Feb 30;47(1):205-232. Epub 2018 Nov 30.

Department of Biostatistics, Columbia University, 722 West 168th Street, New York, NY 10032, U.S.A.

New nonparametric tests for the ordering of multiple survival functions are developed with the possibility of right censorship taken into account. The motivation comes from non-inferiority trials with multiple treatments. The proposed tests are based on nonparametric likelihood ratio statistics, which are known to provide more powerful tests than Wald-type procedures, but in this setting have only been studied for pairs of survival functions or in the absence of censoring. Read More

February 2019

Ann Stat 2019 Aug 21;47(4):2348-2377. Epub 2019 May 21.

Department of Statistics, North Carolina State University, Raleigh, NC 27695.

Precision medicine is an emerging medical paradigm that focuses on finding the most effective treatment strategy tailored for individual patients. In the literature, most of the existing works focused on estimating the optimal treatment regime. However, there has been less attention devoted to hypothesis testing regarding the optimal treatment regime. Read More

August 2019

Ann Stat 2018 Dec 11;46(6B):3643-3675. Epub 2018 Sep 11.

Department of Biostatistics, Columbia University, 722 West 168th St, Rm 633, New York, New York 10032, USA,

In this paper, we develop procedures to construct simultaneous confidence bands for potentially infinite-dimensional parameters after model selection for general moment condition models where is potentially much larger than the sample size of available data, . This allows us to cover settings with functional response data where each of the parameters is a function. The procedure is based on the construction of score functions that satisfy Neyman orthogonality condition approximately. Read More

December 2018

Ann Stat 2019 Feb;47(1):497-526

The University of North Carolina at Chapel Hill.

We develop an approach for feature elimination in statistical learning with kernel machines, based on recursive elimination of features. We present theoretical properties of this method and show that it is uniformly consistent in finding the correct feature space under certain generalized assumptions. We present a few case studies to show that the assumptions are met in most practical situations and present simulation results to demonstrate performance of the proposed approach. Read More

February 2019

Ann Stat 2018 17;46(5):2125-2152. Epub 2018 Aug 17.

Department of Statistics, Chinese University of Hong Kong, Shatin, NT, Hong Kong Sar.

We study the nonparametric estimation of a decreasing density function in a general -sample biased sampling model with weight (or bias) functions for = 1, …, . The determination of the monotone maximum likelihood estimator and its asymptotic distribution, except for the case when = 1, has been long missing in the literature due to certain non-standard structures of the likelihood function, such as non-separability and a lack of strictly positive second order derivatives of the negative of the log-likelihood function. The existence, uniqueness, self-characterization, consistency of and its asymptotic distribution at a fixed point are established in this article. Read More

August 2018

Ann Stat 2018 Aug 27;46(4):1481-1512. Epub 2018 Jun 27.

Program in Computational Biology Fred Hutchinson Cancer Research Center.

It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct "gene tree." Although the gene tree may deviate from the "species tree" due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that they agree. Read More

August 2018

Ann Stat 2018 Jun;46(3):1109-1137

Sun Yat-sen University.

In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. Read More

June 2018

Ann Stat 2018 Feb 22;46(1):308-343. Epub 2018 Feb 22.

Department of Statistics University of Michigan, Ann Arbor, MI 48109, USA.

Censored quantile regression (CQR) has emerged as a useful regression tool for survival analysis. Some commonly used CQR methods can be characterized by stochastic integral-based estimating equations in a sequential manner across quantile levels. In this paper, we analyze CQR in a high dimensional setting where the regression functions over a continuum of quantile levels are of interest. Read More

February 2018

Ann Stat 2018 Dec 11;46(6B):3362-3389. Epub 2018 Sep 11.

University of North Carolina at Chapel Hill, USA.

Robustness is a desirable property for many statistical techniques. As an important measure of robustness, breakdown point has been widely used for regression problems and many other settings. Despite the existing development, we observe that the standard breakdown point criterion is not directly applicable for many classification problems. Read More

December 2018

Ann Stat 2018 Aug 27;46(4):1742-1778. Epub 2018 Jun 27.

Department of Statistics, Stanford University.

We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker * dominating all other shrinkers. Read More

August 2018

Ann Stat 2018 Oct 17;46(5):1904-1931. Epub 2018 Aug 17.

Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, USA.

Heavy-tailed errors impair the accuracy of the least squares estimate, which can be spoiled by a single grossly outlying observation. As argued in the seminal work of Peter Huber in 1973 [ (1973) 799-821], robust alternatives to the method of least squares are sorely needed. To achieve robustness against heavy-tailed sampling distributions, we revisit the Huber estimator from a new perspective by letting the tuning parameter involved diverge with the sample size. Read More

October 2018

Ann Stat 2018 Aug 27;46(4):1383-1414. Epub 2018 Jun 27.

Dept of Operations Research & Financial Engineering, Sherrerd Hall, Princeton University, Princeton, NJ 08544, USA.

We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix. Read More

August 2018

Ann Stat 2018 Jun 3;46(3):1352-1382. Epub 2018 May 3.

Princeton University.

This paper studies hypothesis testing and parameter estimation in the context of the divide-and-conquer algorithm. In a unified likelihood based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from subsamples of size , where is the sample size. In both low dimensional and sparse high dimensional settings, we address the important question of how large can be, as grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible. Read More

June 2018

Ann Stat 2018 Jun 3;46(3):989-1017. Epub 2018 May 3.

Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, USA.

Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries by such data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions on exogeneity of covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distributions of the maximum spurious correlations given certain number of predictors, namely, the distribution of the correlation of a response variable with the best linear combinations of covariates , even when and are independent. When the covariance matrix of possesses the restricted eigenvalue property, we derive such distributions for both finite and diverging , using Gaussian approximation and empirical process techniques. Read More

June 2018

Ann Stat 2018 Apr 3;46(2):814-841. Epub 2018 Apr 3.

Tencent AI Lab, Shennan Ave, Nanshan District, Shen Zhen, Guangdong, China.

We propose a computational framework named iterative local adaptive majorize-minimization (I-LAMM) to simultaneously control algorithmic complexity and statistical error when fitting high dimensional models. I-LAMM is a two-stage algorithmic implementation of the local linear approximation to a family of folded concave penalized quasi-likelihood. The first stage solves a convex program with a crude precision tolerance to obtain a coarse initial estimator, which is further refined in the second stage by iteratively solving a sequence of convex programs with smaller precision tolerances. Read More

April 2018

Ann Stat 2018 Jun 3;46(3):925-957. Epub 2018 May 3.

Department of Statistics, North Carolina State University, Raleigh NC, U.S.A.

Precision medicine is a medical paradigm that focuses on finding the most effective treatment decision based on individual patient information. For many complex diseases, such as cancer, treatment decisions need to be tailored over time according to patients' responses to previous treatments. Such an adaptive strategy is referred as a dynamic treatment regime. Read More

June 2018

Ann Stat 2018 Feb 22;46(1):1-29. Epub 2018 Feb 22.

Department of Statistics, Columbia University, 1255 Amsterdam Avenue, New York, NY 10027.

The asymptotic efficiency of a generalized likelihood ratio test proposed by Cox is studied under the large deviations framework for error probabilities developed by Chernoff. In particular, two separate parametric families of hypotheses are considered (Cox, 1961, 1962). The significance level is set such that the maximal type I and type II error probabilities for the generalized likelihood ratio test decay exponentially fast with the same rate. Read More

February 2018

Ann Stat 2017 15;45(6):2537-2564. Epub 2017 Dec 15.

University of California, Berkeley.

This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, , assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own. Read More

December 2017

Ann Stat 2017 15;45(6):2565-2589. Epub 2017 Dec 15.

Department of Statistics, University of South Carolina.

We propose distance-based goodness-of-fit (GOF) tests for uniform stochastic ordering with two continuous distributions and , both of which are unknown. Our tests are motivated by the fact that when and are uniformly stochastically ordered, the ordinal dominance curve = is star-shaped. We derive asymptotic distributions and prove that our testing procedure has a unique least favorable configuration of and for ∈ [1,∞]. Read More

December 2017

Ann Stat 2017 21;45(1):1-38. Epub 2017 Feb 21.

Duke University.

Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. Read More

February 2017

- Page
**1**of**5** - Next Page