137 results match your criteria Annals Of Statistics[Journal]


ANALYSIS OF "LEARN-AS-YOU-GO" (LAGO) STUDIES.

Ann Stat 2021 Apr 2;49(2):793-819. Epub 2021 Apr 2.

Department of Biostatistics and Center for Methods on Implementation and Prevention Science (CMIPS), Yale School of Public Health.

In Learn-As-you-GO (LAGO) adaptive studies, the intervention is a complex multicomponent package, and is adapted in stages during the study based on past outcome data. This design formalizes standard practice in public health intervention studies. An effective intervention package is sought, while minimizing intervention package cost. Read More

View Article and Full-Text PDF

BOOSTED NONPARAMETRIC HAZARDS WITH TIME-DEPENDENT COVARIATES.

Ann Stat 2021 Aug 29;49(4):2101-2128. Epub 2021 Sep 29.

Division of Biostatistics, University of Miami.

Given functional data from a survival process with time-dependent covariates, we derive a smooth convex representation for its nonparametric log-likelihood functional and obtain its functional gradient. From this we devise a generic gradient boosting procedure for estimating the hazard function nonparametrically. An illustrative implementation of the procedure using regression trees is described to show how to recover the unknown hazard. Read More

View Article and Full-Text PDF

ASYMPTOTICALLY INDEPENDENT U-STATISTICS IN HIGH-DIMENSIONAL TESTING.

Ann Stat 2021 Feb 29;49(1):154-181. Epub 2021 Jan 29.

Division of Biostatistics, School of Public Health, University of Minnesota.

Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients. This paper constructs a family of U-statistics as unbiased estimators of the -norms of those features. We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed. Read More

View Article and Full-Text PDF
February 2021

ASYMPTOTIC DISTRIBUTIONS OF HIGH-DIMENSIONAL DISTANCE CORRELATION INFERENCE.

Ann Stat 2021 Aug 29;49(4):1999-2020. Epub 2021 Sep 29.

Department of Statistics and Data Science, Southern University of Science and Technology.

Distance correlation has become an increasingly popular tool for detecting the nonlinear dependence between a pair of potentially high-dimensional random vectors. Most existing works have explored its asymptotic distributions under the null hypothesis of independence between the two random vectors when only the sample size or the dimensionality diverges. Yet its asymptotic null distribution for the more realistic setting when both sample size and dimensionality diverge in the full range remains largely underdeveloped. Read More

View Article and Full-Text PDF

A SHRINKAGE PRINCIPLE FOR HEAVY-TAILED DATA: HIGH-DIMENSIONAL ROBUST LOW-RANK MATRIX RECOVERY.

Ann Stat 2021 Jun 9;49(3):1239-1266. Epub 2021 Aug 9.

Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544.

This paper introduces a simple principle for robust statistical inference via appropriate shrinkage on the data. This widens the scope of high-dimensional techniques, reducing the distributional conditions from sub-exponential or sub-Gaussian to more relaxed bounded second or fourth moment. As an illustration of this principle, we focus on robust estimation of the low-rank matrix * from the trace regression model = Tr(* ) + . Read More

View Article and Full-Text PDF

AVERAGE TREATMENT EFFECTS IN THE PRESENCE OF UNKNOWN INTERFERENCE.

Ann Stat 2021 Apr 2;49(2):673-701. Epub 2021 Apr 2.

University of North Carolina, Chapel Hill.

We investigate large-sample properties of treatment effect estimators under unknown interference in randomized experiments. The inferential target is a generalization of the average treatment effect estimand that marginalizes over potential spillover effects. We show that estimators commonly used to estimate treatment effects under no interference are consistent for the generalized estimand for several common experimental designs under limited but otherwise arbitrary and unknown interference. Read More

View Article and Full-Text PDF

ASYMMETRY HELPS: EIGENVALUE AND EIGENVECTOR ANALYSES OF ASYMMETRICALLY PERTURBED LOW-RANK MATRICES.

Ann Stat 2021 Feb 29;49(1):435-458. Epub 2021 Jan 29.

Princeton University.

This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix , yet only a randomly perturbed version is observed. The noise matrix - is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. Read More

View Article and Full-Text PDF
February 2021

TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA.

Ann Stat 2020 Oct 19;48(5):2622-2645. Epub 2020 Sep 19.

Department of Statistics, the Pennsylvania State University, University Park, PA 16802-2111, USA.

This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. Read More

View Article and Full-Text PDF
October 2020

ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.

Ann Stat 2020 Jun 17;48(3):1452-1474. Epub 2020 Jul 17.

Department of ORFE, Princeton University, Princeton, NJ 08544, USA.

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. Read More

View Article and Full-Text PDF

CONSISTENT SELECTION OF THE NUMBER OF CHANGE-POINTS VIA SAMPLE-SPLITTING.

Ann Stat 2020 Feb 17;48(1):413-439. Epub 2020 Feb 17.

Department of Statistics, and The Methodology Center, The Pennsylvania State University, University Park, PA 16802-2111, USA

In multiple change-point analysis, one of the major challenges is to estimate the number of change-points. Most existing approaches attempt to minimize a Schwarz information criterion which balances a term quantifying model fit with a penalization term accounting for model complexity that increases with the number of change-points and limits overfitting. However, different penalization terms are required to adapt to different contexts of multiple change-point problems and the optimal penalization magnitude usually varies from the model and error distribution. Read More

View Article and Full-Text PDF
February 2020

A UNIFIED STUDY OF NONPARAMETRIC INFERENCE FOR MONOTONE FUNCTIONS.

Ann Stat 2020 Apr 26;48(2):1001-1024. Epub 2020 May 26.

Department of Biostatistics, University of Washington.

The problem of nonparametric inference on a monotone function has been extensively studied in many particular cases. Estimators considered have often been of so-called Grenander type, being representable as the left derivative of the greatest convex minorant or least concave majorant of an estimator of a primitive function. In this paper, we provide general conditions for consistency and pointwise convergence in distribution of a class of generalized Grenander-type estimators of a monotone function. Read More

View Article and Full-Text PDF

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX.

Ann Stat 2019 31;47(6):3300-3334. Epub 2019 Oct 31.

Department of Statistics, and The Methodology Center the Pennsylvania State University, University Park, PA 16802-2111, USA,

This paper is concerned with test of significance on high dimensional covariance structures, and aims to develop a unified framework for testing commonly-used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high dimensional random matrix theory, and establish several highly useful asymptotic results. Read More

View Article and Full-Text PDF
October 2019

Distributed estimation of principal eigenspaces.

Ann Stat 2019 Dec 31;47(6):3009-3031. Epub 2019 Oct 31.

Department of Operations Research and Financial Engineering Princeton University.

Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. Read More

View Article and Full-Text PDF
December 2019

SPECTRAL METHOD AND REGULARIZED MLE ARE BOTH OPTIMAL FOR TOP- RANKING.

Ann Stat 2019 21;47(4):2204-2235. Epub 2019 May 21.

Department of Operations Research &, Financial Engineering, Princeton University, Princeton, New Jersey 08544,

This paper is concerned with the problem of top- ranking from pairwise comparisons. Given a collection of items and a few pairwise comparisons across them, one wishes to identify the set of items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model - the Bradley-Terry-Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Read More

View Article and Full-Text PDF

LINEAR HYPOTHESIS TESTING FOR HIGH DIMENSIONAL GENERALIZED LINEAR MODELS.

Ann Stat 2019 Oct 3;47(5):2671-2703. Epub 2019 Aug 3.

Department of Statistics, and The Methodology Center, the Pennsylvania State University, University Park, PA 16802-2111, USA.

This paper is concerned with testing linear hypotheses in high-dimensional generalized linear models. To deal with linear hypotheses, we first propose constrained partial regularization method and study its statistical properties. We further introduce an algorithm for solving regularization problems with folded-concave penalty functions and linear constraints. Read More

View Article and Full-Text PDF
October 2019

EIGENVALUE DISTRIBUTIONS OF VARIANCE COMPONENTS ESTIMATORS IN HIGH-DIMENSIONAL RANDOM EFFECTS MODELS.

Ann Stat 2019 Oct 3;47(5):2855-2886. Epub 2019 Aug 3.

Department of Statistics, Stanford University, 390 Serra Mall, Stanford, CA 94305,

We study the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models. When the dimensionality of the observations is large and comparable to the number of realizations of each random effect, we show that the empirical spectra of such estimators are well-approximated by deterministic laws. The Stieltjes transforms of these laws are characterized by systems of fixed-point equations, which are numerically solvable by a simple iterative procedure. Read More

View Article and Full-Text PDF
October 2019

TEST FOR HIGH DIMENSIONAL CORRELATION MATRICES.

Ann Stat 2019 Oct 3;47(5):2887-2921. Epub 2019 Aug 3.

Department of Biostatistics, The University of North Carolina at Chapel Hill Chapel Hill, NC, USA

Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one-, two-, and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. Read More

View Article and Full-Text PDF
October 2019

A ROBUST AND EFFICIENT APPROACH TO CAUSAL INFERENCE BASED ON SPARSE SUFFICIENT DIMENSION REDUCTION.

Ann Stat 2019 Jun 13;47(3):1505-1535. Epub 2019 Feb 13.

DEPARTMENT OF STATISTICS, TEXAS A&M UNIVERSITY, COLLEGE STATION, TEXAS 77843, USA.

A fundamental assumption used in causal inference with observational data is that treatment assignment is ignorable given measured confounding variables. This assumption of no missing confounders is plausible if a large number of baseline covariates are included in the analysis, as we often have no prior knowledge of which variables can be important confounders. Thus, estimation of treatment effects with a large number of covariates has received considerable attention in recent years. Read More

View Article and Full-Text PDF

NONPARAMETRIC TESTING FOR MULTIPLE SURVIVAL FUNCTIONS WITH NON-INFERIORITY MARGINS.

Ann Stat 2019 Feb 30;47(1):205-232. Epub 2018 Nov 30.

Department of Biostatistics, Columbia University, 722 West 168th Street, New York, NY 10032, U.S.A.

New nonparametric tests for the ordering of multiple survival functions are developed with the possibility of right censorship taken into account. The motivation comes from non-inferiority trials with multiple treatments. The proposed tests are based on nonparametric likelihood ratio statistics, which are known to provide more powerful tests than Wald-type procedures, but in this setting have only been studied for pairs of survival functions or in the absence of censoring. Read More

View Article and Full-Text PDF
February 2019

ON TESTING CONDITIONAL QUALITATIVE TREATMENT EFFECTS.

Ann Stat 2019 Aug 21;47(4):2348-2377. Epub 2019 May 21.

Department of Statistics, North Carolina State University, Raleigh, NC 27695.

Precision medicine is an emerging medical paradigm that focuses on finding the most effective treatment strategy tailored for individual patients. In the literature, most of the existing works focused on estimating the optimal treatment regime. However, there has been less attention devoted to hypothesis testing regarding the optimal treatment regime. Read More

View Article and Full-Text PDF

UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK.

Ann Stat 2018 Dec 11;46(6B):3643-3675. Epub 2018 Sep 11.

Department of Biostatistics, Columbia University, 722 West 168th St, Rm 633, New York, New York 10032, USA,

In this paper, we develop procedures to construct simultaneous confidence bands for potentially infinite-dimensional parameters after model selection for general moment condition models where is potentially much larger than the sample size of available data, . This allows us to cover settings with functional response data where each of the parameters is a function. The procedure is based on the construction of score functions that satisfy Neyman orthogonality condition approximately. Read More

View Article and Full-Text PDF
December 2018

FEATURE ELIMINATION IN KERNEL MACHINES IN MODERATELY HIGH DIMENSIONS.

Ann Stat 2019 Feb;47(1):497-526

The University of North Carolina at Chapel Hill.

We develop an approach for feature elimination in statistical learning with kernel machines, based on recursive elimination of features. We present theoretical properties of this method and show that it is uniformly consistent in finding the correct feature space under certain generalized assumptions. We present a few case studies to show that the assumptions are met in most practical situations and present simulation results to demonstrate performance of the proposed approach. Read More

View Article and Full-Text PDF
February 2019

ESTIMATION OF A MONOTONE DENSITY IN -SAMPLE BIASED SAMPLING MODELS.

Ann Stat 2018 17;46(5):2125-2152. Epub 2018 Aug 17.

Department of Statistics, Chinese University of Hong Kong, Shatin, NT, Hong Kong Sar.

We study the nonparametric estimation of a decreasing density function in a general -sample biased sampling model with weight (or bias) functions for = 1, …, . The determination of the monotone maximum likelihood estimator and its asymptotic distribution, except for the case when = 1, has been long missing in the literature due to certain non-standard structures of the likelihood function, such as non-separability and a lack of strictly positive second order derivatives of the negative of the log-likelihood function. The existence, uniqueness, self-characterization, consistency of and its asymptotic distribution at a fixed point are established in this article. Read More

View Article and Full-Text PDF

Consistency and convergence rate of phylogenetic inference via regularization.

Ann Stat 2018 Aug 27;46(4):1481-1512. Epub 2018 Jun 27.

Program in Computational Biology Fred Hutchinson Cancer Research Center.

It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct "gene tree." Although the gene tree may deviate from the "species tree" due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that they agree. Read More

View Article and Full-Text PDF

BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.

Ann Stat 2018 Jun;46(3):1109-1137

Sun Yat-sen University.

In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. Read More

View Article and Full-Text PDF

HIGH DIMENSIONAL CENSORED QUANTILE REGRESSION.

Ann Stat 2018 Feb 22;46(1):308-343. Epub 2018 Feb 22.

Department of Statistics University of Michigan, Ann Arbor, MI 48109, USA.

Censored quantile regression (CQR) has emerged as a useful regression tool for survival analysis. Some commonly used CQR methods can be characterized by stochastic integral-based estimating equations in a sequential manner across quantile levels. In this paper, we analyze CQR in a high dimensional setting where the regression functions over a continuum of quantile levels are of interest. Read More

View Article and Full-Text PDF
February 2018

ASSESSING ROBUSTNESS OF CLASSIFICATION USING ANGULAR BREAKDOWN POINT.

Ann Stat 2018 Dec 11;46(6B):3362-3389. Epub 2018 Sep 11.

University of North Carolina at Chapel Hill, USA.

Robustness is a desirable property for many statistical techniques. As an important measure of robustness, breakdown point has been widely used for regression problems and many other settings. Despite the existing development, we observe that the standard breakdown point criterion is not directly applicable for many classification problems. Read More

View Article and Full-Text PDF
December 2018

Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

Ann Stat 2018 Aug 27;46(4):1742-1778. Epub 2018 Jun 27.

Department of Statistics, Stanford University.

We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker * dominating all other shrinkers. Read More

View Article and Full-Text PDF

A NEW PERSPECTIVE ON ROBUST -ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING.

Ann Stat 2018 Oct 17;46(5):1904-1931. Epub 2018 Aug 17.

Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, USA.

Heavy-tailed errors impair the accuracy of the least squares estimate, which can be spoiled by a single grossly outlying observation. As argued in the seminal work of Peter Huber in 1973 [ (1973) 799-821], robust alternatives to the method of least squares are sorely needed. To achieve robustness against heavy-tailed sampling distributions, we revisit the Huber estimator from a new perspective by letting the tuning parameter involved diverge with the sample size. Read More

View Article and Full-Text PDF
October 2018

LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS.

Ann Stat 2018 Aug 27;46(4):1383-1414. Epub 2018 Jun 27.

Dept of Operations Research & Financial Engineering, Sherrerd Hall, Princeton University, Princeton, NJ 08544, USA.

We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix. Read More

View Article and Full-Text PDF