271 results match your criteria Annals Of Applied Statistics[Journal]


VARIABLE PRIORITIZATION IN NONLINEAR BLACK BOX METHODS: A GENETIC ASSOCIATION CASE STUDY.

Ann Appl Stat 2019 Jun 17;13(2):958-989. Epub 2019 Jun 17.

Duke University.

The central aim in this paper is to address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel and interpretable way to summarize the relative importance of predictor variables. Methodologically, we develop the "RelATive cEntrality" (RATE) measure to prioritize candidate genetic variants that are not just marginally important, but whose associations also stem from significant covarying relationships with other variants in the data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-aoas1222DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295151PMC

ESTIMATING AND FORECASTING THE SMOKING-ATTRIBUTABLE MORTALITY FRACTION FOR BOTH GENDERS JOINTLY IN OVER 60 COUNTRIES.

Ann Appl Stat 2020 Mar 16;14(1):381-408. Epub 2020 Apr 16.

Department of Statistics, Box 354322, University of Washington, Seattle, Washington 98195-4322, USA.

Smoking is one of the leading preventable threats to human health and a major risk factor for lung cancer, upper aero-digestive cancer, and chronic obstructive pulmonary disease. Estimating and forecasting the smoking attributable fraction (SAF) of mortality can yield insights into smoking epidemics and also provide a basis for more accurate mortality and life expectancy projection. Peto et al. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/19-aoas1306DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7220047PMC

A HIDDEN MARKOV MODEL APPROACH TO CHARACTERIZING THE PHOTO-SWITCHING BEHAVIOR OF FLUOROPHORES.

Ann Appl Stat 2019 Sep 17;13(3):1397-1429. Epub 2019 Oct 17.

Imperial College London.

Fluorescing molecules (fluorophores) that stochastically switch between photon-emitting and dark states underpin some of the most celebrated advancements in super-resolution microscopy. While this stochastic behavior has been heavily exploited, full characterization of the underlying models can potentially drive forward further imaging methodologies. Under the assumption that fluorophores move between fluorescing and dark states as continuous time Markov processes, the goal is to use a sequence of images to select a model and estimate the transition rates. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/19-AOAS1240DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6957128PMC
September 2019

MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS.

Ann Appl Stat 2019 Mar 10;13(1):1-33. Epub 2019 Apr 10.

DEPARTMENT OF STATISTICS, STANFORD UNIVERSITY, 390 SERRA MALL, STANFORD, CALIFORNIA 94305,

We tackle the problem of selecting from among a large number of variables those that are "important" for an outcome. We consider situations where groups of variables are also of interest. For example, each variable might be a genetic polymorphism, and we might want to study how a trait depends on variability in genes, segments of DNA that typically contain multiple such polymorphisms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1185DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6827557PMC

Bayesian Semiparametric Joint Regression Analysis of Recurrent Adverse Events and Survival in Esophageal Cancer Patients.

Ann Appl Stat 2019 Mar 10;13(1):221-247. Epub 2019 Apr 10.

Department of Radiation Oncology, M.D. Anderson, Huston, TX.

We propose a Bayesian semiparametric joint regression model for a recurrent event process and survival time. Assuming independent latent subject frailties, we define marginal models for the recurrent event process intensity and survival distribution as functions of the subject's frailty and baseline covariates. A robust Bayesian model, called Joint-DP, is obtained by assuming a Dirichlet process for the frailty distribution. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1182DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824476PMC

EARLY DIAGNOSIS OF NEUROLOGICAL DISEASE USING PEAK DEGENERATION AGES OF MULTIPLE BIOMARKERS.

Ann Appl Stat 2019 17;13(2):1295-1318. Epub 2019 Jun 17.

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599,

Neurological diseases are due to the loss of structure or function of neurons that eventually leads to cognitive deficit, neuropsychiatric symptoms, and impaired activities of daily living. Identifying sensitive and specific biological and clinical markers for early diagnosis allows recruiting patients into a clinical trial to test therapeutic intervention. However, many biomarker studies considered a single biomarker at one time that fails to provide precise prediction for disease age at onset. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1236DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822567PMC
June 2019
1 Read

BAYESIAN METHODS FOR MULTIPLE MEDIATORS: RELATING PRINCIPAL STRATIFICATION AND CAUSAL MEDIATION IN THE ANALYSIS OF POWER PLANT EMISSION CONTROLS.

Ann Appl Stat 2019 Sep 17;13(3):1927-1956. Epub 2019 Oct 17.

Harvard T.H. Chan School of Public Health.

Emission control technologies installed on power plants are a key feature of many air pollution regulations in the US. While such regulations are predicated on the presumed relationships between emissions, ambient air pollution, and human health, many of these relationships have never been empirically verified. The goal of this paper is to develop new statistical methods to quantify these relationships. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/19-AOAS1260DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6814408PMC
September 2019

CAUSAL INFERENCE IN THE CONTEXT OF AN ERROR PRONE EXPOSURE: AIR POLLUTION AND MORTALITY.

Ann Appl Stat 2019 Mar 10;13(1):520-547. Epub 2019 Apr 10.

Harvard T.H. Chan School of Public Health.

We propose a new approach for estimating causal effects when the exposure is measured with error and confounding adjustment is performed via a generalized propensity score (GPS). Using validation data, we propose a regression calibration (RC)-based adjustment for a continuous error-prone exposure combined with GPS to adjust for confounding (RC-GPS). The outcome analysis is conducted after transforming the corrected continuous exposure into a categorical exposure. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1206DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812524PMC

A SIMULATION-BASED FRAMEWORK FOR ASSESSING THE FEASIBILITY OF RESPONDENT-DRIVEN SAMPLING FOR ESTIMATING CHARACTERISTICS IN POPULATIONS OF LESBIAN, GAY AND BISEXUAL OLDER ADULTS.

Ann Appl Stat 2018 Dec 13;12(4):2252-2278. Epub 2018 Nov 13.

University of Washington.

Respondent-driven sampling (RDS) is a method for sampling from a target population by leveraging social connections. RDS is invaluable to the study of hard-to-reach populations. However, RDS is costly and can be infeasible. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1151DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6800244PMC
December 2018

JOINT MEAN AND COVARIANCE MODELING OF MULTIPLE HEALTH OUTCOME MEASURES.

Ann Appl Stat 2019 Mar 10;13(1):321-339. Epub 2019 Apr 10.

DEPARTMENT OF STATISTICAL SCIENCE, DUKE UNIVERSITY, 219 OLD CHEMISTRY BUILDING, BOX 90251, DURHAM, NORTH CAROLINA 27708-0251, USA.

Health exams determine a patient's health status by comparing the patient's measurement with a population reference range, a 95% interval derived from a homogeneous reference population. Similarly, most of the established relation among health problems are assumed to hold for the entire population. We use data from the 2009-2010 National Health and Nutrition Examination Survey (NHANES) on four major health problems in the U. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1187DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6699761PMC

GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION.

Ann Appl Stat 2019 Jun 17;13(2):848-873. Epub 2019 Jun 17.

Department of Statistic, University of Washington; Seattle, Washington.

Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1213DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684253PMC

ESTIMATING POPULATION AVERAGE CAUSAL EFFECTS IN THE PRESENCE OF NON-OVERLAP: THE EFFECT OF NATURAL GAS COMPRESSOR STATION EXPOSURE ON CANCER MORTALITY.

Ann Appl Stat 2019 Jun 17;13(2):1242-1267. Epub 2019 Jun 17.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

Most causal inference studies rely on the assumption of overlap to estimate population or sample average causal effects. When data suffer from non-overlap, estimation of these estimands requires reliance on model specifications, due to poor data support. All existing methods to address non-overlap, such as trimming or down-weighting data in regions of poor data support, change the estimand so that inference cannot be made on the sample or the underlying population. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1231DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6658123PMC

NONPARAMETRIC INFERENCE FOR IMMUNE RESPONSE THRESHOLDS OF RISK IN VACCINE STUDIES.

Ann Appl Stat 2019 Jun 17;13(2):1147-1165. Epub 2019 Jun 17.

Fred Hutchinson Cancer Research Center Seattle, Washington, 98109, U.S.A.

An important objective in vaccine studies entails identifying an immune response which is predictive of disease risk. Nonparametric methods are developed for inference on immune response thresholds that are associated with specified levels of disease risk, including where the risk level is zero. This threshold is defined as the minimum immune response value above which disease risk is less than or equal to the desired level. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1237DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6613658PMC
June 2019
4 Reads

BAYESIAN LATENT HIERARCHICAL MODEL FOR TRANSCRIPTOMIC META-ANALYSIS TO DETECT BIOMARKERS WITH CLUSTERED META-PATTERNS OF DIFFERENTIAL EXPRESSION SIGNALS.

Ann Appl Stat 2019 Mar 10;13(1):340-366. Epub 2019 Apr 10.

Department of Biostatistics, Human Genetics and Computational Biology University of Pittsburgh Pittsburgh, PA 15261

Due to the rapid development of high-throughput experimental techniques and fast-dropping prices, many transcriptomic datasets have been generated and accumulated in the public domain. Meta-analysis combining multiple transcriptomic studies can increase the statistical power to detect disease-related biomarkers. In this paper, we introduce a Bayesian latent hierarchical model to perform transcriptomic meta-analysis. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1188DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6472949PMC
March 2019
1 Read

TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES.

Ann Appl Stat 2018 Sep 11;12(3):1914-1938. Epub 2018 Sep 11.

Institute for Social Research University of Michigan Ann Arbor, Michigan 48104 USA.

Dynamic treatment regimes (DTRs) are sequences of treatment decision rules, in which treatment may be adapted over time in response to the changing course of an individual. Motivated by the substance use disorder (SUD) study, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that directly handles the problem of optimization with multiple treatment comparisons, through a purity measure constructed with augmented inverse probability weighted estimators. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652980
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1137DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6457899PMC
September 2018
30 Reads

Modeling Hybrid Traits for Comorbidity and Genetic Studies of Alcohol and Nicotine Co-Dependence.

Ann Appl Stat 2018 Dec 13;12(4):2359-2378. Epub 2018 Nov 13.

Heping Zhang is Susan Dwight Bliss Professor Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520; Dungang Liu is Assistant Professor Department of Operations, Business Analytics and Information Systems, University of Cincinnati Lindner College of Business, Cincinnati, OH 45221; Jiwei Zhao is Assistant Professor Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY 14214; and Xuan Bi is Postdoctoral Associate, Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520.

We propose a novel multivariate model for analyzing hybrid traits and identifying genetic factors for comorbid conditions. Comorbidity is a common phenomenon in mental health in which an individual suffers from multiple disorders simultaneously. For example, in the Study of Addiction: Genetics and Environment (SAGE), alcohol and nicotine addiction were recorded through multiple assessments that we refer to as hybrid traits. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1156DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6338437PMC
December 2018
2 Reads

EXACT SPIKE TRAIN INFERENCE VIA ℓ OPTIMIZATION.

Ann Appl Stat 2018 Dec 13;12(4):2457-2482. Epub 2018 Nov 13.

Departments of Statistics and Biostatistics, University of Washington, Seattle, Washington 98195, USA,

In recent years new technologies in neuroscience have made it possible to measure the activities of large numbers of neurons simultaneously in behaving animals. For each neuron a is measured; this can be seen as a first-order approximation of the neuron's activity over time. Determining the exact time at which a neuron spikes on the basis of its fluorescence trace is an important open problem in the field of computational neuroscience. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1162DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322847PMC
December 2018

ESTIMATING AND COMPARING CANCER PROGRESSION RISKS UNDER VARYING SURVEILLANCE PROTOCOLS.

Ann Appl Stat 2018 Sep 11;12(3):1773-1795. Epub 2018 Sep 11.

Fred Hutchinson Cancer Research Center.

Outcomes after cancer diagnosis and treatment are often observed at discrete times via doctor-patient encounters or specialized diagnostic examinations. Despite their ubiquity as endpoints in cancer studies, such outcomes pose challenges for analysis. In particular, comparisons between studies or patient populations with different surveillance schema may be confounded by differences in visit frequencies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1130DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322848PMC
September 2018
2 Reads

SCALPEL: EXTRACTING NEURONS FROM CALCIUM IMAGING DATA.

Ann Appl Stat 2018 Dec 13;12(4):2430-2456. Epub 2018 Nov 13.

Department of Biostatistics, University of Washington, Seattle, Washington 98195, USA, Departments of Biostatistics and Statistics, University of Washington, Seattle, Washington 98195, USA,

In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called "calcium imaging" data was made publicly available. The availability of this large-scale data resource opens the door to a host of scientific questions for which new statistical methods must be developed. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1542078051
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1159DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6269150PMC
December 2018
2 Reads

The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments.

Ann Appl Stat 2018 Dec 13;12(4):2075-2095. Epub 2018 Nov 13.

Department of Cell Biology, Harvard Medical School, 240 Longwood Ave, Boston, MA, 02115, USA; Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, 3101 McGavran-Greenberg Hall, CB 7420, Chapel Hill, NC 27599, USA; Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill 120 Mason Farm Rd, Campus Box 7260 Chapel Hill, NC 27599 USA.

An idealized version of a label-free discovery mass spectrometry proteomics experiment would provide absolute abundance measurements for a whole proteome, across varying conditions. Unfortunately, this ideal is not realized. Measurements are made on peptides requiring an inferential step to obtain protein level estimates. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1542078037
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1144DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249692PMC
December 2018
17 Reads

TPRM: TENSOR PARTITION REGRESSION MODELS WITH APPLICATIONS IN IMAGING BIOMARKER DETECTION.

Ann Appl Stat 2018 Sep 11;12(3):1422-1450. Epub 2018 Sep 11.

University of North Carolina at Chapel Hill.

Medical imaging studies have collected high dimensional imaging data to identify imaging biomarkers for diagnosis, screening, and prognosis, among many others. These imaging data are often represented in the form of a multi-dimensional array, called a tensor. The aim of this paper is to develop a tensor partition regression modeling (TPRM) framework to establish a relationship between low-dimensional clinical outcomes (e. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652960
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6221472PMC
September 2018
31 Reads

COMPLEX-VALUED TIME SERIES MODELING FOR IMPROVED ACTIVATION DETECTION IN FMRI STUDIES.

Ann Appl Stat 2018 Sep 11;12(3):1451-1478. Epub 2018 Sep 11.

Marquette University.

A complex-valued data-based model with th order autoregressive errors and general real/imaginary error covariance structure is proposed as an alternative to the commonly-used magnitude-only data-based autoregressive model for fMRI time series. Likelihood-ratio-test-based activation statistics are derived for both models and compared for experimental and simulated data. For a dataset from a right-hand finger-tapping experiment, the activation map obtained using complex-valued modeling more clearly identifies the primary activation region (left functional central sulcus) than the magnitude-only model. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652961
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168091PMC
September 2018
37 Reads

BIOMARKER CHANGE-POINT ESTIMATION WITH RIGHT CENSORING IN LONGITUDINAL STUDIES.

Ann Appl Stat 2017 Sep;11(3):1738-1762

Center for Imaging Science, Johns Hopkins University, 3400 N. Charles St. Baltimore, Maryland 21218 USA.

We consider in this paper a statistical two-phase regression model in which the change point of a disease biomarker is measured relative to another point in time, such as the manifestation of the disease, which is subject to right-censoring (i.e., possibly unobserved over the entire course of the study). Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1507168846
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1056DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157754PMC
September 2017
6 Reads

KERNEL-PENALIZED REGRESSION FOR ANALYSIS OF MICROBIOME DATA.

Ann Appl Stat 2018 Mar 9;12(1):540-566. Epub 2018 Mar 9.

University of Washington.

The analysis of human microbiome data is often based on dimension-reduced graphical displays and clusterings derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated definitions of similarity. Principal coordinate analysis, in particular, is often performed using ecologically defined distances, allowing analyses to incorporate context-dependent, non-Euclidean structure. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1520564483
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1102DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6138053PMC
March 2018
4 Reads

Topological Data Analysis of Single-Trial Electroencephalographic Signals.

Ann Appl Stat 2018 Sep 11;12(3):1506-1534. Epub 2018 Sep 11.

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53705, U.S.A.

Epilepsy is a neurological disorder that can negatively affect the visual, audial and motor functions of the human brain. Statistical analysis of neurophysiological recordings, such as electroencephalogram (EEG), facilitates the understanding and diagnosis of epileptic seizures. Standard statistical methods, however, do not account for topological features embedded in EEG signals. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6135261PMC
September 2018
2 Reads

ADAPTIVE-WEIGHT BURDEN TEST FOR ASSOCIATIONS BETWEEN QUANTITATIVE TRAITS AND GENOTYPE DATA WITH COMPLEX CORRELATIONS.

Ann Appl Stat 2018 Sep 11;12(3):1558-1582. Epub 2018 Sep 11.

Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA.

High-throughput sequencing has often been used to screen samples from pedigrees or with population structure, producing genotype data with complex correlations rendered from both familial relation and linkage disequilibrium. With such data, it is critical to account for these genotypic correlations when assessing the contribution of variants by gene or pathway. Recognizing the limitations of existing association testing methods, we propose (ABT), a retrospective, mixed-model test for genetic association of quantitative traits on genotype data with complex correlations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1121DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6133321PMC
September 2018
16 Reads

A UNIFIED STATISTICAL FRAMEWORK FOR SINGLE CELL AND BULK RNA SEQUENCING DATA.

Ann Appl Stat 2018 Mar 9;12(1):609-632. Epub 2018 Mar 9.

Carnegie Mellon University.

Recent advances in technology have enabled the measurement of RNA levels for individual cells. Compared to traditional tissue-level bulk RNA-seq data, single cell sequencing yields valuable insights about gene expression profiles for different cell types, which is potentially critical for understanding many complex human diseases. However, developing quantitative tools for such data remains challenging because of high levels of technical noise, especially the "dropout" events. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114100PMC
March 2018
13 Reads

FUNCTIONAL PRINCIPAL VARIANCE COMPONENT TESTING FOR A GENETIC ASSOCIATION STUDY OF HIV PROGRESSION.

Ann Appl Stat 2018 Sep 11;12(3):1871-1893. Epub 2018 Sep 11.

Department of Biostatistics, Harvard T. H. Chan School of Public Health, 655 Huntington Ave, Boston, Massachusetts 02115, USA.

HIV-1C is the most prevalent subtype of HIV-1 and accounts for over half of HIV-1 infections worldwide. Host genetic influence of HIV infection has been previously studied in HIV-1B, but little attention has been paid to the more prevalent subtype C. To understand the role of host genetics in HIV-1C disease progression, we perform a study to assess the association between longitudinally collected measures of disease and more than 100,000 genetic markers located on chromosome 6. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1135DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111467PMC
September 2018

TOWARD BAYESIAN INFERENCE OF THE SPATIAL DISTRIBUTION OF PROTEINS FROM THREE-CUBE FÖRSTER RESONANCE ENERGY TRANSFER DATA.

Ann Appl Stat 2017 Sep 5;11(3):1711-1737. Epub 2017 Oct 5.

Aalborg University.

Förster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbor molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell, it is possible to measure in microscopic images to what extent FRET takes place between the fluorophores. This provides indirect information of the spatial distribution of the proteins. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1054DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982602PMC
September 2017
5 Reads

Estimating Large Correlation Matrices for International Migration.

Ann Appl Stat 2018 Jun 28;12(2):940-970. Epub 2018 Jul 28.

Department of Statistics University of Washington, Seattle.

The United Nations is the major organization producing and regularly updating probabilistic population projections for all countries. International migration is a critical component of such projections, and between-country correlations are important for forecasts of regional aggregates. However, in the data we consider there are 200 countries and only 12 data points, each one corresponding to a five-year time period. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-aoas1175DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7164801PMC

A TESTING BASED APPROACH TO THE DISCOVERY OF DIFFERENTIALLY CORRELATED VARIABLE SETS.

Ann Appl Stat 2018 Jun 28;12(2):1180-1203. Epub 2018 Jul 28.

The University of North Carolina at Chapel Hill.

Given data obtained under two sampling conditions, it is often of interest to identify variables that behave differently in one condition than in the other. We introduce a method for differential analysis of second-order behavior called Differential Correlation Mining (DCM). The DCM method identifies differentially correlated sets of variables, with the property that the average pairwise correlation between variables in a set is higher under one sample condition than the other. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1083DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927674PMC

ADJUSTED REGULARIZATION IN LATENT GRAPHICAL MODELS: APPLICATION TO MULTIPLE-NEURON SPIKE COUNT DATA.

Ann Appl Stat 2018 Jun 28;12(2):1068-1095. Epub 2018 Jul 28.

Carnegie Mellon University, Department of Statistics, Baker Hall 132, 5000 Forbes Avenue, Pittsburgh, 15203, PA, USA.

A major challenge in contemporary neuroscience is to analyze data from large numbers of neurons recorded simultaneously across many experimental replications (trials), where the data are counts of neural firing events, and one of the basic problems is to characterize the dependence structure among such multivariate counts. Methods of estimating high-dimensional covariation based on -regularization are most appropriate when there are a small number of relatively large partial correlations, but in neural data there are often large numbers of relatively small partial correlations. Furthermore, the variation across trials is often confounded by Poisson-like variation within trials. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1190DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6879176PMC

POWERFUL TEST BASED ON CONDITIONAL EFFECTS FOR GENOME-WIDE SCREENING.

Authors:
Yaowu Liu Jun Xie

Ann Appl Stat 2018 Mar 9;12(1):567-585. Epub 2018 Mar 9.

Department of Statistics, Purdue University, 250 N. University Street, West Lafayette, Indiana 47907, USA.

This paper considers testing procedures for screening large genome-wide data, where we examine hundreds of thousands of genetic variants, e.g., single nucleotide polymorphisms (SNP), on a quantitative phenotype. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1520564484
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1103DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931742PMC
March 2018
6 Reads

MSIQ: JOINT MODELING OF MULTIPLE RNA-SEQ SAMPLES FOR ACCURATE ISOFORM QUANTIFICATION.

Ann Appl Stat 2018 Mar 9;12(1):510-539. Epub 2018 Mar 9.

University of California, Los Angeles.

Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1100DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5935499PMC
March 2018
10 Reads

LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA.

Ann Appl Stat 2017 Sep 5;11(3):1217-1244. Epub 2017 Oct 5.

Department of Statistics, Department of Sociology, University of Washington, Box 354322 Seattle, Washington 98195-4322 USA, URL: http://www.stat.washington.edu/~tylermc/.

Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS955DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5927604PMC
September 2017
1 Read

DESIGN OF VACCINE TRIALS DURING OUTBREAKS WITH AND WITHOUT A DELAYED VACCINATION COMPARATOR.

Ann Appl Stat 2018 Mar 9;12(1):330-347. Epub 2018 Mar 9.

University of Florida.

Conducting vaccine efficacy trials during outbreaks of emerging pathogens poses particular challenges. The "Ebola ça suffit" trial in Guinea used a novel ring vaccination cluster randomized design to target populations at highest risk of infection. Another key feature of the trial was the use of a delayed vaccination arm as a comparator, in which clusters were randomized to immediate vaccination or vaccination 21 days later. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1095DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5878056PMC
March 2018
2 Reads

A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES.

Authors:
Xiang Zhou

Ann Appl Stat 2017 Dec 28;11(4):2027-2051. Epub 2017 Dec 28.

University of Michigan.

Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1052DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836736PMC
December 2017
5 Reads

A MULTI-RESOLUTION MODEL FOR NON-GAUSSIAN RANDOM FIELDS ON A SPHERE WITH APPLICATION TO IONOSPHERIC ELECTROSTATIC POTENTIALS.

Ann Appl Stat 2018 Mar 9;12(1):459-489. Epub 2018 Mar 9.

Department of Aerospace, Engineering Sciences, University of Colorado, Boulder, 429 UCB, Boulder, Colorado 80309, USA.

Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1104DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6827713PMC

A NOVEL AND EFFICIENT ALGORITHM FOR DE NOVO DISCOVERY OF MUTATED DRIVER PATHWAYS IN CANCER.

Ann Appl Stat 2017 Sep 5;11(3):1481-1512. Epub 2017 Oct 5.

University of Minnesota.

Next-generation sequencing studies on cancer somatic mutations have discovered that driver mutations tend to appear in most tumor samples, but they barely overlap in any single tumor sample, presumably because a single driver mutation can perturb the whole pathway. Based on the corresponding new concepts of coverage and mutual exclusivity, new methods can be designed for de novo discovery of mutated driver pathways in cancer. Since the computational problem is a combinatorial optimization with an objective function involving a discontinuous indicator function in high dimension, many existing optimization algorithms, such as a brute force enumeration, gradient descent and Newton's methods, are practically infeasible or directly inapplicable. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1507168837
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1042DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5823541PMC
September 2017
3 Reads

BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES.

Ann Appl Stat 2017 5;11(3):1561-1592. Epub 2017 Oct 5.

University of Chicago.

Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Read More

View Article

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796536PMC
http://dx.doi.org/10.1214/17-aoas1046DOI Listing
October 2017
5 Reads

ROBUST MIXED EFFECTS MODEL FOR CLUSTERED FAILURE TIME DATA: APPLICATION TO HUNTINGTON'S DISEASE EVENT MEASURES.

Ann Appl Stat 2017 20;11(2):1085-1116. Epub 2017 Jul 20.

Columbia University.

An important goal in clinical and statistical research is properly modeling the distribution for clustered failure times which have a natural intraclass dependency and are subject to censoring. We handle these challenges with a novel approach that does not impose restrictive modeling or distributional assumptions. Using a logit transformation, we relate the distribution for clustered failure times to covariates and a random, subject-specific effect. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1038DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5793916PMC
July 2017
1 Read

DOUBLY ROBUST ESTIMATION OF OPTIMAL TREATMENT REGIMES FOR SURVIVAL DATA-WITH APPLICATION TO AN HIV/AIDS STUDY.

Ann Appl Stat 2017 Sep 5;11(3):1763-1786. Epub 2017 Oct 5.

School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

In many biomedical settings, assigning every patient the same treatment may not be optimal due to patient heterogeneity. Individualized treatment regimes have the potential to dramatically improve clinical outcomes. When the primary outcome is censored survival time, a main interest is to find optimal treatment regimes that maximize the survival probability of patients. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1057DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5749433PMC
September 2017
10 Reads

INFERENCE FOR SOCIAL NETWORK MODELS FROM EGOCENTRICALLY SAMPLED DATA, WITH APPLICATION TO UNDERSTANDING PERSISTENT RACIAL DISPARITIES IN HIV PREVALENCE IN THE US.

Ann Appl Stat 2017 Mar 8;11(1):427-455. Epub 2017 Apr 8.

Egocentric network sampling observes the network of interest from the point of view of a set of sampled actors, who provide information about themselves and anonymized information on their network neighbors. In survey research, this is often the most practical, and sometimes the only, way to observe certain classes of networks, with the sexual networks that underlie HIV transmission being the archetypal case. Although methods exist for recovering some descriptive network features, there is no rigorous and practical statistical foundation for estimation and inference for network models from such data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS1010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737754PMC
March 2017
2 Reads

Quantification of Multiple Tumor Clones Using Gene Array and Sequencing Data.

Ann Appl Stat 2017 Jun 20;11(2):967-991. Epub 2017 Jul 20.

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to successfully de-convolute the complex structure of the genetic information from tumor samples. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728449PMC
June 2017
14 Reads

Spatial Multiresolution Analysis of the Effect of PM on Birth Weights.

Ann Appl Stat 2017 20;11(2):792-807. Epub 2017 Jul 20.

Harvard Chan School of Public Health.

Fine particulate matter (PM) measured at a given location is a mix of pollution generated locally and pollution traveling long distances in the atmosphere. Therefore, the identification of spatial scales associated with health effects can inform on pollution sources responsible for these effects, resulting in more targeted regulatory policy. Recently, prediction methods that yield high-resolution spatial estimates of PM exposures allow one to evaluate such scale-specific associations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS1018DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716638PMC
July 2017
9 Reads

LATENT CLASS MODELING USING MATRIX COVARIATES WITH APPLICATION TO IDENTIFYING EARLY PLACEBO RESPONDERS BASED ON EEG SIGNALS.

Ann Appl Stat 2017 Sep 5;11(3):1513-1536. Epub 2017 Oct 5.

Columbia University.

Latent class models are widely used to identify unobserved subgroups (i.e., latent classes) based upon one or more manifest variables. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1044DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5687521PMC
September 2017
3 Reads

TESTING HIGH-DIMENSIONAL COVARIANCE MATRICES, WITH APPLICATION TO DETECTING SCHIZOPHRENIA RISK GENES.

Ann Appl Stat 2017 Sep 5;11(3):1810-1831. Epub 2017 Oct 5.

Department of Statistics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA.

Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1062DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5655846PMC
September 2017
8 Reads

DYNAMIC PREDICTION FOR MULTIPLE REPEATED MEASURES AND EVENT TIME DATA: AN APPLICATION TO PARKINSON'S DISEASE.

Ann Appl Stat 2017 Sep 5;11(3):1787-1809. Epub 2017 Oct 5.

University of Texas MD Anderson Cancer Center.

In many clinical trials studying neurodegenerative diseases such as Parkinson's disease (PD), multiple longitudinal outcomes are collected to fully explore the multidimensional impairment caused by this disease. If the outcomes deteriorate rapidly, patients may reach a level of functional disability sufficient to initiate levodopa therapy for ameliorating disease symptoms. An accurate prediction of the time to functional disability is helpful for clinicians to monitor patients' disease progression and make informative medical decisions. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1059DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5656296PMC
September 2017
11 Reads

DISCUSSION OF "FIBER DIRECTION ESTIMATION IN DIFFUSION MRI".

Authors:
Jian Kang Lexin Li

Ann Appl Stat 2016 Sep 28;10(3):1162-1165. Epub 2016 Sep 28.

University of Michiga and University of California, Berkeley.

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS937DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5646375PMC
September 2016
9 Reads

ALLELE-SPECIFIC COPY NUMBER ESTIMATION BY WHOLE EXOME SEQUENCING.

Ann Appl Stat 2017 Jun 20;11(2):1169-1192. Epub 2017 Jul 20.

University of Pennsylvania.

Whole exome sequencing is currently a technology of choice in large-scale cancer genomics studies, where the priority is to identify cancer-associated variants in coding regions. We describe a method for estimating allele-specific copy number using whole exome sequencing data from tumor and matched normal. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1043DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5627665PMC
June 2017
7 Reads