249 results match your criteria Annals Of Applied Statistics[Journal]


TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES.

Ann Appl Stat 2018 Sep 11;12(3):1914-1938. Epub 2018 Sep 11.

Institute for Social Research University of Michigan Ann Arbor, Michigan 48104 USA.

Dynamic treatment regimes (DTRs) are sequences of treatment decision rules, in which treatment may be adapted over time in response to the changing course of an individual. Motivated by the substance use disorder (SUD) study, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that directly handles the problem of optimization with multiple treatment comparisons, through a purity measure constructed with augmented inverse probability weighted estimators. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652980
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1137DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6457899PMC
September 2018
2 Reads

Modeling Hybrid Traits for Comorbidity and Genetic Studies of Alcohol and Nicotine Co-Dependence.

Ann Appl Stat 2018 Dec 13;12(4):2359-2378. Epub 2018 Nov 13.

Heping Zhang is Susan Dwight Bliss Professor Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520; Dungang Liu is Assistant Professor Department of Operations, Business Analytics and Information Systems, University of Cincinnati Lindner College of Business, Cincinnati, OH 45221; Jiwei Zhao is Assistant Professor Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY 14214; and Xuan Bi is Postdoctoral Associate, Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520.

We propose a novel multivariate model for analyzing hybrid traits and identifying genetic factors for comorbid conditions. Comorbidity is a common phenomenon in mental health in which an individual suffers from multiple disorders simultaneously. For example, in the Study of Addiction: Genetics and Environment (SAGE), alcohol and nicotine addiction were recorded through multiple assessments that we refer to as hybrid traits. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1156DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6338437PMC
December 2018
1 Read

EXACT SPIKE TRAIN INFERENCE VIA ℓ OPTIMIZATION.

Ann Appl Stat 2018 Dec 13;12(4):2457-2482. Epub 2018 Nov 13.

Departments of Statistics and Biostatistics, University of Washington, Seattle, Washington 98195, USA,

In recent years new technologies in neuroscience have made it possible to measure the activities of large numbers of neurons simultaneously in behaving animals. For each neuron a is measured; this can be seen as a first-order approximation of the neuron's activity over time. Determining the exact time at which a neuron spikes on the basis of its fluorescence trace is an important open problem in the field of computational neuroscience. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/18-AOAS1162DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322847PMC
December 2018

ESTIMATING AND COMPARING CANCER PROGRESSION RISKS UNDER VARYING SURVEILLANCE PROTOCOLS.

Ann Appl Stat 2018 Sep 11;12(3):1773-1795. Epub 2018 Sep 11.

Fred Hutchinson Cancer Research Center.

Outcomes after cancer diagnosis and treatment are often observed at discrete times via doctor-patient encounters or specialized diagnostic examinations. Despite their ubiquity as endpoints in cancer studies, such outcomes pose challenges for analysis. In particular, comparisons between studies or patient populations with different surveillance schema may be confounded by differences in visit frequencies. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1130DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322848PMC
September 2018

SCALPEL: EXTRACTING NEURONS FROM CALCIUM IMAGING DATA.

Ann Appl Stat 2018 Dec 13;12(4):2430-2456. Epub 2018 Nov 13.

Department of Biostatistics, University of Washington, Seattle, Washington 98195, USA, Departments of Biostatistics and Statistics, University of Washington, Seattle, Washington 98195, USA,

In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called "calcium imaging" data was made publicly available. The availability of this large-scale data resource opens the door to a host of scientific questions for which new statistical methods must be developed. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1542078051
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1159DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6269150PMC
December 2018
1 Read

The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments.

Ann Appl Stat 2018 Dec 13;12(4):2075-2095. Epub 2018 Nov 13.

Department of Cell Biology, Harvard Medical School, 240 Longwood Ave, Boston, MA, 02115, USA; Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, 3101 McGavran-Greenberg Hall, CB 7420, Chapel Hill, NC 27599, USA; Department of Biochemistry and Biophysics University of North Carolina at Chapel Hill 120 Mason Farm Rd, Campus Box 7260 Chapel Hill, NC 27599 USA.

An idealized version of a label-free discovery mass spectrometry proteomics experiment would provide absolute abundance measurements for a whole proteome, across varying conditions. Unfortunately, this ideal is not realized. Measurements are made on peptides requiring an inferential step to obtain protein level estimates. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1542078037
Publisher Site
http://dx.doi.org/10.1214/18-AOAS1144DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249692PMC
December 2018
8 Reads

TPRM: TENSOR PARTITION REGRESSION MODELS WITH APPLICATIONS IN IMAGING BIOMARKER DETECTION.

Ann Appl Stat 2018 Sep 11;12(3):1422-1450. Epub 2018 Sep 11.

University of North Carolina at Chapel Hill.

Medical imaging studies have collected high dimensional imaging data to identify imaging biomarkers for diagnosis, screening, and prognosis, among many others. These imaging data are often represented in the form of a multi-dimensional array, called a tensor. The aim of this paper is to develop a tensor partition regression modeling (TPRM) framework to establish a relationship between low-dimensional clinical outcomes (e. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652960
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6221472PMC
September 2018
13 Reads

COMPLEX-VALUED TIME SERIES MODELING FOR IMPROVED ACTIVATION DETECTION IN FMRI STUDIES.

Ann Appl Stat 2018 Sep 11;12(3):1451-1478. Epub 2018 Sep 11.

Marquette University.

A complex-valued data-based model with th order autoregressive errors and general real/imaginary error covariance structure is proposed as an alternative to the commonly-used magnitude-only data-based autoregressive model for fMRI time series. Likelihood-ratio-test-based activation statistics are derived for both models and compared for experimental and simulated data. For a dataset from a right-hand finger-tapping experiment, the activation map obtained using complex-valued modeling more clearly identifies the primary activation region (left functional central sulcus) than the magnitude-only model. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1536652961
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168091PMC
September 2018
21 Reads

BIOMARKER CHANGE-POINT ESTIMATION WITH RIGHT CENSORING IN LONGITUDINAL STUDIES.

Ann Appl Stat 2017 Sep;11(3):1738-1762

Center for Imaging Science, Johns Hopkins University, 3400 N. Charles St. Baltimore, Maryland 21218 USA.

We consider in this paper a statistical two-phase regression model in which the change point of a disease biomarker is measured relative to another point in time, such as the manifestation of the disease, which is subject to right-censoring (i.e., possibly unobserved over the entire course of the study). Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1507168846
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1056DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157754PMC
September 2017
4 Reads

KERNEL-PENALIZED REGRESSION FOR ANALYSIS OF MICROBIOME DATA.

Ann Appl Stat 2018 Mar 9;12(1):540-566. Epub 2018 Mar 9.

University of Washington.

The analysis of human microbiome data is often based on dimension-reduced graphical displays and clusterings derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated definitions of similarity. Principal coordinate analysis, in particular, is often performed using ecologically defined distances, allowing analyses to incorporate context-dependent, non-Euclidean structure. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1520564483
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1102DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6138053PMC
March 2018
3 Reads

Topological Data Analysis of Single-Trial Electroencephalographic Signals.

Ann Appl Stat 2018 Sep 11;12(3):1506-1534. Epub 2018 Sep 11.

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53705, U.S.A.

Epilepsy is a neurological disorder that can negatively affect the visual, audial and motor functions of the human brain. Statistical analysis of neurophysiological recordings, such as electroencephalogram (EEG), facilitates the understanding and diagnosis of epileptic seizures. Standard statistical methods, however, do not account for topological features embedded in EEG signals. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6135261PMC
September 2018
1 Read

ADAPTIVE-WEIGHT BURDEN TEST FOR ASSOCIATIONS BETWEEN QUANTITATIVE TRAITS AND GENOTYPE DATA WITH COMPLEX CORRELATIONS.

Ann Appl Stat 2018 Sep 11;12(3):1558-1582. Epub 2018 Sep 11.

Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA.

High-throughput sequencing has often been used to screen samples from pedigrees or with population structure, producing genotype data with complex correlations rendered from both familial relation and linkage disequilibrium. With such data, it is critical to account for these genotypic correlations when assessing the contribution of variants by gene or pathway. Recognizing the limitations of existing association testing methods, we propose (ABT), a retrospective, mixed-model test for genetic association of quantitative traits on genotype data with complex correlations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1121DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6133321PMC
September 2018
1 Read

A UNIFIED STATISTICAL FRAMEWORK FOR SINGLE CELL AND BULK RNA SEQUENCING DATA.

Ann Appl Stat 2018 Mar 9;12(1):609-632. Epub 2018 Mar 9.

Carnegie Mellon University.

Recent advances in technology have enabled the measurement of RNA levels for individual cells. Compared to traditional tissue-level bulk RNA-seq data, single cell sequencing yields valuable insights about gene expression profiles for different cell types, which is potentially critical for understanding many complex human diseases. However, developing quantitative tools for such data remains challenging because of high levels of technical noise, especially the "dropout" events. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6114100PMC
March 2018
10 Reads

TOWARD BAYESIAN INFERENCE OF THE SPATIAL DISTRIBUTION OF PROTEINS FROM THREE-CUBE FÖRSTER RESONANCE ENERGY TRANSFER DATA.

Ann Appl Stat 2017 Sep 5;11(3):1711-1737. Epub 2017 Oct 5.

Aalborg University.

Förster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbor molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell, it is possible to measure in microscopic images to what extent FRET takes place between the fluorophores. This provides indirect information of the spatial distribution of the proteins. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1054DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982602PMC
September 2017
4 Reads

POWERFUL TEST BASED ON CONDITIONAL EFFECTS FOR GENOME-WIDE SCREENING.

Authors:
Yaowu Liu Jun Xie

Ann Appl Stat 2018 Mar 9;12(1):567-585. Epub 2018 Mar 9.

Department of Statistics, Purdue University, 250 N. University Street, West Lafayette, Indiana 47907, USA.

This paper considers testing procedures for screening large genome-wide data, where we examine hundreds of thousands of genetic variants, e.g., single nucleotide polymorphisms (SNP), on a quantitative phenotype. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1520564484
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1103DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5931742PMC
March 2018
3 Reads

MSIQ: JOINT MODELING OF MULTIPLE RNA-SEQ SAMPLES FOR ACCURATE ISOFORM QUANTIFICATION.

Ann Appl Stat 2018 Mar 9;12(1):510-539. Epub 2018 Mar 9.

University of California, Los Angeles.

Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1100DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5935499PMC
March 2018
5 Reads

LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA.

Ann Appl Stat 2017 Sep 5;11(3):1217-1244. Epub 2017 Oct 5.

Department of Statistics, Department of Sociology, University of Washington, Box 354322 Seattle, Washington 98195-4322 USA, URL: http://www.stat.washington.edu/~tylermc/.

Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS955DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5927604PMC
September 2017

DESIGN OF VACCINE TRIALS DURING OUTBREAKS WITH AND WITHOUT A DELAYED VACCINATION COMPARATOR.

Ann Appl Stat 2018 Mar 9;12(1):330-347. Epub 2018 Mar 9.

University of Florida.

Conducting vaccine efficacy trials during outbreaks of emerging pathogens poses particular challenges. The "Ebola ça suffit" trial in Guinea used a novel ring vaccination cluster randomized design to target populations at highest risk of infection. Another key feature of the trial was the use of a delayed vaccination arm as a comparator, in which clusters were randomized to immediate vaccination or vaccination 21 days later. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1095DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5878056PMC

A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES.

Authors:
Xiang Zhou

Ann Appl Stat 2017 Dec 28;11(4):2027-2051. Epub 2017 Dec 28.

University of Michigan.

Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1052DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836736PMC
December 2017
3 Reads

A NOVEL AND EFFICIENT ALGORITHM FOR DE NOVO DISCOVERY OF MUTATED DRIVER PATHWAYS IN CANCER.

Ann Appl Stat 2017 Sep 5;11(3):1481-1512. Epub 2017 Oct 5.

University of Minnesota.

Next-generation sequencing studies on cancer somatic mutations have discovered that driver mutations tend to appear in most tumor samples, but they barely overlap in any single tumor sample, presumably because a single driver mutation can perturb the whole pathway. Based on the corresponding new concepts of coverage and mutual exclusivity, new methods can be designed for de novo discovery of mutated driver pathways in cancer. Since the computational problem is a combinatorial optimization with an objective function involving a discontinuous indicator function in high dimension, many existing optimization algorithms, such as a brute force enumeration, gradient descent and Newton's methods, are practically infeasible or directly inapplicable. Read More

View Article

Download full-text PDF

Source
https://projecteuclid.org/euclid.aoas/1507168837
Publisher Site
http://dx.doi.org/10.1214/17-AOAS1042DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5823541PMC
September 2017
2 Reads

BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES.

Ann Appl Stat 2017 5;11(3):1561-1592. Epub 2017 Oct 5.

University of Chicago.

Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Read More

View Article

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5796536PMC
October 2017
1 Read

ROBUST MIXED EFFECTS MODEL FOR CLUSTERED FAILURE TIME DATA: APPLICATION TO HUNTINGTON'S DISEASE EVENT MEASURES.

Ann Appl Stat 2017 20;11(2):1085-1116. Epub 2017 Jul 20.

Columbia University.

An important goal in clinical and statistical research is properly modeling the distribution for clustered failure times which have a natural intraclass dependency and are subject to censoring. We handle these challenges with a novel approach that does not impose restrictive modeling or distributional assumptions. Using a logit transformation, we relate the distribution for clustered failure times to covariates and a random, subject-specific effect. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1038DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5793916PMC

DOUBLY ROBUST ESTIMATION OF OPTIMAL TREATMENT REGIMES FOR SURVIVAL DATA-WITH APPLICATION TO AN HIV/AIDS STUDY.

Ann Appl Stat 2017 Sep 5;11(3):1763-1786. Epub 2017 Oct 5.

School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

In many biomedical settings, assigning every patient the same treatment may not be optimal due to patient heterogeneity. Individualized treatment regimes have the potential to dramatically improve clinical outcomes. When the primary outcome is censored survival time, a main interest is to find optimal treatment regimes that maximize the survival probability of patients. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1057DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5749433PMC
September 2017
8 Reads

INFERENCE FOR SOCIAL NETWORK MODELS FROM EGOCENTRICALLY SAMPLED DATA, WITH APPLICATION TO UNDERSTANDING PERSISTENT RACIAL DISPARITIES IN HIV PREVALENCE IN THE US.

Ann Appl Stat 2017 Mar 8;11(1):427-455. Epub 2017 Apr 8.

Egocentric network sampling observes the network of interest from the point of view of a set of sampled actors, who provide information about themselves and anonymized information on their network neighbors. In survey research, this is often the most practical, and sometimes the only, way to observe certain classes of networks, with the sexual networks that underlie HIV transmission being the archetypal case. Although methods exist for recovering some descriptive network features, there is no rigorous and practical statistical foundation for estimation and inference for network models from such data. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS1010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737754PMC
March 2017
1 Read

Quantification of Multiple Tumor Clones Using Gene Array and Sequencing Data.

Ann Appl Stat 2017 Jun 20;11(2):967-991. Epub 2017 Jul 20.

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to successfully de-convolute the complex structure of the genetic information from tumor samples. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728449PMC
June 2017
9 Reads

Spatial Multiresolution Analysis of the Effect of PM on Birth Weights.

Ann Appl Stat 2017 20;11(2):792-807. Epub 2017 Jul 20.

Harvard Chan School of Public Health.

Fine particulate matter (PM) measured at a given location is a mix of pollution generated locally and pollution traveling long distances in the atmosphere. Therefore, the identification of spatial scales associated with health effects can inform on pollution sources responsible for these effects, resulting in more targeted regulatory policy. Recently, prediction methods that yield high-resolution spatial estimates of PM exposures allow one to evaluate such scale-specific associations. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS1018DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716638PMC
July 2017
5 Reads

LATENT CLASS MODELING USING MATRIX COVARIATES WITH APPLICATION TO IDENTIFYING EARLY PLACEBO RESPONDERS BASED ON EEG SIGNALS.

Ann Appl Stat 2017 Sep 5;11(3):1513-1536. Epub 2017 Oct 5.

Columbia University.

Latent class models are widely used to identify unobserved subgroups (i.e., latent classes) based upon one or more manifest variables. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1044DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5687521PMC
September 2017
1 Read

TESTING HIGH-DIMENSIONAL COVARIANCE MATRICES, WITH APPLICATION TO DETECTING SCHIZOPHRENIA RISK GENES.

Ann Appl Stat 2017 Sep 5;11(3):1810-1831. Epub 2017 Oct 5.

Department of Statistics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA.

Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1062DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5655846PMC
September 2017
4 Reads

DYNAMIC PREDICTION FOR MULTIPLE REPEATED MEASURES AND EVENT TIME DATA: AN APPLICATION TO PARKINSON'S DISEASE.

Ann Appl Stat 2017 Sep 5;11(3):1787-1809. Epub 2017 Oct 5.

University of Texas MD Anderson Cancer Center.

In many clinical trials studying neurodegenerative diseases such as Parkinson's disease (PD), multiple longitudinal outcomes are collected to fully explore the multidimensional impairment caused by this disease. If the outcomes deteriorate rapidly, patients may reach a level of functional disability sufficient to initiate levodopa therapy for ameliorating disease symptoms. An accurate prediction of the time to functional disability is helpful for clinicians to monitor patients' disease progression and make informative medical decisions. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1059DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5656296PMC
September 2017
9 Reads

DISCUSSION OF "FIBER DIRECTION ESTIMATION IN DIFFUSION MRI".

Authors:
Jian Kang Lexin Li

Ann Appl Stat 2016 Sep 28;10(3):1162-1165. Epub 2016 Sep 28.

University of Michiga and University of California, Berkeley.

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS937DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5646375PMC
September 2016
6 Reads

ALLELE-SPECIFIC COPY NUMBER ESTIMATION BY WHOLE EXOME SEQUENCING.

Ann Appl Stat 2017 Jun 20;11(2):1169-1192. Epub 2017 Jul 20.

University of Pennsylvania.

Whole exome sequencing is currently a technology of choice in large-scale cancer genomics studies, where the priority is to identify cancer-associated variants in coding regions. We describe a method for estimating allele-specific copy number using whole exome sequencing data from tumor and matched normal. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1043DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5627665PMC
June 2017
5 Reads

Forecasting seasonal influenza with a state-space SIR model.

Ann Appl Stat 2017 Mar 8;11(1):202-224. Epub 2017 Apr 8.

Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA.

Seasonal influenza is a serious public health and societal problem due to its consequences resulting from absenteeism, hospitalizations, and deaths. The overall burden of influenza is captured by the Centers for Disease Control and Prevention's influenza-like illness network, which provides invaluable information about the current incidence. This information is used to provide decision support regarding prevention and response efforts. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS1000DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5623938PMC
March 2017
4 Reads

Integrative Sparse -Means With Overlapping Group Lasso in Genomic Applications for Disease Subtype Discovery.

Ann Appl Stat 2017 Jun 20;11(2):1011-1039. Epub 2017 Jul 20.

Department of Biostatistics, University of Pittsburgh, Pittsburgh, ennsylvania 15261, USA.

Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse -means (is- means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/17-AOAS1033DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5613668PMC
June 2017
3 Reads

IMPROVING EFFICIENCY IN BIOMARKER INCREMENTAL VALUE EVALUATION UNDER TWO-PHASE DESIGNS.

Ann Appl Stat 2017 Jun 20;11(2):638-654. Epub 2017 Jul 20.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115.

Cost-effective yet efficient designs are critical to the success of biomarker evaluation research. Two-phase sampling designs, under which expensive markers are only measured on a subsample of cases and non-cases within a prospective cohort, are useful in novel biomarker studies for preserving study samples and minimizing cost of biomarker assaying. Statistical methods for quantifying the predictiveness of biomarkers under two-phase studies have been proposed (Cai and Zheng, 2012; Liu, Cai and Zheng, 2012). Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS997DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5604898PMC
June 2017
3 Reads

REJOINDER: "FIBER DIRECTION ESTIMATION, SMOOTHING AND TRACKING IN DIFFUSION MRI".

Ann Appl Stat 2016 Sep 28;10(3):1166-1169. Epub 2016 Sep 28.

University of California, Davis.

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS880RDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553553PMC
September 2016
8 Reads

FIBER DIRECTION ESTIMATION, SMOOTHING AND TRACKING IN DIFFUSION MRI.

Ann Appl Stat 2016 Sep 28;10(3):1137-1156. Epub 2016 Sep 28.

Department Of Statistics, University Of California, Davis, 4118 Mathematical Sciences Building, One Shields Avenue, Davis, California 95616, USA.

Diffusion magnetic resonance imaging is an imaging technology designed to probe anatomical architectures of biological samples in an in vivo and noninvasive manner through measuring water diffusion. The contribution of this paper is threefold. First, it proposes a new method to identify and estimate multiple diffusion directions within a voxel through a new and identifiable parametrization of the widely used multi-tensor model. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/15-AOAS880DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5476320PMC
September 2016
22 Reads

Quantifying the Spatial Inequality and Temporal Trends in Maternal Smoking Rates in Glasgow.

Ann Appl Stat 2016 Sep;10(3):1427-1446

Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC, USA, 29401-8350.

Maternal smoking is well known to adversely affect birth outcomes, and there is considerable spatial variation in the rates of maternal smoking in the city of Glasgow, Scotland. This spatial variation is a partial driver of health inequalities between rich and poor communities, and it is of interest to determine the extent to which these inequalities have changed over time. Therefore in this paper we develop a Bayesian hierarchical model for estimating the spatio-temporal pattern in smoking incidence across Glasgow between 2000 and 2013, which can identify the changing geographical extent of clusters of areas exhibiting elevated maternal smoking incidences that partially drive health inequalities. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS941DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5449583PMC
September 2016

COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS.

Ann Appl Stat 2017 Mar 8;11(1):93-113. Epub 2017 Apr 8.

Department of Biostatistics, University of Washington, Box 357232, Health Sciences Building, F-600 1705 NE Pacific Street Seattle, WA 98195.

Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS992DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5448716PMC
March 2017
13 Reads

Gene Network Reconstruction using Global-Local Shrinkage Priors.

Ann Appl Stat 2017 Mar;11(1):41-68

Vrije Universiteit Amsterdam, Department of Mathematics, Vrije Universiteit Amsterdam, De Boelelaan 1081, 1081 HV Amsterdam, The Netherlands.

Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done in the neighbourhood of each node or gene. Read More

View Article

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388190PMC
March 2017
2 Reads

INVESTIGATING DIFFERENCES IN BRAIN FUNCTIONAL NETWORKS USING HIERARCHICAL COVARIATE-ADJUSTED INDEPENDENT COMPONENT ANALYSIS.

Authors:
Ran Shi Ying Guo

Ann Appl Stat 2016 Dec 5;10(4):1930-1957. Epub 2017 Jan 5.

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Rd., Atlanta, Georgia 30322 USA.

Human brains perform tasks via complex functional networks consisting of separated brain regions. A popular approach to characterize brain functional networks in fMRI studies is independent component analysis (ICA), which is a powerful method to reconstruct latent source signals from their linear mixtures. In many fMRI studies, an important goal is to investigate how brain functional networks change according to specific clinical and demographic variabilities. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS946DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5375118PMC
December 2016
5 Reads

ROBUST HYPERPARAMETER ESTIMATION PROTECTS AGAINST HYPERVARIABLE GENES AND IMPROVES POWER TO DETECT DIFFERENTIAL EXPRESSION.

Ann Appl Stat 2016 Jun 22;10(2):946-963. Epub 2016 Jul 22.

The Walter and Eliza Hall Institute of Medical Research; The University of Melbourne.

One of the most common analysis tasks in genomic research is to identify genes that are differentially expressed (DE) between experimental conditions. Empirical Bayes (EB) statistical tests using moderated genewise variances have been very effective for this purpose, especially when the number of biological replicate samples is small. The EB procedures can however be heavily influenced by a small number of genes with very large or very small variances. Read More

View Article

Download full-text PDF

Source
http://projecteuclid.org/euclid.aoas/1469199900
Publisher Site
http://dx.doi.org/10.1214/16-AOAS920DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5373812PMC
June 2016
2 Reads

LINKING LUNG AIRWAY STRUCTURE TO PULMONARY FUNCTION VIA COMPOSITE BRIDGE REGRESSION.

Ann Appl Stat 2016 Dec 5;10(4):1880-1906. Epub 2017 Jan 5.

University of Iowa.

The human lung airway is a complex inverted tree-like structure. Detailed airway measurements can be extracted from MDCT-scanned lung images, such as segmental wall thickness, airway diameter, parent-child branch angles, etc. The wealth of lung airway data provides a unique opportunity for advancing our understanding of the fundamental structure-function relationships within the lung. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS947DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5340208PMC
December 2016
8 Reads

A STATISTICAL FRAMEWORK FOR DATA INTEGRATION THROUGH GRAPHICAL MODELS WITH APPLICATION TO CANCER GENOMICS.

Ann Appl Stat 2017 Mar 8;11(1):161-184. Epub 2017 Apr 8.

DEPARTMENT OF BIOSTATISTICS, YALE SCHOOL OF PUBLIC HEALTH, NEW HAVEN, CONNECTICUT 06510, USA,

Recent advances in high-throughput biotechnologies have generated var-ious types of genetic, genomic, epigenetic, transcriptomic and proteomic data across different biological conditions. It is likely that integrating data from diverse experiments may lead to a more unified and global view of biolog-ical systems and complex diseases. We present a coherent statistical frame-work for integrating various types of data from distinct but related biological conditions through graphical models. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS998DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6447291PMC

STATIC AND ROVING SENSOR DATA FUSION FOR SPATIO-TEMPORAL HAZARD MAPPING WITH APPLICATION TO OCCUPATIONAL EXPOSURE ASSESSMENT.

Ann Appl Stat 2017 Mar 8;11(1):139-160. Epub 2017 Apr 8.

Rapid technological advances have drastically improved the data collection capacity in occupational exposure assessment. However, advanced statistical methods for analyzing such data and drawing proper inference remain limited. The objectives of this paper are (1) to provide new spatio-temporal methodology that combines data from both roving and static sensors for data processing and hazard mapping across space and over time in an indoor environment, and (2) to compare the new method with the current industry practice, demonstrating the distinct advantages of the new method and the impact on occupational hazard assessment and future policy making in environmental health as well as occupational health. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS995DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6086369PMC

A MIXED-EFFECTS MODEL FOR INCOMPLETE DATA FROM LABELING-BASED QUANTITATIVE PROTEOMICS EXPERIMENTS.

Ann Appl Stat 2017 Mar 8;11(1):114-138. Epub 2017 Apr 8.

Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Ave, S8-102, New York, New York, USA.

In mass spectrometry (MS) based quantitative proteomics research, the emerging iTRAQ (isobaric tag for relative and absolute quantitation) and TMT (tandem mass tags) techniques have been widely adopted for high throughput protein profiling. In a typical iTRAQ/TMT proteomics study, samples are grouped into batches, and each batch is processed by one multiplex experiment, in which the abundances of thousands of proteins/peptides in a batch of samples can be measured simultaneously. The multiplex labeling technique greatly enhances the throughput of protein quantification. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS994DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5937554PMC
March 2017
1 Read
1 Citation
1.464 Impact Factor

THE SCREENING AND RANKING ALGORITHM FOR CHANGE-POINTS DETECTION IN MULTIPLE SAMPLES.

Ann Appl Stat 2016 Dec 5;10(4):2102-2129. Epub 2017 Jan 5.

Yale University.

The chromosome copy number variation (CNV) is the deviation of genomic regions from their normal copy number states, which may associate with many human diseases. Current genetic studies usually collect hundreds to thousands of samples to study the association between CNV and diseases. CNVs can be called by detecting the change-points in mean for sequences of array-based intensity measurements. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS966DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5233178PMC
December 2016
67 Reads

A BAYESIAN HIERARCHICAL SPATIAL MODEL FOR DENTAL CARIES ASSESSMENT USING NON-GAUSSIAN MARKOV RANDOM FIELDS.

Ann Appl Stat 2016 22;10(2):884-905. Epub 2016 Jul 22.

Virginia Commonwealth University.

Research in dental caries generates data with two levels of hierarchy: that of a tooth overall and that of the different surfaces of the tooth. The outcomes often exhibit spatial referencing among neighboring teeth and surfaces, i.e. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS917DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5087817PMC
July 2016
7 Reads

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH.

Ann Appl Stat 2016 Jun 22;10(2):575-595. Epub 2016 Jul 22.

University of Washington.

Despite seasonal cholera outbreaks in Bangladesh, little is known about the relationship between environmental conditions and cholera cases. We seek to develop a predictive model for cholera outbreaks in Bangladesh based on environmental predictors. To do this, we estimate the contribution of environmental variables, such as water depth and water temperature, to cholera outbreaks in the context of a disease transmission model. Read More

View Article

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5061460PMC
http://dx.doi.org/10.1214/16-AOAS908DOI Listing
June 2016
7 Reads

Persistent Homology Analysis of Brain Artery Trees.

Ann Appl Stat 2016;10(1):198-218. Epub 2016 Mar 25.

Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06510, USA.

New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/15-AOAS886DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5026243PMC
March 2016
1 Read

FEATURE SCREENING FOR TIME-VARYING COEFFICIENT MODELS WITH ULTRAHIGH DIMENSIONAL LONGITUDINAL DATA.

Ann Appl Stat 2016 Jun 22;10(2):596-617. Epub 2016 Jul 22.

Department of Statistics, Pennsylvania State University, State College, PA, 16801, USA,

Motivated by an empirical analysis of the Childhood Asthma Management Project, CAMP, we introduce a new screening procedure for varying coefficient models with ultrahigh dimensional longitudinal predictor variables. The performance of the proposed procedure is investigated via Monte Carlo simulation. Numerical comparisons indicate that it outperforms existing ones substantially, resulting in significant improvements in explained variability and prediction error. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1214/16-AOAS912DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5019497PMC
June 2016
1 Read