Latest in stat.me

total 13286took 0.12s
Audits as Evidence: Experiments, Ensembles, and EnforcementJul 15 2019We develop tools for utilizing correspondence experiments to detect illegal discrimination by individual employers. Employers violate US employment law if their propensity to contact applicants depends on protected characteristics such as race or sex. ... More
Bayesian Wavelet Shrinkage with Beta PriorsJul 15 2019We present a Bayesian approach for wavelet shrinkage in the context of non-parametric curve estimation with the use of the beta distribution with symmetric support around zero as the prior distribution for the location parameter in the wavelet domain ... More
Posterior Predictive Treatment Assignment Methods for Causal Inference in the Context of Time-Varying TreatmentsJul 15 2019Marginal structural models (MSM) with inverse probability weighting (IPW) are used to estimate causal effects of time-varying treatments, but can result in erratic finite-sample performance when there is low overlap in covariate distributions across different ... More
The Elicitation of Prior Distributions for Bayesian Responsive Survey Design: Historical Data Analysis vs. Literature ReviewJul 15 2019Responsive Survey Design (RSD) aims to increase the efficiency of survey data collection via live monitoring of paradata and the introduction of protocol changes when survey errors and increased costs seem imminent. Unfortunately, RSD lacks a unifying ... More
Fast Algorithms and Theory for High-Dimensional Bayesian Varying Coefficient ModelsJul 15 2019Nonparametric varying coefficient (NVC) models are widely used for modeling time-varying effects on responses that are measured repeatedly. In this paper, we introduce the nonparametric varying coefficient spike-and-slab lasso (NVC-SSL) for Bayesian estimation ... More
Shadow Simulated Annealing algorithm: a new tool for global optimisation and statistical inferenceJul 15 2019This paper develops a new global optimisation method that applies to a family of criteria that are not entirely known. This family includes the criteria obtained from the class of posteriors that have nor-malising constants that are analytically not tractable. ... More
An efficient estimator of the parameters of the Generalized Lambda DistributionJul 15 2019Estimation of the four generalized lambda distribution parameters is not straightforward, and available estimators that perform best have large computation times. In this paper, we introduce a simple two-step estimator of the parameters that is comparatively ... More
Markov-switching State Space Models for Uncovering Musical InterpretationJul 14 2019For concertgoers, musical interpretation is the most important factor in determining whether or not we enjoy a classical performance. Every performance includes mistakes---intonation issues, a lost note, an unpleasant sound---but these are all easily ... More
Leveraging Auxiliary Information on Marginal Distributions in Nonignorable Models for Item and Unit NonresponseJul 13 2019When handling nonresponse, government agencies and survey organizations typically are forced to make strong, and potentially unrealistic, assumptions about the reasons why values are missing. We present a framework that enables users to reduce reliance ... More
An Assumption-Free Exact Test For Fixed-Design Linear Models With Exchangeable ErrorsJul 13 2019We propose the cyclic permutation test (CPT) to test general linear hypotheses for linear models. This test is non-randomized and valid in finite samples with exact type-I error $\alpha$ for arbitrary fixed design matrix and arbitrary exchangeable errors, ... More
Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approachJul 13 2019Linear mixed-effects models are widely used in analyzing clustered or repeated measures data. We propose a quasi-likelihood approach for estimation and inference of the unknown parameters in linear mixed-effects models with high-dimensional fixed effects. ... More
Multilevel models for continuous outcomesJul 12 2019Multilevel models (mixed-effect models or hierarchical linear models) are now a standard approach to analysing clustered and longitudinal data in the social, behavioural and medical sciences. This review article focuses on multilevel linear regression ... More
Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regressionJul 12 2019Motivation: The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select ... More
Estimating densities with nonlinear support using Fisher-Gaussian kernelsJul 12 2019Current tools for multivariate density estimation struggle when the density is concentrated near a nonlinear subspace or manifold. Most approaches require choice of a kernel, with the multivariate Gaussian by far the most commonly used. Although heavy-tailed ... More
Kinetics of thermal Mott transitions in the Hubbard modelJul 12 2019We present the first-ever microscopic dynamical simulation of the temperature-controlled Mott metal-insulator transition in the Hubbard model. By combining the efficient Gutzwiller method with molecular dynamics simulations, we demonstrate that the transformation ... More
Personalized Decision Making for Biopsies in Prostate Cancer Active Surveillance ProgramsJul 12 2019Background: Low-risk prostate cancer patients enrolled in active surveillance programs commonly undergo biopsies for examination of cancer progression. Biopsies are conducted as per a fixed and frequent schedule (e.g., annual biopsies). Since biopsies ... More
Can Bayes Factors "Prove" the Null Hypothesis?Jul 12 2019It is possible to obtain a large Bayes Factor (BF) favoring the null hypothesis when both the null and alternative hypotheses have low likelihoods, and there are other hypotheses being ignored that are much more strongly supported by the data. As sample ... More
Model based Level Shift Detection in Autocorrelated Data Streams using a moving windowJul 11 2019Standard Control Chart techniques to detect level shift in data streams assume independence between observations. As data today is collected with high frequency, this assumption is seldom valid. To overcome this, we propose to adapt the off-line test ... More
Change point detection for graphical models in presence of missing valuesJul 11 2019We propose estimation methods for change points in high-dimensional covariance structures with an emphasis on challenging scenarios with missing values. We advocate three imputation like methods and investigate their implications on common losses used ... More
Directing Power Towards Sub-AlternativesJul 11 2019This paper proposes a novel test statistic for testing a potentially high-dimensional parameter vector. To derive the statistic, I generalize the Mahalanobis distance to measure length in a direction of interest. The test statistic is the sample analogue ... More
Bayesian inferences on uncertain ranks and orderingsJul 10 2019It is common to be interested in rankings or order relationships among entities. In complex settings where one does not directly measure a univariate statistic upon which to base ranks, such inferences typically rely on statistical models having entity-specific ... More
Identifying mediating variables with graphical models: an application to the study of causal pathways in people living with HIVJul 10 2019We empirically demonstrate that graphical models can be a valuable tool in the identification of mediating variables in causal pathways. We make use of graphical models to elucidate the causal pathway through which the treatment influences the levels ... More
Quantifying Error in the Presence of Confounders for Causal InferenceJul 10 2019Estimating average causal effect (ACE) is useful whenever we want to know the effect of an intervention on a given outcome. In the absence of a randomized experiment, many methods such as stratification and inverse propensity weighting have been proposed ... More
Approximate Bayesian inference for spatial flood frequency analysisJul 10 2019Extreme floods cause casualties, widespread property damage, and damage to vital civil infrastructure. Predictions of extreme floods within gauged and ungauged catchments is crucial to mitigate these disasters. A Bayesian framework is proposed for predicting ... More
Quantum bath statistics taggingJul 10 2019The possibility of discriminating the statistics of a thermal bath using indirect measurements performed on quantum probes is presented. The scheme relies on the fact that, when weakly coupled with the environment of interest, the transient evolution ... More
Sparse Unit-Sum RegressionJul 10 2019This paper considers sparsity in linear regression under the restriction that the regression weights sum to one. We propose an approach that combines $\ell_0$- and $\ell_1$-regularization. We compute its solution by adapting a recent methodological innovation ... More
Mechanism for a Chemical Potential of Nonequilibrium Magnons in Parametric Parallel PumpingJul 10 2019We demonstrate how a magnon chemical potential is generated in parametric parallel pumping. We study how a time-periodic magnetic field of this pumping affects magnon properties of a ferrimagnet in a nonequilibrium steady state. We show that the magnon ... More
Bayesian Variable Selection for Non-Gaussian Responses: A Marginally Calibrated Copula ApproachJul 10 2019We propose a new highly flexible and tractable Bayesian approach to undertake variable selection in non-Gaussian regression models. It uses a copula decomposition for the vector of observations on the dependent variable. This allows the marginal distribution ... More
Bayesian Inference for Regression CopulasJul 10 2019We propose a new semi-parametric distributional regression smoother for continuous data, which is based on a copula decomposition of the joint distribution of the vector of response values. The copula is high-dimensional and constructed by inversion of ... More
Identifying Linear Models in Multi-Resolution Population Data using Minimum Description Length Principle to Predict Household IncomeJul 10 2019One shirt size cannot fit everybody, while we cannot make a unique shirt that fits perfectly for everyone because of resource limitation. This analogy is true for the policy making. Policy makers cannot establish a single policy to solve all problems ... More
Identifying the Influential Inputs for Network Output Variance Using Sparse Polynomial Chaos ExpansionJul 09 2019Sensitivity analysis (SA) is an important aspect of process automation. It often aims to identify the process inputs that influence the process output's variance significantly. Existing SA approaches typically consider the input-output relationship as ... More
Adaptive inference for a semiparametric GARCH modelJul 09 2019This paper considers a semiparametric generalized autoregressive conditional heteroscedastic (S-GARCH) model, which has a smooth long run component with unknown form to depict time-varying parameters, and a GARCH-type short run component to capture the ... More
A Robust Two-Sample Test for Time Series dataJul 09 2019We develop a general framework for hypothesis testing with time series data. The problem is to distinguish between the mean functions of the underlying temporal processes of populations of times series, which are often irregularly sampled and measured ... More
Geometry-controlled Failure Mechanisms of Amorphous Solids on the NanoscaleJul 09 2019Amorphous solids, confined on the nano-scale, exhibit a wealth of novel phenomena yet to be explored. In particular, the response of such solids to a mechanical load is not well understood and, as has been demonstrated experimentally, it differs strongly ... More
A framework for the pre-specification of statistical analysis strategies in clinical trials (Pre-SPEC)Jul 09 2019Bias can be introduced into clinical trials if statistical methods are chosen based on subjective assessment of the trial data. Pre-specification of the planned analysis approach is essential to help reduce such bias. However, many trials fail to adequately ... More
The Integrated nested Laplace approximation for fitting models with multivariate responseJul 09 2019This paper introduces a Laplace approximation to Bayesian inference in regression models for multivariate response variables. We focus on Dirichlet regression models, which can be used to analyze a set of variables on a simplex exhibiting skewness and ... More
Incremental Intervention Effects in Studies with Many Timepoints, Repeated Outcomes, and DropoutJul 09 2019Modern longitudinal studies feature data collected at many timepoints, often of the same order of sample size. Such studies are typically affected by dropout and positivity violations. We tackle these problems by generalizing effects of recent incremental ... More
Residual EntropyJul 08 2019We describe an approach to improving model fitting and model generalization that considers the entropy of distributions of modelling residuals. We use simple simulations to demonstrate the observational signatures of overfitting on ordered sequences of ... More
Empirical Bayesian Learning in AR Graphical ModelsJul 08 2019We address the problem of learning graphical models which correspond to high dimensional autoregressive stationary stochastic processes. A graphical model describes the conditional dependence relations among the components of a stochastic process and ... More
False Discovery Rates in Biological NetworksJul 08 2019The increasing availability of data has generated unprecedented prospects for network analyses in many biological fields, such as neuroscience (e.g., brain networks), genomics (e.g., gene-gene interaction networks), and ecology (e.g., species interaction ... More
Aggregated False Discovery Rate ControlJul 08 2019We propose an aggregation scheme for methods that control the false discovery rate (FDR). Our scheme retains the underlying methods' FDR guarantees in theory and can decrease FDR and increase power in practice.
A Versatile Estimation Procedure without Estimating the Nonignorable Missingness MechanismJul 08 2019We consider the estimation problem in a regression setting where the outcome variable is subject to nonignorable missingness and identifiability is ensured by the shadow variable approach. We propose a versatile estimation procedure where modeling of ... More
How many groups? A statistical methodology for data-driven partitioning of infectious disease incidence into age-groupsJul 08 2019Understanding age-group dynamics of infectious diseases is a fundamental issue for both scientific study and policymaking. Age-structure epidemic models were developed in order to study and improve our understanding of these dynamics. By fitting the models ... More
Modeling Symmetric Positive Definite Matrices with An Application to Functional Brain ConnectivityJul 08 2019In neuroscience, functional brain connectivity describes the connectivity between brain regions that share functional properties. Neuroscientists often characterize it by a time series of covariance matrices between functional measurements of distributed ... More
Brand vs. Generic: Addressing Non-Adherence, Secular Trends, and Non-OverlapJul 07 2019While generic drugs offer a cost-effective alternative to brand name drugs, regulators need a method to assess therapeutic equivalence in a post market setting. We develop such a method in the context of assessing the therapeutic equivalence of immediate ... More
Bayesian Nonparametric Nonhomogeneous Poisson Process with Applications to USGS Earthquake DataJul 06 2019Intensity estimation is a common problem in statistical analysis of spatial point pattern data. This paper proposes a nonparametric Bayesian method for estimating the spatial point process intensity based on mixture of finite mixture (MFM) model. MFM ... More
XGBoostLSS -- An extension of XGBoost to probabilistic forecastingJul 06 2019We propose a new framework of XGBoost that predicts the entire conditional distribution of a univariate response variable. In particular, XGBoostLSS models all moments of a parametric distribution, i.e., mean, location, scale and shape (LSS), instead ... More
XGBoostLSS -- An extension of XGBoost to probabilistic forecastingJul 06 2019Jul 11 2019We propose a new framework of XGBoost that predicts the entire conditional distribution of a univariate response variable. In particular, XGBoostLSS models all moments of a parametric distribution, i.e., mean, location, scale and shape (LSS), instead ... More
Learning a latent pattern of heterogeneity in the innovation rates of a time series of countsJul 06 2019We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which ... More
Improving Lasso for model selection and predictionJul 05 2019It is known that the Thresholded Lasso (TL), SCAD or MCP correct intrinsic estimation bias of the Lasso. In this paper we propose an alternative method of improving the Lasso for predictive models with general convex loss functions which encompass normal ... More
Geodesic Learning via Unsupervised Decision ForestsJul 05 2019Geodesic distance is the shortest path between two points in a Riemannian manifold. Manifold learning algorithms, such as Isomap, seek to learn a manifold that preserves geodesic distances. However, such methods operate on the ambient dimensionality, ... More
Risk models for breast cancer and their validationJul 05 2019Jul 09 2019Strategies to prevent cancer and diagnose it early when it is most treatable are needed to reduce the public health burden from rising disease incidence. Risk assessment is playing an increasingly important role in targeting individuals in need of such ... More
Risk models for breast cancer and their validationJul 05 2019Strategies to prevent cancer and diagnose it early when it is most treatable are needed to reduce the public health burden from rising disease incidence. Risk assessment is playing an increasingly important role in targeting individuals in need of such ... More
Analyses of 'change scores' do not estimate causal effects in observational dataJul 05 2019Background: In longitudinal data, it is common to create 'change scores' by subtracting measurements taken at baseline from those taken at follow-up, and then to analyse the resulting 'change' as the outcome variable. In observational data, this approach ... More
Spatio-Temporal Reconstructions of Global CO2-Fluxes using Gaussian Markov Random FieldsJul 05 2019Atmospheric inverse modelling is a method for reconstructing historical fluxes of green-house gas between land and atmosphere, using observed atmospheric concentrations and an atmospheric tracer transport model. The small number of observed atmospheric ... More
Particularities and commonalities of singular spectrum analysis as a method of time series analysis and signal processingJul 04 2019Singular spectrum analysis (SSA), starting from the second half of XX century, has been a rapidly developing method of time series analysis. Since it can be called principal component analysis for time series, SSA will definitely be a standard method ... More
Multiple membership multilevel modelsJul 04 2019Multiple membership multilevel models are an extension of standard multilevel models for non-hierarchical data that have multiple membership structures. Traditional multilevel models involve hierarchical data structures whereby lower-level units such ... More
Cross-classified multilevel modelsJul 04 2019Cross-classified multilevel modelling is an extension of standard multilevel modelling for non-hierarchical data that have cross-classified structures. Traditional multilevel models involve hierarchical data structures whereby lower level units such as ... More
An enriched mixture model for functional clusteringJul 04 2019There is an increasingly rich literature about Bayesian nonparametric models for clustering functional observations. However, most of the recent proposals rely on infinite-dimensional characterizations that might lead to overly complex cluster solutions. ... More
Bayes factors with (overly) informative priorsJul 04 2019Priors in which a large number of parameters are specified to be independent are dangerous; they make it hard to learn from data. I present a couple of examples from the literature and work through a bit of large sample theory to show what happens.
Bayes factors with (overly) informative priorsJul 04 2019Jul 06 2019Priors in which a large number of parameters are specified to be independent are dangerous; they make it hard to learn from data. I present a couple of examples from the literature and work through a bit of large sample theory to show what happens.
Efficient Parameter Estimation of Sampled Random FieldsJul 04 2019We provide a computationally and statistically efficient method for estimating the parameters of a stochastic Gaussian model observed on a spatial grid, which need not be rectangular. Standard methods are plagued by computational intractability, where ... More
Efficient Parameter Estimation of Sampled Random FieldsJul 04 2019Jul 15 2019We provide a computationally and statistically efficient method for estimating the parameters of a stochastic Gaussian model observed on a spatial grid, which need not be rectangular. Standard methods are plagued by computational intractability, where ... More
High-dimensional Gaussian graphical model for network-linked dataJul 04 2019Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations ... More
Subsampling Bias and The Best-Discrepancy Systematic Cross ValidationJul 04 2019Statistical machine learning models should be evaluated and validated before putting to work. Conventional k-fold Monte Carlo Cross-Validation (MCCV) procedure uses a pseudo-random sequence to partition instances into k subsets, which usually causes subsampling ... More
Bayesian Regularization of Gaussian Graphical Models with Measurement ErrorJul 04 2019We consider a framework for determining and estimating the conditional pairwise relationships of variables when the observed samples are contaminated with measurement error in high dimensional settings. Assuming the true underlying variables follow a ... More
mgcpy: A Comprehensive High Dimensional Independence Testing Python PackageJul 03 2019With the increase in the amount of data in many fields, a method to consistently and efficiently decipher relationships within high dimensional data sets is important. Because many modern datasets are high-dimensional, univariate independence tests are ... More
Optical Conductivity in an effective model for Graphene: Finite temperature correctionsJul 03 2019In this article, we investigate the temperature and chemical potential dependence of the optical conductivity of graphene, within a field theoretical representation in the continuum approximation, arising from an underlying tight-binding atomistic model, ... More
Quantum stochastic transport along chainsJul 03 2019We study the spreading along an infinite tight-binding chain, and the relaxation within a finite ring (chain with periodic boundary conditions). Specifically we address the interplay of coherent and stochastic transitions within the framework of an Ohmic ... More
bayes4psy -- an Open Source R Package for Bayesian Statistics in PsychologyJul 03 2019Research in psychology generates interesting data sets and unique statistical modelling tasks. However, these tasks, while important, are often very specific, so appropriate statistical models and methods cannot be found in accessible Bayesian tools. ... More
Mid-quantile regression for discrete responsesJul 03 2019We develop quantile regression methods for discrete responses by extending Parzen's definition of marginal mid-quantiles. As opposed to existing approaches, which are based on either jittering or latent constructs, we use interpolation and define the ... More
Evaluating A Key Instrumental Variable Assumption Using Randomization TestsJul 03 2019Instrumental variable (IV) analyses are becoming common in health services research and epidemiology. Most IV analyses use naturally occurring instruments, such as distance to a hospital. In these analyses, investigators must assume the instrument is ... More
Model-based clustering and classification using mixtures of multivariate skewed power exponential distributionsJul 03 2019Families of mixtures of multivariate power exponential (MPE) distributions have been previously introduced and shown to be competitive for cluster analysis in comparison to other elliptical mixtures including mixtures of Gaussian distributions. Herein, ... More
Testing independence between two random sets for the analysis of colocalization in bio-imagingJul 03 2019Colocalization aims at characterizing spatial associations between two fluorescently-tagged biomolecules by quantifying the co-occurrence and correlation between the two channels acquired in fluorescence microscopy. Colocalization is presented either ... More
A Bayesian Semiparametric Gaussian Copula Approach to a Multivariate Normality TestJul 03 2019In this paper, a Bayesian semiparametric copula approach is used to model the underlying multivariate distribution $F_{true}$. First, the Dirichlet process is constructed on the unknown marginal distributions of $F_{true}$. Then a Gaussian copula model ... More
A Bayesian Semiparametric Gaussian Copula Approach to a Multivariate Normality TestJul 03 2019Jul 04 2019In this paper, a Bayesian semiparametric copula approach is used to model the underlying multivariate distribution $F_{true}$. First, the Dirichlet process is constructed on the unknown marginal distributions of $F_{true}$. Then a Gaussian copula model ... More
Double Cross Validation for the Number of Factors in Approximate Factor ModelsJul 02 2019Determining the number of factors is essential to factor analysis. In this paper, we propose {an efficient cross validation (CV)} method to determine the number of factors in approximate factor models. The method applies CV twice, first along the directions ... More
Quantum Thermodynamics: An introduction to the thermodynamics of quantum informationJul 02 2019This book provides an introduction to the emerging field of quantum thermodynamics, with particular focus on its relation to quantum information and its implications for quantum computers and next generation quantum technologies. The text, aimed at graduate ... More
Penalized Variable Selection in Multi-Parameter Regression Survival ModellingJul 02 2019Multi-parameter regression (MPR) modelling refers to the approach whereby covariates are allowed to enter the model through multiple distributional parameters simultaneously. This is in contrast to the standard approaches where covariates enter through ... More
Multiple competition based FDR controlJul 02 2019Competition based FDR control has been commonly used for over a decade in the computational mass spectrometry community [7]. The approach has gained significant popularity in other fields after Barber and Cand\'es recently laid its theoretical foundation ... More
On Global-local Shrinkage Priors for Count DataJul 02 2019Global-local shrinkage prior has been recognized as useful class of priors which can strongly shrink small signals towards prior means while keeping large signals unshrunk. Although such priors have been extensively discussed under Gaussian responses, ... More
Adaptive Partitioning Design and Analysis for Emulation of a Complex Computer CodeJul 02 2019Computer models are used as replacements for physical experiments in a large variety of applications. Nevertheless, direct use of the computer model for the ultimate scientific objective is often limited by the complexity and cost of the model. Historically, ... More
Volatility Analysis with Realized GARCH-Ito ModelsJul 02 2019This paper introduces a unified approach for modeling high-frequency financial data that can accommodate both the continuous-time jump-diffusion and discrete-time realized GARCH model by embedding the discrete realized GARCH structure in the continuous ... More
Bayesian Analysis of High-dimensional Discrete Graphical ModelsJul 02 2019This work introduces a Bayesian methodology for fitting large discrete graphical models with spike-and-slab priors to encode sparsity. We consider a quasi-likelihood approach that enables node-wise parallel computation resulting in reduced computational ... More
Using Subset Log-Likelihoods to Trim Outliers in Gaussian Mixture ModelsJul 02 2019Mixtures of Gaussian distributions are a popular choice in model-based clustering. Outliers can affect parameters estimation and, as such, must be accounted for. Algorithms such as TCLUST discern the most likely outliers, but only when the proportion ... More
Permutation inference with a finite number of heterogeneous clustersJul 01 2019I introduce a simple permutation procedure to test conventional (non-sharp) hypotheses about the effect of a binary treatment in the presence of a finite number of large, heterogeneous clusters when the treatment effect is identified by comparisons across ... More
State-of-the-art in selection of variables and functional forms in multivariable analysis -- outstanding issuesJul 01 2019How to select variables and identify functional forms for continuous variables is a key concern when creating a multivariable model. Ad hoc 'traditional' approaches to variable selection have been in use for at least 50 years. Similarly, methods for determining ... More
Transformed Naive Ratio and Product Based Estimators for Estimating Population Mode in Simple Random SamplingJul 01 2019In this paper, we propose a transformed na\"ive ratio and product based estimators using the characterizing scalar in presence of auxiliary information of the study variable for estimating the population mode following simple random sampling without replacement. ... More
Coupling techniques for nonlinear ensemble filteringJun 30 2019We consider filtering in high-dimensional non-Gaussian state-space models with intractable transition kernels, nonlinear and possibly chaotic dynamics, and sparse observations in space and time. We propose a novel filtering methodology that harnesses ... More
Frequentist performances of Bayesian prediction intervals for random-effects meta-analysisJun 30 2019The prediction interval has been increasingly used in meta-analyses as a useful measure for assessing the magnitude of treatment effect and between-studies heterogeneity. In calculations of the prediction interval, although the Higgins-Thompson-Spiegelhalter ... More
An outlier-robust Kalman filter with mixture correntropyJun 30 2019We consider the robust filtering problem for a nonlinear state-space model with outliers in measurements. A novel robust cubature Kalman filtering algorithm is proposed based on mixture correntropy with two Gaussian kernels. We have formulated the robust ... More
Estimating Treatment Effect under Additive Hazards Models with High-dimensional CovariatesJun 29 2019Estimating causal effects for survival outcomes in the high-dimensional setting is an extremely important topic for many biomedical applications as well as areas of social sciences. We propose a new orthogonal score method for treatment effect estimation ... More
trialr: Bayesian Clinical Trial Designs in R and StanJun 29 2019This manuscript introduces an \proglang{R} package called \pkg{trialr} that implements a collection of clinical trial methods in \proglang{Stan} and \proglang{R}. In this article, we explore three methods in detail. The first is the continual reassessment ... More
Learning Markov models via low-rank optimizationJun 28 2019Modeling unknown systems from data is a precursor of system optimization and sequential decision making. In this paper, we focus on learning a Markov model from a single trajectory of states. Suppose that the transition model has a small rank despite ... More
Large-scale inference with block structureJun 28 2019The detection of weak and rare effects in large amounts of data arises in a number of modern data analysis problems. Known results show that in this situation the potential of statistical inference is severely limited by the large-scale multiple testing ... More
Robust test for dispersion parameter change in discretely observed diffusion processesJun 28 2019This paper deals with the problem of testing for dispersion parameter change in discretely observed diffusion processes when the observations are contaminated by outliers. To lessen the impact of outliers, we first calculate residuals using a robust estimate ... More
Skyrmion relaxation dynamics in the presence of quenched disorderJun 28 2019Using Langevin molecular dynamics simulations we study relaxation processes of interacting skyrmion systems with and without quenched disorder. Using the typical diffusion length as the time-dependent length characterizing the relaxation process, we find ... More
High-dimensional principal component analysis with heterogeneous missingnessJun 28 2019We study the problem of high-dimensional Principal Component Analysis (PCA) with missing observations. In simple, homogeneous missingness settings with a noise level of constant order, we show that an existing inverse-probability weighted (IPW) estimator ... More
On the conditional distribution of the mean of the two closest among a set of three observationsJun 28 2019Chemical analyses of raw materials are often repeated in duplicate or triplicate. The assay values obtained are then combined using a predetermined formula to obtain an estimate of the true value of the material of interest. When duplicate observations ... More
Formulating causal questions and principled statistical answersJun 28 2019Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where ... More