State Compression of Markov Processes via Empirical Low-Rank EstimationFeb 08 2018Model reduction is a central problem in analyzing complex systems and high-dimensional data. We study the state compression of finite-state Markov process from its empirical trajectories. We adopt a low-rank model which is motivated by the state aggregation ... More

Bayesian analysis of predictive Non-Homogeneous hidden Markov models using Polya-Gamma data augmentationFeb 08 2018We consider Non-Homogeneous Hidden Markov Models (NHHMMs) for forecasting univariate time series. We introduce two state NHHMMs where the time series are modeled via different predictive regression models for each state. Also, the time-varying transition ... More

Data-adaptive doubly robust instrumental variable methods for treatment effect heterogeneityFeb 08 2018We consider the estimation of the average treatment effect in the treated as a function of baseline covariates, where there is a valid (conditional) instrument. We describe two doubly robust (DR) estimators: a locally efficient g-estimator, and a targeted ... More

More Efficient Estimation for Logistic Regression with Optimal SubsampleFeb 08 2018Facing large amounts of data, subsampling is a practical technique to extract useful information. For this purpose, Wang et al. (2017) developed an Optimal Subsampling Method under the A-optimality Criterion (OSMAC) for logistic regression that samples ... More

A Bayesian Approach to Multi-State Hidden Markov Models: Application to Dementia ProgressionFeb 08 2018People are living longer than ever before, and with this arise new complications and challenges for humanity. Among the most pressing of these challenges is to understand the role of aging in the development of dementia. This paper is motivated by the ... More

Correlation Estimation System Minimization Compared to Least Squares Minimization in Simple Linear RegressionFeb 07 2018A general method of minimization using correlation coefficients and order statistics is evaluated relative to least squares procedures in the estimation of parameters for normal data in simple linear regression.

Interpolating Distributions for Populations in Nested Geographies using Public-use Data with Application to the American Community SurveyFeb 07 2018Statistical agencies often publish multiple data products from the same survey. First, they produce aggregate estimates of various features of the distributions of several socio-demographic quantities of interest. Often these area-level estimates are ... More

Intentional control of type I error over unconscious data distortion: a Neyman-Pearson classification approachFeb 07 2018The rise of social media enables millions of citizens to generate information on sensitive political issues and social events, which is scarce in authoritarian countries and is tremendously valuable for surveillance and social studies. In the enormous ... More

Sparse Linear Discriminant Analysis under the Neyman-Pearson ParadigmFeb 07 2018In contrast to the classical binary classification paradigm that minimizes the overall classification error, the Neyman-Pearson (NP) paradigm seeks classifiers with a minimal type II error while having a constrained type I error under a user-specified ... More

Statistical tests for daily and total precipitation volumes to be abnormally extremalFeb 07 2018In this paper, two approaches are proposed to the definition of abnormally extremal precipitation. These approaches are based on the negative binomial model for the distribution of duration of wet periods measured in days. This model demonstrates excellent ... More

Mixtures of Factor Analyzers with Fundamental Skew Symmetric DistributionsFeb 07 2018Mixtures of factor analyzers (MFA) provide a powerful tool for modelling high-dimensional datasets. In recent years, several generalizations of MFA have been developed where the normality assumption of the factors and/or of the errors was relaxed to allow ... More

Group kernels for Gaussian process metamodels with categorical inputsFeb 07 2018Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes.We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens),partitioned in G << L groups of ... More

An Imputation-Consistency Algorithm for High-Dimensional Missing Data Problems and BeyondFeb 06 2018Missing data are frequently encountered in high-dimensional problems, but they are usually difficult to deal with using standard algorithms, such as the expectation-maximization (EM) algorithm and its variants. To tackle this difficulty, some problem-specific ... More

How to Make Causal Inferences Using TextsFeb 06 2018New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences ... More

The additive hazard estimator is consistent for continuous time marginal structural modelsFeb 06 2018Marginal structural models (MSMs) allow for causal interpretations of longitudinal data. The standard MSM is based on discrete time models, but the continuous time MSM is a conceptually appealing alternative for survival analysis. In particular, the additive ... More

Hidden regular variation, copula models, and the limit behavior of conditional excess risk measuresFeb 06 2018Risk measures like Marginal Expected Shortfall and Marginal Mean Excess quantify conditional risk and in particular, aid in the understanding of systemic risk. In many such scenarios, models exhibiting heavy tails in the margins and asymptotic tail independence ... More

Re-thinking non-inferiority: a practical trial design for optimising treatment durationFeb 05 2018Background: trials to identify the minimal effective treatment duration are needed in different therapeutic areas, including bacterial infections, TB and Hepatitis--C. However, standard non-inferiority designs have several limitations, including arbitrariness ... More

Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic ProcessesFeb 02 2018Inference on unknown quantities in dynamical systems via observational data is essential for providing meaningful insight, furnishing accurate predictions, enabling robust control, and establishing appropriate designs for future experiments. Merging mathematical ... More

Zero-adjusted Birnbaum-Saunders regression modelFeb 01 2018In this paper we introduce the zero-adjusted Birnbaum-Saunders regression model. This new model generalizes at least seven Birnbaum-Saunders regression models. The idea of this modeling is mixing a degenerate distribution at zero with a Birnbaum-Saunders ... More

Lindbladians with multiple steady states: theory and applicationsJan 31 2018Markovian master equations, often called Liouvillians or Lindbladians, are used to describe decay and decoherence of a quantum system induced by that system's environment. While a natural environment is detrimental to fragile quantum properties, an engineered ... More

De-biased sparse PCA: Inference and testing for eigenstructure of large covariance matricesJan 31 2018Sparse principal component analysis (sPCA) has become one of the most widely used techniques for dimensionality reduction in high-dimensional datasets. The main challenge underlying sPCA is to estimate the first vector of loadings of the population covariance ... More

Massively parallel symplectic algorithm for coupled magnetic spin dynamics and molecular dynamicsJan 30 2018A parallel implementation of coupled spin-lattice dynamics in the LAMMPS molecular dynamics package is presented. The equations of motion for both spin only and coupled spin-lattice dynamics are first reviewed, including a detailed account of how magneto-mechanical ... More

Mixture Proportion Estimation for Positive--Unlabeled Learning via Classifier Dimension ReductionJan 30 2018Jan 31 2018Positive--unlabeled (PU) learning considers two samples, a positive set $P$ with observations from only one class and an unlabeled set $U$ with observations from two classes. The goal is to classify observations in $U$. Class mixture proportion estimation ... More

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental DataJan 29 2018In the analysis of count data often the equidispersion assumption is not suitable, hence the Poisson regression model is inappropriate. As a generalization of the Poisson distribution, the COM-Poisson distribution can deal with under-, equi- and overdispersed ... More

Methodological variations in lagged regression for detecting physiologic drug effects in EHR dataJan 26 2018We studied how lagged linear regression can be used to detect the physiologic effects of drugs from data in the electronic health record (EHR). We systematically examined the effect of methodological variations ((i) time series construction, (ii) temporal ... More

Uncertainty quantification for spatio-temporal computer models with calibration-optimal basesJan 24 2018The calibration of complex computer codes using uncertainty quantification (UQ) methods is a rich area of statistical methodological development. When applying these techniques to simulators with spatio-temporal output, it is now standard to use principal ... More

Sensitivity of codispersion to noise and error in ecological and environmental dataJan 24 2018Codispersion analysis is a new statistical method developed to assess spatial covariation between two spatial processes that may not be isotropic or stationary. Its application to anisotropic ecological datasets have provided new insights into mechanisms ... More

Quantized Self-Assembly of Discotic Rings in a Liquid Crystal Confined in NanoporesJan 23 2018Disklike molecules with aromatic cores spontaneously stack up in linear columns with high, one-dimensional charge carrier mobilities along the columnar axes making them prominent model systems for functional, self-organized matter. We show by high-resolution ... More

A dissipative environment may improve the quantum annealing performances of the ferromagnetic p-spin modelJan 23 2018We investigate the quantum annealing of the ferromagnetic $ p $-spin model in a dissipative environment ($ p = 5 $ and $ p = 7 $). This model, in the large $ p $ limit, codifies the Grover's algorithm for searching in an unsorted database. The dissipative ... More

Propensity score methodology in the presence of network entanglement between treatmentsJan 22 2018In experimental design and causal inference, it may happen that the treatment is not defined on individual experimental units, but rather on pairs or, more generally, on groups of units. For example, teachers may choose pairs of students who do not know ... More

A Kotel'nikov Representation for WaveletsJan 17 2018This paper presents a wavelet representation using baseband signals, by exploiting Kotel'nikov results. Details of how to obtain the processes of envelope and phase at low frequency are shown. The archetypal interpretation of wavelets as an analysis with ... More

Efficient Computation of the 8-point DCT via Summation by PartsJan 17 2018This paper introduces a new fast algorithm for the 8-point discrete cosine transform (DCT) based on the summation-by-parts formula. The proposed method converts the DCT matrix into an alternative transformation matrix that can be decomposed into sparse ... More

Bayesian Estimation of Gaussian Graphical Models with Projection Predictive SelectionJan 17 2018Jan 20 2018Gaussian graphical models are used for determining conditional relationships between variables. This is accomplished by identifying off-diagonal elements in the inverse-covariance matrix that are non-zero. When the ratio of variables (p) to observations ... More

Panel data analysis via mechanistic modelsJan 17 2018Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically ... More

Factor graph fragmentization of expectation propagationJan 16 2018Expectation propagation is a general approach to fast approximate inference for graphical models. The existing literature treats models separately when it comes to deriving and coding expectation propagation inference algorithms. This comes at the cost ... More

TFisher Tests: Optimal and Adaptive Thresholding for Combining $p$-ValuesJan 12 2018For testing a group of hypotheses, tremendous $p$-value combination methods have been developed and widely applied since 1930's. Some methods (e.g., the minimal $p$-value) are optimal for sparse signals, and some others (e.g., Fisher's combination) are ... More

Robust inference with knockoffsJan 11 2018Jan 23 2018We consider the variable selection problem, which seeks to identify important variables influencing a response $Y$ out of many candidate features $X_1, \ldots, X_p$. We wish to do so while offering finite-sample guarantees about the fraction of false ... More

On variance estimation for Bayesian variable selectionJan 09 2018Consider the problem of high dimensional variable selection for the Gaussian linear model when the unknown error variance is also of interest. In this paper, we argue that the use conjugate continuous shrinkage priors for Bayesian variable selection can ... More

Variable selection in Functional Additive Regression ModelsJan 02 2018This paper considers the problem of variable selection when some of the variables have a functional nature and can be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null model and sequentially ... More

Adaptive Sign Error ControlDec 30 2017In multiple testing scenarios, typically the sign of a parameter is inferred when its estimate exceeds some significance threshold in absolute value. Typically, the significance threshold is chosen to control the experimentwise type I error rate, family-wise ... More

A Divide-and-Conquer Bayesian Approach to Large-Scale KrigingDec 28 2017Flexible hierarchical Bayesian modeling of massive data is challenging due to poorly scaling computations in large sample size settings. This article is motivated by spatial process models for analyzing geostatistical data, which typically entail computations ... More

Space-Filling Designs for Robustness ExperimentsDec 25 2017To identify the robust settings of the control factors, it is very important to understand how they interact with the noise factors. In this article, we propose space-filling designs for computer experiments that are more capable of accurately estimating ... More

Merging $K$-means with hierarchical clustering for identifying general-shaped groupsDec 23 2017Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and $K$-means clustering are two approaches but have different strengths and weaknesses. For instance, ... More

Mixtures of Matrix Variate Bilinear Factor AnalyzersDec 22 2017Over the years data is becoming increasingly higher dimensional, which has prompted an increased need for dimension reduction techniques, in particular for clustering and classification. Although dimension reduction in the area of clustering for multivariate ... More

The Geometry of Continuous Latent Space Models for Network DataDec 22 2017We review the class of continuous latent space (statistical) models for network data, paying particular attention to the role of the geometry of the latent space. In these models, the presence/absence of network dyadic ties are assumed to be conditionally ... More

Dynamic Networks with Multi-scale Temporal StructureDec 22 2017We describe a novel method for modeling non-stationary multivariate time series, with time-varying conditional dependencies represented through dynamic networks. Our proposed approach combines traditional multi-scale modeling and network based neighborhood ... More

Unsteady heat conduction processes in a harmonic crystal with a substrate potentialDec 22 2017An analytical model of high frequency oscillations of the kinetic and potential energies in a one-dimensional harmonic crystal with a substrate potential is obtained by introducing the nonlocal energies [1]. A generalization of the kinetic temperature ... More

Model comparison for Gibbs random fields using noisy reversible jump Markov chain Monte CarloDec 14 2017Dec 21 2017The reversible jump Markov chain Monte Carlo (RJMCMC) method offers an across-model simulation approach for Bayesian estimation and model comparison, by exploring the sampling space that consists of several models of varying dimensions. The implementation ... More

Analysis-of-marginal-Tail-Means - a new method for robust parameter optimizationDec 10 2017This paper presents a novel method, called Analysis-of-marginal-Tail-Means (ATM), for parameter optimization over a large, discrete design space. The key advantage of ATM is that it offers effective and robust optimization performance for both smooth ... More

Maximum entropy low-rank matrix recoveryDec 08 2017We propose a novel, information-theoretic method, called MaxEnt, for efficient data requisition for low-rank matrix recovery. This proposed method has important applications to a wide range of problems in image processing, text document indexing and system ... More

Bayesian analysis of finite population sampling in multivariate co-exchangeable structures with separable covariance matricNov 29 2017We explore the effect of finite population sampling in design problems with many variables cross-classified in many ways. In particular, we investigate designs where we wish to sample individuals belonging to different groups for which the underlying ... More

Binary classification models with "Uncertain" predictionsNov 27 2017Dec 04 2017Binary classification models which can assign probabilities to categories such as "the tissue is 75% likely to be tumorous" or "the chemical is 25% likely to be toxic" are well understood statistically, but their utility as an input to decision making ... More

Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and ClassificationNov 21 2017Neural networks are usually not the tool of choice for nonparametric high-dimensional problems where the number of input features is much larger than the number of observations. Though neural networks can approximate complex multivariate functions, they ... More

Inverse stable prior for exponential modelsNov 08 2017Jan 29 2018We consider a class of non-conjugate priors as a mixing family of distributions for a parameter (e.g., Poisson or gamma rate, inverse scale or precision of an inverse-gamma, inverse variance of a normal distribution) of an exponential subclass of discrete ... More

The extended power distribution: A new distribution on $(0, 1)$Nov 08 2017We propose a two-parameter bounded probability distribution called the extended power distribution. This distribution on $(0, 1)$ is similar to the beta distribution, however there are some advantages which we explore. We define the moments and quantiles ... More

Signatures of the Many-body Localized Regime in Two DimensionsNov 07 2017Lessons from Anderson localization highlight the importance of dimensionality of real space for localization due to disorder. More recently, studies of many-body localization have focussed on the phenomena in one dimension using techniques of exact diagonalization ... More

Sophisticated and small versus simple and sizeable: When does it pay off to introduce drifting coefficients in Bayesian VARs?Nov 01 2017Nov 29 2017We assess the relationship between model size and complexity in the time-varying parameter VAR framework via thorough predictive exercises for the Euro Area, the United Kingdom and the United States. It turns out that sophisticated dynamics through drifting ... More

Spatially Adaptive Colocalization Analysis in Dual-Color Fluorescence MicroscopyOct 31 2017Colocalization analysis aims to study complex spatial associations between bio-molecules via optical imaging techniques. However, existing colocalization analysis workflows only assess an average degree of colocalization within a certain region of interest ... More

Heat Kernel Smoothing in Irregular Image DomainsOct 21 2017We present the discrete version of heat kernel smoothing on graph data structure. The method is used to smooth data in an irregularly shaped domains in 3D images. New statistical properties are derived. As an application, we show how to filter out data ... More

Nonparametric estimation of multivariate distribution function for truncated and censored lifetime dataOct 20 2017In this article we consider a number of models for the statistical data generation in different areas of insurance, including life, pension and non-life insurance. Insurance statistics are usually truncated and censored, and often are multidimensional. ... More

Box-Cox elliptical distributions with applicationOct 17 2017We propose and study the class of Box-Cox elliptical distributions. It provides alternative distributions for modeling multivariate positive, marginally skewed and possibly heavy-tailed data. This new class of distributions has as a special case the class ... More

Methods for Analyzing Large Spatial Data: A Review and ComparisonOct 13 2017The Gaussian process is an indispensable tool for spatial data analysts. The onset of the "big data" era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to ... More

Efficient MCMC for Gibbs Random Fields using pre-computationOct 11 2017Oct 13 2017Bayesian inference of Gibbs random fields (GRFs) is often referred to as a doubly intractable problem, since the likelihood function is intractable. The exploration of the posterior distribution of such models is typically carried out with a sophisticated ... More

Automated and Robust Quantification of Colocalization in Dual-Color Fluorescence Microscopy: A Nonparametric Statistical ApproachOct 03 2017Colocalization is a powerful tool to study the interactions between fluorescently labeled molecules in biological fluorescence microscopy. However, existing techniques for colocalization analysis have not undergone continued development especially in ... More

Generalized Bayesian Updating and the Loss-Likelihood BootstrapSep 22 2017In this paper, we revisit the weighted likelihood bootstrap and show that it is well-motivated for Bayesian inference under misspecified models. We extend the underlying idea to a wider family of inferential problems. This allows us to calibrate an analogue ... More

An estimator of the stable tail dependence function based on the empirical beta copulaSep 12 2017The replacement of indicator functions by integrated beta kernels in the definition of the empirical stable tail dependence function is shown to produce a smoothed version of the latter estimator with the same asymptotic distribution but superior finite-sample ... More

An Efficient Calculation Method for the Expected Value of Sample Information: Can we do it? Yes, we canSep 07 2017The Expected Value of Sample Information (EVSI) allows us to quantify the economic benefit of a potential future trial or study. While this is clearly of benefit, especially when considering which trials should be funded, it has rarely been used in practise ... More

Unbiased Estimation and Sensitivity Analysis for Network-Specific Spillover Effects: Application to An Online Network ExperimentAug 28 2017Sep 11 2017In experiments, the outcome of one individual may be affected by the treatment status of others -- there may be spillover effects. Since such spillover effects may simultaneously occur through multiple networks, researchers are often interested in estimating ... More

Hypothesis testing for tail dependence parameters on the boundary of the parameter space with application to generalized max-linear modelsAug 23 2017Modelling multivariate tail dependence is one of the key challenges in extreme-value theory. The max-linear model is a parametric tail dependence model which is dense in the class of multivariate extreme-value models. Being non-differentiable, it cannot ... More

Projected support points, with application to optimal MCMC reductionAug 23 2017This paper introduces a new method for optimally compacting a continuous distribution $F$ into a representative point set called projected support points. As its name suggests, the primary appeal of projected support points is that it provides an optimal ... More

Regularized Estimation and Testing for High-Dimensional Multi-Block Vector-Autoregressive ModelsAug 19 2017Dynamical systems comprising of multiple components that can be partitioned into distinct blocks originate in many scientific areas. A pertinent example is the interactions between financial assets and selected macroeconomic indicators, which has been ... More

Faster Family-wise Error Control for Neuroimaging with a Parametric BootstrapAug 16 2017Aug 18 2017In neuroimaging, hundreds to hundreds of thousands of tests are performed across a set of brain regions or all locations in an image. Recent studies have shown that the most common family-wise error (FWE) controlling procedures in imaging, which rely ... More

A k-means procedure based on a Mahalanobis type distance for clustering multivariate functional dataAug 01 2017This paper proposes a clustering procedure for samples of multivariate functions in $(L^2(I))^{J}$, with $J\geq1$. This method is based on a k-means algorithm in which the distance between the curves is measured with a metrics that generalizes the Mahalanobis ... More

A unified theory for exact stochastic modelling of univariate and multivariate processes with continuous, mixed type, or discrete marginal distributions and any correlation structureJul 21 2017Hydroclimatic processes are characterized by heterogeneous spatiotemporal correlation structures and marginal distributions that can be continuous, mixed-type, discrete or even binary. Simulating exactly such processes can greatly improve hydrological ... More

Unmixing dynamic PET images with variable specific binding kineticsJul 19 2017Dec 09 2017To analyze dynamic positron emission tomography (PET) images, various generic multivariate data analysis techniques have been considered in the literature, such as principal component analysis (PCA), independent component analysis (ICA), factor analysis ... More

Behaviour of l-bits near the many-body localization transitionJul 17 2017Eigenstates of fully many-body localized (FMBL) systems are described by quasilocal operators $\tau_i^z$ (l-bits), which are conserved exactly under Hamiltonian time evolution. The algebra of the operators $\tau_i^z$ and $\tau_i^x$ associated with l-bits ... More

Optimal Monte Carlo integration on closed manifoldsJul 15 2017Jan 24 2018The worst case integration error in reproducing kernel Hilbert spaces of standard Monte Carlo methods with n random points decays as $n^{-1/2}$. However, re-weighting of random points can sometimes be used to improve the convergence order. This paper ... More

Measuring heavy-tailedness of distributionsJul 05 2017Different questions related with analysis of extreme values and outliers arise frequently in practice. To exclude extremal observations and outliers is not a good decision because they contain important information about the observed distribution. The ... More

Quantifying and estimating additive measures of interaction from case-control dataJul 04 2017In this paper we develop a general framework for quantifying how binary risk factors jointly influence a binary outcome. Our key result is an additive expansion of odds ratios as a sum of marginal effects and interaction terms of varying order. These ... More

Informed Sub-Sampling MCMC: Approximate Bayesian Inference for Large DatasetsJun 26 2017Dec 11 2017This paper introduces a framework for speeding up Bayesian inference conducted in presence of large datasets. We design a Markov chain whose transition kernel uses an {unknown} fraction of {fixed size} of the available data that is randomly refreshed ... More

Active matrix completion with uncertainty quantificationJun 25 2017Feb 12 2018The noisy matrix completion problem, which aims to recover a low-rank matrix $\mathbf{X}$ from a partial, noisy observation of its entries, arises in many statistical, machine learning, and engineering applications. In this paper, we present a new, information-theoretic ... More

On the role of the overall effect in exponential familiesJun 09 2017Sep 01 2017Exponential families of discrete probability distributions when the normalizing constant (or overall effect) is added or removed are compared in this paper. The latter setup, in which the exponential family is curved, is particularly relevant when the ... More

Fast and General Model Selection using Data Depth and ResamplingJun 08 2017Nov 28 2017We present a technique using data depth functions and resampling to perform best subset variable selection for a wide range of statistical models. We do this by assigning a score, called an $e$-value, to a candidate model, and use a fast bootstrap method ... More

A note on intrinsic Conditional Autoregressive models for disconnected graphsMay 13 2017In this note we discuss (Gaussian) intrinsic conditional autoregressive (CAR) models for disconnected graphs, with the aim of providing practical guidelines for how these models should be defined, scaled and implemented. We show how these suggestions ... More

Inference for three-parameter M-Wright distributions with applicationsMay 03 2017Nov 09 2017We propose point estimators for the three-parameter (location, scale, and the fractional parameter) variant distributions generated by a Wright function. We also provide uncertainty quantification procedures for the proposed point estimators under certain ... More

Sparse Bayesian vector autoregressions in huge dimensionsApr 11 2017We develop a Bayesian vector autoregressive (VAR) model that is capable of handling vast dimensional information sets. Three features are introduced to permit reliable estimation of the model. First, we assume that the reduced-form errors in the VAR feature ... More

Geometry of Log-Concave Density EstimationApr 06 2017Shape-constrained density estimation is an important topic in mathematical statistics. We focus on densities on $\mathbb{R}^d$ that are log-concave, and we study geometric properties of the maximum likelihood estimator (MLE) for weighted samples. Cule, ... More

Model selection and model averaging in MACML-estimated MNP modelsApr 01 2017This paper provides a review of model selection and model averaging methods for multinomial probit models estimated using the MACML approach. The proposed approaches are partitioned into test based methods (mostly derived from the likelihood ratio paradigm), ... More

How to avoid the curse of dimensionality: scalability of particle filters with and without importance weightsMar 22 2017Sep 19 2017Particle filters are a popular and flexible class of numerical algorithms to solve a large class of nonlinear filtering problems. However, standard particle filters with importance weights have been shown to require a sample size that increases exponentially ... More

Marcinkiewicz's strong law of large numbers for non-additive expectationMar 02 2017The sub-linear expectation space is a nonlinear expectation space having advantages of modelling the uncertainty of probability and distribution. In the sub-linear expectation space, we use capacity and sub-linear expectation to replace probability and ... More

Rank conditional coverage and confidence intervals in high dimensional problemsFeb 22 2017Confidence interval procedures used in low dimensional settings are often inappropriate for high dimensional applications. When a large number of parameters are estimated, marginal confidence intervals associated with the most significant estimates have ... More

Multilevel Monte Carlo in Approximate Bayesian ComputationFeb 13 2017In the following article we consider approximate Bayesian computation (ABC) inference. We introduce a method for numerically approximating ABC posteriors using the multilevel Monte Carlo (MLMC). A sequential Monte Carlo version of the approach is developed ... More

Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: a dengue case studyFeb 01 2017Aug 01 2017In 2015 the US federal government sponsored a dengue forecasting competition using historical case data from Iquitos, Peru and San Juan, Puerto Rico. Competitors were evaluated on several aspects of out-of-sample forecasts including the targets of peak ... More

Goodness-of-fit tests for the functional linear model based on randomly projected empirical processesJan 29 2017Mar 24 2017We consider marked empirical processes, indexed by a randomly projected functional covariate, to construct goodness-of-fit tests for the functional linear model with scalar response. The test statistics are built from continuous functionals over the projected ... More

cmenet: a new method for bi-level variable selection of conditional main effectsJan 19 2017Nov 18 2017This paper introduces a novel method for selecting main effects and a set of reparametrized effects called conditional main effects (CMEs), which capture the conditional effect of a factor at a fixed level of another factor. CMEs represent interpretable, ... More

Adversarial and Amiable Inference in Medical Diagnosis, Reliability, and Survival AnalysisJan 12 2017In this paper, we develop a family of bivariate beta distributions that encapsulate both positive and negative correlations, and which can be of general interest for Bayesian inference. We then invoke a use of these bivariate distributions in two contexts. ... More

Monte Carlo profile confidence intervalsDec 08 2016Monte Carlo methods to evaluate and maximize the likelihood function enable the construction of confidence intervals and hypothesis tests, facilitating scientific investigation using models for which the likelihood function is intractable. When Monte ... More

Estimation of social-influence-dependent peer pressures in a large network gameDec 08 2016Research on peer effects in sociology has been focused for long on social influence power to investigate the social foundations for social interactions. This paper extends Xu(2011)'s large--network--based game model by allowing for social-influence-dependent ... More

Magnetoresistance of compensated semimetals in confined geometriesDec 07 2016Two-component conductors -- e.g., semi-metals and narrow band semiconductors -- often exhibit unusually strong magnetoresistance in a wide temperature range. Suppression of the Hall voltage near charge neutrality in such systems gives rise to a strong ... More

Testing the fit of relational modelsDec 07 2016Relational models generalize log-linear models to arbitrary discrete sample spaces by specifying effects associated with any subsets of their cells. A relational model may include an overall effect, pertaining to every cell after a reparameterization, ... More