Targeted Random Projection for Prediction from High-Dimensional FeaturesDec 06 2017We consider the problem of computationally-efficient prediction from high dimensional and highly correlated predictors in challenging settings where accurate variable selection is effectively impossible. Direct application of penalization or Bayesian ... More

Nonparametric Bayes modeling of count processesJul 10 2013Data on count processes arise in a variety of applications, including longitudinal, spatial and imaging studies measuring count responses. The literature on statistical models for dependent count data is dominated by models built from hierarchical Poisson ... More

Multiscale Bernstein polynomials for densitiesOct 03 2014Our focus is on constructing a multiscale nonparametric prior for densities. The Bayes density estimation literature is dominated by single scale methods, with the exception of Polya trees, which favor overly-spiky densities even when the truth is smooth. ... More

Bayesian Nonparametric Modeling of Higher Order Markov ChainsJun 20 2015Oct 20 2015We consider the problem of flexible modeling of higher order Markov chains when an upper bound on the order of the chain is known but the true order and nature of the serial dependence are unknown. We propose Bayesian nonparametric methodology based on ... More

Modular Bayes screening for high-dimensional predictorsMar 29 2017With the routine collection of massive-dimensional predictors in many application areas, screening methods that rapidly identify a small subset of promising predictors have become commonplace. We propose a new MOdular Bayes Screening (MOBS) approach, ... More

Nonparametric Bayes dynamic modeling of relational dataNov 19 2013Symmetric binary matrices representing relations among entities are commonly collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being in inference on the relationship structure and prediction. We propose ... More

Locally Adaptive Dynamic NetworksMay 21 2015Aug 18 2016Our focus is on realistically modeling and forecasting dynamic networks of face-to-face contacts among individuals. Important aspects of such data that lead to problems with current methods include the tendency of the contacts to move between periods ... More

Parallelizing MCMC via Weierstrass SamplerDec 17 2013May 25 2014With the rapidly growing scales of statistical problems, subset based communication-free parallel MCMC methods are a promising future for large scale Bayesian analysis. In this article, we propose a new Weierstrass sampler for parallel MCMC based on independent ... More

Stochastic Volatility Regression for Functional Data DynamicsDec 02 2012Although there are many methods for functional data analysis (FDA), little emphasis is put on characterizing variability among volatilities of individual functions. In particular, certain individuals exhibit erratic swings in their trajectory while other ... More

Constrained Bayesian Inference through Posterior ProjectionsDec 14 2018In a broad variety of settings, prior information takes the form of parameter restrictions. Bayesian approaches are appealing in parameter constrained problems in allowing a probabilistic characterization of uncertainty in finite samples, while providing ... More

Bayesian Inference and Testing of Group Differences in Brain NetworksNov 24 2014Aug 17 2016Network data are increasingly collected along with other variables of interest. Our motivation is drawn from neurophysiology studies measuring brain connectivity networks for a sample of individuals along with their membership to a low or high creative ... More

Compressed Gaussian ProcessJun 07 2014Nonparametric regression for massive numbers of samples (n) and features (p) is an increasingly important problem. In big n settings, a common strategy is to partition the feature space, and then separately apply simple models to each partition set. We ... More

Multiresolution Tensor Decomposition for Multiple Spatial Passing NetworksMar 03 2018This article is motivated by soccer positional passing networks collected across multiple games. We refer to these data as replicated spatial passing networks---to accurately model such data it is necessary to take into account the spatial positions of ... More

Probabilistic Curve Learning: Coulomb Repulsion and the Electrostatic Gaussian ProcessJun 11 2015Learning of low dimensional structure in multidimensional data is a canonical problem in machine learning. One common approach is to suppose that the observed data are close to a lower-dimensional smooth manifold. There are a rich variety of manifold ... More

Latent Factor Models for Density EstimationAug 12 2011Sep 19 2011Although discrete mixture modeling has formed the backbone of the literature on Bayesian density estimation, there are some well known disadvantages. We propose an alternative class of priors based on random nonlinear functions of a uniform latent variable ... More

Functional clustering in nested designs: Modeling variability in reproductive epidemiology studiesNov 20 2014We discuss functional clustering procedures for nested designs, where multiple curves are collected for each subject in the study. We start by considering the application of standard functional clustering tools to this problem, which leads to groupings ... More

Bayesian Monotone Regression using Gaussian Process ProjectionJun 17 2013Shape constrained regression analysis has applications in dose-response modeling, environmental risk assessment, disease screening and many other areas. Incorporating the shape constraints can improve estimation efficiency and avoid implausible results. ... More

Nonparametric graphical model for countsJan 03 2019Although multivariate count data are routinely collected in many application areas, there is surprisingly little work developing flexible models for characterizing their dependence structure. This is particularly true when interest focuses on inferring ... More

Bayesian Factor Analysis for Inference on InteractionsApr 25 2019This article is motivated by the problem of inference on interactions among chemical exposures impacting human health outcomes. Chemicals often co-occur in the environment or in synthetic mixtures and as a result exposure levels can be highly correlated. ... More

A framework for probabilistic inferences from imperfect modelsNov 04 2016The Bayesian paradigm provides a natural way to deal with uncertainty in model selection through assigning each model in a list of models under consideration a posterior probability, with these probabilities providing a basis for inferences or used as ... More

Nonparametric Bayes inference on conditional independenceApr 05 2014Mar 24 2015In broad applications, it is routinely of interest to assess whether there is evidence in the data to refute the assumption of conditional independence of $Y$ and $X$ conditionally on $Z$. Such tests are well developed in parametric models but are not ... More

Minimax Optimal Bayesian AggregationMar 06 2014It is generally believed that ensemble approaches, which combine multiple algorithms or models, can outperform any single algorithm at machine learning tasks, such as prediction. In this paper, we propose Bayesian convex and linear aggregation approaches ... More

Bayesian dynamic financial networks with time-varying predictorsMar 10 2014We propose a Bayesian nonparametric model including time-varying predictors in dynamic network inference. The model is applied to infer the dependence structure among financial markets during the global financial crisis, estimating effects of verbal and ... More

Bayesian Compressed RegressionMar 04 2013Mar 22 2013As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors ... More

Bayesian Conditional Tensor Factorizations for High-Dimensional ClassificationJan 21 2013In many application areas, data are collected on a categorical response and high-dimensional categorical predictors, with the goals being to build a parsimonious model for classification while doing inferences on the important predictors. In settings ... More

Lipschitz Bandit Optimization with Improved EfficiencyApr 25 2019We consider the Lipschitz bandit optimization problem with an emphasis on practical efficiency. Although there is rich literature on regret analysis of this type of problem, e.g., [Kleinberg et al. 2008, Bubeck et al. 2011, Slivkins 2014], their proposed ... More

Bayesian Higher Order Hidden Markov ModelsMay 30 2018Feb 05 2019We consider the problem of flexible modeling of higher order hidden Markov models when the number of latent states and the nature of the serial dependence, including the true order, are unknown. We propose Bayesian nonparametric methodology based on tensor ... More

Lipschitz Bandit Optimization with Improved EfficiencyApr 25 2019May 15 2019We consider the Lipschitz bandit optimization problem with an emphasis on practical efficiency. Although there is rich literature on regret analysis of this type of problem, e.g., [Kleinberg et al. 2008, Bubeck et al. 2011, Slivkins 2014], their proposed ... More

Inference on High-Dimensional Sparse Count DataOct 14 2015Apr 14 2016In a variety of application areas, there is a growing interest in analyzing high dimensional sparse count data, with sparsity exhibited by an over-abundance of zeros and small non-zero counts. Existing approaches for analyzing multivariate count data ... More

Bayesian Manifold RegressionMay 03 2013Jun 16 2014There is increasing interest in the problem of nonparametric regression with high-dimensional predictors. When the number of predictors $D$ is large, one encounters a daunting problem in attempting to estimate a $D$-dimensional surface based on limited ... More

Bayesian multivariate mixed-scale density estimationOct 06 2011May 23 2014Although continuous density estimation has received abundant attention in the Bayesian nonparametrics literature, there is limited theory on multivariate mixed scale density estimation. In this note, we consider a general framework to jointly model continuous, ... More

Sequential Markov Chain Monte CarloAug 18 2013We propose a sequential Markov chain Monte Carlo (SMCMC) algorithm to sample from a sequence of probability distributions, corresponding to posterior distributions at different times in on-line applications. SMCMC proceeds as in usual MCMC but with the ... More

Bayesian modeling of temporal dependence in large sparse contingency tablesMay 12 2012In many applications, it is of interest to study trends over time in relationships among categorical variables, such as age group, ethnicity, religious affiliation, political party and preference for particular policies. At each time point, a sample of ... More

Supervised Multiscale Dimension Reduction for Spatial Interaction NetworksJan 01 2019Jan 28 2019We introduce a multiscale supervised dimension reduction method for SPatial Interaction Network (SPIN) data, which consist of a collection of interactions between units indexed by spatial coordinates. To facilitate regression analysis with SPIN predictors, ... More

Classification via local manifold approximationMar 03 2019Classifiers label data as belonging to one of a set of groups based on input features. It is challenging to obtain accurate classification performance when the feature distributions in the different classes are complex, with nonlinear, overlapping and ... More

Locally Adaptive Bayes Nonparametric Regression via Nested Gaussian ProcessesJan 20 2012We propose a nested Gaussian process (nGP) as a locally adaptive prior for Bayesian nonparametric regression. Specified through a set of stochastic differential equations (SDEs), the nGP imposes a Gaussian process prior for the function's $m$th-order ... More

Bayes Variable Selection in Semiparametric Linear ModelsAug 12 2011There is a rich literature proposing methods and establishing asymptotic properties of Bayesian variable selection methods for parametric models, with a particular focus on the normal linear regression model and an increasing emphasis on settings in which ... More

Bayesian Modular and Multiscale RegressionSep 16 2018We tackle the problem of multiscale regression for predictors that are spatially or temporally indexed, or with a pre-specified multiscale structure, with a Bayesian modular approach. The regression function at the finest scale is expressed as an additive ... More

Comparing and weighting imperfect models using D-probabilitiesNov 04 2016Jun 01 2018We propose a new approach for assigning weights to models using a divergence-based method ({\em D-probabilities}), relying on evaluating parametric models relative to a nonparametric Bayesian reference using Kullback-Leibler divergence. D-probabilities ... More

Geodesic Distance Estimation with SphereletsJun 29 2019Many statistical and machine learning approaches rely on pairwise distances between data points. The choice of distance metric has a fundamental impact on performance of these procedures, raising questions about how to appropriately calculate distances. ... More

Multiresolution Gaussian ProcessesSep 05 2012We propose a multiresolution Gaussian process to capture long-range, non-Markovian dependencies while allowing for abrupt changes. The multiresolution GP hierarchically couples a collection of smooth GPs, each defined over an element of a random nested ... More

On Posterior Consistency of Tail Index for Bayesian Kernel Mixture ModelsNov 09 2015Asymptotic theory of tail index estimation has been studied extensively in the frequentist literature on extreme values, but rarely in the Bayesian context. We investigate whether popular Bayesian kernel mixture models are able to support heavy tailed ... More

Locally adaptive factor processes for multivariate time seriesOct 07 2012Jun 21 2013In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such time-varying ... More

Multiscale Dictionary Learning for Estimating Conditional DistributionsDec 04 2013Nonparametric estimation of the conditional distribution of a response given high-dimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with ... More

Scalable Bayes via Barycenter in Wasserstein SpaceAug 24 2015Jun 20 2018Divide-and-conquer based methods for Bayesian inference provide a general approach for tractable posterior inference when the sample size is large. These methods divide the data into smaller subsets, sample from the posterior distribution of parameters ... More

Efficient Entropy Estimation for Stationary Time SeriesApr 11 2019Entropy estimation, due in part to its connection with mutual information, has seen considerable use in the study of time series data including causality detection and information flow. In many cases, the entropy is estimated using $k$-nearest neighbor ... More

Bayesian Genome- and Epigenome-wide Association Studies with Gene Level DependenceApr 29 2016High-throughput genetic and epigenetic data are often screened for associations with an observed phenotype. For example, one may wish to test hundreds of thousands of genetic variants, or DNA methylation sites, for an association with disease status. ... More

Bayesian cumulative shrinkage for infinite factorizationsFeb 12 2019There are a variety of Bayesian models relying on representations in which the dimension of the parameter space is, itself, unknown. For example, in factor analysis the number of latent variables is, in general, not known and has to be inferred from the ... More

Semiparametric Bernstein-von Mises Theorem: Second Order StudiesMar 16 2015The major goal of this paper is to study the second order frequentist properties of the marginal posterior distribution of the parametric component in semiparametric Bayesian models, in particular, a second order semiparametric Bernstein-von Mises (BvM) ... More

Bayesian nonparametric inference on the Stiefel manifoldNov 04 2013Jul 03 2014The Stiefel manifold $V_{p,d}$ is the space of all $d \times p$ orthonormal matrices, with the $d-1$ hypersphere and the space of all orthogonal matrices constituting special cases. In modeling data lying on the Stiefel manifold, parametric distributions ... More

Estimating densities with nonlinear support using Fisher-Gaussian kernelsJul 12 2019Current tools for multivariate density estimation struggle when the density is concentrated near a nonlinear subspace or manifold. Most approaches require choice of a kernel, with the multivariate Gaussian by far the most commonly used. Although heavy-tailed ... More

Simple, Scalable and Accurate Posterior Interval EstimationMay 13 2016There is a lack of simple and scalable algorithms for uncertainty quantification. Bayesian methods quantify uncertainty through posterior and predictive distributions, but it is difficult to rapidly estimate summaries of these distributions, such as quantiles ... More

Shared kernel Bayesian screeningNov 01 2013Feb 17 2016This article concerns testing for equality of distribution between groups. We focus on screening variables with shared distributional features such as common support, modes and patterns of skewness. We propose a Bayesian testing method using kernel mixtures, ... More

Bayesian nonparametric multivariate convex regressionSep 01 2011In many applications, such as economics, operations research and reinforcement learning, one often needs to estimate a multivariate regression function f subject to a convexity constraint. For example, in sequential decision processes the value of a state ... More

Multivariate convex regression with adaptive partitioningMay 10 2011Nov 13 2011We propose a new, nonparametric method for multivariate regression subject to convexity or concavity constraints on the response function. Convexity constraints are common in economics, statistics, operations research, financial engineering and optimization, ... More

Bayesian Sparse Linear Regression with Unknown Symmetric ErrorAug 06 2016We study full Bayesian procedures for sparse linear regression when errors have a symmetric but otherwise unknown distribution. The unknown error distribution is endowed with a symmetrized Dirichlet process mixture of Gaussians. For the prior on regression ... More

Bayesian Distance ClusteringOct 19 2018Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on ... More

Efficient Manifold and Subspace Approximations with SphereletsJun 26 2017Feb 25 2019Data lying in a high dimensional ambient space are commonly thought to have a much lower intrinsic dimension. In particular, the data may be concentrated near a lower-dimensional subspace or manifold. There is an immense literature focused on approximating ... More

Bayesian Distance ClusteringOct 19 2018Jun 25 2019Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on ... More

Bayesian Graphical Models for Multivariate Functional DataNov 15 2014Jan 05 2016Graphical models express conditional independence relationships among variables. Although methods for vector-valued data are well established, functional data graphical models remain underdeveloped. We introduce a notion of conditional independence between ... More

Scalable Bayes via Barycenter in Wasserstein SpaceAug 24 2015Sep 22 2015We propose a novel approach WASP for Bayesian inference when massive size of the data prohibits posterior computations. WASP is estimated in three steps. First, data are divided into smaller computationally tractable subsets. Second, posterior draws of ... More

Bayesian Conditional Density FilteringJan 15 2014Sep 22 2015We propose a Conditional Density Filtering (C-DF) algorithm for efficient online Bayesian inference. C-DF adapts MCMC sampling to the online setting, sampling from approximations to conditional posterior distributions obtained by propagating surrogate ... More

Bayesian Consensus ClusteringFeb 28 2013The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separate clusterings ... More

On the consistency theory of high dimensional variable screeningFeb 24 2015Jun 06 2015Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to produce the final ... More

Bayesian Sparse Linear Regression with Unknown Symmetric ErrorAug 06 2016Mar 22 2019We study full Bayesian procedures for sparse linear regression when errors have a symmetric but otherwise unknown distribution. The unknown error distribution is endowed with a symmetrized Dirichlet process mixture of Gaussians. For the prior on regression ... More

Bayesian Network--Response RegressionJun 02 2016There is an increasing interest in learning how human brain networks vary with continuous traits (e.g., personality, cognitive abilities, neurological disorders), but flexible procedures to accomplish this goal are limited. We develop a Bayesian semiparametric ... More

Bayesian Tensor RegressionSep 22 2015This article proposes a Bayesian approach to regression with a scalar response against vector and tensor covariates. Tensor covariates are commonly vectorized prior to analysis, failing to exploit the structure of the tensor, and resulting in poor estimation ... More

Robust Bayesian inference via coarseningJun 19 2015The standard approach to Bayesian inference is based on the assumption that the distribution of the data belongs to the chosen model class. However, even a small violation of this assumption can have a large impact on the outcome of a Bayesian procedure. ... More

Generalized Beta Mixtures of GaussiansJul 25 2011Mar 13 2012In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better ... More

Bayesian learning of joint distributions of objectsMar 03 2013There is increasing interest in broad application areas in defining flexible joint models for data having a variety of measurement scales, while also allowing data of complex types, such as functions, images and documents. We consider a general framework ... More

Nonparametric Bayes Models of Fiber Curves Connecting Brain RegionsDec 03 2016In studying structural inter-connections in the human brain, it is common to first estimate fiber bundles connecting different regions of the brain relying on diffusion MRI. These fiber bundles act as highways for neural activity and communication, snaking ... More

Simple, Scalable and Accurate Posterior Interval EstimationMay 13 2016Dec 24 2016There is a lack of simple and scalable algorithms for uncertainty quantification. Bayesian methods quantify uncertainty through posterior and predictive distributions, but it is difficult to rapidly estimate summaries of these distributions, such as quantiles ... More

Path Following and Empirical Bayes Model Selection for Sparse RegressionJan 17 2012In recent years, a rich variety of regularization procedures have been proposed for high dimensional regression problems. However, tuning parameter choice and computational efficiency in ultra-high dimensional problems remain vexing issues. The routine ... More

Repulsive MixturesApr 24 2012Sep 20 2012Discrete mixture models are routinely used for density estimation and clustering. While conducting inferences on the cluster-specific parameters, current frequentist and Bayesian methods often encounter problems when clusters are placed too close together ... More

On Posterior Consistency of Tail Index for Bayesian Kernel Mixture ModelsNov 09 2015Apr 18 2018Asymptotic theory of tail index estimation has been studied extensively in the frequentist literature on extreme values, but rarely in the Bayesian context. We investigate whether popular Bayesian kernel mixture models are able to support heavy tailed ... More

Bayesian hierarchical modeling of simply connected 2D shapesJan 08 2012Models for distributions of shapes contained within images can be widely used in biomedical applications ranging from tumor tracking for targeted radiation therapy to classifying cells in a blood sample. Our focus is on hierarchical probability models ... More

Posterior convergence rates in non-linear latent variable modelsSep 23 2011Non-linear latent variable models have become increasingly popular in a variety of applications. However, there has been little study on theoretical properties of these models. In this article, we study rates of posterior contraction in univariate density ... More

Marginally Specified Priors for Nonparametric Bayesian EstimationApr 29 2012Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects ... More

Learning Densities Conditional on Many Interacting FeaturesApr 26 2013Apr 29 2013Learning a distribution conditional on a set of discrete-valued features is a commonly encountered task. This becomes more challenging with a high-dimensional feature set when there is the possibility of interaction between the features. In addition, ... More

Tensor decompositions and sparse log-linear modelsApr 01 2014Contingency table analysis routinely relies on log linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a low rank tensor factorization of the probability mass function for multivariate categorical ... More

Fast moment estimation for generalized latent Dirichlet modelsMar 17 2016Mar 23 2016We develop a generalized method of moments (GMM) approach for fast parameter estimation in a new class of Dirichlet latent variable models with mixed data types. Parameter estimation via GMM has been demonstrated to have computational and statistical ... More

Clustering-Enhanced Stochastic Gradient MCMC for Hidden Markov Models with Rare StatesOct 31 2018MCMC algorithms for hidden Markov models, which often rely on the forward-backward sampler, suffer with large sample size due to the temporal dependence inherent in the data. Recently, a number of approaches have been developed for posterior inference ... More

Extrinsic local regression on manifold-valued dataAug 10 2015We propose an extrinsic regression framework for modeling data with manifold valued responses and Euclidean predictors. Regression with manifold responses has wide applications in shape analysis, neuroscience, medical imaging and many other areas. Our ... More

Expandable Factor AnalysisJul 04 2014Dec 09 2014Bayesian sparse factor models have proven useful for characterizing dependence, but scaling computation to high dimensions is problematic. We propose expandable factor analysis for scalable estimation. The method relies on a novel multiscale generalized ... More

Theoretical Limits of Record Linkage and MicroclusteringMar 15 2017There has been substantial recent interest in record linkage, attempting to group the records pertaining to the same entities from a large database lacking unique identifiers. This can be viewed as a type of "microclustering," with few observations per ... More

Bayesian Local Extrema SplinesApr 04 2016We consider the problem of shape restricted nonparametric regression on a closed set X ?\in R; where it is reasonable to assume the function has no more than H local extrema interior to X: Following a Bayesian approach we develop a nonparametric prior ... More

Scalable Approximations of Marginal Posteriors in Variable SelectionJun 22 2015In many contexts, there is interest in selecting the most important variables from a very large collection, commonly referred to as support recovery or variable, feature or subset selection. There is an enormous literature proposing a rich variety of ... More

Nonparametric Bayes Modeling of Populations of NetworksJun 30 2014Jun 05 2016Replicated network data are increasingly available in many research fields. In connectomic applications, inter-connections among brain regions are collected for each patient under study, motivating statistical models which can flexibly characterize the ... More

Expandable Factor AnalysisJul 04 2014Jun 20 2018Bayesian sparse factor models have proven useful for characterizing dependence in multivariate data, but scaling computation to large numbers of samples and dimensions is problematic. We propose expandable factor analysis for scalable inference in factor ... More

Bayesian inference for Matérn repulsive processesAug 05 2013Apr 03 2015In many applications involving point pattern data, the Poisson process assumption is unrealistic, with the data exhibiting a more regular spread. Such a repulsion between events is exhibited by trees for example, because of competition for light and nutrients. ... More

Bayesian shrinkageDec 25 2012Penalized regression methods, such as $L_1$ regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced ... More

Multivariate mixed membership modeling: Inferring domain-specific risk profilesJan 16 2019Apr 28 2019Characterizing shared membership of individuals in two or more categories of a classification scheme poses severe interpretability problems when the number of categories is large (e.g. greater than six). Mixed membership models quantify this phenomenon, ... More

Robust and Scalable Bayes via a Median of Subset Posterior MeasuresMar 11 2014Jun 02 2016We propose a novel approach to Bayesian analysis that is provably robust to outliers in the data and often has computational advantages over standard methods. Our technique is based on splitting the data into non-overlapping subgroups, evaluating the ... More

Inefficiency of Data Augmentation for Large Sample Imbalanced DataMay 19 2016Many modern applications collect large sample size and highly imbalanced categorical data, with some categories being relatively rare. Bayesian hierarchical models are well motivated in such settings in providing an approach to borrow information to combat ... More

Bayesian modeling of networks in complex business intelligence problemsOct 02 2015Mar 28 2016Complex network data problems are increasingly common in many fields of application. Our motivation is drawn from strategic marketing studies monitoring customer choices of specific products, along with co-subscription networks encoding multiple purchasing ... More

Bayesian time-aligned factor analysis of paired multivariate time seriesApr 27 2019Many modern data sets require inference methods that can estimate the shared and individual-specific components of variability in collections of matrices that change over time. Promising methods have been developed to analyze these types of data in static ... More

MCMC for Imbalanced Categorical DataMay 19 2016Jun 26 2017Many modern applications collect highly imbalanced categorical data, with some categories relatively rare. Bayesian hierarchical models combat data sparsity by borrowing information, while also quantifying uncertainty. However, posterior computation presents ... More

Multivariate mixed membership modeling: Inferring domain-specific risk profilesJan 16 2019Characterizing shared membership of individuals in two or more categories of a classification scheme poses severe interpretability problems when the number of categories is large (e.g. greater than six). Mixed membership models quantify this phenomenon, ... More

Exploiting Big Data in Logistics Risk Assessment via Bayesian NonparametricsJan 21 2015Jul 21 2016In cargo logistics, a key performance measure is transport risk, defined as the deviation of the actual arrival time from the planned arrival time. Neither earliness nor tardiness is desirable for customer and freight forwarders. In this paper, we investigate ... More

Generalized Admixture Mapping for Complex TraitsNov 23 2011Admixture mapping is a popular tool to identify regions of the genome associated with traits in a recently admixed population. Existing methods have been developed primarily for identification of a single locus influencing a dichotomous trait within a ... More