total 569took 0.13s

Compound Dirichlet ProcessesMay 15 2019The compound Poisson process and the Dirichlet process are the pillar structures of Renewal theory and Bayesian nonparametric theory, respectively. Both processes have many useful extensions to fulfill the practitioners needs to model the particularities ... More

Recursive density estimators based on Robbins-Monro's scheme and using Bernstein polynomialsApr 14 2019In this paper, we consider the alleviation of the boundary problem when the probability density function has bounded support. We apply Robbins-Monro's algorithm and Bernstein polynomials to construct a recursive density estimator. We study the asymptotic ... More

Infill asymptotics and bandwidth selection for kernel estimators of spatial intensity functionsApr 10 2019We investigate the asymptotic mean squared error of kernel estimators of the intensity function of a spatial point process. We show that when $n$ independent copies of a point process in $\mathbb R^d$ are superposed, the optimal bandwidth $h_n$ is of ... More

Jaccard/Tanimoto similarity test and estimation methodsMar 27 2019Binary data are used in a broad area of biological sciences. Using binary presence-absence data, we can evaluate species co-occurrences that help elucidate relationships among organisms and environments. To summarize similarity between occurrences of ... More

Deterministic bootstrapping for a class of bootstrap methodsMar 26 2019Apr 09 2019An algorithm is described that enables efficient deterministic approximate computation of the bootstrap distribution for any linear bootstrap method $T_n^*$, alleviating the need for repeated resampling from observations (resp. input-derived data). In ... More

Deterministic bootstrapping for a class of bootstrap methodsMar 26 2019An algorithm is described that enables efficient deterministic approximate computation of the bootstrap distribution for any linear bootstrap method $T_n^*$, alleviating the need for repeated resampling from observations (resp. input-derived data). In ... More

$β$-Divergence loss for the kernel density estimation with bias reducedMar 25 2019Allthough nonparametric kernel density estimation with bias reduce is nowadays a standard technique in explorative data-analysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. This ... More

Adaptive importance sampling by kernel smoothingMar 20 2019Mar 21 2019A key determinant of the success of Monte Carlo simulation is the sampling policy, the sequence of distribution used to generate the particles, and allowing the sampling policy to evolve adaptively during the algorithm provides considerable improvement ... More

Adaptive importance sampling by kernel smoothingMar 20 2019A key determinant of the success of Monte Carlo simulation is the sampling policy, the sequence of distribution used to generate the particles, and allowing the sampling policy to evolve adaptively during the algorithm provides considerable improvement ... More

Combining Model and Parameter Uncertainty in Bayesian Neural NetworksMar 18 2019Mar 20 2019Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using Bayesian approach: ... More

Combining Model and Parameter Uncertainty in Bayesian Neural NetworksMar 18 2019Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using Bayesian approach: ... More

High-dimensional nonparametric density estimation via symmetry and shape constraintsMar 14 2019We tackle the problem of high-dimensional nonparametric density estimation by taking the class of log-concave densities on $\mathbb{R}^p$ and incorporating within it symmetry assumptions, which facilitate scalable estimation algorithms and can mitigate ... More

Shapley regressions: A framework for statistical inference on machine learning modelsMar 11 2019Machine learning models often excel in the accuracy of their predictions but are opaque due to their non-linear and non-parametric structure. This makes statistical inference challenging and disqualifies them from many applications where model interpretability ... More

Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoidsMar 05 2019We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory ... More

A multinomial Asymptotic Representation of Zenga's Discrete Index, its Influence Function and Data-driven ApplicationsMar 05 2019In this paper, we consider the Zenga index, one of the most recent inequality index. We keep the finite-valued original form and address the asymptotic theory. The asymptotic normality is established through a multinomial representation. The Influence ... More

Linear Time Visualization and Search in Big Data using Pixellated Factor Space MappingFeb 27 2019It is demonstrated how linear computational time and storage efficient approaches can be adopted when analyzing very large data sets. More importantly, interpretation is aided and furthermore, basic processing is easily supported. Such basic processing ... More

Dualizing Le Cam's method, with applications to estimating the unseensFeb 14 2019One of the most commonly used techniques for proving statistical lower bounds, Le Cam's method, has been the method of choice for functional estimation. This papers aims at explaining the effectiveness of Le Cam's method from an optimization perspective. ... More

Rate-optimal nonparametric estimation for random coefficient regression modelsFeb 14 2019Random coefficient regression models are a popular tool for analyzing unobserved heterogeneity, and have seen renewed interest in the recent econometric literature. In this paper we obtain the optimal pointwise convergence rate for estimating the density ... More

Estimation of smooth densities in Wasserstein distanceFeb 05 2019Feb 06 2019The Wasserstein distances are a set of metrics on probability distributions supported on $\mathbb{R}^d$ with applications throughout statistics and machine learning. Often, such distances are used in the context of variational problems, in which the statistician ... More

Local minimax rates for closeness testing of discrete distributionsFeb 01 2019We consider the closeness testing (or two-sample testing) problem in the Poisson vector model - which is known to be asymptotically equivalent to the model of multinomial distributions. The goal is to distinguish whether two data samples are drawn from ... More

Neural eliminators and classifiersJan 28 2019Classification may not be reliable for several reasons: noise in the data, insufficient input information, overlapping distributions and sharp definition of classes. Faced with several possibilities neural network may in such cases still be useful if ... More

Bayesian Inference for Persistent HomologyJan 07 2019Persistence diagrams offer a way to summarize topological and geometric properties latent in datasets. While several methods have been developed that utilize persistence diagrams in statistical inference, a full Bayesian treatment remains absent. This ... More

Adaptation in multivariate log-concave density estimationDec 30 2018We study the adaptation properties of the multivariate log-concave maximum likelihood estimator over two subclasses of log-concave densities. The first consists of densities with polyhedral support whose logarithms are piecewise affine. The complexity ... More

Local Estimation of a Multivariate Density and its DerivativesDec 21 2018We present methods for estimating the multivariate probability density (or the $\log$-density) and its first and second order derivatives simultaneously. Two methods, local log-likelihood and Hyv\"arinen score estimation, are in terms of weighted scoring ... More

Limitations Of Richardson Extrapolation For Kernel Density EstimationDec 20 2018This paper develops the process of using Richardson Extrapolation to improve the Kernel Density Estimation method, resulting in a more accurate (lower Mean Squared Error) estimate of a probability density function for a distribution of data in $R_d$ given ... More

Spectral Gaps for Reversible Markov Processes with Chaotic Invariant Measures: The Kac Process with Hard Sphere Collisions in Three DimensionsDec 10 2018We develop a method for producing estimates on the spectral gaps of reversible Markov jump processes with chaotic invariant measures, and we apply it to prove the Kac conjecture for hard sphere collision in three dimensions.

Multiscale geometric feature extraction for high-dimensional and non-Euclidean data with applicationNov 26 2018Dec 30 2018A method for extracting multiscale geometric features from a data cloud is proposed and analyzed. The basic idea is to map each pair of data points into a real-valued feature function defined on $[0,1]$. The construction of these feature functions is ... More

The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecastsNov 21 2018We propose a multivariate elastic net regression forecast model for German quarter-hourly electricity spot markets. While the literature is diverse on day-ahead prediction approaches, both the intraday continuous and intraday call-auction prices have ... More

Probability density function of SDEs with unbounded and path--dependent drift coefficientNov 17 2018In this paper, we first prove that the existence of a solution of SDEs under the assumptions that the drift coefficient is of linear growth and path--dependent, and diffusion coefficient is bounded, uniformly elliptic and H\"older continuous. We apply ... More

Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts ModelsOct 29 2018Mixture of Experts (MoE) are successful models for modeling heterogeneous data in many statistical learning problems including regression, clustering and classification. Generally fitted by maximum likelihood estimation via the well-known EM algorithm, ... More

Signature moments to characterize laws of stochastic processesOct 25 2018The normalized sequence of moments characterizes the law of any finite-dimensional random variable. We prove an analogous result for path-valued random variables, that is stochastic processes, by using the normalized sequence of signature moments. We ... More

A minimax near-optimal algorithm for adaptive rejection samplingOct 22 2018Rejection Sampling is a fundamental Monte-Carlo method. It is used to sample from distributions admitting a probability density function which can be evaluated exactly at any given point, albeit at a high computational cost. However, without proper tuning, ... More

Density Deconvolution with Small Berkson ErrorsOct 16 2018The present paper studies density deconvolution in the presence of small Berkson errors, in particular, when the variances of the errors tend to zero as the sample size grows. It is known that when the Berkson errors are present, in some cases, the unknown ... More

Fire seasonality identification with multimodality testsOct 09 2018Understanding the role of vegetation fires in the Earth system has become an important environmental problem. Although fires time occurrence is mainly influenced by climate, human activity related with land use and management has altered fire patterns ... More

Estimation of the weighted integrated square error of the Grenander estimator by the Kolmogorov-Smirnov statisticOct 08 2018We consider in this paper the Grenander estimator of unbounded, in general, nonincreasing densities on the interval [0; 1] without any smoothness assumptions. For fixed number n of i.i.d. random vari- ables X1;X2; : : : ;Xn with values in [0; 1] and the ... More

Weak Convergence (IIA) - Functional and Random Aspects of the Univariate Extreme Value TheoryOct 03 2018The univariate extreme value theory deals with the convergence in type of powers of elements of sequences of cumulative distribution functions on the real line when the power index gets infinite. In terms of convergence of random variables, this amounts ... More

Kernel Density Estimation with Linked Boundary ConditionsSep 20 2018Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized boundary bias issues at the end-points. Motivated by an application of density estimation in biology, we consider a new type of boundary constraint, ... More

Kernel Density Estimation with Linked Boundary ConditionsSep 20 2018Apr 17 2019Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized boundary bias issues at the end-points. Motivated by an application of density estimation in biology, we consider a new type of boundary constraint, ... More

Kernel Density Estimation with Linked Boundary ConditionsSep 20 2018Apr 18 2019Kernel density estimation on a finite interval poses an outstanding challenge because of the well-recognized boundary bias issues at the end-points. Motivated by an application of density estimation in biology, we consider a new type of boundary constraint, ... More

Quantile Regression for Qualifying Match of GEFCom2017 Probabilistic Load ForecastingSep 10 2018We present a simple quantile regression-based forecasting method that was applied in a probabilistic load forecasting framework of the Global Energy Forecasting Competition 2017 (GEFCom2017). The hourly load data is log transformed and split into a long-term ... More

Application of the metric data analysis method to social development indicators analysisSep 07 2018The article contains a methodology for social statistics assessing. The significance of minorities (groups that differ in their attributes from the majority) has grown substantially in the modern postindustrial economy and society. In the multidimensional ... More

Active set algorithms for estimating shape-constrained density ratiosAug 28 2018Sep 06 2018We review and modify the active set algorithm by Duembgen et al. (2011) for nonparametric maximum-likelihood estimation of a log-concave density. This particular estimation problem is embedded into a more general framework including also the estimation ... More

Adaptive optimal kernel density estimation for directional dataAug 07 2018We focus on the nonparametric density estimation problem with directional data. We propose a new rule for bandwidth selection for kernel density estimation. Our procedure is automatic, fully data-driven and adaptive to the smoothness degree of the density. ... More

Concentration of scalar ergodic diffusions and some statistical implicationsJul 30 2018We derive uniform concentration inequalities for continuous-time analogues of empirical processes and related stochastic integrals of scalar ergodic diffusion processes. Thereby, we lay the foundation typically required for the study of sup-norm properties ... More

Kernel Density Estimation-Based Markov Models with Hidden StateJul 30 2018We consider Markov models of stochastic processes where the next-step conditional distribution is defined by a kernel density estimator (KDE), similar to Markov forecast densities and certain time-series bootstrap schemes. The KDE Markov models (KDE-MMs) ... More

A Theil-like Class of Inequality Measures, its Asymptotic Normality Theory and ApplicationsJul 20 2018In this paper, we consider a coherent theory about the asymptotic representations for a family of inequality indices called Theil-Like Inequality Measures (TLIM), within a Gaussian field. The theory uses the functional empirical process approach. We provide ... More

Density estimation by Randomized Quasi-Monte CarloJul 16 2018Aug 10 2018We consider the problem of estimating the density of a random variable $X$ that can be sampled exactly by Monte Carlo (MC) simulation. We investigate the effectiveness of replacing MC by randomized quasi Monte Carlo (RQMC) to reduce the integrated variance ... More

Weak dependence and GMM estimation of supOU and mixed moving average processesJul 16 2018Dec 28 2018We consider a mixed moving average (MMA) process X driven by a L\'evy basis and prove that it is weakly dependent with rates computable in terms of the moving average kernel and the characteristic quadruple of the L\'evy basis. Using this property, we ... More

Resample-smoothing of Voronoi intensity estimatorsJul 06 2018Voronoi intensity estimators, which are non-parametric estimators for intensity functions of point processes, are both parameter-free and adaptive; the intensity estimate at a given location is given by the reciprocal size of the Voronoi/Dirichlet cell ... More

vsgoftest: An Package for Goodness-of-Fit Testing Based on Kullback-Leibler DivergenceJun 19 2018The R-package vsgoftest performs goodness-of-fit (GOF) tests, based on Shannon entropy and Kullback-Leibler divergence, developed by Vasicek (1976) and Song (2002), of various classical families of distributions. The theoretical framework of the so-called ... More

The Minimax Learning Rate of Normal and Ising Undirected Graphical ModelsJun 18 2018Let $G$ be an undirected graph with $m$ edges and $d$ vertices. We show that $d$-dimensional Ising models on $G$ can be learned from $n$ i.i.d. samples within expected total variation distance some constant factor of $\min\{1, \sqrt{(m + d)/n}\}$, and ... More

q-Space Novelty Detection with Variational AutoencodersJun 08 2018Oct 25 2018In machine learning, novelty detection is the task of identifying novel unseen data. During training, only samples from the normal class are available. Test samples are classified as normal or abnormal by assignment of a novelty score. Here we propose ... More

Deep Bayesian regression modelsJun 06 2018Jun 07 2018Regression models are used for inference and prediction in a wide range of applications providing a powerful scientific tool for researchers and analysts from different fields. In many research fields the amount of available data as well as the number ... More

Holographic Neural ArchitecturesJun 04 2018Representation learning is at the heart of what makes deep learning effective. In this work, we introduce a new framework for representation learning that we call "Holographic Neural Architectures" (HNAs). In the same way that an observer can experience ... More

Efficient Bayesian Inference for a Gaussian Process Density ModelMay 29 2018We reconsider a nonparametric density model based on Gaussian processes. By augmenting the model with latent P\'olya--Gamma random variables and a latent marked Poisson process we obtain a new likelihood which is conjugate to the model's Gaussian process ... More

Strongly Consistent of Kullback-Leibler Divergence Estimator and Tests for Model Selection Based on a Bias Reduced Kernel Density EstimatorMay 18 2018In this paper, we study the strong consistency of a bias reduced kernel density estimator and derive a strongly con- sistent Kullback-Leibler divergence (KLD) estimator. As application, we formulate a goodness-of-fit test and an asymptotically standard ... More

Day-ahead electricity price forecasting with high-dimensional structures: Univariate vs. multivariate modeling frameworksMay 17 2018We conduct an extensive empirical study on short-term electricity price forecasting (EPF) to address the long-standing question if the optimal model structure for EPF is univariate or multivariate. We provide evidence that despite a minor edge in predictive ... More

Kernel and wavelet density estimators on manifolds and more general metric spacesMay 12 2018Feb 09 2019We consider the problem of estimating the density of observations taking values in classical or nonclassical spaces such as manifolds and more general metric spaces. Our setting is quite general but also sufficiently rich in allowing the development of ... More

Fused Density Estimation: Theory and MethodsMay 08 2018Dec 04 2018In this paper we introduce a method for nonparametric density estimation on geometric networks. We define fused density estimators as solutions to a total variation regularized maximum-likelihood density estimation problem. We provide theoretical support ... More

Axiomatic Approach to Variable Kernel Density EstimationMay 04 2018Variable kernel density estimation allows the approximation of a probability density by the mean of differently stretched and rotated kernels centered at given sampling points $y_n\in\mathbb{R}^d,\ n=1,\dots,N$. Up to now, the choice of the corresponding ... More

Moderate deviations for the $L_1$-norm of kernel density estimatorsMay 02 2018The rate of normal approximation for the integral norm of kernel density estimators is investigated in the case of densities with power-type singularities. The quantities from the formulations of published results by the author are estimated. By assumption, ... More

Spatio-temporal Patterns of Indian Monsoon RainfallMay 01 2018The primary objective of this paper is to analyze a set of canonical spatial patterns that approximate the daily rainfall across the Indian region, as identified in the companion paper where we developed a discrete representation of the Indian summer ... More

A Discrete View of the Indian Monsoon to Identify Spatial Patterns of RainfallMay 01 2018We propose a representation of the Indian summer monsoon rainfall in terms of a probabilistic model based on a Markov Random Field, consisting of discrete state variables representing low and high rainfall at grid-scale and daily rainfall patterns across ... More

Data-driven regularization of Wasserstein barycenters with an application to multivariate density registrationApr 24 2018May 04 2019We present a framework to simultaneously align and smooth data in the form of multiple point clouds sampled from unknown densities with support in a d-dimensional Euclidean space. This work is motivated by applications in bioinformatics where researchers ... More

Data-driven regularization of Wasserstein barycenters with an application to multivariate density registrationApr 24 2018Jun 29 2018We present a framework to simultaneously align and smooth data in the form of multiple point clouds sampled from unknown densities with support in a $d$-dimensional Euclidean space. This work is motivated by applications in bioinformatics where researchers ... More

Clustering Analysis on Locally Asymptotically Self-similar Processes with Known Number of ClustersApr 13 2018Nov 04 2018We study the problems of clustering locally asymptotically self-similar stochastic processes, when the true number of clusters is priorly known. A new covariance-based dissimilarity measure is introduced, from which the so-called approximately asymptotically ... More

Moving Beyond Sub-Gaussianity in High-Dimensional Statistics: Applications in Covariance Estimation and Linear RegressionApr 08 2018Jun 29 2018Concentration inequalities form an essential toolkit in the study of high-dimensional statistical methods. Most of the relevant statistics literature is based on the assumptions of sub-Gaussian/sub-exponential random vectors. In this paper, we bring together ... More

Complete monotonicity of multinomial probabilities and its application to Bernstein estimators on the simplexApr 06 2018Jun 22 2018Let $d\in \mathbb{N}$ and let $\gamma_i\in [0,\infty)$, $x_i\in (0,1)$ be such that $\sum_{i=1}^{d+1} \gamma_i = M\in (0,\infty)$ and $\sum_{i=1}^{d+1} x_i = 1$. We prove that \begin{equation*} a \mapsto \frac{\Gamma(aM + 1)}{\prod_{i=1}^{d+1} \Gamma(a ... More

Smoothing-based tests with directional random variablesApr 01 2018Testing procedures for assessing specific parametric model forms, or for checking the plausibility of simplifying assumptions, play a central role in the mathematical treatment of the uncertain. No certain answers are obtained by testing methods, but ... More

Adaptive nonparametric estimation for compound Poisson processes robust to the discrete-observation schemeMar 27 2018A compound Poisson process whose jump measure and intensity are unknown is observed at finitely many equispaced times and a purely data-driven wavelet-type estimator of the L\'evy density $\nu$ is constructed through the spectral approach. Assuming minimal ... More

Adaptive nonparametric estimation for compound Poisson processes robust to the discrete-observation schemeMar 27 2018Feb 08 2019A compound Poisson process whose jump measure and intensity are unknown is observed at finitely many equispaced times. We construct a purely data-driven estimator of the L\'evy density $\nu$ through the spectral approach using general Calderon--Zygmund ... More

Cluster analysis of stocks using price movements of high frequency data from National Stock ExchangeMar 26 2018This paper aims to develop new techniques to describe joint behavior of stocks, beyond regression and correlation. For example, we want to identify the clusters of the stocks that move together. Our work is based on applying Kernel Principal Component ... More

Potential Quality Improvement of Stochastic Optical Localization Nanoscopy Images Obtained by Frame-by-Frame Localization AlgorithmsMar 14 2018Jan 17 2019A data movie of stochastic optical localization nanoscopy contains spatial and temporal correlations, both providing information of emitter locations. The majority of localization algorithms in the literature estimate emitter locations by frame-by-frame ... More

City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug InteractionsMar 09 2018Feb 20 2019The occurrence of drug-drug-interactions (DDI) from multiple drug dispensations is a serious problem, both for individuals and health-care systems, since patients with complications due to DDI are likely to re-enter the system at a costlier level. We ... More

City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug InteractionsMar 09 2018Dec 07 2018The occurrence of drug-drug-interactions (DDI) from multiple drug prescriptions is a serious problem, both for individuals and health-care systems, since patients with complications due to DDI are likely to re-enter the system at a costlier level. We ... More

Nonparametric Estimation of Probability Density Functions of Random Persistence DiagramsMar 07 2018Mar 13 2018We introduce a nonparametric way to estimate the global probability density function for a random persistence diagram. Precisely, a kernel density function centered at a given persistence diagram and a given bandwidth is constructed. Our approach encapsulates ... More

Finite sample improvement of Akaike's Information CriterionMar 06 2018Jul 20 2018We emphasize that it is possible to improve the principle of unbiased risk estimation for model selection by addressing excess risk deviations in the design of penalization procedures. Indeed, we propose a modification of Akaike's Information Criterion ... More

Asymptotic nonequivalence of density estimation and Gaussian white noise for small densitiesFeb 09 2018Nov 06 2018It is well-known that density estimation on the unit interval is asymptotically equivalent to a Gaussian white noise experiment, provided the densities are sufficiently smooth and uniformly bounded away from zero. We show that a uniform lower bound, whose ... More

Bayesian Modeling via Goodness-of-fitFeb 01 2018Apr 16 2018The two key issues of modern Bayesian statistics are: (i) establishing principled approach for distilling statistical prior that is consistent with the given data from an initial believable scientific prior; and (ii) development of a Bayes-frequentist ... More

Covariance-based Dissimilarity Measures Applied to Clustering Wide-sense Stationary Ergodic ProcessesJan 27 2018Jan 16 2019We introduce a new unsupervised learning problem: clustering wide-sense stationary ergodic stochastic processes. A covariance-based dissimilarity measure together with asymptotically consistent algorithms is designed for clustering offline and online ... More

Strong-consistent autoregressive predictors in abstract Banach spacesJan 26 2018This work derives new results on the strong-consistency of a componentwise estimator of the autocorrelation operator, and its associated plug-in predictor, in the context of autoregressive processes of order one, in a real separable Banach space $B$ (ARB(1) ... More

Strongly consistent autoregressive predictors in abstract Banach spacesJan 26 2018Sep 05 2018This work derives new results on strong consistent estimation and prediction for autoregressive processes of order 1 in a separable Banach space B. The consistency results are obtained for the componentwise estimator of the autocorrelation operator in ... More

Bivariate density estimation using normal-gamma kernel with application to astronomyJan 25 2018Jul 06 2018We consider the problem of estimation of a bivariate density function with support $\Re\times[0,\infty)$, where a classical bivariate kernel estimator causes boundary bias due to the non-negative variable. To overcome this problem, we propose four kernel ... More

Embedded Model Error Representation for Bayesian Model CalibrationJan 21 2018Feb 12 2019Model error estimation remains one of the key challenges in uncertainty quantification and predictive science. For computational models of complex physical systems, model error, also known as structural error or model inadequacy, is often the largest ... More

Wave function representation of probability distributionsDec 21 2017Jan 05 2018Orthogonal decomposition of the square root of a probability density function in the Hermite basis is a useful low-dimensional parameterization of continuous probability distributions over the reals. This representation is formally similar to the representation ... More

Fast and stable multivariate kernel density estimation by fast sum updatingDec 04 2017Oct 22 2018Kernel density estimation and kernel regression are powerful but computationally expensive techniques: a direct evaluation of kernel density estimates at $M$ evaluation points given $N$ input sample points requires a quadratic $\mathcal{O}(MN)$ operations, ... More

Central limit theorem for the variable bandwidth kernel density estimatorsDec 02 2017In this paper we study the ideal variable bandwidth kernel density estimator introduced by McKay (1993) and Jones, McKay and Hu (1994) and the plug-in practical version of the variable bandwidth kernel estimator with two sequences of bandwidths as in ... More

Order-Sensitivity and Equivariance of Scoring FunctionsNov 27 2017The relative performance of competing point forecasts is usually measured in terms of loss or scoring functions. It is widely accepted that these scoring function should be strictly consistent in the sense that the expected score is minimized by the correctly ... More

Robust Bayes-Like Estimation: Rho-Bayes estimationNov 22 2017We consider the problem of estimating the joint distribution $P$ of $n$ independent random variables within the Bayes paradigm from a non-asymptotic point of view. Assuming that $P$ admits some density $s$ with respect to a given reference measure, we ... More

The Dispersion BiasNov 15 2017Feb 15 2018Estimation error has plagued quantitative finance since Harry Markowitz launched modern portfolio theory in 1952. Using random matrix theory, we characterize a source of bias in the sample eigenvectors of financial covariance matrices. Unchecked, the ... More

Improved Density and Distribution Function EstimationNov 13 2017Jun 20 2018Given additional distributional information in the form of moment restrictions, kernel density and distribution function estimators with implied generalised empirical likelihood probabilities as weights achieve a reduction in variance due to the systematic ... More

Loglinear model selection and human mobilityNov 07 2017Methods for selecting loglinear models were among Steve Fienberg's research interests since the start of his long and fruitful career. After we dwell upon the string of papers focusing on loglinear models that can be partly attributed to Steve's contributions ... More

Nonparametric estimation of the fragmentation kernel based on a PDE stationary distribution approximationOct 25 2017Apr 11 2019We consider a stochastic individual-based model in continuous time to describe a size-structured population for cell divisions. This model is motivated by the detection of cellular aging in biology. We address here the problem of nonparametric estimation ... More

Asymptotic properties and approximation of Bayesian logspline density estimators for communication-free parallel computing methodsOct 25 2017May 10 2018In this article we perform an asymptotic analysis of Bayesian parallel density estimators which are based on logspline density estimation. The parallel estimator we introduce is in the spirit of a kernel density estimator introduced in recent studies. ... More

Conformal predictive distributions with kernelsOct 24 2017This paper reviews the checkered history of predictive distributions in statistics and discusses two developments, one from recent literature and the other new. The first development is bringing predictive distributions into machine learning, whose early ... More

Structural Variability from Noisy Tomographic ProjectionsOct 23 2017Feb 07 2018In cryo-electron microscopy, the 3D electric potentials of an ensemble of molecules are projected along arbitrary viewing directions to yield noisy 2D images. The volume maps representing these potentials typically exhibit a great deal of structural variability, ... More

Early stopping for statistical inverse problems via truncated SVD estimationOct 19 2017Sep 07 2018We consider truncated SVD (or spectral cut-off, projection) estimators for a prototypical statistical inverse problem in dimension $D$. Since calculating the singular value decomposition (SVD) only for the largest singular values is much less costly than ... More

Modal Regression using Kernel Density Estimation: a ReviewOct 19 2017Dec 07 2017We review recent advances in modal regression studies using kernel density estimation. Modal regression is an alternative approach for investigating relationship between a response variable and its covariates. Specifically, modal regression summarizes ... More

A Bayesian hierarchical model for related densities using Polya treesOct 04 2017Mar 06 2018Bayesian hierarchical models are used to share information between related samples and obtain more accurate estimates of sample-level parameters, common structure, and variation between samples. When the parameter of interest is the distribution or density ... More

On minimax nonparametric estimation of signal in Gaussian noiseOct 02 2017Nov 05 2017For the problem of nonparametric estimation of signal in Gaussian noise we point out the strong asymptotically minimax estimators on maxisets for linear estimators (see \cite{ker93,rio}). It turns out that the order of rates of convergence of Pinsker ... More