Spatial and Spatiotemporal GARCH Models -- A Unified ApproachAug 22 2019In time-series analyses and particularly in finance, generalised autoregressive conditional heteroscedasticity (GARCH) models are widely applied statistical tools for modelling volatility clusters (i.e. periods of increased or decreased risks). In contrast, ... More

There is no Reliable Way to Detect Hacked Ballot-Marking DevicesAug 21 2019Election system vendors are marketing ballot-marking devices (BMDs) as a universal system, and some states are deploying them for all voters, not just those who need a BMD to vote independently. Like all devices with CPUs, BMDs can be hacked, misprogrammed, ... More

Forecasting e-scooter competition with direct and access trips by mode and distance in New York CityAug 21 2019Given the lack of demand forecasting models for e-scooter sharing systems, we address this research gap using data from Portland, OR, and New York City. A log-log regression model is estimated for e-scooter trips based on user age, income, labor force ... More

An Email Experiment to Identify the Effect of Racial Discrimination on Access to Lawyers: A Statistical ApproachAug 21 2019We consider the problem of conducting an experiment to study the prevalence of racial bias against individuals seeking legal assistance, in particular whether lawyers use clues about a potential client's race in deciding whether to reply to e-mail requests ... More

Statistical approaches using longitudinal biomarkers for disease early detection: A comparison of methodologiesAug 21 2019Early detection of clinical outcomes such as cancer may be predicted based on longitudinal biomarker measurements. Tracking longitudinal biomarkers as a way to identify early disease onset may help to reduce mortality from diseases like ovarian cancer ... More

Efficient and powerful equivalency test on combined mean and variance with application to diagnostic device comparison studiesAug 21 2019In medical device comparison studies, equivalency test is commonly used to demonstrate two measurement methods agree up to a pre-specified performance goal based on the paired repeated measures. Such equivalency test often involves controlling the absolute ... More

Clustering Longitudinal Life-Course Sequences using Mixtures of Exponential-Distance ModelsAug 21 2019Sequence analysis is an increasingly popular approach for the analysis of life-courses represented by categorical sequences, i.e. as the ordered collection of activities experienced by subjects over a given time period. Several criteria have been introduced ... More

Importance of spatial predictor variable selection in machine learning applications -- Moving from data reproduction to spatial predictionAug 21 2019Machine learning algorithms find frequent application in spatial prediction of biotic and abiotic environmental variables. However, the characteristics of spatial data, especially spatial autocorrelation, are widely ignored. We hypothesize that this is ... More

Beyond p-values: a phase II dual-criterion design with statistical significance and clinical relevanceAug 21 2019Background: Well-designed phase II trials must have acceptable error rates relative to a pre-specified success criterion, usually a statistically significant p-value. Such standard designs may not always suffice from a clinical perspective because clinical ... More

Risk-Efficient Bayesian Data Synthesis for Privacy ProtectionAug 20 2019High-utility and low-risks synthetic data facilitates microdata dissemination by statistical agencies. In a previous work, we induced privacy protection into any Bayesian data synthesis model by employing a pseudo posterior likelihood that exponentiates ... More

Bayesian Hierarchical Factor Regression Models to Infer Cause of Death From Verbal Autopsy DataAug 20 2019In low-resource settings where vital registration of death is not routine it is often of critical interest to determine and study the cause of death (COD) for individuals and the cause-specific mortality fraction (CSMF) for populations. Post-mortem autopsies, ... More

Social media usage reveals how regions recover after natural disasterAug 20 2019The challenge of nowcasting and forecasting the effect of natural disasters (e.g. earthquakes, floods, hurricanes) on assets, people and society is of primary importance for assessing the ability of such systems to recover from extreme events. Traditional ... More

Indoor Navigation Using Information From A Map And A RangefinderAug 20 2019The problem of indoor navigation of mobile objects, using a map and measurements of distances to the walls is considered. A nonlinear filtering problem aimed at calculating the optimal, in the root-mean-square sense, of the sought parameters is formulated ... More

Bayesian leveraging of historical control data for a clinical trial with time-to-event endpointAug 20 2019The recent 21st Century Cures Act propagates innovations to accelerate the discovery, development, and delivery of 21st century cures. It includes the broader application of Bayesian statistics and the use of evidence from clinical expertise. An example ... More

Counterfactual Distribution Regression for Structured InferenceAug 20 2019We consider problems in which a system receives external \emph{perturbations} from time to time. For instance, the system can be a train network in which particular lines are repeatedly disrupted without warning, having an effect on passenger behavior. ... More

$L_1$ Trend Filtering: A Modern Statistical Tool for Time-Domain Astronomy and Astronomical SpectroscopyAug 20 2019The problem of estimating a one-dimensional signal possessing mixed degrees of smoothness is ubiquitous in time-domain astronomy and astronomical spectroscopy. For example, in the time domain, an astronomical object may exhibit a smoothly varying intensity ... More

Robust Design and Analysis of Clinical Trials With Non-proportional Hazards: A Straw Man Guidance from a Cross-pharma Working GroupAug 20 2019Loss of power and clear description of treatment differences are key issues in designing and analyzing a clinical trial where non-proportional hazard is a possibility. A log-rank test may be very inefficient and interpretation of the hazard ratio estimated ... More

Alliances and Conflict, or Conflict and Alliances? Appraising the Causal Effect of Alliances on ConflictAug 19 2019The deterrent effect of military alliances is well documented and widely accepted. However, such work has typically assumed that alliances are exogenous. This is problematic as alliances may simultaneously influence the probability of conflict and be ... More

Issues arising from benchmarking single-cell RNA sequencing imputation methodsAug 19 2019On June 25th, 2018, Huang et al. published a computational method SAVER on Nature Methods for imputing dropout gene expression levels in single cell RNA sequencing (scRNA-seq) data. Huang et al. performed a set of comprehensive benchmarking analyses, ... More

Evaluating Hierarchies through A Partially Observable Markov Decision Processes MethodologyAug 19 2019Hierarchical clustering has been shown to be valuable in many scenarios, e.g. catalogues, biology research, image processing, and so on. Despite its usefulness to many situations, there is no agreed methodology on how to properly evaluate the hierarchies ... More

A Parametric Bootstrap for the Mean Measure of DivergenceAug 19 2019For more than $50$ years the {\it Mean Measure of Divergence} (MMD) has been one of the most prominent tools used in anthropology for the study of non-metric traits. However, one of the problems, in anthropology including palaeoanthropology (more often ... More

Bayesian models for survival data of clinical trials: Comparison of implementations using R softwareAug 19 2019Aug 20 2019Objective: To provide guidance for the use of the main functions available in R for performing post hoc Bayesian analysis of a randomized clinical trial with a survival endpoint using proportional hazard models. Study Design and Setting: Data derived ... More

An Overview of Statistical Data AnalysisAug 19 2019The use of statistical software in academia and enterprises has been evolving over the last years. More often than not, students, professors, workers, and users, in general, have all had, at some point, exposure to statistical software. Sometimes, difficulties ... More

Hierarchical Bayesian Operational Modal Analysis: Theory and ComputationsAug 18 2019This paper presents a hierarchical Bayesian modeling framework for the uncertainty quantification in modal identification of linear dynamical systems using multiple vibration data sets. This novel framework integrates the state-of-the-art Bayesian formulations ... More

Decline of COPD exacerbations in clinical trials over two decades -- a systematic review and meta-regressionAug 17 2019BACKGROUND: An important goal of chronic obstructive pulmonary disease (COPD) treatment is to reduce the frequency of exacerbations. Some observations suggest a decline in exacerbation rates in clinical trials over time. A more systematic understanding ... More

Measuring international uncertainty using global vector autoregressions with drifting parametersAug 17 2019This paper investigates the time-varying impacts of international macroeconomic uncertainty shocks. We use a global vector autoregressive (GVAR) specification with drifting coefficients and factor stochastic volatility in the errors to model six economies ... More

Onset detection: A new approach to QBH systemAug 17 2019Query by Humming (QBH) is an system to provide a user with the song(s) which the user hums to the system. Current QBH method requires the extraction of onset and pitch information in order to track similarity with various versions of different songs. ... More

An Exploratory Analysis of the Latent Structure of Process Data via Action Sequence AutoencoderAug 16 2019Computer simulations have become a popular tool of assessing complex skills such as problem-solving skills. Log files of computer-based items record the entire human-computer interactive processes for each respondent. The response processes are very diverse, ... More

Stochastic Comparisons of Series and Parallel Systems with Topp-Leone Generated Family of DistributionsAug 16 2019In this article, we stochastically compare the series and parallel systems having Topp Leone generated family of distributions. We consider that the lifetimes of the components of the systems have either the different shape parameters when the scale parameters ... More

Selection of Exponential-Family Random Graph Models via Held-Out Predictive Evaluation (HOPE)Aug 16 2019Statistical models for networks with complex dependencies pose particular challenges for model selection and evaluation. In particular, many well-established statistical tools for selecting between models assume conditional independence of observations ... More

Isotonic regression discontinuity designsAug 15 2019In isotonic regression discontinuity designs, the average outcome and the treatment assignment probability are monotone in the running variable. We introduce novel nonparametric estimators for sharp and fuzzy designs based on the bandwidth-free isotonic ... More

A Bayesian Joint Model for Spatial Point Processes with Application to Basketball Shot ChartAug 15 2019The success rate of a basketball shot may be higher at locations where a player makes more shots. In a marked spatial point process model, this means that the marks are dependent on the intensity of the process. We develop a Bayesian joint model of the ... More

Analyzing the Fine Structure of DistributionsAug 15 2019One aim of data mining is the identification of interesting structures in data. Basic properties of the empirical distribution, such as skewness and an eventual clipping, i.e., hard limits in value ranges, need to be assessed. Of particular interest is ... More

Learning Signal Subgraphs from Longitudinal Brain Networks with Symmetric Bilinear Logistic RegressionAug 15 2019Modern neuroimaging technologies, combined with state-of-the-art data processing pipelines, have made it possible to collect longitudinal observations of an individual's brain connectome at different ages. It is of substantial scientific interest to study ... More

With Malice Towards None: Assessing Uncertainty via Equalized CoverageAug 15 2019An important factor to guarantee a fair use of data-driven recommendation systems is that we should be able to communicate their uncertainty to decision makers. This can be accomplished by constructing prediction intervals, which provide an intuitive ... More

Uplift Modeling for Multiple Treatments with Cost OptimizationAug 14 2019Uplift modeling is an emerging machine learning approach for estimating the treatment effect at an individual or subgroup level. It can be used for optimizing the performance of interventions such as marketing campaigns and product designs. Uplift modeling ... More

A hierarchical model for estimating exposure-response curves from multiple studiesAug 14 2019Cookstove replacement trials have found mixed results on their impact on respiratory health. The limited range of concentrations and small sample sizes of individual studies are important factors that may be limiting their statistical power. We present ... More

Mixed pooling of seasonality in time series pallet forecastingAug 14 2019Multiple seasonal patterns play a key role in time series forecasting, especially for business time series where seasonal effects are often dramatic. Previous approaches including Fourier decomposition, exponential smoothing, and seasonal autoregressive ... More

Robust parametric modeling of Alzheimer's disease progressionAug 14 2019Quantitative characterization of disease progression using longitudinal data can provide long-term predictions for the pathological stages of individuals. This work studies robust modeling of Alzheimer's disease progression using parametric methods. The ... More

Maize Yield and Nitrate Loss Prediction with Machine Learning AlgorithmsAug 14 2019Aug 20 2019Pre-season prediction of crop production outcomes such as grain yields and N losses can provide insights to stakeholders when making decisions. Simulation models can assist in scenario planning, but their use is limited because of data requirements and ... More

Borrowing of information across patient subgroups in a basket trial based on distributional discrepancyAug 14 2019Basket trials emerge as a new class of efficient approaches to evaluate a treatment in several patient subgroups simultaneously. In this paper, we develop a novel analysis methodology for early phase basket trials, which enables borrowing of information ... More

Risk-Limiting TalliesAug 14 2019Many voter-verifiable, coercion-resistant schemes have been proposed, but even the most carefully designed systems necessarily leak information via the announced result. In corner cases, this may be problematic. For example, if all the votes go to one ... More

Multilevel and multifidelity uncertainty quantification for cardiovascular hemodynamicsAug 13 2019Standard approaches for uncertainty quantification (UQ) in cardiovascular modeling pose challenges due to the large number of uncertain inputs and the significant computational cost of realistic 3D simulations. We propose an efficient UQ framework utilizing ... More

Optimal Estimation of Generalized Average Treatment Effects using Kernel Optimal MatchingAug 13 2019In causal inference, a variety of causal effect estimands have been studied, including the sample, uncensored, target, conditional, optimal subpopulation, and optimal weighted average treatment effects. Ad-hoc methods have been developed for each estimand ... More

Blinded sample size re-estimation in equivalence testingAug 13 2019This paper investigates type I error violations that occur when blinded sample size reviews are applied in equivalence testing. We give a derivation which explains why such violations are more pronounced in equivalence testing than in the case of superiority ... More

Inverse Parametric Uncertain Identification using Polynomial Chaos and high-order Moment Matching benchmarked on a Wet Friction ClutchAug 13 2019A numerically efficient inverse method for parametric model uncertainty identification using maximum likelihood estimation is presented. The goal is to identify a probability model for a fixed number of model parameters based on a set of experiments. ... More

Growth of Common Friends in a Preferential Attachment ModelAug 13 2019The number of common friends (or connections) in a graph is a commonly used measure of proximity between two nodes. Such measures are used in link prediction algorithms and recommendation systems in large online social networks. We obtain the rate of ... More

Anomaly Detection in High Dimensional DataAug 12 2019The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain ... More

Quantifying Time-Varying Sources in Magnetoencephalography -- A Discrete ApproachAug 11 2019We study the distribution of brain source from the most advanced brain imaging technique, Magnetoencephalography (MEG), which measures the magnetic fields outside the human head produced by the electrical activity inside the brain. Common time-varying ... More

Detecting Heterogeneous Treatment Effect with Instrumental VariablesAug 09 2019There is an increasing interest in estimating heterogeneity in causal effects in randomized and observational studies. However, little research has been conducted to understand heterogeneity in an instrumental variables study. In this work, we present ... More

Minimax Crossover DesignsAug 09 2019In crossover experiments, two broad types of treatment effects are typically considered: direct effects that capture the immediate impact of the treatment, and carryover effects that capture the lagged impact of past treatments. Existing approaches to ... More

Discovering a Regularity: the Case of An 800-year Law of Advances in Small-Arms TechnologiesAug 09 2019Considering a broad family of technologies where a measure of performance (MoP) is difficult or impossible to formulate, we seek an alternative measure that exhibits a regular pattern of evolution over time, similar to how a MoP may follow a Moore's law. ... More

Combined Tail Estimation Using Censored Data and Expert InformationAug 09 2019We study tail estimation in Pareto-like settings for datasets with a high percentage of randomly right-censored data, and where some expert information on the tail index is available for the censored observations. This setting arises for instance naturally ... More

Climate extreme event attribution using multivariate peaks-over-thresholds modeling and counterfactual theoryAug 08 2019Numerical climate models are complex and combine a large number of physical processes. They are key tools to quantify the relative contribution of potential anthropogenic causes (e.g., the current increase in greenhouse gases) on high impact atmospheric ... More

Enhancing the Demand for Labour survey by including skills from online job advertisements using model-assisted calibrationAug 08 2019In the article we describe an enhancement to the Demand for Labour (DL) survey conducted by Statistics Poland, which involves the inclusion of skills obtained from online job advertisements. The main goal is to provide estimates of the demand for skills ... More

A nonparametric Bayesian approach to the rare type match problemAug 08 2019The "rare type match problem" is the situation in which the suspect's DNA profile, matching the DNA profile of the crime stain, is not in the database of reference. The evaluation of this match in the light of the two competing hypotheses (the crime stain ... More

Spatial Flow-Field Approximation Using Few Thermodynamic Measurements Part II: Uncertainty AssessmentsAug 08 2019In this second part of our two-part paper, we provide a detailed, frequentist framework for propagating uncertainties within our multivariate linear least squares model. This permits us to quantify the impact of uncertainties in thermodynamic measurements---arising ... More

Spatial Flow-Field Approximation Using Few Thermodynamic Measurements Part I: Formulation and Area AveragingAug 08 2019Our investigation raises an important question that is of relevance to the wider turbomachinery community: how do we estimate the spatial average of a flow quantity given finite (and sparse) measurements? This paper seeks to advance efforts to answer ... More

Que será será? The uncertainty estimation of feature-based time series forecastsAug 08 2019Interval forecasts have significant advantages in providing uncertainty estimation to point forecasts, leading to the importance of providing prediction intervals (PIs) as well as point forecasts. In this paper, we propose a general feature-based time ... More

The utility of a Bayesian Markov model with Pólya-Gamma sampling for estimating individual behavior transition probabilities from accelerometer classificationsAug 07 2019The use of accelerometers in wildlife tracking provides a fine-scale data source for understanding animal behavior and decision-making. Current methods in movement ecology focus on behavior as a driver of movement mechanisms. The Bayesian Markov model ... More

A modelling methodology for social interaction experimentsAug 07 2019Analysis of temporal network data arising from online interactive social experiments is not possible with standard statistical methods because the assumptions of these models, such as independence of observations, are not satisfied. In this paper, we ... More

Route Identification in the National Football LeagueAug 07 2019Tracking data in the NFL is a sequence of spatial-temporal measurements that vary in length depending on the duration of the play. In this paper, we demonstrate how model-based curve clustering of observed player trajectories can be used to identify the ... More

Statistical modeling of groundwater quality assessment in Iran using a flexible Poisson likelihoodAug 06 2019Assessing water quality and recognizing its associated risks to human health and the broader environment is undoubtedly essential. Groundwater is widely used to supply water for drinking, industry, and agriculture purposes. The groundwater quality measurements ... More

Predicted disease compositions of human gliomas estimated from multiparametric MRI can predict endothelial proliferation, tumor grade, and overall survivalAug 06 2019Background and Purpose: Biopsy is the main determinants of glioma clinical management, but require invasive sampling that fail to detect relevant features because of tumor heterogeneity. The purpose of this study was to evaluate the accuracy of a voxel-wise, ... More

Global Fixed Income Portfolios: A Macroeconomic Invariant SolutionAug 06 2019Global fixed income returns span across multiple maturities and economies, that is, they naturally reside on multi-dimensional data structures referred to as tensors. In contrast to standard "flat-view" multivariate models that are agnostic to data structure ... More

Second-order Control of Complex Systems with Correlated Synthetic DataAug 06 2019Generation of hybrid synthetic data resembling real data to some criteria is an important methodological and thematic issue in most disciplines which study complex systems. Interdependencies between constituting elements, materialized within respective ... More

Fossil fuel resources, decarbonization, and economic growth drive the feasibility of Paris climate targetsAug 06 2019Understanding how reducing carbon dioxide (CO2) emissions impacts climate risks requires probabilistic projections of the baseline ("business-as-usual") emissions. Previous studies deriving these baseline projections have broken important new ground, ... More

Characterising complex healthcare systems using network science: The small world of emergency surgeryAug 05 2019Hospitals are complex systems and optimising their function is critical to the provision of high quality, cost effective healthcare. Nevertheless, metrics of performance have to date focused on the performance of individual elements rather than the system ... More

Performance of variable and function selection methods for estimating the non-linear health effects of correlated chemical mixtures: a simulation studyAug 05 2019Statistical methods for identifying harmful chemicals in a correlated mixture often assume linearity in exposure-response relationships. Non-monotonic relationships are increasingly recognised (e.g., for endocrine-disrupting chemicals); however, the impact ... More

Interpretable brain age prediction using linear latent variable models of functional connectivityAug 05 2019Neuroimaging-driven prediction of brain age, defined as the predicted biological age of a subject using only brain imaging data, is an exciting avenue of research. In this work we seek to build models of brain age based on functional connectivity while ... More

Antioxidant capacity is repeatable across years but does not consistently correlate with a marker of peroxidation in a free-living passerine birdAug 05 2019Oxidative stress occurs when reactive oxygen species (ROS) exceed antioxidant defences, which can have deleterious effects on cell function, health and survival. Therefore, organisms are expected to finely regulate pro-oxidant and antioxidant processes. ... More

Forecasting age distribution of death counts: An application to annuity pricingAug 05 2019We consider a compositional data analysis approach to forecasting the age distribution of death counts. Using the age-specific period life-table death counts in Australia obtained from the Human Mortality Database, the compositional data analysis approach ... More

Sensitivity Analysis of Treatment Effect to Unmeasured Confounding in Observational Studies with Survival and Competing Risks OutcomesAug 05 2019No unmeasured confounding is often assumed in estimating treatment effects in observational data when using approaches such as propensity scores and inverse probability weighting. However, in many such studies due to the limitation of the databases, collected ... More

Defence Against the Modern Arts: the Curse of Statistics -- FRStatAug 04 2019For several decades, legal and scientific scholars have argued that conclusions from forensic examinations should be supported by statistical data and reported within a probabilistic framework. Multiple models have been proposed to quantify the probative ... More

Detecting the Hot Hand: Tests of Randomness Against Streaky Alternatives in Bernoulli SequencesAug 04 2019We consider the problem of testing for randomness against streaky alternatives in Bernoulli sequences. In particular, we study tests of randomness (i.e., that trials are i.i.d.) which choose as test statistics (i) the difference between the proportions ... More

Improved GM(1,1) model based on Simpson formula and its applicationsAug 04 2019The classical GM(1,1) model is an efficient tool to {make accurate forecasts} with limited samples. But the accuracy of the GM(1,1) model still needs to be improved. This paper proposes a novel discrete GM(1,1) model, named ${\rm GM_{SD}}$(1,1) model, ... More

Estimating Unobserved Individual Heterogeneity Using Pairwise ComparisonsAug 04 2019We propose a new method for studying environments with unobserved individual heterogeneity. Based on model-implied pairwise inequalities, the method classifies individuals in the sample into groups defined by discrete unobserved heterogeneity with unknown ... More

Asymptotically consistent prediction of extremes in chaotic systems:1 stationary caseAug 03 2019Aug 16 2019In many real world chaotic systems, the interest is typically in determining when the system will behave in an extreme manner. Flooding and drought, extreme heatwaves, large earthquakes, and large drops in the stock market are examples of the extreme ... More

Comparing sleep studies in terms of the Apnea-Hypopnea IndexAug 02 2019The Apnea-Hypopnea Index (AHI) is one of the most-used parameters from the sleep study that allows assessing both the severity of obstructive sleep apnea and the reliability of new devices and methods. However, in many cases, it is compared with a reference ... More

Functional Ratings in SportsAug 02 2019In this paper, we present a new model for ranking sports teams. Our model uses all scoring data from all games to produce a functional rating by the method of least squares. The functional rating can be interpreted as a teams average point differential ... More

Generalised Joint Regression for Count Data with a Focus on Modelling Football MatchesAug 02 2019We propose a versatile joint regression framework for count responses. The method is implemented in the \texttt{R} add-on package \texttt{GJRM} and allows for modelling linear and non-linear dependence through the use of several copulae. Moreover, the ... More

Generalised Joint Regression for Count Data with a Focus on Modelling Football MatchesAug 02 2019Aug 21 2019We propose a versatile joint regression framework for count responses. The method is implemented in the R add-on package GJRM and allows for modelling linear and non-linear dependence through the use of several copulae. Moreover, the parameters of the ... More

Exact joint likelihood of pseudo-$C_\ell$ estimates from correlated Gaussian cosmological fieldsAug 02 2019We present the exact joint likelihood of pseudo-$C_\ell$ power spectrum estimates measured from an arbitrary number of Gaussian cosmological fields. Our method is applicable to both spin-0 fields and spin-2 fields, including a mixture of the two, and ... More

Heterogeneous Endogenous Effects in NetworksAug 02 2019This paper proposes a new method to identify leaders and followers in a network. Prior works use spatial autoregression models (SARs) which implicitly assume that each individual in the network has the same peer effects on others. Mechanically, they conclude ... More

Structure retrieval from 4D-STEM: statistical analysis of potential pitfalls in high-dimensional dataAug 01 2019Four-dimensional scanning transmission electron microscopy (4D-STEM) is one of the most rapidly growing modes of electron microscopy imaging. The advent of fast pixelated cameras and the associated data infrastructure have greatly accelerated this process. ... More

Teasing out the overall survival benefit with adjustment for treatment switching to other therapiesAug 01 2019In oncology clinical trials, characterizing the long-term overall survival (OS) benefit for an experimental drug or treatment regimen (experimental group) is often unobservable if some patients in the control group switch to drugs in the experimental ... More

Bayesian Gamma-Negative Binomial Modeling of Single-Cell RNA Sequencing DataAug 01 2019Background: Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to ... More

Non-Archimedean Coulomb GasesAug 01 2019This article aims to study the Coulomb gas model over the $d$-dimensional $p$-adic space. We establish the existence of equilibria measures and the $\Gamma$-limit for the Coulomb energy functional when the number of configurations tends to infinity. For ... More

Measuring the Clustering Strength of a Network via the Normalized Clustering CoefficientAug 01 2019In this paper, we propose a novel statistic of networks, the normalized clustering coefficient, which is a modified version of the clustering coefficient that is robust to network size, network density and degree heterogeneity under different network ... More

Mapping the uncertainty of 19th century West African slave origins using a Markov decision process modelAug 01 2019The advent of modern computers has added an increased emphasis on channeling computational power and statistical methods into digital humanities. Including increased statistical rigor in history poses unique challenges due to the inherent uncertainties ... More

Groundwater pumping to increase food production causes persistent groundwater drought in IndiaAug 01 2019Rapid groundwater depletion in India is a sustainability challenge. However, the crucial role of climate and groundwater pumping on persisting groundwater drought remains unrecognized. Using the data from Gravity recovery climate experiment (GRACE) satellites ... More

Max-Min Fairness Design for MIMO Interference Channels: a Minorization-Maximization ApproachAug 01 2019Aug 02 2019We address the problem of linear precoder (beamformer) design in a multiple-input multiple-output interference channel (MIMO-IC). The aim is to design the transmit covariance matrices in order to achieve max-min utility fairness for all users. The corresponding ... More

Projection pursuit based generalized betas accounting for higher order co-moment effects in financial market analysisJul 31 2019Betas are possibly the most frequently applied tool to analyze how securities relate to the market. While in very widespread use, betas only express dynamics derived from second moment statistics. Financial returns data often deviate from normal assumptions ... More

Bivariate temporal orders for causal inferenceJul 31 2019Causality analysis may be carried out at different levels of detail, e.g. parameter- or temporal-based (both in a global sense). There is hence a need for a local, more distinctive approach, particularly when analyzing data segments. Therefore, the bivariate ... More

Isodiametry, variance, and regular simplices from particle interactionsJul 31 2019Consider a pressureless gas interacting through an attractive-repulsive potential given as a difference of power laws and normalized so that its unique minimum occurs at unit separation. For a range of exponents corresponding to mild repulsion and strong ... More

Additive Bayesian variable selection under censoring and misspecificationJul 31 2019We study the effect and interplay of two important issues on Bayesian model selection (BMS): the presence of censoring and model misspecification. Misspecification refers to assuming the wrong model or functional effect on the response, or not recording ... More