Active Matrix Factorization for SurveysFeb 20 2019Amid historically low response rates, survey researchers seek ways to reduce respondent burden while measuring desired concepts with precision. We propose to ask fewer questions of respondents and impute missing responses via probabilistic matrix factorization. ... More

Matching Refugees to Host Country Locations Based on Preferences and OutcomesFeb 20 2019Facilitating the integration of refugees has become a major policy challenge in many host countries in the context of the global displacement crisis. One of the first policy decisions host countries make in the resettlement process is the assignment of ... More

Optimized data exploration applied to the simulation of a chemical processFeb 18 2019In complex simulation environments, certain parameter space regions may result in non-convergent or unphysical outcomes. All parameters can therefore be labeled with a binary class describing whether or not they lead to valid results. In general, it can ... More

A feature-based framework for detecting technical outliers in water-quality data from in situ sensorsFeb 17 2019Outliers due to technical errors in water-quality data from in situ sensors can reduce data quality and have a direct impact on inference drawn from subsequent data analysis. However, outlier detection through manual monitoring is unfeasible given the ... More

A Bayesian binary algorithm for RMS-based acoustic signal segmentationFeb 17 2019Changepoint analysis (also known as segmentation analysis) aims at analyzing an ordered, one-dimensional vector, in order to find locations where some characteristic of the data changes. Many models and algorithms have been studied under this theme, including ... More

A Statistical Analysis of Noisy Crowdsourced Weather DataFeb 17 2019Spatial prediction of weather-elements like temperature, precipitation, and barometric pressure are generally based on satellite imagery or data collected at ground-stations. None of these data provide information at a more granular or "hyper-local" resolution. ... More

Model fitting in Multiple Systems Analysis for the quantification of Modern Slavery: Classical and Bayesian approachesFeb 16 2019Multiple Systems Estimation is a key estimation approach for hidden populations such as the number of victims of Modern Slavery. The UK Government estimate of 10,000 to 13,000 victims was obtained by a multiple systems estimate based on six lists. A stepwise ... More

Monte Carlo Sampling Bias in the Microwave Uncertainty FrameworkFeb 15 2019Uncertainty propagation software can have unknown, inadvertent biases introduced by various means. This work is a case study in bias identification and reduction in one such software package, the Microwave Uncertainty Framework (MUF). The general purpose ... More

Detected changes in precipitation extremes at their native scales derived from in situ measurementsFeb 15 2019The gridding of daily accumulated precipitation--especially extremes--from ground-based station observations is problematic due to the fractal nature of precipitation, and therefore estimates of long period return values and their changes based on such ... More

Critical Transitions in Intensive Care Units: A Sepsis Case StudyFeb 15 2019Progression of complex human diseases is associated with transitions across dynamical regimes. These transitions are often phase transitions that generate early-warning signs and provide insights into the underlying disease-driving mechanism(s). In this ... More

BAREB: A Bayesian repulsive biclustering model for periodontal dataFeb 15 2019Preventing periodontal diseases (PD) and maintaining the structure and function of teeth are important goals for personal oral care. To understand the heterogeneity in patients with diverse PD patterns, we develop BAREB, a Bayesian repulsive biclustering ... More

Sequential importance sampling for multi-resolution Kingman-Tajima coalescent countingFeb 14 2019Statistical inference of evolutionary parameters from molecular sequence data relies on coalescent models to account for the shared genealogical ancestry of the samples. However, inferential algorithms do not scale to available data sets. A strategy to ... More

Fundamental Diagram of Traffic Flow from Prigogine-Herman-Enskog EquationFeb 14 2019Recent applications of a new methodology to measure fundamental traffic relations on freeways shows that many of the critical parameters of the flow-density and speed-spacing diagrams depend on vehicle length. In response to this fact, we present in this ... More

OPENMENDEL: A Cooperative Programming Project for Statistical GeneticsFeb 14 2019Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, ... More

Estimation of Gaussian directed acyclic graphs using partial ordering information with an application to dairy cattle dataFeb 14 2019Estimating a directed acyclic graph (DAG) from observational data represents a canonical learning problem and has generated a lot of interest in recent years. Research has focused mostly on the following two cases: when no information regarding the ordering ... More

Anytime Tail AveragingFeb 13 2019Tail averaging consists in averaging the last examples in a stream. Common techniques either have a memory requirement which grows with the number of samples to average, are not available at every timestep or do not accomodate growing windows. We propose ... More

A Data-Driven Approach for Assessing Biking Safety in CitiesFeb 13 2019With the focus that cities around the world have put on sustainable transportation during the past few years, biking has become one of the foci for local governments around the world. Cities all over the world invest in bike infrastructure, including ... More

Selective Inference for Testing Trees and Edges in PhylogeneticsFeb 13 2019Selective inference is considered for testing trees and edges in phylogenetic tree selection from molecular sequences. This improves the previously proposed approximately unbiased test by adjusting the selection bias when testing many trees and edges ... More

Impact of Inter-Country Distances on International TourismFeb 13 2019Tourism is a worldwide practice with international tourism revenues increasing from US\$495 billion in 2000 to US\$1340 billion in 2017. Its relevance to the economy of many countries is obvious. Even though the World Airline Network (WAN) is global and ... More

Bayesian inference and non-linear extensions of the CIRCE method for quantifying the uncertainty of closure relationships integrated into thermal-hydraulic system codesFeb 13 2019Uncertainty Quantification of closure relationships integrated into thermal-hydraulic system codes is a critical prerequisite so that the Best-Estimate Plus Uncertainty (BEPU) methodology for nuclear safety and licensing processes can be implemented. ... More

Statistical Failure Mechanism Analysis of Earthquakes Revealing Time RelationshipsFeb 13 2019If we assume that earthquakes are chaotic, and influenced locally then chaos theory suggests that there should be a temporal association between earthquakes in a local region that should be revealed with statistical examination. To date no strong relationship ... More

A simple statistical approach to prediction in open high dimensional chaotic systemsFeb 13 2019Two recent papers on prediction of chaotic systems, one on multi-view embedding1 , and the second on prediction in projection2 provide empirical evidence to support particular prediction methods for chaotic systems. Multi-view embedding1 is a method of ... More

A Novel Maneuvering Target Tracking Approach by Stochastic Volatility GARCH ModelFeb 12 2019In this paper, we introduce a new single model maneuvering target tracking approach using stochastic differential equation (SDE) based on GARCH volatility. The traditional input estimation (IE) techniques assume constant acceleration level which do not ... More

Non-Linear Non-Stationary Heteroscedasticity Volatility for Tracking of Jump ProcessesFeb 12 2019In this paper, we introduce a new jump process modeling which involves a particular kind of non-Gaussian stochastic processes with random jumps at random time points. The main goal of this study is to provide an accurate tracking technique based on non-linear ... More

Inter-Node Distance Estimation from Multipath Delay Differences of Channels to Observer NodesFeb 12 2019We study the estimation of distance d between two wireless nodes by means of their wideband channels to a third node, called observer. The motivating principle is that the channel impulse responses are similar for small d and drift apart when d increases. ... More

Winning the Big Data Technologies Horizon Prize: Fast and reliable forecasting of electricity grid traffic by identification of recurrent fluctuationsFeb 12 2019This paper provides a description of the approach and methodology I used in winning the European Union Big Data Technologies Horizon Prize on data-driven prediction of electricity grid traffic. The methodology relies on identifying typical short-term ... More

Achieving GWAS with Homomorphic EncryptionFeb 12 2019One way of investigating how genes affect human traits would be with a genome-wide association study (GWAS). Genetic markers, known as single-nucleotide polymorphism (SNP), are used in GWAS. This raises privacy and security concerns as these genetic markers ... More

Bayesian Inference of a Finite Population Mean Under Length-Biased SamplingFeb 12 2019We present a robust Bayesian method to analyze forestry data when samples are selected with probability proportional to length from a finite population of unknown size. Specifically, we use Bayesian predictive inference to estimate the finite population ... More

Riemannian joint dimensionality reduction and dictionary learning on symmetric positive definite manifoldFeb 11 2019Dictionary leaning (DL) and dimensionality reduction (DR) are powerful tools to analyze high-dimensional noisy signals. This paper presents a proposal of a novel Riemannian joint dimensionality reduction and dictionary learning (R-JDRDL) on symmetric ... More

Hawkes processes for credit indices time series analysis: How random are trades arrival times?Feb 11 2019Targeting a better understanding of credit market dynamics, the authors have studied a stochastic model named the Hawkes process. Describing trades arrival times, this kind of model allows for the capture of self-excitement and mutual interactions phenomena. ... More

Validating Gravity-Based Market Share Models Using Large-Scale Transactional DataFeb 09 2019Customer patronage behavior has been widely studied in market share modeling contexts, which is an essential step towards modeling and solving competitive facility location problems. Existing studies have conducted surveys to estimate merchants' market ... More

Bayesian Nonparametric Adaptive Spectral Density Estimation for Financial Time SeriesFeb 09 2019Discrimination between non-stationarity and long-range dependency is a difficult and long-standing issue in modelling financial time series. This paper uses an adaptive spectral technique which jointly models the non-stationarity and dependency of financial ... More

Automatic dimensionality selection for principal component analysis models with the ignorance scoreFeb 08 2019Principal component analysis (PCA) is by far the most widespread tool for unsupervised learning with high-dimensional data sets. Its application is popularly studied for the purpose of exploratory data analysis and online process monitoring. Unfortunately, ... More

Does the "Artificial Intelligence Clinician" learn optimal treatment strategies for sepsis in intensive care?Feb 08 2019From 2017 to 2018 the number of scientific publications found via PubMed search using the keyword "Machine Learning" increased by 46% (4,317 to 6,307). The results of studies involving machine learning, artificial intelligence (AI), and big data have ... More

Scalable optimal Bayesian classification of single-cell trajectories under regulatory model uncertaintyFeb 08 2019Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based ... More

Learning spatially-correlated temporal dictionaries for calcium imagingFeb 08 2019Calcium imaging has become a fundamental neural imaging technique, aiming to recover the individual activity of hundreds of neurons in a cortical region. Current methods (mostly matrix factorization) are aimed at detecting neurons in the field-of-view ... More

Distribution of residual autocorrelations for multiplicative seasonal ARMA models with uncorrelated but non-independent error termsFeb 08 2019In this paper we consider portmanteau tests for testing the adequacy of multiplicative seasonal autoregressive moving-average (SARMA) models under the assumption that the errors are uncorrelated but not necessarily independent.We relax the standard independence ... More

Crop Yield Prediction Using Deep Neural NetworksFeb 07 2019Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive ... More

Ensemble Prediction of Time to Event Outcomes with Competing Risks: A Case Study of Surgical Complications in Crohn's DiseaseFeb 07 2019We develop a novel algorithm to predict the occurrence of major abdominal surgery within 5 years following Crohn's disease diagnosis using a panel of 29 baseline covariates from the Swedish population registers. We model pseudo-observations based on the ... More

Concomitant Lasso with Repetitions (CLaR): beyond averaging multiple realizations of heteroscedastic noiseFeb 07 2019Sparsity promoting norms are frequently used in high dimensional regression. A limitation of Lasso-type estimators is that the regulariza-tion parameter depends on the noise level which varies between datasets and experiments. Esti-mators such as the ... More

A Bayesian Approach for Accurate Classification-Based AggregatesFeb 06 2019In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for highly accurate ... More

Winning Is Not Everything: A contextual analysis of hockey face-offsFeb 06 2019This paper takes a different approach to evaluating face-offs in ice hockey. Instead of looking at win percentages, the de facto measure of successful face-off takers for decades, focuses on the game events following the face-off and how directionality, ... More

Using statistical control charts to monitor duration-based performance of projectFeb 06 2019Monitoring of project performance is a crucial task of project managers that significantly affect the project success or failure. Earned Value Management (EVM) is a well-known tool to evaluate project performance and effective technique for identifying ... More

Modelling the effect of training on performance in road cycling: estimation of the Banister model parameters using field dataFeb 06 2019We suppose that performance is a random variable whose expectation is related to training inputs, and we study four performance measures in a statistical model that relates performance to training. Our aim is to carry out a robust statistical analysis ... More

The relative efficiency of time-to-progression and continuous measures of cognition in pre-symptomatic Alzheimer'sFeb 06 2019Pre-symptomatic (or Preclinical) Alzheimer's Disease is defined by biomarker evidence of fibrillar amyloid beta pathology in the absence of clinical symptoms. Clinical trials in this early phase of disease are challenging due to the slow rate of disease ... More

Heavy User Effect in A/B Testing: Identification and EstimationFeb 06 2019On-line experimentation (also known as A/B testing) has become an integral part of software development. To timely incorporate user feedback and continuously improve products, many software companies have adopted the culture of agile deployment, requiring ... More

Playing Fast Not Loose: Evaluating team-level pace of play in ice hockey using spatio-temporal possession dataFeb 06 2019Pace of play is an important characteristic in hockey as well as other team sports. We provide the first comprehensive study of pace within the sport of hockey, focusing on how teams and players impact pace in different regions of the ice, and the resultant ... More

Active Learning for High-Dimensional Binary FeaturesFeb 05 2019Erbium-doped fiber amplifier (EDFA) is an optical amplifier/repeater device used to boost the intensity of optical signals being carried through a fiber optic communication system. A highly accurate EDFA model is important because of its crucial role ... More

Temporal Convolutional Networks and Dynamic Time Warping can Drastically Improve the Early Prediction of SepsisFeb 05 2019Feb 07 2019Motivation: Sepsis is a life-threatening host response to infection associated with high mortality, morbidity and health costs. Its management is highly time-sensitive since each hour of delayed treatment increases mortality due to irreversible organ ... More

Estimating Individualized Treatment Regimes from Crossover DesignsFeb 05 2019The field of precision medicine aims to tailor treatment based on patient-specific factors in a reproducible way. To this end, estimating an optimal individualized treatment regime (ITR) that recommends treatment decisions based on patient characteristics ... More

Interpretation of the individual effect under treatment spilloverFeb 04 2019Some interventions may include important spillover or dissemination effects between study participants. For example, vaccines, cash transfers, and education programs may exert a causal effect on participants beyond those to whom individual treatment is ... More

Reducing variability in along-tract analysis with diffusion profile realignmentFeb 04 2019Diffusion weighted MRI (dMRI) provides a non invasive virtual reconstruction of the brain's white matter structures through tractography. Analyzing dMRI measures along the trajectory of white matter bundles can provide a more specific investigation than ... More

Global Fitting of the Response Surface via Estimating Multiple Contours of a SimulatorFeb 04 2019Computer simulators are nowadays widely used to understand complex physical systems in many areas such as aerospace, renewable energy, climate modeling, and manufacturing. One fundamental issue in the study of computer simulators is known as experimental ... More

The reliability of an environmental epidemiology meta-analysis, a case studyFeb 02 2019Summary Background Claims made in science papers are coming under increased scrutiny with many claims failing to replicate. Meta-analysis studies that use unreliable observational studies should be in question. We examine the reliability of the base studies ... More

Forecasting Intra-Hour Imbalances in Electric Power SystemsFeb 01 2019Keeping the electricity production in balance with the actual demand is becoming a difficult and expensive task in spite of an involvement of experienced human operators. This is due to the increasing complexity of the electric power grid system with ... More

Limit theorems for cloning algorithmsFeb 01 2019Large deviations for additive path functionals of stochastic processes have attracted significant research interest, in particular in the context of stochastic particle systems and statistical physics. Efficient numerical `cloning' algorithms have been ... More

StaTIX - Statistical Type Inference on Linked DataFeb 01 2019Large knowledge bases typically contain data adhering to various schemas with incomplete and/or noisy type information. This seriously complicates further integration and post-processing efforts, as type information is crucial in correctly handling the ... More

Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden ConfoundersFeb 01 2019The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders. This assumption is not testable in practice and, ... More

A copula-based measure for quantifying asymmetry in dependence and associationsFeb 01 2019Asymmetry is an inherent property of bivariate associations and therefore must not be ignored. The currently applicable dependence measures mask the potential asymmetry of the underlying dependence structure by implicitly assuming that quantity Y is equally ... More

Forecasting the Impact of Connected and Automated Vehicles on Energy Use: A Microeconomic Study of Induced Travel and Energy ReboundJan 31 2019Connected and automated vehicles (CAVs) are expected to yield significant improvements in safety, energy efficiency, and time utilization. However, their net effect on energy and environmental outcomes is unclear. Higher fuel economy reduces the energy ... More

A large-scale crowdsourced analysis of abuse against women journalists and politicians on TwitterJan 31 2019We report the first, to the best of our knowledge, hand-in-hand collaboration between human rights activists and machine learners, leveraging crowd-sourcing to study online abuse against women on Twitter. On a technical front, we carefully curate an unbiased ... More

Bayesian nonparametric multiway regression for clustered binomial dataJan 31 2019We introduce a Bayesian nonparametric regression model for data with multiway (tensor) structure, motivated by an application to periodontal disease (PD) data. Our outcome is the number of diseased sites measured over four different tooth types for each ... More

Trends in the extremes of environments associated with severe US thunderstormsJan 30 2019Feb 13 2019Severe thunderstorms can have devastating impacts. Concurrently high values of convective available potential energy (CAPE) and storm relative helicity (SRH) are known to be favourable to severe weather and thus high values of PROD=$\sqrt{\mbox{CAPE}} ... More

Uplift Regression: The R Package tools4upliftJan 30 2019Uplift modeling aims at predicting the causal effect of an action such as a medical treatment or a marketing campaign on a particular individual, by taking into consideration the response to a treatment. The treatment group contains individuals who are ... More

A statistical modelling framework for mapping malaria seasonalityJan 30 2019Many malaria-endemic areas experience seasonal fluctuations in cases because the mosquito vector's life cycle is dependent on the environment. While most existing maps of malaria seasonality use fixed thresholds of rainfall, temperature and vegetation ... More

Geometric structure of graph Laplacian embeddingsJan 30 2019We analyze the spectral clustering procedure for identifying coarse structure in a data set $x_1, \dots, x_n$, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, ... More

A Robust Time Series Model with Outliers and Missing EntriesJan 29 2019This paper studies the problem of robustly learning the correlation function for a univariate time series with the presence of noise, outliers and missing entries. The outliers or anomalies considered here are sparse and rare events that deviate from ... More

A new tidy data structure to support exploration and modeling of temporal dataJan 29 2019Mining temporal data for information is often inhibited by a multitude of formats: irregular or multiple time intervals, point events that need aggregating, multiple observational units or repeated measurements on multiple individuals, and heterogeneous ... More

A semi-supervised approach to message stance classificationJan 29 2019Social media communications are becoming increasingly prevalent; some useful, some false, whether unwittingly or maliciously. An increasing number of rumours daily flood the social networks. Determining their veracity in an autonomous way is a very active ... More

Centered Partition Process: Informative Priors for ClusteringJan 29 2019There is a very rich literature proposing Bayesian approaches for clustering starting with a prior probability distribution on partitions. Most approaches assume exchangeability, leading to simple representations in terms of Exchangeable Partition Probability ... More

A maximum principle argument for the uniform convergence of graph Laplacian regressorsJan 29 2019We study asymptotic consistency guarantees for a non-parametric regression problem with Laplacian regularization. In particular, we consider $(x_1, y_1), \dots, (x_n, y_n)$ samples from some distribution on the cross product $\mathcal{M} \times \mathbb{R}$, ... More

Diseño de un espacio semántico sobre la base de la Wikipedia. Una propuesta de análisis de la semántica latente para el idioma españolJan 28 2019Latent Semantic Analysis (LSA) was initially conceived by the cognitive psychology at the 90s decade. Since its emergence, the LSA has been used to model cognitive processes, pointing out academic texts, compare literature works and analyse political ... More

Decomposition of Higher-Order Spectra for Blind Multiple-Input Deconvolution, Pattern Identification and SeparationJan 28 2019Like the ordinary power spectrum, higher-order spectra (HOS) describe signal properties that are invariant under translations in time. Unlike the power spectrum, HOS retain phase information from which details of the signal waveform can be recovered. ... More

RefCurv: A Software for the Construction of Pediatric Reference CurvesJan 28 2019In medicine, reference curves serve as an important tool for everyday clinical practice. Pediatricians assess the growth process of children with the help of percentile curves serving as norm references. The mathematical methods for the construction of ... More

Intensity estimation of transaction arrivals on the intraday electricity marketJan 28 2019In the following paper we present a simple intensity estimation method of transaction arrivals on the intraday electricity market. Assuming the interarrival times distribution, we utilize a maximum likelihood estimation. The method's performance is briefly ... More

Galton-Watson process and bayesian inference: A turnkey method for the viability study of small populationsJan 28 20191 Sharp prediction of extinction times is needed in biodiversity monitoring and conservation management. 2 The Galton-Watson process is a classical stochastic model for describing population dynamics. Its evolution is like the matrix population model ... More

Fair Regression for Health Care SpendingJan 28 2019The distribution of health care payments to insurance plans has substantial consequences for social policy. Risk adjustment formulas predict spending in health insurance markets in order to provide fair benefits and health care coverage for all enrollees, ... More

Improved Causal Discovery from Longitudinal Data Using a Mixture of DAGsJan 28 2019Many causal processes in biomedicine contain cycles and evolve. However, most causal discovery algorithms assume that the underlying causal process follows a single directed acyclic graph (DAG) that does not change over time. The algorithms can therefore ... More

Clustering Discrete Valued Time SeriesJan 26 2019There is a need for the development of models that are able to account for discreteness in data, along with its time series properties and correlation. Our focus falls on INteger-valued AutoRegressive (INAR) type models. The INAR type models can be used ... More

On the unified zero-inflated cure-rate survival modelsJan 26 2019In this paper, we propose a unified version for survival models that includes zero-inflation and cure rate proportions, and allows different distributions for the unknown competitive causes. Our model has as particular cases several usual cure rate survival ... More

Volatility Models Applied to Geophysics and High Frequency Financial Market DataJan 26 2019This work is devoted to the study of modeling geophysical and financial time series. A class of volatility models with time-varying parameters is presented to forecast the volatility of time series in a stationary environment. The modeling of stationary ... More

Digging the topology of rock art in Northwestern PatagoniaJan 25 2019We present a study on the rock art of Northern Patagonia based on network analysis and communities detection. We unveil a significant aggregation of archaeological sites, linked by common rock art motifs that turn out to be consistent with their geographical ... More

Computational landscape of user behavior on social mediaJan 25 2019With the increasing abundance of 'digital footprints' left by human interactions in online environments, e.g., social media and app use, the ability to model complex human behavior has become increasingly possible. Many approaches have been proposed, ... More

Spatial trend analysis of gridded temperature data at varying spatial scalesJan 25 2019Classical assessments of trends in gridded temperature data perform independent evaluations across the grid, thus, ignoring spatial correlations in the trend estimates. In particular, this affects assessments of trend significance as evaluation of the ... More

Detecting Changes in Hidden Markov ModelsJan 24 2019Jan 26 2019We consider the problem of sequential detection of a change in the statistical behavior of a hidden Markov model. By adopting a worst-case analysis with respect to the time of change and by taking into account the data that can be accessed by the change-imposing ... More

Asynchronous Multi-Sensor Change-Point Detection for Seismic TremorsJan 24 2019We consider the sequential change-point detection for asynchronous multi-sensors, where each sensor observe a signal (due to change-point) at different times. We propose an asynchronous Subspace-CUSUM procedure based on jointly estimating the unknown ... More

Uncertainty Principle in Distributed MIMO RadarsJan 23 2019Radar uncertainty principle indicates that there is an inherent invariance in the product of the time-delay and Doppler-shift measurement accuracy and resolution which can be tuned by the waveform at transmitter. In this paper, based on the radar uncertainty ... More

Mobility-on-demand versus fixed-route transit systems: an evaluation of traveler preferences in low-income communitiesJan 22 2019Emerging transportation technologies, such as ride-hailing and autonomous vehicles, are disrupting the transportation sector and transforming public transit. Some transit observers envision future public transit to be integrated transit systems with fixed-route ... More

Optimal Uncertainty Quantification of a risk measurement from a thermal-hydraulic code using Canonical MomentsJan 22 2019We study an industrial computer code related to nuclear safety. A major topic of interest is to assess the uncertainties tainting the results of a computer simulation. In this work we gain robustness on the quantification of a risk measurement by accounting ... More

The posterior probability of a null hypothesis given a statistically significant resultJan 21 2019Some researchers informally assume that, when they carry out a null hypothesis significance test, a statistically significant result lowers the probability of the null hypothesis being true. Although technically wrong (the null hypothesis does not have ... More

Bayesian Pseudo Posterior Synthesis for Data Privacy ProtectionJan 19 2019Statistical agencies utilize models to synthesize respondent-level data for release to the general public as an alternative to the actual data records. A Bayesian model synthesizer encodes privacy protection by employing a hierarchical prior construction ... More

A New Weighting Scheme in Weighted Markov Model for Predicting the Probability of Drought EpisodesJan 18 2019Drought is a complex stochastic natural hazard caused by prolonged shortage of rainfall. Several environmental factors are involved in determining drought classes at the specific monitoring station. Therefore, efficient sequence processing techniques ... More

Bayesian Prediction of Nitrate Concentration Using a Gaussian Log-Gaussian Spatial Model with Measurement Error in Explanatory VariablesJan 18 2019The occurrence of high nitrate levels in groundwater has to be recognized as a threat to humans and animals. An accurate prediction of pollutant concentrations is a basal component for a correct detection of areas with excess of contamination. The groundwater ... More

Application of Stochastic and Deterministic Techniques for Uncertainty Quantification and Sensitivity Analysis of Energy SystemsJan 17 2019Sensitivity analysis (SA) and uncertainty quantification (UQ) are used to assess and improve engineering models. In this study, various methods of SA and UQ are described and applied in theoretical and practical examples for use in energy system analysis. ... More

How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data CompetitionJan 16 2019Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the ... More

Multivariate mixed membership modeling: Inferring domain-specific risk profilesJan 16 2019Characterizing shared membership of individuals in two or more categories of a classification scheme poses severe interpretability problems when the number of categories is large (e.g. greater than six). Mixed membership models quantify this phenomenon, ... More

Novel metrics for quantifying the capacity of subgroup-defining variables to yield efficient treatment rulesJan 16 2019Feb 06 2019A major objective of subgroup analysis in clinical trials is to explore to what extent patient characteristics can determine treatment outcomes. Conventional one-variable-at-a-time subgroup analysis based on statistical hypothesis testing of covariate-by-treatment ... More