Estimating Sparse Networks with HubsApr 20 2019Graphical modelling techniques based on sparse selection have been applied to infer complex networks in many fields, including biology and medicine, engineering, finance, and social sciences. One structural feature of some of the networks in such applications ... More

Online Non-stationary Time Series Analysis and ProcessingApr 19 2019Time series analysis is critical in academic communities ranging from economics, transportation science and meteorology, to engineering, genetics and environmental sciences. In this paper, we will firstly model a time series as a non-stationary stochastic ... More

Safety-margin-based design and redesign considering mixed epistemic model uncertainty and aleatory parameter uncertaintyApr 18 2019At the initial design stage engineers often rely on low-fidelity models that have high epistemic uncertainty. Traditional safety-margin-based deterministic design resorts to testing (e.g. prototype experiment, evaluation of high-fidelity simulation, etc.) ... More

Modelling antimicrobial prescriptions in Scotland: A spatio-temporal clustering approachApr 18 2019In 2016 the British government acknowledged the importance of reducing antimicrobial prescriptions in order to avoid the long-term harmful effects of over-prescription. Prescription needs are highly dependent on factors that have a spatio-temporal component, ... More

Damping of Propagating Kink Waves in the Solar CoronaApr 18 2019Alfv\'enic waves have gained renewed interest since the existence of ubiquitous propagating kink waves were discovered in the corona. {It has long been suggested that Alfv\'enic} waves play an important role in coronal heating and the acceleration of ... More

Phase transition in the exclusion process on the Sierpinski gasket with slowed boundary reservoirsApr 18 2019We derive the macroscopic laws that govern the evolution of the density of particles in the exclusion process evolving on the Sierpinski gasket in the presence of a slow boundary. Depending on the slowness of the boundary we obtain, at the hydrodynamics ... More

The population-attributable fraction for time-dependent exposures and competing risks - A discussion on estimandsApr 18 2019The population-attributable fraction (PAF) quantifies the public health impact of a harmful exposure. Despite being a measure of significant importance an estimand accommodating complicated time-to-event data is not clearly defined. We discuss current ... More

On models for the estimation of the excess mortality hazard in case of insufficiently stratified life tablesApr 18 2019In cancer epidemiology using population-based data, regression models for the excess mortality hazard is a useful method to estimate cancer survival and to describe the association between prognosis factors and excess mortality. This method requires expected ... More

Sharp Bounds for the Marginal Treatment Effect with Sample SelectionApr 17 2019I analyze treatment effects in situations when agents endogenously select into the treatment group and into the observed sample. As a theoretical contribution, I propose pointwise sharp bounds for the marginal treatment effect (MTE) of interest within ... More

Variable Selection in Functional Linear Concurrent RegressionApr 17 2019We propose a novel method for variable selection in functional linear concurrent regression. Our research is motivated by a fisheries footprint study where one of the goal is to identify important time varying socio-structural drivers influencing patterns ... More

A comparison of statistical and machine learning methods for creating national daily maps of ambient PM$_{2.5}$ concentrationApr 17 2019A typical problem in air pollution epidemiology is exposure assessment for individuals for which health data are available. Due to the sparsity of monitoring sites and the limited temporal frequency with which measurements of air pollutants concentrations ... More

Adjusted Empirical Likelihood Method for the Tail Index of A Heavy-Tailed DistributionApr 17 2019Empirical likelihood is a well-known nonparametric method in statistics and has been widely applied in statistical inference. The method has been employed by Lu and Peng (2002) to constructing confidence intervals for the tail index of a heavy-tailed ... More

Nonparametric drift estimation for diffusions with jumps driven by a Hawkes processApr 17 2019We consider a 1-dimensional diffusion process X with jumps. The particularity of this model relies in the jumps which are driven by a multidimensional Hawkes process denoted N. This article is dedicated to the study of a nonparametric estimator of the ... More

Exponential random graph model parameter estimation for very large directed networksApr 17 2019Exponential random graph models (ERGMs) are widely used for modeling social networks observed at one point in time. However the computational difficulty of ERGM parameter estimation has limited the practical application of this class of models to relatively ... More

Machine learning for early prediction of circulatory failure in the intensive care unitApr 16 2019Apr 19 2019Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and ... More

Robust Response-Adaptive Randomization DesignApr 16 2019In clinical trials, patients are randomized with equal probability among treatments to obtain an unbiased estimate of the treatment effect. However, response-adaptive randomization has been proposed due to ethical reasons, especially in rare diseases ... More

Constraints in Random Effects Age-Period-Cohort ModelsApr 16 2019Random effects (RE) models have been widely used to study the contextual effects of structures such as neighborhood or school. The RE approach has recently been applied to age-period-cohort (APC) models that are unidentified because the predictors are ... More

Why Are the ARIMA and SARIMA not SufficientApr 16 2019The autoregressive moving average (ARMA) model and its variants like autoregressive integrated moving average (ARIMA), seasonal ARIMA (SARIMA) take the significant position in the time series analysis community. The ARMA model could describe a rational-spectra ... More

Metrics for Graph Comparison: A Practitioner's GuideApr 16 2019Comparison of graph structure is a ubiquitous task in data analysis and machine learning, with diverse applications in fields such as neuroscience, cyber security, social network analysis, and bioinformatics, among others. Discovery and comparison of ... More

How to apply multiple imputation in propensity score matching with partially observed confounders: a simulation study and practical recommendationsApr 16 2019Propensity score matching (PSM) has been widely used to mitigate confounding in observational studies, although complications arise when the covariates used to estimate the PS are only partially observed. Multiple imputation (MI) is a potential solution ... More

A deep learning model for early prediction of Alzheimer's disease dementia based on hippocampal MRIApr 15 2019Introduction: It is challenging at baseline to predict when and which individuals who meet criteria for mild cognitive impairment (MCI) will ultimately progress to Alzheimer's disease (AD) dementia. Methods: A deep learning method is developed and validated ... More

Comparison of statistical post-processing methods for probabilistic NWP forecasts of solar radiationApr 15 2019The increased usage of solar energy places additional importance on forecasts of solar radiation. Solar panel power production is primarily driven by the amount of solar radiation and it is therefore important to have accurate forecasts of solar radiation. ... More

Multiple kernel learning for integrative consensus clustering of genomic datasetsApr 15 2019Diverse applications - particularly in tumour subtyping - have demonstrated the importance of integrative clustering as a means to combine information from multiple high-dimensional omics datasets. Cluster-Of-Clusters Analysis (COCA) is a popular integrative ... More

A framework for streamlined statistical prediction using topic modelsApr 15 2019In the Humanities and Social Sciences, there is increasing interest in approaches to information extraction, prediction, intelligent linkage, and dimension reduction applicable to large text corpora. With approaches in these fields being grounded in traditional ... More

Critical elements for connectivity analysis of brain networksApr 15 2019In recent years, new and important perspectives were introduced in the field of neuroimaging with the emergence of the connectionist approach. In this new context, it is important to know not only which brain areas are activated by a particular stimulus ... More

Applications of Quantum Annealing in StatisticsApr 15 2019Quantum computation offers exciting new possibilities for statistics. This paper explores the use of the D-Wave machine, a specialized type of quantum computer, which performs quantum annealing. A general description of quantum annealing through the use ... More

Pólygamma Data Augmentation to address Non-conjugacy in the Bayesian Estimation of Mixed Multinomial Logit ModelsApr 13 2019The standard Gibbs sampler of Mixed Multinomial Logit (MMNL) models involves sampling from conditional densities of utility parameters using Metropolis-Hastings (MH) algorithm due to unavailability of conjugate prior for logit kernel. To address this ... More

Estimation of group means in generalized linear mixed modelsApr 12 2019In this manuscript, we investigate the group mean estimation and prediction for generalized linear models with a subject-wise random effect. Generalized linear models are commonly used to analyze categorical data. The model-based mean for a treatment ... More

A robust approach to model-based classification based on trimming and constraintsApr 12 2019In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers ... More

A streaming feature-based compression method for data from instrumented infrastructureApr 12 2019An increasing amount of civil engineering applications are utilising data acquired from infrastructure instrumented with sensing devices. This data has an important role in monitoring the response of these structures to excitation, and evaluating structural ... More

New statistic for detecting laboratory effects in ORDANOVAApr 12 2019The present study defines a new statistic for detecting laboratory effects in the analysis of ordinal variation (ORDANOVA). The ORDANOVA is an analysis method similar to one-way analysis of variance for analysing ordinal data obtained from interlaboratory ... More

A Weight-based Information Filtration Algorithm for Stock-Correlation NetworksApr 12 2019Several algorithms have been proposed to filter information on a complete graph of correlations across stocks to build a stock-correlation network. Among them the planar maximally filtered graph (PMFG) algorithm uses $3n-6$ edges to build a graph whose ... More

Robust Principal Component Analysis for Compositional TablesApr 11 2019A data table which is arranged according to two factors can often be considered as a compositional table. An example is the number of unemployed people, split according to gender and age classes. Analyzed as compositions, the relevant information would ... More

Scanner Invariant Representations for Diffusion MRI HarmonizationApr 10 2019Pooled imaging data from multiple sources is subject to variation between the sources. Correcting for these biases has become incredibly important as the size of imaging studies increases and the multi-site case becomes more common. We propose learning ... More

A Classification Algorithm to Recognize Fake News WebsitesApr 10 2019'Fake news' is information that generally spreads on the web, which only mimics the form of reliable news media content. The phenomenon has assumed uncontrolled proportions in recent years rising the concern of authorities and citizens. In this paper ... More

Google Street View image of a house predicts car accident risk of its residentApr 10 2019Road traffic injuries are a leading cause of death worldwide. Proper estimation of car accident risk is critical for appropriate allocation of resources in healthcare, insurance, civil engineering, and other industries. We show how images of houses are ... More

Cord-blood vitamin D level and night sleep duration in preschoolers in the EDEN mother-child birth cohortApr 10 2019Objective: 25-hydroxyvitamin D (25OHD) deficiency has been associated with sleep disorders in adults. Only three cross-sectional studies were performed in children and showed an association between 25OHD deficiency and both obstructive sleep apnea syndrome ... More

Multiple imputation and selection of ordinal level 2 predictors in multilevel models. An analysis of the relationship between student ratings and teacher beliefs and practicesApr 10 2019The paper is motivated by the analysis of the relationship between ratings and teacher practices and beliefs, which are measured via a set of binary and ordinal items collected by a specific survey with nearly half missing respondents. The analysis, which ... More

Association of night-waking and inattention/hyperactivity symptoms trajectories in preschool-aged childrenApr 10 2019Objective: To study the longitudinal associations between inattention/hyperactivity symptoms and night-waking in preschool-years, in light of their joint evolution.Study design: Within the French birth-cohort study EDEN, repeated measures of 1342 children's ... More

Night sleep duration trajectories and associated factors among preschool children from the EDEN cohortApr 10 2019Objective. Sleep duration may vary inter-individually and intra-individually over time. We aimed at both identifying night-sleep duration (NSD) trajectories among preschoolers and studying associated factors. Methods. NSD were collected within the French ... More

Early features associated with the neurocognitive development at 36 months of age: the AuBE studyApr 10 2019Background. Few studies on the relations between sleep quantity and/or quality and cognition were conducted among pre-schoolers from healthy general population. We aimed at identifying, among 3 years old children, early factors associated with intelligence ... More

Bayesian averaging of computer models with domain discrepancies: a nuclear physics perspectiveApr 09 2019This article studies Bayesian model averaging (BMA) in the context of several competing computer models in nuclear physics. We quantify model uncertainty in terms of posterior prediction errors, including an explicit formula for their posterior variance. ... More

Bivariate Gamma Mixture of Experts Models for Joint Insurance Claims ModelingApr 09 2019In general insurance, risks from different categories are often modeled independently and their sum is regarded as the total risk the insurer takes on in exchange for a premium. The dependence from multiple risks is generally neglected even when correlation ... More

A new perspective from a Dirichlet model for forecasting outstanding liabilities of nonlife insurersApr 09 2019Forecasting the outstanding claim liabilities to set adequate reserves is critical for a nonlife insurer's solvency. Chain-Ladder and Bornhuetter-Ferguson are two prominent actuarial approaches used for this task. The selection between the two approaches ... More

Robust Approximate Bayesian Inference with Synthetic LikelihoodApr 09 2019Bayesian synthetic likelihood (BSL) is now a well-established method for conducting approximate Bayesian inference in complex models where exact Bayesian approaches are either infeasible, or computationally demanding, due to the intractability of likelihood ... More

A sensitivity analysis of the PAWN sensitivity indexApr 09 2019The PAWN index is gaining traction among the modelling community as a moment-independent method to conduct global sensitivity analysis. However, it has been used so far without knowing how robust it is to its main design parameters, which need to be defined ... More

System modeling of a health issue: the case of preterm birth in OhioApr 09 2019Preterm birth rate (PBR) stands out as a major public health concern in the U.S. However, effective policies for mitigating the problem is largely unknown. The complexities of the problem raise critical questions: Why is PBR increasing despite the massive ... More

On Kaczmarz Signal Processing Technique in Massive MIMOApr 08 2019To exploit the benefits of massive multiple-input multiple-output (M-MIMO) technology in scenarios where base stations need to be inexpensive, the computational complexity of classical signal processing schemes for spatial multiplexing must be reduced. ... More

Tipping point analysis of electrical resistance data with early warning signals of failure for predictive maintenanceApr 08 2019Apr 15 2019We apply tipping point analysis to measurements of electronic components commonly used in applications in the automotive or aviation industries and demonstrate early warning signals based on scaling properties of resistance time series. The analysis utilises ... More

Common Statistical Patterns in Urban TerrorismApr 08 2019The underlying reasons behind modern terrorism are seemingly complex and intangible. Despite diverse causal mechanisms, research has shown that there exists general statistical patterns at the global scale that can shed light on human confrontation behaviour. ... More

Diabetes Mellitus Forecasting Using Population Health Data in Ontario, CanadaApr 08 2019Leveraging health administrative data (HAD) datasets for predicting the risk of chronic diseases including diabetes has gained a lot of attention in the machine learning community recently. In this paper, we use the largest health records datasets of ... More

Geostatistical Modeling of Positive Definite Matrices and Its Applications to Diffusion Tensor ImagingApr 08 2019Geostatistical modeling for continuous point-referenced data has been extensively applied to neuroimaging because it produces efficient and valid statistical inference. However, diffusion tensor imaging (DTI), a neuroimaging characterizing the brain structure ... More

Convolutive Blind Source Separation on Surface EMG Signals for Respiratory Diagnostics and Medical Ventilation ControlApr 08 2019The electromyogram (EMG) is an important tool for assessing the activity of a muscle and thus also a valuable measure for the diagnosis and control of respiratory support. In this article we propose convolutive blind source separation (BSS) as an effective ... More

Early warning in egg production curves from commercial hens: A SVM approachApr 08 2019Artificial Intelligence allows the improvement of our daily life, for instance, speech and handwritten text recognition, real time translation and weather forecasting are common used applications. In the livestock sector, machine learning algorithms have ... More

A Fast Scheme for the Uniform Sampling of Binary Matrices with Fixed MarginsApr 08 2019Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix ... More

Application of data compression techniques to time series forecastingApr 08 2019In this study we show that standard well-known file compression programs (zlib, bzip2, etc.) are able to forecast real-world time series data well. The strength of our approach is its ability to use a set of data compression algorithms and "automatically" ... More

An unsupervised transfer learning algorithm for sleep monitoringApr 07 2019Objective: To develop multisensor-wearable-device sleep monitoring algorithms that are robust to health disruptions affecting sleep patterns. Methods: We develop an unsupervised transfer learning algorithm based on a multivariate hidden Markov model and ... More

Bayesian influence diagnostics using normalizing functional Bregman divergenceApr 07 2019Ideally, any statistical inference should be robust to local influences. Although there are simple ways to check about leverage points in independent and linear problems, more complex models require more sophisticated methods. Kullback-Leiber and Bregman ... More

Monitoring the carbon budget: Real-time verification of CO2 emissionsApr 07 2019International agreements to reduce CO$_2$ emissions necessitate that nations will have to commit to drastic reductions in their emissions in the near future. However, CO$_2$ emissions are reported by individual nations and cannot be easily verified by ... More

Robustness of urban road networks based on spatial topological patternsApr 06 2019During the last decade, road network vulnerability assessment has received an increasing attention. On one hand, it is due to the significant advances in Network Science and the potentialities that its tools offer. On the other hand, it is due to its ... More

A Novel Big Data Analytics Framework to Predict the Risk of Opioid Use DisorderApr 06 2019Addiction and overdose related to prescription opioids have reached an epidemic level in the U.S., creating an unprecedented national crisis. This has been exacerbated partly due to the lack of tools for physicians to help predict whether or not a patient ... More

Bus Travel Time Prediction: A Lognormal Auto-Regressive (AR) Modeling ApproachApr 06 2019Providing real time information about the arrival time of the transit buses has become inevitable in urban areas to make the system more user-friendly and advantageous over various other transportation modes. However, accurate prediction of arrival time ... More

Idealize - A Notion of Idea StrengthApr 06 2019Business Entrepreneurs frequently thrive on looking for ways to test business ideas, without giving too much information. Recent techniques in startup development promote the use of surveys to measure the potential client's interest. In this preliminary ... More

Multivariate Hierarchical Frameworks for Modelling Delayed Reporting in Count DataApr 06 2019In many fields and applications count data can be subject to delayed reporting. This is where the total count, such as the number of disease cases contracted in a given week, may not be immediately available, instead arriving in parts over time. For short ... More

Bayesian estimation of the latent dimension and communities in stochastic blockmodelsApr 06 2019Apr 13 2019Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional latent space, with optimal theoretical guarantees. The embedding can be used to estimate the community structure ... More

The population-attributable fraction for time-dependent exposures using dynamic prediction and landmarkingApr 05 2019The public health impact of a harmful exposure can be quantified by the population-attributable fraction (PAF). The PAF describes the attributable risk due to an exposure and is often interpreted as the proportion of preventable cases if the exposure ... More

Inferring the temporal structure of directed functional connectivity in neural systems: some extensions to Granger causalityApr 05 2019Neural processes in the brain operate at a range of temporal scales. Granger causality, the most widely-used neuroscientific tool for inference of directed functional connectivity from neurophsyiological data, is traditionally deployed in the form of ... More

Simulation of virtual cohorts increases predictive accuracy of cognitive decline in MCI subjectsApr 05 2019The ability to predict the progression of biomarkers, notably in NDD, is limited by the size of the longitudinal data sets, in terms of number of patients, number of visits per patients and total follow-up time. To this end, we introduce a data augmentation ... More

A Bayesian-Based Approach for Public Sentiment ModelingApr 05 2019Public sentiment is a direct public-centric indicator for the success of effective action planning. Despite its importance, systematic modeling of public sentiment remains untapped in previous studies. This research aims to develop a Bayesian-based approach ... More

A knowledge based spatial model for utilizing point and nested areal observations: A case study of annual runoff predictions in the Voss areaApr 04 2019In this study, annual runoff is estimated by using a Bayesian geostatistical model for interpolating hydrological data of different spatial support. That is, streamflow observations from catchments (areal data), and precipitation and evaporation data ... More

Robust Multi-agent Counterfactual PredictionApr 03 2019We consider the problem of using logged data to make predictions about what would happen if we changed the `rules of the game' in a multi-agent system. This task is difficult because in many cases we observe actions individuals take but not their private ... More

Robust semiparametric inference for polytomous logistic regression with complex survey designApr 03 2019Analyzing polytomous response from a complex survey scheme, like stratified or cluster sampling is very crucial in several socio-economics applications. We present a class of minimum quasi weighted density power divergence estimators for the polytomous ... More

Fitting stochastic predator-prey models using both population density and kill rate dataApr 03 2019Most mechanistic predator-prey modelling has involved either parameterization from process rate data or inverse modelling. Here, we take a median road: we aim at identifying the potential benefits of combining datasets, when both population growth and ... More

Do Hospital Data Breaches Reduce Patient Care Quality?Apr 03 2019Objective: To estimate the relationship between a hospital data breach and hospital quality outcome Materials and Methods: Hospital data breaches reported to the U.S. Department of Health and Human Services breach portal and the Privacy Rights Clearinghouse ... More

Estimating Chlorophyll a Concentrations of Several Inland Waters with Hyperspectral Data and Machine Learning ModelsApr 03 2019Water is a key component of life, the natural environment and human health. For monitoring the conditions of a water body, the chlorophyll a concentration can serve as a proxy for nutrients and oxygen supply. In situ measurements of water quality parameters ... More

Bayesian Pharmacokinetic Modeling of Dynamic Contrast-Enhanced Magnetic Resonance Imaging: Validation and ApplicationApr 03 2019Tracer-kinetic analysis of dynamic contrast-enhanced magnetic resonance imaging data is commonly performed with the well-known Tofts model and nonlinear least squares (NLLS) regression. This approach yields point estimates of model parameters, uncertainty ... More

Evaluation of a meta-analysis of air quality and heart attacks, a case studyApr 02 2019It is generally acknowledged that claims from observational studies often fail to replicate. An exploratory study was undertaken to assess the reliability of base studies used in meta-analysis of short-term air quality-myocardial infarction risk and to ... More

Causal comparative effectiveness analysis of dynamic continuous-time treatment initiation rules with sparsely measured outcomes and deathApr 02 2019Evidence supporting the current World Health Organization recommendations of early antiretroviral therapy (ART) initiation for adolescents is inconclusive. We leverage a large observational data and compare, in terms of mortality and CD4 cell count, the ... More

Bivariate Gaussian models for wind vectors in a distributional regression frameworkApr 02 2019A new probabilistic post-processing method for wind vectors is presented in a distributional regression framework employing the bivariate Gaussian distribution. In contrast to previous studies all parameters of the distribution are simultaneously modeled, ... More

Identification, Interpretability, and Bayesian Word EmbeddingsApr 02 2019Social scientists have recently turned to analyzing text using tools from natural language processing like word embeddings to measure concepts like ideology, bias, and affinity. However, word embeddings are difficult to use in the regression framework ... More

Modeling the Causal Effect of Treatment Initiation Time on Survival: Application to HIV/TB Co-infectionApr 02 2019The timing of antiretroviral therapy (ART) initiation for HIV and tuberculosis (TB) co-infected patients needs to be considered carefully. CD4 cell count can be used to guide decision making about when to initiate ART. Evidence from recent randomized ... More

Direction Selection in Stochastic Directional Distance FunctionsApr 02 2019Researchers rely on the distance function to model multiple product production using multiple inputs. A stochastic directional distance function (SDDF) allows for noise in potentially all input and output variables. Yet, when estimated, the direction ... More

New ITEM response models: application to school bullying dataApr 02 2019School bullying victimization is a variable that cannot be measured directly. Taking into account that this variable has a lower bound, given by the absence of bullying victimization, this paper proposes IRT logistic models, where the latent parameter ... More

An Adapted Geographically Weighted Lasso(Ada-GWL) model for estimating metro ridershipApr 02 2019Ridership estimation at station level plays a critical role in metro transportation planning. Among various existing ridership estimation methods, direct demand model has been recognized as an effective approach. However, existing direct demand models ... More

Modeling and Analyzing Spatiotemporal Factors Influencing Metro Station Ridership in Taipei: An Approach based on General Estimating EquationApr 02 2019Modeling and analyzing metro station ridership is of great importance to passenger flow management and transportation planning operations, and complex as it is affected by multiple factors, including spatial dependencies (distance, network topology), ... More

Continuous chain-ladder with paid dataApr 02 2019We introduce a model where $iid$ payments generate the traditional paid run-off triangle. Recent literature explains how claim counts data can be embedded into a continuous chain-ladder model. However, when outstanding claim amounts are to be calculated ... More

Analysis of Large Heterogeneous Repairable System Reliability Data with Static System Attributes and Dynamic Sensor Measurement in Big Data EnvironmentApr 01 2019In Big Data environment, one pressing challenge facing engineers is to perform reliability analysis for a large fleet of heterogeneous repairable systems with covariates. In addition to static covariates, which include time-invariant system attributes ... More

Gene-based Association Analysis for Bivariate Time-to-event Data through Functional Regression with Copula ModelsApr 01 2019Several gene-based association tests for time-to-event traits have been proposed recently, to detect whether a gene region (containing multiple variants), as a set, is associated with the survival outcome. However, for bivariate survival outcomes, to ... More

Data of low quality is better than no dataApr 01 2019Missing data is not uncommon in empirical software engineering research but a common way to handle it is to remove data completely. We believe that this is wasteful and should not be done out of habit. This paper aims to present a typical case in empirical ... More

Nonparametric Matrix Response Regression with Application to Brain Imaging Data AnalysisMar 31 2019With the rapid growth of neuroimaging technologies, a great effort has been dedicated recently to investigate the dynamic changes in brain activity. Examples include time course calcium imaging and dynamic brain functional connectivity. In this paper, ... More

Bayesian Mixed Effect Sparse Tensor Response Regression Model with Joint Estimation of Activation and ConnectivityMar 30 2019Brain activation and connectivity analyses in task-based functional magnetic resonance imaging (fMRI) experiments with multiple subjects are currently at the forefront of data-driven neuroscience. In such experiments, interest often lies in understanding ... More

Estimation of cell lineage trees by maximum-likelihood phylogeneticsMar 29 2019CRISPR technology has enabled large-scale cell lineage tracing for complex multicellular organisms by mutating synthetic genomic barcodes during organismal development. However, these sophisticated biological tools currently use ad-hoc and outmoded computational ... More

Bayesian prediction of jumps in large panels of time series dataMar 28 2019We take a new look at the problem of disentangling the volatility and jumps processes in a panel of stock daily returns. We first provide an efficient computational framework that deals with the stochastic volatility model with Poisson-driven jumps in ... More

Using Latent Class Analysis to Identify ARDS Sub-phenotypes for Enhanced Machine Learning Predictive PerformanceMar 28 2019In this work, we utilize Machine Learning for early recognition of patients at high risk of acute respiratory distress syndrome (ARDS), which is critical for successful prevention strategies for this devastating syndrome. The difficulty in early ARDS ... More

Forecasting model based on information-granulated GA-SVR and ARIMA for producer price indexMar 28 2019The accuracy of predicting the Producer Price Index (PPI) plays an indispensable role in government economic work. However, it is difficult to forecast the PPI. In our research, we first propose an unprecedented hybrid model based on fuzzy information ... More