Design of Experiments for Model Discrimination Hybridising Analytical and Data-Driven ApproachesFeb 12 2018Healthcare companies must submit pharmaceutical drugs or medical devices to regulatory bodies before marketing new technology. Regulatory bodies frequently require transparent and interpretable computational modelling to justify a new healthcare technology, ... More

Representation and Characterization of Non-Stationary Processes by Dilation Operators and Induced Shape Space ManifoldsFeb 08 2018We have introduce a new vision of stochastic processes through the geometry induced by the dilation. The dilation matrices of a given processes are obtained by a composition of rotations matrices, contain the measure information in a condensed way. Particularly ... More

Intentional control of type I error over unconscious data distortion: a Neyman-Pearson classification approachFeb 07 2018The rise of social media enables millions of citizens to generate information on sensitive political issues and social events, which is scarce in authoritarian countries and is tremendously valuable for surveillance and social studies. In the enormous ... More

Random taste heterogeneity in discrete choice models: Flexible nonparametric finite mixture distributionsFeb 07 2018This study proposes a mixed logit model with multivariate nonparametric finite mixture distributions. The support of the distribution is specified as a high-dimensional grid over the coefficient space, with equal or unequal intervals between successive ... More

An MCMC Algorithm for Estimating the Q-matrix in a Bayesian FrameworkFeb 07 2018The purpose of this research is to develop an MCMC algorithm for estimating the Q-matrix. Based on the DINA model, the algorithm starts with estimating correlated attributes. Using a saturated model and a binary decimal conversion, the algorithm transforms ... More

Learning Role-based Graph EmbeddingsFeb 07 2018Random walks are at the heart of many existing network embedding methods. However, such algorithms have many limitations that arise from the use of random walks, e.g., the features resulting from these methods are unable to transfer to new nodes and graphs ... More

A hierarchical model of non-homogeneous Poisson processes for Twitter retweetsFeb 06 2018We present a hierarchical model of non-homogeneous Poisson processes (NHPP) for information diffusion on online social media, in particular Twitter retweets. The retweets of each original tweet are modelled by a NHPP, for which the intensity function ... More

Testing for equivalence: an intersection-union permutation solutionFeb 06 2018The notion of testing for equivalence of two treatments is widely used in clinical trials, pharmaceutical experiments,bioequivalence and quality control. It is essentially approached within the intersection-union (IU) principle. According to this principle ... More

Simultaneous Selection of Multiple Important Single Nucleotide Polymorphisms in Familial Genome Wide Association Studies DataFeb 04 2018We propose a resampling-based fast variable selection technique for selecting important Single Nucleotide Polymorphisms (SNP) in multi-marker mixed effect models used in twin studies. Due to computational complexity, current practice includes testing ... More

Zero-adjusted Birnbaum-Saunders regression modelFeb 01 2018In this paper we introduce the zero-adjusted Birnbaum-Saunders regression model. This new model generalizes at least seven Birnbaum-Saunders regression models. The idea of this modeling is mixing a degenerate distribution at zero with a Birnbaum-Saunders ... More

Assessing student's achievement gap between ethnic groups in BrazilJan 31 2018Achievement gaps refer to the difference in the performance on examinations of students belonging to different social groups. Achievement gaps between ethnic groups have been observed in several countries with heterogeneous populations. In this paper, ... More

Error estimates for spectral convergence of the graph Laplacian on random geometric graphs towards the Laplace--Beltrami operatorJan 30 2018We study the convergence of the graph Laplacian of a random geometric graph generated by an i.i.d. sample from a $m$-dimensional submanifold $M$ in $R^d$ as the sample size $n$ increases and the neighborhood size $h$ tends to zero. We show that eigenvalues ... More

Inferência Baseada em Magnitudes na investigação em Ciências do Esporte. A necessidade de romper com os testes de hipótese nula e os valores de pJan 30 2018Research in Sports Sciences is supported often by inferences based on the declaration of the value of the statistic statistically significant or nonsignificant on the bases of a P value derived from a null-hypothesis test. Taking into account that studies ... More

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental DataJan 29 2018In the analysis of count data often the equidispersion assumption is not suitable, hence the Poisson regression model is inappropriate. As a generalization of the Poisson distribution, the COM-Poisson distribution can deal with under-, equi- and overdispersed ... More

Methodological variations in lagged regression for detecting physiologic drug effects in EHR dataJan 26 2018We studied how lagged linear regression can be used to detect the physiologic effects of drugs from data in the electronic health record (EHR). We systematically examined the effect of methodological variations ((i) time series construction, (ii) temporal ... More

21 Million Opportunities: A 19 Facility Investigation of Factors Affecting Hand Hygiene Compliance via Linear Predictive ModelsJan 26 2018This large-scale study, consisting of 21.3 million hand hygiene opportunities from 19 distinct facilities in 10 different states, uses linear predictive models to expose factors that may affect hand hygiene compliance. We examine the use of features such ... More

Pharmacokinetics Simulations for Studying Correlates of Prevention Efficacy of Passive HIV-1 Antibody Prophylaxis in the Antibody Mediated Prevention (AMP) StudyJan 25 2018A key objective in two phase 2b AMP clinical trials of VRC01 is to evaluate whether drug concentration over time, as estimated by non-linear mixed effects pharmacokinetics (PK) models, is associated with HIV infection rate. We conducted a simulation study ... More

Statistical methods for characterizing transfusion-related changes in regional oxygenation using Near-infrared spectroscopy (NIRS) in preterm infantsJan 24 2018Near infrared spectroscopy (NIRS) is an imaging-based diagnostic tool that provides non-invasive and continuous evaluation of regional tissue oxygenation in real-time. In recent years, NIRS has show promise as a useful monitoring technology to help detect ... More

Integrative analysis of time course metabolic data and biomarker discoveryJan 23 2018Metabonomics time-course experiments provide the opportunity to understand the changes to an organism by observing the evolution of metabolic profiles in response to internal or external stimuli. Along with other omic longitudinal profiling technologies, ... More

MCMC methods for inference in a mathematical model of pulmonary circulationJan 23 2018This study performs parameter inference in a partial differential equations system of pulmonary circulation. We use a fluid dynamics network model that takes selected parameter values and mimics the behaviour of the pulmonary haemodynamics under normal ... More

Characterization of Time Series Via Rényi Complexity-Entropy CurvesJan 17 2018One of the most useful tools for distinguishing between chaotic and stochastic time series is the so-called complexity-entropy causality plane. This diagram involves two complexity measures: the Shannon entropy and the statistical complexity. Recently, ... More

Bayesian Estimation of Gaussian Graphical Models with Projection Predictive SelectionJan 17 2018Jan 20 2018Gaussian graphical models are used for determining conditional relationships between variables. This is accomplished by identifying off-diagonal elements in the inverse-covariance matrix that are non-zero. When the ratio of variables (p) to observations ... More

A Semi-Parametric Binning Approach to Quickest Change DetectionJan 15 2018Jan 16 2018The problem of quickest detection of a change in distribution is considered under the assumption that the pre-change distribution is known, and the post-change distribution is only known to belong to a family of distributions distinguishable from a discretized ... More

Clinical and Non-clinical Effects on Surgery Duration: Statistical Modeling and AnalysisJan 12 2018Surgery duration is usually used as an input to the operation room (OR) allocation and surgery scheduling problems. A good estimation of surgery duration benefits the operation planning in ORs. In contrast, we would like to investigate whether the allocation ... More

"Robust-squared" Imputation Models Using BARTJan 09 2018Examples of "doubly robust" estimator for missing data include augmented inverse probability weighting (AIPWT) models (Robins et al., 1994) and penalized splines of propensity prediction (PSPP) models (Zhang and Little, 2009). Doubly-robust estimators ... More

Sales forecasting and risk management under uncertainty in the media industryJan 09 2018In this work we propose a data-driven modelization approach for the management of advertising investments of a firm. First, we propose an application of dynamic linear models to the prediction of an economic variable, such as global sales, which can use ... More

Scale-free networks are rareJan 09 2018A central claim in modern network science is that real-world networks are typically "scale free," meaning that the fraction of nodes with degree $k$ follows a power law, decaying like $k^{-\alpha}$, often with $2 < \alpha < 3$. However, empirical evidence ... More

Spatial Factor Models for High-Dimensional and Large Spatial Data: An Application in Forest Variable MappingJan 06 2018Gathering information about forest variables is an expensive and arduous activity. As such, directly collecting the data required to produce high-resolution maps over large spatial domains is infeasible. Next generation collection initiatives of remotely ... More

The dynamical structure of political corruption networksJan 05 2018Corruptive behaviour in politics limits economic growth, embezzles public funds, and promotes socio-economic inequality in modern democracies. We analyse well-documented political corruption scandals in Brazil over the past 27 years, focusing on the dynamical ... More

Multiple changepoint detection for periodic autoregressive models with an application to river flow analysisJan 05 2018In river flow analysis and forecasting there are some key elements to consider in order to obtain reliable results. For example, seasonality is often accounted for in statistical models because climatic oscillations occurring every year have an obvious ... More

Fit to speak - Physical fitness is associated with reduced language decline in healthy ageingJan 04 2018Healthy ageing is associated with decline in cognitive abilities such as language. Aerobic fitness has been shown to ameliorate decline in some cognitive domains, but the potential benefits for language have not been examined. We investigated the relationship ... More

Weighted Delta-Tracking with ScatteringJan 02 2018In this work, we expand the weighted delta-tracking routine to include a treatment for scattering. The weighted delta-tracking routine adds survival biasing to normal delta-tracking, improving overall problem figure of merit. In the original formulation ... More

Variable selection in Functional Additive Regression ModelsJan 02 2018This paper considers the problem of variable selection when some of the variables have a functional nature and can be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null model and sequentially ... More

A Partially Supervised Bayesian Image Classification Model with Applications in Diagnosis of Sentinel Lymph Node Metastases in Breast CancerDec 28 2017A method has been developed for the analysis of images of sentinel lymph nodes generated by a spectral scanning device. The aim is to classify the nodes, excised during surgery for breast cancer, as normal or metastatic. The data from one node constitute ... More

EXONEST: The Bayesian Exoplanetary ExplorerDec 24 2017The fields of astronomy and astrophysics are currently engaged in an unprecedented era of discovery as recent missions have revealed thousands of exoplanets orbiting other stars. While the Kepler Space Telescope mission has enabled most of these exoplanets ... More

A comprehensive statistical study of metabolic and protein-protein interaction network propertiesDec 20 2017Understanding the mathematical properties of graphs underling biological systems could give hints on the evolutionary mechanisms behind these structures. In this article we perform a complete statistical analysis over thousands of graphs representing ... More

Towards Personalized Modeling of the Female Hormonal Cycle: Experiments with Mechanistic Models and Gaussian ProcessesNov 30 2017In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals ... More

Binary classification models with "Uncertain" predictionsNov 27 2017Dec 04 2017Binary classification models which can assign probabilities to categories such as "the tissue is 75% likely to be tumorous" or "the chemical is 25% likely to be toxic" are well understood statistically, but their utility as an input to decision making ... More

Ensemble-marginalized Kalman filter for linear time-dependent PDEs with noisy boundary conditions: Application to heat transfer in building wallsNov 26 2017In this work, we present the ensemble-marginalized Kalman filter (EnMKF), a sequential algorithm analogous to our previously proposed approach [1,2], for estimating the state and parameters of linear parabolic partial differential equations in initial-boundary ... More

Sparse and Low-Rank Decomposition for Automatic Target Detection in Hyperspectral ImageryNov 24 2017Given a target prior information, our goal is to propose a method for automatically separating known targets of interests from the background in hyperspectral imagery. More precisely, we regard the given hyperspectral image (HSI) as being made up of the ... More

Deceptiveness of internet data for disease surveillanceNov 16 2017Quantifying how many people are or will be sick, and where, is a critical ingredient in reducing the burden of disease because it helps the public health system plan and implement effective outbreak response. This process of disease surveillance is currently ... More

Survival analysis of DNA mutation motifs with penalized proportional hazardsNov 11 2017Antibodies, an essential part of our immune system, develop in an intricate process to guarantee a broad diversity of antibodies that are able to bind a continually diversifying array of pathogens. This process involves randomly mutating the DNA sequences ... More

Differentially Private ANOVA TestingNov 03 2017Modern society generates an incredible amount of data about individuals, and releasing summary statistics about this data in a manner that provably protects individual privacy would offer a valuable resource for researchers in many fields. We present ... More

Sophisticated and small versus simple and sizeable: When does it pay off to introduce drifting coefficients in Bayesian VARs?Nov 01 2017Nov 29 2017We assess the relationship between model size and complexity in the time-varying parameter VAR framework via thorough predictive exercises for the Euro Area, the United Kingdom and the United States. It turns out that sophisticated dynamics through drifting ... More

Quantifying uncertainty in thermal properties of walls by means of Bayesian inversionOct 09 2017Quantifying the uncertainty from simulations of the energy performance in buildings is crucial for the development of effective policy making aimed at reducing carbon emissions from the built environment. One of the main sources of uncertainty in energy ... More

Equilibrium distributions and discrete Schur-constant modelsSep 28 2017This paper introduces Schur-constant equilibrium distribution models of dimension n for arithmetic non-negative random variables. Such a model is defined through the (several orders) equilibrium distributions of a univariate survival function. First, ... More

If and When a Driver or Passenger is Returning to Vehicle: Framework to Infer Intent and Arrival TimeSep 21 2017This paper proposes a probabilistic framework for the sequential estimation of the likelihood of a driver or passenger(s) returning to the vehicle and time of arrival, from the available partial track of the user location. The latter can be provided by ... More

Weather impacts expressed sentimentAug 31 2017We conduct the largest ever investigation into the relationship between meteorological conditions and the sentiment of human expressions. To do this, we employ over three and a half billion social media posts from tens of millions of individuals from ... More

A Connectedness Constraint for Learning Sparse GraphsAug 29 2017Graphs are naturally sparse objects that are used to study many problems involving networks, for example, distributed learning and graph signal processing. In some cases, the graph is not given, but must be learned from the problem and available data. ... More

Assigning peaks and modeling ETD in top-down mass spectrometryAug 01 2017Aug 25 2017Among many techniques of modern mass spectrometry, the top down methods are becoming continuously more popular in the overall strive to describe the proteome. These techniques are based on fragmentation of ions inside mass spectrometers instead of being ... More

Analysis of Deformation Fields in Spatio-temporal CBCT images of lungs for radiotherapy patientsJul 27 2017Deformable registration of spatiotemporal Cone-Beam Computed Tomography (CBCT) images taken sequentially during the radiation treatment course yields a deformation field for a pair of images. The Jacobian of this field at any voxel provides a measure ... More

Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source DataJul 25 2017Each year, thousands of software vulnerabilities are discovered and reported to the public. Unpatched known vulnerabilities are a significant security risk. It is imperative that software vendors quickly provide patches once vulnerabilities are known ... More

A unified theory for exact stochastic modelling of univariate and multivariate processes with continuous, mixed type, or discrete marginal distributions and any correlation structureJul 21 2017Hydroclimatic processes are characterized by heterogeneous spatiotemporal correlation structures and marginal distributions that can be continuous, mixed-type, discrete or even binary. Simulating exactly such processes can greatly improve hydrological ... More

Physics-guided probabilistic modeling of extreme precipitation under climate changeJul 18 2017Earth System Models (ESMs) are the state of the art for projecting the effects of climate change. However, longstanding uncertainties in their ability to simulate regional and local precipitation extremes and related processes inhibit decision making. ... More

Order Restricted Bayesian Analysis of a Simple Step Stress ModelJul 15 2017In this article we consider a simple step stress set up under the cumulative exposure model assumption. At each stress level the lifetime distribution of the experimental units are assumed to follow the generalized exponential distribution. We provide ... More

Election forensic analysis of the Turkish Constitutional Referendum 2017Jun 29 2017Jul 03 2017With a majority of 'Yes' votes in the Constitutional Referendum of 2017, Turkey continues its transition from democracy to autocracy. By the will of the Turkish people, this referendum transferred practically all executive power to president Erdogan. ... More

A Bayesian approach to modeling mortgage default and prepaymentJun 23 2017In this paper we present a Bayesian competing risk proportional hazards model to describe mortgage defaults and prepayments. We develop Bayesian inference for the model using Markov chain Monte Carlo methods. Implementation of the model is illustrated ... More

A Stochastic Model for Short-Term Probabilistic Forecast of Solar Photo-Voltaic PowerJun 16 2017Sep 16 2017In this paper, a stochastic model with regime switching is developed for solar photo-voltaic (PV) power in order to provide short-term probabilistic forecasts. The proposed model for solar PV power is physics inspired and explicitly incorporates the stochasticity ... More

Fully-Automatic Multiresolution Idealization for Filtered Ion Channel Recordings: Flickering Event DetectionJun 12 2017We propose a new non-parametric segmentation method, JULES, which combines recent statistical multiresolution techniques with local deconvolution for idealization of ion channel recordings. The multiresolution criterion takes into account scales up to ... More

Comparison of Decision Tree Based Classification Strategies to Detect External Chemical Stimuli from Raw and Filtered Plant Electrical ResponseMay 13 2017Plants monitor their surrounding environment and control their physiological functions by producing an electrical response. We recorded electrical signals from different plants by exposing them to Sodium Chloride (NaCl), Ozone (O3) and Sulfuric Acid (H2SO4) ... More

Sparse Bayesian vector autoregressions in huge dimensionsApr 11 2017We develop a Bayesian vector autoregressive (VAR) model that is capable of handling vast dimensional information sets. Three features are introduced to permit reliable estimation of the model. First, we assume that the reduced-form errors in the VAR feature ... More

Probabilistic Mid- and Long-Term Electricity Price ForecastingMar 31 2017The liberalization of electricity markets and the development of renewable energy sources has led to new challenges for decision makers. These challenges are accompanied by an increasing uncertainty about future electricity price movements. The increasing ... More

Quantifying and suppressing ranking bias in a large citation networkMar 23 2017It is widely recognized that citation counts for papers from different fields cannot be directly compared because different scientific fields adopt different citation practices. Citation counts are also strongly biased by paper age since older papers ... More

SCALPEL: Extracting Neurons from Calcium Imaging DataMar 20 2017In the past few years, new technologies in the field of neuroscience have made it possible to simultaneously image activity in large populations of neurons at cellular resolution in behaving animals. In mid-2016, a huge repository of this so-called "calcium ... More

Predicting with limited data - Increasing the accuracy in VIS-NIR diffuse reflectance spectroscopy by SMOTEMar 15 2017Diffuse reflectance spectroscopy is a powerful technique to predict soil properties. It can be used in situ to provide data inexpensively and rapidly compared to the standard laboratory measurements. Because most spectral data bases contain air-dried ... More

Importance sampling with transformed weightsFeb 07 2017Apr 20 2017The importance sampling (IS) method lies at the core of many Monte Carlo-based techniques. IS allows the approximation of a target probability distribution by drawing samples from a proposal (or importance) distribution, different from the target, and ... More

Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: a dengue case studyFeb 01 2017Aug 01 2017In 2015 the US federal government sponsored a dengue forecasting competition using historical case data from Iquitos, Peru and San Juan, Puerto Rico. Competitors were evaluated on several aspects of out-of-sample forecasts including the targets of peak ... More

Bayesian log-Gaussian Cox process regression with applications to fMRI meta-analysisJan 10 2017A typical neuroimaging study will produce a 3D brain statistic image that summarises the evidence for activation during the experiment. However, for practical reasons those images are rarely published; instead, authors only report the (x,y,z) locations ... More

Similarity solutions of Fokker-Planck equation with time-dependent coefficients and fixed/moving boundariesDec 27 2016We consider the solvability of the Fokker-Planck equation with both time-dependent drift and diffusion coefficients by means of the similarity method. By the introduction of the similarity variable, the Fokker-Planck equation is reduced to an ordinary ... More

Advances in using Internet searches to track dengueDec 08 2016Dengue is a mosquito-borne disease that threatens more than half of the world's population. Despite being endemic to over 100 countries, government-led efforts and mechanisms to timely identify and track the emergence of new infections are still lacking ... More

A Nonparametric Bayesian Basket Trial DesignDec 08 2016Targeted therapies on the basis of genomic aberrations analysis of the tumor have become a mainstream direction of cancer prognosis and treatment. Regardless of cancer type, trials that match patients to targeted therapies for their particular genomic ... More

Endogenous and Exogenous Effects in Contagion and Diffusion Models of Terrorist ActivityDec 08 2016Variation in rates of terrorist activity over time is explained via contagion or diffusion. Models for social contagion and diffusion are shown to be cases of the cluster process representation of the Hawkes self-exciting process model. Contagion and ... More

Bridging Medical Data Inference to Achilles Tendon Rupture RehabilitationDec 07 2016Imputing incomplete medical tests and predicting patient outcomes are crucial for guiding the decision making for therapy, such as after an Achilles Tendon Rupture (ATR). We formulate the problem of data imputation and prediction for ATR relevant medical ... More

Demographical Priors for Health Conditions Diagnosis Using Medicare DataDec 07 2016This paper presents an example of how demographical characteristics of patients influence their susceptibility to certain medical conditions. In this paper, we investigate the association of health conditions to age of patients in a heterogeneous population. ... More

A Model-Based Approach to Wildland Fire Reconstruction Using Sediment Charcoal RecordsDec 07 2016Lake sediment charcoal records are used in paleoecological analyses to reconstruct fire history including the identification of past wildland fires. One challenge of applying sediment charcoal records to infer fire history is the separation of charcoal ... More

Efficient Construction of Test-Inversion Confidence Intervals Using Quantile Regression, With Application To Population GeneticsDec 07 2016Modern problems in statistics tend to include estimators of high computational complexity and with complicated distributions. Statistical inference on such estimators usually relies on asymptotic normality assumptions, however, such assumptions are often ... More

Generalized Exponential smoothing in prediction of hierarchical time seriesDec 07 2016Shang and Hyndman (2016) proposed grouped functional time series forecasting approach as a combination of individual forecasts using generalized least squares regression. We modify their methodology using generalized exponential smoothing technique for ... More

Tensor-Based Fusion of EEG and FMRI to Understand Neurological Changes in SchizophreniaDec 07 2016Neuroimaging modalities such as functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) provide information about neurological functions in complementary spatiotemporal resolutions; therefore, fusion of these modalities is expected ... More

Predictive Business Process Monitoring with LSTM Neural NetworksDec 07 2016Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative ... More

Impossible Inference in Econometrics: Theory and Applications to Regression Discontinuity, Bunching, and Exogeneity TestsDec 06 2016This paper presents necessary and sufficient conditions for tests to have trivial power. By inverting these impractical tests, we demonstrate that the bounded confidence regions have error probability equal to one. This theoretical framework establishes ... More

Robust Calibration of Radio Interferometers in Non-Gaussian EnvironmentDec 06 2016The development of new phased array systems in radio astronomy, as the low frequency array (LOFAR) and the square kilometre array (SKA), formed of a large number of small and flexible elementary antennas, has led to significant challenges. Among them, ... More

Method for estimating cycle lengths from multidimensional time series: Test cases and application to a massive "in silico" datasetDec 06 2016Many real world systems exhibit cyclic behavior that is, for example, due to the nearly harmonic oscillations being perturbed by the strong fluctuations present in the regime of significant non-linearities. For the investigation of such sys- tems special ... More

Peaks over thresholds modelling with multivariate generalized Pareto distributionsDec 06 2016The multivariate generalized Pareto distribution arises as the limit of a suitably normalized vector conditioned upon at least one component of that vector being extreme. Statistical modelling using multivariate generalized Pareto distributions constitutes ... More

Peaks over thresholds modelling with multivariate generalized Pareto distributionsDec 06 2016Feb 06 2018When assessing the impact of extreme events, it is often not just a single component, but the combined behaviour of several components which is important. Statistical modelling using multivariate generalized Pareto (GP) distributions constitutes the multivariate ... More

Empirical Bayes Methods for Prior Estimation in Systems MedicineDec 05 2016One of the main goals of mathematical modeling in systems medicine related to medical applications is to obtain patient-specific parameterizations and model predictions. In clinical practice, however, the number of available measurements for single patients ... More

Ranking Biomarkers Through Mutual InformationDec 05 2016We study information theoretic methods for ranking biomarkers. In clinical trials there are two, closely related, types of biomarkers: predictive and prognostic, and disentangling them is a key challenge. Our first step is to phrase biomarker ranking ... More

A Novel Approach for Big Data Analytics in Future Grids Based on Free ProbabilityDec 04 2016Based on the random matrix model, we can build statistical models using massive datasets across the power grid, and employ hypothesis testing for anomaly detection. First, the aim of this paper is to make the first attempt to apply the recent free probability ... More

Modeling trajectories of mental health: challenges and opportunitiesDec 04 2016More than two thirds of mental health problems have their onset during childhood or adolescence. Identifying children at risk for mental illness later in life and predicting the type of illness is not easy. We set out to develop a platform to define subtypes ... More

Nonparametric Bayes Models of Fiber Curves Connecting Brain RegionsDec 03 2016In studying structural inter-connections in the human brain, it is common to first estimate fiber bundles connecting different regions of the brain relying on diffusion MRI. These fiber bundles act as highways for neural activity and communication, snaking ... More

Not Normal: the uncertainties of scientific measurementsDec 02 2016Judging the significance and reproducibility of quantitative research requires a good understanding of relevant uncertainties, but it is often unclear how well these have been evaluated and what they imply. Reported scientific uncertainties were studied ... More

Inferring Ice Thickness from a Glacier Dynamics Model and Multiple Surface DatasetsDec 02 2016The future behavior of the West Antarctic Ice Sheet (WAIS) may have a major impact on future climate. For instance, ice sheet melt may contribute significantly to global sea level rise. Understanding the current state of WAIS is therefore of great interest. ... More

A Bayesian Heteroscedastic GLM with Application to fMRI Data with Motion SpikesDec 02 2016We propose a voxel-wise general linear model with autoregressive noise and heteroscedastic noise innovations (GLMH) for analyzing functional magnetic resonance imaging (fMRI) data. The model is analyzed from a Bayesian perspective and has the benefit ... More

Voxelwise nonlinear regression toolbox for neuroimage analysis: Application to aging and neurodegenerative disease modelingDec 02 2016This paper describes a new neuroimaging analysis toolbox that allows for the modeling of nonlinear effects at the voxel level, overcoming limitations of methods based on linear models like the GLM. We illustrate its features using a relevant example in ... More

Survival Prediction with Limited Features: a Top Performing Approach from the DREAM ALS Stratification Prize4Life ChallengeDec 02 2016Survival prediction with small sets of features is a highly relevant topic for decision-making in clinical practice. I describe a method for predicting survival of amyotrophic lateral sclerosis (ALS) patients that was developed as a submission to the ... More

Development of a hybrid learning system based on SVM, ANFIS and domain knowledge: DKFISDec 02 2016This paper presents the development of a hybrid learning system based on Support Vector Machines (SVM), Adaptive Neuro-Fuzzy Inference System (ANFIS) and domain knowledge to solve prediction problem. The proposed two-stage Domain Knowledge based Fuzzy ... More

A novel multiclassSVM based framework to classify lithology from well logs: a real-world applicationDec 02 2016Support vector machines (SVMs) have been recognized as a potential tool for supervised classification analyses in different domains of research. In essence, SVM is a binary classifier. Therefore, in case of a multiclass problem, the problem is divided ... More

A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological DatasetDec 02 2016Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. ... More

A Bayesian Approach to Predicting Disengaged YouthDec 02 2016This article presents a Bayesian approach for predicting and identifying the factors which most influence an individual's propensity to fall into the category of Not in Employment Education or Training (NEET). The approach partitions the covariates into ... More

A Bayesian Approach to Predicting Disengaged YouthDec 02 2016Dec 07 2016This article presents a Bayesian approach for predicting and identifying the factors which most influence an individual's propensity to fall into the category of Not in Employment Education or Training (NEET). The approach partitions the covariates into ... More

Multibrand geographic experimentsDec 01 2016In a geographic experiment to measure advertising effectiveness, some regions (hereafter GEOs) get increased advertising while others do not. This paper looks at running $B>1$ such experiments simultaneously on $B$ different brands in $G$ GEOs, and then ... More