Latest in cs.lg

total 17222took 0.11s
Online Sampling from Log-Concave DistributionsFeb 21 2019Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^t f_k}$ for each epoch $t$ in an online manner. This problem occurs in applications to machine learning, ... More
Learned Step Size QuantizationFeb 21 2019We present here Learned Step Size Quantization, a method for training deep networks such that they can run at inference time using low precision integer matrix multipliers, which offer power and space advantages over high precision alternatives. The essence ... More
Deep CNN-based Speech Balloon Detection and Segmentation for Comic BooksFeb 21 2019We develop a method for the automated detection and segmentation of speech balloons in comic books, including their carrier and tails. Our method is based on a deep convolutional neural network that was trained on annotated pages of the Graphic Narrative ... More
Domain Partitioning NetworkFeb 21 2019Standard adversarial training involves two agents, namely a generator and a discriminator, playing a mini-max game. However, even if the players converge to an equilibrium, the generator may only recover a part of the target data distribution, in a situation ... More
A Mean Field Theory of Batch NormalizationFeb 21 2019We develop a mean field theory for batch normalization in fully-connected feedforward neural networks. In so doing, we provide a precise characterization of signal propagation and gradient backpropagation in wide batch-normalized networks at initialization. ... More
Statistics and Samples in Distributional Reinforcement LearningFeb 21 2019We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be decomposed as the ... More
Deep Learning Multidimensional ProjectionsFeb 21 2019Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ... More
Adversarial Augmentation for Enhancing Classification of Mammography ImagesFeb 20 2019Supervised deep learning relies on the assumption that enough training data is available, which presents a problem for its application to several fields, like medical imaging. On the example of a binary image classification task (breast cancer recognition), ... More
LOSSGRAD: automatic learning rate in gradient descentFeb 20 2019In this paper, we propose a simple, fast and easy to implement algorithm LOSSGRAD (locally optimal step-size in gradient descent), which automatically modifies the step-size in gradient descent during neural networks training. Given a function $f$, a ... More
Noisy multi-label semi-supervised dimensionality reductionFeb 20 2019Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been studied extensively ... More
A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial BanditFeb 20 2019We revisit the proof by Qin et al. (2014) of bounded regret of the C$^2$UCB contextual combinatorial bandit. We demonstrate an error in the proof of volumetric expansion of the moment matrix, used in upper bounding a function of context vector norms. ... More
Shallow Learning for Fluid Flow Reconstruction with Limited Sensors and Limited DataFeb 20 2019In many applications, it is important to reconstruct a fluid flow field, or some other high-dimensional state, from limited measurements and limited data. In this work, we propose a shallow neural network-based learning methodology for such fluid flow ... More
A Random Subspace Technique That Is Resistant to a Limited Number of Features Corrupted by an AdversaryFeb 19 2019In this paper, we consider batch supervised learning where an adversary is allowed to corrupt instances with arbitrarily large noise. The adversary is allowed to corrupt any $l$ features in each instance and the adversary can change their values in any ... More
Fast Neural Network Verification via Shadow PricesFeb 19 2019To use neural networks in safety-critical settings it is paramount to provide assurances on their runtime operation. Recent work on ReLU networks has sought to verify whether inputs belonging to a bounded box can ever yield some undesirable output. Input-splitting ... More
Graph Neural Networks for Social RecommendationFeb 19 2019In recent years, Graph Neural Networks (GNNs), which can naturally integrate node information and topological structure, have been demonstrated to be powerful in learning on graph data. These advantages of GNNs provide great potential to advance social ... More
Feature Selection for Better Spectral Characterization or: How I Learned to Start Worrying and Love EnsemblesFeb 19 2019An ever-looming threat to astronomical applications of machine learning is the danger of over-fitting data, also known as the `curse of dimensionality.' This occurs when there are fewer samples than the number of independent variables. In this work, we ... More
A spelling correction model for end-to-end speech recognitionFeb 19 2019Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language model component ... More
Simplifying Graph Convolutional NetworksFeb 19 2019Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, ... More
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural NetworkFeb 19 2019Adaptive gradient methods like AdaGrad are widely used in optimizing neural networks. Yet, existing convergence guarantees for adaptive gradient methods require either convexity or smoothness, and, in the smooth setting, only guarantee convergence to ... More
Adaptive Cross-Modal Few-Shot LearningFeb 19 2019Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. However, leveraging cross-modal information in a few-shot setting has yet to be explored. When the support from visual information is limited in ... More
Investigating Generalisation in Continuous Deep Reinforcement LearningFeb 19 2019Feb 20 2019Deep Reinforcement Learning has shown great success in a variety of control tasks. However, it is unclear how close we are to the vision of putting Deep RL into practice to solve real world problems. In particular, common practice in the field is to train ... More
Investigating Generalisation in Continuous Deep Reinforcement LearningFeb 19 2019Deep Reinforcement Learning has shown great success in a variety of control tasks. However, it is unclear how close we are to the vision of putting Deep RL into practice to solve real world problems. In particular, common practice in the field is to train ... More
Evaluating model calibration in classificationFeb 19 2019Probabilistic classifiers output a probability distribution on target classes rather than just a class prediction. Besides providing a clear separation of prediction and decision making, the main advantage of probabilistic models is their ability to represent ... More
DEDPUL: Method for Mixture Proportion Estimation and Positive-Unlabeled Classification based on Density EstimationFeb 19 2019This paper studies Positive-Unlabeled Classification, the problem of semi-supervised binary classification in the case when Negative (N) class in the training set is contaminated with instances of Positive (P) class. We develop a novel method (DEDPUL) ... More
On the Convergence of EM for truncated mixtures of two GaussiansFeb 19 2019Motivated by a recent result of Daskalakis et al. \cite{DGTZ18}, we analyze the population version of Expectation-Maximization (EM) algorithm for the case of \textit{truncated} mixtures of two Gaussians. Truncated samples from a $d$-dimensional mixture ... More
Proper-Composite Loss Functions in Arbitrary DimensionsFeb 19 2019The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used. One avenue of inquiry has been to look at these loss functions in terms of their properties as scoring rules via the proper-composite ... More
Hyperbolic Discounting and Learning over Multiple HorizonsFeb 19 2019Feb 20 2019Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process. The discount factor values future rewards by an exponential scheme that leads to theoretical convergence guarantees of the Bellman equation. However, ... More
Graph Dynamical Networks: Unsupervised Learning of Atomic Scale Dynamics in MaterialsFeb 18 2019Understanding the dynamical processes that govern the performance of functional materials is essential for the design of next generation materials to tackle global energy and environmental challenges. Many of these processes involve the dynamics of individual ... More
Democratisation of Usable Machine Learning in Computer VisionFeb 18 2019Many industries are now investing heavily in data science and automation to replace manual tasks and/or to help with decision making, especially in the realm of leveraging computer vision to automate many monitoring, inspection, and surveillance tasks. ... More
Using Machine Learning to Guide Cognitive Modeling: A Case Study in Moral ReasoningFeb 18 2019Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question. In this paper, ... More
DIViS: Domain Invariant Visual Servoing for Collision-Free Goal ReachingFeb 18 2019Robots should understand both semantics and physics to be functional in the real world. While robot platforms provide means for interacting with the physical world they cannot autonomously acquire object-level semantics without needing human. In this ... More
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient DescentFeb 18 2019A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks have made a theory of learning dynamics elusive. In this work, we show that for ... More
A parallel Fortran framework for neural networks and deep learningFeb 18 2019This paper describes neural-fortran, a parallel Fortran framework for neural networks and deep learning. It features a simple interface to construct feed-forward neural networks of arbitrary structure and size, several activation functions, and stochastic ... More
Incremental Cluster Validity Indices for Hard Partitions: Extensions and Comparative StudyFeb 18 2019Validation is one of the most important aspects of clustering, but most approaches have been batch methods. Recently, interest has grown in providing incremental alternatives. This paper extends the incremental cluster validity index (iCVI) family to ... More
On Evaluating Adversarial RobustnessFeb 18 2019Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose ... More
RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on Streaming DataFeb 18 2019We demonstrate the first possibility of a sub-linear memory sketch for solving the approximate near-neighbor search problem. In particular, we develop an online sketching algorithm that can compress $N$ vectors into a tiny sketch consisting of small arrays ... More
Generative Adversarial Networks Synthesize Realistic OCT Images of the RetinaFeb 18 2019We report, to our knowledge, the first end-to-end application of Generative Adversarial Networks (GANs) towards the synthesis of Optical Coherence Tomography (OCT) images of the retina. Generative models have gained recent attention for the increasingly ... More
Going deep in clustering high-dimensional data: deep mixtures of unigrams for uncovering topics in textual dataFeb 18 2019Mixtures of Unigrams (Nigam et al., 2000) are one of the simplest and most efficient tools for clustering textual data, as they assume that documents related to the same topic have similar distributions of terms, naturally described by Multinomials. When ... More
Fast Efficient Hyperparameter Tuning for Policy GradientsFeb 18 2019The performance of policy gradient methods is sensitive to hyperparameter settings that must be tuned for any new application. Widely used grid search methods for tuning hyperparameters are sample inefficient and computationally expensive. More advanced ... More
STCN: Stochastic Temporal Convolutional NetworksFeb 18 2019Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational and modeling advantages due to inherent parallelism. ... More
Intra- and Inter-epoch Temporal Context Network (IITNet) for Automatic Sleep Stage ScoringFeb 18 2019This study proposes a novel deep learning model, called IITNet, to learn intra- and inter-epoch temporal contexts from a raw single channel electroencephalogram (EEG) for automatic sleep stage scoring. When sleep experts identify the sleep stage of a ... More
LocalNorm: Robust Image Classification through Dynamically Regularized NormalizationFeb 18 2019While modern convolutional neural networks achieve outstanding accuracy on many image classification tasks, they are, compared to humans, much more sensitive to image degradation. Here, we describe a variant of Batch Normalization, LocalNorm, that regularizes ... More
LocalNorm: Robust Image Classification through Dynamically Regularized NormalizationFeb 18 2019Feb 19 2019While modern convolutional neural networks achieve outstanding accuracy on many image classification tasks, they are, compared to humans, much more sensitive to image degradation. Here, we describe a variant of Batch Normalization, LocalNorm, that regularizes ... More
Optimizing Stochastic Gradient Descent in Text Classification Based on Fine-Tuning Hyper-Parameters Approach. A Case Study on Automatic Classification of Global Terrorist AttacksFeb 18 2019The objective of this research is to enhance performance of Stochastic Gradient Descent (SGD) algorithm in text classification. In our research, we proposed using SGD learning with Grid-Search approach to fine-tuning hyper-parameters in order to enhance ... More
Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement LearningFeb 18 2019In this paper, we propose a new learning technique named message-dropout to improve the performance for multi-agent deep reinforcement learning under two application scenarios: 1) classical multi-agent reinforcement learning with direct message communication ... More
Prediction of Porosity and Permeability Alteration based on Machine Learning AlgorithmsFeb 18 2019The objective of this work is to study the applicability of various Machine Learning algorithms for prediction of some rock properties which geoscientists usually define due to special lab analysis. We demonstrate that these special properties can be ... More
Designing recurrent neural networks by unfolding an l1-l1 minimization algorithmFeb 18 2019We propose a new deep recurrent neural network (RNN) architecture for sequential signal reconstruction. Our network is designed by unfolding the iterations of the proximal gradient method that solves the l1-l1 minimization problem. As such, our network ... More
Grids versus Graphs: Partitioning Space for Improved Taxi Demand-Supply ForecastsFeb 18 2019Accurate taxi demand-supply forecasting is a challenging application of ITS (Intelligent Transportation Systems), due to the complex spatial and temporal patterns. We investigate the impact of different spatial partitioning techniques on the prediction ... More
Structural Recurrent Neural Network for Traffic Speed PredictionFeb 18 2019Deep neural networks have recently demonstrated the traffic prediction capability with the time series data obtained by sensors mounted on road segments. However, capturing spatio-temporal features of the traffic data often requires a significant number ... More
Differentially Private Continual LearningFeb 18 2019Catastrophic forgetting can be a significant problem for institutions that must delete historic data for privacy reasons. For example, hospitals might not be able to retain patient data permanently. But neural networks trained on recent data alone will ... More
Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice ProteinsFeb 18 2019A Restricted Boltzmann Machine (RBM) is an unsupervised machine-learning bipartite graphical model that jointly learns a probability distribution over data and extracts their relevant statistical features. As such, RBM were recently proposed for characterizing ... More
A Unifying Bayesian View of Continual LearningFeb 18 2019Some machine learning applications require continual learning - where data comes in a sequence of datasets, each is used for training and then permanently discarded. From a Bayesian perspective, continual learning seems straightforward: Given the model ... More
Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep LearningFeb 18 2019As the models and the datasets to train deep learning (DL) models scale, system architects are faced with new challenges, one of which is the memory capacity bottleneck, where the limited physical memory inside the accelerator device constrains the algorithm ... More
Optimized data exploration applied to the simulation of a chemical processFeb 18 2019In complex simulation environments, certain parameter space regions may result in non-convergent or unphysical outcomes. All parameters can therefore be labeled with a binary class describing whether or not they lead to valid results. In general, it can ... More
CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space ModelFeb 18 2019Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable ... More
AuxBlocks: Defense Adversarial Example via Auxiliary BlocksFeb 18 2019Deep learning models are vulnerable to adversarial examples, which poses an indisputable threat to their applications. However, recent studies observe gradient-masking defenses are self-deceiving methods if an attacker can realize this defense. In this ... More
Periocular Recognition in the Wild with Orthogonal Combination of Local Binary Coded Pattern in Dual-stream Convolutional Neural NetworkFeb 18 2019In spite of the advancements made in the periocular recognition, the dataset and periocular recognition in the wild remains a challenge. In this paper, we propose a multilayer fusion approach by means of a pair of shared parameters (dual-stream) convolutional ... More
Detecting and Diagnosing Incipient Building Faults Using Uncertainty Information from Deep Neural NetworksFeb 18 2019Early detection of incipient faults is of vital importance to reducing maintenance costs, saving energy, and enhancing occupant comfort in buildings. Popular supervised learning models such as deep neural networks are considered promising due to their ... More
A One-Class Support Vector Machine Calibration Method for Time Series Change Point DetectionFeb 18 2019It is important to identify the change point of a system's health status, which usually signifies an incipient fault under development. The One-Class Support Vector Machine (OC-SVM) is a popular machine learning model for anomaly detection and hence could ... More
Distributed Learning for Channel Allocation Over a Shared SpectrumFeb 17 2019Channel allocation is the task of assigning channels to users such that some objective (e.g., sum-rate) is maximized. In centralized networks such as cellular networks, this task is carried by the base station which gathers the channel state information ... More
Distributed Learning for Channel Allocation Over a Shared SpectrumFeb 17 2019Feb 19 2019Channel allocation is the task of assigning channels to users such that some objective (e.g., sum-rate) is maximized. In centralized networks such as cellular networks, this task is carried by the base station which gathers the channel state information ... More
Learning to Infer Program SketchesFeb 17 2019Our goal is to build systems which write code automatically from the kinds of specifications humans can most easily provide, such as examples and natural language instruction. The key idea of this work is that a flexible combination of pattern recognition ... More
Quantized Frank-Wolfe: Communication-Efficient Distributed OptimizationFeb 17 2019How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we ... More
Towards Improved Testing For Deep LearningFeb 17 2019The growing use of deep neural networks in safety-critical applications makes it necessary to carry out adequate testing to detect and correct any incorrect behavior for corner case inputs before they can be actually used. Deep neural networks lack an ... More
Attention-Based Prototypical Learning Towards Interpretable, Confident and Robust Deep Neural NetworksFeb 17 2019We propose a new framework for prototypical learning that bases decision-making on few relevant examples that we call prototypes. Our framework utilizes an attention mechanism that relates the encoded representations to determine the prototypes. This ... More
Neural Network-Based Dynamic Threshold Detection for Non-Volatile MemoriesFeb 17 2019The memory physics induced unknown offset of the channel is a critical and difficult issue to be tackled for many non-volatile memories (NVMs). In this paper, we first propose novel neural network (NN) detectors by using the multilayer perceptron (MLP) ... More
Semiparametric correction for endogenous truncation bias with Vox Populi based participation decisionFeb 17 2019We synthesize the knowledge present in various scientific disciplines for the development of semiparametric endogenous truncation-proof algorithm, correcting for truncation bias due to endogenous self-selection. This synthesis enriches the algorithm's ... More
A semi-supervised deep residual network for mode detection in Wi-Fi signalsFeb 17 2019Due to their ubiquitous and pervasive nature, Wi-Fi networks have the potential to collect large-scale, low-cost, and disaggregate data on multimodal transportation. In this study, we develop a semi-supervised deep residual network (ResNet) framework ... More
ODIN: ODE-Informed Regression for Parameter and State Inference in Time-Continuous Dynamical SystemsFeb 17 2019Parameter inference in ordinary differential equations is an important problem in many applied sciences and in engineering, especially in a data-scarce setting. In this work, we introduce a novel generative modeling approach based on constrained Gaussian ... More
Deep-learning inversion: a next generation seismic velocity-model building methodFeb 17 2019Seismic velocity is one of the most important parameters used in seismic exploration. Accurate velocity models are key prerequisites for reverse-time migration and other high-resolution seismic imaging techniques. Such velocity information has traditionally ... More
A new Potential-Based Reward Shaping for Reinforcement Learning AgentFeb 17 2019Potential-based reward shaping (PBRS) is a particular category of machine learning methods which aims to improve the learning speed of a reinforcement learning agent by extracting and utilizing extra knowledge while performing a task. There are two steps ... More
Multiple Document Representations from News Alerts for Automated Bio-surveillance Event DetectionFeb 17 2019Due to globalization, geographic boundaries no longer serve as effective shields for the spread of infectious diseases. In order to aid bio-surveillance analysts in disease tracking, recent research has been devoted to developing information retrieval ... More
Detecting Colorized Images via Convolutional Neural Networks: Toward High Accuracy and Good GeneralizationFeb 17 2019Image colorization achieves more and more realistic results with the increasing computation power of recent deep learning techniques. It becomes more difficult to identify the fake colorized images by human eyes. In this work, we propose a novel forensic ... More
Context-Based Dynamic Pricing with Online ClusteringFeb 17 2019We consider a context-based dynamic pricing problem of online products which have low sales. Sales data from Alibaba, a major global online retailer, illustrate the prevalence of low-sale products. For these products, existing single-product dynamic pricing ... More
WiSE-VAE: Wide Sample Estimator VAEFeb 16 2019Variational Auto-encoders (VAEs) have been very successful as methods for forming compressed latent representations of complex, often high-dimensional, data. In this paper, we derive an alternative variational lower bound from the one common in VAEs, ... More
WiSE-VAE: Wide Sample Estimator VAEFeb 16 2019Feb 19 2019Variational Auto-encoders (VAEs) have been very successful as methods for forming compressed latent representations of complex, often high-dimensional, data. In this paper, we derive an alternative variational lower bound from the one common in VAEs, ... More
Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth OptimizationFeb 16 2019Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems. However, in some machine learning problems such as the bandit model and the black-box learning problem, proximal gradient ... More
A Little Is Enough: Circumventing Defenses For Distributed LearningFeb 16 2019Distributed learning is central for large-scale training of deep-learning models. However, they are exposed to a security threat in which Byzantine participants can interrupt or control the learning process. Previous attack models and their corresponding ... More
Deep Convolutional Sum-Product Networks for Probabilistic Image RepresentationsFeb 16 2019Sum-Product Networks (SPNs) are hierarchical probabilistic graphical models capable of fast and exact inference. Applications of SPNs to real-world data such as large image datasets has been fairly limited in previous literature. We introduce Convolutional ... More
Making Convex Loss Functions Robust to Outliers using $e$-Exponentiated TransformationFeb 16 2019In this paper, we propose a novel $e$-exponentiated transformation, $0.5< e<1$, for loss functions. When the transformation is applied to a convex loss function, the transformed loss function enjoys the following desirable property: for one layer network, ... More
Screening Rules for Lasso with Non-Convex Sparse RegularizersFeb 16 2019Leveraging on the convexity of the Lasso problem , screening rules help in accelerating solvers by discarding irrelevant variables, during the optimization process. However, because they provide better theoretical guarantees in identifying relevant variables, ... More
On Privacy-preserving Decentralized Optimization through Alternating Direction Method of MultipliersFeb 16 2019Privacy concerns with sensitive data in machine learning are receiving increasing attention. In this paper, we study privacy-preserving distributed learning under the framework of Alternating Direction Method of Multipliers (ADMM). While secure distributed ... More
Differentiable reservoir computingFeb 16 2019Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so called echo state and fading memory properties. These important features amount, in mathematical terms, to the existence ... More
Re-determinizing Information Set Monte Carlo Tree Search in HanabiFeb 16 2019This technical report documents the winner of the Computational Intelligence in Games(CIG) 2018 Hanabi competition. We introduce Re-determinizing IS-MCTS, a novel extension of Information Set Monte Carlo Tree Search (IS-MCTS) \cite{IS-MCTS} that prevents ... More
RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connectionsFeb 16 2019One of the ways to train deep neural networks effectively is to use residual connections. Residual connections can be classified as being either identity connections or bridge-connections with a reshaping convolution. Empirical observations on CIFAR-10 ... More
Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social MediaFeb 16 2019Sentiment analysis has been emerging recently as one of the major natural language processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become significant sources for brands to observe ... More
Forecasting the 2017-2018 Yemen Cholera Outbreak with Machine LearningFeb 16 2019The ongoing Yemen cholera outbreak has been deemed one of the worst cholera outbreaks in history, with over a million people impacted and thousands dead. Triggered by a civil war, the outbreak has been shaped by various political, environmental, and epidemiological ... More
TopicEq: A Joint Topic and Mathematical Equation Model for Scientific TextsFeb 16 2019Scientific documents rely on both mathematics and text to communicate ideas. Inspired by the topical correspondence between mathematical equations and word contexts observed in scientific texts, we propose a novel topic model that jointly generates mathematical ... More
Towards Explainable AI: Significance Tests for Neural NetworksFeb 16 2019Neural networks underpin many of the best-performing AI systems. Their success is largely due to their strong approximation properties, superior predictive performance, and scalability. However, a major caveat is explainability: neural networks are often ... More
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limitFeb 16 2019We consider learning two layer neural networks using stochastic gradient descent. The mean-field description of this learning dynamics approximates the evolution of the network weights by an evolution in the space of probability distributions in $R^D$ ... More
ProLoNets: Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement LearningFeb 15 2019Deep reinforcement learning has seen great success across a breadth of tasks such as in game playing and robotic manipulation. However, the modern practice of attempting to learn tabula rasa disregards the logical structure of many domains and the wealth ... More
Asymptotic Finite Sample Information Losses in Neural ClassifiersFeb 15 2019This paper considers the subject of information losses arising from finite datasets used in the training of neural classifiers. It proves a relationship between such losses and the product of the expected total variation of the estimated neural model ... More
On resampling vs. adjusting probabilistic graphical models in estimation of distribution algorithmsFeb 15 2019The Bayesian Optimisation Algorithm (BOA) is an Estimation of Distribution Algorithm (EDA) that uses a Bayesian network as probabilistic graphical model (PGM). Determining the optimal Bayesian network structure given a solution sample is an NP-hard problem. ... More
Robustness of Neural Networks: A Probabilistic and Practical ApproachFeb 15 2019Neural networks are becoming increasingly prevalent in software, and it is therefore important to be able to verify their behavior. Because verifying the correctness of neural networks is extremely challenging, it is common to focus on the verification ... More
Adaptive Sequence SubmodularityFeb 15 2019In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in certain orders (e.g., guiding an agent through a series of states). Not ... More
DeepFault: Fault Localization for Deep Neural NetworksFeb 15 2019Deep Neural Networks (DNNs) are increasingly deployed in safety-critical applications including autonomous vehicles and medical diagnostics. To reduce the residual risk for unexpected DNN behaviour and provide evidence for their trustworthy operation, ... More
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse ReparameterizationFeb 15 2019Deep neural networks are typically highly over-parameterized with pruning techniques able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic re-allocation of non-zero parameters have ... More
From Dark Matter to Galaxies with Convolutional NetworksFeb 15 2019Cosmological surveys aim at answering fundamental questions about our Universe, including the nature of dark matter or the reason of unexpected accelerated expansion of the Universe. In order to answer these questions, two important ingredients are needed: ... More
Translation Insensitivity for Deep Convolutional Gaussian ProcessesFeb 15 2019Deep learning has been at the foundation of large improvements in image classification. To improve the robustness of predictions, Bayesian approximations have been used to learn parameters in deep neural networks. We follow an alternative approach, by ... More
The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC MetricFeb 15 2019Where machine-learned predictive risk scores inform high-stakes decisions, such as bail and sentencing in criminal justice, fairness has been a serious concern. Recent work has characterized the disparate impact that such risk scores can have when used ... More