Latest in cs.lg
total 17222took 0.11s
Online Sampling from Log-Concave DistributionsFeb 21 2019Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^t f_k}$ for each epoch $t$ in an online manner. This problem occurs in applications to machine learning, ... More Learned Step Size QuantizationFeb 21 2019We present here Learned Step Size Quantization, a method for training deep networks such that they can run at inference time using low precision integer matrix multipliers, which offer power and space advantages over high precision alternatives. The essence ... More Domain Partitioning NetworkFeb 21 2019Standard adversarial training involves two agents, namely a generator and a discriminator, playing a mini-max game. However, even if the players converge to an equilibrium, the generator may only recover a part of the target data distribution, in a situation ... More A Mean Field Theory of Batch NormalizationFeb 21 2019We develop a mean field theory for batch normalization in fully-connected feedforward neural networks. In so doing, we provide a precise characterization of signal propagation and gradient backpropagation in wide batch-normalized networks at initialization. ... More Statistics and Samples in Distributional Reinforcement LearningFeb 21 2019We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be decomposed as the ... More Deep Learning Multidimensional ProjectionsFeb 21 2019Dimensionality reduction methods, also known as projections, are frequently used for exploring multidimensional data in machine learning, data science, and information visualization. Among these, t-SNE and its variants have become very popular for their ... More LOSSGRAD: automatic learning rate in gradient descentFeb 20 2019In this paper, we propose a simple, fast and easy to implement algorithm LOSSGRAD (locally optimal step-size in gradient descent), which automatically modifies the step-size in gradient descent during neural networks training. Given a function $f$, a ... More Noisy multi-label semi-supervised dimensionality reductionFeb 20 2019Noisy labeled data represent a rich source of information that often are easily accessible and cheap to obtain, but label noise might also have many negative consequences if not accounted for. How to fully utilize noisy labels has been studied extensively ... More Fast Neural Network Verification via Shadow PricesFeb 19 2019To use neural networks in safety-critical settings it is paramount to provide assurances on their runtime operation. Recent work on ReLU networks has sought to verify whether inputs belonging to a bounded box can ever yield some undesirable output. Input-splitting ... More Graph Neural Networks for Social RecommendationFeb 19 2019In recent years, Graph Neural Networks (GNNs), which can naturally integrate node information and topological structure, have been demonstrated to be powerful in learning on graph data. These advantages of GNNs provide great potential to advance social ... More A spelling correction model for end-to-end speech recognitionFeb 19 2019Attention-based sequence-to-sequence models for speech recognition jointly train an acoustic model, language model (LM), and alignment mechanism using a single neural network and require only parallel audio-text pairs. Thus, the language model component ... More Simplifying Graph Convolutional NetworksFeb 19 2019Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, ... More Adaptive Cross-Modal Few-Shot LearningFeb 19 2019Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. However, leveraging cross-modal information in a few-shot setting has yet to be explored. When the support from visual information is limited in ... More Evaluating model calibration in classificationFeb 19 2019Probabilistic classifiers output a probability distribution on target classes rather than just a class prediction. Besides providing a clear separation of prediction and decision making, the main advantage of probabilistic models is their ability to represent ... More Proper-Composite Loss Functions in Arbitrary DimensionsFeb 19 2019The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used. One avenue of inquiry has been to look at these loss functions in terms of their properties as scoring rules via the proper-composite ... More Hyperbolic Discounting and Learning over Multiple HorizonsFeb 19 2019Feb 20 2019Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process. The discount factor values future rewards by an exponential scheme that leads to theoretical convergence guarantees of the Bellman equation. However, ... More Democratisation of Usable Machine Learning in Computer VisionFeb 18 2019Many industries are now investing heavily in data science and automation to replace manual tasks and/or to help with decision making, especially in the realm of leveraging computer vision to automate many monitoring, inspection, and surveillance tasks. ... More On Evaluating Adversarial RobustnessFeb 18 2019Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose ... More Fast Efficient Hyperparameter Tuning for Policy GradientsFeb 18 2019The performance of policy gradient methods is sensitive to hyperparameter settings that must be tuned for any new application. Widely used grid search methods for tuning hyperparameters are sample inefficient and computationally expensive. More advanced ... More STCN: Stochastic Temporal Convolutional NetworksFeb 18 2019Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational and modeling advantages due to inherent parallelism. ... More Differentially Private Continual LearningFeb 18 2019Catastrophic forgetting can be a significant problem for institutions that must delete historic data for privacy reasons. For example, hospitals might not be able to retain patient data permanently. But neural networks trained on recent data alone will ... More A Unifying Bayesian View of Continual LearningFeb 18 2019Some machine learning applications require continual learning - where data comes in a sequence of datasets, each is used for training and then permanently discarded. From a Bayesian perspective, continual learning seems straightforward: Given the model ... More AuxBlocks: Defense Adversarial Example via Auxiliary BlocksFeb 18 2019Deep learning models are vulnerable to adversarial examples, which poses an indisputable threat to their applications. However, recent studies observe gradient-masking defenses are self-deceiving methods if an attacker can realize this defense. In this ... More Distributed Learning for Channel Allocation Over a Shared SpectrumFeb 17 2019Feb 19 2019Channel allocation is the task of assigning channels to users such that some objective (e.g., sum-rate) is maximized. In centralized networks such as cellular networks, this task is carried by the base station which gathers the channel state information ... More Learning to Infer Program SketchesFeb 17 2019Our goal is to build systems which write code automatically from the kinds of specifications humans can most easily provide, such as examples and natural language instruction. The key idea of this work is that a flexible combination of pattern recognition ... More Towards Improved Testing For Deep LearningFeb 17 2019The growing use of deep neural networks in safety-critical applications makes it necessary to carry out adequate testing to detect and correct any incorrect behavior for corner case inputs before they can be actually used. Deep neural networks lack an ... More Context-Based Dynamic Pricing with Online ClusteringFeb 17 2019We consider a context-based dynamic pricing problem of online products which have low sales. Sales data from Alibaba, a major global online retailer, illustrate the prevalence of low-sale products. For these products, existing single-product dynamic pricing ... More WiSE-VAE: Wide Sample Estimator VAEFeb 16 2019Variational Auto-encoders (VAEs) have been very successful as methods for forming compressed latent representations of complex, often high-dimensional, data. In this paper, we derive an alternative variational lower bound from the one common in VAEs, ... More WiSE-VAE: Wide Sample Estimator VAEFeb 16 2019Feb 19 2019Variational Auto-encoders (VAEs) have been very successful as methods for forming compressed latent representations of complex, often high-dimensional, data. In this paper, we derive an alternative variational lower bound from the one common in VAEs, ... More Screening Rules for Lasso with Non-Convex Sparse RegularizersFeb 16 2019Leveraging on the convexity of the Lasso problem , screening rules help in accelerating solvers by discarding irrelevant variables, during the optimization process. However, because they provide better theoretical guarantees in identifying relevant variables, ... More Differentiable reservoir computingFeb 16 2019Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so called echo state and fading memory properties. These important features amount, in mathematical terms, to the existence ... More Towards Explainable AI: Significance Tests for Neural NetworksFeb 16 2019Neural networks underpin many of the best-performing AI systems. Their success is largely due to their strong approximation properties, superior predictive performance, and scalability. However, a major caveat is explainability: neural networks are often ... More Adaptive Sequence SubmodularityFeb 15 2019In many machine learning applications, one needs to interactively select a sequence of items (e.g., recommending movies based on a user's feedback) or make sequential decisions in certain orders (e.g., guiding an agent through a series of states). Not ... More DeepFault: Fault Localization for Deep Neural NetworksFeb 15 2019Deep Neural Networks (DNNs) are increasingly deployed in safety-critical applications including autonomous vehicles and medical diagnostics. To reduce the residual risk for unexpected DNN behaviour and provide evidence for their trustworthy operation, ... More From Dark Matter to Galaxies with Convolutional NetworksFeb 15 2019Cosmological surveys aim at answering fundamental questions about our Universe, including the nature of dark matter or the reason of unexpected accelerated expansion of the Universe. In order to answer these questions, two important ingredients are needed: ... More