Latest in cs.lg

total 36380took 0.11s
Reweighted Expectation MaximizationJun 13 2019Training deep generative models with maximum likelihood remains a challenge. The typical workaround is to use variational inference (VI) and maximize a lower bound to the log marginal likelihood of the data. Variational auto-encoders (VAEs) adopt this ... More
The Communication Complexity of OptimizationJun 13 2019We consider the communication complexity of a number of distributed optimization problems. We start with the problem of solving a linear system. Suppose there is a coordinator together with $s$ servers $P_1, \ldots, P_s$, the $i$-th of which holds a subset ... More
Overcoming Mean-Field Approximations in Recurrent Gaussian Process ModelsJun 13 2019We identify a new variational inference scheme for dynamical systems whose transition function is modelled by a Gaussian process. Inference in this setting has either employed computationally intensive MCMC methods, or relied on factorisations of the ... More
Kernel and Deep Regimes in Overparametrized ModelsJun 13 2019A recent line of work studies overparametrized neural networks in the ``kernel regime,'' i.e.~when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS ... More
Telephonetic: Making Neural Language Models Robust to ASR and Semantic NoiseJun 13 2019Speech processing systems rely on robust feature extraction to handle phonetic and semantic variations found in natural language. While techniques exist for desensitizing features to common noise patterns produced by Speech-to-Text (STT) and Text-to-Speech ... More
Robust Regression for Safe Exploration in ControlJun 13 2019We study the problem of safe learning and exploration in sequential control problems. The goal is to safely collect data samples from an operating environment to learn an optimal controller. A central challenge in this setting is how to quantify uncertainty ... More
Lower Bounds for Adversarially Robust PAC LearningJun 13 2019In this work, we initiate a formal study of probably approximately correct (PAC) learning under evasion attacks, where the adversary's goal is to \emph{misclassify} the adversarially perturbed sample point $\widetilde{x}$, i.e., $h(\widetilde{x})\neq ... More
Deep Reinforcement Learning for Cyber SecurityJun 13 2019The scale of Internet-connected systems has increased considerably, and these systems are being exposed to cyber attacks more than ever. The complexity and dynamics of cyber attacks require protecting mechanisms to be responsive, adaptive, and large-scale. ... More
Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive NetworksJun 13 2019In recent years, deep learning has proven to be a viable methodology for surrogate modeling and uncertainty quantification for a vast number of physical systems. However, in their traditional form, such models require a large amount of training data. ... More
Nonlinear System Identification via Tensor CompletionJun 13 2019Function approximation from input and output data pairs constitutes a fundamental problem in supervised learning. Deep neural networks are currently the most popular method for learning to mimic the input-output relationship of a generic nonlinear system, ... More
Distributed High-dimensional Regression Under a Quantile Loss FunctionJun 13 2019This paper studies distributed estimation and support recovery for high-dimensional linear regression model with heavy-tailed noise. To deal with heavy-tailed noise whose variance can be infinite, we adopt the quantile regression loss function instead ... More
Iterative subtraction method for Feature RankingJun 13 2019Training features used to analyse physical processes are often highly correlated and determining which ones are most important for the classification is a non-trivial tasks. For the use case of a search for a top-quark pair produced in association with ... More
Training Neural Networks for and by InterpolationJun 13 2019The majority of modern deep learning models are able to interpolate the data: the empirical loss can be driven near zero on all samples simultaneously. In this work, we explicitly exploit this interpolation property for the design of a new optimization ... More
$c^+$GAN: Complementary Fashion Item RecommendationJun 13 2019We present a conditional generative adversarial model to draw realistic samples from paired fashion clothing distribution and provide real samples to pair with arbitrary fashion units. More concretely, given an image of a shirt, obtained from a fashion ... More
Variance Estimation For Online Regression via Spectrum ThresholdingJun 13 2019We consider the online linear regression problem, where the predictor vector may vary with time. This problem can be modelled as a linear dynamical system, where the parameters that need to be learned are the variance of both the process noise and the ... More
Identifying Illicit Accounts in Large Scale E-payment Networks -- A Graph Representation Learning ApproachJun 13 2019Rapid and massive adoption of mobile/ online payment services has brought new challenges to the service providers as well as regulators in safeguarding the proper uses such services/ systems. In this paper, we leverage recent advances in deep-neural-network-based ... More
An image-driven machine learning approach to kinetic modeling of a discontinuous precipitation reactionJun 13 2019Micrograph quantification is an essential component of several materials science studies. Machine learning methods, in particular convolutional neural networks, have previously demonstrated performance in image recognition tasks across several disciplines ... More
Cognitive Knowledge Graph Reasoning for One-shot Relational LearningJun 13 2019Inferring new facts from existing knowledge graphs (KG) with explainable reasoning processes is a significant problem and has received much attention recently. However, few studies have focused on relation types unseen in the original KG, given only one ... More
Robust and interpretable blind image denoising via bias-free convolutional neural networksJun 13 2019Deep convolutional networks often append additive constant ("bias") terms to their convolution operations, enabling a richer repertoire of functional mappings. Biases are also used to facilitate training, by subtracting mean response over batches of training ... More
Selective prediction-set models with coverage guaranteesJun 13 2019Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data. Instead, we can learn more reliable models by letting them ... More
Reinforcement Learning of Spatio-Temporal Point ProcessesJun 13 2019Spatio-temporal event data is ubiquitous in various applications, such as social media, crime events, and electronic health records. Spatio-temporal point processes offer a versatile framework for modeling such event data, as it can jointly capture spatial ... More
CoopSubNet: Cooperating Subnetwork for Data-Driven Regularization of Deep Networks under Limited Training BudgetsJun 13 2019Deep networks are an integral part of the current machine learning paradigm. Their inherent ability to learn complex functional mappings between data and various target variables, while discovering hidden, task-driven features, makes them a powerful technology ... More
Random Tessellation ForestsJun 13 2019Space partitioning methods such as random forests and the Mondrian process are powerful machine learning methods for multi-dimensional and relational data, and are based on recursively cutting a domain. The flexibility of these methods is often limited ... More
Copulas as High-Dimensional Generative Models: Vine Copula AutoencodersJun 12 2019We propose a vine copula autoencoder to construct flexible generative models for high-dimensional distributions in a straightforward three-step procedure. First, an autoencoder compresses the data using a lower dimensional representation. Second, the ... More
Efficient Evaluation-Time Uncertainty Estimation by Improved DistillationJun 12 2019In this work we aim to obtain computationally-efficient uncertainty estimates with deep networks. For this, we propose a modified knowledge distillation procedure that achieves state-of-the-art uncertainty estimates both for in and out-of-distribution ... More
Flexible Modeling of Diversity with Strongly Log-Concave DistributionsJun 12 2019Strongly log-concave (SLC) distributions are a rich class of discrete probability distributions over subsets of some ground set. They are strictly more general than strongly Rayleigh (SR) distributions such as the well-known determinantal point process. ... More
Neural Arabic Question AnsweringJun 12 2019This paper tackles the problem of open domain factual Arabic question answering (QA) using Wikipedia as our knowledge source. This constrains the answer of any question to be a span of text in Wikipedia. Open domain QA for Arabic entails three challenges: ... More
Compositional generalization through meta sequence-to-sequence learningJun 12 2019People can learn a new concept and use it compositionally, understanding how to "blicket twice" after learning how to "blicket." In contrast, powerful sequence-to-sequence (seq2seq) neural networks fail such tests of compositionality, especially when ... More
E3: Entailment-driven Extracting and Editing for Conversational Machine ReadingJun 12 2019Conversational machine reading systems help users answer high-level questions (e.g. determine if they qualify for particular government benefits) when they do not know the exact rules by which the determination is made(e.g. whether they need certain income ... More
Neural Graph Evolution: Towards Efficient Automatic Robot DesignJun 12 2019Despite the recent successes in robotic locomotion control, the design of robot relies heavily on human engineering. Automatic robot design has been a long studied subject, but the recent progress has been slowed due to the large combinatorial search ... More
Competing Bandits in Matching MarketsJun 12 2019Stable matching, a classical model for two-sided markets, has long been studied with little consideration for how each side's preferences are learned. With the advent of massive online markets powered by data-driven matching platforms, it has become necessary ... More
GANPOP: Generative Adversarial Network Prediction of Optical Properties from Single Snapshot Wide-field ImagesJun 12 2019We present a deep learning framework for wide-field, content-aware estimation of absorption and scattering coefficients of tissues, called Generative Adversarial Network Prediction of Optical Properties (GANPOP). Spatial frequency domain imaging is used ... More
Optimal low rank tensor recoveryJun 12 2019We investigate the sample size requirement for exact recovery of a high order tensor of low rank from a subset of its entries. In the Tucker decomposition framework, we show that the Riemannian optimization algorithm with initial value obtained from a ... More
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point CloudsJun 12 2019We present a novel deep neural network architecture for end-to-end scene flow estimation that directly operates on large-scale 3D point clouds. Inspired by Bilateral Convolutional Layers (BCL), we propose novel DownBCL, UpBCL, and CorrBCL operations that ... More
Neural Network Models for Stock Selection Based on Fundamental AnalysisJun 12 2019Application of neural network architectures for financial prediction has been actively studied in recent years. This paper presents a comparative study that investigates and compares feed-forward neural network (FNN) and adaptive neural fuzzy inference ... More
MOPED: Efficient priors for scalable variational inference in Bayesian deep neural networksJun 12 2019Variational inference for Bayesian deep neural networks (DNNs) requires specifying priors and approximate posterior distributions for neural network weights. Specifying meaningful weight priors is a challenging problem, particularly for scaling variational ... More
Learning Curves for Deep Neural Networks: A Gaussian Field Theory PerspectiveJun 12 2019A series of recent works suggest that deep neural networks (DNNs), of fixed depth, are equivalent to certain Gaussian Processes (NNGP/NTK) in the highly over-parameterized regime (width or number-of-channels going to infinity). Other works suggest that ... More
Keeping Notes: Conditional Natural Language Generation with a Scratchpad MechanismJun 12 2019We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks. By enabling ... More
Efficient Exploration via State Marginal MatchingJun 12 2019To solve tasks with sparse rewards, reinforcement learning algorithms must be equipped with suitable exploration techniques. However, it is unclear what underlying objective is being optimized by existing exploration algorithms, or how they can be altered ... More
Critical Point Finding with Newton-MR by Analogy to Computing Square RootsJun 12 2019Understanding of the behavior of algorithms for resolving the optimization problem (hereafter shortened to OP) of optimizing a differentiable loss function (OP1), is enhanced by knowledge of the critical points of that loss function, i.e. the points where ... More
Presence-Only Geographical Priors for Fine-Grained Image ClassificationJun 12 2019Appearance information alone is often not sufficient to accurately differentiate between fine-grained visual categories. Human experts make use of additional cues such as where, and when, a given image was taken in order to inform their final decision. ... More
Does Learning Require Memorization? A Short Tale about a Long TailJun 12 2019State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set. This phenomenon is referred to as data interpolation or, informally, as memorization of the training ... More
Artificial Intelligence Enabled Material Behavior PredictionJun 12 2019Artificial Intelligence and Machine Learning algorithms have considerable potential to influence the prediction of material properties. Additive materials have a unique property prediction challenge in the form of surface roughness effects on fatigue ... More
GluonTS: Probabilistic Time Series Models in PythonJun 12 2019We introduce Gluon Time Series (GluonTS)\footnote{\url{}}, a library for deep-learning-based time series modeling. GluonTS simplifies the development of and experimentation with time series models for common tasks such as forecasting ... More
Image-Adaptive GAN based ReconstructionJun 12 2019In the recent years, there has been a significant improvement in the quality of samples produced by (deep) generative models such as variational auto-encoders and generative adversarial networks. However, the representation capabilities of these methods ... More
Representation Learning for Words and EntitiesJun 12 2019This thesis presents new methods for unsupervised learning of distributed representations of words and entities from text and knowledge bases. The first algorithm presented in the thesis is a multi-view algorithm for learning representations of words ... More
Search on the Replay Buffer: Bridging Planning and Reinforcement LearningJun 12 2019The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Planning algorithms effectively reason over long horizons, but assume access to a local policy and distance ... More
Multitask Learning for Network Traffic ClassificationJun 12 2019Traffic classification has various applications in today's Internet, from resource allocation, billing and QoS purposes in ISPs to firewall and malware detection in clients. Classical machine learning algorithms and deep learning models have been widely ... More
Bootstrapping Upper Confidence BoundJun 12 2019Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback. Existing techniques for constructing confidence bounds are typically built upon various concentration inequalities, ... More
Reinforcement Knowledge Graph Reasoning for Explainable RecommendationJun 12 2019Recent advances in personalized recommendation have sparked great interest in the exploitation of rich structured information provided by knowledge graphs. Unlike most existing approaches that only focus on leveraging knowledge graphs for more accurate ... More
Continual and Multi-Task Architecture SearchJun 12 2019Architecture search is the process of automatically learning the neural model or cell structure that best suits the given task. Recently, this approach has shown promising performance improvements (on language modeling and image classification) with reasonable ... More
A Model to Search for Synthesizable MoleculesJun 12 2019Deep generative models are able to suggest new organic molecules by generating strings, trees, and graphs representing their structure. While such models allow one to generate molecules with desirable properties, they give no guarantees that the molecules ... More
A Multiscale Visualization of Attention in the Transformer ModelJun 12 2019The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully attention-based approach. Besides improving performance, an advantage of using attention is that it can also help to interpret a model by showing ... More
Is Deep Learning an RG Flow?Jun 12 2019Although there has been a rapid development of practical applications, theoretical explanations of deep learning are in their infancy. A possible starting point suggests that deep learning performs a sophisticated coarse graining. Coarse graining is the ... More
Sorted Top-k in RoundsJun 12 2019We consider the sorted top-$k$ problem whose goal is to recover the top-$k$ items with the correct order out of $n$ items using pairwise comparisons. In many applications, multiple rounds of interaction can be costly. We restrict our attention to algorithms ... More
Warping Resilient Time Series EmbeddingsJun 12 2019Time series are ubiquitous in real world problems and computing distance between two time series is often required in several learning tasks. Computing similarity between time series by ignoring variations in speed or warping is often encountered and ... More
Manifold Graph with Learned Prototypes for Semi-Supervised Image ClassificationJun 12 2019Recent advances in semi-supervised learning methods rely on estimating categories for unlabeled data using a model trained on the labeled data (pseudo-labeling) and using the unlabeled data for various consistency-based regularization. In this work, we ... More
Task Agnostic Continual Learning via Meta LearningJun 12 2019While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided by continual ... More
Boosting Few-Shot Visual Learning with Self-SupervisionJun 12 2019Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data. Few-shot learning aims for optimization methods and models that can learn efficiently to recognize patterns ... More
Parameterized Structured Pruning for Deep Neural NetworksJun 12 2019As a result of the growing size of Deep Neural Networks (DNNs), the gap to hardware capabilities in terms of memory and compute increases. To effectively compress DNNs, quantization and connection pruning are usually considered. However, unconstrained ... More
DCEF: Deep Collaborative Encoder Framework for Unsupervised ClusteringJun 12 2019Collaborative representation is a popular feature learning approach, which encoding process is assisted by variety types of information. In this paper, we propose a collaborative representation restricted Boltzmann Machine (CRRBM) for modeling binary ... More
Attention-based Multi-Input Deep Learning Architecture for Biological Activity Prediction: An Application in EGFR InhibitorsJun 12 2019Machine learning and deep learning have gained popularity and achieved immense success in Drug discovery in recent decades. Historically, machine learning and deep learning models were trained on either structural data or chemical properties by separated ... More
Evaluation of Dataflow through layers of Deep Neural Networks in Classification and Regression ProblemsJun 12 2019This paper introduces two straightforward, effective indices to evaluate the input data and the data flowing through layers of a feedforward deep neural network. For classification problems, the separation rate of target labels in the space of dataflow ... More
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias FunctionJun 12 2019We present an algorithm based on the Optimism in the Face of Uncertainty (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently. By evaluating the state-pair ... More
Secure Federated Matrix FactorizationJun 12 2019To protect user privacy and meet law regulations, federated (machine) learning is obtaining vast interests in recent years. The key principle of federated learning is training a machine learning model without needing to know each user's personal raw private ... More
Reinforcement-Learning-based Adaptive Optimal Control for Arbitrary Reference TrackingJun 12 2019Model-free control based on the idea of Reinforcement Learning is a promising control approach that has recently gained extensive attention. However, most Reinforcement-Learning-based control methods solely focus on the regulation problem or learn to ... More
Deep Smoothing of the Implied Volatility SurfaceJun 12 2019We present an artificial neural network (ANN) approach to value financial derivatives. Atypically to standard ANN applications, practitioners equally use option pricing models to validate market prices and to infer unobserved prices. Importantly, models ... More
Unified Semantic Parsing with Weak SupervisionJun 12 2019Semantic parsing over multiple knowledge bases enables a parser to exploit structural similarities of programs across the multiple domains. However, the fundamental challenge lies in obtaining high-quality annotations of (utterance, program) pairs across ... More
Higher-Order Ranking and Link Prediction: From Closing Triangles to Closing Higher-Order MotifsJun 12 2019In this paper, we introduce the notion of motif closure and describe higher-order ranking and link prediction methods based on the notion of closing higher-order network motifs. The methods are fast and efficient for real-time ranking and link prediction-based ... More
Decoupling Gating from LinearityJun 12 2019ReLU neural-networks have been in the focus of many recent theoretical works, trying to explain their empirical success. Nonetheless, there is still a gap between current theoretical results and empirical observations, even in the case of shallow (one ... More
Fast Task Inference with Variational Intrinsic Successor FeaturesJun 12 2019It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}. ... More
Who Will Win It? An In-game Win Probability Model for FootballJun 12 2019In-game win probability is a statistical metric that provides a sports team's likelihood of winning at any given point in a game, based on the performance of historical teams in the same situation. In-game win-probability models have been extensively ... More
Real-time Attention Based Look-alike Model for Recommender SystemJun 12 2019Recently, deep learning models play more and more important roles in contents recommender systems. However, although the performance of recommendations is greatly improved, the "Matthew effect" becomes increasingly evident. While the head contents get ... More
Graph Embedding on Biomedical Networks: Methods, Applications, and EvaluationsJun 12 2019Motivation: Graph embedding learning which aims to automatically learn low-dimensional node representations has drawn increasing attention in recent years. To date, most recent graph embedding methods are mainly evaluated on social and information networks ... More
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular NetworksJun 12 2019Unmanned aerial vehicles (UAVs) are envisioned to complement the 5G communication infrastructure in future smart cities. Hot spots easily appear in road intersections, where effective communication among vehicles is challenging. UAVs may serve as relays ... More
Neural Variational Inference For Estimating Uncertainty in Knowledge Graph EmbeddingsJun 12 2019Recent advances in Neural Variational Inference allowed for a renaissance in latent variable models in a variety of domains involving high-dimensional data. While traditional variational methods derive an analytical approximation for the intractable distribution ... More
Unsupervised Question Answering by Cloze TranslationJun 12 2019Obtaining training data for Question Answering (QA) is time-consuming and resource-intensive, and existing QA datasets are only available for limited domains and languages. In this work, we explore to what extent high quality training data is actually ... More
DeepSquare: Boosting the Learning Power of Deep Convolutional Neural Networks with Elementwise Square OperatorsJun 12 2019Modern neural network modules which can significantly enhance the learning power usually add too much computational complexity to the original neural networks. In this paper, we pursue very efficient neural network modules which can significantly boost ... More
A Stratified Approach to Robustness for Randomly Smoothed ClassifiersJun 12 2019Strong theoretical guarantees of robustness can be given for ensembles of classifiers generated by input randomization. Specifically, an $\ell_2$ bounded adversary cannot alter the ensemble prediction generated by an isotropic Gaussian perturbation, where ... More
A Structured Learning Approach to Temporal Relation ExtractionJun 12 2019Identifying temporal relations between events is an essential step towards natural language understanding. However, the temporal relation between two events in a story depends on, and is often dictated by, relations among other events. Consequently, effectively ... More
Partial Or Complete, That's The QuestionJun 12 2019For many structured learning tasks, the data annotation process is complex and costly. Existing annotation schemes usually aim at acquiring completely annotated structures, under the common perception that partial structures are of low quality and could ... More
Relative Hausdorff Distance for Network AnalysisJun 12 2019Similarity measures are used extensively in machine learning and data science algorithms. The newly proposed graph Relative Hausdorff (RH) distance is a lightweight yet nuanced similarity measure for quantifying the closeness of two graphs. In this work ... More
Transferrable Operative Difficulty Assessment in Robot-assisted Teleoperation: A Domain Adaptation ApproachJun 12 2019Providing an accurate and efficient assessment of operative difficulty is important for designing robot-assisted teleoperation interfaces that are easy and natural for human operators to use. In this paper, we aim to develop a data-driven approach to ... More
Non-Parametric Calibration for ClassificationJun 12 2019Many applications for classification methods not only require high accuracy but also reliable estimation of predictive uncertainty. However, while many current classification frameworks, in particular deep neural network architectures, provide very good ... More
SPoC: Search-based Pseudocode to CodeJun 12 2019We consider the task of mapping pseudocode to long programs that are functionally correct. Given test cases as a mechanism to validate programs, we search over the space of possible translations of the pseudocode to find a program that passes the validation. ... More
Improving Importance Weighted Auto-Encoders with Annealed Importance SamplingJun 12 2019Stochastic variational inference with an amortized inference model and the reparameterization trick has become a widely-used algorithm for learning latent variable models. Increasing the flexibility of approximate posterior distributions while maintaining ... More
Coresets for Gaussian Mixture Models of Any ShapeJun 12 2019An $\varepsilon$-coreset for a given set $D$ of $n$ points, is usually a small weighted set, such that querying the coreset \emph{provably} yields a $(1+\varepsilon)$-factor approximation to the original (full) dataset, for a given family of queries. ... More
Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural NetworksJun 12 2019Tight estimation of the Lipschitz constant for deep neural networks (DNNs) is useful in many applications ranging from robustness certification of classifiers to stability analysis of closed-loop systems with reinforcement learning controllers. Existing ... More
Compressive Hyperspherical Energy MinimizationJun 12 2019Recent work on minimum hyperspherical energy (MHE) has demonstrated its potential in regularizing neural networks and improving their generalization. MHE was inspired by the Thomson problem in physics, where the distribution of multiple propelling electrons ... More
Using Small Proxy Datasets to Accelerate Hyperparameter SearchJun 12 2019One of the biggest bottlenecks in a machine learning workflow is waiting for models to train. Depending on the available computing resources, it can take days to weeks to train a neural network on a large dataset with many classes such as ImageNet. For ... More
Run-Time Efficient RNN Compression for Inference on Edge DevicesJun 12 2019Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints. As a result, there is a need for ... More
Multiple instance learning with graph neural networksJun 12 2019Multiple instance learning (MIL) aims to learn the mapping between a bag of instances and the bag-level label. In this paper, we propose a new end-to-end graph neural network (GNN) based algorithm for MIL: we treat each bag as a graph and use GNN to learn ... More
Communication-Efficient Accurate Statistical EstimationJun 12 2019When the data are stored in a distributed manner, direct application of traditional statistical inference procedures is often prohibitive due to communication cost and privacy concerns. This paper develops and investigates two Communication-Efficient ... More
Semi-flat minima and saddle points by embedding neural networks to overparameterizationJun 12 2019We theoretically study the landscape of the training error for neural networks in overparameterized cases. We consider three basic methods for embedding a network into a wider one with more hidden units, and discuss whether a minimum point of the narrower ... More
On regularization for a convolutional kernel in neural networksJun 12 2019Convolutional neural network is a very important model of deep learning. It can help avoid the exploding/vanishing gradient problem and improve the generalizability of a neural network if the singular values of the Jacobian of a layer are bounded around ... More
Statistical guarantees for local graph clusteringJun 11 2019Local graph clustering methods aim to find small clusters in very large graphs. These methods take as input a graph and a seed node, and they return as output a good cluster in a running time that depends on the size of the output cluster but that is ... More
Reinforcement Learning for Integer Programming: Learning to CutJun 11 2019Integer programming (IP) is a general optimization framework widely applicable to a variety of unstructured and structured problems arising in, e.g., scheduling, production planning, and graph optimization. As IP models many provably hard to solve problems, ... More
Task-Aware Deep Sampling for Feature GenerationJun 11 2019The human ability to imagine the variety of appearances of novel objects based on past experience is crucial for quickly learning novel visual concepts based on few examples. Endowing machines with a similar ability to generate feature distributions for ... More
A Closer Look at the Optimization Landscapes of Generative Adversarial NetworksJun 11 2019Generative adversarial networks have been very successful in generative modeling, however they remain relatively hard to optimize compared to standard deep neural networks. In this paper, we try to gain insight into the optimization of GANs by looking ... More
Discrepancy, Coresets, and Sketches in Machine LearningJun 11 2019This paper defines the notion of class discrepancy for families of functions. It shows that low discrepancy classes admit small offline and streaming coresets. We provide general techniques for bounding the class discrepancy of machine learning problems. ... More