Results for "Alan Yuille"

total 4163took 0.23s
Parsing Semantic Parts of Cars Using Graphical Models and Segment Appearance ConsistencyJun 09 2014Jun 11 2014This paper addresses the problem of semantic part parsing (segmentation) of cars, i.e.assigning every pixel within the car to one of the parts (e.g.body, window, lights, license plates and wheels). We formulate this as a landmark identification problem, ... More
Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image SegmentationFeb 09 2015Oct 05 2015Deep convolutional neural networks (DCNNs) trained on a large number of images with strong pixel-level annotations have recently significantly pushed the state-of-art in semantic image segmentation. We study the more challenging problem of learning DCNNs ... More
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image RetrievalApr 05 2019Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications. Recently, research interests arise in solving this problem under the more realistic and challenging setting ... More
An Alarm System For Segmentation Algorithm Based On Shape ModelMar 26 2019Mar 27 2019It is usually hard for a learning system to predict correctly on rare events that never occur in the training data, and there is no exception for segmentation algorithms. Meanwhile, manual inspection of each case to locate the failures becomes infeasible ... More
Parsing Occluded People by Flexible CompositionsDec 04 2014Nov 24 2015This paper presents an approach to parsing humans when there is significant occlusion. We model humans using a graphical model which has a tree structure building on recent work [32, 6] and exploit the connectivity prior that, even in presence of occlusion, ... More
Transfer of View-manifold Learning to Similarity Perception of Novel ObjectsMar 31 2017We develop a model of perceptual similarity judgment based on re-training a deep convolution neural network (DCNN) that learns to associate different views of each 3D object to capture the notion of object persistence and continuity in our visual experience. ... More
Efficient variational inference in large-scale Bayesian compressed sensingJul 22 2011Sep 05 2011We study linear models under heavy-tailed priors from a probabilistic viewpoint. Instead of computing a single sparse most probable (MAP) solution as in standard deterministic approaches, the focus in the Bayesian compressed sensing framework shifts towards ... More
DOC: Deep OCclusion Estimation From a Single ImageNov 20 2015Jul 24 2016Recovering the occlusion relationships between objects is a fundamental human visual ability which yields important information about the 3D world. In this paper we propose a deep network architecture, called DOC, which acts on a single image, detects ... More
Symmetric Non-Rigid Structure from Motion for Category-Specific Object Structure EstimationSep 22 2016Many objects, especially these made by humans, are symmetric, e.g. cars and aeroplanes. This paper addresses the estimation of 3D structures of symmetric objects from multiple images of the same object category, e.g. different cars, seen from various ... More
Genetic CNNMar 04 2017The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition. Following basic principles such as increasing the depth and constructing highway connections, researchers have manually designed a lot of ... More
Thickened 2D Networks for 3D Medical Image SegmentationApr 02 2019There has been a debate in medical image segmentation on whether to use 2D or 3D networks, where both pipelines have advantages and disadvantages. This paper presents a novel approach which thickens the input of a 2D network, so that the model is expected ... More
UnrealCV: Connecting Computer Vision to Unreal EngineSep 05 2016Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the ... More
Learning Deep Structured ModelsJul 09 2014Apr 27 2015Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to combine MRFs with ... More
Deep Regression Forests for Age EstimationDec 19 2017Age estimation from facial images is typically cast as a nonlinear regression problem. The main challenge of this problem is the facial feature space w.r.t. ages is heterogeneous, due to the large variation in facial appearance across different persons ... More
Prior-aware Neural Network for Partially-Supervised Multi-Organ SegmentationApr 12 2019Accurate multi-organ abdominal CT segmentation is essential to many clinical applications such as computer-aided intervention. As data annotation requires massive human labor from experienced radiologists, it is common that training data are partially ... More
Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against DefensesApr 01 2019This paper focuses on learning transferable adversarial examples specifically against defense models (models to defense adversarial attacks). In particular, we show that a simple universal perturbation can fool a series of state-of-the-art defenses. Adversarial ... More
Semantic Part Segmentation using Compositional Model combining Shape and AppearanceDec 18 2014In this paper, we study the problem of semantic part segmentation for animals. This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often have similar appearance and highly ... More
Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise RelationsJul 12 2014Nov 04 2014We present a method for estimating articulated human pose from a single static image based on a graphical model with novel pairwise relations that make adaptive use of local image measurements. More precisely, we specify a graphical model for human pose ... More
Improving Transferability of Adversarial Examples with Input DiversityMar 19 2018Mar 27 2019Though CNNs have achieved the state-of-the-art performance on various vision tasks, they are vulnerable to adversarial examples --- crafted by adding human-imperceptible perturbations to clean images. However, most of the existing adversarial attacks ... More
Exploiting Symmetry and/or Manhattan Properties for 3D Object Structure Estimation from Single and Multiple ImagesJul 25 2016Many man-made objects have intrinsic symmetries and Manhattan structure. By assuming an orthographic projection model, this paper addresses the estimation of 3D structures and camera projection using symmetry and/or Manhattan structure cues, for the two ... More
Generating Multiple Diverse Hypotheses for Human 3D Pose Consistent with 2D Joint DetectionsFeb 08 2017Aug 20 2017We propose a method to generate multiple diverse and valid human pose hypotheses in 3D all consistent with the 2D detection of joints in a monocular RGB image. We use a novel generative model uniform (unbiased) in the space of anatomically plausible 3D ... More
Complexity of Representation and Inference in Compositional Models with Part SharingJan 16 2013This paper describes serial and parallel compositional models of multiple objects with part sharing. Objects are built by part-subpart compositions and expressed in terms of a hierarchical dictionary of object parts. These parts are represented on lattices ... More
Exploiting Symmetry and/or Manhattan Properties for 3D Object Structure Estimation from Single and Multiple ImagesJul 25 2016Nov 19 2016Many man-made objects have intrinsic symmetries and Manhattan structure. By assuming an orthographic projection model, this paper addresses the estimation of 3D structures and camera projection using symmetry and/or Manhattan structure cues, for the two ... More
Deep Nets: What have they ever done for Vision?May 10 2018Jan 11 2019This is an opinion paper about the strengths and weaknesses of Deep Nets for vision. They are at the center of recent progress on artificial intelligence and are of growing importance in cognitive science and neuroscience. They have enormous successes ... More
Object Recognition with and without ObjectsNov 20 2016While recent deep neural network models have given promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural networks on the foreground (object) and background (context) ... More
Object Recognition with and without ObjectsNov 20 2016Nov 22 2016While recent deep neural network models have given promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural networks on the foreground (object) and background (context) ... More
PASCAL Boundaries: A Class-Agnostic Semantic Boundary DatasetNov 25 2015In this paper, we address the boundary detection task motivated by the ambiguities in current definition of edge detection. To this end, we generate a large database consisting of more than 10k images (which is 20x bigger than existing edge detection ... More
DeePM: A Deep Part-Based Model for Object Detection and Semantic Part LocalizationNov 23 2015Jan 26 2016In this paper, we propose a deep part-based model (DeePM) for symbiotic object detection and semantic part localization. For this purpose, we annotate semantic parts for all 20 object categories on the PASCAL VOC 2012 dataset, which provides information ... More
OriNet: A Fully Convolutional Network for 3D Human Pose EstimationNov 12 2018In this paper, we propose a fully convolutional network for 3D human pose estimation from monocular images. We use limb orientations as a new way to represent 3D poses and bind the orientation together with the bounding box of each limb region to better ... More
Semi-Supervised Sparse Representation Based Classification for Face Recognition with Insufficient Labeled SamplesSep 12 2016Mar 24 2017This paper addresses the problem of face recognition when there is only few, or even only a single, labeled examples of the face that we wish to recognize. Moreover, these examples are typically corrupted by nuisance variables, both linear (i.e., additive ... More
Object Recognition with and without ObjectsNov 20 2016May 25 2017While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural net- works on the foreground (object) and background (context) ... More
Semi-Supervised Sparse Representation Based Classification for Face Recognition with Insufficient Labeled SamplesSep 12 2016This paper addresses the problem of face recognition when there is only few, or even only a single, labeled examples of the face that we wish to recognize. Moreover, these examples are typically corrupted by nuisance variables, both linear (i.e. additive ... More
A Meta-Theory of Boundary Detection BenchmarksFeb 25 2013Human labeled datasets, along with their corresponding evaluation algorithms, play an important role in boundary detection. We here present a psychophysical experiment that addresses the reliability of such benchmarks. To find better remedies to evaluate ... More
Attention Correctness in Neural Image CaptioningMay 31 2016Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision. But despite their popularity, the "correctness" of the implicitly-learned attention maps has only been assessed qualitatively ... More
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring ExpressionsJan 03 2019Referring object detection and referring image segmentation are important tasks that require joint understanding of visual information and natural language. Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art ... More
Joint Multi-Person Pose Estimation and Semantic Part SegmentationAug 10 2017Human pose estimation and semantic part segmentation are two complementary tasks in computer vision. In this paper, we propose to solve the two tasks jointly for natural multi-person images, in which the estimated pose provides object-level shape prior ... More
Attention Correctness in Neural Image CaptioningMay 31 2016Nov 23 2016Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision. But despite their popularity, the "correctness" of the implicitly-learned attention maps has only been assessed qualitatively ... More
Label Distribution Learning ForestsFeb 20 2017Oct 16 2017Label distribution learning (LDL) is a general learning framework, which assigns to an instance a distribution over a set of labels rather than a single label or multiple labels. Current LDL methods have either restricted assumptions on the expression ... More
Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom NetNov 21 2015Mar 28 2016Parsing articulated objects, e.g. humans and animals, into semantic parts (e.g. body, head and arms, etc.) from natural images is a challenging and fundamental problem for computer vision. A big difficulty is the large variability of scale and location ... More
Few-shot Learning by Exploiting Visual Concepts within CNNsNov 22 2017Feb 13 2018Convolutional neural networks (CNNs) are one of the driving forces for the advancement of computer vision. Despite their promising performances on many tasks, CNNs still face major obstacles on the road to achieving ideal machine intelligence. One is ... More
Weight StandardizationMar 25 2019In this paper, we propose Weight Standardization (WS) to accelerate deep network training. WS is targeted at the micro-batch training setting where each GPU typically has only 1-2 images for training. The micro-batch training setting is hard because small ... More
ELASTIC: Improving CNNs with Instance Specific Scaling PoliciesDec 13 2018Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have similar theme: a set of intuitive and manually designed policies that are generic and fixed (e.g. SIFT or feature pyramid). ... More
Unsupervised learning of object semantic parts from internal states of CNNs by population encodingNov 21 2015Nov 12 2016We address the key question of how object part representations can be found from the internal states of CNNs that are trained for high-level tasks, such as object classification. This work provides a new unsupervised method to learn semantic parts and ... More
Spatial Transformer Introspective Neural NetworkMay 16 2018Natural images contain many variations such as illumination differences, affine transformations, and shape distortions. Correctly classifying these variations poses a long standing problem. The most commonly adopted solution is to build large-scale datasets ... More
Gradually Updated Neural Networks for Large-Scale Image RecognitionNov 25 2017Feb 10 2018Depth is one of the keys that make neural networks succeed in the task of large-scale image recognition. The state-of-the-art network architectures usually increase the depths by cascading convolutional layers or building blocks. In this paper, we present ... More
ScaleNet: Guiding Object Proposal Generation in Supermarkets and BeyondApr 22 2017Motivated by product detection in supermarkets, this paper studies the problem of object proposal generation in supermarket images and other natural images. We argue that estimation of object scales in images is helpful for generating object proposals, ... More
Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT ScansJun 22 2017Automatic segmentation of an organ and its cystic region is a prerequisite of computer-aided diagnosis. In this paper, we focus on pancreatic cyst segmentation in abdominal CT scan. This task is important and very useful in clinical practice yet challenging ... More
Towards Resisting Large Data Variations via Introspective LearningMay 16 2018Apr 03 2019Learning deep networks which can resist large variations between training and testing data are essential to build accurate and robust image classifiers. Towards this end, a typical strategy is to apply data augmentation to enlarge the training set. However, ... More
Scene Graph Parsing as Dependency ParsingMar 25 2018In this paper, we study the problem of parsing structured knowledge graphs from textual descriptions. In particular, we consider the scene graph representation that considers objects together with their attributes and relations: this representation has ... More
Probabilistic Motion Estimation Based on Temporal CoherenceJan 05 2012We develop a theory for the temporal integration of visual motion motivated by psychophysical experiments. The theory proposes that input data are temporally grouped and used to predict and estimate the motion flows in the image sequence. This temporal ... More
Pose-Guided Human Parsing with Deep Learned FeaturesAug 17 2015Nov 25 2015Parsing human body into semantic regions is crucial to human-centric analysis. In this paper, we propose a segment-based parsing pipeline that explores human pose information, i.e. the joint location of a human model, which improves the part proposal, ... More
Few-Shot Image Recognition by Predicting Parameters from ActivationsJun 12 2017Nov 25 2017In this paper, we are interested in the few-shot learning problem. In particular, we focus on a challenging scenario where the number of categories is large and the number of examples per novel category is very limited, e.g. 1, 2, or 3. Motivated by the ... More
Snapshot Distillation: Teacher-Student Optimization in One GenerationDec 01 2018Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting. Teacher-student optimization aims at providing complementary cues from a model trained previously, but these approaches ... More
Knowledge Distillation in Generations: More Tolerant Teachers Educate Better StudentsMay 15 2018Sep 07 2018We focus on the problem of training a deep neural network in generations. The flowchart is that, in order to optimize the target network (student), another network (teacher) with the same architecture is first trained, and used to provide part of supervision ... More
Mitigating Adversarial Effects Through RandomizationNov 06 2017Feb 28 2018Convolutional neural networks have demonstrated high accuracy on various tasks in recent years. However, they are extremely vulnerable to adversarial examples. For example, imperceptible perturbations added to clean images can cause convolutional neural ... More
NormFace: L2 Hypersphere Embedding for Face VerificationApr 21 2017Jul 26 2017Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates ... More
ELASTIC: Improving CNNs with Dynamic Scaling PoliciesDec 13 2018Apr 08 2019Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have a similar theme: a set of intuitive and manually designed policies that are generic and fixed (e.g. SIFT or feature pyramid). ... More
Representing Data by a Mixture of Activated SimplicesDec 12 2014We present a new model which represents data as a mixture of simplices. Simplices are geometric structures that generalize triangles. We give a simple geometric understanding that allows us to learn a simplicial structure efficiently. Our method requires ... More
Discovering Internal Representations from Object-CNNs Using Population EncodingNov 21 2015Jan 07 2016In this paper, we provide a method for understanding the internal representations of Convolutional Neural Networks (CNNs) trained on objects. We hypothesize that the information is distributed across multiple neuronal responses and propose a simple clustering ... More
The DLR Hierarchy of Approximate InferenceJul 04 2012We propose a hierarchy for approximate inference based on the Dobrushin, Lanford, Ruelle (DLR) equations. This hierarchy includes existing algorithms, such as belief propagation, and also motivates novel algorithms such as factorized neighbors (FN) algorithms ... More
Deep Collaborative Learning for Visual RecognitionMar 03 2017Deep neural networks are playing an important role in state-of-the-art visual recognition. To represent high-level visual concepts, modern networks are equipped with large convolutional layers, which use a large number of filters and contribute significantly ... More
SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized DataApr 01 2018Jul 28 2018State-of-the-art techniques of artificial intelligence, in particular deep learning, are mostly data-driven. However, collecting and manually labeling a large scale dataset is both difficult and expensive. A promising alternative is to introduce synthesized ... More
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated ImagesNov 24 2016In this paper, we focus on training and evaluating effective word embeddings with both text and visual information. More specifically, we introduce a large-scale dataset with 300 million sentences describing over 40 million images crawled and downloaded ... More
InterActive: Inter-Layer Activeness PropagationApr 30 2016An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network. Despite the astonishing performance, deep features extracted from low-level neurons are ... More
Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of NeuronsJul 21 2016Deep Convolutional Neural Networks (CNNs) are playing important roles in state-of-the-art visual recognition. This paper focuses on modeling the spatial co-occurrence of neuron responses, which is less studied in the previous work. For this, we consider ... More
Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource UtilizationDec 02 2018In this paper, we study the problem of improving computational resource utilization of neural networks. Deep neural networks are usually over-parameterized for their tasks in order to achieve good performances, thus are likely to have underutilized computational ... More
MAT: A Multimodal Attentive Translator for Image CaptioningFeb 18 2017Aug 10 2017In this work we formulate the problem of image captioning as a multimodal translation task. Analogous to machine translation, we present a sequence-to-sequence recurrent neural networks (RNN) model for image caption generation. Different from most existing ... More
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene UnderstandingJun 16 2014Recent trends in image understanding have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning, and local appearance based classifiers. ... More
Multi-stage Multi-recursive-input Fully Convolutional Networks for Neuronal Boundary DetectionMar 24 2017Jul 31 2017In the field of connectomics, neuroscientists seek to identify cortical connectivity comprehensively. Neuronal boundary detection from the Electron Microscopy (EM) images is often done to assist the automatic reconstruction of neuronal circuit. But the ... More
Rethinking Monocular Depth Estimation with Adversarial TrainingAug 22 2018Sep 24 2018Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. Deep learning-based methods have demonstrated promise for both supervised and unsupervised depth estimation from monocular images. Most existing ... More
UnrealStereo: Controlling Hazardous Factors to Analyze Stereo VisionDec 14 2016Sep 06 2018A reliable stereo algorithm is critical for many robotics applications. But textureless and specular regions can easily cause failure by making feature matching difficult. Understanding whether an algorithm is robust to these hazardous regions is important. ... More
Deep Co-Training for Semi-Supervised Image RecognitionMar 15 2018In this paper, we study the problem of semi-supervised image recognition, which is to learn classifiers using both labeled and unlabeled images. We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework. The original ... More
Multi-Instance Visual-Semantic EmbeddingDec 22 2015Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space. Although several approaches have been proposed for single-label ... More
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring ExpressionsJan 03 2019Apr 06 2019Referring object detection and referring image segmentation are important tasks that require joint understanding of visual information and natural language. Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art ... More
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFsJun 02 2016In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous ... More
Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic UnderstandingOct 14 2018Learning to estimate 3D geometry in a single frame and optical flow from consecutive frames by watching unlabeled videos via deep convolutional network has made significant process recently. Current state-of-the-art (SOTA) methods treat the tasks independently. ... More
DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial OcclusionSep 14 2017Mar 29 2018In this paper, we study the task of detecting semantic parts of an object, e.g., a wheel of a car, under partial occlusion. We propose that all models should be trained without seeing occlusions while being able to transfer the learned knowledge to deal ... More
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)Dec 20 2014Jun 11 2015In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by ... More
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFsDec 22 2014Jun 07 2016Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models ... More
Training Multi-organ Segmentation Networks with Sample Selection by Relaxed Upper Confident BoundApr 07 2018Deep convolutional neural networks (CNNs), especially fully convolutional networks, have been widely applied to automatic medical image segmentation problems, e.g., multi-organ segmentation. Existing CNN-based segmentation methods mainly focus on looking ... More
Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ SegmentationSep 13 2017Apr 08 2018We aim at segmenting small organs (e.g., the pancreas) from abdominal CT scans. As the target often occupies a relatively small region in the input image, deep neural networks can be easily confused by the complex and variable background. To alleviate ... More
Detect What You Can: Detecting and Representing Objects using Holistic Models and Body PartsJun 08 2014Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples of highly deformable objects), ... More
Recurrent Multimodal Interaction for Referring Image SegmentationMar 23 2017Aug 04 2017In this paper we are interested in the problem of image segmentation given natural language descriptions, i.e. referring expressions. Existing works tackle this problem by first modeling images and sentences independently and then segment images by combining ... More
SORT: Second-Order Response Transform for Visual RecognitionMar 20 2017Sep 14 2017In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks. We propose a novel approach named Second-Order Response Transform (SORT), which appends element-wise product transform to the linear ... More
Single-Shot Object Detection with Enriched SemanticsDec 01 2017Apr 08 2018We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global ... More
Joint Shape Representation and Classification for Detecting PDACApr 27 2018We aim to detect pancreatic ductal adenocarcinoma (PDAC) in abdominal CT scans, which sheds light on early diagnosis of pancreatic cancer. This is a 3D volume classification task with little training data. We propose a two-stage framework, which first ... More
Bridging the Gap Between 2D and 3D Organ Segmentation with Volumetric Fusion NetApr 02 2018Jun 09 2018There has been a debate on whether to use 2D or 3D deep neural networks for volumetric organ segmentation. Both 2D and 3D models have their advantages and disadvantages. In this paper, we present an alternative framework, which trains 2D networks on different ... More
Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal AdenocarcinomaJul 09 2018This paper proposes an intuitive approach to finding pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, by checking abdominal CT scans. Our idea is named segmentation-for-classification (S4C), which classifies a volume ... More
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural ImagesSep 13 2016Jul 13 2017Object skeletons are useful for object representation and object detection. They are complementary to the object contour, and provide extra information, such as how object scale (thickness) varies among object parts. But object skeleton extraction from ... More
Detecting Semantic Parts on Partially Occluded ObjectsJul 25 2017In this paper, we address the task of detecting semantic parts on partially occluded objects. We consider a scenario where the model is trained using non-occluded images but tested on occluded images. The motivation is that there are infinite number of ... More
Abdominal multi-organ segmentation with organ-attention networks and statistical fusionApr 23 2018Accurate and robust segmentation of abdominal organs on CT is essential for many clinical applications such as computer-aided diagnosis and computer-aided surgery. But this task is challenging due to the weak boundaries of organs, the complexity of the ... More
Multi-Context Attention for Human Pose EstimationFeb 24 2017In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple ... More
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFsJun 02 2016May 12 2017In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous ... More
Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain TransformNov 10 2015Jun 02 2016Deep convolutional neural networks (CNNs) are the backbone of state-of-art semantic image segmentation systems. Recent work has shown that complementing CNNs with fully-connected conditional random fields (CRFs) can significantly enhance their object ... More
Feature Denoising for Improving Adversarial RobustnessDec 09 2018Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these ... More
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image SegmentationJan 10 2019Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification problems. In this paper, we study NAS for semantic image segmentation, an important ... More
Progressive Recurrent Learning for Visual RecognitionNov 29 2018Computer vision is difficult, partly because the mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn. A currently popular solution is to design a deep neural network and optimize it on a large-scale dataset. ... More
Towards Accurate Task Accomplishment with Low-Cost Robotic ArmsDec 03 2018Training a robotic arm to accomplish real-world tasks has been attracting increasing attention in both academia and industry. This work discusses the role of computer vision algorithms in this field. We focus on low-cost arms on which no sensors are equipped ... More
Visual Concepts and Compositional VotingNov 13 2017It is very attractive to formulate vision in terms of pattern theory \cite{Mumford2010pattern}, where patterns are defined hierarchically by compositions of elementary building blocks. But applying pattern theory to real world images is currently less ... More
A Fixed-Point Model for Pancreas Segmentation in Abdominal CT ScansDec 25 2016Jun 21 2017Deep neural networks have been widely adopted for automatic organ segmentation from abdominal CT scans. However, the segmentation accuracy of some small organs (e.g., the pancreas) is sometimes below satisfaction, arguably because deep networks are easily ... More