VL-BERT: Pre-training of Generic Visual-Linguistic RepresentationsAug 22 2019We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short). VL-BERT adopts the simple yet powerful Transformer model as the backbone, and extends it to take both visual and linguistic ... More
Sequential Latent Spaces for Modeling the Intention During Diverse Image CaptioningAug 22 2019Diverse and accurate vision+language modeling is an important goal to retain creative freedom and maintain user engagement. However, adequately capturing the intricacies of diversity in language models is challenging. Recent works commonly resort to latent ... More
ViCo: Word Embeddings from Visual Co-occurrencesAug 22 2019We propose to learn word embeddings from visual co-occurrences. Two words co-occur visually if both words apply to the same image or image region. Specifically, we extract four types of visual co-occurrences between object and attribute words from large-scale, ... More
Compositional Video PredictionAug 22 2019We present an approach for pixel-level future prediction given an input image of a scene. We observe that a scene is comprised of distinct entities that undergo motion and present an approach that operationalizes this insight. We implicitly predict future ... More
Adversarial-Based Knowledge Distillation for Multi-Model Ensemble and Noisy Data RefinementAug 22 2019Generic Image recognition is a fundamental and fairly important visual problem in computer vision. One of the major challenges of this task lies in the fact that single image usually has multiple objects inside while the labels are still one-hot, another ... More
Predicting Animation Skeletons for 3D Articulated Models via Volumetric NetsAug 22 2019We present a learning method for predicting animation skeletons for input 3D models of articulated characters. In contrast to previous approaches that fit pre-defined skeleton templates or predict fixed sets of joints, our method produces an animation ... More
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action RecognitionAug 22 2019We focus on multi-modal fusion for egocentric action recognition, and propose a novel architecture for multi-modal temporal-binding, i.e. the combination of modalities within a range of temporal offsets. We train the architecture with three modalities ... More
Noise Flow: Noise Modeling with Conditional Normalizing FlowsAug 22 2019Modeling and synthesizing image noise is an important aspect in many computer vision applications. The long-standing additive white Gaussian and heteroscedastic (signal-dependent) noise models widely used in the literature provide only a coarse approximation ... More
Saliency Methods for Explaining Adversarial AttacksAug 22 2019In this work, we aim to explain the classifications of adversary images using saliency methods. Saliency methods explain individual classification decisions of neural networks by creating saliency maps. All saliency methods were proposed for explaining ... More
Indoor Depth Completion with Boundary Consistency and Self-AttentionAug 22 2019Depth estimation features are helpful for 3D recognition. Commodity-grade depth cameras are able to capture depth and color image in real-time. However, glossy, transparent or distant surface cannot be scanned properly by the sensor. As a result, enhancement ... More
Deep Green Function Convolution for Improving Saliency in Convolutional Neural NetworksAug 22 2019Current saliency methods require to learn large scale regional features using small convolutional kernels, which is not possible with a simple feed-forward network. Some methods solve this problem by using segmentation into superpixels while others downscale ... More
Image Colorization By Capsule NetworksAug 22 2019In this paper, a simple topology of Capsule Network (CapsNet) is investigated for the problem of image colorization. The generative and segmentation capabilities of the original CapsNet topology, which is proposed for image classification problem, is ... More
EGNet:Edge Guidance Network for Salient Object DetectionAug 22 2019Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, most existing FCNs-based methods still suffer from coarse object boundaries. In this paper, to solve this problem, we focus on the complementarity ... More
Trajectory Space Factorization for Deep Video-Based 3D Human Pose EstimationAug 22 2019Existing deep learning approaches on 3d human pose estimation for videos are either based on Recurrent or Convolutional Neural Networks (RNNs or CNNs). However, RNN-based frameworks can only tackle sequences with limited frames because sequential models ... More
Contour Detection in Cassini ISS images based on Hierarchical Extreme Learning Machine and Dense Conditional Random FieldAug 22 2019In Cassini ISS (Imaging Science Subsystem) images, contour detection is often performed on disk-resolved object to accurately locate their center. Thus, the contour detection is a key problem. Traditional edge detection methods, such as Canny and Roberts, ... More
Motion correction of dynamic contrast enhanced MRI of the liverAug 22 2019Motion correction of dynamic contrast enhanced magnetic resonance images (DCE-MRI) is a challenging task, due to changes in image appearance. In this study a groupwise registration, using a principle component analysis (PCA) based metric,1 is evaluated ... More
Optimal input configuration of dynamic contrast enhanced MRI in convolutional neural networks for liver segmentationAug 22 2019Most MRI liver segmentation methods use a structural 3D scan as input, such as a T1 or T2 weighted scan. Segmentation performance may be improved by utilizing both structural and functional information, as contained in dynamic contrast enhanced (DCE) ... More
Object detection on aerial imagery using CenterNetAug 22 2019Detection and classification of objects in aerial imagery have several applications like urban planning, crop surveillance, and traffic surveillance. However, due to the lower resolution of the objects and the effect of noise in aerial images, extracting ... More
Uncertainty-Guided Domain Alignment for Layer Segmentation in OCT ImagesAug 22 2019Automatic and accurate segmentation for retinal and choroidal layers of Optical Coherence Tomography (OCT) is crucial for detection of various ocular diseases. However, because of the variations in different equipments, OCT data obtained from different ... More
Progressive Face Super-Resolution via Attention to Facial LandmarkAug 22 2019Face Super-Resolution (SR) is a subfield of the SR domain that specifically targets the reconstruction of face images. The main challenge of face SR is to restore essential facial features without distortion. We propose a novel face SR method that generates ... More
NL-LinkNet: Toward Lighter but More Accurate Road Extraction with Non-Local OperationsAug 22 2019Road extraction from very high resolution satellite images is one of the most important topics in the field of remote sensing. For the road segmentation problem, spatial properties of the data can usually be captured using Convolutional Neural Networks. ... More
3C-Net: Category Count and Center Loss for Weakly-Supervised Action LocalizationAug 22 2019Temporal action localization is a challenging computer vision problem with numerous real-world applications. Most existing methods require laborious frame-level supervision to train action localization models. In this work, we propose a framework, called ... More
BIM-assisted object recognition for the on-site autonomous robotic assembly of discrete structuresAug 22 2019Robots-operating autonomous assembly applications in an unstructured environment require precise methods to locate the building components on site. However, the current available object detection systems are not well-optimised for construction applications, ... More
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary ShapesAug 22 2019Unifying text detection and text recognition in an end-to-end training fashion has become a new trend for reading text in the wild, as these two tasks are highly relevant and complementary. In this paper, we investigate the problem of scene text spotting, ... More
An Image Fusion Scheme for Single-Shot High Dynamic Range Imaging with Spatially Varying ExposuresAug 22 2019This paper proposes a novel multi-exposure image fusion (MEF) scheme for single-shot high dynamic range imaging with spatially varying exposures (SVE). Single-shot imaging with SVE enables us not only to produce images without color saturation regions ... More
Pro-Cam SSfM: Projector-Camera System for Structure and Spectral Reflectance from MotionAug 22 2019In this paper, we propose a novel projector-camera system for practical and low-cost acquisition of a dense object 3D model with the spectral reflectance property. In our system, we use a standard RGB camera and leverage an off-the-shelf projector as ... More
Multi-Stream Single Shot Spatial-Temporal Action DetectionAug 22 2019We present a 3D Convolutional Neural Networks (CNNs) based single shot detector for spatial-temporal action detection tasks. Our model includes: (1) two short-term appearance and motion streams, with single RGB and optical flow image input separately, ... More
Building change detection based on multi-scale filtering and grid partitionAug 22 2019Building change detection is of great significance in high resolution remote sensing applications. Multi-index learning, one of the state-of-the-art building change detection methods, still has drawbacks like incapability to find change types directly ... More
Globally optimal registration of noisy point cloudsAug 22 2019Registration of 3D point clouds is a fundamental task in several applications of robotics and computer vision. While registration methods such as iterative closest point and variants are very popular, they are only locally optimal. There has been some ... More
Multiple instance dense connected convolution neural network for aerial image scene classificationAug 22 2019With the development of deep learning, many state-of-the-art natural image scene classification methods have demonstrated impressive performance. While the current convolution neural network tends to extract global features and global semantic information ... More
Transferability and Hardness of Supervised Classification TasksAug 21 2019We propose a novel approach for estimating the difficulty and transferability of supervised classification tasks. Unlike previous work, our approach is solution agnostic and does not require or assume trained models. Instead, we estimate these values ... More
DUAL-GLOW: Conditional Flow-Based Generative Model for Modality TransferAug 21 2019Positron emission tomography (PET) imaging is an imaging modality for diagnosing a number of neurological diseases. In contrast to Magnetic Resonance Imaging (MRI), PET is costly and involves injecting a radioactive substance into the patient. Motivated ... More
Boundary Aware Networks for Medical Image SegmentationAug 21 2019Fully convolutional neural networks (CNNs) have proven to be effective at representing and classifying textural information, thus transforming image intensity into output class masks that achieve semantic image segmentation. In medical image analysis, ... More
Testing Robustness Against Unforeseen AdversariesAug 21 2019Considerable work on adversarial defense has studied robustness to a fixed, known family of adversarial distortions, most frequently L_p-bounded distortions. In reality, the specific form of attack will rarely be known and adversaries are free to employ ... More
Pixel-wise Segmentation of Right Ventricle of HeartAug 21 2019One of the first steps in the diagnosis of most cardiac diseases, such as pulmonary hypertension, coronary heart disease is the segmentation of ventricles from cardiac magnetic resonance (MRI) images. Manual segmentation of the right ventricle requires ... More
MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile ProcessorsAug 21 2019In recent years, convolutional networks have demonstrated unprecedented performance in the image restoration task of super-resolution (SR). SR entails the upscaling of a single low-resolution image in order to meet application-specific image quality demands ... More
Estimation of perceptual scales using ordinal embeddingAug 21 2019In this paper, we address the problem of measuring and analysing sensation, the subjective magnitude of one's experience. We do this in the context of the method of triads: the sensation of the stimulus is evaluated via relative judgments of the form: ... More
TUNA-Net: Task-oriented UNsupervised Adversarial Network for Disease Recognition in Cross-Domain Chest X-raysAug 21 2019In this work, we exploit the unsupervised domain adaptation problem for radiology image interpretation across domains. Specifically, we study how to adapt the disease recognition model from a labeled source domain to an unlabeled target domain, so as ... More
PCRNet: Point Cloud Registration Network using PointNet EncodingAug 21 2019PointNet has recently emerged as a popular representation for unstructured point cloud data, allowing application of deep learning to tasks such as object detection, segmentation and shape completion. However, recent works in literature have shown the ... More
DomainSiam: Domain-Aware Siamese Network for Visual Object TrackingAug 21 2019Visual object tracking is a fundamental task in the field of computer vision. Recently, Siamese trackers have achieved state-of-the-art performance on recent benchmarks. However, Siamese trackers do not fully utilize semantic and objectness information ... More
Effects of Blur and Deblurring to Visual Object TrackingAug 21 2019Intuitively, motion blur may hurt the performance of visual object tracking. However, we lack quantitative evaluation of tracker robustness to different levels of motion blur. Meanwhile, while image deblurring methods can produce visually clearer videos ... More
Learning Structured Twin-Incoherent Twin-Projective Latent Dictionary Pairs for ClassificationAug 21 2019In this paper, we extend the popular dictionary pair learning (DPL) into the scenario of twin-projective latent flexible DPL under a structured twin-incoherence. Technically, a novel framework called Twin-Projective Latent Flexible DPL (TP-DPL) is proposed, ... More
A CNN toolbox for skin cancer classificationAug 21 2019We describe a software toolbox for the configuration of deep neural networks in the domain of skin cancer classification. The implemented software architecture allows developers to quickly set up new convolutional neural network (CNN) architectures and ... More
Adaptive Structure-constrained Robust Latent Low-Rank Coding for Image RecoveryAug 21 2019In this paper, we propose a robust representation learning model called Adaptive Structure-constrained Low-Rank Coding (AS-LRC) for the latent representation of data. To recover the underlying subspaces more accurately, AS-LRC seamlessly integrates an ... More
Improved MR to CT synthesis for PET/MR attenuation correction using Imitation LearningAug 21 2019The ability to synthesise Computed Tomography images - commonly known as pseudo CT, or pCT - from MRI input data is commonly assessed using an intensity-wise similarity, such as an L2-norm between the ground truth CT and the pCT. However, given that the ... More
Scoot: A Perceptual Metric for Facial SketchesAug 21 2019While it is trivial for humans to quickly assess the perceptual similarity between two images, the underlying mechanism is thought to be quite complex. Despite this, the most widely adopted perceptual metrics today, such as SSIM and FSIM, are simple, ... More
U-Net Training with Instance-Layer NormalizationAug 21 2019Normalization layers are essential in a Deep Convolutional Neural Network (DCNN). Various normalization methods have been proposed. The statistics used to normalize the feature maps can be computed at batch, channel, or instance level. However, in most ... More
InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-PastingAug 21 2019Instance segmentation requires a large number of training samples to achieve satisfactory performance and benefits from proper data augmentation. To enlarge the training set and increase the diversity, previous methods have investigated using data annotation ... More
Video-based Bottleneck Detection utilizing Lagrangian Dynamics in Crowded ScenesAug 21 2019Avoiding bottleneck situations in crowds is critical for the safety and comfort of people at large events or in public transportation. Based on the work of Lagrangian motion analysis we propose a novel video-based bottleneckdetector by identifying characteristic ... More
Dataset Growth in Medical Image Analysis ResearchAug 21 2019Medical image analysis studies usually require medical image datasets for training, testing and validation of algorithms. The need is underscored by the deep learning revolution and the dominance of machine learning in recent medical image analysis research. ... More
A Realistic Face-to-Face Conversation System based on Deep Neural NetworksAug 21 2019To improve the experiences of face-to-face conversation with avatar, this paper presents a novel conversation system. It is composed of two sequence-to-sequence models respectively for listening and speaking and a Generative Adversarial Network (GAN) ... More
RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNsAug 21 2019Binarized convolutional neural networks (BCNNs) are widely used to improve memory and computation efficiency of deep convolutional neural networks (DCNNs) for mobile and AI chips based applications. However, current BCNNs are not able to fully explore ... More
Adaptive Segmentation of Knee Radiographs for Selecting the Optimal ROI in Texture AnalysisAug 21 2019The purposes of this study were to investigate: 1) the effect of placement of region-of-interest (ROI) for texture analysis of subchondral bone in knee radiographs, and 2) the ability of several texture descriptors to distinguish between the knees with ... More
KeystoneDepth: Visualizing History in 3DAug 21 2019This paper introduces the largest and most diverse collection of rectified stereo image pairs to the research community, KeystoneDepth, consisting of tens of thousands of stereographs of historical people, events, objects, and scenes between 1860 and ... More
Automated Multi-sequence Cardiac MRI Segmentation Using Supervised Domain AdaptationAug 21 2019Left ventricle segmentation and morphological assessment are essential for improving diagnosis and our understanding of cardiomyopathy, which in turn is imperative for reducing risk of myocardial infarctions in patients. Convolutional neural network (CNN) ... More
Pilot Study on Verifying the Monotonic Relationship between Error and Uncertainty in Deformable Registration for NeurosurgeryAug 21 2019In image-guided neurosurgery, deformable registration currently is not a clinical routine. Although using it in practice is a goal for image-guided therapy, this goal is hampered because surgeons are wary of the less predictable deformable registration ... More
Lung segmentation on chest x-ray images in patients with severe abnormal findings using deep learningAug 21 2019Rationale and objectives: Several studies have evaluated the usefulness of deep learning for lung segmentation using chest x-ray (CXR) images with small- or medium-sized abnormal findings. Here, we built a database including both CXR images with severe ... More
Preserving Semantic and Temporal Consistency for Unpaired Video-to-Video TranslationAug 21 2019In this paper, we investigate the problem of unpaired video-to-video translation. Given a video in the source domain, we aim to learn the conditional distribution of the corresponding video in the target domain, without seeing any pairs of corresponding ... More
Asymmetric Non-local Neural Networks for Semantic SegmentationAug 21 2019The non-local module works as a particularly useful technique for semantic segmentation while criticized for its prohibitive computation and GPU memory occupation. In this paper, we present Asymmetric Non-local Neural Network to semantic segmentation, ... More
Semantic-Transferable Weakly-Supervised Endoscopic Lesions SegmentationAug 21 2019Weakly-supervised learning under image-level labels supervision has been widely applied to semantic segmentation of medical lesions regions. However, 1) most existing models rely on effective constraints to explore the internal representation of lesions, ... More
FusionNet: Incorporating Shape and Texture for Abnormality Detection in 3D Abdominal CT ScansAug 21 2019Automatic abnormality detection in abdominal CT scans can help doctors improve the accuracy and efficiency in diagnosis. In this paper we aim at detecting pancreatic ductal adenocarcinoma (PDAC), the most common pancreatic cancer. Taking the fact that ... More
Communal Domain Learning for Registration in Drifted Image SpacesAug 20 2019Designing a registration framework for images that do not share the same probability distribution is a major challenge in modern image analytics yet trivial task for the human visual system (HVS). Discrepancies in probability distributions, also known ... More
Saccader: Improving Accuracy of Hard Attention Models for VisionAug 20 2019Although deep convolutional neural networks achieve state-of-the-art performance across nearly all image classification tasks, they are often regarded as black boxes. Because they compute a nonlinear function of the entire input image, their decisions ... More
On Object Symmetries and 6D Pose Estimation from ImagesAug 20 2019Objects with symmetries are common in our daily life and in industrial contexts, but are often ignored in the recent literature on 6D pose estimation from images. In this paper, we study in an analytical way the link between the symmetries of a 3D object ... More
P2L: Predicting Transfer Learning for Images and Semantic RelationsAug 20 2019Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new ... More
Action recognition with spatial-temporal discriminative filter banksAug 20 2019Action recognition has seen a dramatic performance improvement in the last few years. Most of the current state-of-the-art literature either aims at improving performance through changes to the backbone CNN network, or they explore different trade-offs ... More
Joint Motion Estimation and Segmentation from Undersampled Cardiac MR ImageAug 20 2019Accelerating the acquisition of magnetic resonance imaging (MRI) is a challenging problem, and many works have been proposed to reconstruct images from undersampled k-space data. However, if the main purpose is to extract certain quantitative measures ... More
More unlabelled data or label more data? A study on semi-supervised laparoscopic image segmentationAug 20 2019Improving a semi-supervised image segmentation task has the option of adding more unlabelled images, labelling the unlabelled images or combining both, as neither image acquisition nor expert labelling can be considered trivial in most clinical applications. ... More
Phrase Localization Without Paired Training ExamplesAug 20 2019Localizing phrases in images is an important part of image understanding and can be useful in many applications that require mappings between textual and visual information. Existing work attempts to learn these mappings from examples of phrase-image ... More
Resolving 3D Human Pose Ambiguities with 3D Scene ConstraintsAug 20 2019To understand and analyze human behavior, we need to capture humans moving in, and interacting with, the world. Most existing methods perform 3D human pose estimation without explicitly considering the scene. We observe however that the world constrains ... More
Image Synthesis From Reconfigurable Layout and StyleAug 20 2019Despite remarkable recent progress on both unconditional and conditional image synthesis, it remains a long-standing problem to learn generative models that are capable of synthesizing realistic and sharp images from reconfigurable spatial layout (i.e., ... More
LXMERT: Learning Cross-Modality Encoder Representations from TransformersAug 20 2019Vision-and-language reasoning requires an understanding of visual concepts, language semantics, and, most importantly, the alignment and relationships between these two modalities. We thus propose the LXMERT (Learning Cross-Modality Encoder Representations ... More
Probabilistic Reconstruction Networks for 3D Shape Inference from a Single ImageAug 20 2019We study end-to-end learning strategies for 3D shape inference from images, in particular from a single image. Several approaches in this direction have been investigated that explore different shape representations and suitable learning architectures. ... More
Playing magic tricks to deep neural networks untangles human deceptionAug 20 2019Magic is the art of producing in the spectator an illusion of impossibility. Although the scientific study of magic is in its infancy, the advent of recent tracking algorithms based on deep learning allow now to quantify the skills of the magician in ... More
Multi-Modal Recognition of Worker Activity for Human-Centered Intelligent ManufacturingAug 20 2019In a human-centered intelligent manufacturing system, sensing and understanding of the worker's activity are the primary tasks. In this paper, we propose a novel multi-modal approach for worker activity recognition by leveraging information from different ... More
Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose EstimationAug 20 2019Estimating the 6D pose of objects using only RGB images remains challenging because of problems such as occlusion and symmetries. It is also difficult to construct 3D models with precise texture without expert knowledge or specialized scanning devices. ... More
Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical ControlAug 20 2019Recent progress on physics-based character animation has shown impressive breakthroughs on human motion synthesis, through the imitation of motion capture data via deep reinforcement learning. However, results have mostly been demonstrated on imitating ... More
ViSiL: Fine-grained Spatio-Temporal Video Similarity LearningAug 20 2019In this paper we introduce ViSiL, a Video Similarity Learning architecture that considers fine-grained Spatio-Temporal relations between pairs of videos -- such relations are typically lost in previous video retrieval approaches that embed the whole frame ... More
Blind Image Deconvolution using Pretrained Generative PriorsAug 20 2019This paper proposes a novel approach to regularize the ill-posed blind image deconvolution (blind image deblurring) problem using deep generative networks. We employ two separate deep generative models - one trained to produce sharp images while the other ... More
A Novel method for IDC Prediction in Breast Cancer Histopathology images using Deep Residual Neural NetworksAug 20 2019Invasive ductal carcinoma (IDC), which is also sometimes known as the infiltrating ductal carcinoma, is the most regular form of breast cancer. It accounts for about 80% of all breast cancers. According to the American Cancer Society, more than 180,000 ... More
Unsupervised Multi-modal Style Transfer for Cardiac MR SegmentationAug 20 2019In this work, we present a fully automatic method to segment cardiac structures from late-gadolinium enhanced (LGE) images without using labelled LGE data for training, but instead by transferring the anatomical knowledge and features learned on annotated ... More
Learning Semantic-Specific Graph Representation for Multi-Label Image RecognitionAug 20 2019Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency. However, current methods cannot locate the semantic regions accurately ... More
Consistent Scale Normalization for Object RecognitionAug 20 2019Scale variation remains a challenge problem for object detection. Common paradigms usually adopt multi-scale training & testing (image pyramid) or FPN (feature pyramid network) to process objects in wide scale range. However, multi-scale methods aggravate ... More
Non-negative Sparse and Collaborative Representation for Pattern ClassificationAug 20 2019Sparse representation (SR) and collaborative representation (CR) have been successfully applied in many pattern classification tasks such as face recognition. In this paper, we propose a novel Non-negative Sparse and Collaborative Representation (NSCR) ... More
Towards High-Resolution Salient Object DetectionAug 20 2019Deep neural network based methods have made a significant breakthrough in salient object detection. However, they are typically limited to input images with low resolutions ($400\times400$ pixels or less). Little effort has been made to train deep neural ... More
RelGAN: Multi-Domain Image-to-Image Translation via Relative AttributesAug 20 2019Multi-domain image-to-image translation has gained increasing attention recently. Previous methods take an image and some target attributes as inputs and generate an output image with the desired attributes. However, such methods have two limitations. ... More
Deep High-Resolution Representation Learning for Visual RecognitionAug 20 2019High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation ... More
A Neural Virtual Anchor Synthesizer based on Seq2Seq and GAN ModelsAug 20 2019This paper presents a novel framework to generate realistic face video of an anchor, who is reading certain news. This task is also known as Virtual Anchor. Given some paragraphs of words, we first utilize a pretrained Word2Vec model to embed each word ... More
n-MeRCI: A new Metric to Evaluate the Correlation Between Predictive Uncertainty and True ErrorAug 20 2019As deep learning applications are becoming more and more pervasive in robotics, the question of evaluating the reliability of inferences becomes a central question in the robotics community. This domain, known as predictive uncertainty, has come under ... More
Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided AttentionAug 20 2019This paper studies the problem of temporal moment localization in a long untrimmed video using natural language as the query. Given an untrimmed video and a sentence as the query, the goal is to determine the starting, and the ending, of the relevant ... More
SROBB: Targeted Perceptual Loss for Single Image Super-ResolutionAug 20 2019By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate near-photorealistic ... More