MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty EstimationJun 14 2019We tackle the fundamentally ill-posed problem of 3D human localization from monocular RGB images. Driven by the limitation of neural networks outputting point estimates, we address the ambiguity in the task with a new neural network predicting confidence ... More
Collaborative GAN SamplingFeb 02 2019Generative adversarial networks (GANs) have shown great promise in generating complex data such as images. A standard practice in GANs is to discard the discriminator after training and use only the generator for sampling. However, this loses valuable ... More
Collaborative Sampling in Generative Adversarial NetworksFeb 02 2019Apr 19 2019A standard practice in Generative Adversarial Networks (GANs) is to completely discard the discriminator when generating samples. However, this sampling method loses valuable information learned by the discriminator regarding the data distribution.
Let Me Not Lie: Learning MultiNomial LogitDec 23 2018Discrete choice models generally assume that model specification is known a priori. In practice, determining the utility specification for a particular application remains a difficult task and model misspecification may lead to biased parameter estimates. ... More
Perceptual Losses for Real-Time Style Transfer and Super-ResolutionMar 27 2016We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a \emph{per-pixel} loss between the output and ground-truth ... More
PifPaf: Composite Fields for Human Pose EstimationMar 15 2019We propose a new bottom-up method for multi-person 2D human pose estimation that is particularly well suited for urban mobility such as self-driving cars and delivery robots. The new method, PifPaf, uses a Part Intensity Field (PIF) to localize body parts ... More
Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term DependenciesJan 08 2017Apr 03 2017The majority of existing solutions to the Multi-Target Tracking (MTT) problem do not combine cues in a coherent end-to-end fashion over a long period of time. However, we present an online method that encodes long-term temporal dependencies across multiple
Rethinking Person Re-Identification with ConfidenceJun 11 2019A common challenge in person re-identification systems is to differentiate people with very similar appearances. The current learning frameworks based on cross-entropy minimization are not suited for this challenge. To tackle this issue, we propose to ... More
Recurrent Attention Models for Depth-Based Person IdentificationNov 22 2016We present an attention-based model that reasons on human body shape and motion dynamics to identify individuals in the absence of RGB information, hence in the dark. Our approach leverages unique 4D spatio-temporal signatures to address the identification ... More
PifPaf: Composite Fields for Human Pose EstimationMar 15 2019Apr 05 2019We propose a new bottom-up method for multi-person 2D human pose estimation that is particularly well suited for urban mobility such as self-driving cars and delivery robots. The new method, PifPaf, uses a Part Intensity Field (PIF) to localize body parts
Knowledge Transfer for Scene-specific Motion PredictionMar 22 2016Jul 26 2016When given a single frame of the video, humans can not only interpret the content of the scene, but also they are able to forecast the near future. This ability is mostly driven by their rich prior knowledge about the visual world, both in terms of (i) ... More
Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement LearningSep 24 2018Mobility in an effective and socially-compliant manner is an essential yet challenging task for robots operating in crowded spaces. Recent works have shown the power of deep reinforcement learning techniques to learn socially cooperative policies. However, ... More
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity RecognitionNov 28 2016We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a ... More
Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement LearningSep 24 2018Feb 19 2019Mobility in an effective and socially-compliant manner is an essential yet challenging task for robots operating in crowded spaces. Recent works have shown the power of deep reinforcement learning techniques to learn socially cooperative policies. However,
Convolutional Relational Machine for Group Activity RecognitionApr 05 2019We present an end-to-end deep Convolutional Neural Network called Convolutional Relational Machine (CRM) for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video. It learns to produce ... More
From Bits to Images: Inversion of Local Binary DescriptorsNov 06 2012Local Binary Descriptors are becoming more and more popular for image matching tasks, especially when going mobile. While they are extensively studied in this context, their ability to carry enough information in order to infer the original image is seldom ... More
Characterizing and Improving Stability in Neural Style TransferMay 05 2017Recent progress in style transfer on images has focused on improving the quality of stylized images and speed of methods. However, real-time methods are highly unstable resulting in visible flickering when applied to videos. In this work we characterize ... More
Forecasting Social Navigation in Crowded Complex ScenesJan 05 2016When humans navigate a crowed space such as a university campus or the sidewalks of a busy street, they follow common sense rules based on social etiquette. In this paper, we argue that in order to enable the design of new algorithms that can take fully ... More
Unsupervised Learning of Long-Term Motion Dynamics for VideosJan 07 2017Apr 11 2017We present an unsupervised representation learning approach that compactly encodes the motion dependencies in videos. Given a pair of images from a video clip, our framework learns to predict the long-term 3D motions. To reduce the complexity of the learning ... More
Social GAN: Socially Acceptable Trajectories with Generative Adversarial NetworksMar 29 2018Understanding human motion behavior is critical for autonomous moving platforms (like self-driving cars and social robots) if they are to navigate human-centric environments. This is challenging because human motion is inherently multimodal: given a history ... More
Towards Viewpoint Invariant 3D Human Pose EstimationMar 23 2016Jul 26 2016We propose a viewpoint invariant model for 3D human pose estimation from a single depth image. To achieve this, our discriminative model embeds local regions into a learned viewpoint invariant feature space. Formulated as a multi-task learning problem, ... More
CAR-Net: Clairvoyant Attentive Recurrent NetworkNov 28 2017Jul 31 2018We present an interpretable framework for path prediction that leverages dependencies between agents' behaviors and their spatial navigation environment. We exploit two sources of information: the past motion trajectory of the agent of interest and a ... More
Video Analysis for Body-worn Cameras in Law EnforcementApr 11 2016The social conventions and expectations around the appropriate use of imaging and video has been transformed by the availability of video cameras in our pockets. The impact on law enforcement can easily be seen by watching the nightly news; more and more ... More
Towards Vision-Based Smart Hospitals: A System for Tracking and Monitoring Hand Hygiene ComplianceAug 01 2017Apr 24 2018One in twenty-five patients admitted to a hospital will suffer from a hospital acquired infection. If we can intelligently track healthcare staff, patients, and visitors, we can better understand the sources of such infections. We envision a smart hospital ... More
