Shape Priors for Real-Time Monocular Object Localization in Dynamic Environments

INFER: A visual abstract


Reconstruction of dynamic objects in a scene is a highly challenging problem in the context of SLAM. In this paper, we present a real-time monocular object localization system that estimates the shape and pose of dynamic objects in real-time, using video frames captured from a moving monocular camera. Although the problem seems to be ill-posed, we demonstrate that, by incorporating prior knowledge of the object category, we can obtain more detailed instance-level reconstructions. As opposed to earlier object model specifications, the proposed shape-prior model leads to the formulation of a Bundle Adjustment-like optimization problem for simultaneous shape and pose estimation. Leveraging recent successes of Convolutional Neural Networks (CNNs) for object keypoint localization, we present a CNN architecture that performs precise keypoint localization. We then demonstrate how these keypoints can be used to recover 3D object properties, while accounting for any 2D localization errors and self-occlusion. We show significant performance improvements compared to state-of-the-art monocular competitors for 2D keypoint detection, as well as 3D localization and reconstruction of dynamic objects.

In International Conference on Intelligent Robots and Systems
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Click the Slides button above to demo Academic’s Markdown slides feature.

Supplementary notes can be added here, including code and math.

Krishna Murthy Jatavallabhula
Krishna Murthy Jatavallabhula
PhD Candidate

My research blends robotics, computer vision, graphics, and physics with deep learning.

Sarthak Sharma
Masters Student
Madhava Krishna
comments powered by Disqus