Multi-object monocular SLAM for dynamic environments

Monocular multibody SLAM


— In this paper, we tackle the problem of multibody SLAM from a monocular camera. The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene. The quintessential challenge in dynamic scenes is unobservability; it is not possible to unambiguously triangulate a moving object from a moving monocular camera. Existing approaches solve restricted variants of the problem, but the solutions suffer relative scale ambiguity (i.e., a family of infinitely many solutions exist for each pair of motions in the scene). We solve this rather intractable problem by leveraging single-view metrology, advances in deep learning, and category-level shape estimation. We propose a multi posegraph optimization formulation, to resolve the relative and absolute scale factor ambiguities involved. This optimization helps us reduce the average error in trajectories of multiple bodies over real-world datasets, such as KITTI. To the best of our knowledge, our method is the first practical monocular multi-body SLAM system to perform dynamic multi-object and ego localization in a unified framework in metric scale.

In Intelligent Vehicles Symposium
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Click the Slides button above to demo Academic’s Markdown slides feature.

Supplementary notes can be added here, including code and math.

Junaid Ahmed Ansari
Masters Student
Krishna Murthy Jatavallabhula
Krishna Murthy Jatavallabhula

My research builds “world models” – a necessity for intelligent embodied agents acting in the real world. My work spans robotics, computer vision, graphics, and deep learning.

Madhava Krishna
comments powered by Disqus