We propose a method for accurate camera pose estimation in urban environments from single images and 2D maps made of the surrounding buildings’ outlines. Our approach bridges the gap between learning-based approaches and geometric approaches: We use recent semantic segmentation techniques for extracting the buildings’ edges and the façades’ normals in the images and minimal solvers  to compute the camera pose accurately and robustly. We propose two such minimal solvers: one based on three correspondences of buildings’ corners from the image and the 2D map and another one based on two corner correspondences plus one façade correspondence. We show on a challenging dataset that, compared to recent state-of-the-art , this approach is both, faster and more accurate.
|Title of host publication
|Proceedings of the British Machine Vision Conference (BMVC)
|Published - 2017
|28th British Machine Vision Conference: BMVC 2017 - London, United Kingdom
Duration: 4 Sept 2017 → 7 Apr 2018
|28th British Machine Vision Conference
|4/09/17 → 7/04/18