We present a novel 3D pose refinement approach based ondifferentiable rendering for objects of arbitrary categories in the wild. Incontrast to previous methods, we make two main contributions: First,instead of comparing real-world images and synthetic renderings in theRGB or mask space, we compare them in a feature space optimized for3D pose refinement. Second, we introduce a novel differentiable rendererthat learns to approximate the rasterization backward pass from data in-stead of relying on a hand-crafted algorithm. For this purpose, we predictdeep cross-domain correspondences between RGB images and 3D modelrenderings in the form of what we call geometric correspondence fields.These correspondence fields serve as pixel-level gradients which are ana-lytically propagated backward through the rendering pipeline to performa gradient-based optimization directly on the 3D pose. In this way, weprecisely align 3D models to objects in RGB images which results in sig-nificantly improved 3D pose estimates. We evaluate our approach on thechallenging Pix3D dataset and achieve up to 55% relative improvementcompared to state-of-the-art refinement methods in multiple metrics.
|Lecture Notes in Computer Science
|16th European Conference on Computer Vision
|23/08/20 → 28/08/20