Hloubkový obraz z jednoho pohledu a řídké 3D rekonstrukce

Rakshith Madhavan

Single View Depth Completion of Sparse 3D Reconstructions

Typ dokumentu

diplomová práce
master thesis

Autor

Rakshith Madhavan

Vedoucí práce

Pajdla Tomáš

Oponent práce

Zimmermann Karel

Studijní obor

Kybernetika a robotika

Studijní program

Cybernetics and Robotics

Instituce přidělující hodnost

katedra kybernetiky

Práva

A university thesis is a work protected by the Copyright Act. Extracts, copies and transcripts of the thesis are allowed for personal use only and at one?s own expense. The use of thesis should be in compliance with the Copyright Act http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf and the citation ethics http://knihovny.cvut.cz/vychova/vskp.html
Vysokoškolská závěrečná práce je dílo chráněné autorským zákonem. Je možné pořizovat z něj na své náklady a pro svoji osobní potřebu výpisy, opisy a rozmnoženiny. Jeho využití musí být v souladu s autorským zákonem http://www.mkcr.cz/assets/autorske-pravo/01-3982006.pdf a citační etikou http://knihovny.cvut.cz/vychova/vskp.html

Metadata

Zobrazit celý záznam

Abstrakt

This work outlines a methodology to infer dense depth of a scene from an RGB image, and it’s corresponding sparse point cloud using an unsupervised training paradigm and combining it with a visual odometry algorithm such as ORB SLAM [2] in an offline step, to densify the sparse point clouds from its sparse mapping. The network consists of a sparse to dense module, and an encoder to create a 3D positional encoding of the image with a Calibrated Backprojection layer, and the decoder produces the dense depth map. This network is trained without supervision on the data from SLAM by minimizing the photometric reprojection error between frames. Inference is then run on the SLAM Keyframes and sparse depth from its corresponding keypoints to produce dense depth. With thed depth estimate, points from these Key-frames are then back-projected to the point cloud, thus resulting in a denser representation of the scene, especially in low-textured areas where the reconstruction from SLAM ususally fails.