|dc.description.abstract||The thesis presents several results in the area of 3D perception, with focus on combining
learning and planning in active 3D mapping.
Autonomous robots, including those deployed in search and rescue operations or autonomous
vehicles, must build and maintain accurate representations of the surroundings
to operate e ciently and safely in human environment. These representations, or
maps, should encompass both low-level information about geometry of the scene and
high-level semantical information, including recognized categories or individual objects.
In the rst part we propose a method of 3D object recognition based on matching local
invariant features, which is further extended for 3D point cloud registration task and
evaluated on challenging real-world datasets. The method builds on a multi-stage feature
extraction pipeline composed of sparse keypoint detection to reduce complexity of
further stages, establishing local reference frames as a means to achieve invariance with
respect to rigid transformations without sacri cing descriptiveness of the underlying 3D
shape, and a compact description of the shape based on area-weighted normal projections.
For a moderate overlap between the laser scans, the registration method provides
a superior registration accuracy compared to state-of-the-art methods including Generalized
ICP, 3D Normal-Distribution Transform, Fast Point-Feature Histograms, and
4-Points Congruent Sets.
In the second part, two tasks from the area of active 3D mapping are being solved|
namely, simultaneous exploration and segmentation with a mobile robot in a search and
rescue scenario, and active 3D mapping using a sensor with steerable depth-measuring
rays, with applications in autonomous driving. For these tasks, we assume that the
localization is provided by an external source.
In the simultaneous exploration and segmentation task, we consider a mobile robot
exploring an unknown environment along a known path, using a static panoramic sensor
providing RGB and depth measurements, and controlling a narrow eld-of-view
thermal camera mounted on a pan-tilt unit. The task is to control the sensor along
the path to maximize accuracy of segmentation of the surroundings into human body
and background categories. Since demanding optimal control does not allow for online
replanning, we rather employ the optimal planner o ine to provide guiding trajectories
for learning a CNN-based control policy in a guided Q-learning framework. A policy
initialization is proposed which takes advantage of a special structure of the task and
allows e cient learning of the policy.
In the active 3D mapping task, our method simultaneously learns to reconstruct a
dense 3D occupancy map from sparse measurements and optimizes the reactive control
of depth-measuring rays. We propose a fast prioritized greedy algorithm to solve
the control subtask online, which needs to update the cost function in only a small
fraction of possible rays in each iteration. An approximation ratio of the algorithm is
derived. We experimentally demonstrate, using publicly available KITTI dataset, that
accuracy of the 3D improves signi cantly when learning-to-reconstruct is coupled with
the optimization of depth measuring rays.||cze