Manhattan-world urban scenes are common in the real world.
We propose a fully automatic approach for reconstructing such scenes
from 3D point samples. Our key idea is to represent the geometry of the
buildings in the scene using a set of well-aligned boxes. We first extract
plane hypothesis from the points followed by an iterative refinement step.
Then, candidate boxes are obtained by partitioning the space of the point
cloud into a non-uniform grid. After that, we choose an optimal subset of
the candidate boxes to approximate the geometry of the buildings. The
contribution of our work is that we transform scene reconstruction into a
labeling problem that is solved based on a novel Markov Random Field
formulation. Unlike previous methods designed for particular types of
input point clouds, our method can obtain faithful reconstructions from
a variety of data sources. Experiments demonstrate that our method is
superior to state-of-the-art methods.