Abstract:
3D models have become an essential part of many applications ranging from computer games and movie special effects to architectural design, virtual heritage, visual impact studies, and virtual environments. Traditionally, 3D model creation is done using modelling systems such as Maya or Blender. Although these systems enable the construction of highly realistic and complex 3D models, they have a steep learning curve and require a considerable amount of training to use. The introduction of specialised hardware, such as laser scanners, has simplified the creation of models from real physical objects. However, while many of these systems produce highly accurate results, they are usually extremely costly and often have restrictions on the size and surface properties of objects in the scene. Creating photorealistic 3D models of a scene from a collection of unconstrained and uncalibrated images is a demanding and enduring research problem in both computer vision and computer graphics. Although immense progress has recently been made on 3D reconstruction approaches, they are still not accurate and robust enough for general purpose production use. Additionally, most 3D reconstruction methods still depend heavily on calibrated cameras with known orientations to establish the transformation between the three-dimensional object space and two-dimensional image space. This hinders them from being practical when the intrinsic and extrinsic parameters of the camera are unavailable. In order to make 3D reconstruction and visualisation processes more accessible to a wider group of users, the content creation step must be simplified. Hence, there is a critical need for tools which allow non-expert users to quickly and efficiently create complex 3D scenes. This thesis explores the practical aspects associated with visual-geometric reconstruction of a complex 3D scene from a sequence of unconstrained and uncalibrated 2D images. These image sequences can be acquired by a video camera or a handheld digital camera without the need for camera calibration. We propose a novel approach, which integrates uncalibrated structure from motion, patch-based multi-view stereopsis algorithm, and surface reconstruction to facilitate the 3D reconstruction process. Once supplied with the input images, our system automatically processes and produces a 3D model. Our approach does not require any a priori information about the cameras being used. We tested our algorithm using a variety of datasets of objects at different scales acquired under different weather and lighting conditions. The results indicate that our algorithm is stable and enables inexperienced users to easily create complex 3D content using a standard consumer level camera.