Treffer: Voxel-Based 3D Object Generation from Single Images Using an Enhanced Deep Learning Architecture.
Weitere Informationen
Deep learning has revolutionised the field of 3D modelling by providing powerful tools for generating three-dimensional objects from various input sources, such as images, point clouds, and even textual descriptions. The ability to reconstruct accurate 3D models from limited information is crucial for numerous applications, including computer-aided design (CAD), virtual reality (VR), augmented reality (AR), and game development. However, generating precise 3D objects from single-view images remains a significant challenge due to issues like geometric complexity, occlusion, and computational cost. The aim of this study is to enhance existing computer vision and graphics methodologies by improving and optimising deep learning algorithms that enable the generation of accurate 3D objects, thereby improving the efficiency of computer-aided design. The study utilised the "ShapeNetCore (v2)" dataset, which includes over 51,000 unique 3D models across 55 different categories. A developed deep-learning architecture was used to generate 3D objects in the form of voxels from single-view images. Data processing and model training were conducted using the PyTorch framework, which offers flexibility and efficiency in building and training deep neural networks. To address challenges such as geometric complexity and occlusion, ef- ficient data preprocessing techniques, including data augmentation and normalisation, to enhance the quality and diversity of the training data were incorporated. For evaluating the model's performance, metrics such as Chamfer Distance and Intersection-over-Union (IoU) were applied. The Chamfer Distance quantifies the similarity between the predicted and ground truth point clouds, while the IoU measures the overlap between the predicted and actual voxel grids. Preliminary experimental results demonstrate that the proposed model effectively generates accurate 3D objects from single images, achieving an overall IoU score of 0.6549. These initial findings suggest that the model performs well across various object categories. This work contributes to the field of 3D object generation by presenting an optimised deep-learning solution that enhances the accuracy of reconstructed objects. The model's adaptability to various object categories and its potential applications in computer-aided design, virtual reality, and game development highlights its significance in advancing 3D modelling technologies. [ABSTRACT FROM AUTHOR]
Copyright of Vilnius University Proceedings is the property of Vilnius University and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)