Show simple item record

dc.contributor Graduate Program in Computer Engineering.
dc.contributor.advisor Uğur, Emre.
dc.contributor.advisor Yanardağ, Pınar.
dc.contributor.author Dirik, Alara.
dc.date.accessioned 2023-10-15T06:27:10Z
dc.date.available 2023-10-15T06:27:10Z
dc.date.issued 2022
dc.identifier.other CMPE 2022 D57
dc.identifier.uri http://digitalarchive.boun.edu.tr/handle/123456789/19689
dc.description.abstract Computer graphics, 3D computer vision and robotics communities have pro duced multiple approaches to represent and generate 3D shapes, as well as a vast number of use cases. These use cases include, but are not limited to, data encoding and compression, shape completion and reconstruction from partial 3D views. How ever, controllable 3D shape generation and single-view reconstruction remain relatively unexplored topics that are tightly intertwined and can unlock new design approaches. In this work, we propose a unified 3D shape manipulation and single-view reconstruc tion framework that builds upon Deep Implicit Templates [1], a 3D generative model that can also generate correspondence heat maps for a set of 3D shapes belonging to the same category. For this purpose, we start by providing a comprehensive overview of 3D shape representations and related work, and then describe our framework and pro posed methods. Our framework uses ShapeNetV2 [2] as the core dataset and enables finding both unsupervised and supervised directions within Deep Implicit Templates. More specifically, we use PCA to find unsupervised directions within Deep Implicit Templates, which are shown to encode a variety of local and global changes across each shape category. In addition, we use the latent codes of encoded shapes and metadata of the ShapeNet dataset to train linear SVMs and perform supervised manipulation of 3D shapes. Finally, we propose a novel framework that leverages the intermediate latent spaces of Vision Transformer (ViT) [3] and a joint image-text representational model, CLIP [4], for fast and efficient Single View Reconstruction (SVR). More specifi cally, we propose a novel mapping network architecture that learns a mapping between the latent spaces ViT and CLIP, and DIT. Our results show that our method is both view-agnostic and enables high-quality and real-time SVR.
dc.publisher Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022.
dc.subject.lcsh Three-dimensional imaging.
dc.subject.lcsh Robotics -- Computer simulation.
dc.title 3D shape generation and manipulation
dc.format.pages xii, 50 leaves


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Archive


Browse

My Account