GenPoly

Abstract

We introduce GenPoly, a novel generalized 3D prior model designed for multiple 3D generation tasks, focusing on preserving fine details. While previous works learn generalizable representations by decomposing objects into coarse-grained components to reassemble a coherent global structure, this approach sacrifices small-scale details. In this paper, we take a different perspective, formulating 3D prior modeling as a bottom-up polymorphic evolving process. Our key insight is that, beyond global structures, intricate local geometry variations hold rich contextual information that should be incorporated into the modeling process to learn fine-grained, generalizable representations. This allows coarse shapes to progressively evolve through multi-granular local geometry refinements, enabling high-fidelity 3D generation. To this end, we first introduce a polymorphic variational autoencoder (Poly-VAE), which constructs a versatile shape residual codebook via a polymorphic quantization mechanism. This codebook strategically encodes intricate local geometry representations from tesselated shapes within the latent space. Building on these representations, a 3D polymorphic evolving scheme is further developed to refine local details in a coarse-to-fine manner progressively. In this way, visually compelling 3D shapes with rich and complex details can be ultimately generated. The effectiveness of our method is demonstrated through extensive qualitative and quantitative evaluations, where GenPoly consistently surpasses state-of-the-art methods across various downstream tasks, particularly in local detail preservation.

Method

Overview of the proposed PolyVAE framework. Starting from an input shape \( \mathcal{X} \) , the shape feature \( Z \) is first extracted by a 3D encoder \( \mathit{E} \) and progressively quantized into polymorphic geometric representations for constructing a diverse polymorphic residual codebook \( \mathcal{C} \). This is achieved by using a \( n \)-branch polymorphic quantization mechanism, with the first branch quantizing the carse shape features \( Z_1 \), and the subsequent branches capturing polymorphic residuals \( \left\{Z_2, ... Z_n\right\} \). These local geometric features, with diverse geometric contexts, are maintained in a unified polymorphic residual codebook \( \mathcal{C} \) to facilitate the generation of 3D shapes using rich details. Finally, the quantized features \( \{\hat{Z}_1, ... \hat{Z}_n\} \) are aggregated and decoded to recover the input 3D shapes via a shared decoder \( \mathit{D} \).

Unconditional 3D Shape Generation

Slide for more Chair :

Slide for more Car :

Slide for more Airplane :

(a) TIGER

(b) SDF-Diff

(c) 3DQD

(d) Ours

Qualitative comparisons for unconditional generation. We select similar generated shapes of different methods for more comparable evaluation. Our generated shapes are of high quality, with well geometric details and smooth surfaces. Best viewed with zooming in digital version.

Slide for more Chair :

Slide for more Car :

Slide for more Airplane :

Here are our more qualitative results of unconditional generation for chairs, cars, and airplanes on ShapeNet. Our PolyVAE could sufficiently capture geometric informations of various scales, and hence facilitate detail-preserving 3D shape generation with diverse style and promising details. Best viewed with zooming in digital version.

Text-conditioned Generation

A chair with circular seat.

A chair that has a really long backrest.

A chair with an opening at the bottom of the backrest.

A wooden chair featuring slats on all sides, including the armrests.

(a) 3DQD

(b) SDFusion

(c) Ours

Qualitative comparisons for text-conditioned generation. Our generated shapes are more detail-preserving and better aligned with the text descriptions. Best viewed with zooming in digital version.

Single-view Shape Reconstruction

Examples for single-view shape reconstruction. With low cost finetuning, our method could quickly and flexibly generalize to image-conditioned reconstruction tasks under real-world scenes.

GenPoly: Learning Generalized and Tessellated Shape Priors via 3D Polymorphic Evolving

Abstract

Method

Unconditional 3D Shape Generation

Text-conditioned Generation

Single-view Shape Reconstruction

GenPoly: Learning Generalized and Tessellated Shape Priors via
3D Polymorphic Evolving