We introduce GenPoly, a novel generalized 3D prior model designed for multiple 3D generation tasks, focusing on preserving fine details. While previous works learn generalizable representations by decomposing objects into coarse-grained components to reassemble a coherent global structure, this approach sacrifices small-scale details. In this paper, we take a different perspective, formulating 3D prior modeling as a bottom-up polymorphic evolving process. Our key insight is that, beyond global structures, intricate local geometry variations hold rich contextual information that should be incorporated into the modeling process to learn fine-grained, generalizable representations. This allows coarse shapes to progressively evolve through multi-granular local geometry refinements, enabling high-fidelity 3D generation. To this end, we first introduce a polymorphic variational autoencoder (Poly-VAE), which constructs a versatile shape residual codebook via a polymorphic quantization mechanism. This codebook strategically encodes intricate local geometry representations from tesselated shapes within the latent space. Building on these representations, a 3D polymorphic evolving scheme is further developed to refine local details in a coarse-to-fine manner progressively. In this way, visually compelling 3D shapes with rich and complex details can be ultimately generated. The effectiveness of our method is demonstrated through extensive qualitative and quantitative evaluations, where GenPoly consistently surpasses state-of-the-art methods across various downstream tasks, particularly in local detail preservation.
Overview of the proposed PolyVAE framework. Starting from an input shape \( \mathcal{X} \) , the shape feature \( Z \) is first extracted by a 3D encoder \( \mathit{E} \) and progressively quantized into polymorphic geometric representations for constructing a diverse polymorphic residual codebook \( \mathcal{C} \). This is achieved by using a \( n \)-branch polymorphic quantization mechanism, with the first branch quantizing the carse shape features \( Z_1 \), and the subsequent branches capturing polymorphic residuals \( \left\{Z_2, ... Z_n\right\} \). These local geometric features, with diverse geometric contexts, are maintained in a unified polymorphic residual codebook \( \mathcal{C} \) to facilitate the generation of 3D shapes using rich details. Finally, the quantized features \( \{\hat{Z}_1, ... \hat{Z}_n\} \) are aggregated and decoded to recover the input 3D shapes via a shared decoder \( \mathit{D} \).