DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
1VCIP, CS, Nankai University
2NKIARI, Shenzhen Futian
3Horizon Robotics
*Work done as Research Intern
†Corresponding Author
ArXiv Preprint 2025
Abstract
We present DIPO, a novel framework for the controllable generation of articulated 3D objects from a pair of images: one depicting the object in a resting state and the other in an articulated state. Compared to the single-image approach, our dual-image input imposes only a modest overhead for data collection, but at the same time provides important motion information, which is a reliable guide for predicting kinematic relationships between parts. Specifically, we propose a dual-image diffusion model that captures relationships between the image pair to generate part layouts and joint parameters. In addition, we introduce a Chain-of-Thought (CoT) based graph reasoner that explicitly infers part connectivity relationships. To further improve robustness and generalization on complex articulated objects, we develop a fully automated dataset expansion pipeline, named LEGO-Art, that enriches the diversity and complexity of PartNet-Mobility dataset. We propose PM-X, a large-scale dataset of complex articulated 3D objects, accompanied by rendered images, URDF annotations, and textual descriptions. Extensive experiments demonstrate that DIPO significantly outperforms existing baselines in both the resting state and the articulated state, while the proposed PM-X dataset further enhances generalization to diverse and structurally complex articulated objects.
Method Overview
Our DIPO framework consists of three key components:
1. Dual-State Image Conditioning: We condition the denoising process on both resting-state and articulated-state images using DINOv2 features. A Dual-State Injection Module integrates motion-aware cues by performing cross-attention between the two states.
2. Chain-of-Thought Graph Reasoner: This module predicts articulated part connectivity graphs from dual-state images using a step-by-step reasoning paradigm. It identifies candidate parts, estimates spatial layouts, verifies articulation rules, and infers attachment relationships.
3. LEGO-Art Pipeline: A fully automated synthesis pipeline that generates complex articulated 3D assets by assembling part primitives from existing datasets. It includes Description Roller, Layout Builder, Scripting Toolkit, Retrieval & Render, and Visual Filter modules.
PM-X Dataset & LEGO-Art
We introduce PM-X (PartNet-Mobility-Complex), a large-scale dataset of structurally complex articulated objects built using our LEGO-Art pipeline.
Quantitative Results
We evaluate our method on both PartNet-Mobility and ACD datasets, demonstrating superior performance in reconstruction quality and graph prediction accuracy.
Results on PartNet-Mobility Test Set
Method |
Reconstruction Quality |
Graph Acc% ↑ |
RS-dgIoU ↓ |
AS-dgIoU ↓ |
RS-dcDist ↓ |
AS-dcDist ↓ |
RS-dCD ↓ |
AS-dCD ↓ |
URDFormer [6] |
1.2327 |
1.2332 |
0.2885 |
0.4403 |
0.4417 |
0.6910 |
6.62 |
NAP-ICA [18] |
0.5706 |
0.5765 |
0.0563 |
0.2547 |
0.0209 |
0.3473 |
25.06 |
SINGAPO [23] |
0.5134 |
0.5236 |
0.0487 |
0.1107 |
0.0191 |
0.1270 |
75.97 |
DIPO (Ours) |
0.4561 |
0.4683 |
0.0359 |
0.0732 |
0.0132 |
0.0423 |
85.06 |
Qualitative Comparison
Our method demonstrates superior visual quality and better accuracy in articulation graph prediction. Thanks to the large-scale structurally diverse training provided by the PM-X dataset, DIPO shows better robustness when handling complex objects or real-world data.
Acknowledgements
This work was done while Ruiqi Wu was a Research Intern with Horizon Robotics. We thank the reviewers for their valuable feedback.
BibTex
@article{wu2025dipo,
title={DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data},
author={Wu, Ruiqi and Wang, Xinjie and Liu, Liu and Guo, Chunle and Qiu, Jiaxiong and Li, Chongyi and Huang, Lichao and Su, Zhizhong and Cheng, Ming-Ming},
journal={arXiv preprint arXiv:2505.20460},
year={2025}
}
Contact
Feel free to contact us at wuruiqi@mail.nankai.edu.cn