DoughNet 🍩
A Visual Predictive Model for Topological Manipulation of Deformable Objects
Dominik Bauer 1, Zhenjia Xu 1,2, Shuran Song 1,2
1 Columbia University, 2 Stanford University
tl;dr Our predictive model enables planning of robotic manipulation for geometrical deformation and topological change of elastoplastic objects; taking only a single RGBD observation as input, we select the best tool, its pose and opening width to recreate the desired robot- or human-made goal shape.
Planning Topological Manipulation
Given a partial RGBD observation of (held-out) objects and a set of (held-out) tools that the robot may use, our approach selects (1) the best suited tool, (2) its in-plane pose, as well as (3) its final opening width to satisfy, both, the geometry and the topology of a given goal state.
Below, we provide the last frame in the left video as goal state and the first frame in the right video as initial state to our method. The left video shows the creation of the goal state; the right video shows the execution of our plan to achieve this goal state (from a similar initialization).
Predicting Deformation and Topological Change
The topological manipulation shown above is enabled by our learned predictive model. It consists of two components: (1) a denoising autoencoder embeds and completes partial point-cloud observations in a topology-aware latent space; (2) an autoregressive set-to-set model, taking such a representation of the current object state and the desired motion of the tool to predict the next latent state.
In the examples below, the top row shows color and depth observations of a real-world manipulation trajectory. The bottom row shows the partial point cloud we may extract from these observations; and the predictions of DoughNet, taking the partial observation in the left column as input and predicting subsequent completed states from its own previous output. Different colors indicate the assignment to different components. The graph in the lower-right corner is a visualization of the predicted topology.
Synthesizing a Topological Manipulation Dataset
To determine if a topological change has occured, objects need to be perturbed. In the real world, however, this perturbation is destructive and introduces unwanted geometrical deformation. Instead, we use an MPM-based simulation to create a synthetic dataset of topological manipulations by employing two checking operations: (1) pulling previously disconnected components apart (opposite of the tool direction) to test for (self-)merging, and (2) pulling previously connected components apart (orthogonal to the tool direction) to test for splitting.
BibTeX
@article{bauer2024doughnet,
title={DoughNet: A Visual Predictive Model for Topological Manipulation of Deformable Objects},
author={Bauer, Dominik and Xu, Zhenjia and Song, Shuran},
journal={European Conference on Computer Vision (ECCV)},
year={2024}
}
Contact
If you have any questions, please contact Dominik Bauer.