Hoppa till huvudinnehållet
Till KTH:s startsida Till KTH:s startsida

Interpretability in Contact-Rich Manipulation via Kinodynamic Images

Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from the kinematic and dynamic data of a contact-rich manipulation task. Our formulation visually reflects the task's state by encoding its kinodynamic
variations and temporal evolution. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations of the model's decisions with Grad-CAM, a technique that produces visual explanations. Our method is versatile and can be applied to any classification problem in  manipulation tasks to visually interpret which parts of the input drive the model's decisions and distinguish its failure modes, regardless of the features used. We evaluate this approach on two examples of real-world contact-rich manipulation: pushing and cutting, with known and unknown objects. Finally, we demonstrate that our method enables both detailed visual inspections of sequences in a task, as well as high-level evaluations of a model's behavior and tendencies.  

 

Ioanna Mitsioni , Joonatan Mänttäri , Yiannis Karayiannidis , John Folkesson and Danica Kragic, "Interpretability in Contact-Rich Manipulation via Kinodynamic Images", in IEEE International Conference on Robotics and Automation (ICRA) 2021, Xi'an, China  [link to the paper]


Profilbild av Ioanna Mitsioni

Portfolio