Soft robot fingers equipped with tactile sensors grasping an egg. The bottom-right images show the tactile sensing results.Credit: Binghao Huang.

Integrated multi-modal sensing and learning system could give robots new capabilities

by · Tech Xplore

To assist humans with household chores and other everyday manual tasks, robots should be able to effectively manipulate objects that vary in composition, shape and size. The manipulation skills of robots have improved significantly over the past few years, in part due to the development of increasingly sophisticated cameras and tactile sensors.

Researchers at Columbia University have developed a new system that simultaneously captures both visual and tactile information. The tactile sensor they developed, introduced in a paper presented at the Conference on Robot Learning (CoRL) 2024 in Munich, could be integrated onto robotic grippers and hands, to further enhance the manipulation skills of robots with varying body structures.

The paper was published on the arXiv preprint server.

"Humans perceive the environment from multiple sensory modalities, among which touch plays a critical role in understanding physical interactions," Yunzhu Li, senior author of the paper, told Tech Xplore. "Our goal is to equip robots with similar capabilities, enabling them to sense the environment through both vision and touch for fine-grained robotic manipulation tasks."

As part of their study, the researchers set out to develop a multi-modal sensing system that could be used to gather both visual data, which can be used to estimate the position of objects in its field of view and their geometry, as well as tactile information, such as contact location, force, and local interaction patterns.

The integrated multi-modal sensing and learning system they developed, called 3D-ViTac, could give robots new sensing capabilities, allowing them to better tackle real-world manipulation tasks.

"Compared with existing state-of-the-art solutions, especially optical-based sensors, our sensor is as thin as a piece of paper, flexible, scalable and more robust for long-term use and large-scale data collection," explained Li.

"Coupled with visual observation, we developed an end-to-end imitation framework that enables robots to perform a variety of manipulation tasks, demonstrating significant improvements in safe interactions with fragile items and long-horizon tasks involving in-hand manipulation."

Li and his colleagues tested their sensor and the end-to-end imitation learning framework they developed in a series of experiments employing a real robotic system. Specifically, they integrated two of their sheet-like sensing devices onto each of a robotic gripper's fin-like hands.

The team then tested the gripper's performance on four challenging manipulation tasks, including steaming an egg, placing grapes on a plate, grasping a hex key and serving a sandwich. The findings of these initial tests were very promising, as their sensor appeared to improve the gripper's ability to successfully complete all tasks.

"We demonstrate that our proposed visuo-tactile imitation learning framework enables even low-cost robots to perform precise manipulation tasks," said Li. "It significantly outperforms vision-only approaches, particularly in handling fragile objects and achieving high precision in fine-grained manipulation."

The new sensor developed by this team of researchers could soon be deployed on other robotic systems and assessed on a broader range of object manipulation tasks that require high levels of precision. Meanwhile, Li and his colleagues plan to develop simulation methods and integration strategies that could make their sensor easier to apply and test on other robots.

"In our next studies, we aim to develop simulation techniques for tactile signals, explore ways to integrate the sensor into dexterous robotic hands and larger-scale surfaces (e.g., robot skin) and democratize tactile sensing in robotics," added Li.

"This will facilitate large-scale data collection and contribute toward multimodal robotic foundation models that better understand physical interactions through touch."

More information: Binghao Huang et al, 3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing, arXiv (2024). DOI: 10.48550/arxiv.2410.24091
Journal information: arXiv