Robots that can learn to navigate warehouses safely

April 23, 2022

(News from Nanowerk) Robots have been working in factories for many years. But given the complex and diverse tasks they perform, as well as the security concerns associated with it, most of them operate inside cages or behind security glass to limit or prevent interaction. with humans.

In warehouse operations, where goods are continuously sorted and moved, robots cannot be caged or stationary. And while large companies like Amazon have already integrated robots into their warehouses, these are highly customized and expensive systems where robots are designed to operate in a particular facility on predefined grids or well-defined pathways under the direction specific centralized programming that carefully directs their activity.

“For robots to be most useful in a warehouse, they will need to be smart enough to deploy easily and quickly in any facility; able to practice navigating new dynamic environments; and most importantly, being able to work safely with humans, as well as large fleets of other robots,” said Ding Zhao, Principal Investigator and Assistant Professor of Mechanical Engineering Opens in a new window.

At Carnegie Mellon University, a team of engineers and computer scientists used their expertise in advanced manufacturing, robotics and artificial intelligence to develop the warehouse robots of the future.

The collaboration was formed at the university’s Manufacturing Future’s Institute (MFI), which funds this research with grants from the Richard King Mellon Foundation. The foundation awarded a core grant of $20 million in 2016 and awarded an additional $30 million in May 2021 to support advanced manufacturing research and development at the IMF.

Zhao and Martial Hebert Opens in a new window, dean of the School of Computer Science and professor at the Robotics Institute, lead the warehouse robot project. They investigated several reinforcement learning techniques that showed measurable improvements over previous methods in simulated motion planning experiments. The software used in their test robot also performed well in path planning experiments at Mill 19, MFI’s collaborative workspace for advanced manufacturing.

“Thanks to advances in advanced AI chips, sensors and algorithms, we are poised to revolutionize manufacturing robots,” Zhao said. The team builds on previous work on self-driving cars to develop warehouse robots capable of learning multitasking route planning via secure reinforced learning, training robots to quickly adapt to new environments and operate safely with human-operated workers and vehicles.

MAPPER: robots capable of learning to plan their own trajectories

The group first developed a method (IEEE/RSJ 2020 International Conference on Intelligent Robots and Systems, “MAPPER: Multi-Agent Path Planning with Scalable Reinforcement Learning in Mixed Dynamic Environments”) that could allow robots to continuously learn how to plan paths in large dynamic environments. the Multi-agent journey planning with scalable reinforcement (MAPPER) will allow robots to explore on their own and learn by trial and error in a manner similar to how human babies accumulate more experience to handle various situations over time.

The decentralized method eliminates the need to program robots from a powerful central control computer. Instead, robots make independent decisions based on their own local observations. The robots’ partially observable capabilities will allow their onboard sensors to observe dynamic obstacles within a range of 10 to 30 meters. But with enhanced learning, robots will train continuously, if not indefinitely, to handle unknown dynamic obstacles.

In November 2021, Ding Zhao and his students demonstrated the capabilities of their warehouse robot to Pennsylvania Senators Ryan Aument, Joe Pittman, and Pat Stefano, and Representatives Josh Kail and Natalie Mihalek, who were visiting Carnegie Mellon College of Engineering. (Photo: Carnegie Mellon University)

These smart robots can make it easier and faster for warehouses to utilize large robot fleets. Since the calculation is performed with each robot’s onboard resources, the complexity of the calculation will increase slightly as the number of robots increases, making it easier to add, remove, or replace robots.

Energy consumption could also be reduced when robots travel shorter distances, as they are able to independently learn to plan their own efficient routes. And the “decentralized and partially observable” parameter will reduce communication and computational energy compared to classical centralized methods.

RCE: robots that prioritize safety in the pursuit of a programmed objective

Another successful study (“Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method”) applied the use of constrained model-based reinforcement learning with the Robust Cross-Entropy (RCE) method.

Researchers must explicitly consider safety constraints for a learned robot so that it does not sacrifice safety to complete tasks. For example, the robot must avoid colliding with other robots, damaging property or interfering with equipment in order to achieve its goal.

“Although reinforcement learning methods have achieved great success in virtual applications, such as computer games, there are still a number of difficulties in applying them to real-world robotic applications. Among them, safety is paramount,” Zhao said.

The creation of such safe constraints that are taken into account at all times and under all conditions, goes beyond traditional methods of reinforcement learning into the increasingly important field of safe reinforcement learning, which is essential to the deployment of these new technologies.

Two students in a lab working with a robotic arm Mengdi Xu, a third-year Ph.D. student at CMU Safe AI Lab, working with an intelligent manufacturing manipulation robot. (Photo: Carnegie Mellon University)

The team evaluated its new RCE method in the Safety Gym, a set of virtual environments and tools for measuring progress towards reinforcement learning agents that meet safety constraints during training. The results showed that their approach allowed the robot to learn to perform its tasks with a much lower number of constraint violations than state-of-the-art baselines. Moreover, they were able to achieve several orders of magnitude better sampling efficiency compared to model-free RL approaches.

CASRL: Robots that can learn to adapt to current conditions

To better understand how robots can safely navigate typical warehouse environments where people and other robots move freely – or what the researchers call non-stationary disturbances, the group used a Safe and context-aware reinforcement learning (CASRL), a meta-learning framework in which robots can learn to safely adapt to non-stationary disturbances when they occur (2021 IEEE International Conference on Robotics and Automation“Safe and context-aware reinforcement learning for non-stationary environments”).

In addition to workers or other robots moving through a warehouse, the CARSL method would also allow robots to learn to navigate safely through other situations which could include inaccurate sensor measurements, broken robot parts or obstructions such as trash or other obstacles in the environment. . The team is also applying CARSL to tool manipulation and human interaction, which can be directly applied to assembly in manufacturing.

“Non-stationary disturbances are everywhere in real-world applications, offering endless variations of scenarios. An intelligent robot should be able to generalize to unseen cases rather than just memorizing examples provided by humans. This is one of the ultimate challenges of trustworthy AI. said Zuxin Liu, a third-year Ph.D. student at CMU’s Safe AI Lab, supported by the MFI award.

Zhao explains that the robot must learn to determine whether previously formed planning policies are still suitable for the current situation. The robot updates the policy based on recent local observations through online training, so that it can be easily adapted to new situations with unseen disturbances, while ensuring safety with a degree of probability raised. Given the detection data of the last minutes/seconds, the robot can automatically infer and model potential disturbances based on the data and update the planning policy.

Zhao’s team further extends the method to task-independent online learning, which can continuously learn to solve unseen tasks with online reinforcement learning that is not only able to adapt to unseen but similar tasks. , but also to identify and learn to solve distinct tasks. .

In each of the above studies, the new models and methods improved previous methods of training robots to move safely and efficiently in new and changing environments. Such successful incremental steps are essential to achieving the ultimate goal of the verifiable level of reliability required for better warehouse robots.

The team will continue to work on the deployment of manufacturing logistics and assembly handling. Zhao will also work on a new MFI-funded project on the generation of safety and security-critical digital twins/metaverses, which will be an essential tool in developing trustworthy intelligent manufacturing robots in a safe and efficient manner.

“The future of next generation manufacturing is now,” Zhao said.

Previous One Andhra man killed, 3 injured in electric vehicle battery explosion
Next How Anzac Day has become etched in our national consciousness. and Australia split into a new history