Robots can now navigate without maps, but achieving efficient navigation remains a challenge. Backtracking wastes time, and unexpected obstacles can lead to falls. Facebook has developed a promising solution: a distributed reinforcement learning algorithm that successfully reaches its destination 99.9% of the time, with only a 3% deviation from the ideal path.
Named DD-PPO (Decentralized Distributed Proximal Policy Optimization), this algorithm requires only a standard RGB camera with depth data, GPS, and a compass to function effectively. The key innovation was a new training method that scales efficiently while maintaining synchronization, even under varying workloads. Previous attempts struggled with high computational demands.
Facebook's virtual agent learned point-to-point navigation by simulating 80 years of human experience, equating to approximately 2.5 billion steps. This extensive training enables the algorithm to smartly choose the correct path in indoor environments and quickly recognize when it deviates from the intended route. The technology is reportedly learning to grasp the "structural regularities" of buildings, according to Facebook.
While still in its infancy, this technology faces challenges in outdoor navigation and complex scenarios, particularly when sensors become unavailable. However, Facebook is committed to sharing its advancements, which could pave the way for robots that navigate seamlessly through various spaces. This progress may also enhance augmented reality systems, aiding users in unfamiliar environments.