Author: Will Douglas Heaven
Translation: Shen Chao TechFlow
Shen Chao Introduction: Niantic has turned the 30 billion city photos taken by Pokémon Go players into a new business. Its AI subsidiary Niantic Spatial has trained a visual positioning system with this data, achieving centimeter-level accuracy, far surpassing GPS performance in urban canyons. The first major client is the delivery robot company Coco Robotics. From catching Pikachu to delivering pizza, this may be one of the most unexpected commercialization paths for crowd-sourced data.
The full text is as follows:
Pokémon Go is the world's first phenomenal AR game. Released in 2016 by Google's subsidiary Niantic, this game, which overlays augmented reality gameplay on the Pokémon IP, quickly swept the globe. From Chicago to Oslo to Enoshima, players took to the streets hoping to catch a Jigglypuff, Squirtle, or (if lucky) a super-rare Galarian Zapdos — they hover just above the real world, tantalizingly out of reach.
In simple terms, this means millions of people are taking pictures of numerous buildings with their phones. "500 million people downloaded this App in 60 days," said Brian McClendon, CTO of Niantic Spatial. Niantic Spatial was spun off from Niantic last May as an AI company. According to data from gaming company Scopely (which concurrently acquired Pokémon Go from Niantic), the game still has over 100 million active players in 2024, eight years after its release.
Now, Niantic Spatial is leveraging this unparalleled treasure trove of crowd-sourced data — city landmark photos from hundreds of millions of Pokémon Go players' phones, accompanied by super-precise location tags — to build a World Model. This is a hot trend in technology today, aiming to anchor the intelligence of LLMs in real-world environments.
The company's latest product is a model: with just a few snapshots of buildings or landmarks, it can pinpoint your location on the map within a few centimeters. They want to use it to help robots achieve more precise navigation in areas where GPS is unreliable.
As a large-scale validation of the technology, Niantic Spatial has just partnered with Coco Robotics. Coco is a startup deploying last-mile delivery robots in several cities across the U.S. and Europe. "Everyone believes AR is the future, and AR glasses are coming," McClendon said, "but it turns out robots became the first users."
From Pikachu to Pizza Delivery
Coco Robotics has deployed about 1,000 suitcase-sized robots in Los Angeles, Chicago, Jersey City, Miami, and Helsinki, capable of carrying up to 8 large pizzas or 4 bags of groceries. According to CEO Zach Rash, these robots have completed over 500,000 deliveries, traveling millions of miles under various weather conditions.
But to compete with human riders, Coco's robots (which travel on sidewalks at about 5 mph) must be reliable enough. "Our best way of operating is to arrive on time at the moment we told you," Rash said. This means they cannot get lost.
The challenge for Coco is that they cannot rely on GPS. In urban areas, radio signals bounce between buildings and interfere with each other, resulting in weak GPS signals. "We do deliveries in many dense areas with skyscrapers, underground passages, and overpasses, where GPS basically never works well," Rash said.
"Urban canyons are the worst places in the world for GPS performance," McClendon said. "You see that blue dot on your phone, it often drifts 50 meters, putting you in another block, another direction, or across the road." This is the problem Niantic Spatial seeks to solve.
Over the past few years, Niantic Spatial has been organizing data generated by Pokémon Go and Ingress (Niantic's previous mobile AR game released in 2013) players to build a Visual Positioning System — determining your location based on what you see. "Having Pikachu realistically run around on the street and having Coco's robots safely navigate through the city is essentially the same problem," said John Hanke, CEO of Niantic Spatial.
"Visual positioning is not a new technology," said Konrad Wenzel of the digital mapping and geospatial analysis company ESRI, "but it’s clear that the more outdoor cameras there are, the better it works."
Niantic Spatial trained its model using 30 billion images captured in urban environments. These images are particularly dense around "hot spots" — important locations encouraged by Niantic games for players to visit, such as Pokémon battle gyms. "We have over 1 million locations worldwide where we can accurately pinpoint your position," McClendon said, "we know where you are standing, with an accuracy of a few centimeters. More importantly, we know which direction you are facing."
As a result, for each of these 1 million locations, Niantic Spatial has thousands of photos taken from almost the same position, but at different angles, times, and weather conditions. Each photo comes with detailed metadata: the precise location of the phone in space, orientation, posture, whether it was moving, speed, and direction, among others.
The company uses this dataset to train the model, allowing it to accurately predict its position by interpreting the "seen objects" — even working in areas outside those 1 million hot spots, where image and location data are relatively sparse.
Besides GPS, Coco's robots (equipped with 4 cameras) now also use this model to determine where they are and where they need to go. The robots' cameras are mounted at hip height and face in all directions; the perspective is somewhat different from Pokémon Go players, but Rash said the data adaptation is not complex.
Competitors are also using visual positioning systems. For example, Starship Technologies, a robot delivery company founded in Estonia in 2014, claims its robots use sensors to build 3D maps of their surroundings, marking building edges and lamp post locations.
But Rash bets that Niantic Spatial's technology will give Coco an edge. He believes this will enable the robots to accurately stop at the correct pickup location outside the restaurant, not blocking anyone's way, and halt at the customer's doorstep rather than just a few steps away — a situation that has occurred in the past.
The Cambrian Explosion of Robots
When Niantic Spatial began developing the visual positioning system, the goal was for use in augmented reality, Hanke said. "If you're wearing AR glasses, and you want the virtual world anchored in the direction you're looking, you need a way to achieve that. But now we are witnessing the Cambrian explosion in the field of robots."
Some robots need to share space with humans, such as construction sites and sidewalks. "If robots are to blend into these environments without disturbing humans, they must possess spatial understanding abilities similar to humans," Hanke said. "When robots are pushed or bumped, we can help them accurately find out where they are."
The partnership with Coco Robotics is just the beginning. Hanke said Niantic Spatial is building what he calls the first components of a "Living Map": an ultra-high-precision simulation of the virtual world that changes as the real world changes. As Coco and other company's robots navigate worldwide, they will provide new sources of mapping data, making digital world replicas more intricate.
In Hanke's and McClendon's view, maps are not only becoming more precise but are also increasingly utilized by machines. This changes the purpose of maps. For a long time, maps have helped humans locate themselves. From 2D to 3D to 4D (think of real-time simulations like digital twins), the basic principle remains unchanged: points on the map correspond to points in space or time.
But maps designed for machines may need to become more like travel guides, filled with information that humans take for granted. Companies like Niantic Spatial and ESRI want to add descriptions to maps, telling machines what they are actually seeing, with each object labeled with a series of attributes. "The challenge of this era is to build a useful world description for machines," Hanke said. "The data we have is a very good starting point for understanding how the connections and organizations of the world operate."
World models are very popular now, and Niantic Spatial is well aware of this. LLMs seem to understand everything, but they lack common sense when interpreting and interacting with everyday environments. World models aim to address this issue. Some companies, like Google DeepMind and World Labs, are developing models that can instantly generate virtual fantastical worlds to use as training grounds for AI agents.
Niantic Spatial says they are approaching this problem from different angles. By making maps as extreme as possible, you will eventually capture everything, McClendon said: "We are not there yet, but we aim to get there. I am currently very focused on trying to reconstruct the real world."
免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。