The researchers claim that their approach could be used to train AI to carry out other tasks. To begin with, it could be used to for bots that use a keyboard and mouse to navigate websites, book flights, or buy groceries online. But in theory it could be used to train robots to carry out physical, real-world tasks by copying first-person video of people doing those things. “It’s plausible,” says Stone.
Matthew Guzdial at the University of Alberta in Canada, who has used videos to teach AI the rules of games like Super Mario Bros., does not think it will happen any time soon, however. Actions in games like Minecraft and Super Mario Bros. are performed by pressing buttons. Actions in the physical world are far more complicated and harder for a machine to learn. “It unlocks a whole mess of new research problems,” says Guzdial.
“This work is another testament to the power of scaling up models and training on massive data sets to get good performance,” says Natasha Jaques, who works on multi-agent reinforcement learning at Google and the University of California, Berkeley.
Large internet-sized data sets will certainly unlock new capabilities for AI, says Jaques: “We’ve seen that over and over again, and it’s a great approach.” But OpenAI places a lot of faith in the power of large data sets alone, she says: “Personally, I’m a little more skeptical that data can solve any problem.”
Still, Baker and his colleagues think that collecting more than a million hours of Minecraft videos will make their AI even better. It’s probably the best Minecraft-playing bot yet, says Baker: “But with more data and bigger models, I would expect it to feel like you’re watching a human playing the game, as opposed to a baby AI trying to mimic a human.”