Saying NVIDIA is a graphics card company is really selling the company short, and nothing showcases that more than the last few years of advances in Artificial Intelligence. Beyond the supercomputing, self-driving car, intelligent machines, cloud, and data center applications, they’ve also given consumers AI-driven advanced image processing in DLSS, noise reduction for our microphones, intelligent background removal/replacement in our cameras, power our media centers with their Shield products, and yes…they also make some powerful videocards. Well, today NVIDIA is unveiling a bit of a side project that could have a profound impact on the entertainment industry, and likely beyond — let’s meet GANverse3D. Ok, maybe we need to work on that name…
GANs are Generative Adversarial Networks — a machine learning framework that “competes” with itself by having two neural networks try to generate new data using a training image set, and then comparing the result. Through this competitive approach, the networks can iterate on their knowledge of an object. The result is somewhat unexpected — translating 2D images into 3D objects. More than that, it also inherits the properties of the object in question. Let me try to boil down the expansive research paper into the most lay terms I can.
What would happen if we fed 55,000 different images of cars into a neural network, taught the algorithm to detect the object and delete the background, and then turned it loose to try to interpret all of the angles of the object to create a 3D version of it. Well, computers are pretty terrible at that, or at least they were. This GANverse3D research was able to deliver not only a clean 3D image, but one that doesn’t require an animator to cobble it together. Why does that matter to you? Faster turnaround to create worlds, for sure, but the more important piece comes from what those 3D versions inherit.
By using realistic images of vehicles, the GANverse3D system is able to not only understand the look of an object, but also all of that object’s obvious properties. This means that the AI can understand all of the component parts (the colored pieces in the video below) are some of their proprerties. It can do this all without a human having to draw, render, rig, animate, or otherwise manipulate the object. It means that you could be rendering a New York street for Spider-Man to swing through, and you could let the AI simply generate you a whole busy street full of realistic cars without having to hand-create them. By feeding the system a single image of the car from Knight Rider, the AI was able to deliver this video:
This research paper is a hint of what’s to come, with realistic objects like horses, and birds also being tested. In their current state they are somewhat primitive and untextured models, but in the future these could become part of an AI-powered world driven by developers feeding the GAN a handful of 2D objects they’d like to bring to life, and then letting the AI deliver a mostly-finished 3D object.
It may be in its infancy, but it’s very easy to imagine how this could be used in gaming and movies, with background objects becoming easier to “fill in” without an army of animators. Just like DLSS before it, as NVIDIA iterates on this new tech, it’ll be interesting to see how developers of all stripes put it to use.