Nvidia (a client of the author) has lately been doing a lot of fascinating things, from creating workstations designed to conceive the metaverse to digital assistants that are evolving into human digital twins to tools that could let anyone create compelling art. One of the more interesting tools is Generator StyleGAN, which creates people’s faces by blending pictures.
The training set for this artificial-intelligence-based offering contains 70,000 high-quality PNG images (each at a resolution of 1024×1024 pixels) that allow a user almost unlimited flexibility of source material.
StyleGAN has been around since 2018, became more widely available in 2019 when the source code went open source, and is now in its third permutation. StyleGAN3 was released last October.
The advantages for those of us who work with images include the eventual ability to craft them from large pools of protected source images without facing copyright issues or worrying about copyright infringement. And as the process evolves to include other images (it’s basically an image-blending engine), it could allow you to blend professional photographs from a variety of sources to create uniquely beautiful images or paintings created from memories or imagination with little or no connection to anything real.
An AI-driven image-blending tool like StyleGAN could dramatically change and improve a number of industries and practices (or be used for more nefarious “deep-fakes”). Let’s explore.
I watch a lot of crime procedurals on TV; there’s usually a segment where someone sits in front of a sketch artist to create an image of a criminal they observed. That entire process could be automated by a conversational AI. The witness could be shown an evolving picture with examples of features that are blended on command until the picture matches the victim’s memory. The end result would be a photorealistic image that could be used by facial recognition programs to locate the criminal quickly. (The collateral damage would be that there would no longer be a need for law enforcement sketch artists.)
One area where this technology might have a big impact is in locating kidnapped children. The AI could rapidly age the image of the child so they might be better identified later in life.
A lot of marketing material uses stock images or models in production. The problem with the former is that these same images can be used in other campaigns — inadvertently connecting disparate campaigns. For instance, if the same image is used in a medication ad and for a restaurant, customers might associate the two and avoid the restaurant. The same problem could result from using a live model who later ends up on another campaign, since some actors and models move between competitors. And live models/actors can have personal problems that can damage a brand or ad campaign.
But using blended images and videos from something like StyleGAN means you can create an image that can be copyrighted by your firm, be unique from any stock image, and not connected to any actor or model, living or dead. The result is lower cost and, more importantly, lower risk. You get results faster and the need for models and actors would be reduced. You might only use actors in 3D-imaging suits that obscure their identities — and with advances in metaverse tools and 3D imagers, you might not even need them. It also takes us a big step closer to not needing actors for movies.
Another area Nvidia is exploring involves the creation of digital twins for the metaverse. And as the AI behind these twins improves, they’d become more indistinguishable from the source material. When that happens, who owns the result? You can make an argument that an employee should own their digital twin. But if a tool like StyleGAN is used to blend both images and an employee’s skills, that position becomes more tenuous; a company might be able to defend its ownership of the result. (I expect future employees and unions could have significant problems with a something like this being used to displace employees without compensation.
The ability to blend source material that may (or may not) be protected at a scale is compelling — especially if it eliminates potential legal issues. Nvidia’s process uses a vetted source of images that eliminates legal exposure, but tools like this don’t have to rely only on stock photo databases; they could be used on images of public figures taken from social media posts, movies or other advertising material.
At some point, I expect this technology will force a rewrite of copyright laws dealing with composite images. At the same time, they would reduce the amount of effort and cost that go into creating photorealistic movies and images that can be used in business and entertainment. It’s an early example of major changes coming to current business practices and related income for those engaged as models, actors, or directors, and for artists tasked with creating images that define memorized events.
Tools like StyleGAN will redefine the future of virtual media for business, government, and entertainment.