Driving real value with synthetic data

The Ultimate Guide to Synthetic Data: Uses, Benefits & Tools

Ask someone if they’re familiar with generative adversarial networks, and they might say no. But ask that same person if they’ve heard of deepfakes, and they’ll probably say yes. Deepfakes – a combination of “deep learning” and “fake” – get a lot of attention, and it’s easy to see why. With technology that’s already available, almost anyone can generate convincing audio or video of someone saying or doing things they never said. In a well-known example, a scammer used an audio deepfake to con a U.K. energy firm out of €220,000.

In fact, though, deepfakes are just one application of generative adversarial networks (GANs). And the focus on only negative applications of GANs means businesses are leaving real value on the table.

Generative network technology is about creating realistic synthetic data. In the case of a deepfake, the goal is to create audio or video that can fool human viewers. For businesses, though, synthetic data can be used to generate value: in product development, better training for AI systems, artistic enhancement, and even consumer privacy.

Researchers at Labs are working in these spaces and more. They’ve used synthetic data to speed up the process of testing new product formulations, while also exploring more possible formulations than before. They’re looking at how synthetic data can be used to better train computer vision systems; this will drive improved retail experiences for customers and companies alike. And they’re working to help businesses strike the right balance while using synthetic data for privacy purposes. An automated privacy assessment tool will give companies a way to evaluate different anonymization strategies for data, including synthetic datasets, and choose the one that’s best for a given task.

Of course, even as companies use synthetic data to generate value, bad actors will continue their efforts. Our Labs researchers have also been active in deepfake detection, applying an ensemble of AI models to analyze content. Identifying malicious content that’s the result of deepfake technologies will be key as companies look to drive value with synthetic data while guarding against the reality of bad actors.

It’s shortsighted, though, to focus only on the potential for negative impact – and businesses doing so are leaving value on the table. There’s a lot to gain from creative, thoughtful use of these innovations. The true story: synthetic data offers real value, in everything from product development to healthcare to the entertainment industry. How will you capture it?