What do we do?

Image by Lucas George Wendt

Data Generation.

Data Labeling.


Synthetic Data is Data 2.0

If you are training computer vision models to solve complex problems, you need synthetic data. 

Synthetic data is the future of all machine learning training data.


Unlike manually harvested and human-labeled data, our fabrication process produces pixel-perfect labels and annotations. The sheer volume of the sets we're able to create gives our synthetic data the potential to cover all edge cases and reduce bias in your model. And unlike real data that has been scraped and stripped of personal identifiers, re-identification is never possible with synthetic data, guaranteeing privacy and compliance with CCPA and GDPR.

How is Zumo Labs synthetic data different from my current process?

Our Training Data as a Service (TDaaS) model pays for itself when you consider the efficiencies we can introduce to your training data pipeline. To name a few:

  • Reduce Bias: Create more representative data sets, generate more edge cases, and exponentially increase image volume in order to reduce bias in your model.

  • Improve Model Performance and Accelerate Model Development: Object recognition, keypoint detection, and occlusion are all solvable issues when using CGI-rendered data sets.

  • Be Free of Privacy Concerns: No PII to strip means re-identification is never a concern with synthetic data, guaranteeing privacy and compliance with laws like the CCPA and GDPR.

Solving difficult computer vision problems is what we do. 

How does it work?

Tell us what you need. We then create a simulation or a scene, like this one of some people doing push-ups:

We then generate your data from the scene (images, animations, videos, you name it). Since the data is synthetic, we have the full state knowledge of the scene. Meaning we can give you:

Full keypoint skeleton




Class segmentation

Instance segmentation

...and more

Image by Lucas George Wendt

