• Kory Stiger

We're Building AutoML for Computer Vision Data

Computer vision is hard. The modern computer vision engineer needs to know how to frame a real-world problem as a computer vision problem, choose a model architecture, train at scale, tune hyperparameters, run experiments, and, sometimes, deploy onto IoT hardware. With the advent of synthetic data, another skill gets added to that list—3D data engineering. With computer-generated training data, you no longer need to rely on clever hacks or manual data annotation to train your models properly. Instead, you can design your dataset to fit your problem.


Much like Automatic Machine Learning revolutionized machine learning model development, “AutoML for Data” (or maybe even AutoData) is poised to revolutionize dataset development. But what exactly does “AutoML for data” mean?


Let’s look at the list of tasks involved in machine learning today, per automl.org:

  1. Collect the data.

  2. Preprocess and clean the data.

  3. Select and construct appropriate features.

  4. Select an appropriate model family.

  5. Optimize model hyperparameters.

  6. Postprocess machine learning models.

  7. Critically analyze the results obtained.

For the most part, numbers 3 through 6 can be abstracted away with various established products on the market—to the point that a machine learning engineer only needs to set up their AutoML facilitator of choice and focus on #7.


#7 is challenging to do automatically and will likely require a human brain for a while longer (until artificial general intelligence [AGI] becomes a thing and ushers in a utopia guided by our benevolent robot overlords). So, let’s focus on numbers 1 and 2, how they’re changing, and where AutoML for data fits in.


Currently, data pipelines look something like this:

  • Buy, scrape, or otherwise collect real-world images into a dataset.

  • Manually inspect the dataset to throw out bad data and label good data according to specifications.

  • Optionally: run various augmentations over the images to enhance the dataset (filters, cropping, flipping, etc.).

When this is completed, you run through aforementioned steps 3 through 6 until you have, hopefully, achieved a performant model for your use case. But what happens when you get to #7 and realize that the specs have changed and the capture angle, camera intrinsics, or other metadata are no longer appropriate? Time to recollect and/or relabel, which is time-consuming and expensive. But, it's also not necessary if you’re using synthetic data.


The synthetic data flow is similar to the above, except that instead of waiting for real-life data collection and labeling, you—or Zumo Labs for you—will create a 3D simulation to model your real-world application of choice and make required adjustments to that sim over time. This is where the dream of AutoML for data comes in: unlocking novel use cases and rapid experiment iteration in a way that has never been possible before!


So, what’s the secret sauce? It’s simple. We create runnable simulations using a flexible framework for controlling data generation with parameters that can hook into the next part of the existing AutoML flow. Let’s look at an example. Say you need to do detection in a room with many windows and lights that may be on or off. You would need to create a simulation that takes several properties at runtime, perhaps like this:


{
    # True is on, False is off
    "room_lights": True,
	
    # 0 is no ambient light, 1 is the most ambient light
    # (set at an arbitrarily bright point)
    "ambient_light_brightness": 0.5,
	
    # Kelvin color scale temperature 1000 - 10000
    "ambient_light_warmth": 1500 
}

Then you would generate the distribution of data that makes sense for your experiment and analyze the model loss in correlation to the dataset parameters to determine which sim parameters are most important for your model's performance!


And we can get even more specific: what if the arbitrary brightest point at 1 isn’t bright enough, or you find out that the room lights have dimmers? No problem. A few simple tweaks to the sim will produce new data according to the evolving specifications. And while a future where the simulation self-improves independent of human input is still a ways out, it's easy to imagine the dataset improving itself (through domain randomization, augmentations, etc.) in the very near future.


Savvy readers may recognize that creating custom domain-specific simulations is no small task. That’s where zpy comes in. Our open-source computer vision library and add-on for Blender enables our internal sim creators to feel like heroes when creating new 3D simulations. You can import existing 3D assets or build your sims entirely from scratch. Give it a try. It’s cool, it works with our scalable backend, and we’re making it better every day.


If you’d like to get deeper into the weeds on synthetic data or just have questions you think we can help with, feel free to reach out and schedule some time.


We round up the latest in AI, computer vision, and digital privacy news weekly. Sign up here.

Newsletter Header FINAL xii2020 WHITE 20