Synthetic Data Starters: Shelf Simulation
We recently published zpy, a versatile toolkit for rapidly developing synthetic data. It can save teams tons of time on image collection and labeling. However, synthetic data can be intimidating to start experimenting with due to the multiple skills required for development. This is where starter simulations come in: to get you started with your machine learning needs while teaching you how to build your own sims.
One of the rapidly emerging use cases for synthetic data is object detection. That type of synthetic training data is easily created inside of Blender using the zpy add-on. After downloading the Simulation Package—and making sure you’ve installed Blender and zpy (as both an add-on and a library)—you can open it up and hit the “Run” button in the Execute Panel. That's all you need to do to get annotated, randomized, fully functional data that you can use for any application. But this file is just a template, meaning it allows for customization to your specific use case. Want to teach a computer vision algorithm to recognize a Coke can instead of a Red Bull can? We’ve made it our goal not to sacrifice flexibility for ease of use while supporting your efforts to create custom synthetic data.
Customizing the Simulation Using the User Interface
When you open the Simulation, you’ll notice that Blender has different work areas that provide different tools. While we’ve written the script that controls your Simulation using Python, there are several additional things you can tweak, change, and control using just the User Interface—without knowing a lick of code. First, let’s look at how to change the objects included in the simulation to customize the content to your use case.
You can import any object using File → Import and choosing a file.
When the file loads, drag and drop it in the window in the top left corner to the Group labeled ‘CONTENTS.’
If you run your simulation now using the zpy addon’s run button, you will see that the spawning system has included the object of your choice in the simulation.
You can use this same principle to delete or add any number of objects from the CONTENTS group to tell the computer what objects you want to run with.
While customizing the contents of your simulation is a cool feature, we also like to expose tons of easy, one-click customization to our users.
If you go into the modifiers tab while having your Spawner selected, you’ll see several things you can play with, including changing your CONTENTS group with another group of objects.
Customizing the Simulation using Python
We’ve written zpy on top of Blender because of its incredible flexibility and its use of a Python API. Python is the language of choice for Machine Learning, which means it’s easy for developers in this field to use our library of commands to create the data they need to train on, instead of contracting it out to another synthetic data company or a traditional annotation firm. The anatomy of a zpy script (as we write them, feel free to come up with your own style!) will include keyword arguments (kwargs) at the top.
These arguments will be used again within your script to customize values, value ranges and even to provide you with randomization. For instance, you can change the shelf_length_range parameter and see that it will later be used to decide how many shelves are possible to spawn.
So even within the more complicated API ecosystem, you can quickly and easily customize your script. Or rapidly iterate, making your testing cycle even faster.
As you progress through your zpy script, it’s crucial to notice segmentation and other annotation data that can be rendered out with your images. This is the essential and revolutionary part of synthetic data—you will get perfect annotations every time. Check out our documentation here to get more information on specific calls that are available. The TLDR: we create an ImageSaver object, add a category, add that category to an object or group of objects, and then later, when we’re rendering the image, we can apply that category data to the object we designated.
It’s a simple concept but is integral to getting the best out of synthetic images.
Continuing on in our overview of a zpy script, the workhorse will be a for loop that allows you to create as many synthetic data images as you want.
Inside of this for loop, every value that needs to be randomized will be, meaning each loop of the script will capture a different simulation within your parameters. You’ve decided that you want your object to rotate x degrees for each image? Put that inside the for loop.
Blender API Debugging Tips
Now that we’ve spent a little bit of time going through what makes a zpy run script tick, we have to address another area of knowledge that facilitates a zpy simulation. The Blender API is a fantastic tool to manipulate data through your run script: you can get a handle to an object, move objects between groups, and randomize materials. Anything you can do in the UI in Blender, you can automate using the API. You can see examples of these in the run script, usually starting with ‘bpy.’ This, of course, means that you don’t just need to learn zpy, but also Blender’s API, which takes some effort. Two handy tools to practice your Blender API skills are the console shipped with Blender, which lets you try out commands and see if they work.
The other is the Info Window, where doing a command in the UI will let you see what command was used in the code.
Additionally, if you enable Python tool tips in the Preferences menu, you can mouse over concepts and see their data right in the main screen.
This public sim comes with a package of textures, models, and distractors you can play with. Using these building blocks, it’s simple to create a robust simulation that can be used for the use case that you want. And if it can’t? Build your own! Zpy tools do all the heavy lifting of designing a sim and leave you free to build better, more equitable, and less expensive synthetic data.
Feel free to reach out to us directly with questions or feedback.