The FTC Issues a Warning on Algorithmic Bias
This week the FTC published an explicit rebuke of algorithmic bias and the creation or use of AI systems that produce biased outcomes. The piece, titled “Aiming for truth, fairness, and equity in your company’s use of AI,” reiterates some of the finer points of the business guidance around the use of AI they provided last April. It ends with the warning “Hold yourself accountable – or be ready for the FTC to do it for you."
That sounds a bit like a threat. And while it’s easy to think the FTC may be targeting a handful of Big Tech companies—it mentions at least two complaints against Facebook—the truth is that they appear to be addressing the broader business sector with the piece. They specifically cite recent action taken against a company called Everalbum, which enabled facial recognition technology on user uploaded photos without consent.
The FTC’s piece otherwise includes what should be common sense guidelines around building and deploying AI. Although, the FTC’s “I can’t believe I have to tell you this” tone makes it clear these have yet to be widely put into practice. The instructions include: watch out for discriminatory outcomes; embrace transparency and independence; don’t exaggerate what your algorithm can do or whether it can deliver fair or unbiased results; tell the truth about how you use data; and, do more good than harm.
It’s all very sound business advice that can be reduced to “do the right thing.” And while each individual guideline is vital, they’re also all dependent on the very first: Start with the right foundation. The FTC is acknowledging how critical it is to start with diverse and balanced data when you first set out to build a deep learning model. They say explicitly, "If a data set is missing information from particular populations, using that data to build an AI model may yield results that are unfair or inequitable to legally protected groups. From the start, think about ways to improve your data set, design your model to account for data gaps, and – in light of any shortcomings – limit where or how you use the model."
This is something we at Zumo Labs think about daily. You can’t engineer yourself out of a biased model without the dataset to do it. It’s part of the reason we’re so focused on synthetic data. While not a silver bullet, synthetic data is a proven way to address gaps in your training data. It’s also a way to train models to recognize people, without training those models on questionably-sourced images of real people. In doing so, synthetic data removes the need to store (or ultimately compromise) anyone’s personal data for the sake of training AI.
But like any other tool, synthetic data works best when it’s used by thoughtful craftspeople. Ask yourself not only how the AI system you’re building will impact you and your customers, but how it might impact folks from every walk of life. What might be invisible to you in the early stages of model development? What might the downstream unintended effects be? It’s exciting to solve an interesting problem with technology, but that’s no longer enough.
We applaud the FTC for calling upon individuals and teams building AI systems to think more deeply and critically about their process and its consequences. If we can help you do that, please reach out.