by Donald Martin, Jr. and Andrew Moore
October 28, 2020
Artificial Intelligence (AI) has become one of the biggest drivers of technological change, impacting industries and creating entirely new opportunities. From an engineering standpoint, AI is just a more advanced form of data engineering. Most good AI projects function more like muddy pickup trucks than spotless race cars — they are a workhorse technology that humbly makes a production line 5% safer or movie recommendations a little more on point. However, more so than many other technologies, it is very, very easy for a well-intentioned AI practitioner to inadvertently do harm when they set out to do good. AI has the power to amplify unfair biases, making innate biases exponentially more harmful.
As Google AI practitioners, we understand that how AI technology is developed and used will have a significant impact on society for many years to come. As such, it’s crucial to formulate best practices. This starts with the responsible development of the technology and mitigating any potential unfair bias which may exist, both of which require technologists to look more than one step ahead: not “Will this delivery automation save 15% on the delivery cost?” but “How will this change affect the cities where we operate and the people — at-risk populations in particular — who live there?”
This has to be done the old-fashioned way: by human data scientists understanding the process that generates the variables that end up in datasets and models. What’s more, that understanding can only be achieved in partnership with the people represented by and impacted by these variables — community members and stakeholders, such as experts who understand the complex systems that AI will ultimately interact with.
Faulty causal assumptions can lead to unfair bias.
How do we actually implement this goal of building fairness into these new technologies — especially when they often work in ways we might not expect? As a first step, computer scientists need to do more to understand the contexts in which their technologies are being developed and deployed.
Despite our advances in measuring and detecting unfair bias, causation mistakes can still lead to harmful outcomes for marginalized communities. What’s a causation mistake? Take, for example, the observation during the Middle Ages that sick people attracted fewer lice, which led to an assumption that lice were good for you. In actual fact, lice don’t like living on people with fevers. Causation mistakes like this, where a correlation is wrongly thought to signal a cause and effect, can be extremely harmful in high-stakes domains such as health care and criminal justice. AI system developers — who usually do not have social science backgrounds — typically do not understand the underlying societal systems and structures that generate the problems their systems are intended to solve. This lack of understanding can lead to designs based on oversimplified, incorrect causal assumptions that exclude critical societal factors and can lead to unintended and harmful outcomes.
For instance, the researchers who discovered that a medical algorithm widely used in the U.S. health care was racially biased against Black patients identified that the root cause was the mistaken causal assumption, made by the algorithm designers, that people with more complex health needs will have spent more money on health care. This assumption ignores critical factors — such as lack of trust in the health care system and lack of access to affordable health care — that tend to decrease spending on health care by Black patients regardless of the complexity of their health care needs.
Researchers make this kind of causation/correlation mistake all the time. But things are worse for a deep learning computer, which searches billions of possible correlations in order to find the most accurate way to predict data, and thus has billions of opportunities to make causal mistakes. Complicating the issue further, it is very hard, even with modern tools, such as Shapely analysis, to understand why such a mistake was made — a human data scientist sitting in a lab with their supercomputer can never deduce from the data itself what the causation mistakes may be. This is why, among scientists, it is never acceptable to claim to have found a causal relationship in nature just by passively looking at data. You must formulate the hypothesis and then conduct an experiment in order to tease out the causation.
Addressing these causal mistakes requires taking a step back. Computer scientists need to do more to understand and account for the underlying societal contexts in which these technologies are developed and deployed.
Here at Google, we started to lay the foundations for what this approach might look like. In a recent paper co-written by DeepMind, Google AI, and our Trust & Safety team, we argue that considering these societal contexts requires embracing the fact that they are dynamic, complex, non-linear, adaptive systems governed by hard-to-see feedback mechanisms. We all participate in these systems, but no individual person or algorithm can see them in their entirety or fully understand them. So, to account for these inevitable blindspots and innovate responsibly, technologists must collaborate with stakeholders — representatives from sociology, behavioral science, and the humanities, as well as from vulnerable communities — to form a shared hypothesis of how they work. This process should happen at the earliest stages of product development — even before product design starts — and be done in full partnership with communities most vulnerable to algorithmic bias.
This participatory approach to understanding complex social systems — called community-based system dynamics (CBSD) — requires building new networks to bring these stakeholders into the process. CBSD is grounded in systems thinking and incorporates rigorous qualitative and quantitative methods for collaboratively describing and understanding complex problem domains, and we’ve identified it as a promising practice in our research. Building the capacity to partner with communities in fair and ethical ways that provide benefits to all participants needs to be a top priority. It won’t be easy. But the societal insights gained from a deep understanding of the problems that matter most to the most vulnerable in society can lead to technological innovations that are safer and more beneficial for everyone.
Shifting from a mindset of “building because we can” to “building what we should.”
When communities are underrepresented in the product development design process, they are underserved by the products that result. Right now, we’re designing what the future of AI will look like. Will it be inclusive and equitable? Or will it reflect the most unfair and unjust elements of our society? The more just option isn’t a foregone conclusion — we have to work towards it. Our vision for the technology is one where a full range of perspectives, experiences and structural inequities are accounted for. We work to seek out and include these perspectives in a range of ways, including human rights diligence processes, research sprints, direct input from vulnerable communities and organizations focused on inclusion, diversity, and equity such as WiML (Women in ML) and Latinx in AI; many of these organizations are also co-founded and co-led by Googler researchers, such as Black in AI and Queer in AI.
If we, as a field, want this technology to live up to our ideals, then we need to change how we think about what we’re building — to shift to our mindset from “building because we can” to “building what we should.” This means fundamentally shifting our focus to understanding deep problems and working to ethically partner and collaborate with marginalized communities. This will give us a more reliable view of both the data that fuels our algorithms and the problems we seek to solve. This deeper understanding could allow organizations in every sector to unlock new possibilities of what they have to offer while being inclusive, equitable and socially beneficial.
Donald Martin, Jr. is Sr. Staff Technical Program Manager and Social Impact Technology Strategist at Google.
Andrew Moore is Head of Google Cloud AI & Industry Solutions.