Teaching Machines ‘Fairness’
Teaching anyone about “fairness” is a laudable goal.
Source: Junko Yoshida
As humans, we may not necessarily agree on what’s fair. It sometimes depends on the context. Teaching kids to be fair — both at home and in school — is fundamental, but it’s easier said than done. With this in mind, how can we, as a society, communicate the nuances of “being fair” to artificial intelligence (AI) systems?
A team of researchers at IBM Research is taking the first crack at this conundrum. IBM is rolling out a tool kit for developers called “AI Fairness 360.” As part of this effort, IBM is offering businesses a new “cloud-based, bias-detection, and mitigation service” that corporations can use to test and verify how AI-driven systems are behaving.
In a phone interview with EE Times, Saska Mojsilovic, a Fellow at IBM Research, told us that scientists and AI practitioners have been far too focused on the accuracy of AI. Typically, the first question that people ask about AI is, “Can machines beat humans?”
But what about fairness? The fairness void in AI has the potential to induce catastrophic consequences in, for example, health care or autonomous vehicles, she said.
What if a dataset used to train a machine is biased? If AI can’t explain how it came to a decision, how could we verify its “rightness?” Can AI reveal if data has been somehow manipulated during AI processing? Could AI assure us that its data has never been attacked or compromised, including during pre- and post-processing?
In short, is there any such thing as introspective AI? The simple answer: No.
Without being transparent to AI users, developers, and practitioners, AI systems cannot gain trust from society, said Mojsilovic.
A bigger question is how to teach the machine what fairness is. Mojsilovic noted, “Because we are scientists, the first thing we did was to decompose ‘fairness.’ We needed to get our hands around it.” They broke down fairness in terms of metrics, algorithms, and bias practiced in AI implementation.
Kush Varshney, Research Scientist, IBM, explained that the team looked at bias and fairness in AI algorithms and AI decision-making. “There is fairness to individuals and there is fairness to groups. We looked at different attributes of groups — ranging from gender to race. Legal and regulatory issues are also considered.” In the end, the team ended up measuring 30 different metrics to look for bias in datasets, AI models, and algorithms.
These findings are incorporated into the AI Fairness 360 toolbox that IBM launched this week. The company described it as “a comprehensive open-source toolkit of metrics to check for unwanted bias in datasets and machine-learning models.”
Although many scientists are already working to spot discrimination in AI algorithms, Mojsilovic said that IBM’s approach differs by including algorithms not just to find bias but also a tool for debiasing.
On a basic level, you’ve got to ask: Computer scientists — defining fairness? This is a task normally assigned to social scientists? Aware of this incongruity, IBM made it clear that neither Mojsilovic nor Varshney is working in a vacuum. They brought in a host of scholars and institutes. Varshney participated in the Uehiro-Carnegie-Oxford Ethics Conference sponsored by the Carnegie Council for Ethics in International Affair. Mojsilovic participated in an AI Workshop in Berkeley, California, sponsored by the UC Berkeley Law School.
Is an algorithm neutral?
Social scientists have been pointing out the issue of AI bias for some time.
Young Mie Kim, professor, School of Journalism and Mass Communication at University of Wisconsin—Madison, explained, “AI discrimination (or AI bias) can happen when it implicitly or explicitly reinforces existing unequal social orders and biases (e.g., gender, race, age, social/economic status, etc.).” Examples range from sampling errors (e.g., under-representation of certain demographics due to inappropriate or difficulties in sampling methods) to human biases in machine training (modeling). Kim argued that AI bias exists even with “strategic decisions” in design or modeling, such as political advertising algorithms.
In her recent study entitled “Algorithmic Opportunity: Digital Advertising and Inequality of Political Involvement,” Kim demonstrated how inequality can be reinforced in algorithm-based decisions.
The technical community might argue that “an algorithm is neutral” or can be “educated” (trained). Kim noted, “That does not acknowledge that biases enter at any stage in algorithm development.”
It’s not just consumers who are afraid of AI. Many computer scientists are also voicing concerns.
Aws Albarghouthi, assistant professor of computer sciences at the University of Wisconsin, told EE Times, “In the short term, I am concerned about the increasing use of data-driven decision-making, which has the potential to propagate bias and injustice. In the long term, I am concerned with the use of AI in automating warfare.”
AI’s inability to explain its decision-making is a persistent anxiety. Loris D’Antoni, assistant professor of computer sciences at the University of Wisconsin, told us, “Right now, programmers are starting to treat powerful machine-learning algorithms as tools in their programming toolbox. However, these algorithms are complicated, hard to predict, and hard to explain. For example, no one knows what the resulting machine-learned models exactly do.” In his opinion, “In general, automated decision-making needs to be understood and regulated, and possibly in a formal way.”
For each step of the process, from training data and processing to testing and deployment, IBM noted that its toolkit provides explanations that show which factors weighted a decision in one direction versus another, confidence in the recommendation, and the factors supporting that confidence.
IBM hopes that the AI Fairness 360 toolkit will be widely used so that developers can contribute to it, helping to build trust in AI.
In parallel, IBM will offer businesses its new software service, designed to automatically detect bias and explain how AI makes decisions — as decisions are being made. It runs on the IBM Cloud and helps corporations manage AI systems.
According to IBM, its fairness monitoring services on the IBM Cloud will work with models from a wide variety of machine-learning frameworks and AI-build environments such as Watson, TensorFlow, SparkML, AWS SageMaker, and AzureML. “This means that organizations can take advantage of these new controls for most of the popular AI frameworks used by enterprises,” claimed IBM. IBM is also promising customization. “The software service can also be programmed to monitor the unique decision factors of any business workflow, enabling it to be customized to the specific organizational use.”
The AI research community, increasingly aware of algorithmic fairness issues in AI, is welcoming IBM’s announcement. University of Wisconsin computer scientist D’Antoni told us, “Seeing fairness-checking techniques making it to industry and to practice is really exciting.” He added, “I think that the tool will make the community more aware of the importance of this problem.
Albarghouthi called IBM’s effort “a good start.”
But for the AI Fairness 360 toolkit to become truly effective, it needs to be disseminated to many developers, who will need to understand it. Albarghouthi explained that the tool needs to “co-evolve with the state-of-the-art in fairness as explored by the research community.”
He cautioned that the tool may very well stagnate “if research and understanding move beyond the current definitions and techniques.”
In the end, what could make or break any AI fairness toolkit comes back to the thorny issue of how to define fairness. IBM’s Mojsilovic acknowledged that fairness manifests itself differently in different applications. Just as humans see fairness differently, users, clients, and the general public may judge fairness differently depending on the situation.
Asked about the pitfalls of an AI fairness toolkit, Albarghouthi said that one is “in the choice of the fairness definition and whether it is adequate.” After all, fairness “highly depends on the domain, its sensitivity, and the regulation involved.” He added, “I do believe that these tools will be effective only if their use is made a standard part of the machine-learning pipeline.”
D’Antoni expressed his own concern. “There are many definitions of fairness and bias, and it is hard to take them all into account and enforce them all at once.” In other words, “fairness definitions are often ‘conflicting’,” he said. “There are impossibility results showing that certain notions of fairness cannot be enforced at the same time.”
Also, there is the fact that “the machine-learning community is currently not trained to understand what the existing fairness and bias definitions mean and how to use them,” he added.
Social scientist Kim agreed. “An awareness of such [AI bias] problems is the very first, but most important, step to fight against AI discrimination.” She noted, “In a human society, we come up with some policies and regulations to mitigate inequality and discrimination. But the problem is that AI remains a secret. As in most cases, data and modeling are proprietary. This makes any public policy or regulatory discussion/debate more difficult.”
Understanding the tradeoffs and fluidity in defining fairness, IBM researchers believe that the priority should be the transparency of AI practices and implementation.
IBM’s Mojsilovic proposes that a Supplier’s Declaration of Conformity (which she calls a fact sheet) be completed and voluntarily released by AI service developers and providers “to increase the transparency of their services and engender trust in them.” She likened it to “nutrition labels for foods,” or “information sheets for appliances.”
The industry needs a standard for information about AI systems deployed in different services. IBM is keenly aware that this won’t happen overnight. Just as it took a long time for nutrition labels to evolve, the development of fact sheets could be a long process. Mojsilovic cautioned that the industry has only begun its AI journey.
Similar to IBM, the research community is also contending with AI fairness. In this unsettled realm, IBM’s AI Fairness toolbox appears to be groundbreaking. D’Antoni told us, “I’m not aware of existing general-purpose fairness-checking tools for AI models.”
On the other hand, he added, “There are many exciting prototype tools proposed by the research community. For example, FairTest by Kramer et al. and FairSquare from our own group.”