“Algorithmic Accountability”, risks and opportunities of automation. Interview with Joshua A. Kroll

Broadband4Europe. The role of algorithm in human decision making is getting more and more wide. Which threat does this trend pose and which are the main benefits from the wider usage of algorithms?

Joshua A. Kroll. Fundamentally, we are making the same decisions and performing the same tasks as before, but advances in technology enable a greater range of these tasks to be situated in technology rather than performed by humans. As more human tasks become automated, there is great benefit: machines can do things more quickly and more consistently than people, enabling high-quality output at lower cost, greater speed, and wider scale. Also, automation frees up humans for other tasks, giving them more time and augmenting their capacity. For example, improvements in computer vision have enabled automatic sorting of recyclable materials, increasing the rate of material recovery while decreasing the cost of mixed-stream recycling, thereby making it easier for ordinary people to keep their waste out of landfills. Autopilots relieve pilots of the need to make continuous corrections to heading and altitude, leaving them time to consider the flight plan and how to find the safest, most comfortable, and most direct route.

There are two major risks from automation. The first is that the oversight, governance, and control mechanisms we’ve developed for our decision-making processes over time might not adapt to this new speed and scale or might not be prepared to make determinations about software-driven decision-making. For example, determining what content should or shouldn’t be published in a newspaper advertisement is the work of trained editors, and sometimes it’s very challenging. But in an Internet-sized advertising network or social media platform, there are too many ads and too many contexts for those ads and posts for a human to review them all. So we see why that system can be abused for political gain, fraud, or simply the marketing of low-quality products. New governance approaches are needed to handle these new problems, but they’re not as well established or tested as those that work for existing technologies.

The second risk is more subtle: as automation takes over human tasks and performs more work, fewer humans are responsible for more output from the same systems. As a result, the remaining humans are both less aware of what’s happening and less experienced with how to operate the system. An easy example of this is that people follow the GPS on their phones and don’t understand how to read maps as well as they once did. Sometimes, a driver instructed to “turn left” will do so immediately and end up in a field, rather than turning at an upcoming turnout or intersection. In more elaborate cases, automation confuses even trained operators like airline pilots or ship driving teams.

These risks can overlap, as well. For example, if a car driving under an autopilot hands control back to a human suddenly and the car crashes a few seconds later, should we blame the human operator for failing to prevent the accident or the car’s automation for not understanding the road conditions? Should legal liability for the accident fall on the human, the software vendor, or the company that built the car? How do we set standards for the behavior of these new ensemble human-machine systems? All of these are difficult questions.

Broadband4Europe. What needs to be done in order to reach a wider accountability in algorithms decisionmaking processes?

Joshua A. Kroll. It’s important to view these new technologies in the context of the systems they are part of. That is, we’re looking for accountability in those systems, not just of the algorithms themselves. For example, if doctors use a software tool to help them diagnose cancers in imaging tests (X-rays, MRIs, etc.), the doctor remains responsible for the diagnosis and must know the tool well enough and the case well enough to make a good decision. Of course, that’s easier if the software is more accurate or can be audited, but accountability is about who is responsible.

In too many systems, accountability is obscured: corporations and governments make decisions by committee or don’t keep records that explain why or even how decisions were taken. When systems are complex, many factors can lead to problems and accidents. The desire to give a “root cause” for a failure can be misleading. For example, in 1996, the first flight of the ESA’s new launch vehicle, the Ariane 5, ended in an explosion. The direct costs were over €500 million, and it set back a multi-billion euro program by about ten years. Although the rocket veered off course due to a software error, the inquiry board did not blame the programmers. Why? Because other design decisions had led the programmers to be given requirements that led to a risky final design. Nobody foresaw that risk, and few resources were given to testing or simulating the new system in ways that might have discovered it.

To make automated systems accountable, we must look to the ways that existing human systems are accountable and attempt to (1) adapt them to include new machine-driven or software-driven compoents; and (2) enhance their capacity to understand and reason about those components. That means improving and educating existing oversight structures, but also creating new methods. For example, unlike human decisions or bureaucratic decisions, software-made decisions can be recorded and replicated exactly. So we must develop a new discipline of software system accounting, tracing, and auditing. Some of that must be done by the entities that create and operate software systems, but some of it could be done by opening parts of those systems to researchers, journalists, or the public. There’s a lot of opportunity.

Broadband4Europe. Who should be considered responsible for algorithm’s mistakes when they occur? Should there always be a responsible, someone in charge for the correct functionality of algorithms?

Joshua A. Kroll. Ultimately, yes, someone must be responsible. And when decisions are made by humans, we generally hold those humans responsible. That gets harder when humans work in groups or organizations. If a committee votes on a decision and that decision leads to an accident where people die, should we treat the committee members who voted against the decision as responsible? This isn’t a new problem, but it’s a difficult problem. We need to organize the process for making decisions so that someone or some organization is always responsible.

One risk here is that, if someone is always responsible, then people won’t want to do new things or take risks. But it’s possible to manage those risks in many cases – airliners are complicated, but they generally fly very safely because that complexity and the attendant risk are carefully managed. That management happens because of good engineering but also because of culture, policy, and law – crew are not blamed for reporting safety problems, for example. In the financial industry in the US, there’s a discipline of “model risk management” that is supposed to demonstrate risk management practices to regulators. It’s not perfect, but it balances being careful with taking new risks. Of course, risk judgements can be spectacularly wrong – that’s what happened in the 2008 financial crisis. But that shows something else – in that crisis, the judgements were wrong because the credit rating agencies that were making them were paid to judge the risks as low. No technical control can be stronger than the organization that uses it.

Broadband4Europe. In your work you name a new technological toolkit to verify the compliance of automated decisions: can you describe the toolkit and the main benefits of it? How does it work?

Joshua A. Kroll. My work suggest building a kind of audit trail for algorithms, which records what the inputs and outputs were and how they are supposed to be related. Using some computer science and cryptography, it’s possible to structure this audit trail so that lots of it can be kept secret and yet you can still be convinced of many useful things, like that my decision was made by the same rule as your decision. Then later, you can show the audit trail to an oversight entity like a court or a regulator to establish whether any specific decision was appropriate. Further, the oversight process can review the rules in secret and announce that they are acceptable, that that they don’t use inappropriate data or discriminate, for example.

I think this sort of accounting forms the base level for the accountability structures we need to make sure our new, automated world is governed to the same level we’re used to. This approach doesn’t solve every problem – some rules might look OK on their face but disfavor specific people or groups inappropriately, and that can be hard to spot – but it provides an important baseline.

Broadband4Europe. In your work you say that transparency of the algorithm is not enough to solve the accountability question. What do you mean?

Joshua A. Kroll. Lots of policy documents suggest that transparency is an important governance tool. I agree. But I worry that we might stop there, when we need to go further. Knowing how a system works or how a decision was taken is not the same as knowing why it works that way or who made the decision to set it up that way. And even knowing all of that doesn’t tell you who should be responsible for a decision, so it doesn’t tell you what to do if you don’t like the decision or want to change it.

In many important applications, transparency isn’t even possible. If private data are involved, you likely can’t disclose them – we have this issue with the Census in the US. By law, the Census agency cannot disclose what people put on their response forms, but it’s been possible historically to make good guesses about responses from the aggregated statistics the agency produces. So the agency is moving to use a modern privacy tool to guarantee this won’t be possible in the future. That’s a hard, research-level problem, but they’re making progress on it. It causes tradeoffs, because the tool requires them to add noise to the data and this makes it hard for population scientists to use the data in the way they’re used to. It’s not a perfect solution, but it’s a step in a good direction. They can be transparent about how they’re protecting private data and the statistics they produce, while also making sure they protect the data of individuals.

In national security applications, we often can’t be transparent for other reasons. Namely, if adversaries understood the process, they could use that information to attack it. That concern isn’t specific to military applications, by the way – lots of automated decision systems can be manipulated if people understand how they work. For example, e-mail spam has an easier time getting through when the spammer understands what will and won’t be blocked and why, and many techniques spammers use are designed to overcome parts of the spam defense system.

Finally, my favorite example of this is a lottery. Lotteries are perfectly transparent – we know exactly how they are supposed to work, and the transparency is the point. Even though we understand exactly how the winner is chosen, we can’t predict who the winner will be ahead of time. Nor can we replicate the outcome. If I told you that I ran a lottery in the back room of a nightclub, for example, you probably wouldn’t want to buy a ticket because I might keep re-choosing the winner until one of my friends wins, so you aren’t likely to win ever. These problems get worse when lotteries are run on computers for interesting technical reasons, but the result is that there have been many interesting lottery fraud cases involving computer lotteries. One interesting case is the lottery that was used in 2012 to determine who would get certain immigration visas to live in the USA. A bug in the program caused it to choose mostly people who applied in the first few days of the application period, and it was found by a court to be illegal, since it didn’t give all the applicants an equal chance. So even though the winners had been notified, many of them didn’t end up getting the visas.

The overall point is that transparency helps, but isn’t always desirable, and even when you have it, it might not give you what you want.

Broadband4Europe. In general, the accountability of algorithm is central to guarantee the interests of citizens and society as a whole. How to ensure accountability?

Joshua A. Kroll. As I said, the important thing is the accountability of the entire system, not just the algorithms. As you say, it’s the interests of citizens and of society as a whole that matter. I think if we start from that perspective, we see the set of possible interventions more clearly.

For example, there’s currently a policy discussion in the US about whether big technology platforms are illegal monopolies. Without taking a position on that question, we can ask whether breaking up these platforms would serve the interests we actually care about. The truth is that plenty of small platforms have the same problems and the things we can do to serve the interests of society aren’t directly relevant to competition issues, though competition might drive the big platforms to take problems more seriously. But interventions like improving the capacity of oversight agencies or requiring third-party audits or instituting risk management guidelines could happen before or after competition enforcement.

These problems are about power, not about technology. Technology can centralize power, but it can also democratize it. We need to work to make sure it does the latter.

Broadband4Europe. Is it possibile to verify and prevent discrimination risk in algorithms decisionmaking processes?

Joshua A. Kroll. We know many ways to identify and measure discrimination in decision-making processes and some ways to fix those problems. The machine learning community has gotten very interested in establishing “fairness”, but the law and our tools are both better at establishing unfairness or discrimination. Most often, the problem is again one of power dynamics and whether the decision-making process favors certain groups systematically. That can happen because of bias in data, which can often be seen with careful auditing, though it can also be a subtle or insidious problem. But it can also happen because of the way the goal of a system is established, which has nothing to do with the data behind a system or the design of that system. For example, there’s a set of tools that are meant to automate the interviewing process by analyzing video of job applicants. We aren’t yet to the point where AI can understand the answers the applicant gives to the interview questions, and these systems seem to work by analyzing the emotions of the applicant as expressed through facial behavior. Of course, there’s no reason to believe that the small movements of an applicant’s face during their job interview has anything to do with their job performance. These systems are essentially the 21st century version of phrenology, making decisions based on variations in irrelevant characteristics. Yet many companies rely on these tools, which are very likely discriminatory against protected attributes like race or health status or disability. The risk here is not that the system is based on biased data or that anyone did anything wrong in the process of building these models. It’s that the problem the system is trying to solve is poorly formulated. There’s no “fair” version of phrenology and it would be wrong to try to pursue one.

Again, it’s a question of risk management. Here, the risks are in the decision to build or buy a certain product or to use a certain process for a sensitive function like hiring. A company might say “it’s OK, because the final hiring decisions are made by humans and the automated tools are just used for screening.” But what happens to the people who are screened out by these tools? Do they have an opportunity to object? And what of manipulation? In Korea, university students take classes to learn to score well on these automated assessments. So you can see that the process reinforces existing power structures rather than being more “objective” through the use of automation.

Related posts