Algorithms should not decide who spends time in a California jail. But thats exactly what will happen under S.B. 10, a new law slated to take effect in October 2019. The

law, which Governor Jerry Brown signed in September, requires the states criminal justice system to replace cash bail with an algorithmic pretrial risk assessment. Each county in California must use some form of pretrial risk assessment to categorize every person arrested as a low, medium, or high risk of failing to appear for court, or committing another crime that poses a risk to public safety. Under S.B. 10, if someone receives a high risk score, the person must be detained prior to arraignment, effectively placing crucial decisions about a persons freedom into the hands of companies that make assessment tools.

Some see risk assessment tools as being more impartial than judges because they make determinations using algorithms. But that assumption ignores the fact that algorithms, when not carefully calibrated, can cause the same sort of discriminatoryoutcomes as existing systems that rely on human judgementand even make new, unexpected errors. We doubt these algorithmic tools are ready for prime time, and the state of California should not have embraced their use before establishing ways to scrutinize them for bias, fairness, and accuracy.

EFF in July joined more than a hundred advocacy groups to urge jurisdictions in California and across the country already using these algorithmic tools to stop until they considered the many risks and consequences of their use. Our concerns are now even more urgent in California, with less than a year to implement S.B. 10. We urge the state to start working now to make sure that S.B. 10 does not reinforce existinginequity in the criminal justice system, or even introduce new disparities.

This is not a merely theoretical concern. Researchers at Dartmouth University found in January that one widely used tool, COMPAS,incorrectly classified black defendants as being at risk of committing a misdemeanor or felony within 2 years at a rate of 40%, versus 25.4% for white defendants.

There are ways to minimize bias and unfairness in pretrial risk assessment, but it requires proper guidance and oversight. S.B. 10 offers no guidance for how counties should calculate risk levels. It also fails to lay out procedures to protect against unintentional, unfair, biased, or discriminatory outcomes.

The states Judicial Council is expected to post the first of its rules mandated by S.B. 10 for public comment within the coming days. The state should release informationand soonabout the various algorithmic tools counties can consider, for public review. To date, we dont even have a list of the tools up for consideration across the state, let alone the information and data needed to assess them and safeguard against algorithmic bias.

We offer four key criteria that anyone using a pretrial risk assessment tool must satisfy to ensure that the tool reduces existing inequities in the criminal justice system rather than reinforces them, and avoids introducing new disparities. Counties must engage the public in setting goals, assess whether the tools they are considering use the right data for their communities, and ensure the tools are fair. They must also be transparent and open to regular independent audits and future correction.

Policymakers and the Public, Not Companies, Must Decide What A Tool Prioritizes

As the state considers which tools to recommend, the first step is to decide what its objective is. Is the goal to have fewer people in prisons? Is it to cut down on unfairness and inequality? Is it both? How do you measure if the tool is working?

These are complex questions. It is, for example, possible to optimize an algorithm to maximize true positives, meaning to correctly identify those who are likely to fail to appear, or to commit another dangerous crime if released. Optimizing an algorithm that way, however, also tends to increase the number of false positives, meaning more people will be held in custody unnecessarily.

Its also important to define what constitutes success. A system that recommends detention for everyone, after all, would have both a 100% true positive rate and a 100% false positive rateand would be horribly unjust. As Matthias Spielkampwrote for the MIT Technology Review: What trade-offs should we make to ensure justice and lower the massive social costs of incarceration?

Lawmakers, the courts, and the publicnot the companies who make and sell algorithmic toolsshould decide together what we want pretrial risk assessment tools to prioritize and how to ensure that they are fair.

The Data and Assumptions Used to Develop the Algorithm Must Be Scrutinized

Part of the problem is that many of these pretrial risk assessment tools must be trained by examining existing data. But the assumptions a developer makes when creating an assessment dont always apply to the communities upon which they are used. For example, the dataset used to train a machine-learning algorithm might not be representative of the community that will eventually use the risk assessment. If the risk assessment tool was developed with bad training data, i.e. it learned from bad data, it will produce bad risk assessments.

How might the training data for a machine-learning algorithm be bad?

For example, the rate of re-arrest of released defendants could be used as a way to measure someones risk to public safety when building an algorithm. But does the re-arrest rate actually tell us about risk to public safety? In fact, not all jurisdictions define re-arrest in the same way. Some include only re-arrests that actually result in bail revocation, but some include traffic or misdemeanor offenses that dont truly reflect a risk to society.

Training data can also often be gummed up by our own systemic biases. Data collected by the Stanford Open Policing Projectshows that officers own biases cause them to stop black drivers at higher rates than white drivers and to ticket, search, and arrest black and Hispanic drivers during traffic stops more often than whites. Using a rate of arrest that includes traffic offenses could therefore introduce more racial bias into the system, rather than reduce it.

Taking the time to clean datasets and carefully vet tools before implementation is necessary to protect against unfair, biased, or discriminatory outcomes.

Fairness and Bias Must Be Considered and Corrected

Beyond examining the training data algorithms use, its also important to understand how the algorithm makes its decisions. The fairness of any algorithmic system should be defined and reviewed before implementation as well as throughout the systems use. Does an algorithm treat all groups of people the same? Is the system optimizing for fairness, for public safety, for equal treatment, or for the most efficient allocation of resources?

Biased decision-making is a trap that both simple and complicated algorithms can fall into. Even a tool using carefully vetted data that focuses too narrowly on a single measure of success, for example, can also produce unfair assessments. (See, for example, Goodhart's Law.) Algorithmic systems used in criminal justice, education policy, insurance, and lending have exhibited these problems.

Its important to note that simplyeliminating race or gender data will not make a tool fair because of the way machine learning algorithms process information. Sometimes machine learning algorithms will make prejudiced or biased decisions even if data on demographic categories is deliberately excludeda phenomenon called omitted variable bias in statistics. For example, if a system is asked to predict a persons risk to public safety, but lacks information about their access to supportive resources, it could improperly learn to use their postal code as a way to determine their threat to public safety.

In this way, risk assessment can use factors that appear neutralsuch as a persons income levelbut produce the same unequal results as if they had used prohibited factors such as race or sex.

Automated assessments can also fail to take important, but less obvious, information about peoples lives into accountreducing people to the sum of their data and ignoring their humanity. A risk assessment may not, for example, consider something like familial relationships and responsibilities. But a person who is the primary caregiver for a sick relative may be at significantly higher risk of failing to appear in courtbut not purposely absconding. If these familial relationships are not considered, then the system may conflate such life circumstances with a risk of flightwhich would lead to inaccurate, potentially biased, and discriminatory outcomes in the future.

There are sensible solutions to address omitted variable bias, and they must be applied properly to offset existing biases inherent in the training data.

The Public and Independent Experts Must Be Informed and Consulted

Any government decision to adopt a system or tool that uses algorithmic decision-making is a policy decisionwhether the system is being used for pretrial risk assessment or to determine whether tocut people off from healthcareand the public needs to be able to hold the government accountable for those decisions. Thus, even when decision makers have thought through the steps weve outlined as they choose vendors, its equally vital that they let the public and independent data scientists review them.

Developers must be upfront about how their tools work, so that courts, policy makers, and the public understand how tools fit their communities. If these tools are allowed to be a black box a system or device that doesnt reveal how it reaches its conclusionsthen it robs the public of their right to understand what the algorithm does and to test its fairness and accuracy. Without knowing what goes into the black box, its hard to assess the fairness and validity of what comes out of it.

The public must have access to the source code and the materials used to develop these tools, and the results of regular independent audits of the system, to ensure tools are not unfairly detaining innocent people or disproportionately affecting specific classes of people.

Transparency gives people a way to measure progress and ensure government accountability. As Algorithm Watchsays, The fact that most [algorithmic decision making] procedures are black boxes to the people affected by them is not a law of nature. It must end.

California Needs To Address These Issues Immediately

As California looks to implement S.B. 10, it should not rely on vendor companies marketing promises. We urge the state to vet thoroughly any algorithmic tools consideredand enable independent experts and auditors to do the same. There must be thorough and independent evaluations of whether the tools up for consideration are fair and appropriate.

Any recommendation to take away someones liberty must receive immediate human review. These considerations should have been baked into S.B. 10 from the start. But it is critical that California satisfy these four criteria now, and that policymakers across the country considering similar laws build these critical safeguards directly into their legislation.