The AI Bill of Rights Is a Great Step – But More Is Needed

*Read more about author Mamdouh Refaat.*

The Office of Science and Technology Policy (OSTP) of the White House has issued the blueprint of the AI Bill of Rights. This document describes the rights that should be protected when implementing automated systems using AI technology. The paper lists the following five principles that define these rights:

1. The right to be protected against unsafe or ineffective systems.

2. The right to be protected against discrimination by algorithms. AI automated systems should be designed equitably.

3. The right to be protected against abusive data practices and have agency over how one’s data is used.

4. The right to know how these systems are used and understand how they make decisions.

5. The right (where appropriate!) to opt out and have access to human alternatives to resolve problems.

Before we discuss the document and make some proposals on additional considerations, let’s review the basic technology that this bill of rights is designed to address.

The term AI (artificial intelligence) is used to describe a system that implements a computer program that can perform a function and make decisions that are the domain of intelligent creatures, i.e., humans.

For example, playing a game of chess is the domain of humans, and when they are really good at it, they are usually intelligent. We can write computer programs that play chess as well as, and even better than humans, in two nonexclusive ways. The first method relies on asking expert players to tell us how they analyze situations in games and the rules they use to determine each move. These rules can then be coded as a computer program that can be as good or better than human players. Such systems are called “expert systems.”

The second method is based on collecting the moves made by expert players from a large number of games and applying some algorithm that automatically extracts the rules that the players used in these games to win. This technique is called “machine learning” and it is based on the “data” collected from these games. The rules (or equations) extracted from the data by the learning algorithm are what is called the “ML model.” This model is then coded as a computer program to be used to make decisions – in this case, the moves in a game of chess. Most AI systems are now based on ML models developed using collected data.

In either method, the developed computer code (system) is used to make decisions in an automated way. These systems are the subject of the AI Bill of Rights.

Because the AI systems developed using the machine learning approach are the prevalent method, we should note that the concerns with these systems result from four possible sources:

The data used for creating the ML model: The data could be biased, resulting in discrimination or simply leading to models that make wrong decisions. The data could also include private data that was collected and used without the knowledge and/or consent of the persons from which it was collected. Furthermore, although data is now considered a commodity, it does not have an expiration date. It lives forever. For example, when a person gets a credit card from a bank, all the history of the transactions on this account is stored and kept virtually forever. The bank may continue to use this data in creating ML models for years, even after the account is closed and the card is canceled. Although in principle the customers own their data, the bank would continue to use the data indefinitely. Some of the data is also private and sensitive in nature, such as the personal data entered at the time of application (SSN, income, date of birth, etc.), or that the person used the card to make purchases that he/she wants to keep private.

The learning algorithm and modeling process: The mathematical complexity of the learning algorithms of ML models is increasing each day making it more and more difficult to audit or track the process and the discipline used to create the ML models. The process of creating ML models involves making many complex decisions by the analyst (mostly a data scientist) such that, even with detailed documentation of the modeling process, it is hard to scrutinize the effect of these decisions on the integrity of the decisions made by the model in automated deployment. For example, the common practice of “binning” continuous variables, such as “age”, could lead to introducing some bias because of the specific choice of the bin boundaries. For instance, if the age of a group of customers ranges between 21 and 99, and we group the customers into three bins, say (21-35, 36-55, 56-99), a customer may get two different decisions in two dates within only a few days around the 36^th birthday.

The model accuracy: No ML model is 100% accurate. Therefore, automated AI systems relying on machine learning models, by definition (and design), are making some wrong decisions. When these decisions are binary (yes/no, accept/reject, etc.) we denote one of those two levels as the positive event and the other as the negative event. In some applications, the choice of which level we call positive is clear, such as medical testing against a specific condition or infection with a disease. A recent example is the COVID testing in the last couple of years. But since no model is perfect, we will always have false positive and false negative outcomes. The better the model the fewer we will have these two types of errors. But how about the unlucky individuals who will be identified by the automated system erroneously as positive or negative? Although the analysis of these errors often leads to a systematic explanation of when these errors occur, we will have to find these errors and analyze them. But in most cases, we cannot eliminate them.

Model deployment: Models are always deployed within an operational system that runs a specific business. For example, a model used to approve a loan to customers of a bank will be implemented within the banking system that manages customers and their accounts. The decisions of the ML model will be fed into the banking system. Programming the interface between the banking system and the decision-making model is an IT task that is subject to the risk of errors and bugs. A customer’s loan application could be denied, or simply delayed, because of an undetected error or bug.

With the above-highlighted issues that may be encountered in the development of AI automated systems, let’s now discuss each of the five principles of the proposed bill.

1. Unsafe or Ineffective Systems

This principle is of course the most important one. The bill discusses this issue in reasonably good detail. However, one could add that we should expect implementations to include more than one model or AI system to make important decisions. For example, in critical applications such as medical treatment, security, or financial decision with high impact on the lives of individuals, the automated AI system should not rely on one ML model or decision engine, but rather on several such models that take different point of views and consider different modeling data. The final decision would be a pooled decision from these models (using some aggregation or voting scheme). This will reduce the number of false positives and false negatives by allowing some models to compensate for the weakness of others. This idea has its roots in the medical practice of having a second (and more) opinion in critical difficult cases. This level of redundancy should be proportional to the importance of the application. The more critical the decisions made by the AI automated system, the more redundant systems, and models that should be implemented.

2. Preventing Algorithmic Discrimination

Algorithmic discrimination results from using biased data and unfair manual tuning and/or manual overrides. Removing, or at least minimizing, bias in the data is not as hard or expensive as some may think. And in certain areas, there are existing laws and legislations that protect consumers against such bias. For example, the Equal Credit Opportunity Act (ECOA), and its amendments, clearly state the prohibited areas of discrimination in making credit and lending decisions. Data elements related to or derived from these areas are not allowed to be used in any decision for credit or lending in either an automated or manual fashion. Such fields include data identifying race, gender, age, religious affiliation, and marital status. The AI Bill of Rights could easily extend the scope of these laws to be used in all AI automated systems. This could be an easy entry point to determine the minimum standard to prevent algorithmic discrimination.

3. Data Privacy

Data privacy is a difficult subject because we need to balance the ability of organizations to use data to better serve their clients and be more efficient in customizing their products and services to the needs of their clients while respecting the right of the individual to privacy.

A good reference point to the AI Bill of Rights should be the European Union (EU) General Data Protection Regulation (GDPR) issued in 2016. It clearly defines the principles of who can access what data and how they can use it. It also clearly states the conditions of consent and the rights of individuals regarding their data. Similar detailed regulations or at least recommendations should be added to the AI Bill of Rights. Note that entities outside the EU are bound to comply with the GDPR if they process or store any data protected under this regulation, including organizations in the USA. It is ironic that companies in the US must follow stricter regulations in handling data of EU residents than they need for US or other residents. The US citizens and residents deserve to have at least the same level of right to the protection offered to EU residents if not better.

4. Notice and Explanation

Individuals and communities have the right to know how decisions regarding their interests are made. These decisions and how they are made should be easy to understand and decisions should be justified. The AI Bill of Rights could follow the example of the credit risk industry where they have standardized the machine learning models for all lending and credit risk procedures to be one form of what is called the “Standard Scorecard” format. In this format, each customer or account gets a specific number of points for matching certain criteria involving each of the model predictors or attributes used in the model. The final score representing the credit worthiness, and hence the decision to allow or deny the credit, is the summation of the points from each model attribute. This methodology proved to be robust enough to be the basis of lending and credit decisions in the last three decades. It is supported by many software vendors including Altair.

In addition to standard scorecards, there exists a number of tools providing what is known as “explainable AI.” Mandating the use of these tools, or the use of a standardized model form, similar to that used in the credit industry, as a part of standardized components of the AI system development could be a good recommendation that the AI Bill of Rights could promote.

5. Human Considerations and Fallbacks – Right to Opt Out

AI automated systems provide efficient ways for entities to offer services and products. Allowing opt-out and providing human alternatives and fallbacks will result in an additional cost that they will only bear if it is justified by an increase in tangible measures of success, such as customer loyalty, increased revenue from certain sectors in society and so on. To reduce the impact of these additional costs, commercial companies may resort to making it difficult to opt-out and communicate with a human instead the AI Automated system. For example, they could allow the waiting for telephone helplines to be unreasonably long. The AI Rights of Bill should encourage legislations that deter this possible reaction to make sure that this principle right is protected.

Additional Rights to Consider

The Right to be Informed

AI automated systems based on machine learning models learn from existing data. Embedded in this data are certain behaviors the machine learning models learn and tries to replicate when deployed. For example, a customer contact strategy based on fitting a machine learning model that uses the data of successful sales in the past will result in focusing the contact strategy on customers that bought the products and services of the organization in the past. That is, it will try to repeat the past. It will not try to reach out to customers who did not buy in the past. Therefore, these customers may not be informed or at all aware of the new products being offered. How will they even know about the product or service when the contact strategy will systematically exclude them?

The above issue will need to balance the right of the individual to know about possible opportunities versus the right of the commercial entity to maximize profit by marketing only to the most likely buyers.

The right to know becomes more important in areas such as customized online news by news outlets that want their consumers to watch their news for a longer time to maximize their advertisement revenue. When an online news outlet keeps sending a subscriber news alerts on the issue that they like to read or watch because they match their point of view, how will these subscribers be informed about what’s happening in the rest of the world or other points of view when they are being overwhelmed by things that, yes of interest to them, but leave them no time to learn about anything else? The long-term effect is less tolerance for different views. Again, it is a balance between the responsibility of the news outlet to inform versus its right to maximize profits.

The Right to Anonymity

An individual should have the right to access products and services to the maximum extent that will not harm commercial providers. For example, one should be able to buy anything online anonymously, the same way one can buy the same products in stores using cash. Today, this is virtually impossible.

Specific limitations could be imposed to protect society against specific anonymous access. For example, buying weapons or drugs (wherever they are legal) online could be excluded from the right to anonymity to guarantee compliance with applicable laws. But one should be able to buy shoes online without being asked to provide all personal information and only use a credit card that can be tracked forever.

Concluding Thoughts

The AI Bill of Rights is a welcome initiative by the White House and should be praised for starting the discussion on the issues related to automated AI systems. However, it should be augmented to support and extend the scope of existing laws, such as ECOA, and should learn and borrow from the EU GDPR, and extend the scope of the rights of individual rights and ownership of data. Most importantly, accelerate the rate of translating the principles of the Bill into laws and regulations that implement them in practice and clearly defines incentives for compliance and penalties for violations. At the current rate of technical developments in AI and machine learning, governments all over the world, not only in the US, will have to quickly catch up with laws and guidelines to keep the balance between the interest and freedoms of the individual and communities and the interests of the business without impacting their ability to innovate and lead progress. AI automated systems should be a tool that creates new opportunities to better the lives of individuals without sacrificing their freedoms and rights.