What Does It Mean to Certify an AI Product as Safe?

By on

Click to learn more about author James Kobielus.

Artificial intelligence (AI) scares many people. Personally, I think those worries are seriously overblown. But we should respect the general public’s need for meaningful assurances that the next AI-infused product they acquire won’t unexpectedly transform into an evil overlord or insidious snoop.

Electricity frightened a lot of people when it first entered their lives in the late 19th century. Responding to those concerns, most parts of the civilized world instituted regulations over electric utilities. At the same time, the private sector spawned electrical testing and certification groups such as Underwriters Laboratories (UL).

Thanks to safeguards such as these, we needn’t worry about having 10 zillion volts shoot through our bodies the next time we plug in a toaster. Electricity is a natural phenomenon that can be detected, controlled, and neutralized before it does harm. But how in the world can any testing lab certify that some capability as versatile as AI doesn’t stray beyond its appointed function and, perhaps, begin to sell our children to the highest bidder on eBay?

I pondered this while reading a recent Datanami article discussing UL’s desire to include AI in the scope of its product testing. At first, it seemed like a bit of strategic overreach on UL’s part, considering that it has traditionally limited itself to testing discrete physical products in controlled laboratory settings. Let’s remember that much of what we regard as AI is in fact a distributed ecosystem of Cloud server farms, Big Data repositories, and statistical processing tools. And, unlike electrical appliances and gas stoves, AI, even when it’s embedded in consumer products, rarely poses a direct physical threat to human beings when it malfunctions.

But product testing and certification must keep pace with technological innovation. Given that we’re now living in the digital age, it’s not inconceivable that consumers might someday rely on this or other trusted organizations to certify that some AI-powered gadget can’t accidentally (or deliberately, when disabled by evil people) disable our home fire alarms and carbon monoxide detectors. To their credit, UL has progressively expanded the range of consumer-product safety issues it addresses beyond electrical and fire hazards. It now also tests for water and food safety, environmental sustainability, and hazardous substances in a wide range of products.

When it comes to AI safety, labs will need to test for behavioral risks with product performance, in addition to continuing to vet any physical defects, weaknesses, or limitations that could harm consumers. With that in mind, I’d like to propose that UL and equivalent organizations around the world institute safety testing for AI-equipped products that addresses the following key concerns:

  • AI Rogue Agency: AI must always be under the control of the user or a designated third-party. For example, the smart speakers mentioned here should have been trained to refrain from placing orders for what they mistakenly interpreted as voice-activated purchase requests, but which in fact came from a small child without parental authorization. Though this could have been handled through multifactor authentication rather than through algorithmic training, it’s clear that voice-activated AI-enabled products in many environmental scenarios may need to step through complex algorithms when deciding what multifactor methods to use for strong authentication and delegated permissioning. Laboratory testing should be able to certify that AI products don’t have the potential to “go rogue.” In this regard, there should be testing to measure the extent to which the user can rescind AI-driven decisioning agency in circumstances where the uncertainty is too great to justify autonomous actions.
  • AI Instability: AI’s foundation in Machine Learning (ML) means that much of its operation will be probabilistic and statistical in nature, rather than according to fixed, repeatable rules. Though that data-driven intelligence is the foundation of what makes AI useful, it fosters an underlying uncertainty into how the technology will behave under operating circumstances outside the range for which its core ML models were built and trained. Independent testing should assess the extent to which AI-driven products behave in consistent, predictable patterns, free from unintended side effects, even when they are required to dynamically adapt to changing circumstances. They should also be able to certify that the AI fails gracefully, rather than catastrophically, when environment data departs significantly from circumstances for which they were trained. And they should verify that an AI-based product incorporates failsafe mechanisms that allow users to take back control when automated AI applications reach the limits of their competency.
  • AI Sensor Blindspots: When AI is incorporated into robots, drones, self-driving vehicles, and other sensor-equipped devices, there should be some indication to the consumer about the visuals, sounds, smells, and other sensory inputs they’re unable to detect under realistic operating environments. The “blindspots” may have to do with the limitations in the sensors, in specific deployment configurations, in the AI algorithms that process sensor data, in of the training that was used to ready sensor-equipped devices for use. For example, this security robot would have been better able to learn the range of locomotion challenges in the indoor and outdoor environments where it was designed to patrol if its reinforcement-learning training had been up to snuff. Equipping the robot with a built-in video camera and thermal imaging simply wasn’t enough to keep it from rolling over into a public fountain. Independent testing could have uncovered this risk, as well as any consequent risks from faulty collision avoidance and defensive maneuvering algorithms.
  • AI Privacy Vulnerabilities: Considering that many AI-driven products, such as Alexa, are in the consumer end of the Internet of Things, there must be safeguards to prevent them from inadvertently invading people’s privacy, or from exposing people to surveillance hack attacks by external parties. For example, the risks to a child using this cute little AI-powered toy robot are partly in its physical construction, but also partly in what it may do in the child’s grasp, toybox, or home. One would want it tested to make sure that unauthorized third parties, such as potential kidnappers, can’t use it to track or eavesdrop on a child or anybody in their family. Likewise, it would be important for a parent to know that a mischievous party can’t hack it to voice obscenities or other objectionable language.
  • AI Adversarial Exposure: Vulnerabilities in your Deep Neural Networks can expose your company to considerable risk if they are discovered and exploited by third-parties before you even realize or have implemented defenses. The potential for adversarial attacks against deep neural networks — such as those behind computer vision, speech recognition and Natural Language Processing — are an increasing cause for concern within the Data Science profession. The research literature is full of documented instances where Deep Neural Networks have been fooled by adversarial attacks. Testing laboratories should be able to certify that AI-infused products are able to withstand some of the most likely sources of adversarial attacks.
  • AI Algorithmic Inscrutability: Many safety issues with AI may stem from the “blackbox” complexity of its algorithms. If you’ve placed total faith in, say, an AI-driven healthcare app’s recommendations, it would be important to know how the program reached its conclusions and whether it incorporated all relevant variables pertaining to your personal situation. But even if the relevant input variables are exhaustively listed, that runs up against the fact that many of the most predictive machine learning models often generate results based on high-order weighted combinations of these variables, which can be exceptionally hard to boil down to human-comprehensible explanations. Independent testing of an AI product should call out the risks a consumer faces when using products that embed such algorithms. And there should be disclaimers on AI-driven products that are not ideally transparentexplicable, and interpretable to average humans.
  • AI Liability Obscurity: Just as every ingredient in the foodchain may be traceable back to a source, the provenance of every AI component of a product should be transparent. Consumer confidence in AI-infused products rests on knowing that there’s always a clear indication of human accountability, responsibility, and liability for their algorithmic outcomes. In fact, this will almost certainly become a legal requirement in most industrialized countries, so testing labs should start certifying products that ensure transparency of accountability.

What I’ve just discussed are AI behavioral features that, if they’re embedded into products, can be tested in controlled laboratory environments. What’s not directly ascertainable under such circumstances are the following considerations, many of which also fall outside the scope of “safety” into a broader range of equally valid societal concerns:

  • Socioeconomic biases that may be baked into AI-driven products;
  • Ethical and cultural faux pas that AI-driven products may potentially commit that might offend some segments of the population;
  • Choreography breakdowns that may potentially take place in the complex interactions among AI-driven automated and human agents in every conceivable potential combination under every potential future real-world scenario
  • Deficiencies in the predictive power of AI-driven products at any point in the indefinite future due to failures to ensure high-quality training, monitoring, and governance over models, data, and other algorithmic artifacts

Clearly, there’s a limit to what we can test about an AI-equipped product in a lab. Just as society can’t certify that the electrical grid won’t blackout under unusual circumstances, we can’t always assure ourselves that the AI we’re putting into everything is 100 percent safe.

Leave a Reply