Swense Tech

Best Solution For You

AI Ethics Tempted But Hesitant To Use AI Adversarial Attacks Against The Evils Of Machine Learning, Including For Self-Driving Cars

It is widely accepted sage wisdom to garner as much as you can about your adversaries.

Frederick The Great, the famous king of Prussia and a noted military strategist, stridently said this: “Great advantage is drawn from knowledge of your adversary, and when you know the measure of their intelligence and character, you can use it to play on their weakness.”

Astutely leveraging the awareness of your adversaries is both a vociferous defense and a compelling offense-driven strategy in life. On the one hand, you can be better prepared for whatever your adversary might try to destructively do to you. The other side of that coin is that you are likely able to carry out better attacks against your adversary via the known and suspected weaknesses of any vaunted foe.

Per the historically revered statesman and ingenious inventor Benjamin Franklin, those that are on their guard and appear ready to receive their adversaries are in much less danger of being attacked, much more so than otherwise being unawares, supine, and negligent in preparation.

Why all this talk about adversaries?

Because one of the biggest concerns facing much of today’s AI is that cyber crooks and other evildoers are deviously attacking AI systems using what is commonly referred to as adversarial attacks. This can cause an AI system to falter and fail to perform its designated functions. As you’ll see in a moment, there are a variety of vexing AI Ethics and Ethical AI issues underlying the matter, such as ensuring that AI systems are protected against such scheming adversaries, see my ongoing and extensive coverage of AI Ethics at the link here and the link here, just to name a few.

Perhaps even worse than getting the AI to simply stumble, the adversarial attack can sometimes be used to get AI to perform as the wrongdoer wishes the AI to perform. The attacker can essentially trick the AI into doing the bidding of the malefactor. Whereas some adversarial attacks seek to disrupt or confound the AI, another equally if not more insidious form of deception involves getting the AI to act on the behalf of the attacker.

It is almost as though one might use a mind trick or hypnotic means to get a human to do wrong acts and yet the person is blissfully unaware that they have been fooled into doing something that they should not particularly have done. To clarify, the act that is performed does not necessarily have to be wrong per se or illegal in its merits. For example, conning a bank teller to open the safe or vault for you is not in itself a wrong or illegal act. The bank teller is doing what they legitimately are able to perform as a valid bank-approved task. Of course, if they open the vault and doing so allows a robber to steal the money and all of the gold bullion therein, the bank teller has been tricked into performing an act that they should not have undertaken in the given circumstances.

The use of adversarial attacks against AI has to a great extent arisen because of the way in which much of contemporary AI is devised. You see, this latest era of AI has tended to emphasize the use of Machine Learning (ML) and Deep Learning (DL). These are computational pattern matching techniques and technologies which have dramatically aided the advancement of modern-day AI systems. ML/DL is often used as a key element in many of the AI systems that you interact with daily, such as the use of conversational interactive systems or Natural Language Processing (NLP) akin to Alexa and Siri.

The manner in which ML/DL is designed and fielded provides a fertile opening for the leveraging of adversarial attacks. Cybercrooks generally can guess how the ML/DL was built. They can make reasoned guesses about how the ML/DL will react when put into use. There are only so many ways that ML/DL is usually constructed. As such, the evildoer hackers can try a slew of underhanded ML/DL adversarial tricks to get the AI to either go awry or do their bidding.

In contrast, during the prior era of AI systems, it was somewhat harder to undertake adversarial attacks since much of the AI was more idiosyncratic and written in a more proprietary or individualistic manner. You would have had a more challenging time trying to guess how the AI was constructed and also how it might react when placed into active use. In comparison, ML/DL is largely more predictable as to its susceptibilities (this is not always the case, and please know that I am broadly generalizing).

You might be thinking that if adversarial attacks are relatively able to be targeted specifically at ML/DL then certainly there be should a boatload of cybersecurity measures available to protect against those attacks. One would hope that those devising and releasing their AI applications would ensure that the app was securely able to fight against those adversarial attacks.

The answer is yes and no.

Yes, there exist numerous cybersecurity protections that can be used by and within ML/DL to guard against adversarial attacks. Unfortunately, the answer is also somewhat a “no” in that many of the AI builders are not especially versed in those protections or are not explicitly including those protections.

There are lots of reasons for this.

One is that some AI software engineers concentrate solely on the AI side and are not particularly caring about the cybersecurity elements. They figure that someone else further along in the chain of making and releasing the AI will deal with any needed cybersecurity protections. Another reason for the lack of protection against adversarial attacks is that it can be a burden of sorts to the AI project. An AI project might be under a tight deadline to get the AI out the door. Adding into the mix a bunch of cybersecurity protections that need to be crafted or set up will potentially delay the production cycle of the AI. Furthermore, the cost of creating AI is bound to go up too.

Note that none of those are satisfactory as to allow an AI system to be vulnerable to adversarial attacks. Those that are in the know would say the famous line of either pay me now or pay me later would come to play in this instance. You can skirt past the cybersecurity portions to get an AI system sooner into production, but the chances are that it will then suffer an adversarial attack. A cost-benefit analysis and ROI (return on investment) needs to be properly assessed as to whether the cost upfront and the benefits thereof are going to be more profitable against the costs to repair and deal with cybersecurity intrusions further down the pike.

There is no free lunch when it comes to making ML/DL that is well-protected against adversarial attacks.

That being said, you don’t necessarily need to move heaven and earth to be moderately protected against those evildoing tricks. Savvy specialists that are versed in cybersecurity protections can pretty much sit side-by-side with the AI crews and dovetail the security into the AI as it is being devised. There is also the assumption that a well-versed AI builder can readily use AI constructing techniques and technologies that simultaneously aid their AI building and that seamlessly encompasses adversarial attack protections. To adequately do so, they usually need to know about the nature of adversarial attacks and how to best blunt or mitigate them. This is something only gradually becoming regularly instituted as part of devising AI systems.

A twist of sorts is that more and more people are getting into the arena of developing ML/DL applications. Regrettably, some of those people are not versed in AI per se, and neither are they versed in cybersecurity. The idea overall is that perhaps by making the ability to craft AI systems with ML/DL widely available to all we are aiming to democratize AI. That sounds good, but there are downsides to this popular exhortation, see my analysis and coverage at the link here.

Speaking of twists, I will momentarily get to the biggest twist of them all, namely, I am going to shock you with a recently emerging notion that some find sensible and others believe is reprehensible. I’ll give you a taste of where I am heading on this heated and altogether controversial matter.

Are you ready?

There is a movement toward using adversarial attacks as a means to disrupt or fool AI systems that are being used by wrongdoers.

Let me explain.

So far, I have implied that AI is seemingly always being used in the most innocent and positive of ways and that only miscreants would wish to confound the AI via the use of adversarial attacks. But keep in mind that bad people can readily devise AI and use that AI for doing bad things.

You know how it is, what’s good for the goose is good for the gander.

Criminals and cybercrooks are eagerly wising up to the building and using AI ML/DL to carry out untoward acts. When you come in contact with an AI system, you might not have any means of knowing whether it is an AI For Good versus an AI For Bad type of system. Be on the watch! Just because AI is being deployed someplace does not somehow guarantee that the AI will be crafted by well-intended builders. The AI could be deliberately devised for foul purposes.

Here then is the million-dollar question.

Should we be okay with using adversarial attacks on purportedly AI For Bad systems?

I’m sure that your first thought is that we ought to indeed be willing to fight fire with fire. If AI For Good systems can be shaken up via adversarial attacks, we can use those same evildoing adversarial attacks to shake up those atrocious AI For Bad systems. We can rightfully turn the attacking capabilities into an act of goodness. Fight evil using the appalling trickery of evil. The net result would seem to be an outcome of good.

Not everyone agrees with that sentiment.

From an AI Ethics perspective, there is a lot of handwringing going on about this meaty topic. Some would argue that by leveraging adversarial attacks, even when the intent is for the good, you are perpetuating the use of adversarial attacks all-told. You are basically saying that it is okay to launch and promulgate adversarial attacks. Shame on you, they exclaim. We ought to be stamping out evil rather than encouraging or expanding upon evil (even if the evil is ostensibly aiming to offset evil and carry out the work of the good).

Those against the use of adversarial attacks would also argue that by keeping adversarial attacks in the game that you are going to merely step into a death knell of quicksand. More and stronger adversarial attacks will be devised under the guise of attacking the AI For Bad systems. That seems like a tremendously noble pursuit. The problem is that the evildoers will undoubtedly also grab hold of those emboldened and super-duper adversarial attacks and aim them squarely at the AI For Good.

You are blindly promoting the cat and mouse gambit. We might be shooting our own foot.

A retort to this position is that there are no practical means of stamping out adversarial attacks. No matter whether you want them to exist or not, the evildoers are going to make sure they do persist. In fact, the evildoers are probably going to be making the adversarial attacks more resilient and potent, doing so to overcome whatever cyber protections are put in place to block them. Thus, a proverbial head-in-the-sand approach to dreamily pretending that adversarial attacks will simply slip quietly away into the night is pure nonsense.

You could contend that adversarial attacks against AI are a double-edged sword. AI researchers have noted this quandary, as stated by these authors in a telling article in AI And Ethics journal: “Sadly, AI solutions have already been utilized for various violations and theft, even receiving the name AI or Crime (AIC). This poses a challenge: are cybersecurity experts thus justified to attack malicious AI algorithms, methods and systems as well, to stop them? Would that be fair and ethical? Furthermore, AI and machine learning algorithms are prone to be fooled or misled by the so-called adversarial attacks. However, adversarial attacks could be used by cybersecurity experts to stop the criminals using AI, and tamper with their systems. The paper argues that this kind of attacks could be named Ethical Adversarial Attacks (EAA), and if used fairly, within the regulations and legal frameworks, they would prove to be a valuable aid in the fight against cybercrime” (article by Michał Choraś and Michał Woźniak, “The Double-Edged Sword Of AI: Ethical Adversarial Attacks To Counter Artificial Intelligence For Crime”).

I’d ask you to mull this topic over and render a vote in your mind.

Is it unethical to use AI adversarial attacks against AI For Bad, or can we construe this as an entirely unapologetic Ethical AI practice?

You might be vaguely aware that one of the loudest voices these days in the AI field and even outside the field of AI consists of clamoring for a greater semblance of Ethical AI. Let’s take a look at what it means to refer to AI Ethics and Ethical AI. On top of that, we can set the stage by looking at some examples of adversarial attacks to establish what I mean when I speak of Machine Learning and Deep Learning.

One particular segment or portion of AI Ethics that has been getting a lot of media attention consists of AI that exhibits untoward biases and inequities. You might be aware that when the latest era of AI got underway there was a huge burst of enthusiasm for what some now call AI For Good. Unfortunately, on the heels of that gushing excitement, we began to witness AI For Bad. For example, various AI-based facial recognition systems have been revealed as containing racial biases and gender biases, which I’ve discussed at the link here.

Efforts to fight back against AI For Bad are actively underway. Besides vociferous legal pursuits of reining in the wrongdoing, there is also a substantive push toward embracing AI Ethics to righten the AI vileness. The notion is that we ought to adopt and endorse key Ethical AI principles for the development and fielding of AI doing so to undercut the AI For Bad and simultaneously heralding and promoting the preferable AI For Good.

On a related notion, I am an advocate of trying to use AI as part of the solution to AI woes, fighting fire with fire in that manner of thinking. We might for example embed Ethical AI components into an AI system that will monitor how the rest of the AI is doing things and thus potentially catch in real-time any discriminatory efforts, see my discussion at the link here. We could also have a separate AI system that acts as a type of AI Ethics monitor. The AI system serves as an overseer to track and detect when another AI is going into the unethical abyss (see my analysis of such capabilities at the link here).

In a moment, I’ll share with you some overarching principles underlying AI Ethics. There are lots of these kinds of lists floating around here and there. You could say that there isn’t as yet a singular list of universal appeal and concurrence. That’s the unfortunate news. The good news is that at least there are readily available AI Ethics lists and they tend to be quite similar. All told, this suggests that by a form of reasoned convergence of sorts that we are finding our way toward a general commonality of what AI Ethics consists of.

First, let’s cover briefly some of the overall Ethical AI precepts to illustrate what ought to be a vital consideration for anyone crafting, fielding, or using AI.

For example, as stated by the Vatican in the Rome Call For AI Ethics and as I’ve covered in-depth at the link here, these are their identified six primary AI ethics principles:

  • Transparency: In principle, AI systems must be explainable
  • Inclusion: The needs of all human beings must be taken into consideration so that everyone can benefit, and all individuals can be offered the best possible conditions to express themselves and develop
  • Responsibility: Those who design and deploy the use of AI must proceed with responsibility and transparency
  • Impartiality: Do not create or act according to bias, thus safeguarding fairness and human dignity
  • Reliability: AI systems must be able to work reliably
  • Security and privacy: AI systems must work securely and respect the privacy of users.

As stated by the U.S. Department of Defense (DoD) in their Ethical Principles For The Use Of Artificial Intelligence and as I’ve covered in-depth at the link here, these are their six primary AI ethics principles:

  • Responsible: DoD personnel will exercise appropriate levels of judgment and care while remaining responsible for the development, deployment, and use of AI capabilities.
  • Equitable: The Department will take deliberate steps to minimize unintended bias in AI capabilities.
  • Traceable: The Department’s AI capabilities will be developed and deployed such that relevant personnel possesses an appropriate understanding of the technology, development processes, and operational methods applicable to AI capabilities, including transparent and auditable methodologies, data sources, and design procedure and documentation.
  • Reliable: The Department’s AI capabilities will have explicit, well-defined uses, and the safety, security, and effectiveness of such capabilities will be subject to testing and assurance within those defined uses across their entire lifecycles.
  • Governable: The Department will design and engineer AI capabilities to fulfill their intended functions while possessing the ability to detect and avoid unintended consequences, and the ability to disengage or deactivate deployed systems that demonstrate unintended behavior.

I’ve also discussed various collective analyses of AI ethics principles, including having covered a set devised by researchers that examined and condensed the essence of numerous national and international AI ethics tenets in a paper entitled “The Global Landscape Of AI Ethics Guidelines” (published in Nature), and that my coverage explores at the link here, which led to this keystone list:

  • Transparency
  • Justice & Fairness
  • Non-Maleficence
  • Responsibility
  • Privacy
  • Beneficence
  • Freedom & Autonomy
  • Trust
  • Sustainability
  • Dignity
  • Solidarity

As you might directly guess, trying to pin down the specifics underlying these principles can be extremely hard to do. Even more so, the effort to turn those broad principles into something entirely tangible and detailed enough to be used when crafting AI systems is also a tough nut to crack. It is easy to overall do some handwaving about what AI Ethics precepts are and how they should be generally observed, while it is a much more complicated situation in the AI coding having to be the veritable rubber that meets the road.

The AI Ethics principles are to be utilized by AI developers, along with those that manage AI development efforts, and even those that ultimately field and perform upkeep on AI systems. All stakeholders throughout the entire AI life cycle of development and usage are considered within the scope of abiding by the being-established norms of Ethical AI. This is an important highlight since the usual assumption is that “only coders” or those that program the AI are subject to adhering to the AI Ethics notions. As earlier stated, it takes a village to devise and field AI, and for which the entire village has to be versed in and abide by AI Ethics precepts.

Let’s also make sure we are on the same page about the nature of today’s AI.

There isn’t any AI today that is sentient. We don’t have this. We don’t know if sentient AI will be possible. Nobody can aptly predict whether we will attain sentient AI, nor whether sentient AI will somehow miraculously spontaneously arise in a form of computational cognitive supernova (usually referred to as the singularity, see my coverage at the link here).

The type of AI that I am focusing on consists of the non-sentient AI that we have today. If we wanted to wildly speculate about sentient AI, this discussion could go in a radically different direction. A sentient AI would supposedly be of human quality. You would need to consider that the sentient AI is the cognitive equivalent of a human. More so, since some speculate we might have super-intelligent AI, it is conceivable that such AI could end up being smarter than humans (for my exploration of super-intelligent AI as a possibility, see the coverage here).

Let’s keep things more down to earth and consider today’s computational non-sentient AI.

Realize that today’s AI is not able to “think” in any fashion on par with human thinking. When you interact with Alexa or Siri, the conversational capacities might seem akin to human capacities, but the reality is that it is computational and lacks human cognition. The latest era of AI has made extensive use of Machine Learning (ML) and Deep Learning (DL), which leverage computational pattern matching. This has led to AI systems that have the appearance of human-like proclivities. Meanwhile, there isn’t any AI today that has a semblance of common sense and nor has any of the cognitive wonderment of robust human thinking.

ML/DL is a form of computational pattern matching. The usual approach is that you assemble data about a decision-making task. You feed the data into the ML/DL computer models. Those models seek to find mathematical patterns. After finding such patterns, if so found, the AI system then will use those patterns when encountering new data. Upon the presentation of new data, the patterns based on the “old” or historical data are applied to render a current decision.

I think you can guess where this is heading. If humans that have been making the patterned upon decisions have been incorporating untoward biases, the odds are that the data reflects this in subtle but significant ways. Machine Learning or Deep Learning computational pattern matching will simply try to mathematically mimic the data accordingly. There is no semblance of common sense or other sentient aspects of AI-crafted modeling per se.

Furthermore, the AI developers might not realize what is going on either. The arcane mathematics in the ML/DL might make it difficult to ferret out the now hidden biases. You would rightfully hope and expect that the AI developers would test for the potentially buried biases, though this is trickier than it might seem. A solid chance exists that even with relatively extensive testing that there will be biases still embedded within the pattern matching models of the ML/DL.

You could somewhat use the famous or infamous adage of garbage-in garbage-out. The thing is, this is more akin to biases-in that insidiously get infused as biases submerged within the AI. The algorithm decision-making (ADM) of AI axiomatically becomes laden with inequities.

Not good.

I trust that you can readily see how adversarial attacks fit into these AI Ethics matters. Evildoers are undoubtedly going to use adversarial attacks against ML/DL and other AI that is supposed to be doing AI For Good. Meanwhile, those evildoers are indubitably going to be devising AI For Bad that they foster upon us all. To try and fight against those AI For Bad systems, we could arm ourselves with adversarial attacks. The question is whether we are doing more good or more harm by leveraging and continuing the advent of adversarial attacks.

Time will tell.

One vexing issue is that there is a myriad of adversarial attacks that can be used against AI ML/DL. You might say there are more than you can shake a stick at. Trying to devise protective cybersecurity measures to negate all of the various possible attacks is somewhat problematic. Just when you might think you’ve done a great job of dealing with one type of adversarial attack, your AI might get blindsided by a different variant. A determined evildoer is likely to toss all manner of adversarial attacks at your AI and be hoping that at least one or more sticks. Of course, if we are using adversarial attacks against AI For Bad, we too would take the same advantageous scattergun approach.

Some of the most popular types of adversarial attacks include:

  • Adversarial falsifications such as false-positive attacks and false-negative attacks
  • Adversarial black-box attacks that are done without knowing what is inside the ML/DL
  • Adversarial white-box attacks that are done when the internals of the ML/DL are known
  • Adversarial run-time attacks that occur once the ML/DL is placed into active use
  • Adversarial training-mode attacks that happen while the ML/DL is being trained
  • Adversarial one-time attacks that are used on a one-and-done basis against ML/DL
  • Adversarial iterative incremental attacks that dig away stepwise at ML/DL
  • Etc.

At this juncture of this weighty discussion, I’d bet that you are desirous of some illustrative examples that might showcase the nature and scope of adversarial attacks against AI and particularly aimed at Machine Learning and Deep Learning. There is a special and assuredly popular set of examples that are close to my heart. You see, in my capacity as an expert on AI including the ethical and legal ramifications, I am frequently asked to identify realistic examples that showcase AI Ethics dilemmas so that the somewhat theoretical nature of the topic can be more readily grasped. One of the most evocative areas that vividly presents this ethical AI quandary is the advent of AI-based true self-driving cars. This will serve as a handy use case or exemplar for ample discussion on the topic.

Here’s then a noteworthy question that is worth contemplating: Does the advent of AI-based true self-driving cars illuminate anything about the nature of adversarial attacks against AI, and if so, what does this showcase?

Allow me a moment to unpack the question.

First, note that there isn’t a human driver involved in a true self-driving car. Keep in mind that true self-driving cars are driven via an AI driving system. There isn’t a need for a human driver at the wheel, nor is there a provision for a human to drive the vehicle. For my extensive and ongoing coverage of Autonomous Vehicles (AVs) and especially self-driving cars, see the link here.

I’d like to further clarify what is meant when I refer to true self-driving cars.

Understanding The Levels Of Self-Driving Cars

As a clarification, true self-driving cars are ones where the AI drives the car entirely on its own and there isn’t any human assistance during the driving task.

These driverless vehicles are considered Level 4 and Level 5 (see my explanation at this link here), while a car that requires a human driver to co-share the driving effort is usually considered at Level 2 or Level 3. The cars that co-share the driving task are described as being semi-autonomous, and typically contain a variety of automated add-ons that are referred to as ADAS (Advanced Driver-Assistance Systems).

There is not yet a true self-driving car at Level 5, and we don’t yet even know if this will be possible to achieve, nor how long it will take to get there.

Meanwhile, the Level 4 efforts are gradually trying to get some traction by undergoing very narrow and selective public roadway trials, though there is controversy over whether this testing should be allowed per se (we are all life-or-death guinea pigs in an experiment taking place on our highways and byways, some contend, see my coverage at this link here).

Since semi-autonomous cars require a human driver, the adoption of those types of cars won’t be markedly different than driving conventional vehicles, so there’s not much new per se to cover about them on this topic (though, as you’ll see in a moment, the points next made are generally applicable).

For semi-autonomous cars, it is important that the public needs to be forewarned about a disturbing aspect that’s been arising lately, namely that despite those human drivers that keep posting videos of themselves falling asleep at the wheel of a Level 2 or Level 3 car, we all need to avoid being misled into believing that the driver can take away their attention from the driving task while driving a semi-autonomous car.

You are the responsible party for the driving actions of the vehicle, regardless of how much automation might be tossed into a Level 2 or Level 3.

Self-Driving Cars And Adversarial Attacks Against AI

For Level 4 and Level 5 true self-driving vehicles, there won’t be a human driver involved in the driving task.

All occupants will be passengers.

The AI is doing the driving.

One aspect to immediately discuss entails the fact that the AI involved in today’s AI driving systems is not sentient. In other words, the AI is altogether a collective of computer-based programming and algorithms, and most assuredly not able to reason in the same manner that humans can.

Why is this added emphasis about the AI not being sentient?

Because I want to underscore that when discussing the role of the AI driving system, I am not ascribing human qualities to the AI. Please be aware that there is an ongoing and dangerous tendency these days to anthropomorphize AI. In essence, people are assigning human-like sentience to today’s AI, despite the undeniable and inarguable fact that no such AI exists as yet.

With that clarification, you can envision that the AI driving system won’t natively somehow “know” about the facets of driving. Driving and all that it entails will need to be programmed as part of the hardware and software of the self-driving car.

Let’s dive into the myriad of aspects that come to play on this topic.

First, it is important to realize that not all AI self-driving cars are the same. Each automaker and self-driving tech firm is taking its approach to devising self-driving cars. As such, it is difficult to make sweeping statements about what AI driving systems will do or not do.

Furthermore, whenever stating that an AI driving system doesn’t do some particular thing, this can, later on, be overtaken by developers that in fact program the computer to do that very thing. Step by step, AI driving systems are being gradually improved and extended. An existing limitation today might no longer exist in a future iteration or version of the system.

I hope that provides a sufficient litany of caveats to underlie what I am about to relate.

As earlier mentioned, some of the most popular types of adversarial attacks include:

  • Adversarial falsifications such as false-positive attacks and false-negative attacks
  • Adversarial black-box attacks that are done without knowing what is inside the ML/DL
  • Adversarial white-box attacks that are done when the internals of the ML/DL are known
  • Adversarial run-time attacks that occur once the ML/DL is placed into active use
  • Adversarial training-mode attacks that happen while the ML/DL is being trained
  • Adversarial one-time attacks that are used on a one-and-done basis against ML/DL
  • Adversarial iterative incremental attacks that dig away stepwise at ML/DL
  • Etc.

We can showcase the nature of each such adversarial attack and do so in the context of AI-based self-driving cars.

Adversarial Falsification Attacks

Consider the use of adversarial falsifications.

There are generally two such types: (1) false-positive attacks, and (2) false-negative attacks. In the false-positive attack, the emphasis is on presenting to AI a so-called negative sample that is then incorrectly classified by the ML/DL as a positive one. The jargon for this is that it is a Type I effort (this is reminiscent perhaps of your days of taking a statistics class in college). In contrast, the false-negative attack entails presenting a positive sample for which the ML/DL incorrectly classifies as a negative instance, known as a Type II error.

Suppose that we had trained an AI driving system to detect Stop signs. We used an ML/DL that we had trained beforehand with thousands of images that contained Stop signs. The idea is that we would be using video cameras on the self-driving car to collect video and images of the roadway scene surrounding the autonomous vehicle during a driving journey. As the digital imagery real-time streams into an onboard computer, the ML/DL scans the digital data to detect any indication of a nearby Stop sign. The detection of a Stop sign is obviously crucial for the AI driving system. If a Stop sign is detected by the ML/DL, this is conveyed to the AI driving system and the AI would need to ascertain a suitable means to use the driving controls to bring the self-driving car to a proper and safe stop.

Humans seem to readily be able to detect Stop signs, at least most of the time. Our human perception of such signs is keenly honed by our seemingly innate cognitive pattern matching capacities. All we need to do is learn what a Stop sign looks like and we take things from there. A toddler learns soon enough that a Stop sign is typically red in color, contains the word “STOP” in large letters, has a special rectangular shape, usually is posted adjacent to the roadway and resides at a person’s height, and so on.

Imagine an evildoer that wants to make trouble for self-driving cars.

In a false-positive adversarial attack, the wrongdoer would try to trick the ML/DL into computationally calculating that a Stop sign exists even when there isn’t a Stop sign present. Maybe the wrongdoer puts up a red sign along a roadway that looks generally similar to a Stop sign but lacks the word “STOP” on it. A human would likely realize that this is merely a red sign and not a driving directive. The ML/DL might though calculate that the sign resembles sufficiently enough a Stop sign to the degree that the AI ought to consider the sign as in fact a Stop sign.

You might be tempted to think that this is not much of an adversarial attack and that it seems rather innocuous. Well, suppose that you are driving in a car and meanwhile a self-driving car that is ahead of you suddenly and seemingly without any basis for doing so comes to an abrupt stop (due to having misconstrued a red sign near the roadway as being a Stop sign). You might ram into that self-driving car. It could be that the AI was fooled into computationally calculating that a non-stop sign was a Stop sign, thus committing a false-positive error. You get injured, the passengers in the self-driving car get injured, and perhaps even pedestrians get injured by this dreadful false-positive adversarial attack.

A false-negative adversarial attack is somewhat akin to this preceding depiction though based on tricking the ML/DL into incorrectly misclassifying in the other direction, as it were. Imagine that a Stop sign is sitting next to the roadway and for all usual visual reasons seems to be a Stop sign. Humans accept that this is indeed a valid Stop sign.

An evildoer covers up part of the word “STOP” with some tape. Humans still realize this is a Stop sign. The ML/DL might be computationally calculating that since the word “STOP” does not seem entirely present on the sign, the sign is not ranked as a Stop sign. This is conveyed to the AI driving system. Sure enough, the AI driving system allows the self-driving car to plow ahead and completely fails to come to a stop at the Stop sign.

I hope that you can envision how dangerous this false-negative attack can be. A human-driven car coming from a cross-street might be under the assumption that the self-driving car was going to come to a proper legal stop at the Stop sign. Sadly, the two cars crash while in the middle of the intersection. People are injured or possibly killed. The evildoer has wreaked their nefarious havoc.

That briefly explains the nature of false-positive and false-negative attacks. There can be a lot more trickery involved. I mentioned the simplest case as a means of showcasing what those types of adversarial attacks consist of. For the moment, we will continue to explore how adversarial attacks work. Once I’ve covered those fundamentals, we will revisit the question of whether it is proper or improper to use these adversarial attacks when wishing to confront AI For Bad.

Adversarial Black-Box And White-Box Attacks

Recall that I opened this column by pointing out that it is wise to know as much about your adversary as you possibly can. With adversarial attacks, sometimes you might know quite a bit about how the ML/DL internally functions, while in other instances you might have little if any clues about how a particular ML/DL was devised.

In a black-box adversarial attack, the evildoer has little if any understanding of how the particular ML/DL being targeted has been set up. The cybercrook has to make some form of reasoned guess. This guess might be completely on-target. Luck favored their devious deed. Other times the guess might be wildly off-target. Luck was not on their evil side.

In a white-box adversarial attack, the evildoer has some understanding of how the particular ML/DL being targeted has been set up. This makes life a lot easier for the evildoer. They can devise an adversarial attack that they pretty much know or believe strongly will undercut the ML/DL. If you’ve ever wondered why sometimes an AI builder might try to hide the details of how their ML/DL has been set up, this is a reason for that secrecy (there are other valid reasons too).

Here’s an example of how the black-box and white-box adversarial attack approaches come to play.

Returning to the Stop sign scenario, an evildoer is having to decide whether to try and use a false-positive or a false-negative attack against a self-driving car. If the cybercrook knows the internals of the ML/DL involved (“white-box”), they can figure out which of the two means is the better choice. Perhaps the ML/DL is well-fortified against the false-positive but unduly weak against the false-negative. Voila, the attacker would likely choose to use the false-negative for their insidious trickery.

That being said, sometimes an evildoer might not have any ready way of discerning how the ML/DL has been established (“black-box”). In that case, the cybercrook might have to make a guess at which of various adversarial attacks will be most successful. The guess could be right or the guess could be wrong. The odds are that when facing a black-box situation, the evildoer will resort to using a scattergun approach in hopes that something will do the trick for them.

Adversarial Run-Time and Training Time Attacks

The examples so far have dealt with the assumed circumstance that the self-driving car is being attacked at the time of the self-driving car being underway on the roadway. This can be characterized as doing an adversarial attack during the run-time of the AI ML/DL system.

An evildoer might though try to get to the AI beforehand prior to the ML/DL being placed into active use, doing so during the time that the ML/DL is being trained. Remember that the ML/DL presumably had to be “trained” to do whatever tasks are being undertaken. It is conceivable that a determined wrongdoer could get access to the ML/DL when it is still being set up. If so, the evildoer can then potentially implant or shape the ML/DL to contain a particular vulnerability.

Take the example of the Stop sign. Imagine that the evildoer was able to access the ML/DL when the Stop sign detection training was being accomplished. Perhaps specially fabricated data is fed into the ML/DL. Or the ML/DL parameters are tweaked by the wrongdoer. All in all, the idea is that this might undercut how the ML/DL is going to work once it is placed into active use.

An evildoer that has done their dirty work beforehand would potentially know for sure that the AI ML/DL is vulnerable to for example a false-positive attack, or likewise vulnerable to a false-negative attack. Whereas in the usual white-box approach the baddies might simply inspect the ML/DL to see if they can uncover any weaknesses, this effort to seed weaknesses is obviously a likely more strident evildoing avenue.

Complicating and worsening such matters is the possibility that the ML/DL also trains during run-time. You might want the ML/DL to adjust by itself while it is in active use. In that case, the evildoer can potentially do nefarious training infusion acts, even while the ML/DL is in the midst of being used. For more on these kinds of AI ML/DL cybersecurity aspects, see my coverage at the link here.

Adversarial Attack Of One-Time Or Iterative Nature

Another perspective on adversarial attacks against AI consists of doing them on a one-time basis versus doing so iteratively on an incremental basis.

It could be that an evildoer repeatedly tries the Stop sign trickery and hopes that eventually, a self-driving car falls for the appalling act. Or the wrongdoer might decide that the attack should be tried just once. An issue for an attacker is that if they repeatedly do an attack, the AI being attacked might detect that such attacks are being launched and then alert someone or invoke a built-in counterattack.


Now that I’ve covered the overall nature of adversarial attacks against AI, we should revisit the latest AI Ethics twist that consists of debating whether or not those kinds of nefarious attacks are okay to be used by those seeking to take down AI For Bad.

Yes, you might say, of course, it is perfectly fine to do so. Use whatever means we can to curtail AI For Bad. The ends justify the means.

No, you might reply, it isn’t suitable to employ adversarial attacks, even when used to deal with AI For Bad. We must resist getting into an arms race of adversarial attacks. The more that we entertain the use of them, the worse things will get. Put aside this Pandora’s box and avoid the temptation that it entails. Also, keep in mind that we are potentially shooting our own foot, including that we might inadvertently use adversarial attacks against AI For Good that we falsely thought was AI For Bad.

While you mull over these hefty matters, let’s end the discussion for now with some insightful quotes.

Evagrius Ponticus, a famous monk and highly influential theologian, said this about adversaries: “The further the soul advances, the greater are the adversaries against which it must contend.” And, as further contemplative milieu, Tennessee Williams, the revered American playwright, noted this about adversaries: “But I think the spirit of man is a good adversary.”

You be the judge and jury when it comes to deciding whether or not we are to proceed to embrace adversarial attacks as a full-fledged member of the AI Ethics club, or whether such methods should be relegated to the junk heap as something reviled and never to be used (even when employed for the pursuit of goodness and aiming to virtuously root out Unethical AI).

Welcome to the world of the vexing ethical conundrum about AI adversarial attacks.