AI systems can have embedded biases, including in AI self-driving cars.
The news has been replete with stories about AI systems that have been regrettably and dreadfully exhibiting various adverse biases including racial bias, gender bias, age discrimination bias, and other lamentable prejudices.
How is this happening?
Initially, some pointed fingers at the AI developers that craft AI systems.
It was thought that their own personal biases were being carried over into the programming and the AI code that is being formulated. As such, a call for greater diversity in the AI software development field was launched and efforts to achieve such aims are underway.
Turns out though that it isn’t only the perspectives of the AI programmers that are necessarily the dominant factor involved, and many began to realize that the algorithms being utilized were a significant element.
There is yet another twist.
Many of the AI algorithms used for Machine Learning (ML) and Deep Learning (DL) are essentially doing pattern matching, and thus if the data being used to train or prepare an AI system contains numerous examples with inherent biases in them, there’s a solid chance those will be carried over into the AI system and how it ultimately performs.
In that sense, it’s not that the algorithms are intentionally generating biases (they are not sentient), while instead, it is the subtle picking up of mathematically “hidden” biases via the data being fed into the development of the AI system that’s based on relatively rudimentary pattern matching.
Imagine a computer system that had no semblance about the world and you repeatedly showed it a series of pictures of people standing and looking at the camera. Pretend that the pictures were labeled as to what kind of occupations they held.
We’ll use the pictures as the data that will be fed into the ML/DL.
The algorithm that’s doing pattern matching might computationally begin to calculate that if someone is tall then they are a basketball player.
Of course, being tall doesn’t always mean that a person is a basketball player and thus already the pattern matching is creating potential issues as to what it will do when presented with new pictures and asked to classify what the person does for a living.
Realize too that there are two sides to that coin.
A new picture of a tall person gets a suggested classification of being a basketball player. In addition, a new picture of a person that is not tall will be unlikely to get a suggested classification of being a basketball player (therefore, the classification approach will be inclusive and furthermore tend toward being exclusionary).
In lieu of using height, the pattern matching might calculate that if someone is wearing a sports jersey, they are a basketball player.
Once again, this presents issues since the wearing of a sports jersey is not a guarantee of being a basketball player, nor necessarily that someone is a sports person at all.
Among the many factors that be explored, it could be that the pattern matching opts to consider the race of the people in the pictures and uses that as a factor in finding patterns.
Depending upon how many pictures contain people of various races, the pattern matching might calculate that a person in occupation X is associated with being a race of type R.
As a result, rather than using height or sports jerseys or any other such factors, the algorithm landed on race as a key element and henceforth will use that factor when trying to classify newly presented pictures.
If you then put this AI system into use, and you have it in an app that lets you take a picture of yourself and ask the app what kind of occupation you are most suited for, consider the kind of jobs it might suggest for someone, doing so in a manner that would be race biased.
Scarier still is that no one might realize how the AI system is making its recommendations and the race factor is buried within the mathematical calculations.
Your first reaction to this might be that the algorithm is badly devised if it has opted to use race as a key factor.
The thing is that many of the ML/DL algorithms are merely full-throttle examining all available facets of whatever the data contains, and therefore it’s not as though race was programmed or pre-established as a factor.
In theory, the AI developers and data scientists that are using these algorithms should be analyzing the results of the pattern matching to try and ascertain in what ways are the patterns being solidified.
Unfortunately, it gets complicated since the complexity of the pattern matching is increasing, meaning that the patterns are not so clearly laid out that you could readily realize that race or gender or other such properties were mathematically at the root of what the AI system has opted upon.
There is a looming qualm that these complex algorithms that are provided with tons of data are not able to explain or illuminate what factors were discovered and are being relied upon. A growing call for XAI, explainable AI, continues to mount as more and more AI systems are being fielded and underlay our everyday lives.
Here’s an interesting question: Could AI-based true self-driving cars become racially biased (and/or biased in other factors such as age, gender, etc.)?
Sure, it could happen.
This is a matter that ought to be on the list of things that the automakers and self-driving tech firms should be seeking to avert.
Let’s unpack the matter.
The Levels Of Self-Driving Cars
It is important to clarify what I mean when referring to true self-driving cars.
True self-driving cars are ones that the AI drives the car entirely on its own and there isn’t any human assistance during the driving task.
These driverless vehicles are considered a Level 4 and Level 5, while a car that requires a human driver to co-share the driving effort is usually considered at a Level 2 or Level 3. The cars that co-share the driving task are described as being semi-autonomous, and typically contain a variety of automated add-on’s that are referred to as ADAS (Advanced Driver-Assistance Systems).
There is not yet a true self-driving car at Level 5, which we don’t yet even know if this will be possible to achieve, and nor how long it will take to get there.
Meanwhile, the Level 4 efforts are gradually trying to get some traction by undergoing very narrow and selective public roadway trials, though there is controversy over whether this testing should be allowed per se (we are all life-or-death guinea pigs in an experiment taking place on our highways and byways, some point out).
Since semi-autonomous cars require a human driver, the adoption of those types of cars won’t be markedly different than driving conventional vehicles, so there’s not much new per se to cover about them on this topic (though, as you’ll see in a moment, the points next made are generally applicable).
For semi-autonomous cars, it is important that kids are forewarned about a disturbing aspect that’s been arising lately, namely that in spite of those human drivers that keep posting videos of themselves falling asleep at the wheel of a Level 2 or Level 3 car, we all need to avoid being misled into believing that the driver can take away their attention from the driving task while driving a semi-autonomous car.
You are the responsible party for the driving actions of the vehicle, regardless of how much automation might be tossed into a Level 2 or Level 3.
Self-Driving Cars And Biases
For Level 4 and Level 5 true self-driving vehicles, there won’t be a human driver involved in the driving task.
All occupants will be passengers.
The AI is doing the driving.
Consider one important act of driving, namely the need to gauge what pedestrians are going to do.
When you drive your car around your neighborhood or downtown area, the odds are that you are looking at pedestrians that are standing at a corner and waiting to enter into the crosswalk, particularly when the crosswalk is not controlled by a traffic signal.
You carefully give a look at those pedestrians because you know from experience that sometimes a pedestrian will go into a crosswalk even when it is not safe for them to cross.
According to the NHTSA (National Highway Traffic Safety Administration), approximately 60% of pedestrian fatalities occur at crosswalks.
Consider these two crucial questions:
· By what means do you decide whether a pedestrian is going to cross?
· And, by what means do you decide to come to a stop and let a pedestrian cross?
There have been various studies that have examined these questions, and some of the research suggests that at times there are human drivers that will apparently make their decisions based on race.
In one such study by the NITC (National Institute for Transportation and Communities), an experiment was undertaken and “revealed that black pedestrians were passed by twice as many cars and experienced wait times that were 32% longer than white pedestrians.”
The researchers concluded that the “results support the hypothesis that minority pedestrians experience discriminatory treatment by drivers.”
Analysts and statisticians argue that you should be cautious in interpreting and making broad statements based on such studies since there are a number of added facets that come to play.
There is also the aspect of explicit bias versus implicit bias that enters into the matter.
Some researchers believe that a driver might not realize they hold such biases, being unaware explicitly, and yet might implicitly have such a bias and that in the split-second decision making of whether to keep driving through a crosswalk or stopping to let the pedestrian proceed there is a reactive and nearly subconscious element involved.
Put aside for the moment the human driver aspects and consider what this might mean when trying to train an AI system.
If you collected lots of data about instances of crosswalk crossing, which included numerous examples of drivers that choose to stop for a pedestrian to cross and those that don’t stop, and you fed this data into an ML/DL, what might the algorithm land on as a pattern?
Based on the data presented, the ML/DL might computationally calculate that there are occasions when human drivers do and do not stop, and within that, there might be a statistical calculation potentially based on using race as a factor.
In essence, similar to the earlier example about occupations, the AI system might “mindlessly” find a mathematical pattern that uses race.
Presumably, if human drivers are indeed using such a factor, the chances of the pattern matching doing the same are likely increased, though even if human drivers aren’t doing so it could still become a factor by the ML/DL computations.
Thus, the AI systems that drive self-driving cars can incorporate biases in a myriad of ways, doing so as a result of being fed lots of data and trying to mathematically figure out what patterns seem to exist.
Figuring out that the AI system has come to that computational juncture is problematic.
If the ML/DL itself is essentially inscrutable, you have little chance of ferreting out the bias.
Another approach would be to do testing to try and discern that biases have crept into the AI system, yet the volume and nature of such testing is bound to be voluminous and might not be able to reveal such biases, especially if the biases are subtle and assimilated into other correlated factors.
It’s a conundrum.
Dealing With The Concerns
Some would argue that the AI developers ought to forego using data and instead programmatically develop the code to detect pedestrians and decide whether to accede to their crossing.
Or, maybe just always come to a stop at a crosswalk for all pedestrians, thus presumably vacating any chance of an inherent bias.
Well, there’s no free lunch in any of this.
Yes, directly programming the pedestrian detection and choice of crossing is indeed what many of the automakers and self-driving tech firms are doing, though this does not again guarantee that some form of biases won’t be in the code.
Furthermore, the benefit of using ML/DL is that the algorithms are pretty much already available, and you don’t need to write something from scratch. Instead, you pull together the data and feed it into the ML/DL. This is generally faster than the coding-from-scratch approach and might be more proficient and exceed what a programmer could otherwise write on their own.
In terms of the always coming to a stop approach, some automakers and self-driving tech firms are using this as a rule-of-thumb, though you can imagine that it tends to make other human drivers upset and become angered at self-driving cars (have you ever been behind a timid driver that always stops at crosswalks, it’s a good bet that you got steamed at such a driver), and might lead to an increase in fender benders as driverless cars keep abruptly coming to a stop.
Widening the perspective on AI and self-driving cars, keep in mind that the pedestrian at a crosswalk is merely one such example to consider.
Another commonly voiced concern is that self-driving cars are going to likely choose how to get to wherever a human passenger asks to go.
A passenger might request that the AI take them to the other side of town.
Suppose the AI system opts to take a route that avoids a certain part of the town, and then over and over again uses this same route. Gradually, the ML/DL might become computationally stuck-in-a-rut and always take that same path.
This could mean that parts of a town will never tend to see any self-driving cars roaming through their neighborhood.
Some worry that this could become a kind of bias or discriminatory practice by self-driving cars.
How could it happen?
Once again, the possibility of the data being fed into the AI system could be the primary culprit.
Enlarge the view even further and consider that all the self-driving cars in a fleet might be contributing their driving data to the cloud of the automaker or self-driving tech firm that is operating the fleet.
The hope is that by collecting this data from hundreds, or thousands, or eventually millions of driverless cars, it can be scanned and examined to presumably improve the driving practices of the self-driving cars.
Via the use of OTA (Over-The-Air) electronic communications, the data will be passed along up to the cloud, and whenever new updates or patches are needed in the self-driving cars they will be pushed down into the vehicles.
I’ve already forewarned that this has the potential for a tremendous kind of privacy intrusion since you need to realize that a self-driving car is loaded with cameras, radar, LIDAR, ultrasonic, thermal, and other data-collecting devices and going to be unabashedly capturing whatever it sees or detects during a driving journey.
A driverless car that passes through your neighborhood and goes down your block will tend to record whatever is occurring within its detectable range.
There you are on your front lawn, playing ball with your kids, and the scene is collected onto video and later streamed up to the cloud.
Assuming that driverless cars are pretty much continuously cruising around to be available for those that need a ride, it could end-up allowing the possibility of knitting together our daily efforts and activities.
In any case, could the ML/DL that computationally pattern matches on this vast set of data be vulnerable to landing on inherently biased elements and then opt to use those by downloading updates into the fleet of driverless cars.
This description of a problem is one that somewhat predates the appearance of the problem.
There are so few self-driving cars on our roadways that there’s no immediate way to know whether or not those driverless cars might already embody any kind of biases.
Until the number of self-driving cars gets large enough, we might not be cognizant of the potential problem of embedded and rather hidden computational biases.
Some people seem to falsely believe that AI systems have common sense and thus won’t allow biases to enter into their “thinking” processes.
Nope, there is no such thing yet as robust common-sense reasoning for AI systems, at least not anywhere close to what humans can do in terms of employing common sense.
There are others that assume that AI will become sentient and presumably be able to discuss with us humans any biases it might have and then squelch those biases.
Sorry, do not hold your breath for the so-called singularity to arrive anytime soon.
For now, the focus needs to be on doing a better job at examining the data that is being used to train AI systems, along with doing a better job at analyzing what the ML/DL formulates, and also pursuing the possibility of XAI that might provide an added glimpse into what the AI system is doing.
It’s a human devised problem that requires a human devised resolution, and not an AI “cognition” problem that should be awaiting an AI-sentient solution.