Responsible AI - how do we accomplish it?

The Google Gemini Fiasco

Mar 29, 2024

By now surely you have heard about the fiasco at Google. They released Gemini, an image generating AI tool and it suffered from a case of embarrassing political over-correctness. One of its many howlers was when asked to draw instances of Nazi soldiers, it drew images of people of color dressed in Nazi uniforms.

Clearly something went very wrong at Google. They tried so hard at being responsible and not offending that they ended up offending everyone. Google co-founder Sergey Brin suggested it happened because the software hadn’t been tested properly. This is certainly possible, however it is also possible that it was tested and no one was comfortable enough pointing out these howlers least they offend their co-workers! Such are the perils of the job these days.

Google will sort all this out in whatever way it sees fit, however this topic of responsible AI is a fascinating one full of all kinds of grey areas that are worth unpacking.

If we ask the machine to make a decision for us it is relatively easier to be responsible. We know the context of the inquiry and can put guardrails around the results to ensure they are fair. e.g. in resume evaluation, we can put guardrails around the machine learning so that it doesn’t say, choose only people from one ethnic group.

However, for generative AI, where we ask the machine to generate new content for us, it gets much harder since we very likely do not know the context. For example, if someone asks Gemini to generate the image of a Nazi soldier, this could be a very benign request. They could be working on a historical post about the subject. Or they could be planning some horribly offensive meme. Just the fact that the Nazis were awful doesn’t tell us anything about how the image will be used.

However the potential for causing offense is there, so one solution could be to simply refuse to generate the image of the Nazi. This leads to very absurd situations as I found out. I asked Bard, Google’s text generating tool to generate a short screenplay for me in which, as part of the story, one of the characters gets killed. The output Bard generated was farcical, it created a screenplay alright, but instead of the character getting killed, it wrote the story so that the antagonist and protagonist realized the error of their ways, became friends and no one was killed. When I pushed Bard to fixed this, it refused.

This is ridiculous of course and if no one died in stories Hollywood wouldn’t exist. Shakespeare wouldn’t exist either. If we say Google is being responsible here, its not a great definition of responsible. Such a G rated approach to responsibility has other pitfalls too. Lets say someone asked Gemini to generate an image of an Easter bunny. Should Gemini do so?

An Easter bunny, what’s wrong with that? It is all nice and cuddly, of course Gemini should generate this image. However, someone could just as easily create an offensive meme out of this image, in that case would it have been responsible for Gemini to have generated it? What if that offensive meme was part of an article explaining the dangers of offensive memes? Then? And what if someone copied that image and used it for something even more offensive?

Then there is the question of biases. Are there any inherent biases in what is generated? For example, if I ask for an image of a janitor and the machine returns an image of a Hispanic man is this evidence of bias? It could be, if we asked the question many times and got a Hispanic man every time it certainly suggest that the machine has learnt this bias. Most likely it does so because the data used to train it was biased and this generative AI systems can readily correct.

However what if I wanted the picture of a Hispanic janitor for an article on the Hispanic working class people? Or I took the picture of a non-Hispanic janitor and used to to create an offensive meme to mock Hispanic people?

We get the idea. There is often no easy way for the machine to know the context for which the generated content will be used and so to think about responsibility at the generation step is (mostly) premature. This question of responsibility makes the most sense when the context is fully revealed.

Given this what can Google or any other generative AI firm do? Whatever it is, it has to be something tangible, in the end the engineers need clear requirements that they can code for. One approach is this safety first approach, lets just refuse to create anything not G rated and then we can say we did our job and don’t have to think about it. This isn’t great, it leads to a bland sterility that stifles creativity and also could impact the business value of the product. For example I have little reason to use Bard if it is going to be so sterile. This safety first approach was very likely the cause of the launch fiasco itself.

Then there is another reason why this safety first approach doesn’t work. If Google won’t do it, someone else will! If someone wants to create that offensive meme, Google being G-rated isn’t going to stop it. There are already many tools out there and there are going to be many many more.

What about regulation? Can we manage this via regulation? Just outlaw offensiveness, that will take care of it. Would it be so easy! Who decides what is offensive? Some committee in Congress? We come to the question of freedom of speech which thankfully is still a core value.

Google could also just apply an acceptable use policy - hey you want to use stuff we generate, don’t be a jerk with it. This at least allows Google to cover its tracks. However how can this policy be enforced? One way is something like the community notes that Twitter employs, lets users report things that are objectionable. This at least generates something like social pressure, which could be effective in some circumstances.

Even then, if this offensive content was generated via AI, how would Google know that it was their systems that generated it? Since content is so easily duplicated, unless a new technology is invented that can ‘watermark’ generated content in a way that is trackable even when copied, it will be impossible and/or too costly to enforce this policy.

Throw in a further complication - the scrubbed, sanitized atmosphere of big tech companies in which you have to believe people are basically good, you can’t say anything negative, you can’t point out awkward truths, God help you if you did, its off to HR for you. This adds its own existential bias to any policy initiatives the tech companies take on.

Are people basically good? The jury is out on this one. I am reminded of a news story from a few years ago. An American couple embarked on a cycling tour of the world. Nine months in and one of them posted on their blog how confident they were that there was no such thing as evil, there were just cultural differences, how their travels revealed that people were basically good.

On one painful mountain climb in Europe in really bad weather they reported that after a few hours of struggling with lots of cars passing them by, one car stopped and the couple in it gave them a ride to their own home and then food and shelter from the storm. Surely this was evidence of the goodness of humanity?

Is it? It is definitely a heartwarming story, however a dispassionate data analysis reveals something different. Most cars ignored them, one of out maybe many hundred stopped to help. They also reported the opposite case, where on one mountain road also in Europe a car came up to them from the rear and tried to push them off the road. Which suggests that the conclusion is neutral, there are some good, some bad and a lot of indifferent people. In a tragic twist, not long after posting about the goodness of humanity, they were killed by extremists somewhere in central Asia.

All of this is a long way of saying that we have limited ways of stopping people from being mean jerks, and the anonymity offered by online presence can bring out the worst in us. Furthermore, whatever limited tools we do have, whether they be community notes, social reputation, demonetization etc will need a lot of shoring up. Already we are drowning in a sea of content and with the machine now being able to generate it 24/7 at better and better quality levels, many tidal waves will soon be upon us.

I wrote a while ago about the problem of fake news and the misaligned incentives for social media firms to stop it. Already our political culture has been transformed by this sea of information, some of it real and a lot of it not so real. Brace yourselves because this is only going to get worse before it can get better. There is no limit to the amount of fake content AI can generate. The onus is going to be more and more on the individual to navigate this digital wasteland. We all will have to find ways to answer the questions, what is fake, what is not? What has been manipulated to rile us up?

Are we discerning enough in your consumption of content, whether human or machine generated? Do we have resources that can help? What biases do the resources have themselves? How much can we trust them? Tools that help us do this discernment will become more and more valuable. This will be the subject of a future post.

Subuddh’s Newsletter

Responsible AI - how do we accomplish it?

The Google Gemini Fiasco