2.2 The impact of AI technology on society
Let’s get a bit more detailed about the impact of AI technology on society. The following series of discussions are concerned with a talk by Dr. Timnit Gebru, a key global figure in ethical AI, and was until 2020 the technical co-lead of the Ethical AI Team within Google. Gebru was fired from Google in December 2020 in controversial circumstances.
Watch thewhich provides a very clear and powerful statement of the risks of using AI improperly, and how this can lead to the marginalisation of vulnerable groups within society. While watching the following video, think about the people who are being directly affected by the AI technology referred to, and consider how the impact on a person can be much worse depending on their personal circumstances (and further, how their circumstances can arise from historical disparities and inequalities in society).
In the first couple of minutes, Gebru talks about a campaign they were involved in to stop the U.S. Immigration and Customs Enforcement (ICE) from implementing an AI driven programme called the ‘Extreme Vetting Initiative’. This programme involved using Machine Learning tools to automatically decide which U.S. visitors and immigrants were to be allowed into the country versus those who should be deported, based on data gathered from social media sites. The initiative was subsequently dropped in 2018, apparently due to criticism (Duarte, 2018; Lawfare, 2019).
Read the Center for Democracy and Technology (CDT) webpage summary of the ICE’s initiative (Center for Democracy and Technology, 2018) and answer the following questions: Do you think such tools as those proposed for the ‘Extreme Vetting Initiative’ should be available at all? Why or why not?
As you have no doubt realised by now, the concerns raised in Gebru’s talk and in the CDT’s summary about the use of AI technology for the ICE’s visa vetting tool, are central to this course. As yet, however, we don’t have criteria whereby we can make such concerns explicit enough that we can work towards addressing them. The aim of the activities throughout this unit is to bring to your attention key dilemmas currently faced by the AI sector.
To the target question for this activity: the CDT post makes clear that the complex set of decisions that ICE were attempting to automate through the ‘Extreme Vetting Initiative’ would be difficult enough for a human to carry out, particularly given the range of subjective judgments involved in the criteria against which visa applications were being considered such as ‘positively contributing member of society’. Automating such poorly understood decision-making tasks would be dangerous, and on this basis, it would seem proper that the ICE should not proceed with creating such technology.
It is important to be aware of the long history of activism in this area, and that groups such as the #TechWontBuildIt movement, seek a whole range of changes within AI in order to address the kinds of concerns we are examining in this course. A common issue expressed by many such groups is that sustainable change is only likely if the larger companies themselves, such as Google, Microsoft and IBM, remain democratic and transparent. So the news of the firing of Gebru from Google was followed with a great deal of concern from late 2020 by people across the field of AI.
After a couple of minutes into the video, Gebru asks the following two questions:
- Should tools exhibiting the kind of bias Gebru is discussing be available at all?
- Are the AI tools currently being used in such high-stakes scenarios, where they can have real impact on people’s lives, safe?
The first question is one we will come back to repeatedly throughout this course, with the aim of providing a means of deciding a way of answering this question for any specific project. For the second question: based on the example she follows up these questions with, of an Arabic-speaking Palestinian being arrested by Israeli authorities, what do you think Gebru’s answer might be to this question?
Consider the example of an Arabic-speaking Palestinian being arrested by Israeli authorities: the key fact is that Facebook tools translated a post written in Arabic into English as ‘Attack them’, when it should have been translated as ‘Good morning’. Since the kinds of tools which Facebook employs do not come equipped with the capability of understanding the seriousness of the implications of such a (mis)translation, then there was no way in which to raise the inappropriateness, and indeed dangerousness of such a translation in this context. Perhaps more importantly in this example, Gebru points out that those using these tools, in this case the Israeli authorities, were doing so unquestioningly, which Gebru points out is an example of ‘automation bias’. Further, Gebru’s main point is that automation bias coupled with tools that are in fact highly error-prone has the potential to be very dangerous to those using them and/or those affected by their use.
Reflecting on the ICE’s proposal to do automatic visa vetting, a key question that perhaps should have been asked by those aiming to develop these tools was whether in fact it was safe to do so, given that automating the decision about whether someone was a positively contributing member of society, required collecting and analysing large amounts of data that potentially included someone’s interactions with friends and close associates. Such data is certainly not the sort of thing that would seem relevant to questions about fitness of membership to society. We will see throughout the course that questions about data are central to critiquing the application of modern AI techniques.
Gebru begins talking about ‘word embeddings’ after about three minutes into the video, and how these can contain systematic bias, reflecting established societal biases. Word embeddings are representations of the meanings of the words of a language, seeing words which occur close to one another across a large collection of texts as similar in meaning, and those which occur further away from each as less similar in meaning. As the linguist J.R. Firth put it, ’You shall know [the meaning of] a word by the company it keeps’ (Firth, 1957, p.11).
Use the free response box below to write down your answers to the following questions. Make sure to use Gebru’s examples to explain your answers.
- What are word embeddings?
- What examples of societal bias does Gebru refer to here?
Once you have added your responses, click on ‘Save and Reveal Discussion’.
As Gebru points out, because they are based on large corpora of naturally occurring text (e.g. web-based text), then word embeddings encode the kinds of societal biases found in such text. This should, of course, be expected in very large amounts of usage data. For example, analogies built from these word embeddings include not only ‘man IS TO king AS woman IS TO queen’, but also ‘man IS TO doctor AS woman IS TO nurse/housewife’ or even ‘man IS TO computer programmer AS woman IS TO home-maker’. Interesting recent work (Guo & Caliskan, 2020) points to how the biases in word embeddings can in fact more seriously disadvantage people who are subject to multiple forms of marginalisation (e.g. women from ethnically marginalised groups).
About five and a half minutes into this video, Gebru introduces The Perpetual Line Up report (Garvie, 2016). Recall, that Gebru has been building the case so far in her talk that technology which is demonstrably unsafe and error-prone should arguably not be allowed to be in operation.
Have a look at The Perpetual Line Up website. From Gebru’s talk, and also from the website discussing the ‘The Perpetual Line Up’ report, try to answer the following questions:
- From the video and the Georgetown Law report, do you think we have enough information to decide about how accurate the technology is? If so, do you think the technology is accurate or not?
- Consider whether there is a risk that the technology referred to here could be mis-used by someone who has malicious intent toward an individual or group of individuals. How safe do you think this technology is? (Hint: if you were unable to answer the question in (1), then you should conclude that this technology is unsafe).
We will come back to the issue of surveillance and face recognition later in this course, however, it is worth noting at this point that Gebru uses this example to emphasise the risks of using technology which is likely to be unsafe and error-prone.
If you find yourself stuck trying to work out satisfactory answers to these questions, then try looking through the “Key Findings” summarised on The Perpetual Line Up website. In particular, consider the findings regarding:
- the FBI’s use of biometric data
- the lack of regulation of law enforcement use of face recognition technology
- the lack of evaluation by law enforcement to ensure the accuracy of face recognition technology
- and finally, that ‘Police face recognition will disproportionately affect African Americans’.
Taken together, these points above suggest that without addressing such concerns, then the use of this technology for this purpose is unsafe.
More specifically, note that the first 3 paragraphs of the ‘Executive Summary’ of The Perpetual Line Up report make for very confronting reading. Consent is something that most of us would assume to be protected under properly functioning democratic systems, and yet this kind of use of face recognition technology effectively removes consent. Given the dubious accuracy of facial recognition technology when it comes to groups not well-represented in the data, such as people of colour, as well as the potential risk of misuse of such technology, then again this points to such technology being too unsafe to be adopted for law enforcement purposes.
A comprehensive consideration of the issues raised in this course will need to take into account both data and algorithms, both of which are necessary to carrying out work in AI and related areas such as Data Science. Due to the complex nature of the challenges that are being addressed in such work, projects in this area often require teams made up of groups with complementary skills, experiences and resources. For example, a university-based group researching in the area of healthcare for the elderly, may team up with a company trying to solve a difficult challenge for people living in a care home for older people. Solving issues around availability, suitability, and confidentiality of data, as well as safely collecting, sharing and analysing data, often take up quite a bit of project time, especially with greater diversity across partners within a single project.