Susan Etlinger: We are at a critical turning point about data
Artificial Intelligence (AI) is coming and in big waves. But the way it functions, analyzes, and makes decisions without human input is already affecting the lives of billions of people.
We all need to understand the basic benefits and dangers of the technology to make sure AI become an useful tool. That was the main topic of last week’s keynote by Susan Etlinger at the Smart City Congress in Barcelona. Etlinger is an industry analyst at the thinktank, Altimeter Group, where she focuses on data strategy, analytics and ethical data use.
What worries Etlinger is the way that we use different AI technologies, especially who controls the information, how it is collected and shared, and the opacity of most algorithms currently used to make decisions.
I sat down with Susan the day before her keynote to talk about the challenges of AI and society. One inevitable topic was the role of government in big data and surveillance.
“Last week [Nov. 8th] we elected a new president in the United States,” she said, “and suddenly we’ve gone from a government where we had an understanding, a general understanding, particularly after Snowden, of how data was used, to big questions,” she said. “Fundamentally, we are at a critical turning point in terms of how we think about data, and how we use data both for governments and cities, and also for businesses and other institutions.”
When asked about the possibility of people using data and AI to influence political decisions and distort information to the public, Etlinger is outspoken:
“We don’t even know the level of intentional misinformation that has been shared.” Etlinger says. “Obviously the US news media, as an example, is full of conspiracy theories right now. The reality is [AI] is an incredibly powerful technology, even more because it is very difficult, and in some cases impossible, to go back and understand exactly what happens in an algorithm, and AI.”
Etlinger says, “It could be potentially a very scary time. Some people in the US are talking about ‘living in a post-facts society.’ That is a real danger.”
Algorithms, the “White Guy” and “Black Sheep” problems
Bias in algorithms is a favorite topic of Etlinger’s. How data scientists and mathematicians design the AI models influences the outcome and has implications for the decisions based on analysis coming from those algorithms.
“There is this sort of assumption that mathematics is inherently neutral. And, in the world of data science, nothing can be further from the truth.”
“I’ll give you an example:”, she continues, “there is a data set, a corpus of data within Google called ‘Word2Vec.’ It is a data set made from words from 300 million Google News stories. It is used to train algorithms for the purpose of search and other applications. There is a group of researchers from the University of Boston and Microsoft, who looked at Word2Vec dataset and concluded, using very sophisticated mathematics, that it was, in their words, ‘blatantly sexist’… [Because] using Google News as the source of words, even massive, the algorithms made connexions such as ‘man is to computer programmer as women is to homemaker,’ and ‘man is to doctor as woman is to nurse’. That shocked the researchers because they originally thought that using the massive dataset of Google News would be more neutral, unbiased than average everyday speech.”
It’s called “the Black Sheep Problem,” Etlinger explains:
“In English and most European languages, if you were to come from Mars and try to understand a little bit about Earth and you try to understand the most common color of sheep, you would conclude, by looking at Google, that the most common color of sheep is black sheep, because we have this phrase ‘Black Sheep,’ which is mentioned many, many more times — I think in English is 14 to 1 — than white sheep. So, it confuses volume for meaning.”
“This is one of the challenges with algorithms, that there are biases and assumptions built into algorithms, that sometimes we can’t see until we see the results of a work,” she says, and mentions “The Artificial Intelligence White Guy Problem”, a phrase coined by Kate Crawford in an New York Times Op-Ed piece. Crawford is a researcher at Microsoft and co-chairwoman of a White House symposium on society and AI.
The “white guy problem” occurs when most of the algorithms are written and designed primarily by white men who miss signals and implications for people of color, of different races, of different social groups, etc.
“Like all technologies before it, artificial intelligence will reflect the values of its creators. So inclusivity matters — from who designs it to who sits on the company boards and which ethical perspectives are included. Otherwise, we risk constructing machine intelligence that mirrors a narrow and privileged vision of society, with its old, familiar biases and stereotypes.” Kate Crawford
Etlinger concurs: “We need data scientists, […] but we also need a diversity of voices to make sure that algorithms actually reflect what we want to reflect, as opposed to actually amplifying our existing biases.”
AI applications and challenges
Etlinger says that there are many jobs and processes, especially repetitive ones, that are good candidates to be replaced by artificial intelligence and robotics. “Here is the challenge,” she says, “some of those jobs won’t be replaced right away […] let’s say that driverless cars [and trucks] become dominant in the next 5 or 10 years. What happens to all those people? I don’t know. Retraining is not necessarily an option,” she says, adding that it is not her area of expertise.
“The important thing for us to look at, in terms of AI, will be: What are the best applications of AI that allow for complementary relationships between humans and machines, where machines do the majority of routine work and humans can provide the intuition, the expertise and the human context?”
She mentions Facebook News feed as a timely example of where this has failed entirely in the past two years.
“In the Facebook News feed, which is optimized for engagement, the consequence is that the most controversial and provocative stories tend to be shared more than real news reporting, and Facebook has not had a way to make verification and authenticity an important part of the algorithm and then Facebook started trending false news stories on a regular basis.” That, Etlinger says, “is an example where a machine has too much responsibility.”
“Other examples are predicting policing, surveillance, profiling etc., which originally had good intentions,” she says. “But there are so many assumptions built into the [American] criminal justice system about the type of people who tend to get in prison, and arrested, that it can amplify bias, thus creating a real disadvantage for vulnerable populations.”
The Data Poor Challenge
“Some of the challenges have to do with context,” Etlinger says.
“Some of the data sets [needed to provide context] are either difficult to get or cost money,” she argues. “If you want to understand how people feel about something, Twitter is a great place, but you need to buy that data, and if you want all of it, it’s a million dollars a year.” That, she says “raises the issue of the data rich and the data poor.”
Another important point Etlinger mentioned is the need for diversity of voices and experiences, and how academia can help. “I’ve seen that a number of [the] challenges of AI are being looked at very closely by the academic community,” she says.
“I think there is enormous potential for collaboration between academics, governments, NGOs, and businesses, to look at some of these issues, from multiple points of view, because we are dealing with unprecedented questions, and new methods are being developed all the time to deal with all kinds of data. And so we do really need, irrespectively of regulation, common ways of working that are replicable, and can stand up to scrutiny.”
Sign up to our newsletter to receive the latest Cities of the Future news. You can also follow us on Twitter and Facebook.