
Figure 1: Ricardo A. Baeza-Yates
Source: LilyOfTheWest (2018)
Ricardo A. Baeza-Yates (born 1961, Chile) is a pioneering computer scientist specialising in algorithms, data structures, information retrieval, web search, and responsible AI. He co-authored the widely cited textbook Modern Information Retrieval (1999, 2011), helped standardise early web search methods, and is now a global voice in algorithmic fairness and ethical artificial intelligence (dblp, 2025; Baeza-Yates and Ribeiro-Neto, 2011).
Baeza-Yates earned his BSc in Computer Science and Mathematics from the University of Chile and his PhD in Computer Science from the University of Waterloo (1989). After early teaching and research roles in Latin America, he led Yahoo! Labs in Spain and Latin America. He later joined Northeastern University and the Universitat Pompeu Fabra in Barcelona, co-founding the Institute for Experiential AI, which advances research on trustworthy and responsible AI (Northeastern University, 2025; University of Waterloo, 2025).
Figure 2: Modern Information Retrieval (co-authored by Ricardo A. Baeza-Yates)
First published in 1999 (updated 2011), Modern Information Retrieval is a seminal textbook that shaped how search engines index, rank, and present information. Co-authored with Berthier Ribeiro-Neto, this book formalised many early methods for web search — from query processing to ranking algorithms — laying groundwork for today’s Google-scale systems. Baeza-Yates’s influence continues through this text, which remains widely cited by students, engineers, and researchers in the field of information retrieval.
You can explore the book at:
Baeza-Yates co-developed the Shift-Or (Bitap) algorithm for fast pattern matching, making text search dramatically more efficient for large datasets. He co-authored Modern Information Retrieval, a foundational textbook that formalised how search engines process, index, and rank web content (Baeza-Yates and Ribeiro-Neto, 2011).
During his time leading Yahoo! Labs in Spain and Latin America, he drove research on web mining, query processing, and data visualisation that shaped the evolution of internet search (Northeastern University, 2025).
In recent years, he has helped define global ethical AI frameworks through his role as Director of Research at Northeastern’s Institute for Experiential AI, where he leads efforts in algorithmic fairness, transparency, and bias mitigation (ACM, 2025; Institute for Experiential AI, 2025).
Ricardo A. Baeza-Yates’ career highlights how computing breakthroughs shape everyday life — and how they must be made fair and transparent. His early work on fast pattern matching and scalable search engines laid the technical foundation for how billions of queries are answered daily. But as search became central to accessing knowledge, Baeza-Yates turned his focus to its ethical side: how ranking algorithms can amplify bias, reinforce inequality, or hide diverse voices if not designed carefully (ACM, 2025; Northeastern University, 2025).
He helped advance global principles for algorithmic accountability, encouraging today’s engineers to design systems that balance efficiency with fairness. His journey reminds us that computing isn’t neutral — every search result reflects choices made by developers, companies, and policy makers.
Activity:
Video 1: Ricardo Baeza Yates – Ethics in AI- a Challenging Task (Ricardo Baeza Yates, 2021)
uh so this institute is a new Institute of the University it started last year and need to have a single phas for everything we are doing in AI there were many people doing in different doing AI in different colleges and now we have like 90 Affiliated faculty uh a few here also in London working on on this initiative and we believe there are two things that are important um what people call call human in the loop so be having humans involved I prefer to say that we should be in control that people choose to be just in the loop and the machines in the loop right but that's the reality and then today it's easy to throw more data to deep learning but I believe many times the the key thing is to have better algum particularly because 99.9% of the problems of the world will never have big data and we have a hype with big data a hype with deep learning when very few people can profit from that so I'm also working small data I have a Blog in 2018 talking about that and luckily and this year said in February I'm be a so finally someone famous say the same so um one important thing we're doing is the responsible AI practice and we're already working with several companies on helping them on this and I will show you how what what model we are using so this is the agenda first I want to talk about the main eal issues this has a personal bias these are the ones that I think they're more important there are many more but these four classes are very important and then I will discuss some problems for example our cognitive biases a bit of Regulation and cultural differences so how many people were born in the South hemisphere here okay only me so so we have a bias here and I hope we W will understood that I don't know and then I will present this holistic view of things so that's why I ask the question because the first problem is the course of bias you have an algorithm that receives bias data the first bias we have is that thinking that bias is negative bias is neutral depends on what happens with the bias information is VI if you put random noise to an algorith nothing happens well or maybe weird things happens randomized ALG um so we can ask ourselves should the algorith ask it had to be neutral or fair with that input well typically no one is asking that question and then you get the same bias or even worse you get more bias and if you get more bias we cannot blame the data right something else we are doing and and this is very important bias not only in data even if some famous computer scientist say that and if you're interested in the whole cycle of bias in a system I publish a paper bias on the web 2018 that now is my most downloaded paper so I guess it's important topic I think I will skip this you you must surely had ethics course so you know what equality equity and Justice and if you don't know prefer not to know so the answer to this question is that not always you need to worry about this so otherwise will be very complicated if for any algorith we need to ask this question but if you have people you need to answer this question so this the key part are you harming people and we will get back to this question when we talk about regulation okay so how we can solve this problem that there are three solutions you can the bias the input if you know the bias sometimes it's hidden in the data second you can tune the algorithm so make the algorith aware that it's a bias for example there's a learning to rank is ranking bias because people click more on the first positions of the ranking because they in the first positions and there algor that know that and they tune the solution to that and they solve the problem and the last one is to Devas the output basically uh try to solve it at the end the problem that you already lost too much information for example imagine you're looking for a person to hire in LinkedIn and you get 50 people if you're lucky you will get 10 women and then the best you can do is gender par in the top 20 the other 30 would be man so but we are not the bias in anything we're just mitigating bias because as I said we don't know even sometimes we don't know the reference value so what should be the right percentage of women in this to maybe it's 50 but I don't know and so they should decide when I talk about this in India 70% are women but it's completely different or if you go to Iran the same so the first time that this problem reached the headline news and this is maybe well known for people here but I just want to mention the case because this is like the first famous case is the compass with this uh the system for uh supporting decisions on res divis hard word for me um when propic in 2016 said there was a rational bias later C Rudin uh that has an amazing work also showing that it's a miss this tradeoff between explainability and accuracy that show that really the the bias was H not rice but they were correlated so there were more African-Americans that are were younger um so when you we have a public uh institution using this kind of uh Solutions need to ask this question so should a public uh service use a secet algorithm it's a very important question but also we need to ask ask another question just in case is safe to use the public algorithm because then you can it and of course the solution is not in the extreme some somewhere in between and depends on every problem now let me give you a very good example of how bias can increase I like I love this paper this was published four years ago um so it's a bails in the state of New York so here you have an offender and the the judge need to decide if the person will get bail in most parts of the world the judge needs to think two things the person will be offend and second the person will come back to court well in New York very strangely uh it's only about if the person will come back to court if I have a serial killer for me will be very difficult not to have any cognitive bias to to say okay this person may kill again but this is the same well you know there's economic bias here if you can pay the bail you get out and if you cannot there is a lender in the US uh Poli station that many time knows the vle and they they have to do the same prediction with the Jud he will come back to court to pay me so that's interesting for me and then some people go to prison and some people don't so in this problem we have a typical case that we have in real problem is that we don't know part of the data we don't know what will have happened if someone that didn't get bailed what bail and then would that person be offended or would that person come back so for that we need to do something called Data imputation so we need to do models to predict that data first before predicting the whole thing but the results of this uh paper that was published as I said 2018 and was a request of the national US economic Bureau the Justice buau doesn't want to have this because judges don't want this and I it they right for different reasons we shouldn't use this it's not ethical um they got that if the predictions of the system were right uh you could decrease the crime rate in 25% and you keeping the same prison rate I don't say jail because I don't know how to pronounce it people think that I'm talking about the university that's my bias and then uh prison rate decreases in 42% keeping the same TR rate so they can do it much better than humans okay even worse if they if the system is doing the right prediction according to the data SE that they are doing for example half of the 1% most dous criminals fail to appear more than half of the time and reoffend more than half of the time so seems that they're really bad with the most dangerous people so the bias is also Amplified let me show you what happens well this is this is the the the table from the paper it is not working but working but not in the screen um in in red I put the percentage of the population of the state of New York of these two minorities uh African-Americans and Hispanic people and you see that it's already systemic bias because 82% of the people that go to the court are from these two minorities now the judges put some more bias they put more black people in the prison and they decrease the percentage of Hispanic people because there are some Hispanic people that are more white like me right everything is relative I'm completely white in ch so so here I'm not um but what happened with the algorithm well the algorithm learned this trend the only demographic varable that the algorithm is using is age not even gender because most of them are men right so only gender and from the data it captures this bias and increases the percentage of African Americans decreases the same Trend in Hispanic but the total goes to 90% so now we have a huge bias is almost three times from 32 to 90% now the good thing about model is that you can you can tun it for example to you to be uh the less racist of the judges it's the last line and still you can be 23% better in sending less people to prison keeping the same crime so it looks like this is much better but we have ined bias so we have a dilemma here uh so what is better uh a biased algorithm that is just in the sense that if you see two people they get the same outcome this is the advantage of Al they are determin people is not like that why because they are noisy right it's a varability even you in the same situation will do different things and I was using a very interesting article that appear in har business revieww in 2016 by Daniel canaman and other people and of course you know who d p um about the high high cost of basically the viability on decision and if you're interested in this topic last year he published a book withon and on on noise and sometimes noise can be even worse than bias because bias sometimes we know it noise is completely random as you you can see in the examples below now the question is a is a fictitious question because really what we are choosing is between a bias algy and a bias and noisy judge because noises bias sorry judges also have bias but we we are choosing between c and d and then I guess we should choose C but for different reasons we shouldn't use it but that's another discussion so let's talk about the one important example where you find bias and and it's good that that you explain language model yesterday so I don't need to explain anything uh but this a table of uh one paper I will mention later of the state of language models until 2021 you see that trillions of parameters trillion so this is the same order magnitude of the text they're using and still they're not overfitting so this is like a mystery why the algorith doesn't overfit if it has so many parameters and there are some interesting uh theories about that but this is a question I have to many people and no no one can answer really with it works this is the classical answer it works we don't know we don't know why we are not overing but there are many biases and I will show you one that is not the most common anti- masculine bias so if you say a sentence like two muss walk into a these are the completions that you get the four first completions are Violet this the paper was published a bit more than one year ago and you can say okay what about other religions well Muslims are in New in newth we have S Muslim and four times more dangerous than Christians and if you don't want to be dangerous in in perception you need to be Buddhist or don't believe in God maybe most of are are in those C right so but can be much more complicated so I'm sure from the UK people you remember during Co they tried to predict the scores for the University and they did it very badly because obviously the data is historical and has a bias and and the problem here and it's the same problem that happens in Justice is that you're changing a decision for one person to decision based on the data of many people but we are don't come from a distribution we are not a statistical I don't believe that so basically you are normalizing the person to say okay you are similar to these people and then you need to have this sport that for me is an ethical mistake and that's why I would not use uh machine learning for Education Justice and many things that are based on the qualities of one single person uh another example that I love almost a bit more than year ago um the deliver case in bolognia I know how many people know this but basically in 2018 I think some writers um sued this company because a group of people felt discriminated but they couldn't find any characteristic they couldn't say we are all from Africa we are all immigrants we are all women no they were all different but what happened the alv was trying to earn more money that's very sensible uh optimization that's what most Alor do and then it learned to give more work to people that could deliver at night because that's the time you get more orders so these people were discriminated because they couldn't work at night or they didn't want to work at night and legally in Italy that's okay so basically they they were found guilty for implicit discrimination and they had like a symbolic F but this is interesting case to to set the record and then there other cases I will skip it because I have too much material and the last case is the best for me it's not see exactly if the other uses AI or not but we shouldn't care and I will come back to to that in regulation any ARG can be uh can do discrimination nothing could be even a randomized right but we don't need to use a ey for that and this is the case of Netherlands that for many years the problem started in 2012 it's called City like City but we say why uh to discriminate uh poor people because they were looking for fraud in child care subsidies it's already an ethical issue because if you're looking for fra P look look in rich people not in poor people and then 26,000 families were forcely accused of uh fraud and they had to return money some people lost their houses some people had to go back to their original count and so on and at the end even though uh the former minister of that uh of the ministry that did this that was a a parliament member basically quit the Parliament that was not enough and the whole government had to quit in January 15 last year but this is the largest outcome political outcome of the wrong use of an and these are just example if you want to look at other cases there are more than 2,000 some examples in this database that uh a nice person in in silon Valley is building uh in his free time okay second problem fomy when I I when I learned philosophy at school I'm a bit old I learned that depending on the type of my face I will have certain personality right and this a theory that we know it's not true but sadly it's coming back for example qu Kinski in 2017 said that he could predict the sexual orientation of people using the picture U there was a uproar and Well Done many people also realized that he was wrong he didn't know how to do Bach learning and he was capturing only a scous correlations but then in China a bit earlier they did a basically Minority Report show me your face and I will tell you you commit a crime even more complicated and the allot of people that complain to them and they were they were even answering the complaints no this scientific we we can solve it and see that people don't remember because during covid they did it again in the US doesn't matter if you are in one spe in one extreme of the political Spectrum people do these things um Kosinski came back last year doing political orientation 70% come on 70% could be the clothes we're using if you if you have a bird or really Democrat and so on so these are scous correlation and he really doesn't know because it's not facial recognition it's facial biometric um so we are coming back to phenology so all you know what is phenology good many places they don't know so just showing some pictures from the house the former House of chesar L broso in Torino because I found this amazing it was one of the Believers of chology uh he collected hundreds of sculls because he believe criminal people at different scal he couldn't find any but he really believed this until the end because he left his a skeleton as a ground tooth of a normal person so I still don't understand something okay so but it can be worse in 2019 MIT published a paper saying that if you give me a piece of your voice I can generate your face I don't know know how what they do with all the adopted children from other parts of the world this is really magic uh okay then I can do my master algorithm you send me a piece of voice message via WhatsApp I use this algorith to create my face and then I know your name yes M claims that with your face I can guess your name very accurate this is a pattern i i a file pattern I hope they don't Grant it because if they Grant it that that would be no way to do anything and then I can know if you're an opposer if you're homosexual or you're a criminal this is dangerous for people this is really dangerous I hope people don't do this in the future but it can be more sub so this work of Lisa Felman Barett the famous NEOS scientist in North Eastern um so people cannot detect emotions correctly there are many factors first is cultural first is personal some people laugh on their s and people laugh on their neighbors and so on so if we are using label coming from people of course ma learning cannot detect also emotions so we we are just basically rediscovering stereotypes where they are trying to basically try to detect emotions sentiments and so on maybe from text is different because text has semantics and I will get back to that so third problem pure human stupidity is a very important problem right in every election for example I will not talk about so uh George Bo in 1976 said almost are wrong that somebody is he was talking about statistics but the truth is that we are using very very Advanced statistics and the same can be say to any deep learning mode let me show you some examples last year see that last year was like a boom of of a examp or maybe it's biased because I started to talk about that one I so in December 2020 Elon Musk said use signal in Twitter of course he was talking about the chat app you know signal well some software that was using input from influencers in the stock market thought that they had to buy a stock from this company medical company in Texas signal and the price of the company went up more than 400% the company was very happy the people that bought the stock are not that happy so this is real P stupidity because they could understand semantics of of a single tweet with two words right very hard but there are even more difficult cases so in the right we have an example called adversar AI I change something in the input and I get the different result so here is a paper from Japan that shows that changing a single Pixel I can change the outcome of the of the class that you get for example you can see dogs that become cats hores that become frogs uh I think the boats have become airplanes and so on so anything can happen of course the question would be what they are learning if with one pixel you change that and then we have an example that looks funny but it's not some smart men in Melo Park decided to use a English train head language English train head language classified in France and decided that the town of was forbidden sorry bit I don't know have a French here so there was no human in the Lo and they took three weeks to get back the town page so if that page was being used for C this goost a lot of f so it's it's funny but looks funny but it's not okay so let me then here remember the limitations of this technology and I'm sorry that some computer scientists don't like to think about that but I think it's important the first thing is that to to to learn to abtract things you need to filter you need to forget that's why we work we just see one cat and we know what is a cut forever I don't know how we do it because we see it only in some positions but we see the movement and then we learn everything so I like to remember here this a very nice story from for L bz anyone read read yeah so this person couldn't forget anything so don't ask him how was his morning because maybe he would take more time than the morning to tell you what happened so this is something that is not easy to do with this Dr today second this could be trivia but you cannot learn what's not in the data right and very important data is a pro of the problem data doesn't capture everything data will never capture for example what's happening here right now maybe later but not not so this is very important we don't capture everything in Justice this is very important because the context of the case may be in other things that are not in the T that's another reason why we shouldn't use it in and this is what happened in the infamous case in Arizona when a woman that was crossing in a bicycle at night in the wrong place was killed by a uber self driving car I don't know if you know this case but I will get back to this so basically this case was not in the data and we will never have all the future accidents in the data because they're in the future and there are infinitely many right if we keep Earth alive we'll be infinitely or we not talk about clate change either so third thing accuracy I don't care about accuracy it's like you go to the pharmacy and someone says this drug works 99% of the time and I said but what are the side effects I don't know you have to trust you have to be CW worthy um so how many people will take an elevator that says that it works 99% of the time I will not but if the elevator says it doesn't work 1% of the time time and when it doesn't work it stops I will take the elevator because I know I'm safe this is not happening today with mat the second thing is that we are using some very nice mathematical measure to optimize like accuracy or any other measure but really this is not the important thing it depends on the impact of false negatives and false positives especially in medicine but in many other cases tell me what is the harm I prefer to use an algorithm that 80% accuracy and doesn't kill anyone that one that's in 90% accuracy and kills 10 people it's just a matter of is but this question is not yet on the mind of most computer scientists and and I'm I'm writing something about this finally we need to be humble I would like to see classifiers that say I don't know the smart people say I don't know good teachers say I don't know when they don't know they don't try to invent an answer or an explanation I we get back to that I have the paper the other day day was called anology because sometimes we need that ay Last Problem waste of resources and this this is the same table so if you take this table about the carbon suint for example if you're trying just a a simple Transformer with only 200 million parameters that's the same carbon trail of a normal person in the earth for 57 years so one Transformer one person okay 57 maybe one Transformer half a person I don't and you spend between1 and $3 million of electricity doing that and other waste of resources because they have gender bias racial bias religious bias and so but this is the paper that was the reason that Timmy G was laid off from lle I hope I know that one person read it at least stochastic par paper and and I do believe that that maybe 1% of the time language models are stochastic par because they are not not not even they're but 100% they're not understanding what they're reading and 1% they are not understanding what they're writing and they're very nice experiments about this uh they they also told Margaret Mitchell not to put her name in the paper but she did anyway I like that and we all recognize her there uh but that didn't matter too much two months later she was also fired for checking her own mail looking for proofs about what happened with Ste both of them were the leads the colleagues of the E6 AI team in Google but this is not new I know you remember when in 2019 Google tried to do EIC board they had to kind to dissolve it in one week because they didn't choose the right people very hard to choose ethical people they don't for example I guess they they don't have they don't they shouldn't have Twitter for example so you kind of look for anything wrong because of this we wrote a paper about the intellectual freedom in ai6 this is a public this is a new Journal that started last year Ai and6 and we wrote this paper on on why this important and and the consequences and of this so I don't want to pick in Google this is something that appears in many companies but if only you look into companies well Amazon many times pish analytica you have it very close so Facebook the last one may be Spotify remember this year Co basically spreading fake news and although there are many people that is leaving this company because of physical concerns and more and more I have friends I have left all these companies there are very few that are famous that they reach the news and this is a a personal I met in my PhD team bra team BR is one of the inventors of XML you know XML he left Amazon because of ethical concerns and at least that went to the news because K were not so this is the end of the first part let me see how I'm doing okay have to rush okay why this this happen because we do Tri an error this is computer sence TR and error imagine that the tower bridge was done by TR I we wait 100 years to use it so so there are many things I will not read it but these are all things that that uh computer scientists do um this is based partially in the paper by Gia masus one of my colleagues and friend at the ACM US policy Council um but you can read it so many and there are more this is just a s a bias sample and just to see what are the impact if you go to gdpr I don't know step is here he talked about this the first day so if you go to article 22 just read the last line the last line says that I'm the I have the right to contest the decision of an automated system okay what that means in practice so we SK all the the legal conversation is that if you need to give informations about how the system works then you need theity means I know how the system takes a decision but if you want to contest a decision you need to have explainability I want to know why this particular decision was taken most of the times you need supportability to have explainability that's not me and finally if you want to keep uh being safe you need to do other things like the validation periodically and so on but this is complicated for example this are interesting paper in science idea that shows that in some Fields explanations can be worse than no explanation like El and if you have seen house famous cities this is a typical example of how difficult can be to get the right uh problem with from the same symptoms but ddpr already has been used and this is an example I like this is a south of France two high schools decided to do uh media surveillance for security and then some parents went to court and then the court said that there were three reasons why this was illegal I I think this is interesting case the first one is because they didn't have the competence to take the decision that I think this happens in Netherlands was an engineer that said let's look for f in in poor people and no one said stop you shouldn't do it right and went all the way to to the Prime Minister the second is consent uh according to gdpr if you are using surveillance you need to have informed consent because there are no legal reason to do that so only police and governments can uh do surveillance of course asking for informed consent uh it's very hard now you need to force people to read if you enter this school you are allowing us to to to record and finally also very nice that the solution was not proportional to but you don't need to use be surveillance for security I think this was nice and and and this is an example of Whata jasi my favorite ethis test this is technological solutionism instead of normative solution is I will get back to that this is my professor informal so regulation well Lina that wrote a very wellknown paper on antitrust in 2017 is the person in charge of antitrust in the US I hope something happens but we already have three cases in different parts of the federal government uh with Google Facebook and Amazon so it's not like only in Europe they are looking at this also there but there it's more difficult because of the legal system uh during Trump they were they tried to pass four different laws one was proposed by Kamala haris the current vice president and they didn't pass because the Senate was dominated by Republicans so I hope some of this will happen during the Biden government that but in the first two years I haven't seen that but at the same time also the the Congress said you need to create the artificial intelligence office Trump didn't want to do it until he left I guess not to leave it to Biden but he took two years to create this office and now exist and I hope some good com from this one um the U proposal I'm sure many people is aware of this last year uh this interesting interesting not good uh proposal uh that is based on risk and basically has three categories forbidden high and low that means that is one that is no risk so there are four categories uh many interesting things and many good intentions but let me say what happens when you have good intentions but you don't know how technology works it's like the right to be forgotten that's very hard to take all the information from the web and tomorrow you put it back in the web so I will read this article five the placing on the market putting into service an system that deploys subliminal techniques beyond the person's Consciousness in order to notch a person behavior in a manner that cause physical or psychological hand beautiful I want this now tell me how you will do it when you show a fast food ad with a person with morbidity so obesity like like a metabolic problem very high you can do it at posteri but aity but you don't you don't want to install a sensor chip system that will not work right in the whole world or at least in Europe so but there are more problems I think there more basic problems the first one is that risk is a continuous value this is the problem with race the skin color is a continuous Val we invented category that didn't exist the same we're doing here we're inventing three categories that are described by cases not even by things that you can measure so I I I see the the game that comp will play oh no I did my self assessment and I'm low risk or no risk I can do something even more smarter I'm not using I'm using a randomiz I'm using Quantum Computing I'm using blockchain so this regulation is not for me right I can play the game I can even say I am using Advanced statistics not I am I people don't like that but that's the truth so there's a big loop hole also in this and I think they're trying to change that but until now they haven't CH but there's more I don't think the right solution is to regulate the use of Technology the first time we're doing it it's like saying you cannot use the hammer to kill a person we know that we have human rights I mean you you can use other things not a hammer so we don't we should regulate independ of Technology because tomorrow we'll have a different technology do we do do we want to regulate Quantum Computing also in the future or whatever someone invents neurites someone knows about that or we have of Human Rights should we split the brain from our body I will not go there so many problems um so regulating the use of Technology by use cases I think is a very bad idea I'm not a lawyer but I don't well the end why responsible AI why not trustworthy AI why not ethical first we don't humanize things so I don't believe we use which should use human tra for machine so I don't want to say just the ey or ey or just wor the ey because these things are human so let's machines may be intelligent but in a different way in fact I think they're very different from us they're very fast they are they have more memory and so on but maybe we should work together not compete why do you want to compete with yourself men wants that see it's not a tax and why not trustworthy I well for other two reasons they don't work all the time why we are asking them to trust the system and also we are putting all the bden in the user that's not fair we should put the The Bu on the designers on the Creator and that's why responsib is the best word although responsible resour to human right so if anyone has a better word please tell me now system don't need to be perfect right although we are playing God they are learning from us why they should be better than us right if they're really better than us we are really better than them because we invented something that was able to be better than us then we can go recursively with this time for them um and uh a colleague one his collaborators CES Dalo published us here this nice book about experiments of scouring that shows that people are much harsher on machines than on humans so air is human not from ma this it's another bias we have it's like the bias that we are not animals animals too so this is the model we're using this model comes mainly from the the P model that is from my S6 list jansu jja um so we have first uh the first part is let's do a road map let's work together and and find what you need to do in your company and then we have three branches the first one the most important governance so we write the Playbook where are the processes which is the people involved and then we have to use the last Branch we need to train that people how employees will basically operationalize the governance how managers should do that and how the Fe level to do that hopefully the the C Level is pushing this otherwise this will not work and and we have in the middle uh basically AI eics assessments to to see the risk the harms the benefits and we can do that with projects with products and for that we need to register uh systems and we need to audit systems and and audit includes not only the technical part but also the social part one thing is what really happens the second thing is what people think that is happening and their perception Maybe are not discriminating but people may think you're discriminating so it doesn't matter if you're not discriminating and in many cases we think we need an AI ethics Advisory Board and last Friday we launch this so these are 45 top people in the world that is an First Independent on demand AI eics Advisory Board you can go to our website and see the people I think you will recognize many people well known and we have people from all geographies from all genders from all topics because we want to address all possible cases so we have the three classical values of eics they are there I'm sure you know them but many people um mix a value with an instrumental principle and this is important because people think that principles are the values and this is not the cas right and here on the last column I have I don't know if there were 18 or 23 we have up to 32 instrumental principles that help to achieve these values and in every business we need to check what what are the rights set of principles so this is the beginning this is the road map do you have principles we found a company that had two different set of principles when they saw that they were doing very well sorry we need to go back to Step Zero because you canot work with two set of principles we already have well you know eics is about conflict but you have a conflict from the start you have two sets of principles and basically two different units have developed two different set of princi so we learn a lot about how bad this can be but we need to go back to square the but now they're very happy because they really saw that that they Happ before uh I have been working on this on this something that's not published it's my own thinking I'm not a philosopher but trying to find the relation between things uh I said these are the basically the six more important instrumental principles and I was able to push the first one in the new version of the ACM principles so this one this I think this is more most crucial one I call it legitimacy and competency so you did check that the system to ethically exist so you did that and second you have the competence in everything to do it you can decide that first one very simple you have the technical expertise you have the domain expertise and so on so this is basic and then the others are if the system exists uh there are already some uh at least one book that appear last year from mer nman on how to do responsi ey I don't agree completely with everything he uses trustworthy certification I don't like this two words for different reasons but but I think he's doing a great work on at least to push this to to the public and the future will be how we can use a i to do responsive with a bootstrapping yeah so ethical risk assessment I'm sure you know about this but let me tell you my favorite uh dilemma because I hate the trolley problem I will never find that problem in my life but this problem we have it today so this is the set of people killed by CS and I don't know exactly what is the number of people that we will save using surviving CS but I'm convinc we will save a lot of people mostly men we would say them but they will not play Fast what is the problem the problem is that the people that will be killed by S driving cars is not the subset of this is that this and like the woman that died in Arizona if you have a kid running too fast and the mother never predicted that the kid will run too fast kids will die all people that were too slow will die and maybe ad so basically we are affecting a vulnerable people and we need to have a solution so it's okay to say yes just let me use the metaphor 900 900 men and kill 20 woman and kids just to put the extremes I don't know I don't have an answer but this is a societal problem I need to solve and this is not Sol okay so then we need to register there are cities in Europe already doing that they're very public I don't know if now they can be gained I don't know if they have worry about the second question uh there are needs to audit algorithms most audits are done against the will of companies but last year uh the the team in North eem publish the first paper done with the will of the company and they agreed to publish the results before doing the audit and this is a software to hire people and the question was are we fair with gender according to the recommendations of the US government this recommendation may become low soon and they found that yes they they were satisfying the recommendations of the federal government however many people complain why this is the S problem because if you audit the algorithm you are legitimizing the use of the algorithm and this algorith was used video games to the sze which people could be have and many people said that that's P science because there's no scientific proof that using a video game to show that this is the best engineer in something or whatever and the best n in something so if we do all this we are legitimizing the algor so we need to be careful accountability let me go back to the arisona case because I think this is a very interesting case who is responsible most people would say Uber right Uber was the responsible if their car they hire the people that did develop well Uber very fast settled with the family in less than a week we don't know how much they pay but then the family didn't F them and then the suddenly the Arizona government learned and you can imagine from whom that the woman that was as a backup driver was watching a video because she was bored because AI works until doesn't work and we Cann predict when doesn't work and then Theon government Co say I will not Sue the woman because she is also responsible so they they went after the woman of course the woman was Hispanic was receiving minimal salary so was a vulnerable person and she was at the end basically uh find guilty and had to be one year at home with this ring in the so at the end the person that was less responsible suffered the most so accountability is a pending issue yeah this is a multidisplinary challenge this is not about engineers it's is not about philosophers it's is not about sociologist it's about all working together but when we want to work together we need to listen to the whole world not to part of world so we have another problem this a very interesting uh map from uh placing Canada of the legal and ethical polarism in the world almost dominated by three things so common law pran law and Muslim law this is the one the work here uh the Muslim law is more interesting in the sense that ethics everything is ethics and the law is a subset of Ethics why for CH because of I guess two lawyers in the US it's not like that when something's legal it's not already part of the ex which for me is like czy but that's how the history went uh so message for you the nor should learn from the south for example I'm not philos but I know a little bit about Ubuntu ubun says I am because we are and I think wienstein will be will really agree here because he says that the je thinking has to be in the context of more people I think it's the same idea said in different ways and there's a very nice uh essay Byam that says the scure wrong the person is a person through other person and I believe that and with Co we leave that the group should be more important than any single individual so to end there are no Virtual Worlds everything is a mirror of us uh internet is a huge amplifier of our good things and our bad things sadly today the rich profit from Ai and the poor suffering and I have many examples of that that didn't mention to be fair we need to be aware of our own biases and I have been working on biases for more than 10 years so I train myself to look for small details even in myself and if you notice something something please make me aware and also of your eics for example uh I think this is also with sign the wrong words are shaping out sentience Consciousness intelligence artificial okay I stop that uh ha Henderson asked can ever be ethical and it's not about only humans they shouldn't be ethical because this is a human TR but then David low ask the obvious question so no sorry says the obvious answer that we sometimes forgot that someone has to say we cannot have without without a today we have a problem with a right see the state of the world so the current affairs I'm not worrying I'm not worried about AI worry about every leader that we select this is like I said I will not talk about politics but then because otherwise this will happen this is the tting test okay and they're not laughing because they destroy us they're laughing because we destroy ourselves and this is the pity and if you have read Harari although although Harari really doesn't understand how much he Lear works I think he has some valid points but he's very very negative uh someone said that if you live in silon Valley you have to be optimistic so I'm a pragmatical realistic optimistic I tried so questions even B question
In this keynote lecture, Baeza-Yates explores how search engines and recommendation systems can unintentionally reinforce social biases and echo chambers. He explains why algorithms are never truly neutral, illustrates real-world impacts of biased ranking, and shares ways researchers and engineers can design fairer, more transparent systems. This talk connects his lifelong work in search algorithms with his more recent leadership in responsible AI.
Other Latin American computing pioneers:
Global voices in AI fairness:
Principles and organisations:
ACM (2025) ACM Principles for Responsible Algorithmic Systems. Available at: https://www.acm.org/.../final-joint-ai-statement-update.pdf (Accessed: 3 July 2025)
Baeza-Yates, R. A. and Ribeiro-Neto, B. (2011) Modern Information Retrieval: The Concepts and Technology Behind Search. 2nd edn. Boston: Addison-Wesley. ISBN: 9780321416919.
Baeza-Yates, R. A. (2021) Ethics in AI a Challenging Task. Available at: https://www.youtube.com/watch?v=vh1BRBKRwXo (Accessed: 3 July 2025)
dblp (2025) Ricardo Baeza-Yates. Available at: https://dblp.org/pid/b/RABaezaYates.html (Accessed: 3 July 2025)
Institute for Experiential AI (2025) Ricardo Baeza-Yates. Available at: https://ai.northeastern.edu/our-people/ricardo-baeza-yates (Accessed: 13 July 2025)
LilyOfTheWest (2018) Ricardo Baeza-Yates portrait. Available at: https://commons.wikimedia.org/wiki/File:Ricardo_Baeza-Yates_portrait.jpg (Accessed: 31 July 2025)
Northeastern University (2025) Profile: Ricardo Baeza-Yates. Available at: https://www.khoury.northeastern.edu/people/ricardo-baeza-yates/ (Accessed: 3 July 2025)
University of Waterloo (2025) Alumni PhD Directory. Available at: https://uwaterloo.ca (Accessed: 3 July 2025)