LESSON 2.7: POISONING ML SYSTEMS WITH NON-RANDOMNESS 

Sometimes bias can be introduced into the machine learning system deliberately. In that case an attacker tries to poison the ML model or introduce backdoor into the system. The most obvious way to do this, is to change the underlying dataset (add biased data) or model architecture. But researchers found out that this could also be achieved by only changing the order in which data are supplied to the model. Let's imagine an example of a company, that want to have a credit-scoring system that is secretly sexist, but the model architecture and data would look like the model is fair. They could develop a ML model with high accuracy and collect a set of financial data that are highly representative of the whole population. But then they do not order the data to be randomly mixed, but instead they start the model’s training on ten rich men and ten poor women from that set. This would create ­ the initialisation bias, which will then poison the whole system. Bias in ML is not just a data problem, it can be introduced in a very subtle ways. So called stochastic nature of modern learning procedures means that the fairness of the model also depends on randomness. A random number generator with a backdoor can undermine a neural network and secretly introduce bias in the model that otherwise looks fair. Which means that the AI developers should also pay attention on the training process and be specifically careful about their assumptions about randomness.

Watch video lectures presenting additional real-life examples of bias in AI:

Thinking critically about digital data collection: Twitter and beyond (duration 0:37:22) 

The alluring promise of objectivity: Big data in criminal justice (duration 0:25:20) 

Beyond the headlines: How to make the best of machine learning models in the wild (duration 1:03:43) 

Logics and practices of transparency and opacity in real-world applications of public sector machine learning (0:17:49) 

Exploring Racial Bias in Classifiers for Face Recognition (duration 0:12:26) 

Does Gender Matter in the News? Detecting and Examining Gender Bias in News Articles (duration 0:16:34) 

Bias Issues and Solutions in Recommender System (duration 0:57:29) 

Mitigating Demographic Biases in Social Media-Based Recommender Systems (duration 0:16:10) 

Gender Bias in Fake News: An Analysis (duration 0:31:24) 

Never Too Late to Learn: Regularizing Gender Bias in Coreference Resolution (duration 0:11:05) 

Auditing for Bias in Algorithms Delivering Job Ads (duration 0:14:39) 

Mitigating Gender Bias in Captioning Systems (duration 0:14:52) 

Understanding the Impact of Geographical Bias on News Sentiment: A Case Study on London and Rio Olympics (duration 0:11:57) 

Discriminative Bias for Learning Probabilistic Sentential Decision Diagrams (duration 0:10:58)


Congratulations on completing Module 2! Now you can deepen your knowledge in Quiz 2. Keep up the good work!