3.1 Machine learning
Machine learning is a technique for automatically finding patterns in large amounts of data. Watch the following video extract in which Wing discusses several applications of machine learning.
Activity 10 Machine learning
Transcript: Machine learning
So let’s flip it. How about one method that has influenced many disciplines? So I actually have to call on any computer scientists in the audience to help me answer this one, because I doubt the general public would know the answer. But I know there are some computer scientists in the audience so I challenge you. What method do you think has influenced so many disciplines today? – conceptional modelling – Of course I have the answer I want you to say. The answer is ‘machine learning’. How many people have heard of ‘machine learning’? More people have heard about ‘machine learning’ than ‘model checking’ so that’s good. So machine learning has completely transformed the whole field of statistics. In fact there is a whole department of machine learning at Carnegie Mellon, if you can believe that within the school of Computer Science and it’s made of faculties from Computer Science and Statistics and what they really do is Statistical Machine Learning. Another example is that there are departments of statistics now at universities for instance Purdue which has an excellent Statistics department who are hiring computer scientists because they see their future as in this combination of statistics and computing particularly in Statisitical Machine Learning. So we are seeing this in the next generation if you will. So let me give you some examples of where machine learning has influenced many, many fields. So in the sciences it is a technique that has been used to discover new brown dwarfs and fossil galaxies. This is really new scientific discoveries because of this particular technique. In medicine it has been used for discovering and inventing these kinds of drugs and uses in these applications. In meteorology: for tornado formation. In the neurosciences for understanding the brain, so I should at least give you a one sentence definition of Machine Learning given that many people may not know it. What it is, it’s a technique that allows you to analyse huge data sets, large amounts of data and find patterns and clusters in large amounts of data. So in this particular case what you feed this algorithm is lots and lots of fMRI scans, (scans of your brain). What they are able to do through using Machine Learning is to find out what part of your brain lights up when a subject sees a noun versus a verb or this kind of adjective versus that kind of adjective and it’s looking at lots and lots of fMRI scans that allow you to see those clusters, those patterns. Machine learning has been used beyond science and engineering, so for instance it is used in detecting credit card fraud. It’s used on Wall Street, (the answer from the back of the room). When you go to the supermarket and you hand the clerk your Safeway card, your affinity cards, they are tracking your purchases and the coupon you get out after your receipt, is using that kind of analysis of large amounts of data. Recommendation Systems and Reputation Systems like Netflix when you go to Travelocity to find out what customers use and so on. Machine learning is even used in sports, so I don’t know what basketball player this, is but maybe you do, but what they did was videotape lots of professional basketball players and then the coaches would use Machine Learning to find out what are the skills of these professionals so they can teach those skills to their own students. This is Lance Armstrong who used Machine Learning to analyse the kind of data he kept of himself. As you know, he is a machine – and he was quite mathematical and analytical in his training so that he could really hone his skills.
Wing gives the example of basketball coaches who use machine learning to find out which skills distinguish good players. Once they know this, they can teach those skills to their own players. Machine learning makes this possible by automatically finding patterns in the behaviour of professional players from a large collection of video recordings.
Describe this use of machine learning in terms of abstraction as modelling.
Discussion
One can view this use of machine learning as an attempt to automatically build a model of a good basketball player. Such a model is an abstraction that needs to capture the skills that distinguish a good player and ignores anything else. At a different level, machine learning is itself based on a model of learning by humans or animals. Being an abstraction, it ignores many of the details of human and animal learning, while preserving some key properties (in particular, the idea that one can learn from examples).
Activity 11 Chemistry, physics, economics …
Watch the following video.
Transcript: Chemistry, physics, economics...
In chemistry computational methods, computational concepts have been used to actually invent new molecular structures that would have desired chemical properties. In physics, (this is one of my favourite examples), I am going to labour on this a little bit in quantum computing. There is a very young professor in MIT who works in quantum computing, he’s actually in the Computer Science department but he works as a physicist. In computer science, there is a kind of quantum computer where you can represent the problem in terms of a, let’s call it a graph, but you can solve the problem if you can morph this graph into some other representation/other graph. You can solve it more easily in this second representation, and so the idea is then how can you get this graph to be morphed into this graph. There is a procedure that you can use that would allow you to make manipulations of each graph until you finally come in the middle, and have that intermediate. Physicists know about this kind of quantum computer. So, the question that quantum scientists asks, is how fast you can do this? This is something that computer scientists ask all the time, how fast can you do it, right? I already said that is the question they always ask about efficiency in terms of space and time. The physicist never asks that question of themselves, they just know that there exists some path such that you can do that, but they didn’t ask, how fast can you do it? Now again, for the computer scientists in the audience, the reason this was such a tantalising question to answer, is, if it turns out that the convergence is polynomial it means that you might have an answer to the P=NP question. So you can see why this young computer scientist at MIT would very much like to know the answer to that convergence question, and it turns out that it goes really, really, really fast until this very small window where it is exponential. Even so it was a very interesting question to pose and now he tells me that the physicists are going around asking how fast on all their convergence questions. This to me is a deep way of computational thinking influencing other people’s thinking because you ask these basic questions that we would ask, and had the answer being polynomial then wow!, what a break through in computer science. We see it in Mathematics, we see it in Engineering and I think we are going to see more of it in all other fields. In society like Economics, right now, in fact tomorrow and Friday there is a workshop at Cornell on computer science and economics that an assessor is helping to sponsor and we are seeing a lot of the theories of economics and models of economics. In fact, we already know that some of them are not applying to our real world, but they are certainly not necessarily applicable to say to the internet and internet economics, and so there is all of a sudden this new interest, completely new interest, in revisiting economic models and theories in terms of computing models and vice versa. So for instance we are very much using game theoretic models in looking at lots of problems in computer science, game theory from economics, but many of the direct relationships between economics and computer science has to do with things like added placement and keyword auctions and so on. So for instance when you type into Google and you do a search, maybe you realise to the right of your search you have all these ads, well companies actually bid on where they want their company to be listed in that list of ads. You pay a lot to be listed number 1. But maybe it doesn’t matter to be listed number 1 or 2, you pay a little less to be listed number 2, you are still number 2, you are up there, right so that the kind of game theoretic thinking that people are doing in computer science. In Law we are seeing computational thinking and in the Humanities. Again in the sense of digging into the data challenge is analysing lots and lots of data.