Transcript

I want to be slightly a little more technical in this part of my talk that’s not the shotgun algorithm, so I am going to go through one example that is not the shotgun algorithm to show a different kind of computational model and thinking that has influenced biology, and this is very recent work. So to do that you have to bear with me for a few slides because I am going to go into a slightly more technical part of my talk and I going to do a primer on this particular computational method, and its called model checking. I always in my talk at this point ask the audience … how many people have heard of model checking? I am surprised. After this slide you will all know what model checking is, at least in the abstract. OK, so let me go through it. There is a black box, it’s called the ‘model checker’ and it has two inputs. One is a finite state machine – what that means is that there are states and those are circles and there are transitions between the states, those are the state transitions, the lines, the arrows. And, what makes this a finite state machine is that there are a finite number of circles, finite number of states. OK, so that is one input and that is critical for this technique that it be finite. The second input is a property that we would normally write in English but, because computers can’t understand English well enough yet we write it in a more formal language. I picked this particular logic, its called Temporal Logic, it doesn’t matter, it is a formal language, a mathematical language that can reason about the ordering of events. So basically you have these two inputs, and what happens is, you describe the system of interest using this input and you describe a desirable property using this input. So you might want to say something like … when you go to the ATM machine, eventually I get my money out. So you would say … describe the ATM machine in terms of the finite state machine and all the kinds of things you could with it. Like; put your card in, punch buttons, deposit money, withdraw money and you would describe the property of ‘eventually getting my money out’ in terms of here and ideally for any ATM machine you would like to know that this property holds. OK ideally. That’s how we like to design our systems. Of course we know in reality there are always bugs. OK so those are the two inputs. The output is ‘yes’, every single ATM is designed, will always give your money back but, more realistically you will say ‘no’, here is a bug in this ATM machine, not only is there a bug but here is an example of a sequence of steps that you will take in terms of that machine where you are not going to get your money back. OK and that is called the Counter Example. So that is my primer on model checking. Basically it has two inputs, you press the button and out will say, yes, every single time you interact with the ATM machine this property will hold or, no, you will get into trouble if you do this sequence of events. OK? That is a particular computation model, computational concept, computational approach and it has been used in fact very successfully for hardware design, companies like Intel, HP, Fujitsu, IBM, they all use this kind of technique for verifying and debugging hardware/processor design. This is almost a ‘bread and butter’ kind of technique in computing. Let me … I’m going to skip that … for those of you who are interested in the formal definitions that is what it looks like. OK there are many efficient algorithms and many efficient data structures that allow us to push a button and get that black box answer. What does that have to do with biology? Well for the face of it absolutely nothing! It certainly wasn’t invented because there was a biological problem to solve. But it turns out that when you put computer scientists and biologists together or when you have a person trained in both biology and computer sciences sparks will fly and that is exactly what happened to that particular person who worked in this intersection of computer science and biology, particular model checking and protein folding and he represents a protein that has three residues in this particular case in terms of this very simplistic finite state machine where you could imagine where this state with three zeros (000) represents the folded state and this state with the three ones (111) represents the unfolded state and these numbers in parenthesis represent how much energy does it take to go from this state to this state, where you are unfolding one of the residues. OK, and so the question is what are all the paths, the ways in which this protein can unfold itself and what are the paths in which the protein can unfold itself using the least amount of energy. So now this is a very simple example, you can probably figure this out in your head, right, but suppose you had 76 residues? You couldn’t even write that out on a piece of paper! There is actually where the scale of computers come in handy because they can do that sort of calculation, no problem. So in fact they show that you can use this modelling checking technique to explore state spaces as large as 2 to the 76th using this kind of particular model-checking algorithm/model-checking approach, and this technique, it’s a pretty recent result, you see its 2007, with 14 orders of magnitude greater than comparable techniques. So this is already new stuff that we are talking about protein folding and so on but here is where you can see sparks fly when you bring computational methods to these kinds of biological problems. OK, so I promise this is the most technical part of my presentation. So that’s one discipline, many methods …