Navigated to Proteins: Weird blobs that do important things - Transcript

Proteins: Weird blobs that do important things

Episode Transcript

1 00:00:00,320 --> 00:00:04,640 In today's episode, we're going to talk  about the wonderful world of proteins.   2 00:00:04,640 --> 00:00:08,000 Proteins are all around our body.  We use them in our daily lives,   3 00:00:08,000 --> 00:00:10,800 and they do amazing things to keep us going.   4 00:00:10,800 --> 00:00:17,120 Protein design just won a Nobel Prize  and we are going to do a mini-series of   5 00:00:17,120 --> 00:00:23,920 episodes here to talk about AlphaFold and  other AI systems used to design proteins,   6 00:00:23,920 --> 00:00:29,520 whether people can increasingly design dangerous  proteins, not just medicines, and whether protein   7 00:00:29,520 --> 00:00:36,720 design can help us get cures for some of the  toughest diseases that still plague humanity.   8 00:00:39,600 --> 00:00:45,280 But first, let's start with the basics. You might  remember being in high school biology and seeing   9 00:00:45,280 --> 00:00:51,440 a simple diagram of a cell. It probably looked a  bit like a fried egg or a sunny side up. There was   10 00:00:51,440 --> 00:00:56,080 the nucleus, which was a bit like the egg yolk.  And then there were a few other things scattered   11 00:00:56,080 --> 00:01:01,600 around, like mitochondria and ribosomes,  but that was a massive simplification.   12 00:01:01,600 --> 00:01:07,840 In reality, cells are incredibly busy. There  are billions of molecules in every cell,   13 00:01:07,840 --> 00:01:13,680 including loads of proteins, which have different  functions. So let me just think about what are   14 00:01:13,680 --> 00:01:18,400 the different things that the proteins are  doing? Well, there are structural proteins;   15 00:01:18,400 --> 00:01:23,920 they provide shape and strength to cells.  There are storage proteins; they store   16 00:01:23,920 --> 00:01:28,640 little molecules. There are signalling proteins  that help cells communicate with each other.   17 00:01:28,640 --> 00:01:34,240 So insulin, for example, is a hormone, and  it's made in the pancreas and it tells cells   18 00:01:34,240 --> 00:01:39,920 to take up glucose from the bloodstream,  and that lowers blood sugar after eating.   19 00:01:39,920 --> 00:01:45,600 There are also transport proteins that  move molecules between cells. Haemoglobin,   20 00:01:45,600 --> 00:01:50,640 for example, is a protein in red blood cells  that binds to oxygen and carries it around in   21 00:01:50,640 --> 00:01:56,320 the blood. There are also enzymes — enzymes speed  up chemical reactions in our body, by lowering   22 00:01:56,320 --> 00:02:02,720 the activation energy needed for them. There are  regulatory proteins that control other proteins   23 00:02:02,720 --> 00:02:08,800 and pathways. And there are defence proteins  that protect us from attack; so antibodies are   24 00:02:08,800 --> 00:02:13,760 a type of protein. Snakes and spiders have  venoms, which are proteins that help them   25 00:02:13,760 --> 00:02:20,640 disable their threats. There are so many different  types of jobs that a protein might have, and many   26 00:02:20,640 --> 00:02:26,320 proteins have multiple jobs at the same time. And this means that this basic diagram view,   27 00:02:26,320 --> 00:02:32,640 that you might've had of a cell, was quite  simple. In reality, the cell is extremely   28 00:02:32,640 --> 00:02:39,840 busy. It's more like a bustling city, and there  are literally billions of molecules, proteins,   29 00:02:39,840 --> 00:02:46,560 DNA, RNA, fats, sugars, and ions — all moving  around, reacting and interacting with each other.   30 00:02:46,560 --> 00:02:53,520 Every part of the cell has its own job and it's  a bit like different districts in the city.   31 00:02:53,520 --> 00:02:59,840 There's a great blog post by Niko McCarty where he  describes this, and I thought it would be helpful   32 00:02:59,840 --> 00:03:06,480 just to have a sense of what's going on. He says,  "A microbe's guts are a veritable Times Square,   33 00:03:06,480 --> 00:03:12,080 crowded with sugars, proteins, and water molecules  that ricochet and smash into each other billions   34 00:03:12,080 --> 00:03:19,040 of times each second. Space is limited. A  bacterium's insides are 70% water by mass;   35 00:03:19,040 --> 00:03:25,760 the other 30% is dominated by proteins first,  followed by RNA and lipids. DNA accounts for   36 00:03:25,760 --> 00:03:34,080 just 1%. And all of this stuff fits inside a  volume that is one quadrillionth of a litre.   37 00:03:34,080 --> 00:03:37,920 That's a lot of proteins and I  can't even see one of them.   38 00:03:37,920 --> 00:03:46,160 Right? They're so small. And so if you think  of this city — of each cell — the nucleus is   39 00:03:46,160 --> 00:03:51,600 something like the city hall, it's managing the  information; it has instructions for what should   40 00:03:51,600 --> 00:03:57,840 happen. There are mitochondria; the power stations  of the cell. There are ribosomes that construct   41 00:03:57,840 --> 00:04:02,240 new proteins. And then there are proteins, that  are the workers and the machines of the city,   42 00:04:02,240 --> 00:04:07,120 but they're also the structural components and the  signalling molecules and all of these things.   43 00:04:07,120 --> 00:04:11,600 Our body is doing so much  with all of those proteins.   44 00:04:11,600 --> 00:04:19,520 Are proteins used outside of the body too? They are! In fact, if you've done any cooking,   45 00:04:19,520 --> 00:04:25,520 you would know, for example, that chemical  reactions change the proteins that you're cooking   46 00:04:25,520 --> 00:04:32,720 with. So, for example, if you cook an egg white,  it becomes firm when it's cooked. That's because   47 00:04:32,720 --> 00:04:38,880 the heat denatures the proteins — it makes them  unfold — and then it makes them coagulate into a   48 00:04:38,880 --> 00:04:44,400 different kind of mesh, and that makes it opaque. There's also gluten, which is a protein that gives   49 00:04:44,400 --> 00:04:50,160 bread its stretchy texture — that's made of two  proteins. There are also lots of proteins that   50 00:04:50,160 --> 00:04:57,360 are used in industry and biotechnology. If you've  done your laundry recently, you might have used   51 00:04:57,360 --> 00:05:04,880 a detergent that was made of enzymes, and the  enzymes break down stains, like fat or blood.   52 00:05:04,880 --> 00:05:10,240 Then there are a bunch of proteins that are used  in baking and brewing and textile manufacturing.   53 00:05:10,240 --> 00:05:14,720 Of course there are lots of proteins that are  used in medicine as well. So I mentioned that   54 00:05:14,720 --> 00:05:21,600 antibodies are a type of protein, and lots of  medicines are types of antibodies. There's also   55 00:05:21,600 --> 00:05:29,680 insulin, which people use in diabetes; it's  a protein that is also a therapeutic drug.   56 00:05:29,680 --> 00:05:37,120 What actually are proteins? What do  they look like and how do they form?   57 00:05:37,120 --> 00:05:45,680 Proteins are long chains of amino acids. You  can sort of think of that as like beads on a   58 00:05:45,680 --> 00:05:53,200 string. And then that string, or that chain, is  folded into some kind of 3D shape. The string   59 00:05:53,200 --> 00:06:00,720 is the protein's backbone, and each bead is an  amino acid. Each amino acid has unique features.   60 00:06:00,720 --> 00:06:06,240 So as this string falls into a structure, you  can kind of imagine that maybe happening at a   61 00:06:06,240 --> 00:06:12,640 small scale — maybe there's like a little helix  of the string in some place, or maybe there are   62 00:06:12,640 --> 00:06:18,320 two parallel strings next to each other. But  imagine that... we have to kind of zoom out   63 00:06:18,320 --> 00:06:25,200 and this whole 3D shape of the protein could also  be connected to another protein; it could be two   64 00:06:25,200 --> 00:06:33,200 proteins together, making a protein complex. How is that made? I know I eat some protein,   65 00:06:33,200 --> 00:06:38,880 but I think we make some too. That's right. So you have lots of   66 00:06:38,880 --> 00:06:46,320 DNA in your cells, and the DNA, which is the  code of life, is the instructions for which   67 00:06:46,320 --> 00:06:52,320 proteins to make and how they should look.  The DNA is transcribed into RNA, which is   68 00:06:52,960 --> 00:07:00,800 typically this temporary molecule, and then the  RNA is then translated into protein by ribosomes.   69 00:07:01,840 --> 00:07:10,240 They sort of form one-by-one into this chain, and  then rapidly fold into a much bigger structure.   70 00:07:10,800 --> 00:07:16,640 This was kind of interesting to me because when  I was reading this, I was thinking, okay, how did   71 00:07:16,640 --> 00:07:23,520 the first protein that was ever discovered look?  What did people think when they first saw it?   72 00:07:24,080 --> 00:07:32,960 And that was fascinating because the first protein  whose structure was determined was in 1958,   73 00:07:32,960 --> 00:07:40,560 and that was myoglobin. This was determined  by John Kendrew, a British scientist. When he   74 00:07:40,560 --> 00:07:47,360 discovered this, it was only four years after  the discovery of DNA's structure — DNA is of   75 00:07:47,360 --> 00:07:55,840 course very beautiful; it has this symmetrical  structure, of this helix. And he was really   76 00:07:55,840 --> 00:08:01,920 disappointed when he figured out what myoglobin  looked like. He wrote in this paper: "Perhaps the   77 00:08:01,920 --> 00:08:07,040 most remarkable features of the molecule are  its complexity and its lack of symmetry."   78 00:08:07,040 --> 00:08:13,520 Oh no, it's ugly. But in hindsight, the irregularity is exactly what   79 00:08:13,520 --> 00:08:23,360 makes proteins so powerful. It's not really like  DNA, which has this kind of linear messaging — it   80 00:08:23,360 --> 00:08:30,480 has the code, and then the code just linearly  turns into RNA. But a protein is actually doing   81 00:08:30,480 --> 00:08:37,600 multiple things. It's in the cell being bombarded  sometimes with lots of different molecules,   82 00:08:37,600 --> 00:08:43,600 and it needs to be able to recognise these  different shapes and structures, and sometimes,   83 00:08:43,600 --> 00:08:50,720 it has multiple functions — and this function  of every protein depends on that 3D structure.   84 00:08:51,760 --> 00:08:57,360 The folded shape means that there are like  little pockets, grooves and surfaces that   85 00:08:57,360 --> 00:09:04,240 the protein uses to bind to other molecules,  or carry out specific chemical reactions,   86 00:09:04,240 --> 00:09:10,960 or even receive signals and then change shape in  response. That means the same protein molecule   87 00:09:10,960 --> 00:09:16,160 might be doing multiple things at once. It could  be doing a chemical reaction, but also binding   88 00:09:16,160 --> 00:09:21,360 to something else, and then when it gets some  regulatory signal, it could be changing shape and   89 00:09:21,360 --> 00:09:28,240 stopping that chemical reaction from happening. So there's benefits to being a weird blob. There's   90 00:09:28,240 --> 00:09:39,440 nothing wrong with being a weird blob. I thought it would be fun if we both share   91 00:09:39,440 --> 00:09:48,240 some fun facts about proteins. I found these from  the book Biology by the Numbers, which is a great   92 00:09:48,240 --> 00:09:56,400 textbook, and it's also free online. The authors  create these rough estimates and pull together key   93 00:09:56,400 --> 00:10:03,920 numbers on lots of different things related to  cell biology. Some of them are rough estimates,   94 00:10:03,920 --> 00:10:09,120 but they're kind of our best guess right now. Hit me.   95 00:10:09,120 --> 00:10:15,520 Alright, first one, how many  proteins are in a human cell?   96 00:10:15,520 --> 00:10:21,600 They're busy, so I'm going to guess a lot.  And I'm going to guess it depends on the cell,   97 00:10:21,600 --> 00:10:30,320 but I will go with a hundred million. That is a lot, and it does depend on   98 00:10:30,320 --> 00:10:38,480 the cell. But the estimate for the average  number is ten billion proteins per cell.   99 00:10:38,480 --> 00:10:44,320 Oh no. Two orders of magnitude wrong, not  a good start. Okay, well, I've got one.   100 00:10:47,040 --> 00:10:54,320 Which is bigger: the protein or the  mRNA that codes for the protein?   101 00:10:54,320 --> 00:11:04,873 Um... surely the protein is bigger, no? Why would  the instructions be bigger than the protein?   102 00:11:04,873 --> 00:11:10,480 That's what I always think, and it's the other  way around. So the mRNA is bigger — you look at   103 00:11:10,480 --> 00:11:18,720 them side by side - well, images of 'em - and  the mRNA is like 10 times bigger. Because each   104 00:11:18,720 --> 00:11:23,840 amino acid is coded for by three nucleotides,  and the nucleotides themselves are bigger and   105 00:11:23,840 --> 00:11:31,520 heavier. So it's counterintuitive to me,  but you know, it makes sense, I guess,   106 00:11:31,520 --> 00:11:36,480 when you think about it physically. That does make sense... well,   107 00:11:36,480 --> 00:11:39,600 I don't know if that makes sense. I feel  like I need to think about this more.   108 00:11:39,600 --> 00:11:44,160 Yeah, it doesn't make sense from a computer  science point of view, but from a physical point   109 00:11:44,160 --> 00:11:48,240 of view it feels like, yeah. Right.   110 00:11:48,240 --> 00:11:54,080 I have one. So, you know, as a small person,  I wanted to find out which protein was the   111 00:11:54,080 --> 00:11:59,040 smallest. Do you have any guesses? The protein that's the smallest? Well,   112 00:11:59,040 --> 00:12:05,840 the definition of a protein... I wonder if I'm  allowed to have- it's got to have at least two   113 00:12:05,840 --> 00:12:12,000 amino acids, so I know it's not going to be  less than two, but that probably wouldn't   114 00:12:12,000 --> 00:12:16,320 count as a protein because it wouldn't fold  into anything, wouldn't have much function.   115 00:12:16,320 --> 00:12:23,920 So I'm going to guess philosophically,  two, and then, literally, more than two.   116 00:12:23,920 --> 00:12:31,920 Well, you're right. I think the typical definition  of a protein is something that floats on its own   117 00:12:31,920 --> 00:12:37,920 in water and can fold into a stable shape. If  you use that definition, then the smallest ones   118 00:12:37,920 --> 00:12:44,080 are some 20 to 30 amino acids long. There are  actually lots of really tiny proteins, and these   119 00:12:44,080 --> 00:12:51,680 tiny proteins are called "micro proteins", and  they're less than a hundred amino acids or so. One   120 00:12:51,680 --> 00:13:01,600 example that's actually even smaller than 20 or 30  is somatostatin, which is a hormone that controls   121 00:13:01,600 --> 00:13:07,680 other hormones — so it controls growth hormone and  insulin. — and that's only 14 amino acids long.   122 00:13:07,680 --> 00:13:12,640 Oh wow, it's that small. Oh okay. Right. It still has a stable shape,   123 00:13:12,640 --> 00:13:18,320 because parts of the chain are connected to each  other. So it's not considered a typical protein,   124 00:13:18,320 --> 00:13:23,680 but it's a Itpeptide and it's very small. Got it, okay. What's the biggest? I think you   125 00:13:23,680 --> 00:13:28,160 know the answer to this one. I think I do. Is it titin?   126 00:13:28,160 --> 00:13:32,640 It's titin. That's the biggest human protein at  least, I don't know outside of humans. But that   127 00:13:32,640 --> 00:13:37,840 one is 33,000 amino acids long. I got one. What's the most   128 00:13:37,840 --> 00:13:43,120 abundant protein on earth? I am going to guess it has something   129 00:13:43,120 --> 00:13:49,040 to do with photosynthesis, because that seems  like one of the biggest functions on earth.   130 00:13:49,040 --> 00:13:56,480 Very good guess. So it's kind of a tie, and  we're not really sure which one is more abundant,   131 00:13:56,480 --> 00:13:59,200 so that was a bit of a trick question. Oh wow.   132 00:13:59,200 --> 00:14:08,080 But one of them is RuBisCO, and that is used in  photosynthesis; it's used to grab carbon from the   133 00:14:08,080 --> 00:14:17,040 air and turn it into useful organic material. And  that's used by all photosynthetic organisms. And   134 00:14:17,040 --> 00:14:22,720 scientists estimate that there are about five  kilogrammes of RuBisCO per person on earth.   135 00:14:22,720 --> 00:14:28,880 Oh my god. What?! Wow. I guess there are a lot of plants.   136 00:14:28,880 --> 00:14:32,640 Yeah, fair enough. They're winning.  They're winning... for now...   137 00:14:32,640 --> 00:14:37,040 There's actually the second, which  might be ahead. We're not sure-   138 00:14:37,040 --> 00:14:42,480 Oh right. -and that is collagen. That is   139 00:14:42,480 --> 00:14:51,040 used as a kind of structural protein, and it makes  up about 30% of the protein mass in your body — so   140 00:14:51,040 --> 00:14:57,280 about three kilogrammes of collagen per person.  But it's not just humans that have collagen,   141 00:14:57,280 --> 00:15:05,920 it's also the livestock and all animals. That  means there's- well, the total number- the total   142 00:15:05,920 --> 00:15:12,320 mass of livestock is also enormous, right? And so  this means there's roughly four to six kilogrammes   143 00:15:12,320 --> 00:15:17,520 of collagen per person on earth. Ready for another fun fact?   144 00:15:17,520 --> 00:15:20,240 Yes. Well, enzymes are a   145 00:15:20,240 --> 00:15:28,480 type of protein that speed up reactions... so how  much do you think enzymes speed up reactions?   146 00:15:28,480 --> 00:15:37,520 Mmm... a thousand times, maybe? Two thousand?  I feel like... a lot. But I don't know.   147 00:15:37,520 --> 00:15:45,920 A lot. A lot. And I bet some do a thousand, but  if you're really looking at the best of the best,   148 00:15:45,920 --> 00:15:54,400 we're talking billions of times, and possibly  trillions of times, so we're talking millions   149 00:15:54,400 --> 00:16:01,040 of reactions per second per enzyme in some  cases, and just totally changing what is   150 00:16:01,040 --> 00:16:06,640 happening at the molecular level. That's crazy. That means, I guess,   151 00:16:06,640 --> 00:16:10,640 some reactions just wouldn't happen  if the enzymes weren't there.   152 00:16:10,640 --> 00:16:14,000 Oh, absolutely. Yeah. I mean,  statistically speaking, yeah.   153 00:16:14,000 --> 00:16:18,720 So we were talking about protein folding  the other day, and I was thinking: well,   154 00:16:18,720 --> 00:16:22,480 how fast do proteins fold into  shape? Do you have any guesses?   155 00:16:22,480 --> 00:16:25,758 Oh... that is a tough one because, well, we just  had a very long protein that took forever, but   156 00:16:25,758 --> 00:16:29,680 I bet most proteins don't take long at all. The  folding has to happen quickly, otherwise they'll   157 00:16:29,680 --> 00:16:40,560 get distracted by other forces. So I will go with  tenths of seconds, no, hundredths of seconds.   158 00:16:40,560 --> 00:16:44,480 Pretty close. So, on average,  proteins fold in milliseconds,   159 00:16:44,480 --> 00:16:52,800 but some proteins fold really quickly, in micro  seconds, which are a millionth of a second. And   160 00:16:52,800 --> 00:16:57,440 I guess you're right that it really does have  to happen fast, because there's so much other   161 00:16:57,440 --> 00:17:03,280 stuff going on in the cell. It could just be  bombarded with something else before it folds.   162 00:17:03,280 --> 00:17:04,868 Yeah, well, no fun. One final one  from me. How quick do they move? Let's   163 00:17:04,868 --> 00:17:04,941 say you're in a cell. How quick does  the protein move across the cell?   164 00:17:04,941 --> 00:17:05,035 I love the idea that I've shrunk myself to the  size that I can fit inside a cell. And now I'm   165 00:17:05,035 --> 00:17:05,119 trying to race with these little proteins.  To get across a cell... uh... I dunno. A   166 00:17:05,119 --> 00:17:05,200 second? Maybe half a second? I dunno. A small protein could be 10 milliseconds   167 00:17:05,200 --> 00:17:05,300 to get across a cell. The thing, though, is that  cells are small. So if you haven't shrunk yourself   168 00:17:05,300 --> 00:17:05,385 all the way down, and are just visualising  the human scale, how long would it take a   169 00:17:05,385 --> 00:17:05,479 protein to move a whole centimetre? Well, then  you'd need 20 days for some of the proteins.   170 00:17:05,479 --> 00:17:05,559 Well, so at first I thought you said -  okay, that's quite fast - they're taking   171 00:17:05,559 --> 00:17:05,646 10 milliseconds to cross the cell. But 20  days to travel one centimetre is quite slow,   172 00:17:05,646 --> 00:17:05,714 I could do that much faster. Yeah, I think you're going to win.   173 00:17:05,714 --> 00:17:10,560 ... but maybe not if I'm shrink to that size.  Okay, I got another one. How fast are enzymes   174 00:17:10,560 --> 00:17:15,600 colliding with other molecules in the cell? Or  how many collisions are there per second?   175 00:17:15,600 --> 00:17:20,640 Okay. I have the sense that things are  just crazy up in there and everyone's   176 00:17:20,640 --> 00:17:26,720 sort of bumping around. So I'm going to  say a thousand collisions a second.   177 00:17:26,720 --> 00:17:33,600 Well, you were right with the idea. Oh no, I should have just said "A lot."   178 00:17:33,600 --> 00:17:42,640 But I think the estimate is 500,000 molecules  are colliding with an enzyme per second.   179 00:17:42,640 --> 00:17:47,920 Wow. And that's a lot! And that makes me think that   180 00:17:47,920 --> 00:17:54,000 proteins have to be really specific in how they  bind to their targets. It's like, you know, if   181 00:17:54,000 --> 00:18:00,720 you're at a really crowded party and you're trying  to find a friend, you would just bump into so many   182 00:18:00,720 --> 00:18:05,600 people before you actually find your friend. So  you have to actually be able to recognise them   183 00:18:05,600 --> 00:18:12,160 among the 500,000 random strangers around you. Yep. That's tricky. Okay, Saloni,   184 00:18:12,160 --> 00:18:18,000 what's your favourite protein? My favourite protein is tubulin. It's part   185 00:18:18,000 --> 00:18:25,680 of microtubules. The microtubules are kinda the  skeletons of your cells... That sounds a bit grim,   186 00:18:25,680 --> 00:18:33,840 actually. But they are basically formed of these  hollow tubes that are made of this protein,   187 00:18:33,840 --> 00:18:41,040 and each of the little structures is kind of  like a tiny corn kernel. That tube can sort of   188 00:18:41,040 --> 00:18:47,920 assemble and disassemble in response to signals,  and that means that the entire skeleton can kind   189 00:18:47,920 --> 00:18:54,480 of assemble and disassemble... which means the  whole cell can change its shape or its size and   190 00:18:54,480 --> 00:19:02,880 move around, because of these microtubules.  The microtubules also act as tracks to move   191 00:19:02,880 --> 00:19:09,360 things around, so they're a bit like a cellular  railway or something, which I think is just super   192 00:19:09,360 --> 00:19:14,720 cool. And I remember learning about this in  my undergrad and just seeing some diagrams   193 00:19:14,720 --> 00:19:18,320 and thinking, wow, that's amazing. That's a good one. I haven't even   194 00:19:18,320 --> 00:19:25,680 better one though, which is gluten  in bread! Woo! I'm a bread guy.   195 00:19:25,680 --> 00:19:29,760 That's a good one. We each have our favourites.   196 00:19:29,760 --> 00:19:36,960 This was the first of a series of mini episodes  we're doing on proteins. Stay tuned for our next   197 00:19:36,960 --> 00:19:45,280 episode on the history of Insulin. And if you like  this, share it with your friends and subscribe.

Never lose your place, on any device

Create a free account to sync, back up, and get personal recommendations.