Artificial intelligence has been used to predict the structures of almost all the proteins made by the human body.
Proteins are essential building blocks of living organisms; every cell that we have in us is filled with it.
Understanding the structures of proteins is essential for advancing medicine, but so far only a fraction of these have been developed.
The researchers used a program to predict 350,000 protein structures belonging to humans and other organisms.
The instructions for making human proteins are contained in our genomes – the DNA contained in the nuclei of human cells.
There are approximately 20,000 of these proteins expressed by the human genome. Collectively, biologists call this complete supplement the “proteome”.
The AI program used for the job is called AlphaFold. He was able to make a reliable prediction of the structural positions of 58% of amino acids (the building blocks of proteins) in the human proteome.
Of this number, the positions of 35.7% were predicted with a very high degree of confidence, ie double the number of structures confirmed by experience.
“We believe this is the most complete and accurate picture of the human proteome to date,” said Dr Demis Hassabis, CEO and co-founder of Deep Mind.
“We believe this work represents the most significant contribution AI has made to the advancement of scientific knowledge to date.
“And I think that’s a great illustration and an example of the kind of benefits AI can bring to society.”
In the prestigious scientific journal Nature, DeepMind researchers detailed how AlphaFold predicted the structures of 350,000 different proteins, including not only the 20,000 of the human proteome, but those of so-called model organisms used in scientific research, such as E. coli, yeast, fruit fly and mouse.
The structural arrangement of different proteins can be worked out using a variety of techniques including x-ray crystallography, cryogenic electron microscopy (Cryo-EM), and others. But none of this is easy to do: “It takes a tremendous amount of money and resources to build structures,” Professor John McGeehan, a structural biologist at the University of Portsmouth, told BBC News.
Therefore, structures are often determined in focused scientific investigations, but no successful project so far has undertaken to systematically determine the structures of all proteins made by the body.
Indeed, only 17% of the proteome is covered with an experimentally confirmed structure.
Commenting on AlphaFold’s predictions, Professor McGeehan said, “It’s just the speed – the fact that it took us six months per structure and now it takes a few minutes. We couldn’t really have predicted that it would happen if fast. “
Professor Edith Heard, European Molecular Biology Laboratory (EMBL), said: “At EMBL we believe this will transform our understanding of how life works. This is because proteins represent the basic building blocks from which living organisms are made.
“Applications are only limited by our understanding.”
Applications we can envision now include developing new drugs and treatments for disease, designing future crops that can resist climate change, or enzymes that can break down the plastic that permeates the environment.
Dr Ewan Birney, director of EMBL’s European Institute of Bioinformatics, said the structures predicted by AlphaFold were “one of the most important datasets since mapping the human genome.”
DeepMind has partnered with EMBL to make AlphaFold code and protein structure predictions accessible to the global scientific community.
Dr Hassabis said DeepMind plans to dramatically expand the database’s coverage to almost every sequenced protein known to science – over 100 million structures.
Follow Paul on Twitter.