We are learning about protein sequences three thousand times faster than we are learning about protein structures. We have this enormous gap in which certain things are very easy to measure like genetics and certain things are very hard to measure three-dimensional (3D) structures and we would love to close this gap, said John Jumper, Breakthrough Prize laureate.
Delivering the TNQ Distinguished Lectures in the Life Sciences – 2024 on ‘Highly Accurate Protein Structure Predictions: Using AI to Solve Biology Problems in Minutes Instead of Years’ at the JN Tata Auditorium, IISc, Mr Jumper said that protein structures should be collected as they are precious and that they should be collected into a central resource.
Mr. Jumper currently leads the AlphaFold2 project at Google DeepMind spoke about the project which has made structure predictions for over 200 million proteins.
AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. The programme is said to reduce the time taken by scientists to determine protein structure apart from displaying the impact Artificial Intelligence (AI) can have on scientific discovery.
Mr. Jumper has been developing novel methods to apply AI and machine learning to protein biology.
“We really want to make AlphaFold have a wider domain and be more useful. There is a tremendous amount of interaction between protein and Deoxyribonucleic acid (DNA), protein and Ribonucleic acid (RNA), etc, so we really want to achieve this goal of whole Protein Data Bank (PDB), how do we make predictions for all the atoms that you could see within the PDB. This is still work in progress,” he said.
Mr. Jumper said that data has to be diverse for machine learning.
“There are a couple of key components for ML. Data has to be diverse. Data has to be diverse as the problem you are looking to solve. Bigger and more general problems are easier to solve with ML,” Mr. Jumper said.