Genome sequencing methods have allowed researchers to determine the full genome sequences of tens of thousands of humans as well as thousands of other species. It has also revealed the staggering array of mutations each of us carries. “There are millions of differences—mutations—between the genomes of unrelated individuals,” says Haiyuan Yu, Biological Statistics and Computational Biology. “A lot of these mutations fundamentally determine who we are, genetically speaking. They influence things like height and also our predisposition to certain diseases. But even though we researchers have accumulated tens of millions of mutations in our databases, we don’t understand what the vast majority do. This limits how we can take advantage of these sequencing results.”
Using a systems biology approach, Yu and his lab are determined to shed light on the specific roles played by various mutations. “In each cell there is an underlying molecular network,” he explains. “All biomolecules are connected through this network. If a mutation changes one protein, just one pathway, there is a big ripple effect. The impact is never limited just to that protein.”
Networking within a Cell, Uncovering the Big Picture
Yu’s approach is to study biological cellular networks, specifically protein-protein networks (where proteins act upon each other) and gene regulatory networks (where proteins regulate the activity of genes). “We are looking at the big picture,” Yu says. “That’s especially important to understand complex diseases such as cancer or autism. There’s not just one mutation that causes the disease. There are multiple mutations in multiple proteins, and often the same disease can be the result of different sets of mutations on completely different sets of proteins. To really understand this, we have to look at how mutations on individual proteins affect the whole network, and more importantly, how multiple mutations on different proteins together amplify or reduce each other’s impact. In this situation, one plus one is never two. It’s either much bigger or much smaller than two.”
Identifying functional mutations is an important task for researchers, and many tools have been developed to ascertain whether a mutation can disrupt any aspect of a given protein’s function. If it can, then researchers decide whether it is a damaging mutation based on the importance of the site it affects. A site that has been biologically conserved—meaning that it has not changed through time across individuals within a species or across species—is of high importance for correct biological function, and any mutation at that site is assumed to be damaging.
“We know that disease mutations on proteins often cause two different diseases. One mutation might affect the interactions of proteins in a particular pathway, while another might not affect that pathway at all, but instead disrupt a different pathway causing a totally different disease.”
“The problem I have with this approach is that it’s not specific enough and not taking network into consideration,” says Yu. “The idea that a mutation at a conserved site is damaging is a very simple assumption that is right as far as it goes, but we need to look at much more. Proteins have many different functions, and we know that disease mutations on proteins often cause two different diseases. One mutation might affect the interactions of proteins in a particular pathway, while another might not affect that pathway at all, but instead disrupt a different pathway causing a totally different disease.”
Cellular Networking and Disease Mutations
In the summer of 2018, Yu and his collaborators published a breakthrough paper in the journal Nature Genetics that put forth a framework to identify disease mutations that takes into account the cellular network. Using algorithms, the researchers looked at the DNA sequences of thousands of autistic children and their unaffected siblings, searching for mutations whose affects on the network make them likely candidates to contribute to the disease. “We used a lot of experiments to validate that our prediction algorithm works well, and then we applied it to the data,” Yu says. “We still have a ways to go, but our framework is the first that is able to prioritize likely causal missense mutations. That’s very satisfying and exciting.”
Yu is now applying the same approach to cancer. Working with Weill Cornell Medicine faculty member Steven M. Lipkin, Medicine/Genetic Medicine, the Yu Lab is analyzing genetic data on a cohort of patients and their relatives with a family history of multiple myeloma. “Again, we are looking to predict not only whether a mutation damages, but to be very precise and say exactly what functions the mutation affects,” Yu says.
In these studies, Yu and his collaborators used existing methods to elucidate the interactions of proteins. These include the Yeast Two-Hybrids (Y2H) method, which requires researchers to clone two proteins and then put them into a simple yeast system to study how they interact with each other, and proteomics. This depends on pulling a target protein out of a cell and then analyzing the accompanying material in a mass spectrometer to identify the other interacting proteins that were extracted along with the target protein. Both of these methods are very labor intensive and require tens of thousands of experiments to pinpoint each protein’s interactions. “Even then you don’t get all the possible interactions because you lose time points and tissue specificity,” Yu says. “You don’t have the dynamics.”
Maps to Show the Interactions within a Cell
To address this problem, a major part of Yu’s research focuses on developing novel, proteomics-based technologies to generate maps of protein networks. Collaborating with Hening Lin, Chemistry and Chemical Biology, the Yu Lab is creating a new approach to detect protein interactions using state-of-the-art cross-linking mass spectrometry. The researchers hope to freeze all the interactions in a cell at a given time point and then use a mass spectrometer to identify all the cross-linked protein interactions. “We want to see which proteins are connected to each other right at the moment we froze the cell,” Yu explains. “If we are successful, this will revolutionize the scale at which we identify interactions and totally change the type of interactions we can gain. This new method will allow us to map out the networks.
“This technology will take time to develop, and then it will take even more time to make it into clinical applications,” Yu continues. “But that’s what keeps me going at night and gets me up early in the morning. I come to work full of energy to attack these problems and make progress because I can see the light at the end of the tunnel.”