Which Single Amino Acid Mutations Cause Disease?

Almost all proteins function through interacting with other proteins. Studies have shown that the vast majority of damaging mutations in single amino acids in proteins disrupt only a subset of specific protein-protein interactions. Research also shows that mutations in the same protein that disrupt different interactions tend to cause clinically distinct disorders. Therefore, determining which protein interactions are disrupted by each mutation could help pinpoint the cause of various diseases.

More than 10 million single nucleotide variants (SNVs) exist in protein-coding genes in the human population. Unfortunately, there is no method that can predict the specific impact of a large fraction of SNVs on individual protein-protein interactions.  

Haiyuan Yu, Biological Statistics and Computational Biology, and Andrew G. Clark, Molecular Biology and Genetics, with Emil G. Alexov (Clemson University), are developing a high-throughput pipeline to quickly clone and directly test a large number of SNVs for their impact on the human interactome network—the whole set of molecular interactions in a cell. The massive amount of new data generated will allow researchers—for the first time—to comprehensively assess the relationships between the impact of SNVs on interactions and their population genetic attributes.

The research team will also use these data to build a machine learning pipeline that can accurately predict specific impacts on all individual protein-protein interactions for all SNVs. The work will fuel hypothesis-driven research, significantly improve the functional understanding of variants, and fundamentally change the experimental design and data interpretation for whole-genome studies, with broad clinical and therapeutic applications.

Cornell Researchers

Funding Received

$2.3 Million spanning 4 years

Sponsored by