Proteins and Facebook Friends

New model of cellular interaction developed by computer scientists could help researchers find better biological targets to study

illustration of a network of connections

To fully understand how a disease takes hold in the body, biologists have to get down in the weeds to learn how individual proteins or genes work, often spending years uncovering tiny interactions that can promote illness or health. Deciding exactly which genes or proteins to examine has always been difficult, because researchers still don’t know how many of them work.

Now a team of Tufts computer scientists and their colleagues have developed a model to enhance their understanding of protein function by methodically tracking the interactions among related proteins in a cell. The discovery could provide biologists better targets for their research.

Proteins play a critical role in the body. They do most of the work in cells, and are required for the structure, function and regulation of the body’s tissues and organs, according to the National Institutes of Health’s Genetics Home Reference.

Using an organism whose genome has been well mapped and fairly well annotated—the common baker’s yeast S. cerevisiae—the researchers realized they needed to understand the relationships among the many proteins present in the yeast to more accurately assess how the individual contributors work. Current thinking holds that proteins that interact in the cell often have similar functions.

But some proteins are multitaskers, performing many functions at once, and so might be giving erroneous clues about what their neighbors are up to. The researchers’ computer model paints a more detailed picture of the relationships among the proteins, and thus a better prediction of their likely roles. Their findings were published this week in the open-access, peer-reviewed journal PLOS One.

“We know which proteins are interacting and collaborating to do a job,” says Lenore Cowen, a professor of computer science in the School of Engineering. “We know what some of these proteins do, and others we don’t.” Much like groups of adolescents, “who you hang out with might tell me a lot about who you are and what you do,” she says.

Their predictions about the relationships among proteins were derived from social networking. Imagine you’re on Facebook, says Ben Hescott, an assistant professor of computer science in the School of Engineering and lead author on the paper. It’s reasonable to assume that your friends’ interests will be a tip-off to your interests.

But what if one of your Facebook friends is Beyoncé? The odds that a random friend of Beyoncé’s would actually reveal much information about you are pretty remote—after all, she’s got millions of Facebook friends. To really understand your interests, it makes sense to tamp down the influence of your mutual friends with Beyoncé, and amplify the interests of friends who are more likely to be your real friends, Hescott says.

In the case of that blob of S. cerevisiae yeast, some proteins within the organism are like Beyoncé: they have connections with a lot more proteins than most. Inside S. cerevisiae, there are about 4,900 proteins and 74,310 “edges” or connections between proteins that physically interact. “It’s like a hairball of interconnectedness,” says Cowen.

Teasing out the connections among the proteins helps pinpoint their exact functions. Existing algorithms try to do that, but they simply assume that the closer the distance in the interaction network, the more closely related their functions.

The researchers’ new computer model, called “diffusion state distance,” takes into account additional complexities of the interactions—for instance, that sometimes proteins interact but are not crucial to each other for functioning—and suggests what tasks each protein likely is responsible for, says collaborator Mark Crovella, a professor of computer science at Boston University.

Crovella suggests an analogy. The difference in the new and old systems, he says, is like that between a regular map of the United States, showing only major roads, and the two-page map in the Rand McNally atlas that specifies distances in miles between many interconnected cities: there’s much more information available at a glance.

“We’re modifying the way to look at the data,” says Hescott. The results aren’t perfect, he notes, but they are distinctly better than existing algorithms. “By looking at the data as computer scientists at a very high level, we can predict what a protein or a gene does.”

That’s the approach that computational biologists take: they use computers to analyze data to study living mechanisms. It’s related, the researchers say, to the work of this year’s recipients of the Nobel Prize in chemistry. The three laureates are computational chemists who use computer models—instead of laboratory research—to advance knowledge of complex chemical systems.

Like any model, though, “the only way we’ll know if we’re right is if the biologists go to the wet lab and test this,” says Cowen.

The research group, which includes several graduate students in computer science working with Hescott and Cowen, is also looking at applications beyond biological networks, because so many systems, from the Internet to transportation, are interconnected networks.

Taylor McNeil can be reached at

Back to Top