The evolution of the immunome network was investigated by combining
information about genes and proteins forming the immunome. 847
essential genes and proteins were identified by text and bioinformatic
data mining as related to the human immune system . The evolutionary history and relationships of the immunome proteins were obtained from ImmTree ,
which contains information on the orthologs for each protein. An
evolutionary level was assigned for each protein. The level denotes
when the protein emerged during evolution. The levels are presented in
Table 1. The third essential component, the PPIs, came from the Human Protein Reference Database (HPRD) , which contains only experimentally proven interactions.
Table 1. Network parameters for the human immunome at the evolutionary levels of the corresponding subnetworks.
Further, interactions for low level subnetworks, levels 6–9, were collected for Drosophila melanogaster [42,43] and Caenorhabditis elegans .
This was done to reconstruct early interactions between immunome
proteins which might have been lost during the evolutionary processes.
Only 13 new interactions were identified in these subnetworks and thus
they did not affect the overall trends. It would have been beneficial
to have PPI data for model organisms on all the 10 evolutionary levels.
This is not currently possible due to lacking proteome wide PPI
For the analysis, we made some assumptions. We used PPIs identified
in the human proteome, and assumed that if two proteins which interact
in human, they also interact in any organism in which the two proteins
coexist. We also assumed that this interaction existed during the
evolution since both of the proteins emerged in a linage. The model is
a simplification of all possible cases. This assumption is also in use
in interaction predictions [44,45],
and although not exclusive, it is still true for the majority of
interactions. Our results for interactions in fruitfly and worm confirm
this idea, because hardly no new interactions were found in these low
level subnetworks. The model simplifies the evolution of the entire
interaction network to the evolution of its nodes. This is necessary
since we do not have a method to track the evolutionary past of
interactions, while the phylogenetic analysis of the proteins has well
established and accepted procedures.
The human immunome PPI network contains 1349 interactions for 584 proteins (Table 1).
Since the network includes only experimentally proven interactions from
the HPRD database, we can assume it represents a real, albeit
incomplete, network model of protein interactions in the human immune
system. Not all the immunome proteins are included because the data is
not complete and does not cover all proteins in all cell types and
Of the investigated proteins, protein-tyrosine kinase FYN has the
highest number of interactions. The subnetwork of FYN and its first
neighborhood includes 47 proteins and 93 interactions, which account
for about 8% of all the immunome nodes and 6.9% of interactions.
Another Src-family member, lymphocyte-specific protein tyrosine kinase
(LCK), is the second most linked protein with 37 interactions. 64 of
the 584 proteins have more than 10 interactions. Many of these are
mediators of signal transduction pathways, for example the Janus
kinases (JAKs), the signal transducer and activator of transcription
(STAT) family members, and the TNF receptor-associated factors (TRAFs)
(see Additional file 1).
Additional file 1.
Proteins of the protein interaction network of the human immune system.
Basic information, including protein name, Entrez Gene ID for the
coding gene, evolutionary level and degree, is listed for all the
proteins in the human immunome PPI network.
Size: 156KB Download file
further analysis, subnetworks were created for eight evolutionary
levels. Levels 8 and 9 were excluded from the analysis because these
levels contain only a few proteins and interactions. All subnetworks
contain the nodes from the examined and earlier levels, and the
interactions between them (Table 1, Fig 1, Additional file 2).
They thus represent the interaction network that existed at different
steps during evolution. The lowest, level 7, subnetwork is small, with
112 proteins and 133 interactions which existed in the ancestors of Homo sapiens when
the taxon Bilateria was formed. In each evolutionary step the number of
proteins and the interactions between them grows substantially. The
degree distribution of the subnetworks follows power law distribution
with the power law exponent between 2 and 3 (Fig 2.). The log-likelihood ratio (-2logΛ),
which marks the likelihood of the power law degree distribution of the
dataset, is much higher in the higher level networks (Table 1).
Additional file 2.
Graph representation of the immunome network at the evolutionary
levels. Graphs for each level are presented separately. The gene
symbols are shown.
Size: 390KB Download file
Figure 1. Graph representation of the immunome network at the evolutionary levels. Colors represent the levels of nodes as shown. See Additional file 2 for gene names and the network on the different evolutionary levels.
Figure 2. Estimates for the power law exponents (α) of the networks on the different levels. The bars show the confidence intervals with the 2.5% lower and 97.5% upper boundaries.
analyzing the relationship between the evolutionary levels and the
degrees (the number of interactions of the proteins) we expected
proteins which appeared early on to have more interactions ,
and thus nodes with higher evolutionary levels should have higher
degrees. However, a simple comparison of node level numbers and their
degree in the level 0 network does not show this phenomenon (Fig 3).
When we tested whether the new nodes introduced in evolutionary steps
tend to attach to nodes with higher numbers of connections, a
statistically significant preferential attachment is clear (Fig 4).
We compared the degree distribution of all the nodes and of those nodes
which get new connections in the next evolutionary level. The nodes
with new connections have higher degrees in each step. This implies
that when a new node is introduced into the immunome network, it most
likely attaches to a node with a higher degree, so there is a bias
toward attachment to higher degree nodes.
Figure 3. Degree distribution for evolutionary levels of the protein-protein interaction network. Notches represent the 95% confidence interval of the median.
Figure 4. Comparison
of degree distributions of all the nodes and the nodes with new
connections in the immunome protein interaction network at evolutionary
levels 0–7. Degree distribution for all the edges in the
network is on the lower half of the subgraphs, while degree
distribution only for nodes with new connections, representing the
proteins with newly formed interactions, is on the upper half. P values
for the Kruskal-Wallis Rank Sum Test are shown on the plots. Nodes with
new connections have higher degrees than the others, and the difference
is considered significant at levels 1, 2, 4, 5 and 6. Notches represent
the 95% confidence interval of the median.
scale free models although the immunome protein interaction network,
like other PPI networks, does not contain enough nodes to fulfill the
statistical criteria for scale freeness. Therefore we mostly used
general descriptive measures of networks, like efficiency, and avoided
in our conclusions the scale free network specific aspects.
An important feature of the scale free protein interaction networks
is that highly connected nodes tend to be essential and therefore more
conserved [40,46]. We used the average entropy 
of the proteins to measure how conserved, and thus how essential they
are. Entropy was used to measure the variability of the sites in a
multiple protein sequence alignment instead of comparing a human
sequence to an ortholog in a reference genome. We thus take variability
into account from many sequences instead of a sequence pair. Proteins
with high connectivity never have high entropy; yet on the other hand,
some of the proteins with just a few connections have very high average
entropy, which means that they are not conserved (Fig 5).
We detect this phenomenon on all the evolutionary levels. These levels
also contain enough nodes to allow binning of the data (Fig 6). More conserved proteins are more connected during the evolution of the immunome protein interaction network.
Figure 5. Conservation of the proteins as a function of their connectivity in the human immunome PPI network.
Conservation is measured by the average entropy of the proteins. Data
points are binned so that a minimum of 6 points are in each degree
interval. The conclusion, that proteins with high connectivity never
have high entropy, does not depend on the binning. The notation [11,14)
means a bin for degree values 11, 12 and 13.
Figure 6. Conservation
of the proteins as a function of their connectivity in the immunome
protein-interaction networks at the eight evolutionary levels. Data is presented in the same way as for Fig 5.
further studied the effects on several network characteristics during
the evolution of the network to find out what kind of selective forces
affect its development. Global efficiency quantifies the efficiency of
the network in sending information between nodes .
According to earlier studies, the efficiency of a scale free network is
expected to decrease when the size of the network is growing [36,48].
Surprisingly, the efficiency of the immunome network grows through the
evolutionary steps, from an initial value of around 0.24 to a final
value of around 0.41 although the number of nodes and edges is also
growing (Fig 7A).
This means that despite the number of nodes increasing from 112 to 584,
the average number of steps necessary to reach one random node from
another decreases. Since this is against the expected behavior (Fig 7A), we assume that a selection pressure exists which shapes the immunome PPI networks to became more efficient during evolution.
Figure 7. Characteristics of the human immunome network during evolution. (A) Efficiency (solid line) and expected efficiency calculated as a function of 1/(ln ln N) (dashed line), and 1/(ln N) (dotted line), where N is
the number of nodes in the network. Expected efficiency curves are
scaled to have the same starting values as the observed network. The
shape of the observed efficiency curve shows an opposite trend than
expected, suggesting that selective forces during evolution favor
higher efficiency. (B) Maximal vulnerability.
The critical components of a network can be searched by looking for the most vulnerable nodes .
Vulnerability is defined as the drop in efficiency when a node and all
its edges are removed from the network. The maximal value of the
vulnerability is the overall vulnerability of the whole network. The
maximal vulnerability of the immunome network constantly decreases
during the evolutionary steps from the initial value of 0.28 (Fig 7B). At the level of Homo sapiens (level
0) the value is 0.003, which means that maximal drop in the efficiency
of the network is 0.3% if one of the nodes is deleted from the network.
Scale free networks are known to be tolerant of errors in randomly
chosen nodes , a feature that is also important for biological interaction networks.